- as reported on Slack and in issue #931 the gateway scaling code
was scaling to zero replicas as a result of the "proportional
scaling" added by @Templum's PR. This commit added a failing test
which was fixed by adding boundary checking - now if the scaling
amount is "0" we keep the current amount of replicas.
Signed-off-by: Alex Ellis (VMware) <alexellis2@gmail.com>
Trivial change to add logging around scale from zero events in scaling.go.
Previously scale from zero events were not logged in the same way that normal
scaling events are. This change adds log writes to show when a scale from zero
was requested and when a function successfully moved to > 0 replicas.
Signed-off-by: Richard Gee <richard@technologee.co.uk>
- Covers part of 919 by making the HTTP client used for proxying
stop following redirects. Tested with a stateless microservice,
but additional code changes may be requierd in the queue-worker,
the watchdogs and other areas.
Tested on Swarm with stateless microservice (Node.js) issuing
a redirect via Location header.
Signed-off-by: Alex Ellis (VMware) <alexellis2@gmail.com>
**What**
- Split the table into official providers and community providers
- Move the site links to the site column
Signed-off-by: Lucas Roesler <roesler.lucas@gmail.com>
- Removes use of "our" from CONTRIBUTING guide
- Updates/adds README.md files
- Commnents and typo fix in watchdog
- Adds good/bad examples of commit messages
Signed-off-by: Alex Ellis (VMware) <alexellis2@gmail.com>
- updates comments and adds where missing
- updates locks so that unlock is done via defer instead of
at the end of the statement
- extracts timeout variable in two places
- remove makeClient() unused method from metrics package
No-harm changes tested via go build.
Signed-off-by: Alex Ellis (VMware) <alexellis2@gmail.com>
This commit adds a line to the issue template
to request the faas-cli version and provider
information to the issue template to help diagnose
issues faster
Signed-off-by: Burton Rheutan <rheutan7@gmail.com>
Within MakeScalingHandler() there is a call to GetReplicas() which was not returning an error when a non-200 http response was received from /system/function/. The call would also return a populated struct, so the perception was that a function existed an had been scaled to zero. This meant that the function would be added to the function cache and the code would continue into SetReplicas() where an attempt would be made to scale up a non-existent function.
This change amends GetReplicas() so that it will return an error if the gateway returns anything other than a 200 reponse code from the /system/function/ endpoint. This causes MakeScalingHandler() to return earlier with an error indicating that the function could not be found. The cache.Set call is also moved to after the error check so that the cache is only updated to include existent functions.
During investigations as to the cause of #876 tests were added to function_cache to check that Get() is behaving as intended when function exists and when not. Tests are also added to plugin/external to test that GetReplicas() and SetReplicas() are following their intended modes of operation when 200 and non-200 responses are received from the gateway.
Signed-off-by: Richard Gee <richard@technologee.co.uk>
- the path was converted to lowercase which meant the samples
would not build, this fixes that error.
Signed-off-by: Alex Ellis (VMware) <alexellis2@gmail.com>
- the shutdown sequence meant that the kubelet was still passing
work to the watchdog after the HTTP socket was closed. This change
means that the kubelet has a chance to run its check before we
finally stop accepting new connections. It will require some
basic co-ordination between the kubelet's checking period and the
"write_timeout" value in the container.
Tested with Kubernetes on GKE - before the change some Pods were
giving a connection refused error due to them being not detected
as unhealthy. Now I receive 0% error rate even with 20 qps.
Issue was shown by scaling to 20 replicas, starting a test with
hey and then scaling to 1 replica while tailing the logs from the
gateway. Before I saw some 502, now I see just 200s.
Signed-off-by: Alex Ellis (VMware) <alexellis2@gmail.com>
This change adds link to the table of contents on the
community.md file as well as adding links to return to top
from the headers.
Signed-off-by: Burton Rheutan <rheutan7@gmail.com>