During some exploratory testing, I ran into an issue where
the gateway would attempt to scale a deployment from zero
replicas to min, despite there already being min replicas.
Why?
The scaling logic was looking for Available replicas when
it should have looked for Desired replicas. So when a
deployment had zero ready replicas due to readiness checks
failing, the gateway was attempting to scale from zero
to min.
This logic has been corrected and separated from the
a holding pattern where the gateway waits for a ready
replica.
Tested with KinD and an edited function which had a
readiness probe, which was failing and no ready
replicas. As desired, the gateway did not scale to min.
However, when setting desired replicas to zero, the
gateway did scale up as expected.
This change also modifies all print statements for
"seconds" and makes them use 4 decimal places instead of
the default which was a longer, more verbose string for
the logs.
Signed-off-by: Alex Ellis (OpenFaaS Ltd) <alexellis2@gmail.com>
Introduces a single-flight call to a function's health
endpoint to verify that it is registered with an Istio
sidecar (Envoy) before letting the invocation through.
Results are cached for 5 seconds, before a probe is
required again.
Tested without Istio, with probe_functions environment
variable set to true, I saw a probe execute in the logs.
Fixes: #1721 for Istio users.
Signed-off-by: Alex Ellis (OpenFaaS Ltd) <alex@openfaas.com>
Enables publishing to various topics according to annotations
on the functions. The function cache is moved up one level so
that it can be shared between the scale from zero code and the
queue proxy.
Unit tests added for new internal methods.
Tested e2e with arkade and the newest queue-worker and RC
gateway image with two queues and an annotation on one of the
functions of com.openfaas.queue. It worked as expected including
with multiple namespace support.
Signed-off-by: Alex Ellis (OpenFaaS Ltd) <alexellis2@gmail.com>
Allows alerts to trigger functions to scale when they
also have an optional namespace set.
Tested e2e with Kubernetes 1.15 and a non-default namespace.
Signed-off-by: Alex Ellis (OpenFaaS Ltd) <alexellis2@gmail.com>
- extracting this package means it can be used in other components
such as the asynchronous nats-queue-worker which may need to
invoke functions which are scaled down to zero replicas.
Ref: https://github.com/openfaas/nats-queue-worker/issues/32
Tested on Docker Swarm for scaling up, already scaled and not
found error.
Signed-off-by: Alex Ellis (VMware) <alexellis2@gmail.com>
Trivial change to add logging around scale from zero events in scaling.go.
Previously scale from zero events were not logged in the same way that normal
scaling events are. This change adds log writes to show when a scale from zero
was requested and when a function successfully moved to > 0 replicas.
Signed-off-by: Richard Gee <richard@technologee.co.uk>
Within MakeScalingHandler() there is a call to GetReplicas() which was not returning an error when a non-200 http response was received from /system/function/. The call would also return a populated struct, so the perception was that a function existed an had been scaled to zero. This meant that the function would be added to the function cache and the code would continue into SetReplicas() where an attempt would be made to scale up a non-existent function.
This change amends GetReplicas() so that it will return an error if the gateway returns anything other than a 200 reponse code from the /system/function/ endpoint. This causes MakeScalingHandler() to return earlier with an error indicating that the function could not be found. The cache.Set call is also moved to after the error check so that the cache is only updated to include existent functions.
During investigations as to the cause of #876 tests were added to function_cache to check that Get() is behaving as intended when function exists and when not. Tests are also added to plugin/external to test that GetReplicas() and SetReplicas() are following their intended modes of operation when 200 and non-200 responses are received from the gateway.
Signed-off-by: Richard Gee <richard@technologee.co.uk>
Existing code has been used for scaling up and querying replicas.
This meant the new code was deleted and there is less duplication
now.
The cache store a whole query response rather than just the
available replica count and the tests were updated. This has been
tested with Docker swarm and the image:
openfaas/gateway:scale-17-07-2018
This feature now needs the env-var of scale_from_zero to be enabled
in order to turn on the scaling behaviour.
Signed-off-by: Alex Ellis (VMware) <alexellis2@gmail.com>
This change allows functions to be "idled" or scaled to zero
replicas and then be invoked later on. There is a penalty to
scaling up - the API gateway proxy will block until the function
is ready.
A cache is included to off-set the calls to upstream API to check
on readiness along with unit tests.
Testing via scaling to zero replicas and then invoking function.
On Swarm I observed 3 seconds on an Intel Nuc i5 for scaling back
from zero replicas.
Signed-off-by: Alex Ellis (VMware) <alexellis2@gmail.com>