15 Commits

Author SHA1 Message Date
Alex Ellis (OpenFaaS Ltd)
8e711b3a0c Use Desired Replicas when scaling from zero
During some exploratory testing, I ran into an issue where
the gateway would attempt to scale a deployment from zero
replicas to min, despite there already being min replicas.

Why?

The scaling logic was looking for Available replicas when
it should have looked for Desired replicas. So when a
deployment had zero ready replicas due to readiness checks
failing, the gateway was attempting to scale from zero
to min.

This logic has been corrected and separated from the
a holding pattern where the gateway waits for a ready
replica.

Tested with KinD and an edited function which had a
readiness probe, which was failing and no ready
replicas. As desired, the gateway did not scale to min.

However, when setting desired replicas to zero, the
gateway did scale up as expected.

This change also modifies all print statements for
"seconds" and makes them use 4 decimal places instead of
the default which was a longer, more verbose string for
the logs.

Signed-off-by: Alex Ellis (OpenFaaS Ltd) <alexellis2@gmail.com>
2022-09-08 11:21:29 +01:00
Alex Ellis (OpenFaaS Ltd)
88eea5f62e Feature for probing functions
Introduces a single-flight call to a function's health
endpoint to verify that it is registered with an Istio
sidecar (Envoy) before letting the invocation through.

Results are cached for 5 seconds, before a probe is
required again.

Tested without Istio, with probe_functions environment
variable set to true, I saw a probe execute in the logs.

Fixes: #1721 for Istio users.

Signed-off-by: Alex Ellis (OpenFaaS Ltd) <alex@openfaas.com>
2022-07-07 10:35:07 +01:00
Alex Ellis (OpenFaaS Ltd)
01841f605c Use sync package from unofficial Go library
Uses the sync package from the unofficial Go library instead
of simpler solution.

Signed-off-by: Alex Ellis (OpenFaaS Ltd) <alex@openfaas.com>
2022-06-29 09:43:58 +01:00
Alex Ellis (OpenFaaS Ltd)
6ed0ab71fb Move to single-flight for back-end queries
When querying for replicas during a scale up event, then the
gateway can overwhelm the provider with requests. This is
especially true under high concurrent load.

The changes in this PR limit the inflight requests.

Signed-off-by: Alex Ellis (OpenFaaS Ltd) <alex@openfaas.com>
2022-06-29 09:43:58 +01:00
Alex Ellis (OpenFaaS Ltd)
d85d5e7239 Export new metrics for OpenFaaS Pro scaling
* Add service target metric
* Add service min replicas metric
* Add scale type metric

These combined allow new auto-scaling modes and parameters
for OpenFaaS Pro customers.

Signed-off-by: Alex Ellis (OpenFaaS Ltd) <alexellis2@gmail.com>
2022-01-24 14:30:08 +00:00
Alistair Hey
ca3d53c0a5 Add publish step to github actions
Signed-off-by: Alistair Hey <alistair@heyal.co.uk>
2020-12-10 14:54:38 +00:00
Alex Ellis (OpenFaaS Ltd)
5741610f31 Return cached query when hit
The code was calling into the cache twice, even if the first
call was a cache hit and not a miss.

Signed-off-by: Alex Ellis (OpenFaaS Ltd) <alexellis2@gmail.com>
2020-05-06 18:32:51 +01:00
Alex Ellis (OpenFaaS Ltd)
18f6c720b5 Extract a caching function_query type
This type abstracts the function_query type and introduces an
interface for testing and substitution.

Signed-off-by: Alex Ellis (OpenFaaS Ltd) <alexellis2@gmail.com>
2020-04-22 15:26:42 +01:00
Alex Ellis (OpenFaaS Ltd)
2bfca6d848 Publish to multiple topics
Enables publishing to various topics according to annotations
on the functions. The function cache is moved up one level so
that it can be shared between the scale from zero code and the
queue proxy.

Unit tests added for new internal methods.

Tested e2e with arkade and the newest queue-worker and RC
gateway image with two queues and an annotation on one of the
functions of com.openfaas.queue. It worked as expected including
with multiple namespace support.

Signed-off-by: Alex Ellis (OpenFaaS Ltd) <alexellis2@gmail.com>
2020-04-22 15:26:42 +01:00
Vivek Singh
9fd8a59b89 Update logs to reudce verbosity and consistency
Signed-off-by: Vivek Singh <vivekkmr45@yahoo.in>
2020-01-24 12:39:16 +00:00
Alex Ellis (OpenFaaS Ltd)
df4126d8f5 Scale functions with namespace option
Allows alerts to trigger functions to scale when they
also have an optional namespace set.

Tested e2e with Kubernetes 1.15 and a non-default namespace.

Signed-off-by: Alex Ellis (OpenFaaS Ltd) <alexellis2@gmail.com>
2019-09-20 18:38:55 +01:00
Alex Ellis (VMware)
299e5a5933 Read config values from environment for max_conns tuning
- max_conns / idle / per host are now read from env-vars and have
defaults set to 1024 for both values
- logging / metrics are collected in the client transaction
rather than via defer (this may impact throughput)
- function cache moved to use RWMutex to try to improve latency
around locking when updating cache
- logging message added to show latency in running GetReplicas
because this was observed to increase in a linear fashion under
high concurrency
- changes tested against 3-node bare-metal 1.13 K8s cluster
with kubeadm

Signed-off-by: Alex Ellis (VMware) <alexellis2@gmail.com>
2019-02-04 11:50:25 +00:00
Alex Ellis (VMware)
b4c12f824b Make use of cache in scaling
- this reinstates the cache to reduce the count of lookups to the
provider when checking if scaling is needed.

Signed-off-by: Alex Ellis (VMware) <alexellis2@gmail.com>
2018-11-07 13:49:56 +00:00
Alex Ellis (VMware)
117707df14 Enable backoff/retries on scaling up
- this change is needed for Docker Swarm which may give an error
when several concurrent requests come in to scale a deployment.

Tested on Docker Swarm before/after with the hey tool and figlet
scaled down to zero replicas.

Signed-off-by: Alex Ellis (VMware) <alexellis2@gmail.com>
2018-11-07 13:49:56 +00:00
Alex Ellis (VMware)
9cea08c728 Extract scaling from zero
- extracting this package means it can be used in other components
such as the asynchronous nats-queue-worker which may need to
invoke functions which are scaled down to zero replicas.

Ref: https://github.com/openfaas/nats-queue-worker/issues/32

Tested on Docker Swarm for scaling up, already scaled and not
found error.

Signed-off-by: Alex Ellis (VMware) <alexellis2@gmail.com>
2018-11-01 15:10:08 +00:00