28 Commits

Author SHA1 Message Date
Alex Ellis (OpenFaaS Ltd)
8e711b3a0c Use Desired Replicas when scaling from zero
During some exploratory testing, I ran into an issue where
the gateway would attempt to scale a deployment from zero
replicas to min, despite there already being min replicas.

Why?

The scaling logic was looking for Available replicas when
it should have looked for Desired replicas. So when a
deployment had zero ready replicas due to readiness checks
failing, the gateway was attempting to scale from zero
to min.

This logic has been corrected and separated from the
a holding pattern where the gateway waits for a ready
replica.

Tested with KinD and an edited function which had a
readiness probe, which was failing and no ready
replicas. As desired, the gateway did not scale to min.

However, when setting desired replicas to zero, the
gateway did scale up as expected.

This change also modifies all print statements for
"seconds" and makes them use 4 decimal places instead of
the default which was a longer, more verbose string for
the logs.

Signed-off-by: Alex Ellis (OpenFaaS Ltd) <alexellis2@gmail.com>
2022-09-08 11:21:29 +01:00
Alex Ellis (OpenFaaS Ltd)
88eea5f62e Feature for probing functions
Introduces a single-flight call to a function's health
endpoint to verify that it is registered with an Istio
sidecar (Envoy) before letting the invocation through.

Results are cached for 5 seconds, before a probe is
required again.

Tested without Istio, with probe_functions environment
variable set to true, I saw a probe execute in the logs.

Fixes: #1721 for Istio users.

Signed-off-by: Alex Ellis (OpenFaaS Ltd) <alex@openfaas.com>
2022-07-07 10:35:07 +01:00
Alex Ellis (OpenFaaS Ltd)
cc2f38938e Add HTTP status code to histogram
The histogram for gateway_functions_seconds excluded the status
code that gives important information for setting up SLOs.

Fixes: #1725

Signed-off-by: Alex Ellis (OpenFaaS Ltd) <alex@openfaas.com>
2022-06-01 10:14:18 +01:00
Alex Ellis (OpenFaaS Ltd)
feb649be86 Turn off query for usage for invocations
Signed-off-by: Alex Ellis (OpenFaaS Ltd) <alexellis2@gmail.com>
2022-01-24 14:38:12 +00:00
Alex Ellis (OpenFaaS Ltd)
d85d5e7239 Export new metrics for OpenFaaS Pro scaling
* Add service target metric
* Add service min replicas metric
* Add scale type metric

These combined allow new auto-scaling modes and parameters
for OpenFaaS Pro customers.

Signed-off-by: Alex Ellis (OpenFaaS Ltd) <alexellis2@gmail.com>
2022-01-24 14:30:08 +00:00
Alex Ellis (OpenFaaS Ltd)
4ced7cacd0 Update scaling polling interval
Polls every 100 ms instead of every 50ms for use with a new
faas-netes PR:

https://github.com/openfaas/faas-netes/pull/726

The call is much faster now, so this request should me made
at a lower frequency.

Also adds error handling for URL building in the external
service query code.

Signed-off-by: Alex Ellis (OpenFaaS Ltd) <alexellis2@gmail.com>
2020-12-10 14:00:31 +00:00
Alex Ellis (OpenFaaS Ltd)
11ff356cce Remove service namespace from scale request
This field doesn't appear to be used and is supplied via the
querystring which faas-netes (the only ns-enabled provider
already consumes)

Related to PR:

https://github.com/openfaas/faas-netes/pull/671

Signed-off-by: Alex Ellis (OpenFaaS Ltd) <alexellis2@gmail.com>
2020-07-20 10:19:26 +01:00
Alex Ellis (OpenFaaS Ltd)
2bfca6d848 Publish to multiple topics
Enables publishing to various topics according to annotations
on the functions. The function cache is moved up one level so
that it can be shared between the scale from zero code and the
queue proxy.

Unit tests added for new internal methods.

Tested e2e with arkade and the newest queue-worker and RC
gateway image with two queues and an annotation on one of the
functions of com.openfaas.queue. It worked as expected including
with multiple namespace support.

Signed-off-by: Alex Ellis (OpenFaaS Ltd) <alexellis2@gmail.com>
2020-04-22 15:26:42 +01:00
Alex Ellis (OpenFaaS Ltd)
df4126d8f5 Scale functions with namespace option
Allows alerts to trigger functions to scale when they
also have an optional namespace set.

Tested e2e with Kubernetes 1.15 and a non-default namespace.

Signed-off-by: Alex Ellis (OpenFaaS Ltd) <alexellis2@gmail.com>
2019-09-20 18:38:55 +01:00
Alex Ellis (OpenFaaS Ltd)
df97efafae Migrate away from requests package for Function structs
The function deployment and status structs have been moved away
into the faas-provider package.

Tested with a build, running tests, and CI.

Signed-off-by: Alex Ellis (OpenFaaS Ltd) <alexellis2@gmail.com>
2019-08-05 12:58:30 +01:00
Alex Ellis
1cf030da48 Differentiate external service auth from user auth
Signed-off-by: Alex Ellis <alexellis2@gmail.com>
2019-06-09 20:08:39 +01:00
Alex Ellis (VMware)
299e5a5933 Read config values from environment for max_conns tuning
- max_conns / idle / per host are now read from env-vars and have
defaults set to 1024 for both values
- logging / metrics are collected in the client transaction
rather than via defer (this may impact throughput)
- function cache moved to use RWMutex to try to improve latency
around locking when updating cache
- logging message added to show latency in running GetReplicas
because this was observed to increase in a linear fashion under
high concurrency
- changes tested against 3-node bare-metal 1.13 K8s cluster
with kubeadm

Signed-off-by: Alex Ellis (VMware) <alexellis2@gmail.com>
2019-02-04 11:50:25 +00:00
Alex Ellis (VMware)
9cea08c728 Extract scaling from zero
- extracting this package means it can be used in other components
such as the asynchronous nats-queue-worker which may need to
invoke functions which are scaled down to zero replicas.

Ref: https://github.com/openfaas/nats-queue-worker/issues/32

Tested on Docker Swarm for scaling up, already scaled and not
found error.

Signed-off-by: Alex Ellis (VMware) <alexellis2@gmail.com>
2018-11-01 15:10:08 +00:00
Alex Ellis
bd39b9267a Update comments
- updates comments and adds where missing
- updates locks so that unlock is done via defer instead of
at the end of the statement
- extracts timeout variable in two places
- remove makeClient() unused method from metrics package

No-harm changes tested via go build.

Signed-off-by: Alex Ellis (VMware) <alexellis2@gmail.com>
2018-10-03 13:16:28 +01:00
Richard Gee
df6f4c49f2 Add checking for existent function in GetReplicas
Within MakeScalingHandler() there is a call to GetReplicas() which was not returning an error when a non-200 http response was received from /system/function/.  The call would also return a populated struct, so the perception was that a function existed an had been scaled to zero.  This meant that the function would be added to the function cache and the code would continue into SetReplicas() where an attempt would be made to scale up a non-existent function.

This change amends GetReplicas() so that it will return an error if the gateway returns anything other than a 200 reponse code from the /system/function/ endpoint.  This causes MakeScalingHandler() to return earlier with an error indicating that the function could not be found.  The cache.Set call is also moved to after the error check so that the cache is only updated to include existent functions.

During investigations as to the cause of #876 tests were added to function_cache to check that Get() is behaving as intended when function exists and when not.  Tests are also added to plugin/external to test that GetReplicas() and SetReplicas() are following their intended modes of operation when 200 and non-200 responses are received from the gateway.

Signed-off-by: Richard Gee <richard@technologee.co.uk>
2018-09-23 12:39:13 +01:00
Alex Ellis (VMware)
3598da2e51 Enable basic auth for service query / scaling on provider
- this is a blocking issue for auth with Docker Swarm
fixes #879

Signed-off-by: Alex Ellis (VMware) <alexellis2@gmail.com>
2018-09-19 20:52:14 +01:00
Vivek Singh
07f3e8f624 Accept 202 as valid code for /system/scale-function/<function>
Updated gateway to accept 202 as valid response code for
/system/scale-function/<function> along with 200.

Fixes: #faas-netes/245

Signed-off-by: Vivek Singh <vivekkmr45@yahoo.in>
2018-07-16 11:13:11 +01:00
Alex Ellis (VMware)
811bbe6031 Apply gofmt
Previous PR from Simon or Ken broke build due to missing gofmt
in the PR. This PR applies it to resolve the build issue.

Signed-off-by: Alex Ellis (VMware) <alexellis2@gmail.com>
2018-04-11 20:46:20 -07:00
Templum
0b28e96a58 Adjusted code accordingly to feedback
Based on the received feedback I updated the documentation of the function. Also replaced variable temp by an more declaritive variable.
Signed-off-by: Simon Pelczer <templum.dev@gmail.com>
2018-04-11 19:30:43 -07:00
Templum
6cd6975fe5 Moved label extraction to dedicated function.
Further I created some unit test which should cover all relevant scenarios for the created function.

Signed-off-by: Simon Pelczer <templum.dev@gmail.com>
2018-04-11 19:30:43 -07:00
Simon Pelczer
7fe67d7af6 Implemented the autoscaling steps to be proportions of the max replicas.
Introduced an new label to set the scaling factor that is used to calculate th proportions, setting it to 0 also allows to disable scaling.
Updated the tests to reflect the changes and added a new test which shows that setting the scaling factor to 0 indeed does disable scaling.
Ensured that the scaling factor is always between [0 and 100].

Signed-off-by: Simon Pelczer <templum.dev@gmail.com>
2018-04-11 19:30:43 -07:00
Alex Ellis
30928739ee Use context for upstream timeouts
Signed-off-by: Alex Ellis <alexellis2@gmail.com>
2018-03-05 12:49:25 +00:00
Alex Ellis
23a7187435 Refactoring: variable names, adding tests and http constants
Signed-off-by: Alex Ellis <alexellis2@gmail.com>
2017-12-05 06:50:08 -06:00
Alex Ellis
2452fdea0b Allow min-scale
Signed-off-by: Alex Ellis <alexellis2@gmail.com>
2017-12-05 06:50:08 -06:00
John McCabe
89878f0c8a Migrate from alexellis org to openfaas
Note, not all `alexellis/github` references should be changed, there are
a number of repos which are not part of the openfaas org, this commit
excludes those.

Signed-off-by: John McCabe <john@johnmccabe.net>
2017-10-04 09:18:06 +01:00
Alex Ellis
c5815d36ab add_missing_mit_header_27_aug
Signed-off-by: Alex Ellis <alexellis2@gmail.com>
2017-08-27 22:37:19 +01:00
Alex Ellis
f165ce2ca7 External replica proxy 2017-08-08 09:14:46 +01:00
Alex Ellis
33381a783a External alert-handler support. 2017-08-08 09:14:46 +01:00