Envoy statistics#
By default, Aspen Mesh configures the Istio proxy (Envoy) to record a minimal set of statistics to reduce the overall performance footprint of the installed sidecars. The default collection keys are:
cluster_manager
listener_manager
server
cluster.xds-grpc
wasm
These default statistics are perfect for a majority of applications, but there are use cases where additional statistics are needed to fully understand what is happening within your service mesh. With Aspen Mesh you have the capability to capture additional envoy statistics that you may need, such as capturing the number of request retries. This is easily achieved by updating your mesh proxy configuration or adding the appropriate annotations to specific workloads or gateways.
The list of Envoy statistics are available here (Please note that not all of these may be available in a given release).
Warning
Including additional Envoy statistics might significantly increase the number of time series collected by Prometheus. Special care may need to be taken when configuring Prometheus to reduce cardinality.
Example: Imagine that you would like to know how many times a request is being automatically retried for a particular workload#
This can be accomplished by adding the Envoy upstream_rq_retry
statistic as part of the proxy.istio.io/config
annotation on the workload under observation.
Below is an example of the annotation:
metadata:
annotations:
proxy.istio.io/config: |-
proxyStatsMatcher:
inclusionPrefixes:
- "upstream_rq_retry"
Here is an example of the sleep Deployment that is modified to utilize the annotation:
apiVersion: apps/v1
kind: Deployment
metadata:
name: sleep
spec:
replicas: 1
selector:
matchLabels:
app: sleep
template:
metadata:
labels:
app: sleep
annotations:
proxy.istio.io/config: |-
proxyStatsMatcher:
inclusionPrefixes:
- "upstream_rq_retry"
spec:
terminationGracePeriodSeconds: 0
containers:
- name: sleep
image: curlimages/curl
command: ["/bin/sh", "-c", "while true; do curl -XGET https://httpbin.org/status/500 && sleep 5000; done"]
imagePullPolicy: IfNotPresent
Warning
Once applied you must restart your pod to have the Istio proxy (Envoy) pick up the stats matcher configuration.
cluster.outbound|80||httpbin.example.svc.cluster.local.retry.upstream_rq_503: 3
cluster.outbound|80||httpbin.example.svc.cluster.local.retry.upstream_rq_5xx: 3
cluster.outbound|80||httpbin.example.svc.cluster.local.retry.upstream_rq_completed: 3
cluster.outbound|80||httpbin.example.svc.cluster.local.upstream_rq_retry: 3