Metrics#

Overview#

Introduction#

The DNS controller exposes metrics that you can scrape using Prometheus.

Metric endpoint configuration#

DNS controller pod metrics are available at the following endpoint.

Port

Endpoint

8080

/metrics

Metric types#

Metric types are defined by Prometheus. Learn about Prometheus metric types.

Metric categories#

The DNS controller exposes two categories of metrics:

  • Default metrics: The metrics exposed by the controller runtime. Learn about the default metrics.

  • Custom metrics: The metrics exposed by the DNS controller application code, which are described below.

About SOA records#

When the DNS controller executes a query for an IPv4-address (A) or IPv6-address (AAAA) record and that record doesn’t exist on the Kubernetes DNS server, the DNS controller receives an NXDOMAIN response from the Kubernetes DNS server. The DNS controller then executes a query for a start-of-authority (SOA) record to obtain a negative caching time to live (TTL). The DNS controller uses the negative caching TTL to determine how long to wait before executing another query for the IPv4-address (A) or IPv6-address (AAAA) record.

Observe metrics in Grafana using the sample dashboard#

Introduction#

If you want to observe metrics for the DNS controller in Grafana, you can import the sample DNS controller dashboard provided with Aspen Mesh to use as a starting point for your own dashboard.

Before you begin#

Before you can observe metrics for the DNS controller in Grafana, make sure the following software is installed on your cluster:

  • Prometheus

  • Grafana

  • The DNS controller

Import the sample dashboard#

  1. Log in to Grafana.

  2. Click Dashboards in the left-side menu.

  3. Click the New pop-up menu, and then choose Import.

  4. Upload the following dashboard JSON file:

    • Directory: samples/aspenmesh/dashboards (in the Aspen Mesh release directory)

    • File: dns-controller-dashboard.json

  5. Select your Prometheus data source in the pop-up menu.

  6. Click Import.

Metrics#

dns_controller_created_istio_service_entries_total#

Description#

The number of created managed, static-resolution Istio service entries

Type#

Counter

Labels#

Label name

Description

Label value

namespace

The namespace of the managed, static-resolution Istio service entry

Example: bookinfo

dns_controller_deleted_istio_service_entries_total#

Description#

The number of deleted managed, static-resolution Istio service entries

Type#

Counter

Labels#

Label name

Description

Label value

host

The host of the managed, static-resolution Istio service entry

Example: httpbin.org

name

The name of the managed, static-resolution Istio service entry

Example: httpbin-78dd67af5b

namespace

The namespace of the managed, static-resolution Istio service entry

Example: bookinfo

dns_controller_dns_exchange_failures_total#

Description#

The number of DNS exchange (communication) failures

Type#

Counter

Labels#

Label name

Description

Label value

response_code

The exchange response code

- exchange: No response was returned.
- The exchange response code. Example: 2 (ServFail)
- no_answer: The exchange response code was 0 (NoError), but no answer was included.

dns_controller_dns_lookup_failures_total#

Description#

The number of DNS lookup failures

Type#

Counter

Labels#

Label name

Description

Label value

soa_record

If type is a or aaaa, whether the lookup results contained a start-of-authority (SOA) record

- true: Contained an SOA record
- false: Did not contain an SOA record (this value is also used when type is soa)

type

The type of DNS record queried

- a: IPv4 address
- aaaa: IPv6 address
- soa: Start of authority

dns_controller_dns_lookups_total#

Description#

The number of DNS lookups

Type#

Counter

Labels#

Label name

Description

Label value

type

The type of DNS record queried

- a: IPv4 address
- aaaa: IPv6 address
- soa: Start of authority

dns_controller_dns_query_duration_seconds#

Description#

This metric consists of two submetrics that allow you to determine the average number of seconds the DNS controller spent executing a DNS query (lookup):

  • dns_controller_dns_query_duration_seconds_sum: The total number of seconds the DNS controller spent executing DNS queries

  • dns_controller_dns_query_duration_seconds_count: The number of DNS queries the DNS controller executed

Type#

Histogram

Labels#

Label name

Description

Label value

type

The portion of the DNS-query-execution timespan

- fqdn: The complete time executing queries for A (when applicable) and AAAA (when applicable) records
- a: Executing queries for IPv4-address (A) records
- aaaa: Executing queries for IPv6-address (AAAA) records
- soa: Recursively executing queries for start-of-authority (SOA) records (when the A or AAAA query didn’t provide the SOA record)

dns_controller_dns_static_entries#

Description#

The number of Aspen Mesh DNS static entries being managed

Type#

Gauge

Labels#

Label name

Description

Label value

namespace

The namespace of the Aspen Mesh DNS static entry

Example: bookinfo

dns_controller_dns_truncated_responses_total#

Description#

The number of DNS queries whose size exceeded the UDP limit

Type#

Counter

Labels#

(No labels are reported for this metric.)

dns_controller_empty_dns_record_lookups_total#

Description#

The number of empty DNS records returned

Type#

Counter

Labels#

(No labels are reported for this metric.)

dns_controller_kube_apiserver_errors_total#

Description#

The number of Kubernetes API-server request errors

Type#

Counter

Labels#

(No labels are reported for this metric.)

dns_controller_kubernetes_query_duration_seconds#

Description#

This metric consists of two submetrics that allow you to determine the average number of seconds the DNS controller spent executing a Kubernetes query:

  • dns_controller_kubernetes_query_duration_seconds_sum: The total number of seconds the DNS controller spent executing Kubernetes queries

  • dns_controller_kubernetes_query_duration_seconds_count: The number of Kubernetes queries the DNS controller executed

Type#

Histogram

Labels#

(No labels are reported for this metric.)

dns_controller_kubernetes_update_duration_seconds#

Description#

This metric consists of two submetrics that allow you to determine the average number of seconds the DNS controller spent executing a Kubernetes update:

  • dns_controller_kubernetes_update_duration_seconds_sum: The total number of seconds the DNS controller spent executing Kubernetes updates

  • dns_controller_kubernetes_update_duration_seconds_count: The number of Kubernetes updates the DNS controller executed

Type#

Histogram

Labels#

Label name

Description

Label value

type

The type of update

- delete: Deleting a managed, static-resolution Istio service entry
- create: Creating a managed, static-resolution Istio service entry
- update: Updating a managed, static-resolution Istio service entry
- status: Updating the status of a managed, static-resolution Istio static entry

dns_controller_missing_dns_static_entries_total#

Description#

The number of missing Aspen Mesh DNS static entries. An Aspen Mesh DNS static entry is considered missing when an operator deletes the Aspen Mesh DNS static entry.

Type#

Counter

Labels#

Label name

Description

Label value

namespace

The namespace of the Aspen Mesh DNS static entry

Example: bookinfo

dns_controller_missing_istio_service_entry_total#

Description#

The number of missing managed, static-resolution Istio service entries. A managed, static-resolution Istio service entry is considered missing when one of the following is true:

  • An Aspen Mesh DNS static entry exists for the off-mesh service, but the DNS controller is unable to obtain an IP address for that service and therefore can’t create the related managed, static-resolution Istio service entry.

  • An operator deletes the managed, static-resolution Istio service entry. When this occurs, the DNS controller re-creates the managed, static-resolution Istio service entry.

Type#

Counter

Labels#

Label name

Description

Label value

name

The name of the managed, static-resolution Istio service entry

Example: httpbin-78dd67af5b

namespace

The namespace of the managed, static-resolution Istio service entry

Example: bookinfo

dns_controller_negative_ttl_total#

Description#

The number of times a negative TTL was set

Type#

Counter

Labels#

Label name

Description

Label value

host

The host of the Aspen Mesh DNS static entry

Example: httpbin.org

name

The name of the Aspen Mesh DNS static entry

Example: httpbin

namespace

The namespace of the Aspen Mesh DNS static entry

Example: bookinfo

dns_controller_refresh_did_not_update_istio_service_entry_total#

Description#

The number of times that managed, static-resolution Istio service entries expired, but resolution didn’t cause any changes

Type#

Counter

Labels#

Label name

Description

Label value

name

The name of the managed, static-resolution Istio service entry

Example: httpbin-78dd67af5b

namespace

The namespace of the managed, static-resolution Istio service entry

Example: bookinfo

dns_controller_updated_istio_service_entries_total#

Description#

The number of updates to managed, static-resolution Istio service entries

Type#

Counter

Labels#

Label name

Description

Label value

name

The name of the managed, static-resolution Istio service entry

Example: httpbin-78dd67af5b

namespace

The namespace of the managed, static-resolution Istio service entry

Example: bookinfo