Learn about the Tanzu Observability Usage Integration.

This page provides an overview of what you can do with the Tanzu Observability Usage integration. The documentation pages only for a limited number of integrations contain the setup steps and instructions. If you do not see the setup steps here, navigate to the Operations for Applications GUI. The detailed instructions for setting up and configuring all integrations, including the Tanzu Observability Usage integration are on the Setup tab of the integration.

  1. Log in to your Operations for Applications instance.
  2. Click Integrations on the toolbar, search for and click the Tanzu Observability Usage tile.
  3. Click the Setup tab and you will see the most recent and up-to-date instructions.

Operations for Applications Usage Integration

The Operations for Applications Service and Proxy Data dashboard allows you to examine internal metrics and check whether your Operations for Applications instance is behaving as expected.

The Operations for Applications Ingestion Policy Explorer dashboard provides a granular breakdown of Operations for Applications ingestion across your organization by ingestion policies, accounts, sources, and types. Use this dashboard to identify who is contributing the most to your Operations for Applications usage and manage your overall usage.

The Operations for Applications Namespace Usage Explorer dashboard breaks down metrics usage based on integrations with the ability to drill-down further into the metric namespaces.

The Usage (PPS) vs Remaining Balance (PPS P95) for Burndown dashboard shows your monthly PPS usage against your remaining burndown balance. Applies only to customers who have burndown commit contracts with Operations for Applications.

The Committed Rate vs Monthly Usage (PPS P95) for Billable dashboard shows your monthly PPS usage against your monthly billable commitment. Applies only to customers who have billable commit contracts with Operations for Applications.

Operations for Applications internal metrics have the following prefixes.

To modify the Operations for Applications Usage alerts, install them and clone them. You must update the required fields in cloned alerts.

Alerts

  • High rate of host IDs observed:Alert for tenant is reporting high rate of new host (source) IDs to Tanzu Observability.
  • High rate of metric IDs observed:Alert for tenant is reporting high rate of new metric IDs to Tanzu Observability.
  • High rate of string IDs observed:Alert for tenant is reporting high rate of new string (point tags) IDs to Tanzu Observability.
  • Alert Webhooks failed:Alert is reporting when the alert webhooks end with either a 4xx or 5xx error.
  • Proxy check-in with an invalid token observed:Alert is reporting when a proxy checks in by using an invalid token.
  • Proxy Network Latency (P95):Alert is reporting the 95th percentile of the latency. Latency measures the time it takes for the proxy to push its metric, i.e. the duration. Constantly large numbers mean that the network suffers certain latency.
  • Proxy Data Received Lag (P95):Alert is reporting the 95th percentile of time differences (in milliseconds) between the timestamp on a point and the time that the proxy received it. Large numbers indicate backfilling old data, or clock drift in the sending systems. You can also graph other percentiles.
  • Proxy Backlog (spans) has been accumulating:This alert checks whether there is any span back logs on proxy. Back logs means the proxy is queuing points due to either the span data transmission between proxy and TO service has been blocked, or data is being pushed back by the service due to the ingestion limit imposed.
  • Proxy Backlog (histograms) has been accumulating:Alert is reporting there are histogram backlogs on the proxy. Backlogs mean that the proxy is queuing histograms because either the data transmission between the proxy and the service has been blocked, or the data is being pushed back by the service because the ingestion limit is reached.
  • Proxy rate limiter is activated:Alert is reporting when the points per second rate’s 30-minute moving sum is constantly high. Check to see which proxy is affected by the data being pushed back.
  • Proxy Backlog (points) has been accumulating:This alert checks whether there is any metric back logs on proxy. Back logs means the proxy is queuing metric points due to either the data transmission between proxy and TO service has been blocked, or data is being pushed back by the service due to the ingestion limit imposed.
  • High proxy JVM memory heap usage observed:Alert is reporting when the heap memory usage of the proxy is constantly high. Make sure that the memory of the proxy is reasonable.
  • Tanzu Observability rate limits exceeded:Alert is reporting when the points per second that are being rate limited reach a certain threshold.
  • Invalid Alert Condition Found:Alert is reporting that some alerts are with invalid state. To find the detailed list of the alerts that are currently in invalid state, in the alert list search for invalid status.
  • Remaining Balance:Alert is reporting when the usage balance is below 5% of the total contract rate.
  • Percentage of Usage Scanned (Real-Time):Alert is reporting when the total percentage of usage scanned is above 95% of the committed rate.
  • Percentage of Usage Ingested (Real-Time):Alert is reporting when the total percentage of usage ingested is above 95% of the committed rate.