Bill shock is not an accident. It is the pricing model.
Engineering teams describe it as a rite of passage. You deploy Datadog, the UX is excellent, adoption spreads, and then one quarter the invoice arrives at three, five, or ten times what was budgeted. Reddit threads, blog posts, and finance escalations tell the same story in slightly different words. It is so common that "Datadog tax" is a meme among SREs.
The reason it keeps happening is not that finance teams are bad at forecasting. It is that Datadog bills on more than 25 separately metered SKUs, and several of them charge twice for the same data. Once you see how the meter runs, the shock stops feeling random.
The double charge on logs
Datadog log management is billed in two line items. You pay $0.10 per GB to ingest. Then you pay $1.70 per million events to index for the standard 15 day retention tier. The same log you just sent in gets metered again the moment it becomes searchable.
For a typical environment, the effective cost lands around $1.80 per GB once both lines are added together. At 100 GB per day that is roughly $107,400 per year for log management alone. At 500 GB per day it is approximately $509,400 per year. Add Cloud SIEM at $0.20 per GB analyzed and you are paying a third time for the same bytes.
Most teams do not realize the indexing line exists until they read the invoice carefully. The ingestion number anchors the mental model. The indexing number is where the surprise lives.
Custom metrics: 30 to 52 percent of the bill, quietly
Pro tier includes 100 custom metrics per host. Enterprise includes 200. Every metric beyond that bills at $5 per 100 metrics per month. The trap is cardinality. A single metric tagged with user_id, request_id, or region expands into thousands of unique time series, and every unique combination is a separate billable custom metric.
Industry data puts custom metrics at 30 to 52 percent of total Datadog spend at scale. OpenTelemetry metrics are all billed as custom metrics. Teams who instrumented with best intentions discover that the metric they added to debug a single incident is now costing more than the host it runs on.
The 25-plus SKU problem
Infrastructure monitoring, APM, log ingest, log indexing, Flex Logs storage, Cloud SIEM, Cloud Workload Security, CSPM, Application Security, Database Monitoring, Synthetic tests, Real User Monitoring, Continuous Profiler, Network Performance Monitoring, Serverless, Data Streams Monitoring, and more. Each meters differently. Some are per host. Some are per GB. Some are per million events. Some are per session or per test run.
The practical consequence: nobody on the team can answer the question "what does it cost to send this source to Datadog?" without building a spreadsheet that reconciles seven meters against one month of invoices. Cost modeling is not hard because the engineers are bad at math. It is hard because the pricing surface is engineered that way.
The architectural distortion
The second-order effect is worse than the invoice. When cost scales with host count and custom metric count, engineering decisions start to optimize for the bill rather than the architecture. Teams consolidate workloads onto larger instances to reduce host count. Engineers stop adding the metric they need, stop enabling tracing on the new service, stop writing the log line that would have shortened the next incident.
This is the Datadog tax. Not the invoice itself but the quiet tax on observability coverage. You get a cleaner bill and a blinder production environment. By the time someone notices, the MTTR on the next outage is the real cost.
What routing changes
A Cribl pipeline sits between your Datadog Agents and Datadog. It replaces the Agent intake using the Datadog Agent Source, so no host-side reconfiguration is required. Every event passes through the pipeline before it reaches the meter.
Logs: a high-value route sends security-relevant and investigation-relevant events to Datadog at full fidelity. Everything else routes to S3 at roughly $0.023 per GB per month. When investigation requires historical data, Cribl Search queries the S3 archive in place. Datadog only indexes the fraction of logs that drive dashboards and alerts.
Custom metrics: the Aggregations function converts verbose events into summary metrics before Datadog sees them. The Eval function strips high-cardinality tags that were exploding the meter. What arrives at Datadog is a lower-cardinality metric stream with the same analytical value.
Cost visibility: the Data Reduction Value report shows exactly what was routed where, per source and per destination. One dashboard answers the question that seven Datadog invoice line items cannot.
What does not change
Your dashboards keep working. Your monitors keep firing. Your team keeps using the Datadog UX they actually like, because this is not a migration story. Datadog is sticky because the product is good. The goal is not to leave. The goal is to stop paying three times for data that only needed to arrive once.
Engineers stop self-censoring. Collection at the source becomes free from a cost perspective, because the pipeline decides what reaches the meter. Instrument fully. Route intelligently. Keep the UX you paid for.
What to look for on your next invoice
Three quick diagnostics. First: add the log ingestion line and the log indexing line together. If the indexing line is larger than half the ingestion line, indexing is costing more than ingestion and a routing layer will pay for itself immediately. Second: find the custom metrics line. If it is more than 25 percent of total spend, aggregation upstream is the fastest lever. Third: count the SKUs. If you cannot answer in under a minute which SKU a given source hits, you do not have a cost model, you have a guess.
None of these require Cribl to diagnose. They require ten minutes with the invoice. The remediation is where the pipeline earns its place.