Headlines Love Averages—Decisions Need Distributions

Bill Kantor
Nov 7
4 min read

Updated: 1 day ago

The presenter at a recent AI conference put up this slide (courtesy of McKinsey) to demonstrate that “AI maturity varies significantly across and within sectors.” They did this by measuring something McKinsey calls AI Quotient (AIQ) across “800 companies and 20K participants worldwide.”

Source: McKinsey AIQ by Sector slide (conference photo)

If you’ve been following our blogs you know that we rail against showing simple average (mean) statistics. Point statistics like averages have their place, but they can be very misleading. You need to see the distribution of data to know if the average is useful and if you can make any conclusions. In business people often show only averages. Sometimes it’s innocuous. But sometimes it can be used to hide the underlying variability and risks.

So my first reaction was to commend McKinsey. Instead of just presenting an average measurement (horizontal blue bars) for each sector, they also showed quantile range stats of the data distribution (vertical blue bars). While not showing the full nature of the data, quantile ranges help you assess the distribution of values in a sampled population. The high and low values show a fixed portion of the population (typically 95%, but other ranges are commonly used).

But wait! Quickly looking at the chart, I concluded—contrary to the slide title—that the AI maturity was indistinguishable across all sectors. The mean value was within the quantile ranges for every other sector.

But wait, look again (upper right key). These are min/max values, not quantile ranges. They show 100% of the data. Meaning each sector’s statistic includes outliers. Quantiles are intentionally chosen (at less than 100%, typically 95%, 80%, or other) to help you understand more about the population by excluding outliers. By including 100% of the range, the range is highly sensitive to outliers and offers no information about the distribution or reliability of the data. So you really can’t say anything about AI maturity varying across sectors.

In defense of McKinsey, the second part of their narrative—AI maturity varies significantly within sectors—is sort of supported by the data. “Sort of” because, as before, you can’t tell if the ranges are distorted by a few outliers. Again, it would have been more informative to show quantile ranges. But strictly speaking there are big differences within each sector.

The bottom part of the slide is also troubling. It shows the “Increase in spread between AI Quotient leaders and laggards: 2016 – 2019 to 2020 – 2022.” In the 2016 to 2019 era, AI was mostly ML but not anything close to the Generative AI we have today. That only became a serious business resource following the public release of ChatGPT in November 2022—maybe (probably?) after the data collection dates in their survey. So this whole slide likely predates modern Generative AI. Not particularly relevant today.

To their credit, McKinsey acknowledges that there is insufficient sample size in their public sector data to make meaningful comparisons of this statistic.

So what can we conclude?

Not much. That AI maturity varies considerably if you include outliers; and maybe that maturities have all increased over the two eras.

I’m not sure if or how that helps. It creates the illusion of a serious study that should drive real decisions. I see nothing useful in this slide other than the range of AIQ is quite broad.

Notwithstanding this example, I am sure that the folks at McKinsey understand these issues. I didn’t write this as a critique of them. My intent was to underscore the need for basic statistics on business data. Here’s a basic guide (see Appendix) to follow if you want to get better at communicating and making data-driven decisions.

Appendix

Guidelines for basic business statistics.

Quantify uncertainty: Add Quantile ranges statistics (like 95% or 80% bands) for key estimates.Example: “CAC: mean $7.8k; middle 95% of observations: $5.2k–$11.9k.”
Show distributions, not just averages: Use histograms or (more advanced) box/violin plots (great description here); at least add median alongside the mean.Example: “Median win rate 28% (80% between 22–34%), mean 30%.”
Include sample size
Define precisely what is being measured: Give units, formula, and any filters.Example: “Sales cycle = close date − first meeting, calendar days, excludes <5-day deals.”
Handle outliers transparently: State how you handled outliers. Show results with/without them if impactful.
Segment before concluding: Break results by meaningful slices (e.g., new vs. renewal; or SMB vs. Enterprise). Many “overall” effects vanish or flip in segments.
Compare like with like: Keep definitions and cohorts consistent over time; note any changes in data collection.Example: “Q3 includes a new product line; prior quarters don’t.”
Time-window clarity: State the exact period and its rationale; call out seasonality or events that may skew results.
Make the visual truthful
Start axes at sensible baselines; avoid deceptive truncation.
Label units.
Follow the principle of proportional ink.

Bonus (not strictly statistics): State the decision implication. If you’ve shown the data and the basic stats are sound, finish by answering: “So what?” What should you do differently because of the data-driven insight? And if the data are not sound, if they don’t support a decision, say so.

This isn’t about confidence intervals or distributions; it’s about making sure analysis get the attention it deserves and drives changes in behavior.

Headlines Love Averages—Decisions Need Distributions

Appendix

Recent Posts

Comments

Quick Links

Services

Contact