marketingscience.dev

We're Statisticians, Not Magicians: Why Some Channels Can't Be Modeled

No variability, no model. Affiliates aren't a media channel. CRM isn't media. You don't choose theta — the data does. A field-tested rant on the limits of MMM, and the two places teams cheat.

There's a phrase I say when I'm annoyed, and people who've worked with me know it well:

"We're statisticians, not magicians."

I usually pull it out when someone hands me a channel with no variability and asks the model to tell them what it's worth. So let me unpack what that actually means, because behind the joke is one of the most important and least-taught ideas in marketing measurement: some channels cannot be modeled, and no amount of cleverness changes that.

No variability, no model

Here's the mechanism, stated plainly. The way linear regression works — complicated or not — is by relating changes in your inputs to changes in your outcome. If your spend is always the same, the model can't tell anything apart. There's no signal to attribute. You can run the fanciest Bayesian sampler in the world over a flat line and it will tell you nothing, because there's nothing in the data to find.

This comes up most with content teams on fixed budgets. Same cadence, same number of articles, same value, every week. They'll often insist their channel doesn't saturate — that the slope is high and the returns keep coming. And I get why they believe it: content is so low-frequency on the budget side that they're permanently sitting on the early part of the curve. They never spend enough to saturate their own channel, so they never see saturation. But belief isn't variability. If there's no variation in the data, there is absolutely no model that can save you.

The cleanest version of this I ever lived through was sponsorship. I had a sponsorship channel and zero usable variability — because the deal was a yearly contract, and the finance team amortized it. They took the total value, divided by 12, and gave me twelve identical monthly numbers. Twelve copies of the same row. There is nothing — nothing — a statistician can do with that. The spend was constant by construction. The only fix is to introduce variability: shut it down for a stretch, ramp it up and down, give the model something to chew on. And if the team won't do that and you can't convince them? Then, again: I'm not a magician.

You don't choose theta — the data does

This is the single most common confusion I see, so let me put it bluntly. In an MMM you have an adstock transformation (the idea that a TV ad today still has some effect tomorrow, and the day after) governed by a decay parameter, theta. People ask me, "how do you decide which theta to use?"

You don't. You let the data find it.

Mechanically, theta defines the shape of the decay curve. A theta near zero means an immediate, intent-driven channel — you see the ad, react, forget it. A theta near one means the effect persists almost forever. Different channels decay differently, so you'd expect paid search to sit low and TV to sit higher. But you don't pick the number. You run a grid search: for every channel, you sweep theta from 0 to 1 in small increments, fit each candidate curve to the data, and keep the one that best matches what you actually observe. In a frequentist model that's a literal for loop over a grid. In the Bayesian world it's the same idea, just coupled into the inference — you set a prior range ("I'd expect adstock to live between here and here") and the sampler explores it. Either way, you don't choose theta. You get it out of the data.

Why does this matter so much? Because I keep running into teams who force it. I had a conversation with a company where another team had made the vendor hard-code specific adstock values. That is one of the things you should never do in an MMM. Fixing the parameter by hand defeats the entire point of the exercise.

That doesn't mean you don't look at the output critically. If the model says paid search has an adstock lasting four weeks, push back — that channel doesn't behave like that. If it says TV decays instantly, ask why. But the move is to fix the model, not to override the parameter. Let the grid search surface the plausible range, then sanity-check it. Don't muck with the dial directly. If your head of digital wants to hand-set saturation parameters, push them back. Let the data speak.

A small honesty note: even Google trusts grid search more than they can test it. I sat in a conversation with them about how you'd actually run an incrementality test for adstock — and even they were a bit "yeah, we have some ideas." We sketched a multi-cell design (always-on in one cell, intermittent in another, dark in a fourth, then compare the decay across cells), but if your true adstock runs to 12 weeks you'd need something like 12 cells, and nobody's running that. So we take it for granted that grid search finds a close-enough parameter we can trust. Which is just the old line: all models are wrong, some are useful. There are always assumptions. This is one of them.

Affiliates aren't a media channel

Now the reframe that tends to land hardest, because it contradicts how almost every org is structured.

Affiliates usually sit on the media budget, under the media director, treated as a media channel. For most of my career I've had to make an allowance and model them that way. But think it through and affiliates are not a media channel at all.

Why? Because you are not buying exposure. Your cost has nothing to do with how much advertising got shown. Your cost is linearly tied to the outcome — someone converted, and you pay a commission. That changes everything:

  • The effect is literally linear. Spend doubles, outcome roughly doubles, because the relationship is mechanical, not behavioral.
  • There's no adstock and no saturation — for you. It's the affiliate who faces the media optimization problem. The deal is: I pay a commission, and they handle the advertising however they like. They're the one dealing with diminishing returns and adstock on their campaigns. You're buying the result.
  • It's brutally collinear with your outcome variable. The more you "spend" on affiliates, the more outcome you booked — because the spend is a fraction of the outcome.
  • You don't really control the budget. Unless your contract was written badly, you pay a commission because the lifetime value of that customer exceeds the affiliate cost. So you're always "winning." You spent $2M more on affiliates? That mechanically means you got more revenue. That's not a media response curve. That's arithmetic.

So when you drop affiliates into an MMM as media, you're modeling a channel that has no adstock, no saturation, and is mechanically tied to your KPI. It's one of my pet peeves. (Gambling is a fun edge case — huge spend, but it doesn't behave like a channel either.)

CRM isn't media — that's propensity modeling

Same family, different flavor: CRM campaigns. CRM doesn't have adstock. I send you the email, you read it or you don't, and we're done. There's no carryover decaying over weeks. And there's no spend-saturation effect either, because the unit cost is always the same.

You can argue there's a curve hiding in there — a "how spammy am I" curve. Keep emailing and your probability of landing in the spam folder goes up, and then I'm not reading anything. Fine. But notice what that is: that's propensity modeling, not media mix modeling. The dynamics that justify putting a variable into an MMM — adstock, saturation, a media response curve — aren't there. This is one of my biggest gripes about treating CRM as a media variable: it doesn't behave like media, so the machinery you're applying doesn't fit the phenomenon.

The general pattern across affiliates and CRM: these are channels that don't behave like channels. They don't saturate and they don't carry over the way paid media does. Forcing them through MMM transformations gives you a number, but it's a number describing a process the model wasn't built for.

The two places teams cheat

Pull all of this together and you land on the two spots where teams reliably muck around — the two places people put their thumb on the scale:

1. Fixed adstock. Manually setting the adstock parameter instead of letting grid search find it. As I said: never do this. The whole value of the model is that it surfaces the decay from the data. Hand-code it and you've just encoded your bias as a result and called it a model.

2. Over-tight priors. This is the Bayesian version of the same sin. In a Bayesian MMM your prior is your domain knowledge before seeing the data — and a healthy prior gets pulled by the data when the evidence is strong enough. That's the feature: in marketing you usually have small data (an MMM is often three years of weekly data — about 156 points, not exactly big data), so injecting reliable prior knowledge like incrementality tests is genuinely valuable. That's not a bug, that's a feature.

But if you make the prior really, really tight, you've constrained the model so hard that it would take overwhelming evidence to move it. You're no longer letting the data update your beliefs — you're forcing the answer you already wanted. And note where priors live: on a well-built MMM you put priors on ROI and hyperparameters, not directly on channel contributions. Teams that clamp those priors shut are doing the same thing as the fixed-adstock crowd, just with more sophisticated cover.

So what do you do?

If a channel genuinely can't support a model, the honest answer is to make it modelable, not to fake it. Introduce variability. Shut the channel off for a week. Run a CISO-style test where you push spend up and down over a 12-week window and watch the channel react. Run an incrementality test and feed the result back in as a prior. The MMM is causal in nature — it's trying to estimate the impact of turning a channel off — so the best calibration you can give it is an actual experiment.

And if the team won't introduce variability, and the channel doesn't behave like media in the first place? Then you say it out loud, kindly but clearly. The model is not the bottleneck. The data is. And we're statisticians, not magicians.


This is the stuff I spend most of my time on — MMM, attribution, experimentation, and knowing which channels you can trust a model to explain and which you can't. I teach it at marketingscience.dev. If you're the analyst who has to sit across from the feisty above-the-line team and explain why their channel won't model, you'd probably get something out of the course.