Models Are Just Data Interpolators

by Justin Skycak (@justinskycak) on

Asking a model to extrapolate is like asking a pig to fly.

Want to get notified about new posts? Join the mailing list and follow on X/Twitter.


If there’s one thing I’ve learned about ML/datasci over the years, it’s that models are just data interpolators. They’re just tools to run “continuous” queries on an otherwise discrete data space.

The difference between sophisticated models and simpler models is that the sophisticated models can internally represent a curvier, higher-dimensionality data space, whereas simpler models have to collapse down their representation so that it fits within their model capacity (which causes them to lose information).

If you can frame a task as interpolation, the model will perform incredibly well, and you can make it do even better by teeing things up to make it even easier to interpolate.

If you ask the model to extrapolate, then you’re almost certainly going to be disappointed with the results, because you might as well be asking a pig to fly.

∗     ∗     ∗

A concrete takeaway: If a model’s mistakes indicate gaps in your data, then no amount of additional sophistication will make up for the gaps, because it’s not a sophistication issue. It’s a data issue.

If you’re trying to fit an ML model on a data set and it keeps screwing up on things that are not interpolable within the data set, you can’t solve that problem by increasing the model complexity. The only way to solve the problem is to improve the data set.

Sure, as you keep expanding your data set (not just the number of points but the actual dimensionality and curvature of the surface), at some point your “data space” gets complex enough that you need a more sophisticated model to represent it.

But you can tell that it’s happening because you’ll notice that your model is making mistakes on stuff that you know for a fact is comprehensively covered in the data set.



Want to get notified about new posts? Join the mailing list and follow on X/Twitter.