#86 - Would you hire a fraud analyst who knows everything?

“We abandoned the approach of one model to rule them all.”

That’s Gilit Saporta, VP Product at DoubleVerify’s fraud lab, in episode #11 of TSFS podcast(out today!).

Every major AI vendor right now is selling some version of a world model - one system trained on everything, detecting everything.

On the surface, this approach makes sense.

After all, at least when it comes to LLMs, we were “trained” to correlate dataset sizes to benchmark performance.

But Gilit’s experience in DoubleVerify matches my experience in Sardine - AI agents fare better when they are more specialized and granular.

And that’s a common theme I see surfacing a lot in fraud solutions, no matter the underlying tech. So let’s talk about it.

What should your AI agent replace?

The failure I keep seeing in agentic workflows is the same one I’ve seen in rules and models: one system trying to tell all possible stories.

But workflows that actually work are the ones where multiple agents operate in a sequence. Each of them is focused on solving a specific problem.

As Gilit smartly pointed out, AI agents shouldn’t be designed to replace employees. They should be designed to replace specific manual tasks.

Less context and narrower instructions mean higher accuracy and fewer hallucinations.

It also enables specialized agents to focus on finding the “good story”, allowing your workflow to identify false positives.

Take the example Gilit shared on the podcast: a 3am traffic spike.  A general purpose agent would default to a bad explanation - a bot attack. 

Why? Because nothing in its training accounts for “This might be a Taylor Swift concert sale that just opened.”

Granularity is what gives the agent the context to ask: what else could this be?

Your rules had this problem first

The good story / bad story approach isn’t new to AI agents. It’s the frame I use every time I sit down to build a fraud rule.

Here’s the gist of it: For any anomaly, you construct two competing explanations. The fraud one, and the legitimate one. Then you figure out which fits the evidence.

The problem is that most fraud rules only tell one story.

They're designed to detect fraud patterns - velocity, geo mismatches, device anomalies. But most don’t have a mechanism for telling a good story.

And even if you introduce great exclusions into each of your rules, your system as a whole will still be skewed toward telling bad stories.

To actually solve this, you need dedicated approval rules. Separate logic, designed from the ground up to identify and approve legitimate users who your fraud stack would otherwise block.

This makes things so much simpler. Give it a try.

Does your ML model know your business?

Generic vs. granular is a theme we encounter in fraud ML models as well.

Consortium data is one of the most compelling pitches in fraud prevention. More signals, broader coverage, and you start with a dataset without having to build one. On paper, it’s an obvious win.

But here’s the thing:

I’ve seen hundreds of fraud models in my career. And every time they were trained on a particular client’s data, they performed better. Every time.

Consortium data is optimized to detect generic fraud - pooled and averaged across thousands of businesses. But fraud isn’t evaluated on average accuracy.

It’s evaluated on accuracy at your specific business, for your specific customers.

Ironically, it means that the more sources of data we feed a model, the less it will perform for each of them individually.

Or to put it more accurately, the more its performance would be just… average.

But I see teams that keep falling for it because breadth feels safer. More data, seeing fraud attacks before they happen - that sounds good right?

So instead of insisting on asking their vendor to train a model on their data, they insist on getting irrelevant, averaged insights from peers they have nothing in common with.

Side note: Consortium models do have their uses, especially when your data is unavailable for training or when labels are missing.

One question across all three layers

Whether you’re writing rules, choosing an ML vendor, or spinning up your first AI agents - the question is always the same: is this solution accurate enough?

And the more it tries to do - more data sources, more use cases, more stories - the more average the performance would turn out to be.

The fix?

Abandon the fantasy of a perfect solution that can do everything. Instead, deploy granular solutions that tackle niche problems.

That’s the daily grind.

And that’s also what separates good fraud teams from average ones.

If that’s something you want to learn more about, check out my full interview with Gilit.

Where in your current stack are you running a blanket model where you probably shouldn’t be? Hit reply and let me know.

In the meantime, that’s all for this week.

See you next Saturday.


P.S. If you feel like you're running out of time and need some expert advice with getting your fraud strategy on track, here's how I can help you:

Free Discovery Call - Unsure where to start or have a specific need? Schedule a 15-min call with me to assess if and how I can be of value.
​Schedule a Discovery Call Now »

Consultation Call - Need expert advice on fraud? Meet with me for a 1-hour consultation call to gain the clarity you need. Guaranteed.
​Book a Consultation Call Now »

Fraud Strategy Action Plan - Is your Fintech struggling with balancing fraud prevention and growth? Are you thinking about adding new fraud vendors or even offering your own fraud product? Sign up for this 2-week program to get your tailored, high-ROI fraud strategy action plan so that you know exactly what to do next.
Sign-up Now »

 

Enjoyed this and want to read more? Sign up to my newsletter to get fresh, practical insights weekly!

<
Next
Next

#85 - Stop letting fraudsters write your rules