All writing
Financial Inclusion AI LLM myZoi Unbanked

AI is Training on the Wrong Humans

1.5 billion unbanked adults represent one of the most consistent financial datasets on the planet. It is just invisible to every model being built today.

·3 min read

AI training data and the unbanked

I have sat across from people who move money more carefully than most people I know in finance.

Construction workers in Dubai sending $400 home to India or South Asia every month. On time. Every time. For years. No missed payments, no defaults, no credit card debt. Just a disciplined, consistent financial behaviour that would make any risk model happy.

Except it never makes it into any risk model.

No credit file. No bank statement. No address history that maps to a system we traditionally look for. The data trail is a mobile number and a remittance receipt. That is it.

The gap nobody is naming correctly

Here is what strikes me: we are in the middle of the most ambitious period of AI development in history. OpenAI, Anthropic, Google DeepMind, MetaAI, all racing to build systems that understand human behaviour at scale.

But the humans I work with every day are not feeding into those datasets at all.

Not because anyone decided to exclude them. Because the infrastructure that generates financial data was never built for them in the first place. Formal systems reward formal participation. If you were never inside the system, you never generated the signal. And if you never generated the signal, the LLM never learned you exist.

That is not a bias problem in the way the industry usually talks about it. Bias implies the data is there but skewed. This is an absence problem, and absence is a much harder problem to fix.

The most consistent dataset nobody uses

The low-income migrant worker corridor, Gulf to South Asia, Europe to West Africa, is one of the most behaviourally consistent financial datasets on the planet. It is just fragmented across remittance operators, telecom records, and employer payroll systems. Never aggregated. Never used to train anything.

The labs building general intelligence are building it on a model of human economic life that reflects roughly the top third of the global income distribution with reasonable accuracy. Below that, it guesses.

That is not necessarily a niche concern. Approximately 1.5 billion unbanked adults is not an edge case. It is a large proportion of the world. The companies that figure out how to build ground truth from this population will have something the current LLMs do not have: what financial decision-making actually looks like under real constraint, at real scale.

Why this matters for AI

When these models combine, that is a more complete picture of human intelligence, the kind of intelligence AI claims to be training on.

Right now, it is not. It is training on the articulate, the documented, the formally included.

The rest of humanity is teaching it nothing, not because they have nothing to teach, but because nobody built the infrastructure to listen.


Originally posted on LinkedIn. Christian Buchholz is a co-founder of myZoi, a CBUAE-licensed fintech serving unbanked migrant workers across the GCC.