What AI Can't Do (Yet) in Fintech Development

A clear-eyed look at where AI is genuinely delivering in financial services, where it keeps falling short, and what institutions need to get right before the next wave of investment.

Updated January 28, 2026

Written by Sofija Vidjikant

Content Writer

Contents:

The Integration Problem Nobody Talks About
Algorithms Sometimes Learn the Wrong Lessons, Especially in Fintech
The Hallucination Problem
The Legacy System Trap in Fintech
- EU AI Act Requirements
- DORA
- The Multi-Jurisdiction Problem
The Human Factor Banks Keep Relearning
What Works and What’s Stuck in Fintech
- Currently Working in Production
- Stuck in Pilot Purgatory
What This Means for Financial Services Leaders

2 ratings, average 5 out of 5

Tech Content

15 min read

Contents:

The Integration Problem Nobody Talks About
Algorithms Sometimes Learn the Wrong Lessons, Especially in Fintech
The Hallucination Problem
The Legacy System Trap in Fintech
- EU AI Act Requirements
- DORA
- The Multi-Jurisdiction Problem
The Human Factor Banks Keep Relearning
What Works and What’s Stuck in Fintech
- Currently Working in Production
- Stuck in Pilot Purgatory
What This Means for Financial Services Leaders

Here is a number worth sitting with: global banks poured $30 billion into AI in 2025, and roughly 70 percent of those implementations have no measurable return on investment. The gap between what AI promises and what it delivers in financial services is not primarily a technology problem. It is an infrastructure problem, an organizational problem, a regulatory problem, and in some cases a fundamental misunderstanding of what machine learning can and cannot do.

That gap will not close by spending more on AI development. It closes when institutions understand the specific constraints of deploying AI with other people’s money, in a regulated industry, on top of systems that predate the internet.

This article examines where those constraints come from, what AI genuinely does well in fintech today, and what still belongs in the “aspirational” column despite years of significant investment.

Pic_4 (1).png

The Integration Problem Nobody Talks About

Every payment processor, lender, and bank runs on different data formats, proprietary APIs, and systems that were never designed to communicate with each other.

The payroll platform doesn’t know what the banking app knows. The expense management tool operates independently of the accounting software. Payment processors have no visibility into cash flow forecasting, even when all three tools belong to the same organization.

This creates a genuine paradox for anyone trying to deploy AI agents in a financial context: those systems need unified, real-time data streams to function, but financial infrastructure is deliberately fragmented, partly for security reasons and partly for competitive ones. No institution wants its systems too legible to outsiders.

Open banking initiatives and frameworks like the Model Context Protocol are moving in the right direction, but they remain emerging infrastructure rather than established standards. Bridging these gaps requires specialized expertise in connecting payment processors, KYC/AML systems, and banking APIs into working platforms, the kind of work Softjourn has been doing with financial clients for over 20 years.

The organizational cost compounds the technical one. Financial institutions deploying AI are not just installing software; they are changing how decisions get made, how data flows, and how employees interact with systems.

That requires retraining programs, new governance frameworks, and a cultural shift in organizations where “that’s how we’ve always done it” is not a bug but a feature. McKinsey research confirms what many banks discover firsthand: productivity often drops initially as legacy processes collide with automated workflows, and even powerful tools produce nothing useful if the people using them don’t trust or understand them.

Algorithms Sometimes Learn the Wrong Lessons, Especially in Fintech

AI doesn’t create bias. It scales existing bias with a precision that manual processes never could.

Hello Digit’s automated savings algorithm illustrates this clearly. The system was designed to help users save money without causing overdrafts. Instead, it caused exactly the problem it promised to prevent, repeatedly enough that the CFPB found nearly 70,000 overdraft reimbursement requests filed since 2017. The $2.7 million penalty wasn’t simply for the overdrafts but for the disconnect between what the product marketed and what the algorithm actually did.

The structural problem runs deeper than product design. AI learns from historical data, and historical data in financial services reflects decades of discriminatory human decisions. Investigations into mortgage approval algorithms in the US found racial discrimination patterns embedded in the models, not because engineers built bias intentionally, but because training data reflected historical redlining and discriminatory lending practices. The AI learned to replicate those patterns because, from its perspective, it was identifying features that correlated with historical outcomes.

What makes this harder to fix is the problem of proxy discrimination. Even after engineers remove race, gender, and other protected characteristics from a model, the algorithm often finds correlated variables that serve as functional stand-ins:

• ZIP codes correlating with higher default rates, largely because those areas were historically marginalized

• Employment history that serves as a proxy for socioeconomic status

• Education credentials that encode class and race signals

• Shopping patterns and social media activity that reveal demographic information through indirect means

Building robust fraud prevention and KYC systems that don’t reproduce these patterns requires deliberate bias mitigation at every stage of development, not as a final review step but as a design constraint from day one.

The Hallucination Problem

In consumer applications, AI hallucinations, instances where a model generates plausible but completely false information, tend to be annoying at worst. In fintech, they are compliance nightmares. The specific danger is not that AI occasionally makes mistakes. It’s that AI makes mistakes with the same confident tone it uses when it’s correct. A fabricated loan calculation sounds exactly like a real one. An invented regulatory requirement reads like actual policy. Fictional investment performance data is presented with the same authority as genuine figures.

NatWest understood this when they partnered with IBM specifically to train their AI assistant Cora+ with safeguards against the behaviors common in open AI models. A wrong account balance is a customer service problem. A hallucinated interest rate, loan term, or compliance requirement is a legal liability.

The downstream effects of a single calculation error in a financial context are also worth understanding clearly. Wells Fargo’s mortgage modification underwriting tool had a calculation error in 2018 that resulted in more than 500 customers losing their homes and hundreds more being denied loan modifications they qualified for. The problem was not one mistake but an error that propagated through every process depending on that calculation: credit risk assessment, capital reserve requirements, secondary market valuations, and regulatory reporting.

Large language models are designed to generate text that sounds plausible given what came before it. This is not the same as generating text that is accurate. In any context where “sounds right” and “is right” must be identical, that distinction matters enormously.

The Legacy System Trap in Fintech

There is a real irony in the infrastructure picture for major financial institutions: the organizations with the most capital to invest in AI are frequently the ones running on technology built in the 1980s. Only 52 percent of banks have fully implemented basic AI chatbots and biometric security, and those technologies predate the current AI wave by years. The bottleneck is not algorithmic sophistication. It’s the unglamorous work of modernizing core systems built when “real-time” meant same-day processing.

This is where newer fintechs hold a structural advantage that has little to do with the quality of their AI. A digital bank founded in 2020 doesn’t need to figure out how to make a machine learning model communicate with a mainframe that predates the internet. It deploys models directly into microservices architectures designed for exactly that purpose.

EU AI Act Requirements

Institutions must explain exactly how AI reached every decision, prove the model doesn’t discriminate against protected groups, demonstrate continuous monitoring for drift and degradation, and build governance infrastructure around every AI system they deploy. This includes model audit tooling, bias detection pipelines, and documentation frameworks. This is not a documentation exercise. It requires entirely new organizational structures.

DORA

The Digital Operational Resilience Act went into effect in early 2025, requiring real-time incident detection, comprehensive vendor audits, and traceability across infrastructure. Institutions cannot simply deploy AI and monitor informally. When models degrade, the monitoring system must catch it. When third-party AI services are used, vendor management frameworks must verify that they meet regulatory standards. When AI failures occur, incident response must contain them before they become systemic.

The Multi-Jurisdiction Problem

For institutions operating across borders, the compliance picture compounds. What satisfies EU requirements may not meet UK standards post-Brexit. What works in Singapore’s regulatory sandbox may be prohibited in the United States. The UK’s DORA framework differs from the EU version in meaningful ways. The US CFPB takes a harder line on algorithmic accountability than most international regulators.

Emerging markets are each developing their own AI governance standards with limited coordination between them. Only 9 percent of UK bank executives feel prepared for upcoming AI regulations.

The fundamental tension is structural rather than solvable: AI systems in fintech that learn and adapt continuously are inherently unpredictable, but compliance requires certainty. Most institutions discover that compliant AI deployment means deliberately slowing model updates, freezing versions for regulatory review, and accepting that cutting-edge capability will always outpace what can be safely deployed in a regulated context.

Softjourn worked through exactly this tension with Bullet, an Irish accounting software provider that needed to meet Central Bank of Ireland requirements while modernizing its infrastructure. The answer was a phased migration to a scalable AWS architecture with disaster recovery, zero-downtime deployment, and secure data management built in from the start, ultimately making Bullet one of the few CBI-licensed AWS resellers in the country.

The Human Factor Banks Keep Relearning

Klarna’s trajectory tells a useful story. The company initially declared that its AI chatbot was handling the work of 700 employees and would improve profits by $40 million. CEO Sebastian Siemiatkowski later acknowledged publicly that customers needed to know there would always be a human available when they wanted one. The company lifted its 18-month hiring freeze and began recruiting for customer service roles again.

This is not a story about AI failing technically.

The chatbot worked as designed. It’s a story about what customers actually want when dealing with financial stress, disputed charges, or complex situations where an algorithmic response feels inadequate. The option of human contact turned out to matter independently of whether customers ever used it.

Grasshopper Bank articulates the same insight from the lending side. Their CTO has been explicit that AI recommendations will never replace human judgment in lending and due diligence, and that using generative AI in risk and credit decisioning carries fundamental risks because those models function as black boxes. A lack of transparency in credit decisions doesn’t just undermine customer trust. It undermines the auditability that regulators require.

The institutional challenge is that every new AI system requires ongoing retraining, change management, and cultural adaptation in industries where stability and predictability are foundational values. Financial institutions aren’t asking employees to learn new software. They’re asking people who have spent decades mastering a particular way of working to change how they assess risk, interact with customers, and make decisions, while maintaining the same accuracy and compliance standards. The result is predictable: organizational exhaustion, where staff spend more time learning and adapting than they save through automation.

What Works and What’s Stuck in Fintech

The clearest way to understand the current state of AI in fintech is to look at what is deployed in production versus what lives permanently in pilot status.

Currently Working in Production

Fraud detection operates at roughly 95 percent accuracy with acceptable false positive rates and has been genuinely production-ready for several years. Basic chatbots handle high-volume, low-stakes queries reliably. Transaction monitoring flags suspicious activity effectively.

Document processing automation has measurably reduced manual handling in back-office operations measurably. What these share is that they are narrow, well-defined problems with clear success metrics and tolerances for error that make sense given the stakes involved. A fraud flag that turns out to be wrong is annoying. A credit denial that turns out to be wrong is a legal and financial liability.

Stuck in Pilot Purgatory

Credit decisioning without human review, autonomous compliance monitoring, fully automated wealth management, and unsupervised lending decisions remain aspirational for most institutions. What they share is a requirement for reliability, explainability, and accountability that current AI cannot consistently deliver at scale under regulatory scrutiny with real customer money.

Banks report that 70 percent of AI use cases don’t have reported outcomes or measurable ROI. The failure mode is rarely “didn’t work in the lab.” It’s almost always “couldn’t scale from controlled test to real money, real customers, real regulatory pressure.”

Goldman Sachs CEO David Solomon wants to completely reimagine processes with AI, but has been clear that this won’t mean fewer people. Citigroup is assessing AI’s fit across more than 50 of its most complex processes, but implementations target new sources of efficiency rather than full end-to-end automation.

The distinction between efficiency and automation matters here: efficiency means AI handles parts of workflows while humans retain decision authority. Full automation means AI owns the process entirely. The former is achievable today across many use cases through collaboration with fintech consulting experts. The latter remains aspirational for anything high-stakes.

Why Financial Innovation Still Starts With Humans

AI works best when the present closely resembles the past. This makes it genuinely powerful for optimization problems, but it creates a fundamental ceiling when it comes to innovation.

Consider what an algorithm analyzing 2005 transaction data would have predicted about the future of financial services. It would not have foreseen a market for splitting restaurant bills through smartphones. It would not have anticipated peer-to-peer home rentals creating a new category of payment needs. It would not have predicted automated spare-change investing as a product category. None of those behaviors existed in the training data, so no model could have surfaced them.

The 2025 explosion of prediction markets, the GENIUS Act creating frameworks for stablecoin adoption, and the embedded finance wave reshaping how consumers interact with financial services were conceived by people who understood changing customer behavior, regulatory opportunities, and emerging technological possibilities in ways no historical dataset could have revealed. AI excels at scaling these innovations once they exist: optimizing pricing, improving user matching, and detecting fraud in new payment types. The creative work of asking “what if we did this completely differently?” is still human territory.

Financial crises make the same point from a different angle. The 2008 housing crisis, the COVID-19 economic shock, and the 2023 banking crisis shared one defining characteristic: none of them resembled anything that existed in prior training data. AI trained on decades of market stability cannot advise clients during genuine instability because its operating principle, that future patterns will resemble past patterns, breaks down exactly when guidance is most needed.

AI has become a strategic decision, according to Deloitte. The strategic decisions that actually determine competitive advantage, identifying emerging customer needs before data shows them, building regulatory frameworks for instruments that don’t fully exist yet, forming ecosystems through partnership negotiation and shared standard-setting, require human judgment synthesizing technical possibility, market conditions, regulatory constraints, and organizational capability in ways that training data cannot encode.

What This Means for Financial Services Leaders

The AI in the fintech market reached $30 billion in 2025 and is projected to reach $97.7 billion by 2034. Every breakthrough enabling that growth was conceived by humans. The hundreds of billions in annual value that generative AI could theoretically add to global banking comes from AI’s ability to execute and scale human-conceived strategies, not from inventing those strategies.

The organizations positioned to capture that value are not necessarily the ones with the best models. They’re the ones that have built the infrastructure capable of using those models in production, the governance frameworks that satisfy regulators without paralyzing development, and the human-AI collaboration models that put automation where it genuinely helps and human judgment where the stakes require it.

The constraints explored throughout this article, the infrastructure complexity, the regulatory demands, the bias risks, the innovation ceiling, are not temporary problems waiting to be solved.

They are features of the environment that AI-enabled financial services must be built around. Accepting them as design inputs rather than obstacles is what separates institutions that deploy AI effectively from those still running pilot programs years after their initial investment.

Contact Softjourn’s fintech consulting team to discuss your development strategy and find out how to build a compliant, scalable AI-enabled infrastructure that holds up in production.

1. AI ROI: The paradox of rising investment and elusive returns Deloitte’s 2025

2. Unlocking profitable B2B growth through gen AI McKinsey