Modular, domain-specific AI is better aligned with India's BFSI than one-size-fits-all monolith LLMs.

From Experimentation to Operation
Indian financial institutions stand at an inflection point. The early excitement around generalised artificial intelligence has given way to a harder question: how can we actually deploy AI to solve real business problems within the constraints we face?
The answer, AI Agents. Compact, domain-specialised models orchestrated to handle specific, well-bounded financial tasks, AI Agents are proving to be more aligned with the economic, infrastructural, and regulatory realities of Indian BFSI than approaches that attempt to apply a single AI system across all use cases.
The constraints that define financial services in India are not obstacles to be overcome; they are signals pointing toward the right architectural choices. We will examine what those constraints are, why they matter, and how AI Agents address them in ways that create immediate, measurable business value.

The Structural Realities Shaping Indian BFSI in 2026
Economic Pressure: High Volume, Low Ticket, Thin Margins
India's digital payment infrastructure has transformed the sector's operational profile. The Unified Payments Interface (UPI) now accounts for 83.7 percent of all digital transaction volume, with transaction values exceeding 19 billion transactions in 2024.[1] By 2025, daily UPI volumes have continued to surge, and projections suggest the trend will sustain throughout 2026.
At this scale, the unit economics of artificial intelligence become critical. Each transaction that touches an AI system incurs an inference cost. When processing millions of small-ticket transactions daily - micro-loans, sachet insurance products, vernacular banking queries, that per-call cost compounds rapidly. A model that functions acceptably in a proof-of-concept with 10,000 transactions becomes economically unviable when scaled to billions.
Industry research confirms the constraint: inference on a 7-billion-parameter model costs 10 to 30 times less in latency and computational resources than inference on a 70-billion-parameter or larger model.[2] For institutions processing billions of transactions annually, this difference translates to millions of rupees in operational expense. More often, it determines whether an automation is even economically possible.
Many financial services use cases in India—algorithmic credit scoring, insurance claims triage, fraud detection on micro-transactions—are only viable if the underlying AI model is extremely efficient on a per-call basis.

Infrastructure Constraint: Connectivity and Last-Mile Reach
Despite the growth of digital rails, significant portions of Tier-3 to Tier-6 India still experience intermittent connectivity. Many banking correspondents, branch staff, and service delivery points operate in environments where assuming continuous, high-bandwidth connection is unrealistic. A customer enquiry from a village in Jharkhand or a small business in North East have network conditions that would disrupt cloud-dependent systems.
The last-mile problem persists not because of technological failure, but because of simple economics: densification of telecommunications infrastructure in low-population-density areas remains challenging. Financial inclusion, however, requires services to reach precisely these areas. This creates an architectural challenge: how to deploy sophisticated financial services in infrastructure-constrained environments?
Many AI deployment models assume centralised processing: a query goes to the cloud, a large model responds, and the result returns. This pattern breaks when connectivity is unreliable. Even high-latency connections can work if the system is designed to handle them; continuous disconnection cannot.
Data Diversity: Heterogeneous Inputs, Task Specificity
BFSI workloads are profoundly heterogeneous. A single institution may need to process vernacular voice queries in 22+ scheduled Indian languages, legacy COBOL and Java code, ISO 20022 payment messages with complex nested structures, handwritten medical records for insurance underwriting, and unstructured UPI transaction histories for alternative credit scoring.
This heterogeneity creates a challenge for generalised approaches: a single model trained to handle all of these tasks must allocate parameters to every domain whilst being simultaneously spread too thin to excel at any single one. A model trained primarily on English text and global financial data may struggle with colloquial Hindi or Telugu; a model fine-tuned for natural language may be mediocre at code refactoring; a model optimised for payment messaging may perform poorly on unstructured medical data.
Task heterogeneity compounds this issue. The cognitive and computational steps required to reconcile an ISO 20022 payment message differ fundamentally from those required to summarise a medical report. The reasoning required to triage an insurance claim differs from the logic needed to score alternative credit. Applying the same model architecture to all tasks creates inefficiency.

Regulatory Constraint: Transparency, Control, and Auditability
The Digital Personal Data Protection (DPDP) Act, 2023, and associated RBI guidelines have established strict requirements for data localisation, purpose limitation, and auditability.[3] Financial institutions are expected to maintain continuous visibility and control over the data they process and the decisions their systems make.

This introduces a fundamental struggle for certain AI architectures. Generalised models trained on internet-scale data and operated by external vendors create compliance friction: customer data must be sent outside institutional firewalls for processing, introducing localization risks; the reasoning chains within these models are often opaque, making it difficult to explain decisions to regulators or customers; and the models are not fine-tuned on institution-specific regulatory frameworks, creating misalignment between what the model has learned and what the institution is permitted to do.
Regulators increasingly expect that critical decisions—loan rejections, high-value claims denials, policy exceptions—be explainable and auditable. When a regulator asks, "Why was this customer's loan rejected?", an answer of "the model determined it" is insufficient. An institution must be able to point to the specific data and reasoning that led to the decision.
The Convergence: Why These Constraints Define the Problem
These four constraints, economic pressure, infrastructure limitations, data diversity, and regulatory requirements, are not separate problems; they are interconnected aspects of the Indian financial services operating environment. They are also not temporary. They will continue to define the sector in 2026 and beyond.
An AI architecture that works in this environment must be:
· Economically efficient: Low cost per inference, low training overhead, allowing incremental investment tied to clear business metrics.
· Architecturally distributed: Capable of running at the edge, offline, on devices and systems under institutional control.
· Task-aligned: Optimised for specific problems rather than attempting to be general-purpose.
· Sovereignly controlled: Deployable on-premise or within tightly governed private cloud environments, with full institutional visibility and auditability.
AI Agents: The Optimal Solution
AI Agents are compact, domain-specialised models that are orchestrated to perform well-bounded tasks within institution-controlled environments. They are typically deployed with parameter counts in the range of 3 to 13 billion, fine-tuned on institution-specific or domain-specific data, and often paired with Retrieval-Augmented Generation (RAG), rules engines, and human-in-the-loop governance.
The design philosophy is fundamentally different from generalised approaches: rather than attempting to be all things to all use cases, an AI Agent is optimised for a specific task and bundled with the data, governance, and operational constraints necessary to execute that task reliably and safely.
How AI Agents Address the Constraints
On economics: A 7-billion-parameter model fine-tuned for a specific task consumes 10 to 30 times less compute per inference than a 70+ billion-parameter general-purpose model.[2] This unlocks automation at cost points that were previously uneconomical. A 50,000-rupee loan can now be underwritten algorithmically; a micro-insurance claim can be processed through AI assistance; a rural customer can receive vernacular financial guidance—all at inference costs that preserve institutional margins.
On infrastructure: AI Agents are designed for edge deployment. A specialist agent running on a tablet used by banking correspondents can detect fraud, score credit, fill compliance forms, and verify KYC requirements—all offline. When connectivity returns, results and learnings are synchronised. Sensitive transaction data never leaves the device; only aggregated model updates are sent to central servers. This pattern, called federated learning, makes financial services more resilient in infrastructure-constrained areas whilst preserving privacy.
On data diversity: Rather than forcing heterogeneous tasks into a single model, an institution can deploy a specialised agent for each task. A code-refactoring agent trained on legacy syntax and modern microservices patterns; a vernacular agent trained on Indian speech and financial terminology; an ISO 20022 reconciliation agent trained on payment semantics; a medical underwriting agent trained on clinical signals. Each agent is optimized for its domain, resulting in superior accuracy, lower cost, and simpler governance.
On regulation: AI Agents running on institutional infrastructure with RAG constraints ensure data stays within firewalls and decision-making is auditable. When paired with RAG, agents retrieve answers strictly from verified institutional knowledge bases—policy documents, regulatory frameworks, product definitions—with no invention. Every answer is traceable to a source. Every decision leaves a reasoning trace that can be reviewed by a human officer and captured for regulatory examination.
The objective is making AI trustworthy enough to deploy in a regulated environment.

AI Agents in Practice: Where They Create Value
Core Banking: Accelerating Legacy Modernisation
The Opportunity. Most Indian banks operate on legacy cores. These systems are stable but brittle. Modernising them is seen as high-risk ("open-heart surgery") and has been repeatedly deferred. Yet the cost of maintaining legacy talent and the missed opportunity of deploying modern functionality has become untenable.
Additionally, integration between cores and new platforms (fintechs, open finance APIs, payment networks) remains fragile. A partner changes a data field, and the integration breaks. ISO 20022 migration, now underway globally, introduces richer data that traditional reconciliation engines struggle to utilise.
AI Agents at Work. Code-refactoring agents, trained on legacy syntax and modern patterns, can analyse decades-old code and infer the business logic embedded within it. They generate dependency graphs, identify redundant subroutines, and propose microservices-oriented refactors that preserve financial correctness while enabling cloud-native deployment. This reduces reliance on scarce legacy talent and transforms modernisation from a one-time capital project into a continuous operational capability.
Semantic gateway agents deployed within API gateways act as intelligent translators. When a partner changes data schema—from "cust_ID" to "customer_identifier"—the agent infers semantic equivalence and automatically remaps the data to the bank's internal schema. This creates "self-healing APIs" that reduce integration fragility and maintenance burden.
ISO 20022 reconciliation agents parse complex payment messages and reason through unstructured remittance information—"payment for invoices 342 and 345, less 2% early settlement discount"—to match payments to open receivables with near-human accuracy. Manual back-office effort drops; settlement cycles accelerate; working capital improves.
Insurance: Defending Against AI-Enabled Fraud While Accelerating Claims
The Opportunity. The insurance sector faces dual pressures: the need to increase penetration and close the protection gap, and the rising threat of AI-enabled fraud. Deepfakes and synthetic identities are becoming industrialised. Traditional claims processing cannot distinguish synthetic images from authentic ones; manual review is slow and expensive. Health insurance underwriting remains bottlenecked by the need to manually digitise and interpret heterogeneous medical records.
AI Agents at Work. Forensic agents trained to detect the digital artefacts of synthetic images analyse pixel noise patterns, metadata inconsistencies, and lighting anomalies. When a claim is submitted, the forensic agent screens attachments for authenticity markers whilst simultaneously cross-referencing claim narratives against external signals—weather reports, traffic camera feeds, social media consensus. This creates a robust defence against industrialised fraud whilst enabling "straight-through processing" for genuine claims—reducing settlement times from days to hours.
Underwriting agents equipped with vision capabilities ingest scanned medical reports, handwritten doctor's notes, and lab results. They extract clinical entities (diagnoses, medications, vital signs) and reason through medical history to flag pre-existing conditions and risk factors that keyword matching would miss. Underwriting latency falls; customer acquisition costs drop; premium pricing becomes fairer and more granular.
Policy companion agents transform static PDFs into interactive, queryable interfaces. A customer asks, "If I am diagnosed with dengue fever, is my hospitalisation covered?" The agent reasons through their specific policy clauses, calculates sub-limits and waiting periods, and provides a legally accurate answer: "Yes, up to ₹5,000 per day, after a 30-day waiting period." Transparency increases; mis-selling decreases; grievance redressal burden falls.
Lending: Credit for the "New Data-Rich, File-Thin" Segment
The Opportunity. Millions of Indian MSMEs and new-to-credit individuals lack formal credit histories. Traditional credit scoring models cannot underwrite them. Manual underwriting for small-ticket loans—₹50,000—is uneconomical. Meanwhile, collections remain contentious, often relying on aggressive tactics that damage relationships and attract regulatory scrutiny.
AI Agents at Work. Alternative-data agents synthesise the "digital exhaust" of borrowers: UPI transaction histories, bank statements, and GST filings. Using Account Aggregator consent frameworks, these agents construct granular cash-flow narratives. They distinguish business expenses from personal consumption in UPI transactions, identify seasonality in income, flag external shocks, and infer repayment capacity with surprising accuracy. This makes micro-lending to MSMEs and new-to-credit individuals economically viable.
Collections agents combine sentiment analysis with rule-bound restructuring logic. When a borrower shows financial distress—missed payments, urgent language patterns—the agent detects this and reasons through pre-approved restructuring options. Rather than rigidly demanding payment, the agent offers: "I understand you have a medical emergency. Based on your history with us, we can defer this month's EMI to the end of your tenure. Would that help?" This improves recovery rates while adhering to RBI guidelines on ethical conduct.
Unified Lending Interface (ULI) companion agents monitor merchant context—inventory levels, cash flow patterns, seasonal cycles. They predict working capital gaps before they occur and proactively offer credit at the moment of need. A merchant app prompts: "You typically stock up for Diwali next week, but your cash balance is low. Approve a ₹2 lakh inventory loan?" This integrates credit seamlessly into livelihood activities, aligning with the NSFI 2025-30 objective of linking financial services to real economic activity.
Wealth Management: Trustworthy Personalisation at Scale
The Opportunity. Wealth management is shifting from an exclusive, high-touch service for the affluent to a digitised, mass-market offering. Yet retail investors distrust opaque algorithmic advice. They need explanations. Financial literacy, especially among Gen Z, is uneven. Research reports are dense and inaccessible to most investors.
AI Agents at Work. Advisory agents pair portfolio optimisation with narrative explanations. When recommending asset reallocation, the agent generates a personalised explanation: "We are moving 5 percent of your portfolio from small-cap to large-cap funds. The reasoning is that market volatility is expected to rise, and this aligns with your goal of preserving capital for your house purchase next year." This builds trust and reduces panic-selling during downturns.
Research summarisation agents automatically distil dense institutional research into personalised formats. A Gen Z user might receive a bulleted summary with key risks highlighted; a sophisticated investor gets a detailed technical note. The same underlying research; different granularity and language. This democratises access to institutional-grade insights.
Financial literacy mentor agents engage younger investors in interactive dialogues and scenario simulations. "The market just crashed 10 percent. What would you do?" Rather than lecturing, the agent tailors feedback to build financial discipline and long-term thinking. This fulfils NSFI mandates on financial literacy while creating a pipeline of educated future customers.


Implementing AI Agents: A Tactical Blueprint for 2026
Model Garden Architecture
Deploy a modular ensemble, not a monolith. A lightweight router agent classifies incoming requests by domain—credit, complaint, transaction, fraud signal—and dispatches to specialist agents. Each specialist has clear governance, versioning, and risk profiles. This mirrors microservices principles and allows the risk function to evaluate and approve each agent independently.

Edge Deployment and Federated Learning
Deploy agents locally on devices used by banking correspondents, branch staff, and customers. Fraud detection, credit scoring, and compliance checks run offline. When connectivity returns, results and learnings are synchronised to central systems. Sensitive transaction data never leaves devices; only aggregated model updates are sent to servers. This pattern preserves privacy, reduces bandwidth demand, and enables reliable service in infrastructure-constrained areas.

Human-in-the-Loop Governance
For Tier 1 decisions—loan rejections, high-value claims, policy exceptions—agents operate in "draft mode." They prepare recommendations with visible reasoning traces. A human officer reviews, approves, or overrides. Overrides are fed back into the system to improve subsequent agent behaviour. This transforms governance into a continuous learning mechanism, keeping AI aligned with institutional ethics and regulatory expectations.
Monitoring and Continuous Improvement
Establish clear SLAs, accuracy thresholds, and cost baselines for each agent. Track overrides, customer complaints, and regulatory feedback. Retire underperforming agents; upgrade successful ones. Use retrieval logs and decision records to continuously refine knowledge bases and prompts. This creates a feedback loop where operational experience continuously improves agent performance.
Alignment with India's Financial Strategy
The National Strategy for Financial Inclusion (NSFI) 2025-30 sets an ambitious agenda: expanding access to savings, credit, insurance, and pensions across all segments; ensuring services are suitable and affordable; and linking financial inclusion to real livelihood outcomes.[4] The strategy is built on principles of financial safety, security, resilience, and discipline.
AI Agents are structurally well-positioned to advance this agenda. By enabling cost-effective, explainable, and sovereign AI, they make it economically viable to extend sophisticated financial services to underserved segments. Vernacular interfaces remove literacy barriers. Edge deployment reaches unconnected areas. Algorithm transparency builds trust. Modular, scalable architecture aligns with the distributed nature of India's financial inclusion journey.
This is not incidental; it is structural. The constraints that make certain AI approaches economically unviable in India are the same constraints that define financial inclusion. By solving for these constraints, AI Agents simultaneously solve for India's financial future.
Those waiting for the "perfect" generalised AI solution may find themselves outpaced by those who chose precision over generality, and operation over experimentation.
AI Agents are not a replacement for all AI work. Rather, they are the pragmatic workhorse for the high-volume, low-latency, tightly-governed tasks that define day-to-day BFSI operations. They reflect a maturation in how the sector approaches technology: moving from "what can AI theoretically do?" to "what can AI actually do within our constraints?"
Small is the new smart. And in Indian BFSI, smart means building for your own constraints, not against them.
References
[1] Reserve Bank of India. (2025). UPI's share in India's digital payments surged to 83%: RBI report. Government of India, Ministry of Finance. Retrieved https://ddnews.gov.in/en/upis-share-in-indias-digital-payments-surged-to-83-rbi-report/from
[2] Anaconda. (2025). Small Language Models: The Efficient Future of AI. Research on computational efficiency and inference cost comparison between SLMs and LLMs. https://www.anaconda.com/blog/small-language-models-efficient-future-ai
[3] Ministry of Electronics and Information Technology (MeitY), Government of India. (2024). Digital Personal Data Protection Act, 2023. Retrieved from https://www.meity.gov.in/static/uploads/2024/06/2bf1f0e9f04e6fb4f8fef35e82c42aa5.pdf; Reserve Bank of India. (2024–2025). Data Localization and Storage Guidelines for Payment Systems. Retrieved from https://www.sisainfosec.com/blogs/data-protection-regulations-in-banking-india/
[4] Reserve Bank of India. (2025). National Strategy for Financial Inclusion (NSFI) 2025–30. Government of India. Retrieved from https://www.galaxyclasses.co.in/details?res_type=ca&res_id=9019
Published by Humane Technologies
This paper is published as independent expert analysis for informational purposes only and does not constitute legal, regulatory, or investment advice. Institutions should evaluate all architectural and governance choices in light of their own risk appetite, regulatory obligations, and supervisory guidance. The information herein reflects the state of technology and policy as of January 2026.
© Humane Technologies. All rights reserved.