AI for Good

Utkarshini: The Vedic AI That Actually Understands the Texts

Adaptiv Admin

May 13, 2026 · 7 min read

Ask any AI assistant today what the Isha Upanishad teaches about renunciation, and you will receive a confident, fluent, and subtly wrong answer - a verse that is misattributed, a nuance that is collapsed, or a thousand-year tradition reduced to a hallmark-card paragraph that sounds plausible but isn't quite right. We are building Utkarshini because that is not good enough.

Utkarshini (from Sanskrit, meaning one who elevates, signifying excellence, brilliance, and progression) is Project Bhaskar's project to create the first AI model trained deeply and specifically on the Vedas and Upanishads. Not a general-purpose chatbot that has incidentally read some Sanskrit. A specialist: a system that has absorbed the Rig Veda, the Atharva Veda, the principal Upanishads, and the great commentarial traditions of Shankaracharya, Madhvacharya, and Ramanuja. A system that understands not just the words, but the philosophical architecture behind them.

"Can't I just ask ChatGPT about the Vedas?"

It's a fair question. Modern AI is genuinely impressive, and it has read an enormous amount of text. But it has a significant Indic gap. Most Western-trained models fragment Devanagari script badly - a single Sanskrit word can become a dozen meaningless pieces in the model's internal representation, like shredding a manuscript before reading it. Testing in 2024 confirmed what Sanskrit scholars already suspected: even the most advanced models produce grammatically incorrect and logically flawed Sanskrit. They hallucinate verses, confuse texts, and give shallow answers to questions that any informed reader of the Gita would answer correctly.

The problem is structural, not superficial. These models were trained on vast oceans of English-language internet text, with Sanskrit and Vedic literature as a rounding error. When they generate an "explanation," they are pattern-matching across that impoverished representation. The results feel wrong to anyone who knows the material — and they should.

"The Vedic corpus is the cornerstone of civilisational wisdom for over a billion people. It deserves more than a confident hallucination."

What are building: two deliverables, one lasting asset

Utkarshini is, technically, two things at once. The first is a fine-tuned 8-billion-parameter AI model capable of answering complex philosophical questions, citing specific verses, explaining concepts across multiple commentarial traditions, and working in English, Hindi, and Sanskrit. The second, and in many ways more important, is the parallel corpus underneath it.

For every verse in the principal Upanishads and Rig Veda, we are assembling the Sanskrit original in Devanagari, its transliteration, a word-by-word grammatical breakdown, multiple English translations, and the classical commentaries of the major Vedantic traditions, side by side. This kind of structured, aligned, scholar-reviewed dataset does not currently exist in machine-readable form. It is the foundational scholarly work that makes everything else possible, not just this model, but every future model, research project, or digital humanities application that wants to engage seriously with this literature. The corpus is the moat. The model is the demonstration layer.

The project's full name carries its own logic: User-interface for Testing, Knowledge Annotation and Review from Scrapers by Humans of Indic Information. But the spirit is simpler than the acronym suggests.

Scholarship before engineering

Building a genuine Vedic specialist requires more than feeding texts into a computer. The knowledge base we are constructing draws from three tiers: the primary Sanskrit originals across all four Vedas and the 108 Upanishads; the great commentaries of Shankaracharya, Sayana, Madhvacharya, and Ramanuja; and a broader Indic context including the Bhagavad Gita, Brahma Sutras, and Yoga Sutras, alongside modern scholarly interpretations from Radhakrishnan and Max Müller. The model learns not just the texts themselves, but the full tradition of thinking about them - Advaita, Dvaita, Vishishtadvaita, and modern academic lenses all represented. Otherwise, we risk building a sectarian AI rather than a scholarly one.

One challenge most people don't anticipate: the same Sanskrit verse exists in dozens of incompatible digital encodings - Devanagari Unicode, IAST, Harvard-Kyoto, ITRANS, Velthuis. They are all the same text, but to a machine they look like entirely different strings. We have built a normalisation pipeline that converts every source into a single canonical form before any training begins. It is unglamorous work. It is also completely essential.

Automated tools cannot assess theological accuracy, which is why Sanskrit scholars are in the loop throughout - validating alignments, reviewing synthetic training data, and anchoring the model's outputs in authentic tradition rather than digital inference. Even five hundred expert-reviewed question-and-answer pairs change the model's reliability substantially.

Who Utkarshini is for

The spiritual wellness market is projected to reach nearly $10 billion by 2035, with faith-tech among its fastest-growing segments. Over 25 million Christians now use AI-powered Bible study tools, and India's Sri Mandir has 40 million downloads. The infrastructure for faith-tech at scale is clearly there. What doesn't exist is the Hindu equivalent of a genuinely knowledgeable AI - one that can give a student a verse-grounded answer about karma, help a researcher compare Shankara's and Ramanuja's readings of the same passage, or explain to a diaspora family in Chicago what a line from the Chandogya Upanishad actually means.

That diaspora market matters particularly. The Indian communities in the US, UK, UAE, and Australia represent a high-intent audience for exactly this kind of tool - people who want to connect with their heritage but face the language barrier of Sanskrit, and who have found that existing AI tools give them confident nonsense when they try. Utkarshini is being built for them as much as for scholars.

For Project Bhaskar, this also connects directly to our broader work in Indian art digital preservation. The Mughal miniatures illustrating Upanishadic stories, the Pahari paintings of Vedic scenes, the Bengal school works drawing from philosophical traditions - Utkarshini is the knowledge layer that wants to contextualise all of it. The philosophical traditions behind the art deserve the same rigour as the art itself. The project also aligns squarely with the Indian government's Gyan Bharatam Mission has committed ₹482 crore to digitising India's manuscript heritage and IndiaAI Mission has allocated ₹10,300 crore specifically for indigenous AI solutions.

"We are not building a chatbot. We are building the first structured, machine-readable record of the Vedic canon — and demonstrating it through an AI that actually knows what it's talking about."

Carrying the past forward, not preserving it in amber

Utkarshini is not a nostalgia project. It is not about freezing a tradition or building a digital shrine. It is about making one of humanity's great intellectual inheritances legible, accurately, at scale, and for the first time, using the best available tools. The Vedas and Upanishads have survived three thousand years of oral transmission, manuscript copying, colonial disruption, and neglect. They do not need our protection. They need our attention.

At Project Bhaskar, we think that is what responsible use of technology looks like: not chasing the next benchmark, but turning powerful tools toward problems that matter and have been ignored. Nobody has built what Utkarshini describes. The compute costs are modest. The scholarly work is hard. The result, if we do it right, will outlast the model version it runs on - because the corpus underneath it will still be there when better base models emerge, waiting to be used again.

That is the bet. That is Utkarshini.

This project is bigger than one lab.

Building the first structured, scholar-reviewed parallel corpus of Vedic and Upanishadic literature is not a task for a small team alone. We are actively looking for collaborators, validators, and partners who want to see this exist in the world, and who have something to contribute to making it right.

Drop us a note on hello@adaptiv.me. We would love to hear from you, even none of those categories fit but you care about this work, as a practitioner, a member of the diaspora, a museum professional, a funder, or simply someone who thinks this should exist. The corpus we are building will outlast any single version of the model. Every person who contributes to getting it right is part of what makes it last.

Stay close to the project. Follow us on LinkedIn and Instagram.

Indic Language AI ResearchAI for the Common GoodEthics & Cultural Intelligence

Adaptiv Admin

@admin

Building the future of AI products at Adaptiv.Me.

AI for Good

Release Note: AINA, a Mirror for India's Employment Crisis

AI for Good

Case Study: How Be A Reader was designed to rebuild the habit of deep reading

"Can't I just ask ChatGPT about the Vedas?"

What are building: two deliverables, one lasting asset

Scholarship before engineering

Who Utkarshini is for

Carrying the past forward, not preserving it in amber

This project is bigger than one lab.

Adaptiv Studio