Mapping India's Court Data Stack: From NJDG to APIs to AI Agents

Most Indian legaltech market maps in circulation today skip the layer that every other layer depends on. They show you contract lifecycle management, online dispute resolution, AI copilots, litigation finance, RegTech, legal marketplaces. They do not show you where the data comes from.

Court data is the substrate for most of Indian legaltech. An AI drafting tool needs judgment text. A litigation intelligence product needs case status. A background verification API needs litigation history. A case management SaaS needs hearing schedules. None of it works without a reliable court data pipeline underneath.

This post maps that stack end to end. Five layers, from the public source at the bottom to the AI agent at the top, with the companies operating at each layer.

Layer 1: The public source

Everything starts here. This is the canonical, government-maintained layer where court data originates.

eCourts Services (services.ecourts.gov.in). District and subordinate court records across 37 states and union territories.
National Judicial Data Grid (njdg.ecourts.gov.in). Live pendency, disposal, and institution statistics across tiers.
High Court portals. 25 separate High Court websites, each with its own interface, case information system, and cause-list format.
Supreme Court of India (main.sci.gov.in). Registry, cause lists, judgments, and case status.
Specialised tribunals. NCLT, NCLAT, ITAT, CAT, DRT, Consumer Commissions, each on their own portal.

This layer is comprehensive, authoritative, and free. It is also, by design, built for single-case lookup rather than bulk or programmatic access. Captchas, fragmented interfaces, and the absence of a third-party API are deliberate. They protect the public source from scraping abuse.

Read our full primer on eCourts here: Inside eCourts: How India Digitised 29,600 Courts.

Layer 2: Aggregation and structuring

This is the layer that turns scattered public records into a queryable dataset. It is also the layer most investors miss.

Aggregation is hard because the source is fragmented. A single operator has to ingest from 25+ High Court portals plus thousands of district courts, each with slightly different CIS quirks, normalise the data, resolve entities (every advocate has five different name spellings), detect updates, and keep the whole pipeline fresh. The work is operational and engineering-heavy, not glamorous, and does not produce shiny demos. It also takes years to get right.

Players operating here:

eCourtsIndia. 26.8 crore case records, 29 lakh advocate profiles, Supreme Court, High Court, district court, NCLT, and NCLAT coverage. REST API and MCP access for developers. Launched April 2025.
IndianKanoon. Free-tier search over a large judgment corpus, bootstrapped, a much-loved utility for Indian lawyers for over a decade.
LegitQuest. Research plus background verification, partial district court coverage.

Layer 2 is what makes everything above it possible. Every research platform, every AI agent, every due-diligence product is only as good as the aggregation layer it depends on.

Layer 3: Legal research databases

This is the oldest commercial layer in Indian legaltech. Built for Supreme Court and High Court case research, typically sold to urban firms and chambers at annual subscriptions of ₹15,000 to ₹50,000 per seat.

Manupatra. Curated Indian case law, statutes, commentary. Legacy incumbent.
SCC Online. Supreme Court Cases Online. The other legacy incumbent.
Casemine. Research with AI-assisted discovery.
Nearlaw, LawAtYourFingertips. Smaller research DBs.

Strong at what they were built for. Largely absent from the district-court tier, which is where 80 percent of Indian advocates practise.

Layer 4: AI agents and copilots

This is the most visible layer in 2026. Funded, press-covered, and full of experimentation.

Jhana.ai. AI research copilot. Seed round in 2025.
NyayNidhi. AI-assisted drafting. Early-stage funding in 2025.
Lucio. AI-native legal workflows.
Spotdraft, SimpliContract. AI-assisted contract lifecycle management (contract data rather than court data).

The honest observation about Layer 4 is that it depends entirely on Layer 2 and Layer 3. Without structured court data beneath it, an agent has nothing to reason over. Several Layer 4 products currently rely on a mix of IndianKanoon, Manupatra, scraped public sources, and custom pipelines, which is why the best of them are moving toward structured-data partnerships.

Layer 5: Applications and workflows

The top of the stack. Products that consume the lower layers and package them for a specific workflow.

Case management SaaS. Tools that help advocates manage their active matters, hearings, and clients.
Online dispute resolution. Presolv360, Jupitice, Sama, others. Arbitration and mediation platforms.
Litigation finance. LegalPay, FightRight. Capital for claimants.
Compliance and RegTech. IDfy, Redacto, RegisterKaro. Compliance automation for companies.
Background verification. BGV vendors running court-record checks as part of hiring and lending decisions.
Legal marketplaces. Vakilsearch, LegalKart, Lawyered. Consumer and SME legal services.

The stack, at a glance

Layer	What it does	Representative players
5. Applications	Workflow products: case management, ODR, litigation finance, BGV, marketplaces	Presolv360, LegalPay, IDfy, Vakilsearch, SpotDraft
4. AI agents	Research copilots, drafting assistants, AI workflows	Jhana, NyayNidhi, Lucio, Casemine
3. Research DBs	Curated case law for SC and HC research	Manupatra, SCC Online, Casemine
2. Aggregation	Structuring public court data into queryable datasets and APIs	eCourtsIndia, IndianKanoon, LegitQuest
1. Public source	Canonical government data	eCourts, NJDG, SC, HC portals, tribunals

Where the Indian market is underbuilt

Looking at the stack honestly, three patterns stand out.

Layer 2 is the bottleneck. More than 90 percent of Indian legaltech investment in 2024-25 went to Layers 4 and 5. Very little capital has flowed into the aggregation layer, despite it being the choke point. This is the opposite of how the US stack evolved, where PACER ingestion and CourtListener built a mature data layer before the application explosion.

District courts are under-represented at every layer above 1. Research DBs cover Supreme Court and High Court well and district courts barely at all. AI agents inherit that bias. The roughly 1.6 million advocates practising at the district tier have been effectively invisible to most Indian legaltech.

Vernacular is a green field. The entire stack, from Layer 2 upward, is built in English. India’s trial courts produce orders and cause lists in 22 scheduled languages. An AI agent that reads Hindi, Tamil, Marathi, and Bengali court documents does not yet exist at scale.

What this means for investors and operators

If you are investing in Layer 4 or Layer 5, the due-diligence question that matters most is where the data comes from. An AI drafting product with no structured court data behind it is a thin wrapper. A case management tool without hearing alerts is a contact book. The moat in Indian legaltech, for the next five years at least, sits in Layer 2.

If you are operating at Layer 4 or 5, the decision you will face within twelve months is whether to build your own ingestion pipeline or partner with a Layer 2 provider. The companies that figure this out early will look very different in 2028 from the companies that do not.

What this means for eCourtsIndia

We sit at Layer 2, deliberately. Our job is to turn the public source into the queryable, API-accessible dataset that every layer above us depends on. That is why we have built our REST API and MCP access, why we publish coverage across 37 states and union territories, and why 26.8 crore case records sit inside our platform today.

Over time, we expect the Indian court data stack to look less like a pyramid and more like a pipeline: the public source feeding the aggregation layer, the aggregation layer powering research, AI agents, and applications, and every layer getting smarter as the ones below it get richer. That is the direction we are building toward.

If you are building at Layer 4 or Layer 5 and need structured court data, our API is available at ecourtsindia.com. Search any case free at ecourtsindia.com/search.

Sources and further reading

Tracxn, 2025: Legaltech India Funding Summary.
Ken Research, 2024: India Legaltech Outlook.
Grand View Research, 2024: India Legal AI Market Report.
eCommittee, Supreme Court of India: Annual Reports.
Bar Council of India: Enrolment statistics.
Harshith Viswanath, The LegalTech Thesis: Mapping India’s LegalTech Ecosystem, March 2026.

Mapping India’s Court Data Stack: From NJDG to APIs to AI Agents