Indian data is staring at a credibility crisis with official numbers on a range of subjects – from Covid deaths to jobs – being questioned by independent experts. But not too long ago, the country was seen as a world leader in data collection, writes author and historian Nikhil Menon.

Soon after India became independent from British rule, the country took inspiration from the Soviet Union to organise its economy – through centralised five-year plans. This made it imperative for policymakers to have access to accurate, granular information about India’s economy.

Here, India faced a problem – as its first Prime Minister Jawaharlal Nehru put it, “we have no data”, because of which “we function largely in the dark”. Setting up a vast data infrastructure was meant to turn on the lights.

Perhaps the most transformative of the changes introduced was the National Sample Survey, which was established in 1950. It was intended to be a series of sprawling, nationwide surveys that captured information on all aspects of the economic life of citizens.

The idea behind this was that since it would be impossible (or prohibitively expensive) to collect statistics from every household across the nation, it was better to develop a robust and representative sample so that the whole could be calculated from a small fraction.

It was, according to an assessment published by the Hindustan Times newspaper in 1953, “the biggest and most comprehensive sampling inquiry ever undertaken in any country in the world”.

Nehru handed the responsibility of running the survey to scientist PC Mahalanobis – now called the father of Indian statistics – and the organisation he founded, the Indian Statistical Institute. They faced enormous challenges. In order to properly survey the 1,833 sample villages out of the total 560,000, the short-staffed institute needed investigators who could together negotiate 15 languages and 140 local systems of measurement.

In his diary, Mahalanobis wrote about this highly complex operation with the excitement of an adventurer. There were “wild areas” in Orissa where investigators had to be accompanied by armed guards through forests; sometimes they had to cross snow-clad high Himalayan passes.

His language reflected the attitudes of the times. In Assam, the surveyors met “the most civilized people” and also “uncivilized naked tribes” who didn’t speak a known language. The tribals “know not what money means,” he wrote, and “laugh at the word economic development”. Elsewhere, criminals harassed National Sample Survey staff. A persistent problem was the jungle; “dense, impregnable forests with wild animals and epidemic tropical diseases”. In some parts, the danger was even starker: investigators complained of having to fight through forests “infested with wild-beasts and man eaters”.

But the results of the survey were extraordinary, delivering remarkably granular information about the daily lives of Indians. The second survey, for example, told policymakers exactly how much one Chidambaram Mudaly, his wife, three daughters, and mother-in-law – who lived in a remote village in the southern state of Tamil Nadu – spent on ghee, rice, wheat, salt, tea, chillies and other essentials. While this family’s financial outlay wasn’t significant on its own, when aggregated with tens of thousands of other data points, it allowed economic planners and policymakers to understand the economy in a fundamentally different way.

Since then, the National Sample Survey has consistently yielded fine-grained detail about economic life in India, helping assess poverty, employment, consumption and expenditure, to mention just a few indicators.

Furthermore, it has made contributions to policymaking at a global level. The methods pioneered by it are now used by the World Bank and the United Nations. As Nobel Prize-winning economist Angus Deaton and co-author Valerie Kozel wrote in 2005: “Where Mahalanobis and India led, the rest of the world has followed, so that today, most countries have a recent household income or expenditure survey.” “Most countries,” they continued, “can only envy India in its statistical capacity”.

Economists TN Srinivasan, Rohini Somanathan, Pranab Bardhan and another Nobel-winner Abhijit Banerjee have since argued that there is “no other instance of an entirely homegrown institution in a developing country becoming a world leader in a large field of general interest”.

The rise of modern India’s data capacities is inseparable from the story of the enigmatic Mahalanobis (who was known as “the Professor”). What began as a chance encounter with a journal of statistics aboard a ship – when Mahalanobis was returning from Cambridge to Calcutta during World War One – resulted in the transformation of India’s data infrastructure in the decade after the country’s independence.

Drawing some of the leading statisticians and economists in the world to the lush Calcutta (now known as Kolkata) campus of the Indian Statistical Institute, Mahalanobis helped elevate it to an institution of international excellence. Together, they helped calculate India’s GDP, staffed the Central Statistical Organization, designed the National Sample Survey, and brought India its first-ever digital computers. Much of the institutional architecture they built holds up even today, and India marks Mahalanobis’ birth date, 29 June, as “Statistics Day”.

But today, Indian data appears to be in crisis. As the world surges into the era of “big data”, India risks being left behind. The Economist recently sounded the alarm, warning that the country’s “statistical infrastructure is crumbling”. Official figures on issues ranging from Covid mortality to education to poverty are all increasingly distrusted by independent observers and experts – which has alarming implications for policymaking and government accountability.

What makes this especially unfortunate is that India was once a trailblazer in this field. The country would do well to take pride in that inheritance and restore its lost lustre.