Studying Indian Politics with Large-scale Data: Indian Election Data 1961–Today

AuthorFrancesca R. Jensenius,Gilles Verniers
Published date01 December 2017
Date01 December 2017
Subject MatterNotes on Methods
09INP727984_F.indd Notes on Methods
Studying Indian Politics with
Studies in Indian Politics
5(2) 269–275
Large-scale Data: Indian
© 2017 Lokniti, Centre for the
Study of Developing Societies
Election Data 1961–Today
SAGE Publications
DOI: 10.1177/2321023017727984
Francesca R. Jensenius1
Gilles Verniers2

There has been a data revolution in the study of Indian politics in recent years. Indian politics has always
been a fascinating field, in comparative perspective and in its own right. Accounts of fieldwork in India
have challenged and informed discussions about democracy in the developing world for decades. Since
the 1990s, the National Election Studies conducted by Lokniti, CSDS, have enabled informed quantita-
tive analyses of political patterns across the country. More recently, technological innovations and the
increasing availability of online data sources have opened the door to new ways of studying Indian
politics quantitatively. Scholars can now create and merge a wide range of large-scale datasets, making
it possible to establish new empirical trends, fact check commonly held political narratives and test
hypotheses developed in other democratic contexts.
In this research note, we first describe some techniques and tools used for creating and merging large-
scale datasets. Next, we introduce two datasets we have developed: constituency-level datasets of Indian
State Elections and General Elections from 1961 until today.3 We describe the process of creating these
datasets, the efforts involved in cleaning the data and how the data can be utilized. In conclusion, we
offer some reflections on the limitations of over-relying on quantitative data in research on Indian
politics. We hope to get more scholars and practitioners interested in using the publicly available datasets
developed by ourselves and others, and to inspire students and scholars to invest in the quantitative skills
needed to develop new quantitative datasets.
Data Sources for the Study of Indian Politics
India has some extraordinary data sources for studying politics and society. The decennial censuses
provide large amounts of information about villages and towns, which can be aggregated to higher
Note: This section is coordinated by Divya Vaid. E-mail:
1 Senior Research Fellow, Norwegian Institute of International Affairs (NUPI), Oslo, Norway.
Assistant Professor of Political Science and Co-Director, Trivedi Centre for Political Data, Ashoka University, Rajiv Gandhi
Education City, Sonipat, Haryana.
3 We wish to acknowledge Dr. Sudheendra Hangal’s contribution towards restructuring the dataset, and wish to thank research
assistants at UC Berkeley as well as the research team at TCPD, Ashoka University, who helped with the tedious job of cleaning
these data.
Corresponding author:
Francesca R. Jensenius, Senior Research Fellow, Norwegian Institute of International Affairs (NUPI), C.J. Hambros
plass 2D, 0033 Oslo, Norway.


Studies in Indian Politics 5(2)
administrative levels. Several large surveys, including the Indian Sample Surveys and the Indian National
Election Studies, are internationally acclaimed for their innovative design and sizable samples. Most
of India’s government agencies also collect impressive amounts of information about their work.
Of particular importance for political scientists is that the Election Commission of India (ECI) publishes
detailed information about all elections in India, and, since the early 2000s, on the socio-economic back-
grounds of election candidates.4
However, scholars find it difficult to use some of these data sources for statistical analysis. Many
sources are hard to come by, as they have been recorded and kept by single offices at the state or district
level. Census data prior to 1991 were published in book format, not in soft copy. Historical election
results are published as PDFs. Furthermore, public data in India are often available in inconsistent or
unpractical formats, making it cumbersome (or near-impossible) to merge them or work with them.
The sheer scale of the data, and the time and effort required to digitize data manually from handwritten
notes, books, PDFs and scanned documents—or simply to...

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT