Open government data in India: an answer to India’s logjam

March 24, 2016 by Natasha Agarwal

In a recently concluded international open data conference, the international data community identified five priority areas for harnessing open data for sustainable development. These five areas include 1. deliver shared principles for open data; 2. develop and adopt good practices and open standards for data publication; 3. build capacity to produce and use open data effectively; 4.strengthen open data innovation networks; and 5. adopt common measurement and evaluation tools.

With the launch of data.gov.in in October 2012, India also embarked on its own journey in opening government data to the general public at large. On the surface it appears to be a success: over 21,000 resources published, 5.6 million views and 2.24 million downloads.[1]? Nevertheless, in a policy brief, I find that the data.gov.in which facilitates the dissemination of India’s OGD, has only just begun its journey: critical datasets continue to be missing from the portal, and the available ones are more often than not outdated, duplicated, incomplete, lack semantic interoperability, and are inadequately referenced. Top level metadata such as data collection methodology and a description of the variables are also either missing or incomplete. Besides, there isn’t available an effective communication mechanism for querying and trouble-shooting.

Being one of the pioneering countries in embracing the Open Government Data (OGD) initiative with a robust and one of its kinds National Data Sharing and Accessibility Policy (NDSAP), some challenges India faces in its approach to align itself with international community and solutions are:

Problem – Lack of clarity in NDSAP itself. For instance, the policy currently requires government agencies to segregate datasets into high- and low-value, and upload as many as high-value datasets as possible. This is in addition to publishing at least five high-value datasets on data.gov.in within three months of the notification of the policy.

Solution – It is essential that the government decommissions the prioritization requirement. If, for legislative reasons, the government is unable to decommission the prioritization requirements, then the government agencies by applying analytics (such as Google Analytics) should make available datasets that receive at least 100 unsuccessful requests. The government on its part can monitor the same analytics and ensure deliverables from the agencies.

Problem – Inconsistency in the implementation of NDSAP. For instance, several uploaded datasets on data.gov.in are on performance indicators like the Gross Domestic Product (GDP) or the employment levels of the country, a state, a district, a sub-district, village or a town. Such data is available over time. However, these datasets operate in silos, and have no dimension to facilitate semantic interoperability.

Solution – One of the ways to achieve consistency could be to integrate the e-Governance standards laid down in the National e-Governance plans (NEGP) with NDSAP, and make it mandatory for agencies to comply with e-governance standards for uploading datasets on data.gov.in. Therefore, implementing Metadata and Data Standards (MDDS) would ensure that the government agencies implement practices that enable standardization of data and metadata such that the precise meaning of information is understood across government agencies, within a single government agency, and over time.

Problem – Inadequate interaction between the consumers (namely the data users such as researchers, analyst, application developers, amongst others) and suppliers (government agencies) of data. This is largely because data-suppliers are unaware of the economic benefits attached to opening government data and data users are either unaware of the existence of data.gov.in and or prefer to use the agencies web-portals because of ease of operations and familiarity.

Solution – Reinstate the capacities of government agencies by encouraging private participation (such as individual researchers, academic institutions, or data miners) in the entire process from data collection to data dissemination. This could be achieved by developing internship programmes, consultation roles or having a roster of private participants that could be brought in as and when the need arises. This would also help in developing a pool of resources for troubleshooting of queries posted on the platform, and have an accessible archive of resolved queries. In addition, this would reinforce the motivation of government agencies by cutting down on bureaucracy, and encouraging them to innovate.

Problem – Inadequate infrastructure and capacity constraints that data-suppliers face to implement the dissemination of OGD.

Solution – Building an independent data infrastructure agency within India’s OGD initiative such that it creates a balance between physical and intellectual capacity and enables centralization of dissemination activities. Such an independent agency would invest in computers that have updated, and multiple statistical packages, seamless internet connectivity, host of super computers for processing larger datasets, and an engineering team to troubleshoot computer problems to sustain a functioning data infrastructure. Such an agency could be established could be established centrally and locally to meet local government requirements.

To conclude, the Indian government should commit to improving and sustaining OGD initiative by taking into considerations the recommendations highlighted above. Further Identification and rectification of similar barriers can help India in internationalizing its commitment to open data.

