A guest post from CTIC exploring work towards developing common open data standards. CTIC has been involved in many of the most important open data projects in Spain over recent years, from the implementation of the national catalogue, and multiple initiatives at local levels, to working as a partner in the SharePSI2.0 project, that brings together 40 partners from 25 countries with the aim of harmonizing the Opendata in Europe and identifying good practices.
In the early days of the open data movement, the call was for “raw data now”, but more recently we’ve been learning to get more strategy. To paraphrase the campaign led by Tim Berners on creating a ‘Web we Want’, it’s time to be talking about the “Open Data we Want” and how it should be published using common standards.
The growth of open data
Since the “Big Bang” that occurred with the launch of Data.gov in 2009, we have seen an expanding Open Data universe, with an almost exponential growth in terms of Open Data initiatives and portals throughout the world. In Europe, we have seen a strong push from the European Commission through the development of the legal frameworks for open data and the launch of projects for the development and dissemination of Open Data, such as:
- the ePSIPlatform.eu;
- the LAPSI thematic network (Legal Aspects of Public Sector Information) 2.0; and
- the planned Pan-European open data catalogue.
At Spanish level, this process began in 2010 with the publication of catalogues in the Basque Country and in the city of Zaragoza and then the publication in 2011 of the National Catalogue datos.gob.es. Since thenthe process has grown to include more than 110 open data initiatives at all levels of government: national, regional, and local (although some initiatives have fallen by the wayside).
Besides the momentum that the government of Spain has provided for open data, a key driver of the success of the national catalogue has been the process of federating catalogues. Through this, data from more than 75 catalogues in Spain are brought together representing about 60% of the entire national catalogue. This is enabled by the catalogue description standard DCAT, developed by the W3C.
This progression has been measured in the last Estudio de caracterización del Sector infomediario en España (2014), a report taking into account, among other factors, the economic impact on employment and Opendata.
Considering all of this, it would seem easy to say that this is a resounding success of the Open Data initiative. However, if we remember the expectations raised initially, it is clear that we have made progress but we have not reached the full promised value generation of open data.
Standardizing key datasets
One of the main hindrances causing the reuse of data more difficult is the lack of standardization. The same sets of data provided by the city of Madrid, should be easily discoverable in Barcelona, Paris, Amsterdam or Ottawa. The standardization of this data would enable an application created by an entrepreneur in Spain to be able to potentially use it anywhere in the world.
The process of standardization requires a continuous and on-going dialogue between those who produce data and those who use it. Practical standardization should focus on the most popular datasets, in order to deliver the maximum benefit.
In Spain, this process began in early 2015 with the publication of the Spanish standard UNE 178 301:2015, created by the standardization group Smartcities. It defines a set of indicators divided into 5 axes (political, organizational, technical, legal and economic) as well as a measurement metric that can assess the level of open data initatives in cities. But, event more interestingly it defines 10 datasets, along with their corresponding vocabularies, that governments should publish, and in the process this recommendation paves the way towards standardization: at least for cities.
The ten datasets are their schemas are listed here:
- Leisure and Culture. Culture Agenda
- Demography. Population (Census): Using The Data Cube Vocabulary and vocabularies for Age, Gender and Geography.
- Environment. Air quality
- Public sector. Contracts
- Public sector. Initial budget and execution
- Transport. Public car parking (link forthcoming)
- Transport. Regular Bus timetables, lines, stops, fares…
- Transport. State of Traffic
- Tourism. Places of touristic interest
- Town Planning. Street guide
Over time, this process must move towards a greater number of datasets and recommended vocabularies and must provide an extension to other levels of the government in order to improve coordination and standardization of public catalogues, which would mean a real improvement in the ratios of reuse and the economic value produced.
How can we encourage more global collaboration around this standard setting? Or is standardisation always a national task?