A guest post from Yasodara Cordova, from W3C Brazil, on the Data on the Web Best Practices, and the need to link together policy and technology conversations.
The activists who advocate for Open Data, and the Governments that are involved through initiatives like the Open Government Partnership, sometimes seem to be dancing out of step. Maybe this is because we have not reached the tipping point in the open data ecosystem where data is generating both advances in transparency, and a chain of large-scale business innovations and economic growth. Researchers looking at what can stimulate open data ecosystems seek to discover methods and processes that will lead data publishers to provide resources that meet the needs of stakeholders: from developers and businesses, to nonprofit institutions, and even individuals, each of whom have specific demands of data.
On the other hand, there have been many successful efforts towards the opening of data. Governments have opened their data even if they have not yet adopted common standards or followed general guidelines for data publication, proving that there is the momentum behind the opening up of data and the need to show results fast on transparency. But, now that more data is in the open, maybe now is time to start using standards in order to move beyond simple publication and to acquire the velocity of data re-use that everybody involved in transparency and open data activists expects and is aiming for.
Although there are no foregone conclusions to the question of how to accelerate open data ecosystems, there are a set of clear hypothesis that surround open data re-use. In Brazil, for example, the method of using hackathons as “appetizers” has been widely applied in an attempt to show the possible benefits of using open data, its ability to bring transparency and to increase the apps on offer for citizen to use.
In the last year alone there were more than 10 hackathons and challenges around open data in Brazil. Just like the Open Data Day Brazil – that happened at the Calango Hacker Club in Brazil’s capital, many had impressive results. But more than apps, the work of the civic hackers brought up discussions about best practices for publication of Open Data. For example, the W3C Brazil office launched a challenge involving data from the Ministry of Justice, in a partnership with the ministry itself, and the most important outcome of the process was not in the apps created, but was a GitHub based discussion around the quality of the data. Through this conversation, developers were able to clean the data and bring it up to meet key standards, whilst also sharing the results of their work as a foundation for others to build upon. These examples, amongst others point to the importance of offering of data using international standards in order to enable greater data re-use, and meet not just the data publication, but also the data use objectives of initiatives like the OGP. Working towards the adoption of standards and best practices around open data is something as a community we need to focus on.
Based on this premise, W3C launched a Working group in 2013 to work on Best Practices for Data on the Web. Since then, the group has explored many data publishing challenges based around a set of use cases collected during the first phase of the working group.
This use cases were important to identify and select the priority challenges for effective publication of data on the web. These challenges are described in the picture below, and are each connected with particular technical aspects. In response to each challenge the working group has put forward best practices that are still in development and open for discussion in the W3C forums.
The first draft of the Best Practices Document has a rough translation to portuguese that can be accessed here. The work of the CSV on the Web Working Group it’s also related with supporting the publication and use of Open Data on the Web.
The frontier between Public Data and Open Data: do we have to address that?
Although the recommendations made by the DWBP group focus on technical aspects of open data, it’s important to note that political issues are also always present and tightly connected with best practices. Discussions over data licensing, privacy and security, for example, have an important place at the policy table, and can’t be thrown away or left aside as technical issues alone when talking in fora like the Open Government Partnership.
Thus, to develop best practices effectively we need to formulate many questions:
- where are the frontiers between technical actions and policy responsibilities?
- Can we navigate these blurred lines by establishing technical guidelines written in bills and laws?
- How far are best practices for data on the web an open data issue, or how far do they also relate to the growing field of data mining from public data, where new techniques are being used to detect and address questions concerning resources management in cities and different problematic scenarios, like diseases spreading for example? Do these represents fields where data best practices need to be discussed in fora like the OGP?
Standardising technologies and methods for sharing data has a vital role in the triggering of the whole cycle of Open Data, because it has as its consequence the greater interoperability and reusability of data. Datasets that are machine readable are ready to be used by developers and consumed by services and applications in a faster and efficient way. Also, the methods adopted by agencies that works with public data and the methods used to release this data can be addressed as issues to be solved, which can be an important step to unlock the power of the Open Data as a tool for transparency and accountability.
The big question is how we set, and continue to develop, these best practices, and how we get both technology and policy groups involved in a shared debate.