4th IODC

Blog IODC 2016
Madrid. October 6-7, 2016


From digitalization to open data: challenging memory institutions

Ana Alvarez is Web Manager at the Museo Thyssen-Bornemisza of Madrid. She has a BA in Medieval History (Complutense University, Spain) and a Master in Museum Studies (Leicester University, UK). Her professional career has developed within digital content in the cultural sector regarding digitization, websites and information management, not only at museums. She has been consultant at ICT projects related with cultural heritage and digital educational content project for the Spanish Education repository in Red.es (Spanish Ministry of Industry); Spanish Representative at the European NRG (National Representatives Group on Digitisation in Culture) which was the seed of European Commission’s Digital Culture projects such as Europeana. Back in the museum world, she is involved not only with the web management but with digital curation strategical issues. She is a member of the Board of SEDIC involved in promoting training and dissemination initiatives related to information related professionals.

César Iglesias is partner at Kuroshiro and an attorney and consultant with over 13 years working experience in the fields of Information Technologies (IT) and Intellectual Proper ty (IP), both as an in-house and external counsel and as a consultant for private companies and Public Administrations. He has degrees in both Law (ICADE) and Economics (UNED) and an International Executive Master in Business Administration (IE). He also holds a CISA (Certified Information Systems Auditor) certificate. He is member of Madrid’s Bar Association, Information Systems Audit and Control Association (ISACA) and the Spanish Association for the Study and Teaching of Copyright (ASEDA) where he is the General Secretary. In the public sector, César Iglesias has acted as legal counsel for Red.es in the development of Electronic Administration Services and as an independent expert for the European Commission in projects for the development of Europeana and the Open Data initiative. In the private sector, César Iglesias, acting as independent attorney, has provided legal advice on compliance and personal data protection regulations for many years. Cesar Iglesias has also acted as Legal Manager for a number of companies, including the ‘Sociedad General de Autores’ (SGAE) and, currently, ISDE. He has written numerous articles and books on IT Law and Copyright. He is a regular lecturer in a number of postgraduate courses and is a regular speaker in seminars and conferences.

At the end of 2015, the law that establishes the rules for the re-use of the information of the public sector in Spain was modified in order to make compulsory for public libraries, museums and archives to make their data reusable. This modification followed the modification in 2013 of Directive 2003/98/EC on the re-use of public sector information, also known as the “PSI Directive”.

The law has raised several questions: Is a work of art such as a sculpture or a book “data”? What is considered “data”? How and when should memory institutions publish their data? Is Open Data synonymous with data reuse?

First step: Digitisation

Online cataloguing and digitisation are the current basis for the asset management of cultural institutions. For many years and together with a huge resource investment, these institutions have made an enormous effort to reorganise their data in order to be able to share it, at first internally, and nowadays mostly through their websites or services and digital content projects (retrieval tools, collaborative projects with other institutions, mobile apps, interactive galleries, audiovisuals, etc.). The progress has not been even among memory institutions, and libraries have been at the avant-garde partly due to a more extensive standardisation of their data sets. Archives and museums, due to the particularities of their collections, started later with the normalisation but it is already being implemented and are catching up.

Reuse of public sector information and Open Data

“Reuse of public sector information” is a concept derived from the intuition that data already collected by public sector bodies, and already paid by the tax payer, can have a “second life” in the private sector.

Meteorological data is the classic example to explain this. Meteorological data is too costly to collect for most institutions and enterprises in a position to deliver it to the interested people (such as TV companies). Sharing this information with the private sector allows this information to be accessible to more people (through open TV channels, for example), render a better public service (a hurricane warning) and allows the creation of additional added value.

The reuse of the information of the public sector not only aims to foster transparency in public sector, which are clearly social and political goals, but also to fully develop the economic potential of public sector information. This potential comes to action, for example, by facilitating the development of new products, services, solutions and job creation in the digital content industry where content the major driver of growth. This pragmatic approach sometimes is not well understood or shared by the cultural sector, with a more altruistic point of view[1].

The evolution from “reuse of public sector information” to “Open Data”, i.e. automated reuse of public sector information, has been possible thanks to semantic web developed by W3C, The World Wide Consortium, since the nineties. Linked Data is a method of publishing structured data so that it can be interlinked and become more useful through semantic queries. The application of standard web technologies such as HTTP, RDF and URI to the cultural field or LAM (Libraries, Archives and Museums) is known as LODLAM or Linked Open Data for memory institutions. Its adoption in projects like Europeana has enabled not only to link the metadata of different collections but also to allow the results to be reused immediately by third parties[2].

We may now come back to the questions arising from the approval of the new Spanish law regarding the reuse of data from LAM.

What is understood by data regarding cultural assets?

Data comprises not only the digital depiction of the work of art, book or the information made available through a website or catalogue, but also the metadata. Therefore, cultural institutions need to prepare their digital information not only to provide contents but also to allow third parties to access and reuse their metadata and raw data derived from the digitisation of texts, images, audios and videos.

How to publish open data?

Linked Data representation is the first, technical, step towards open data. A second, legal, one is the approval of a license for the reuse of the data. Whether a linked data project is open or not shall be determined by the nature of the licence that the project is used.

Following the 5-star deployment scheme for Open Data, developed by Tim Berners-Lee,[3] an open license is the basis for the first level of openness but automatic reuse, necessary for Open Data, can only achieved from third level upwards. This scheme measures how well the data is integrated into the web but no other aspects such as ease of reuse that shall be the subject of Open Data certificates that are currently being developed.

Besides, the election of open licenses to enable reuse is not a minor issue as it has a major effect on the quantity and quality of the reutilization of the data. In the cultural field there is no consensus on which licence is the most adequate. Creative Common’s CC0 and Public Domain licences are clearly considered “open” because they waive the exercise of any copyright over the data. Licences that require the user to acknowledge the author and to use the data only for non-commercial purposes, when applied to data sets, may raise doubts about the “openness” of the project. It should be noted that national legislations may limit the licences that may be used by the public sector.



Is Open Data synonymous with data reuse?

Open Data goes beyond the minimum legal requirements for public sector data reuse.

However, Open Data allows the development of reutilization strategies in order to:

  1. Maximize the cost reduction of the handling of reutilization requests.
  2. Increase of the visibility of the institution.
  3. Improvement of the quantity and quality of the contents provided in the Institution’s website.
  4. Fulfil the public duty of the institution of making its contents available to the public.
  5. Improve internal management of the Institution.
  6. Develop the research and knowledge on the works held by the Institution.

These issues regarding Open Cultural Data will be discussed at a pre-event to the International Open Data Conference 2016. The pre-event will be held at the Spanish National Library, and it is being organised by the Spanish Ministries of Culture and Industry and SEDIC (professional association of Information and Scientific Documentation).

A full-version of this post in Spanish can be found at SEDIC Blog here.


[1] Henninger, Maureen. The Value and Challenges of Public Sector Information. Cosmopolitan Civil Societies: An Interdisciplinary Journal. 2013, Vol. 5 Issue 3, p75-95. 21p. Available at: https://epress.lib.uts.edu.au/journals/index.php/mcs/article/download/3429/3851 (accessed 1/08/2016).

[2] http://labs.europeana.eu/api/linked-open-data-introduction

[3] http://5stardata.info/


Cover photo by James Kemp

Use of cookies

This site uses cookies in order to improve your user experience. By continuing to use the site, you are agreeing to the use of cookies and accepting our cookies policy. .