Guest post from José L. Marín
Holding both a Telecommunication Engineering Degree and a Business Administration Degree from the University of Valladolid (Spain), Mr. Marín’s professional career has been developed at the company Gateway S.C.S. (owner of the brand “EUROALERT.NET“), where he is currently one of the partners and member of the management board, as well as CEO. In his position, Mr. Marín is supervising the huge challenge undertaken by Euroalert to build up a pan-European platform to aggregate EU public procurement data which represents about 18% of EU GDP, and deliver commercial services for SMEs and organizations all over the world.
He currently takes part in the editorial coordination of the Pocket Innova book collection focused on Innovation, and is author of the book “Web 2.0. Una descripción muy sencilla de los cambios que estamos viviendo”, published by Netbiblo, (2010). He has been speaker at events like FICOD09, PSI Meeting 2010, Digital Agenda Assembly, Share PSI or SICARM, and Universities like Oviedo, Almería or Girona, always with the objective to promote the release of public sector data in open and reusable formats.
In conversations between members of the open data community, especially those responsible of providing data, one often overhears statements such as “it’s necessary to stimulate the demand for open data,” “we can’t reach the reusers,” “it would be interesting if data providers and reusers talked more.” I am sure that you have heard such statements in many occasions.
Most probably, this uneasiness is not unknown to the IODC organizers, whom need to be aware that previous editions of the event have mostly been focused on what is usually called the “supply side,” this is the public organizations in charge of the custody and providing groups of open data. What is true is that in Spain, possibly due to the fact that it is the Ministry of Industry the one that promotes open data policies, it has always been encouraged that reuse companies are very present in events about open data. And this will surely be noticed in the program of the 4th IODC next October.
However, I would like to tell you a secret that could help understand why, apparently, there is no such long-awaited open data demand: it turns out that for reuse companies, it is often more productive to obtain data from the web than using open data portals. Unfortunately, technologies for data extraction from documents have advanced in recent years much faster than the existent datasets in portals.
Even though it is quite inefficient and we may not like it, currently it is the only possible way in many sectors for companies to generate data value. In other sectors, when there is no published data, neither in documents nor in datasets, there is no demand to stimulate. Companies, especially small companies, survive on the value that they can create and sell today, not on future promises.
If you were a company, where would you put resources? On an open source library to improve a data-extraction algorithm for PDFs or taking part in circular arguments about the best way of opening data?
In my opinion, as I am on the “demand side,” I would like IODC 2016 to be a turning point, not as much as to define more standards, more indexes and policies and laws, but to obtain a publication agreement of more useful datasets.
If we actually aim to encourage innovation and creation of value from open data, I suggest we flood portals with useful datasets. What could go wrong? Actually, much of these data are already inside published documents on the web, and much effort is being put on extracting and cleaning them when it could rather be put on creating data value.