• Ingen resultater fundet

5. Towards a Theory of Value Generation through Open Data

5.1 Seven Dimensions of Liquid Open Data

Constructs are the foundation of theory. Theory can be viewed as a “system of constructs and variables in which the constructs are related to each other by propositions and the variables are related to each other by hypotheses.” (Bacharach, 1998, 498). Just as constructs are the building blocks of strong theory, clear and accurate terms are the fundament of strong constructs (Suddaby, 2010). Accordingly,

in order to offer conceptual clarity we must articulate the constructs of our proposed theory.

Providing open data (supplier side view) has been proposed as being a matter of availability, accessibility, format and license (Davies, 2010). From the demand point of view, openness is proposed to combine unrestricted availability of data with accessibility and technical interoperability (Tammisto & Lindman, 2012). In practice oriented literature, the term open data is interpreted in a variety of fashions, as evidenced from the many different working definitions found online. The Open Knowledge Foundation defines open data as “data that can be freely used, re-used and redistributed by anyone - subject only, at most, to the requirement to attribute and sharealike” (OKF, 2015). However, this definition lacks references to the technical dimensions of open data. Alternatively, Berners Lee´s five stars of linked data specify a number of technical dimensions. However, the five stars are not really an open data definition, but rather a maturity model that focuses on how to gradually transform data into linked data. Linked data is a method of publishing structured data so they can be interlinked and discovered through semantic queries.

An overview over multiple definitions of open data is presented in Verhulst et al.

(2014). This overview shows that the currently used definitions usually highlight 2-4 dimensions, and no two definitions are based on exactly the same dimensions. Thus, I turned to my data see if they could reveal the most relevant dimensions. From my qualitative data from the Danish BDP, I could infer that members of the program had very different opinions on the most important attributes of data as a valuable resource.

Some of the basic data were already free and program participants working with these particular datasets focused mostly on improving data quality and coherence of data.

Other BDP members were working on data modeling and yet others were responsible for developing a common license. A group of members and data users advocated the case for making data available free-of-charge. Programmers I interviewed were focused on the technical challenges involved in developing APIs and web services.

Other technical people were designing and building the technical open data infrastructure. I concluded that many of the challenges faced by program participants were directly related to the multi-dimensionality of the open data phenomenon.

For the quantitative studies, I had been using empirical data from The Open Data Barometer, which defines “truly open” data as data that are available online, in bulk, and under an explicit open license (Davies, 2013). In spite of the fact that considerable efforts have been exercised to make diverse government data available to the public, less than one in ten public datasets reviewed in seventy-seven countries in 2013 could

be classified as truly open according to the Open Data Barometer definition (Davies, 2013; Höchtl et al., 2014; Zuiderwijk & Janssen, 2014). Thus, the question remained, which dimensions are important for stimulating engagement with and use of open data? In order to corroborate my data against previous scientific research I reviewed a number of academic articles that have suggested the main barriers and enablers encountered by several ODIs. The barriers that could be traced to attributes related to the data themselves can be broadly classified as follows (Conradie & Choenni, 2014;

de Vries et al., 2011; Halonen, 2012; Janssen et al., 2012; Martin et al., 2014; Meyer-Schönberger & Zappia, 2011):

1) Availability issues (data only available to specific groups or not available at all).

2) Economic issues (too high prices limit use).

3) Legal issues (not standardized licenses, unavailable licenses, laws that limit dissemination or potential use of data).

4) Usability issues (data are not published under open standards or in machine-readable formats, data quality lacking).

5) Discoverability issues (links to data buried in website hierarchies, no use of metadata to facilitate search, data not linked to other data, no central repository or portal).

6) Accessibility issues (lack of download possibilities, no bulk download nor web services or open API´s, not secure access, not standardized or sustainable access opportunities).

7) Interoperability issues (data buried in silos, not published with compatible identifiers, no data models that explain syntax and semantics, or data models not openly published).

Thus, through triangulation of qualitative and quantitative data, as well as a synthesis of the dimensions found in the literature, I finally managed to create and define a construct I call Liquid open data. The extra word liquid is intended to indicate the importance of the technical dimensions for making open data more useful to external re-users. Truly liquid open data are defined as data that are available online, free-of-charge and under an open access license, published in machine-readable formats, easily discoverable, accessible and conceptually coherent. Liquid open data can be re-used without discrimination or limitation, linked to other data and streamed across systems.

Table 2 demonstrates a framework consisting of seven dimensions of liquid open data.

Table 2: The Seven Dimensions of Liquid Open Data Dimension Description

Strategic:

Availability

Availability reflects the strategic importance of open data, ranging from all government data that are not subject to privacy or national security limitations being open to all by default, to government data in general not being available outside of organizational boundaries.

Economic:

Affordability

Affordability is an economic dimension and refers to the pricing of data, ranging from data being completely free-of-charge to data being extremely expensive.

Legal:

Reusability

Reusability is a legal dimension and depends on the type of license used for government data intended for reuse. The license can range from a type of creative commons licenses that allows anyone to use the data for whatever purposes they like, to very strict licenses that allow use for a single purpose only.

Technical:

Usability

Usability is a technical dimension and refers to the clarity and ease with which we can interact with the data. If data are not usable, it is difficult to use them for purposes other than

originally intended. Data need to be of high quality and presented using standardized machine-readable data formats.

Technical:

Discoverability

Discoverability is also a technical dimension refers to whether potential users can easily discover the data and find information about the data. Highlights use of metadata.

Technical:

Accessibility

Accessibility refers to whether data are easily, consistently and securely accessible and downloadable or streamable. Highlights use of open standards.

Technical/

conceptual:

Interoperability

Interoperability of data refers to data being conceptually open.

Interoperable data are published in a way that enables the data to be used outside of the context within they were collected.

Interoperability of data depends both on structure (syntax) and meaning (semantics) and calls for use of unique identifiers for linkability. If data are interoperable, they are also “liquid” in the sense that they can stream across systems and easily be linked to other data.

Openness as defined in the context of liquid open data is not a binary feature, where data are either open or closed in the sense of them being available or not. Rather, we

can view openness as a continuous, multidimensional variable, ranging from data being closed over all dimensions, to data being truly liquid and open. Accordingly, data can have different degrees of openness, distributed across these dimensions, as explained in Table 2 and in more detail in Paper VI. All of the above dimensions will influence how users can and will engage with the data. Within each dimension, ODIs should balance the costs and the benefits related to supporting the desired value generating mechanisms (these are discussed in section 5.2). As a practical result and extension of this study, I am using this framework is to design a methodology for use in ODIs that are interested in stimulating the desired impact from open data with the limited resources at hand. Figure 8 illustrates as an example evaluation of three important datasets in Denmark over the seven dimensions. The evaluation is based on my own observations as well as on data from the Open Data Barometer in 2013 and 2014 and from the Open Data Index (Open Knowledge Foundation).

Figure 8: Evaluation of Datasets based on the Seven Dimensions of Liquid Open Data An important aspect of achieving liquid open data is the implementation of a technical open data infrastructure, which can be crucial for the last four dimensions of liquid open data. Implementation and governance of such an infrastructure was amongst the topics of discussion in Paper VI. As can be seen in figure 8, open data originate from a variety of organizations, which I like to call data custodians. Accordingly, successful dissemination of liquid open data can rarely be achieved by an individual public organization in isolation. Collaboration between multiple stakeholders across the public and the private sectors is an important success factor for an ODI (Ubaldi, 2013).

The choice of governance mechanisms can be a deciding factor for the success of such collaborations (Provan & Kenis, 2008). ODI governance must address potentially competing motives and ensure that individual data custodians have sufficient autonomy, without introducing too much governance related overhead, as resources are

generally scarce. Paper VI addresses the four following ODI governance tensions in more detail.

Tension 1: Simplicity vs. Comprehensiveness

Every ODI has to balance their ambitions for open data with the level of funding they receive. While disseminating high quality data that are liquid and open across all seven dimensions of liquid open data is a tempting idea, it is very difficult to achieve in reality. The approach chosen by the BDP was to focus on a limited number of key datasets and to develop comprehensive data modeling principles for these data. These data modeling principles are based on EU Inspire standards whenever possible and are general enough to be reused by other public sector organizations in their efforts to publish liquid open data. Interoperability of data across different domains can thus be improved, without including too many sets of data in the first round of data modeling and data publishing.

Tension 2: Autonomy vs. Control

Open government data do not originate from a single organization and therefore can be difficult to publish in a coherent manner. However, while each data custodian is collecting data for their own (regulatory) purposes, these data can undeniably be of much use to other organizations. The data custodians must have enough autonomy to fulfill their individual roles, while contributing towards a common goal of liquid open data, which means using a set of common standards. In order to achieve a balance between those two competing demands, the BDP chose a governance approach called System-of-Systems governance (see Paper VI for more detail). A Systems-of-Systems can be defined as a collaborative set of systems where the components are independent dedicated systems that are separately acquired and integrated to form a single system, yet maintain a continuous operational existence independent of the collaborative system (Rechtin & Maier, 2000). This style of governance seems to be well suited for a constellation of loosely coupled participants, although demanding a high level of network governance skills (Provan & Kenis, 2008).

Tension 3: Exploration vs. Exploitation

Promoting a common goal for publishing liquid open data is highly important.

However, it is not likely that all public and private organizations share a common view on the most important value drivers of open data. While some may emphasize the importance of private sector innovation, others may be more interested in efficiency gains within the public sector. Consequently, these stakeholders may subscribe to different, perhaps competing implementation methods. The BDP case study indicates

that these tensions might actually be resolved by maintaining a focus on all the seven dimensions of liquid open data, which was possible in this case due to a limited number of datasets included in the program. The infrastructural features of open data have resulted in some unexpected synergies across value generating mechanisms, reflecting the serendipitous value generation opportunities offered by open data.

Infrastructural resources are considered as shared means to many ends, which satisfy the following three criteria: 1) they are non-rivalrous, 2) social demand is driven primarily by downstream productive activities, 3) the resource can be used as an input for a wide range of purposes (general purpose criteria) (Frischmann, 2012).

Tension 4: Short term gains vs. long-term investment

While it is natural for any ODI to focus on low-hanging fruits, it is important not to do so at the expense of future, currently unknown applications of the data. Thus, it remains a challenge to find a balance between being openly publishing a large number of datasets and ensuring that continuing publication of these data is actually sustainable. The two case studies conducted as a part of this study revealed that users hesitate to engage with open data unless they are convinced of the sustainability of the data resource. The chosen strategy of the BDP was to finance their ODI upfront, by effectively transferring funds from organizations that are expected to benefit from the liquid open data to the data custodians, which are responsible for remodeling data and building the technical infrastructure. This ensured enough funding to reorganize data-collection and data modeling efforts across central and local government, which is a long-term investment. Simultaneously, individual data stewards had the means to publish their open data through new data services, making the data available for interested users, although not yet in a fully coherent manner.

These governance mechanisms are important for understanding how data can become open and liquid. Therefore, they fall beyond the scope of the conceptual model I present as the overall biggest contribution of this dissertation. The conceptual model focuses solely on the relationship between data that are already liquid and open to some degree or another, the mediating variable that are intended to reflect manifestations of underlying generative mechanisms, and the resulting impacts, conceptualized as sustainable value.