2 Background and related work

(3)

Figure 1: The concept of SOA [36].

indicate exceptional situations, so during the implementation of a CWS, special procedures (compensation or fault handlers) to handle them should be designed. In the literature [3, 4, 9, 10, 13, 14, 15, 22, 25] there are works that address the problem of how to implement replication techniques in web services, thus how to make web services more reliable. The number of methods dedicated to the specific challenges of CWS is however rather small and often based on assumptions (e.g. reduced autonomy) that are not consistent with the goals of composite web services.

The work presented in this paper aims to preserve the autonomy of web services used in CWS, which means that only the model of the CWS and the results obtained during its executions is taken into account.

The presented work is based on the concept of analyzing the workflow of the CWS in order to determine alternatives, which enable reliable execution even if used components encounter failures. Searching for those alternatives is performed before the execution of the CWS, and the choice of an appropriate alternative to execute is made before each subsequent execution of the CWS.

In this paper colored Petri Nets are used, which offer a modeling language for design, specification and the simulation of systems [23]. The colored Petri Nets have a graphical representation as well as formal mathematical foundations, and they offer many formal verification methods [23]. One of them, namely an analysis of an occurrence graph [24], is used in this work. It is the basis for inferring the reachability of a success state of the CWS from states that represent failures. This approach doesn’t require knowledge or control over interacting WS, thus it ensures that their autonomy is not limited in CWS.

The remainder of the paper is organized as follows. Section 2 presents the definitions of web services, which is followed by a review of solutions for the fault and failure management in web services. Section 3 describes the proposed solution, and section 4 and 5 present modeling and analysis details. The execution is described in section 6. The paper ends with conclusions and an outlook on future work.

In this section we present the definitions of web services, as well as ways in which we can compose them.

Then we define faults and failures and we review the approaches proposed in the literature to deal with them in web services.

2.1 Atomic and composite WS

WS are currently the most popular implementation of the Service Oriented Architecture (SOA) design and integration approach. The basic principle that underlies SOA is presented in Figure 1. According to this an abstract description of a service, understood as software functionality, is published in a registry (discovery facility). There it can be found by a requestor that wants to use it. The requestor, after finding the description, binds to the service, so it obtains enough information to connect over a network to it [36].

The broad idea of the Service Oriented Architecture is adapted by the WS. Therefore the notion of WS refers to systems that are built from several networked modules, however it also refers to a set of standards, which supports an implementation of such applications [38]. In the first approach a web service can be described as an interface that collects operations accessible over the network, and this access is possible by sending standard XML messages [21]. In the second approach the web service is treated as the WSDL (Web Service Description Language [34]) description of a group of operations, which are invoked over a network using SOAP (Simple Object Access Protocol [33]) messages. These operations can be published with UDDI (Universal Description, Discovery and Integration [28]) in a register.

The main advantage of the web services is their interoperability, which means that they allow commu-nication and cooperation between software components, which are implemented in different programming languages or deployed on different platforms. The web services technology achieves this by using the al-ready mentioned set of standards as well as by relying on XML-based artifacts for describing, publishing and invoking activities. Choosing XML as the underlying language is an important element of ensuring the interoperability, because XML is machine and platform independent.

The next important attribute of web services is their composability (aggregation). Web services can be aggregated (orchestrated or choreographed) to work together, in order to provide more complex functional-ity. In most of the cases the goal of such composition is to model a business process, like supply-production chains or planning services. There are two aspects in which a composition of web services can represent a business process: orchestration and choreography [29]. Orchestration is an executable business process that interacts with other web services, and is controlled by one party. Choreography is more collaborative, and it allows involved parties to define their role in interaction, so it tracks sequences of exchanged messages.

For both types of modeling the BPEL (Business Process Execution Language) specification, which supports both abstract and executable business processes [26], can be used. Other specifications that serve the same purpose are Business Process Management Language (BPML) together with Web Services Choreography Interface (WSCI) [29]. There are thus different ways that web services can be aggregated and can build recursively other web services.

2.2 Faults and failures

A software failure is a result returned by a program that is not expected and deviates from specified requirements [27]. A failure occurs, when an application is executed in particular conditions, and is caused by a fault, which is understood as a defect in an implementation [27]. To deal with faults, there are two general strategies: fault prevention and fault tolerance [2]. In the first strategy efforts are made to avoid or to remove faults existing in an implementation by testing and analyzing a code. It is based on the assumption that it is possible to predict all use cases of a program and all conditions of its execution. Such assumption is however invalid if a software system consists of a larger number of components or modules.

Then the way to deal with faults is fault tolerance, which assumes that faults are present in software.

This inherent presence of faults in applications may result in failures, so the responsibility of fault tolerant systems is to provide means to cope with encountered failures [2].

If components that are able to perform computations to achieve a goal of a software system are spread over a network, the system is distributed. These components may be called processes [19] or servers providing services [12]. The services are simply collections of operations, which are executed after receiving defined inputs. There are many methods to specify the processes or services, however they must at least identify a set of inputs and corresponding outputs. If after receiving an input, a server’s output or state differs from a service’s specification, a failure occurs. We can distinguish four main categories of failures in distributed systems [12]:

- omission failure - if a server omits a response,

- timing failure- if a response comes outside of a specified time interval (if a response is received after this interval then this failure is called aperformance failure),

- response failure - if a response comes on time, but its value is wrong or a server’s state after the response is incorrect,

- crash failure - if a server fails to produce outputs for several subsequent inputs and it must be restarted.

2.3 Overcoming faults and failures in web services

In the following sections we present the solutions to deal with faults and failures in web services. First we show the methods used in standards (like WSDL or BPEL), and then other solutions present in the literature.

2.3.1 Standards

The current set of standards, which provides means for describing, executing and composing web services, to overcome failures proposes fault tolerance that bases on the exception handling concept.

The WSDL standard [34], which specifies a web service’s description, allows declaring, for each operation available in the web service, a special fault message with its name and content. If the web service’s transport protocol is SOAP [33], then at the execution time, the description of the fault message is transformed to a special type of message, namely a SOAP fault. It contains information about a general type of a fault, its name and details specified in a WSDL description. SOAP fault can be mapped to an exception in a programming language (for example in Java) and handled there [33].

The BPEL language, which is considered the standard language for implementing compositions, offers two mechanisms that are used to deal with failures: compensation and fault handlers [26]. Compensation handlers are used to perform backward error recovery, so they allow defining procedures that can be invoked to undo changes, which have been made prior to a failure’s occurrence. Fault handlers are used to forward error recovery, so to successfully overcome failures. Both of the mentioned structures, compensation and fault handlers, are defined for a scope, which is BPEL’s unit of processing [26].

2.3.2 Other solutions

Solutions for the problem of dealing with failures in web services that are not composed (atomic WS) are based in most cases more or less on replication techniques. Deron et al. [13] propose to deploy the same web service on many hosts. A description of the web service is then enhanced with the additional element indicating locations of primary and backup servers. Each request to the web service is logged and stored, thus in case of a failure (detected after a request), the next host is chosen to take over a primary server role, and response to the request. Liu et al. [25] give a similar solution, but make the fault tolerance transparent to requests. They split a web service into two layers: one is responsible for delivering its functionality and the other for providing fault tolerance (by replication management, logging services and recovery from logs). The replication mechanism is also used by Dobson [15], but he investigates the possibility of using BPEL language to implement a fault tolerant web service. In his proposition BPEL serves as an additional layer (previously realized by SOAP engines) for choosing atomic web service, according to their accessibility status. When one host with a web service crashes then another is invoked, which is implemented in compensation handler of the web service.

The concepts proposed to manage servers failures presented above are not suitable for composite web services, because they are not considering the hierarchical structure of such web services. This issue is addressed in the following approaches. Dialani et al. [14] designed an architecture that consists of an application layer with a general fault manager and of the service layer that is enhanced with messaging capabilities. Such distinction makes it possible to introduce local and global recovery mechanisms; the first one is responsible for recovery of an individual web service, whereas the second one is used by the entire application. Dialani et al. propose backward error recovery, so in case of a failure, the system goes back to

Authors Type of WS Type of failures Description

BPEL, Composite Declared Exception handling,

WSDL [26, 34] exceptions which is based on faulty messages.

Deron et al. [13] Atomic Crashes Replication of web services,

with an additional element in a WSDL description for specifying locations of primary and backup servers.

Liu et al. [25] Atomic Crashes Replication of web services that is transparent for clients, and uses two layers of WS.

Dobson [15] Atomic Omission failures, Using BPEL’s compensation handlers crashes to manage replication of WS.

Dialani et al. [14] Composite Crashes Informing cooperating web services about failures, assumes control over atomic web services.

Ardissono et al. [3, 4] Composite Exceptions Reasoning about possible cause of an exception, assumes knowledge about atomic web services.

Issarny et al. [22] Composite Exceptions thrown Concept of Coordinated Atomic (CA) in parallel execution actions adapted to web services

(gathering all cooperating web services and perform recovery procedures).

Chafle et al. [10, 9] Decentralized Exceptions Inserting handlers into each part

composition of a distributed composition.

Table 1: The summary of solutions for overcoming faults and failures in web services.

the previous valid state, and all services that might be affected are notified. This solution assumes that an application that composes atomic web services can control them, unfortunately in most CWS it is invalid.

The work of Ardissono et al. [3, 4] does not assume any control over composed web services. Their framework reasons about possible causes of an exception and chooses the best fault handler for it. Their main concept is to use local diagnosers with each web service, so in case of an exception the diagnoser produce local hypotheses about its cause. They are verified by the global diagnoser, which has knowledge about the whole composition and can generate global diagnostic hypotheses. Ardissono et al. use a model-based diagnosis, which allows modeling a composition as a set of components, which store exchanged variables. Although in this work the control over web services is not assumed, this framework still requires a substantial amount of knowledge about the behavior of individual web services to build the diagnostic model.

The next two solutions consider slightly different problems and explore specific approaches to the composition of web services. Issarny et al. [22] solve the problem of how to perform forward error recovery if individual web services are invoked in a composition in parallel. They adapt the concept of Coordinated Atomic (CA) actions used in decentralized systems. The CA actions are used to control cooperative concurrency and exception handling, by gathering all interacting threads and synchronizing their initial and final states [30]. In the web services domain this works similarly, and web services are participants of a CA action. In case of a failure there are procedures that deal with global results of concurrently executed services [22]. Chafle et al. [10] propose decentralized mechanism of composing web services in opposition to centralized one defined in BPEL. Although decentralization may improve performance, it makes the fault handling more complex, because all parts of decentralized composition can throw exceptions. To overcome this handlers are inserted into each part and then, according to the type of the part, appropriate data are gathered and sent to related nodes [9].

All solutions presented above are gathered and compared in Table 1. It can be concluded from this

summary that solutions for composite web services are based on the assumption of having control or knowledge about used WS. Such assumption is in conflict with the interoperability paradigm, because it requires additional dependency between components.

In document View of Eighth Workshop and Tutorial on Practical Use of Coloured Petri Nets and the CPN Tools, Aarhus, Denmark, October 22-24, 2007 (Sider 92-96)