Service discovery, composition and execution for large scale open environments
In today's largest open environment, the World Wide Web, human-to-computer interactions are predominant. The natural evolution envisioned by many people is towards a Web in which the majority of interactions will be computer-to-computer. This has certain benefits but also raises significant research challenges. For addressing some of these challenges, we use the concept of service to model computer-to-computer interactions. A service is delivered by a provider upon request from a consumer. In this thesis, we address research issues related to the discovery of services, their composition as workflows, and the optimal execution of these workflows with respect to given cost measures. In addressing these research issues, we concentrate on those aspects specific to open environments, in particular the large number of possible services. This is taken into account for both discovery and composition. Prevailing approaches to composition assume all services to be known a prior, whereas in our case services are incrementally retrieved from a directory. The composer uses the current search state to iteratively generate new queries to the directory. As these queries can be quite complex, the directory supports an expressive formalism both for the description of the service request and for the control of the discovery process. As the result set of a query can be potentially large, guiding the discovery process is key to speed up the composition process. This is achieved by user-specified selection and ranking functions that encode algorithm-specific heuristics. In order for the discovery process to be efficient, we have designed our directory as a search tree where service descriptions are numerically represented as sets of intervals. Inner nodes in the search tree provide bounds regarding potential results in each sub-tree. These bounds are used for guiding the search process in a best-first manner. We reconcile flexibility with efficiency by making the internal organization of the directory completely transparent to the client. The original query is transformed into a relaxed version that is used for evaluating internal nodes of the directory. As in open environments directories are accessed by multiple clients, consistency of the directory is ensured through a technique based on multiversion concurrency control, well integrated with the internals of the directory system. In our case, a version represents a complete read-only search tree. Our composition engine uses type-related constraints between services in order to automatically generate workflows. In addition to completely matching services, our system also accommodates partially matching services. In this case, the composition engine integrates several services using a switch, although each service taken alone may fail type compatibility. Based on runtime values, the appropriate service is used such that type-compatibility is ensured. Composed services formulated as workflows can be optimally executed according to given cost metrics. For this purpose, we introduce a special execution mediation layer, formed by execution sites, each situated in the proximity of one ore more services. Parts of the workflow are optimally distributed across execution sites.
EPFL_TH3371.pdf
restricted
13.37 MB
Adobe PDF
865ba2704a508130d8baf6e338d90c7c