Steffen Lamparter, Anupriya Ankolekar, Rudi Studer
University of Karlsruhe (TH)
FZI Research Center for Information Technologies,
Categories and Subject Descriptors: H.3.5 [Information Systems]: On-line Information Services - Web-based services; H.3.3 [Information Systems]: Information Storage and Retrieval - Selection process.
General Terms: Algorithms, Languages, Economics.
Keywords: Web Services, Customisation, Preference-based Service Selection.
Web service discovery and selection have been extensively studied in recent years. As the set of available Web services may not be known a priori, may change frequently or service requester requirements and preferences may change, the problem of dynamic Web service selection is a fundamental one. Considerable research and industry effort has focussed on the (semantic) description of Web services, leading to standards such as WSDL , WSMO  and OWL-S . One of the key open challenges is performing dynamic service selection for highly configurable Web services with dynamic user preferences. Web services are typically highly configurable, with significant service customisation possibilities and a choice of quality-of-service (QoS) properties, e.g. delivery/response times, naturally each with its own price. Customisation is critical for them to be able to differentiate themselves from competing Web services and offer a better service experience to their customers. Service requesters themselves have certain preferences on which service configurations they want to use and their preferences may change dynamically. Given their significance, there is a strong need to support customisable services. However, we lack efficient methods for the representation and matching of configurable services.
Current approaches to modelling Web service configurations and requester preferences, such as WS-Agreement , enumerate the various possible service configurations, which is inefficient when dealing with multiple service attributes and their values. For example, a service described by five attributes, each with five possible values, already leads to 3125 different configurations. Given this combinatorial increase, a functional description of service attributes and their associated prices or preferences would be more appropriate.
In this paper, we model service configurations and associated preferences compactly using utility function policies [17,20], which provide a declarative mechanism for efficiently attaching price information to attribute values. This allows us to draw from the vast literature on efficient multi-attribute decision theory methods to develop an algorithm for optimal service selection. In addition, in order to be able to compare service attributes correctly, we need to describe them in a way that captures their semantics. We use ontologies to describe services attributes and their values semantically, and extend the resulting semantic representation to include offers and requests. This allows us to define appropriate attribute value matching rules for each kind of attribute.
The contributions of this work are three-fold. First, we develop an OWL ontology for configurable Web service offers and requests that can represent (execution-) context-dependent user preferences for functional and non-functional (e.g. QoS) properties within a standards-based specification language, thus extending current semantic Web service description frameworks, such as OWL-S and WSMO. Second, we present an optimal service selection mechanism in the context of this framework and demonstrate its feasibility by analysing its performance. For large numbers of highly configurable offers, given the assumption of additive functions, experimental results indicate that the algorithm introduces an overhead of only 2 sec. compared to random selection, while giving optimal results. This represents a 35% slowdown at 1000 offers and 1600 configurations per offer, which only decreases as the number of configurations rises. Third, our framework enables flexible matching by specifying declarative logic-based matching rules rather than hard-coding the matching algorithm as is usual. By providing a declarative mechanism that integrates optimization techniques, such as linear programming, we achieve computational tractability while obtaining a flexible system where different optimization and matching algorithms can be seamlessly plugged in.
In the following, we first discuss the requirements for a preference-based service selection framework informally through a scenario in Section 2 and discuss the state-of-the-art with respect to these requirements. We then develop an abstract model that addresses the requirements for configurable services, preferences over configurations and service selection in Section 3. To enable the exchange of offers and requests in open heterogeneous environments, the abstract model is implemented in Section 4 using standard Web languages. Based on this formalization, matching and optimization rules for service selection are presented and evaluated (Section 5). In Section 6, we present a proof-of-concept implementation of our optimal service selection framework within the scenario developed in Section 2. We conclude the paper with thoughts on the feasibility of optimal service selection within current Web scenarios and point to future work directions.
Consider Annika, a mobile phone user, who is currently in the city of Karlsruhe in Germany and wants to know the driving directions to Munich as soon as possible. Annika's mobile network operator, Mobifhon, provides route planning services for several countries to its customers, dynamically outsourced from third party route planning services on the Web, as sketched in Figure 1. Thus, the service selection takes place at Mobifhon's end. The service selection is therefore not constrained by the limited resources and partial connectivity of Annika's mobile phone, while allowing Mobifhon to aggregate demands and thus procure better discounts for services than if each customer were to transact individually. Mobifhon only sends the final route to Annika's mobile phone. For the sake of illustration, in the we have chosen a relatively general scenario for identifying the requirements, but one could imagine several simplifications, e.g. where the service repository is located with Mobifhon.
Since WSDL lacks any support for modelling QoS characteristics of Web services, there have been several proposals to extend WSDL with concrete QoS metrics, e.g. the Web service Level Agreement (WSLA) project  and Web Service Modeling Language (WSML) . By generalising beyond QoS attributes, XML query languages, like XQuery , and policy languages, like WS-Policy , enable the expression of constraints on arbitrary attributes. However, these approaches lack appropriate support for attaching prices and also preferences which are addressed by requirement (R2).(R2) Context-dependent Preferences: Many of Mobifhon's customers have different preferences about the services they use, based on their current context, location, activities, etc. For example, Annika typically prefers to travel on highways but she is currently on vacation and wants to travel through scenic country roads, possibly making several stops at attractions on the way. Mobifhon chooses route planning services, taking into account the requirements and preferences of the user as well as its own preferences, e.g. on service characteristics such as availability, response time, supported encryption methods etc. Annika's preferences may also depend on her implicit context. For example, if she has an upcoming appointment in Munich the next day, she is more likely to prefer a short route than a long scenic route. In fact, her context-dependent preferences may be predefined, allowing Mobifhon to choose her preferences based on her context dynamically. To enable this, we need a way to describe requests and preferences for particular service configurations declaratively in terms of their attributes.
Requirement (R1) and (R2) are addressed by languages such as WS-Agreement  and the Web Service Offering Language (WSOL) , which introduce classes of services that roughly correspond to what we call service configurations and attach price and preference information to configurations. Both languages require the enumeration of the set of possible configurations. This is clearly inefficient for Web services whose attribute space is very large or even infinite (contradiction to (R4)). For example, the charges for route plans levied by the route planning service may decrease linearly with the desired response time. To cover such requirements, we use Utility Function Policies  to describe preferences and prices as a function of service attribute values, and we use declarative rules to model the context-sensitive nature of the preferences. This is similar to the approach presented in , where personalized service selection is realized by expressing preferences declaratively using SQL. However, since there is no rule support, context-dependent preferences cannot be expressed. In addition, matching algorithms required by (R3) are not supported there.(R3) Semantic Web Service Descriptions: One of the advantages of using semantic description languages like OWL-S and WSMO is that one can use logical reasoning, in particular class subsumption, to bridge different levels of abstraction that occur when specifying requests and offers. Thus, if Annika has specified that she wants to know about attractions on the route, Mobifhon can identify route planning services with information about historical sites as being relevant. Annika may also not want to define preferences for all attributes. She may not know which attributes are used by the services she is interested in and even if she did, it would be too tedious. For example, she may want to say that she generally prefers historical sites to museums without specifying which particular types of each she prefers and by how much. The service matching algorithm needs to match her general preference to the actual attractions information provided by individual services. By modelling attributes as classes in a Semantic Web language, we can classify them into attribute hierarchies. It would not be possible to rely on semantic reasoning alone for service discovery and selection, as others  have also argued, since this only results in a coarse ranking.
There have been previous efforts to augment OWL-S and WSMO with QoS extensions [15,33]. In addition, [21,35] propose dynamic binding for Web service compositions using semantic service descriptions. However, as with the Web service description languages, they cannot be used to describe complex functional relations. Recent research [12,26] has tried to express such relations declaratively, without however investigating how the performance of service selection is affected by the modelling of preferences and matching rules. Thus, these approaches often fail with respect to (R4). We use a decidable fragment of the rule language SWRL  to express complex functional relations declaratively, and discuss the effects of modelling on the performance of selection.(R4) Communication and Computational Efficiency: A lesser requirement, yet nonetheless critical for resource-constrained environments such as mobile services, is that the chosen representation be designed for communication efficiency and computational tractability. I.e., the request has to be expressed in an efficient way (e.g. by avoiding enumeration of all possible configurations) and the selection algorithm has to be efficient enough to enable run-time selection.
Policy languages are state-of-the-art for expressing Web service configurations. However, as already discussed, while alleviating the problem, they cannot solve it due to the exponential size of the attribute space. We circumvent the problem using functional representations (as suggested in [10,4]) by introducing Utility Function Policies. There is a vast amount of work in economics, and particularly in operations research, addressing the computational efficiency of decision making algorithms. In the context of service selection, this is investigated in , focusing on the complexity of service selection with one time costs and e.g. in [40,35] for service compositions. Like these approaches, our work utilises efficient optimization techniques for service selection, but also augments them with the required service description and matching models.
Based on these requirements, in the next section we develop an abstract model for representing and selecting configurable services.
Web service selection is the problem of selecting the best offer made by a service provider given a request. In order to perform Web service selection, one requires (i) means for communicating service offers as well as requests to the other party and (ii) an algorithm for ranking the offers with respect to the request. Bidding languages are a well-established means for communicating requests and offers within economic literature.
Our abstract model essentially describes a Web service, Web service offers and requests and the Web service selection problem. We take a fairly abstract view of a Web service in our model and consider it to be fully described by properties . Such properties might comprise service input and output, behavioural aspects of a service, QoS attributes, etc., thus covering existing Web service description approaches as well as fulfilling (R1). Such a general description of a Web service allows us to abstract from various existing Web service description frameworks, while simultaneously allowing us to utilise existing decision-theoretic algorithms for multi-attribute products.
During the execution of a Web service, each attribute is assigned a value. A set of Web service configurations comprises all possible combinations of attribute values, i.e. . For example, considering the attributes Attractions, Highways and Response Time a concrete configuration would be a service providing routes including highways and information about nearby attractions within 10 seconds. A specific value of is denoted by .
A Web service contract is defined as a tuple , where agent provides a Web service with configuration to a customer at a price of . Furthermore, let denote the set of all contracts involving provider , and the set of contracts involving customer . Not all possible contracts are acceptable to an agent, and thus, only subsets and are requested or offered, respectively.
For our bidding language for highly configurable Web services (R1), we draw from bidding languages for multi-attribute products [10,4]. In this context, a common technique to efficiently encode pricing information (R4) is the use of functions that represent the relationship between Web service configurations and their prices or utilities (as discussed in (R2)). This avoids the combinatorial explosion that results from adding price markups to each configuration.
Thus an offer assigns an additive pricing function to a Web service description, mapping the configurations of the offer to a certain price. Analogously, we introduce a functional form for representing Web service requests. One major difference though is that a requester's willingness to pay might depend on a runtime specific context (R2). Therefore, we introduce a set of execution contexts, where the represent different context dimensions, such as current location of a mobile device, time of service execution, history of past transactions. Any denotes a concrete execution context.
A configuration which is not requested is scored as minus infinity.
Due to the additive form of the scoring function , we have to assume mutual preferential independency  between the attributes in the scoring function. This holds if the utility of an attribute does not depend on the value of another attribute. For example, the score for a certain guaranteed response time will not change if the type of indicated attractions changes.
Service selection in the case of configurable services involves finding the best provider and her best offer. Therefore, we have to solve two maximization problems: first, the best contract for a given provider has to be identified, the so-called Multiattribute Matching Problem (MMP) . Second, based on these results, the best provider can be chosen. Service selection often requires tradeoffs between the various attributes of configurable services. For example, it can be hard to decide between a slow route planning service that provides a lot of detailed information about en route attractions and a fast service that provides only imprecise route information. In economic literature, multi-attribute utility theory (e.g. ) uses utility functions to make such decisions, mapping each alternative to a measure that can be used to rank the alternatives. In our case, the utility of a service configuration is given by a quasi-linear function representing the difference between the requester's preference score for the configuration and its price. The MMP is thus defined as follows:
Our assumptions of additive pricing and scoring functions allow us to simplify the MMP. In particular, we can decompose the calculation into individual subproblems which can be solved independently. The following equations (4-6) use this property to solve Equation 3 efficiently by reducing the number of iterations from to . The binary decision variable is associated with each attribute value and denotes whether the value is part of the best configuration. Equation 5 ensures that exactly one attribute value is selected for each attribute. Since the following integer programming formulation has a totally unimodular constraint matrix and only integers on the constraints' right-hand sides, the problem can be solved efficiently using the simplex algorithm .
In a second step, we have to find the best provider from the set of all offers. Obviously this implies that the best contract for each provider is known, i.e. the MMP is solved for each pair of a request and an offer. We can then determine the best provider by solving Local Allocation Problem (LAP).
Solving LAP is linear with respect to the number of offers and requires steps. However, there are several scenarios where LAP is not sufficient for service selection. First, if we relax the requirement for quasi-linearity of the utility function, e.g. by allowing one time costs, the problem will get considerably more complex. It can be shown by reduction of the Uncapacitated Facility Location Problem that computing the optimal service in such scenarios is in . Second, for the problem formulation in this section, we assume that offered services are always available for all requesters and that possible resource limitations are handled at the provider side, e.g. by adapting the guaranteed service levels or by increasing server capacity. LAP also needs to be extended to handle scarce resources. This is done, e.g. in  using a double auction or in  by means of scheduling algorithms. Third, LAP could be generalized for entire service compositions such as in . The techniques presented later in this paper can also be applied to these new problems, requiring only the rewrite of a single rule to change the selection strategy.
Given the abstract selection model above, we now focus on implementing this model using existing standards and tools for the open and heterogenous Web environment. We use the Web Ontology Language OWL  together with its rule extension SWRL  to implement our service selection framework, which allows us to perform sophisticated matchmaking and ranking of services by means of logical inferencing. We build on well-known notions of matching for Semantic Web services, such as subsumption-based ``plugin" or ``exact" matches  and develop a flexible and extensible framework of declarative matching and optimization rules.
OWL is an ontology language standardized by the World Wide Web Consortium (W3C)  and is based on the description logic (DL) formalism . Due to its close connection to DL it facilitates logical inferencing and allows to derive conclusions from an ontology that have not been stated explicitly. We briefly review some of the modelling constructs of OWL using its abstract syntax.
The main elements of OWL are individuals, properties that relate individuals to each other and classes that group together individuals which share some common characteristics. Classes as well as properties can be put into subsumption hierarchies. Furthermore, OWL allows for describing classes in terms of complex class constructors that pose restrictions on the properties of a class. For example, the statement Class( partial restriction( someValuesFrom )) describes the class of big cities, which are connected to some Highway. The keyword partial means that any big city is connected to some highway, but not any city connected to a highway is also necessarily big, which would be achieved by using the keyword complete instead. Subclass relationship can be expressed by a statement like SubClassOf( ), saying that any big city is also interesting. Individuals can be related to classes and assigned values by a statement like Individual( type( ) value( ) value( )). Besides introducing Munich as a big German city, this statement also includes a data value for the city's population, which is supported by OWL for various datatypes such as integer or string.
An OWL ontology consists of statements like the ones above, considered logical axioms from which an agent can draw logical consequences. For example, given an ontology consisting of the above statements, it follows that Munich is an interesting city, which is denoted by .
For the declarative formulation of matching directives in form of rules, we require additional modelling primitives not provided by OWL. We use the Semantic Web Rule Language (SWRL)  which allows us to combine rule approaches with OWL. We restrict ourselves to a fragment of SWRL called DL-safe rules1 , which is more relevant for practical applications due to its tractability and support by inference engines such as KAON22. For the notation of rules we rely on a standard first-order implication syntax.
In this section we present an ontology-based modelling approach for Web service offers and requests which is in line with Definition 1 and 2 of our abstract model. For the reader's convenience we present the most parts of our ontological model informally via UML class diagrams, where UML classes correspond to OWL concepts, UML associations to object properties, UML inheritance to subconcept relations, UML attributes to OWL datatype properties and UML dependencies to OWL class instantiation .
Figure 2 shows a top-level view of our ontological model, which can be split into two conceptual levels: the upper part a) captures the elements of the abstract model introduced in Section 3, while the lower part b) exemplarily captures certain available domain ontologies that are plugged in for the matchmaking of attribute values.
Recalling definitions 1 and 2, offers and requests specify a set of supported configurations in our abstract model, which they map to prices and scores, respectively. This is captured in the ontological model shown in Figure 2 a) by the classes in the two boxes for description of Web services and policies. The classes and are introduced as subclasses of the more general , by which they are connected to the policies used to define their pricing and scoring functions. They represent the sets and of configurations for a provider or requester. Instead of relating offers and requests to configurations directly, as done in the abstract model, we introduce the intermediary concept Service to capture technical service-specific aspects. Offers and requests are then related to a service which in turn supports various configurations. Referring to pairs of attributes and their values , the class Configuration represents the combinations of attribute values that a provider or customer support. This upper part of our ontology can be seen as extensions of existing Web service description ontologies such as OWL-S or WSMO by using these ontologies to define the type of Attributes. For instance, by introducing the concepts Input, Output, Result, etc. as specialisation of Attribute our ontology can be aligned to the OWL-S profile.
To give an example, recall our mobile phone scenario, where our user Annika requests route planning from Karlsruhe to Munich in at most 30 seconds while she wants nearby castles to be indicated along the route. For convenience, we illustrate the instantiation of elements for service description, such as , by an intuitive notation of pairs of attributes and their values, while we use OWL abstract syntax for details concerning the attribute values in domain ontologies. Colon-separated namespace prefixes indicate the origin of an entity in a domain ontology. An example of a supported configuration for a service that Mobifhon launches as a request based on the above parameters could look as follows.
On the other hand, an example for a configuration supported by an appropriate provider could look like this.
restriction( someValuesFrom( ))))
Definition 1 and 2 introduce a compact functional form for expressing pricing and scoring information. This avoids introducing a separate class Price to model the tertiary relation between Price, Configuration and Offer/Request explicitly. Although such an approach would be most natural, it would result in a significant modelling overhead and does hardly scale up, as shown in our previous work . For modelling functions we introduce the notion of Policies. Generally policies are declarative rules that guide the decision making of an agent. We use a specific class of policies, called utility-function policies , which allow for representing trade-offs between different Web service attributes by mapping their values to a comparable quantitative measure. Approaches on how such policies are expressed via an ontology are discussed in . Namely there are three modelling techniques: Point Based Functions, Piecewise Linear Functions and Pattern-based Functions. To illustrate the idea, the concept Point Based Function is introduced in more detail.
A Point Based Function can be used for discrete attributes and is modeled by specifying sets of Points that explicitly map attribute values referred to in an Attribute Value Pair to a price or . To indicate the Attribute for which a certain Policy is applicable, the isAssignedTo-relation is introduced that points to one of the Attributes in the Web service description. Coming back to our example, assume Annika generally prefers cultural attractions to sports events with the only exception that she hates museums. We can model such a preference structure by instantiating a point-based policy function assigned to the attribute Attractions. Mapping a utility of , and 0 to the three alternatives results in the following specification of the function's component for this particular attribute.
The table maps alternative values for the attribute Attractions to the utility values that specify Annika's preferences.
Assuming appropriate domain ontologies are available and agreed by providers and customers, they are linked to our ontology by their elements, such as the class or the individual , being instances of the class . The value for the attribute EndPoint would be a URI like http://geo.owl#Munich that points to a location in a geographic ontology3.
In our example, the notions of ``route planning service" and ``service that supports navigation" are captured in the service classification ontology that states them to be equivalent.
Class( complete intersectionOf(
restriction( someValuesFrom( ))))
Also the values ``Karlsruhe", ``Munich" and ``Germany" are covered in a domain ontology, namely in the geographical ontology , where the two cities are stated to be located in Germany.
Individual( type( ) value( ))
Individual( type( ) value( ))
Moreover, the values ``cultural attraction" and ``castle" for the attribute Attractions are related by subsumption in the ontology that describes notions of leisure.
Having shown how descriptions of configurable Web services offers and requests are captured by an ontological model, we now describe how the actual selection of a service is carried out by solving LAP using logical inferencing on the ontological elements introduced above.
In order to derive a ranked list of the offer instances from the knowledge base, we formulate a query that refers to a Request instance containing preferences and to an instance representing the current Context. In addition, to reduce the number of offers that have to be ranked we can add mandatory conditions directly to the query using the SPARQL FILTER element. This is exemplified in Query 8, where only services that provide a guaranteed response time of less than 20 sec. are retrieved.
Conceptually answering such a query can be broken down into two highly connected parts: (i) determining matches between offers and requests by comparing the respective service attributes, and (ii) ranking the various configurations provided in the offer according to the preferences in the request. In the following, we first discuss how matching of attribute values is realized and show how these matching rules are applied to define variants of optimization predicate mmp used in Query 8. Subsequently, we evaluate these algorithms in terms of performance and discuss their applicability for service selection.
The comparison between a requested attribute value and an offered attribute value is fundamental. In the context of matchmaking in the Semantic Web, various techniques have been proposed for comparing the characteristics of two semantically annotated services. The most widely used approach for descriptions based on OWL classes is to use DL inferencing, distinguishing between several notions of match based on subsumption or concept satisfiability. For two OWL classes and that represent attribute values of a request or an offer, the degrees of match proposed in  are: exact if and are equivalent, plugin if is a subclass of , subsumes if is a subclass of , intersect if the conjunction of and is satisfiable, and fail if the conjunction of and is unsatisfiable.
We support these notions of match in our framework, and also allow for others by including customisable matching predicates into the service selection algorithm. In fact, since we use a declarative formalism to describe how attribute values are matched, a user can bring in arbitrarily complex matchmaking behaviour expressed in rules which facilitates the adaption of the selection component to changing service descriptions (e.g. with new attributes). Contrarily, other approaches use hard-coded algorithms to process the results of attribute matching that are specific to e.g. input/output matching, as done in . Since we keep attributes in Web service configurations rather generic, the way in which two attribute values are compared strongly depends on the domain of interest they originate from and on the way in which they are represented in there.
In our example, the values of the attributes ServiceType and IndicatedAttractions are concepts in an ontology, and thus, the formerly described degrees of match apply to them and can be used for their comparison. The following rule definition specifies the matching predicate for service types, requiring them to yield an exact match.
In our example, the indeed yields an exact match with the provided , since entails their equivalence. The attribute ServiceType itself is an instance of the class in the model shown in Figure 2. Analogously, values of the attribute Attractions could be matched using the predicate instead of , and again the provided attribute value would match the requested one, since subClassOf( ).
The values of the attributes StartPoint and EndPoint from the example represent individual locations in a geographic ontology and require a different treatment. Here the modeller specifies the customised matching behavior for location attributes by introducing a class as a subclass of , of which StartPoint and EndPoint are instances. The appropriate matching behaviour is then captured by the following rule.
Finally, there are attributes which do require a complex matching in terms of logical reasoning, but where a simple and efficient string comparison or arithmetic calculations are sufficient. In this case, the modeler of a matching rule can include predefined builtin predicates defined in SWRL. From our example, the QoS attribute ResponseTime falls into this category, and is processed according to the following rule specification.
In the following, we show how the matching predicates introduced in the previous section can be used within the optimization rule in order to determine the utility measure for the individual attribute values. We define three alternative variants of the mmp-predicate, which considerably differ in their underlying assumptions, applicability and performance characteristics. While the fist variant [V1] implements the ranking based on enumerating the configurations (Equation 3), [V2] solves MMP on per attribute basis using Equation 4-6. [V3] goes a step beyond [V2] by utilizing the linear program formulation.[V1] This variant implements Equation 3, where a ranking of all offers and configurations is derived by evaluating all possible configurations for each offer according to a request. We can model the problem purely based on DL-safe rules using some standard SWRL builtin functions. Rule 15 calculates the difference between score and price of each Configuration that is supported by an Offer as well as asked for in the Request. The compare-predicate defined in Rule 14 is used to match two configurations.
Since pricing and scoring information are not explicitly given, the two predicates price and score are used to calculate this information based on the Policies defined in the offer or request, respectively. Rule 16 calculates the score by evaluating for all provided in a configuration . Attribute values are matched by means of the matching predicate defined in the previous section. The price-relation is defined analogously, but without the context-dependency represented by the relation isValidIn.
Advantages of this approach are that one can get a full ranking of all configurations, which might be required in some applications. Furthermore, it can be modelled purely based on standard modelling primitives provided by OWL-DL and SWRL. However, the disadvantages are also evident. Since the approach is based on enumerating all configurations, a finite number of configurations is required and thus the approach is not suitable in the presence of continuous attributes. As already discussed in Section 3, another fundamental problem is the complexity with respect to the number of required utility calculations.
[V2] The second variant of the mmp-predicate implements the decomposed optimization algorithm described in Equation (4-6). In this context we utilize the additive structure of the pricing as well as scoring functions: the optimal value for each attribute is determined separately and the overall price/score is calculated based on these measures. Equation 17 determines the utility of an offer according to request in a specific execution context.
Since the calculation of the optimal value for a certain attribute requires iterating over an unknown number of attribute values (instances), the calculation cannot be directly expressed in SWRL. We thus use a builtin function, called optFkt, to determine the attribute value maximizing the utility of attribute . Algorithm 1 shows the implementation of the builtin-predicate specifically for Point-based Functions. In the predicate optFkt for each attribute the requester and provider policies are queried from the knowledge base and the attribute value leading to the maximal utility is determined. A major advantage of the approach is that this query uses the match-predicate defined in the ontology. Thus, the correct matching algorithm is used for each attribute automatically and the implementation of the builtin is domain independent.
[V3] The third variant of the algorithm implements also the decomposed ranking algorithm described in Equation (4-6), but with some additional optimizations.
This time we use a linear programm to calculate the optimal attribute value. The calculation is encapsulated within the builtin optLP (Algorithm 2). The builtin performs a query to get the relevant utilities for the attribute values. This is done again by utilising the match-predicate from the ontology. The optimization problem is constructed and solved using a standard optimization library.4 This approach has the advantage that we can use the efficient implementations for solving integer linear programs provided by standard tools.In contrast to the first optimization algorithm, variant [V2] and [V3] can be easily adapted to handle continuous attributes by introducing appropriate builtins for optFkt and optLP. However, it is not possible to get a ranked list enumerating all offers and configurations, as it is possible using the first approach. Nevertheless, for most applications determining the ranked list of offers is sufficient.
In the next section we compare the different modelling approaches with respect to their performance in the selection process.
[100 configurations per offer]
[1000 configurations per offer]
[1000 offers in KB]
Having introduced an approach for preference-based selection of configurable Web services, the question arises how this increased expressivity influences the performance of the selection process. In particular, we are interested in the trade-off between performance and optimality. Therefore, the three selection variants introduced in Section 5.2 are compared to an algorithm that randomly selects an offer and configuration. All algorithms are evaluated for varying number of offers in the knowledge base and varying numbers of configurations per offer. Each of these settings is evaluated by means of a simulation. Only settings with string matching rules have been used. Performance evaluations of query answering with more complex matching rules is a complementary question and has already been elaborated in  for KAON2. For each setting, instances of offers, requests and contexts are randomly generated using a uniform distribution and stored in the knowledge base. Then SPARQL-queries are generated according to Equation 8 (without any FILTER condition) referring to a specific execution context and request in the knowledge base. The time between sending the query and receiving the result is measured. In order to avoid possible network delays the simulation is done on a single machine. For each setting the average query time is determined based on 20 simulation runs. Using this simulation setup the following issues are addressed:How does the performance change when moving towards preference- and context-aware selection strategies? How expensive is optimality? To investigate the additional time required for evaluating the offers and configurations according to preferences, we compare the most general optimal variant [V1] with a baseline algorithm that randomly selects an offer and corresponding configuration from the knowledge base. Figure 3 shows the interrelation between the number of offers in the knowledge base, the number of configurations in an offer and the resulting query time. In the first setting (Figure 3(a)) each bid contains exactly 100 configurations. Service selection can be done in less than a second for 2000 offers using the random algorithm, while [V1] requires about 17 seconds. As depicted in Figures 3(b) and 3(c), the gap increases further for more demanding settings. However, the random approach leads to a considerable loss in utility for the requester. Assuming a uniform distribution of the prices and scoring values in [0,1] and a reasonable number of offers () an optimal algorithm leads to a utility of almost 1 while the random algorithm results only in a average utility of 0.Can we improve the performance of optimal selection by constraining the bidding language? How does the performance of the optimal selection variants differ? Variants [V2] and [V3] of the mmp-predicate assume an additive structure of the pricing and scoring function. As discussed above, this allows a more efficient implementation. [V2] reduces the runtime compared to [V1] from 17 to 11 seconds in the first setting and from 477 to 41 in the second setting (both with 2000 offers). [V3] further reduces the runtime to 5 and 13 seconds, respectively. If we now compare this improved performance to the random algorithm, the cost of optimality is rather moderate. In particular, considering setting 3 with 1000 offers and 1600 configurations per offer, there is only a slowdown by 35% when moving from random selection to [V3] (Figure 3(c)). Comparing this number to smaller settings we can observe much greater slowdowns which, at first glance, seems contradictory. However, this observation can be explained by the fact that for large-scale scenarios (more that 1000 configurations per offer) query answering becomes the predominant factor compared to the optimization in [V3]. Since query answering is required for both algorithms, variant [V3] and random selection converge.
In this section, we presented an flexible approach for assigning syntactic as well as semantic matching predicates to attributes. These predicates are automatically applied for matching attribute values in the optimization process. In general, the results of the performance evaluation are promising since the fastest approach allows ranking up to 2000 offers with a reasonable number of configurations below 15 seconds. Considering the fact that these 2000 offers all fulfill the mandatory conditions defined in the FILTER condition of Query 8, this can already be seen as a very large scenario. Moreover, we analysed the worst-case scenario where all offers provide all possible configurations and all attributes are discrete. Optimization on discrete attributes is more time consuming compared to the continuous case because techniques like differentiation are not applicable. Therefore, we expect better performance in a real-world use case. In our mobile scenario, only up to 20 different route planning providers might be available, whereas the number of possible configurations may easily exceed 1000. However, it is unlikely that all of them are offered by all providers.
As a further result of our performance study, it becomes clear that providing expressive means for modelling preferences as done in [12,26] is not sufficient without ensuring that the way they are used allows for the implementation of efficient selection algorithms. Comparing the results for [V2] and [V3], we can identify the absence of the additivity assumption as the major source of complexity (improvement from [V1] to [V2]). Using an efficient implementation for solving the optimization problem provides a relatively minor improvement (improvement from [V2] to [V3]) in performance. Therefore, in many cases, especially if service selection has to be done at runtime, restricting the expressiveness of the bidding language is a viable way to considerably increase performance. Even if preferential independency does not hold exactly, additive functions often provide a good approximation . If this simplification is not possible, other methods for improving the performance of [V1] can be introduced, e.g. a caching mechanism for prices and scores that reduces the number of rule evaluations .
A further conclusion is that in some very demanding settings, reducing the set of relevant offers is crucial. This can be realized by adding additional mandatory conditions through FILTER expressions.
As a proof of concept, we implemented the algorithms presented in this paper in a framework consisting of two components.5 The first is a server component that provides a repository for service offers and requests that can be queried via a Web service interface and the DL reasoner KAON2. KAON2 is chosen because it supports the logical fragment required for our offer and request descriptions, while being optimized for large-scale query answering . This component corresponds to the service repository in Figure 1.
The second component is a client tool that facilitates the specification of Web service offers and requests by providing a GUI for specifying SPARQL-queries and utility function policies. Generally, offers are transferred to the server, whereas for requests the user decides whether they should also be stored as policies on the server to enable further reuse (cf. (R4)) or formulated directly as a query. The client is supplemented by a WS-BPEL engine  that allows the specification of service compositions. For example, the second component could be used by the network operator in our initial example to implement its application.
In order to implement dynamic binding of services, we utilise the distinction between ports and port types in WS-BPEL. This feature allows us to dynamically re-assign end points as long as the service candidates have an identical interface, i.e. port type. To support this, the client tool allows extending the process as follows: before each dynamic service invocation, a WS-BPEL invoke-operation for the selection service is introduced that provides the binding information for the following service. We realize the Binding by Constraint paradigm  by specifying a SPARQL query (such as Query 8) that is passed to the selection service. In this case, SPARQL provides a standardized language for identifying suitable services without referring to a concrete name or identifier. Parts of the query are generated at development time of the process, while others can be added dynamically at runtime. To illustrate this approach, Listing 1 shows an excerpt form the WS-BPEL process of the route planning scenario introduced in Section 2. In lines 2-4 the user's request is received, the contained clientId is passed to the location service, where the current country of the user is determined (lines 8-10). Then the SPARQL query, which statically refers to request ns1:RequestOperatorX containing the providers preferences, is extended by the context location (lines 12-15) and passed to the selection service which is invoked in lines 18-20. After receiving the address of the best service, the corresponding port is assigned to the port type of the following partner link (lines 21-25) and the route planning service is invoked by passing the original user request containing the start and end point (lines 26-28).
In case service candidates have different interfaces, dynamic selection requires complex interface mappings.  present an approach for dynamic binding of services using reflection. However, this cannot be used directly for WS-BPEL.
Listing 1: Flexible binding in WS-BPEL
In this paper, we have provided a formal and standards-based representation of Web service configurations and user preferences over these configurations meeting the requirements (R1) and (R2) introduced in Section 2. Our approach avoids enumerating offered/requested configurations and thus significantly reduces storage requirements and increases communication efficiency (R4). In addition, we have presented a service selection algorithm that seamlessly integrates syntactic as well as semantic matching (R3) with efficient optimization techniques. In contrast to other work in this area, we do not restrict ourselves to logical and/or similarity-based matching approaches, but allow customisable matching predicates that can be declaratively assigned to service attributes in a very flexible way. In order to quantify the overhead introduced by the additional expressivity and the optimality requirement, we evaluated the performance of the different ranking algorithms, showing that an algorithm that implements the linear programming formulation of the optimization problem introduces only a small overhead compared to random selection. This holds particularly for large-scale scenarios with a high number of configurations per offer. Another important finding is that the performance depends crucially on the way offers and request are modelled. According to our evaluation, additive preference and price functions are required to dynamically select services in large-scale scenarios in a computationally tractable manner (R4). Finally, as a proof of concept we applied our framework for dynamic Web service selection in WS-BPEL.
As future work, we plan to move from selecting single services to service selection for an entire process. This can be realized simply by adding rules and builtin predicates implementing the more complex optimization algorithms (e.g. ). In addition, we plan to integrate behavioral matching rules as presented in , which would allow defining preferences over temporal properties of a service. For example, Annika might prefer services which provide the route information before paying. In terms of implementation, we plan to address the problem of dynamic binding in WS-BPEL beyond simple port re-assignments.