WS-Discovery

WS-Discovery Steve Loughran Apache Web Services Copyright (c) 2003 Apache Software Foundation WS-Discovery This documenet describes a protocol for finding Web Services on a LAN or campus wide network, using multicasting of XML request messages. The underlying payload is similar to that of Service Location Protocol, though the directory agent process is omitted. To scale well, integration with UDDI is required. While it is easy to fault the design, mainly on security and scalability, the initial implementation binds to Axis, automatically exporting running Axis services, enabling local clients to find services without a server. rationale This specification aims to provide dynamic discovery of web services on a Local Area Network. It will not scale up to "the Internet", or indeed a large Intranet, but it lets programs find local implementations of well known services, including UDDI registries. The envisaged uses are embedded web services, workgroup computing systems and to bootstrap server-side clusters in a 'near-zero' configuration environment. Every server exporting services can run a discovery server, responding to queries for the services that it offers. The protocol is not SOAP-centric; you could also use it to enumerate REST objects that the client knows about; The system just maps URIs to URLs and leaves to applications and entities at the end of URLs to work out the details among themselves. protocol Server listens on a well known (potentially IANA-assigned) multicast group, awaiting resolution requests. Client builds a request object, comprising requestID: xsd:int type: xsd:string URI: xsd:anyURL URL: xsd:anyURI scope: xsd:string (can be "") expires: xsd:int (optional) The XML Schema is as follows: <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified"> <xs:element name="discovery"> <xs:annotation> <xs:documentation>message of specified type</xs:documentation> </xs:annotation> <xs:complexType> <xs:sequence> <xs:element name="uri" type="xs:anyURI"> <xs:annotation> <xs:documentation>uri of service</xs:documentation> </xs:annotation> </xs:element> <xs:element name="scope" type="xs:string"> <xs:annotation> <xs:documentation>scope of request, leave empty for 'any' scope</xs:documentation> </xs:annotation> </xs:element> <xs:element name="url" type="xs:anyURI" minOccurs="0"> <xs:annotation> <xs:documentation>endpoint of service</xs:documentation> </xs:annotation> </xs:element> <xs:element name="expires" type="xs:int" minOccurs="0"> <xs:annotation> <xs:documentation>expiry time as time_t integer, always in UTC</xs:documentation> </xs:annotation> </xs:element> <xs:element name="description" type="xs:string" minOccurs="0"> <xs:annotation> <xs:documentation>optional description of endpoint</xs:documentation> </xs:annotation> </xs:element> </xs:sequence> <xs:attribute name="id" type="xs:int" use="required"/> <xs:attribute name="type" type="xs:string" use="required"/> </xs:complexType> </xs:element> </xs:schema> The "type" attribute is the key to the system. The protocol supports a series of verbs. "Find": look up any implementations of the supplied URI. "Found": a found URL mapped from the requested URI. "Advertisement": to be used in any announcement extensions. Requests can be sent out with a TTL of 1,2,3, whatever. If the client wants to find any implementation, rather than enumerate all implementations in a location, then the client should slowly increase the TTL till it gets a response: using the same requestID. Services should ignore repeated requests from the same host with the same request ID. Server receives the "Find" request. If the combination of (senderIP, requestID) is in the list of recent requests, the server does not respond to the request. If the request merits a response, then the server looks to see if it has an entry for that URI in the same scope. If it does it sleeps a short period of time (20-500ms), then posts the response directly back to the client as a unicast datagram. A response is a "Found" message containing the same requestID and URI as the request, adding the URL, and optionally an expiry time and description. The expiry time is a time_t value; UTC seconds since 1970-01-01-0000. The xsd:dateTime value would match were it not for the ongoing cross-platform issues with binding the dateTime to specific time zones. Negative responses, "no-match" responses, do not generate network traffic. The network load does not therefore scale with the number of servers on the network, but with the number of clients using it, and then by the number of successful responses to their requests. extensibility There are different ways to extend this; The protocol could support new request types, and have new responses. The content of requests and responses may change too. How can we address this? The proposed solution is XML schemas; if the schema of the request is not recognised, then the server ignores the message. The schema of the response must be bound to that of the request; you cannot change the response schema without also changing the request schema. A future iteration may also add an xsd:any section to the payload What about replacing the current payload with a full SOAP envelope? It would let us add authentication via WS-Security, and do other things. But in the absence of a standard for SOAP-over-multicast-datagrams, and the size limitations of such datagrams, it doesn't seem immediately appealing. One of the interesting challenges is what should recipients of a multicast SOAP message do with mustUnderstand headers they don't understand. The SOAP rules say 'fault' -but that means every recipient is going to send a fault back -this is the wrong thing to do. Implementation Issues configuration Where does the system get configured? It can be directly hooked to the SOAP server. WS-Discovery aware endpoints should be able to provide extra configuration information in their deployment metadata, which for Axis implies the deployment descriptor. Web applications should also be able to add or delete entries. This lets a Web Service hide entries, but it also lets a local WS-Discovery server return references to remote services, such as a distant UDDI service. discovery of WS-Discovery servers Should WS-Discovery servers be discoverable in their own right, just as SLP directory agents are? Yes, if you want to permit unicast interrogation and management. No if you want to make it harder to find system information and vulnerabilities. A WS-Discovery server may choose to respond to requests to find the WS-Discovery service; this option could be user configurable. The current service string to search for is "service:axis-discovery" The response to such a request is a URL indicating the UDP ports supported; in the absence of a widespread 'udp:' URL schema, we choose to declare the use of this URL schema in our responses. If a URL contains something such as udp://192.168.4.4:1434 then it means that the machine at IP address 192.168.4.4 is responding to unicast datagrams on port 1434. This is an extension of the SLP responses, which introduced tcp: urls, URLS which should not be confused with the .NET remoting URLs, which adopt the same tcp: prefix to mean .NET remoting connections exclusively. This confusion should not have adverse effects -merely serves to demonstrate why meaningful prefixes should be used, and the W3C addressing recommendations (http://www.w3.org/Addressing/) used. Internationalisation The requests and responses must be in the UTF-8 encoding. limitations and risks Multicast IP does scale, but things get very chatty. The protocol needs to minimise chattiness and collisions by having servers not repeat responses, and by adding a random delay before responding. Multicast IP does not work on ad-hoc WLANs; those wireless networks without an access point. Some network stacks refuse to allow applications to bind to a multicast address, though there are hints this may just be a firmware defect. . Could it be used to amplify a DoS attack? Yes, if someone spoofs an IP address and asks for a popular service with a high TTL: all implementations would respond. If servers could determine the # of hops from the client, it could restrict responses to local systems only. As we cannot do that, we can code our client to limit the TTL to a maximum well below 255. Could servers be subjected to a DoS attack? Not really. The computational load of this is minor: an unmarshal of an XML document, a document limited in size to 8192 bytes by the IP protocol itself, a lookup of a hash table, and a generation of an XML response document. A logging attack is possible if the server logs serious failures to the file system; the log system should be configured to not save full details to file, or (better) to use a rolling file system logger. What about security? There is currently no security or authentication in WS-Discovery; after you get an endpoint from it, you have to negotiate with the endpoint to see if you trust it. The risk here is twofold. First, a malicious client could issue may requests, generating server load. Secondly, a malicious server could issue false responses, listing endpoints that were invalid. This could deny clients service, especially if the invalid endpoints were themselves malicious and deliberately slow, acting as a tar-pit endpoint. Security could be addressed with signing of messages and responses, but as well as the size limits of datagrams, signing introduces a need for authentication, which implies some authentication and authorisation framework. The obvious solutions would be XML-Signature and perhaps even WS-Security, the latter needing a SOAP-esque payload first.