passing information. (As of draft version 1.2, SOAP is no longer an acronym.) Despite
being unidirectional, SOAP messages can be combined to implement
request/response processes, or even more sophisticated interactions. In addition to the
sending and the receiving nodes of a SOAP message, SOAP message routing includes
intermediary nodes. SOAP intermediaries should not be confused with intermediaries
in any underlying protocol. For instance, HTTP messages may be routed through intermediaries.
However, these intermediaries are not involved in the processing of SOAP
messages. SOAP intermediaries play a role in the handling or processing of a message
at the application level.
SOAP describes an XML-based markup language “for exchanging structured and
typed information.” The information passed in a SOAP message can either represent
documents or remote procedure calls (RPCs) that invoke specific procedures at the service
provider. A SOAP document could be a purchase order or an airline reservation
form. On the other hand, an RPC can invoke software to charge a purchase. There are
no clear guidelines to determine when a document or an RPC should be used. The system
designer will make this decision.
Web Services using SOAP have gained popularity very quickly. The concept of an
XML RPC was created in 1998 by David Winer of Userland Software. The XML-RPC
specification was released in 1999 and was the work of Winer, Don Box of Develop-
Mentor, and Mohsen Al-Ghosein and Bob Atkinson of Microsoft. While the specification
was published as XML-RPC, the working group adopted the working name
SOAP. Soon after, SOAP .9 and 1.0 were released. In March 2000, IBM joined the group
and worked on the SOAP 1.1 specification. The 1.1 version was adopted by the W3C as
a recommendation. SOAP version 1.2 currently exists as a series of working drafts
(W3C 2002e, W3C 2002f). In addition to the working drafts, there is a SOAP 1.2 Primer
(W3C 2002d) that takes the information in the working drafts and describes SOAP features
using actual SOAP messages.
The discussion in this section is based on the SOAP 1.2 working drafts. The discussion
is not meant to be all encompassing and is a brief overview of the protocol. The
reader should consult the W3C drafts or other books on SOAP to get further details.
As with any other protocol, there are two portions to the SOAP Protocol: a description
of the messages that are to be exchanged, including the format and data encoding
rules, and the sequence of messages exchanged. As the reader will see, there isn’t a lot
of specificity to SOAP. This is by design. Rather than overspecifying and trying to
anticipate every possible outcome, the SOAP designers took a minimalist approach.
SOAP specifies the skeleton of a message format—very little else is required. This
approach allows messages to be tailored to application-specific uses. In addition to the
protocol, there are protocol bindings that describe how SOAP can be transported using
different underlying transport protocols. Currently, HTTP is the only underlying protocol
with a binding referenced in the SOAP specification, but others are possible and
not excluded by the specification.
SOAP Message Processing
The two main nodes in processing a SOAP message are the initial message sender and
the ultimate message receiver. In addition, SOAP intermediaries, who are message
receivers that later forward the message toward the ultimate receiver, also have a role
in the processing of a SOAP message. For instance, in Figure 2.3, the buyer’s system
may send a purchase order to the seller via the buyer’s accounts payable system. The
accounts payable system records the details of the purchase so that, when an invoice
from the seller is received, the information needed to authorize a payment is already
entered. The accounts payable system is an intermediary. When the accounts payable
system completes its tasks, it is responsible for transmitting the purchase order to the
seller.
The buyer’s system can target portions of the SOAP message at different receivers.
The body of the message is, by definition, intended for the ultimate receiver of the message.
Other receivers may examine the body and process information in it but must not
modify it. The ultimate receiver must be able to understand and process the body. If it
can’t process the message, a SOAP fault is generated and returned to the sender. Unlike
the message’s body, elements in the message’s header can be:
■■ Explicitly targeted at specific receivers via a URI
■■ Targeted at a receiver based on its relative position in the processing chain
■■ Targeted using some application-defined role
Except for the ultimate receiver, all other receivers of the SOAP message are SOAP
intermediaries. When a URI is used, the URI can specify a unique and concrete
receiver, say by using a URL receiver.
When the relative position is used to specify a target, two predefined roles, next and
ultimateReceiver, are available. Next is a role assumed by the next receiver of a message.
UltimateReceiver is the ultimate receiver of the message. If no role is associated with the
element, the ultimate receiver is assumed to be the target.
A third predefined role, none, indicates that no receiver should process the element.
An element targeted at none may not be processed by any receiver but may contain
data that is examined in the course of processing other elements.
The third option for targeting a header element is application specific. But, it will
probably be used to target header elements to nodes performing an application-specific
function, such as manager or accounting.
It is possible for a receiver to fill more than one role. For instance, an element could
be targeted at a receiver based on a URL and based on its role as the next receiver.
The creator of the header element can specify that the targeted receiver must process
the header or whether it is acceptable for the targeted receiver to ignore the header element.
If the targeted receiver must process the header, it is said that the receiver must
understand the header. If there is a requirement to understand the element but the
receiver does not understand it, the receiver must stop all processing of the message
and return a SOAP fault code. By marking a header as must understand, the creator
can force a receiver to process the header. This is useful for making sure that securityrelated
information is properly processed.
Processing order
SOAP prescribes an order for processing the SOAP-specific parts of a message. This
description follows the SOAP version 1.2 Part 1: Messaging Framework (W3C 2002e). Processing
of the SOAP message must be performed as though it were done in the following
order. First, the receiver must decide what roles it will play. Is it only the next
receiver or is it also the ultimate receiver? The node can use information contained in
headers or the body to make the decision.
Next, the node must identify header elements targeted at it and that it must understand
and decide whether it can process these blocks. If it cannot, all processing must
end and a SOAP fault generated. For the ultimate receiver, processing of the body
should not be considered at this step in deciding whether to generate a fault.
If all mandatory headers can be processed, the node should process the headers and,
in the case of the ultimate receiver, process the message body. The node can choose to
ignore header elements that are not mandatory for it to process. Other faults may be
generated during this phase.
Finally, if the recipient is an intermediary, it must remove header elements targeted
at it, insert any new header elements needed, and pass the message on to the next
receiver with the body unmodified.
Open items
After this description of SOAP message processing, you may be curious to know:
■■ How does a receiver know what role it is playing? The recipient of a message is
always the next receiver, but is it also the ultimate receiver?
■■ How does a receiver decide what order it is going to use to process the headers?
■■ How does a node know who the next receiver is so that the message can be
routed to it?
These are all very good questions, but the SOAP specification does not answer them.
These decisions can be determined using some algorithm programmed into the application,
or determined by some other method that is outside the scope of SOAP.
Once these decisions have been made, instructions that reflect the answers can be
contained in the headers of the message itself. For instance, the originator of the message
can include routing information and more detailed processing instructions in the
header. Or each node can insert instructions for the next.
Message Format
The basic minimal form of a SOAP message is shown in the XML document below. A
data encoding using only built-in types and no additional definitions or declarations is
recommended in the specification. This minimal schema allows SOAP message validation
without XML Schema documents. However, application-specific XML schemas
are allowed, which may require additional validation. DTDs are explicitly disallowed.
Each SOAP message is identified as an XML 1.0 document that has one element with
the local name envelope. It is qualified with the namespace http://www.w3.org
/2002/06/soap-envelope. Besides qualifying the namespace as a SOAP namespace, the
URL identifies the version of SOAP used. In this discussion, we use the June 2002 version
of SOAP 1.2. Attributes are also qualified by the soap-envelope namespace. The
envelope has child elements of an optional header and a required body that we will
describe later.
...
...
Beyond what we have just discussed, there are no required elements within the
SOAP envelope that convey the meaning or intent of the message. There is no requirement
to include the identity of the sender or the receiver, the time or date the message
was created, or a message title. It is expected that each application will define these elements,
if they are required.
While the SOAP specification describes how an RPC can be represented in a SOAP
message, there is no requirement to use the representation described. And, even if the
encoding is used, there is no indicator in the message itself that the message body represents
an RPC. With the exception of guidance on how to encode arguments to an
RPC, the receiver is left to determine how to interpret the contents of the message. It is
expected that the receiver does this, in part, through the use and understanding of
namespaces that associate elements and attributes with the application implemented
by the receiver.
SOAP Message Header
A SOAP Message Header, shown in a modified version of the message from above, is
an optional part of a SOAP message. Its local name is header, and it is qualified using
the same namespace as the envelope, http://www.w3c.org/2002/06/soap-envelope. The
header can contain zero or more namespace-qualified child elements. Two attributes,
role and mustUnderstand, can be associated with child elements of the header. In the
example, hdr1, is qualified in the www.widgets.com/logging namespace.
role/next”
sec:mustUnderstand=”true”>
...
...
Unlike the message’s body, which may not be modified, the message’s header is a
dynamic part of the message. Intermediaries are required to delete header elements
targeted at them and can add header elements as needed. Adding the same header
back in that was deleted is acceptable.
Role
SOAP header elements are targeted at SOAP nodes. A node performs some function
in processing or routing the message. The value of the SOAP role attribute can be
designated explicitly via a URI or relatively via three predefined values, next, ultimateReceiver,
or none. These relative values correspond to the roles described previously
in the section on SOAP message processing. That is, if the header is targeted at
next, then the next receiver processes the header. If the header is targeted at the ultimateReceiver,
then the ultimate receiver processes the element. Finally, if none is the role
targeted, no receiver processes the element. If no role attribute is specified, the default
is UltimateReceiver, the ultimate receiver. In the example above, the header is targeted
at the next recipient. The namespace of the header hints that the header is targeted at a
logging intermediary that will log the order before it goes to the seller.
Each header element will be processed by at most one role. However, nodes playing
other roles may examine headers not targeted at them. If the node is an intermediary,
it must delete from the message any header elements targeted at it and may add other
header elements for subsequent receivers before passing it on. It is not considered a
fault if the ultimate receiver receives the message and there are header elements that
are not targeted at it. Areceiver must decide for itself whether it is the next receiver or
the ultimate receiver.
MustUnderstand
Besides identifying a header element as intended for a particular receiver, the creator
of a header element may designate that the targeted receiver mustUnderstand it. In
other words, the receiver must know what to do with the header. The receiving software
must understand the semantics of the names in the header element and be able to
process the element accordingly. The header in the previous example, hdr1, must be
understood by the recipient. If the header namespace is not known to it, the receiver
must stop processing the message. Ideally, the processing node should return a SOAP
fault to the requester. But, depending on the protocols used and the routing, there are
conditions where this is not possible.
SOAP Message Body
A message body must have the local name of body. It must be associated with the
http://www.w3c.org/2002/06/soap-envelope namespace. Child elements are optional, and
multiple child elements are allowed. No body-specific attributes are defined. The message
body is targeted at the ultimate receiver, who must understand the body.
Remember that SOAP is a unidirectional protocol. It is often difficult to keep that in
mind. It is natural to think of SOAP as a request/response protocol. But, there is no
requirement to return a response for a message received. Still, message body child elements
have been defined that are the logical consequence of certain inputs. Because of
this, our discussion of the message body will be divided into request message body
elements and response message body elements. However, the reader should keep in
mind that the SOAP protocol regards communication in each direction as separate and
unrelated events. A discussion of the options for returning a response to a SOAP
request is discussed in the section on protocol bindings.
Request message body elements
A SOAP request message body may contain zero or more child elements. If multiple
child elements are present they can represent a single unit of work, multiple units of
work, or some combination of work and data. Request body elements can be divided
into two categories, document type and RPC type. The distinction is subtle. There is
nothing that distinguishes an RPC message body from a document body.
Document body elements are analogous to paper documents. Most likely, they will
be forms that have an understood structure such as purchase orders, invoices, itineraries,
or prescriptions. In order for the document to be processed correctly, it is important
that the ultimate receiver be cognizant of the namespace that defines the elements
of the document.
RPC message bodies are XML-based remote procedure calls. SOAP Version 1.2 Part
2: Adjuncts (W3C 2002f) describes how to encode data structures used by programming
languages to convey parameters in procedure calls. SOAP does not mandate the use of
these encoding rules and acknowledges the possibility of using other encoding rules.
However, use of other encodings will adversely impact the interoperability of the RPC.
Two options exist for encoding the arguments of an RPC. First, the SOAP RPC invocation
can be a struct where the name of the struct corresponds to the procedure or
method name. Each input or in/out argument to the procedure is a child element
structure with a name and type corresponding to the name and type of the parameter
in the procedure signature. The second RPC encoding method is to encode each argument
as an element of an array. The name of the array corresponds to the name of the
procedure and the position in the array corresponds to the position in the argument
list. If problems occur, several RPC specific faults have been defined which will be
described later.
The following example invokes an RPC called buy. This RPC is in the form of a
structure and takes two arguments, the order and the shipInfo. Note that there is no
explicit indication that this is an RPC invocation.
envelope/role/next”
sec:mustUnderstand=”true”>
...
Response message body elements
The content of response message bodies can be documents, RPC responses, or a SOAP
fault. Just as a document can be received, a document can result from the receipt of a
document. For instance, a reservation request can result in the creation of an itinerary.
Using SOAP to transmit a document has already been described, so the discussion will
not be repeated here.
The response to an RPC can be a structure or an array. The name of the structure is
identical to the name of the procedure or method that is returning the information. If
the procedure or method returns a value, it must be named result, and it must be namespace
qualified with http://www.w3.org/2002/06/soap.rpc. Every other output or
input/output parameter must be represented by an element with a name corresponding
to the parameter name. If an array is used, the result must be the first element in the
array. The result element, if there is one, is followed by array elements for each out or
in/out parameter, in the order they are specified in the procedure signature. The following
example illustrates the response to the RPC invocation from the previous section.
For this response, there is no special header targeted at the recipient. A result is
returned indicating the status of the RPC invocation.
okay
ASOAP output message may also contain a SOAP fault. SOAP faults are generated
in response to errors or to carry other status information. This is the only body child
element that is defined by SOAP. The element must have a local name of fault and a
namespace of http://www.w3.org/2002.06/soap-envelope. Only one fault element may
appear in the message body. Child elements of code and reason are required within the
fault element. Other child elements, node and role, and details are optional. Code is a
structure that consists of a value that designates the high-level fault and an optional
subcode that provides additional details on the fault. Reason is a human-readable representation
of the fault. Node identifies the SOAP node that encountered the fault. Role
identifies what role the node was operating in when the fault occurred. Finally, detail
carries application-specific fault information. SOAP defined faults are:
■■ Version mismatch.
■■ Inability to understand or process a mandatory header.
■■ A DTD was contained in the message.
■■ A data encoding was referenced that is not recognized by the processor.
■■ The message was incorrectly formatted or did not contain needed information.
■■ The message could not be processed for some reason other than that the message
was malformed or incomplete.
For SOAP RPC, additional fault codes have been defined. Fault codes can be
extended to handle application specific needs.
SOAP Features
Key to SOAP’s future success is the ability to add capabilities to it and extend it. SOAP
features are abstract capabilities related to the exchange of messages between SOAP
nodes. These capabilities can include reliability, guaranteed delivery, and security.
If a feature is implemented within a SOAP node, the feature is implemented by
modifying the SOAP processing model. If the feature affects the interaction between
two successive nodes, the feature is implemented as part of the SOAP protocol binding.
One limitation of a protocol binding is that it relates two nodes connected by a single
transmission. End-to-end transmission may be implemented using different
protocols, requiring multiple transmissions. In these cases, the feature should be
expressed in SOAP header blocks and implemented by the processing model.
Features are expressed as modules or Message Exchange Patterns. Modules are
expressed as SOAP header blocks. The content and semantics of the header blocks
must be clearly and completely stated. In addition, if the operation of the module
affects the operation of other SOAP features, these effects must be identified.
AMessage Exchange Pattern (MEP) is a template, defined in the SOAP specification
(W3C 2002f) used to describe the exchange of messages between SOAP nodes. Amajor
part of specifying a binding is to describe how a protocol is used to implement any
MEPs it claims to support. Two MEPs, Request-Response and Response, have been
defined so far. The request-response MEP is exactly what we’d expect and is used
for RPCs. The response MEP is the sending of a SOAP response after receiving a
non-SOAP request. The MEP describes actions from the point of view of both the
requesting and the responding nodes.
The MEP is a distributed state-based specification of a node’s operation. At any particular
point in a message exchange, a node is in a specific state. Upon receipt of an
input, sending an output, or the arrival of some other event, the node enters a new
state and undertakes some processing.
HTTP Binding
Many underlying protocols can be used to transmit SOAP messages. The selected
underlying protocol may also provide additional features such as assured delivery,
correlation of a response to a request, or error correction and detection that enhance
SOAP. In addition, the underlying protocol may support patterns of message exchange
that are more complex than the simple one-way exchange specified by SOAP.
A SOAP protocol binding describes how an underlying transmission protocol is
used to transmit the SOAP message. Abinding framework is used as a formal method
to describe the relationship between SOAP and its underlying transmission protocol. It
describes the operation of one node as it exchanges and processes a single message.
Other functionality supported by the binding is also described in the framework.
SOAP defines a default HTTP binding (W3C 2002f). Unless otherwise agreed to,
SOAP over HTTP is transmitted using this binding. The binding supports the requestresponse
and Response MEP and specifies how HTTP is used to implement the pattern.
For the request-response MEP, the HTTP protocol binding describes how requests
are transmitted using HTTP by the requesting node and how the responses are sent in
the responding state at the responding node. SOAP request messages are sent using
HTTP POST requests. The HTTP URL identifies the target node as well as the application
that receives the message. The SOAP message is carried as the body of an HTTP
POST. The HTTP content-type header must be application/soap. The corresponding
response is returned using the HTTP response. This provides a natural way to correlate
the SOAP request with its response.
In the HTTP binding, the SOAP response message is sent in the response to an HTTP
request. For the request-response MEP, the SOAP request message is sent in an HTTP
POST request. For the SOAP response MEP, the request is transmitted as an HTTP GET
request. The HTTP binding only supports this MEP to request information. When used
in this way, the interaction will be indistinguishable from conventional HTTP information
retrieval. The MEP can only be used when there are no intermediaries between the
initial sender and the ultimate receiver. The information retrieved must be identified
by the URL alone because there is no SOAP message envelope to transmit additional
identification to the service provider.
SOAP Usage Scenarios
SOAP is a very simple protocol, but this simplicity supports many kinds of interactions,
some of them very complex. To illustrate the variety of ways that SOAP can be
used, the W3C, XML Protocol Working Group has sponsored the creation of SOAP Version
1.2 Usage Scenarios (W3C 2002g).
SOAP Usage Scenarios span the basic, one-way SOAP message transmission to
request/response to intermediaries. The scenarios also cover the provision of features
such as caching, routing, and quality of service. Familiarity with these scenarios gives
more appreciation of the ways that SOAP can be used.
Universal Description Discovery and Integration
One perceived obstacle to widespread, easy access to Web Services is limited ability to
locate suitable Web Services. If an enterprise needs a service that it doesn’t already use,
how does it discover providers that offer the service? Today, enterprises make use of
various directories to identify a vendor or products or services of interest. The directories
offered by the phone company are an example of one such type of directory, but
industry-specific directories are also possible. To provide information on Web Services
available over the Internet, a comparable type of Internet facility has been conceived.
Aconsortium of companies, including Ariba, IBM, and Microsoft, began developing
the concept of an Internet business directory. The result is the UDDI Project. UDDI continues
to be a collaborative effort of concerned businesses. Unlike the topics that have
been discussed so far, UDDI is more than a specification or standard. It encompasses
an infrastructure that implements the standard and allows Internet-wide, all-inclusive
search and discovery of Web Services.
UDDI includes a structured way to describe a business, the services that are offered
by the business, and the programmatic interface to the services. Data is organized so
that a business may offer multiple services, and a service (which may have been developed
by a separate organization) may be offered by more than one business.
UDDI is a Web accessible directory and is built on SOAP over HTTP. A UDDI registry
is basically a Web Service. Two sets of SOAP interfaces have been defined. One set
of interfaces for potential subscribers supports searching for services or direct retrieval
of details about known services of interest. While UDDI is built on SOAP, it should be
pointed out that the services described in the directory are not required to be SOAP
services. The directory’s discovery services can also be used to mitigate problems that
occur during runtime access of the registered Web Service. If a service is not accessible
at a previously published location, the registry can be updated to refer to a location
where the service can be accessed. Service subscribers can then update their location
caches. The second set of interfaces is for use by service providers and supports saving
descriptions, deleting descriptions, and security for access to these services.
The infrastructure conceived by the UDDI Project is a single, distributed network of
directory operators called the UDDI Business Registry. Business and service descriptions
published by the Business Registry are intended to be publicly available to anyone
without restriction. Publishing and deleting information are subject to authorization
checks. Publishing a description at one node results in the description being propagated
to and available at all nodes. IBM, Microsoft, SAP, and HP operate nodes.
An alternative to public business registries are private registries that make Web Services
known to a community of potential subscribers. The community can be based on
a common line of business, such as building construction or manufacturing lawn furniture,
or the community could be a single company. Private registries cater to subscribers
who have common interests and needs. Unlike the business registry, access to
a private registry may not be open to everyone, and controlling access to the information
becomes important.
A business registry contains a variety of company-specific data so that a potential
subscriber can decide whether it wants to do business with the service provider and, if
it does, what must be done to use the service. Besides the name of the company, the
registry can include other identifying information, such as tax number, a text description,
and contact information. Industry segment or business categorization descriptors
support use of the registry for searches based on industry. Potential subscribers can
locate companies that offer the type of services they need. Finally, the registry can contain
technical and programmatic descriptions of the Web Services offered by the company
so that programmers have the information they need to interface with the Web
Services offered.
Five structures have been defined for UDDI entries. They are businessEntity, businessService,
bindingTemplate, tModel, and publisherAssertion. The diagram in Figure 2.4,
taken from UDDI Version 2.0 Data Structure Reference, UDDI Open Draft specification
8, June 2001, illustrates their relationship.
The businessEntity structure represents a business. The structure is made up of a Universally
Unique ID (UUID) that is assigned to each business entity, and can also
include a business name, description, and the contacts that are in the white pages.
These identifiers and categories are descriptors that can be used to classify businesses
and the services they provide. Finally, the structure optionally includes one or more
businessService structures.
The businessService structure includes data about a service being offered by the business.
This structure contains a UUID that is assigned to each business service, an
optional text-based description of the service and category descriptors, and zero or
more binding templates.
The bindingTemplate structure identifies how and where a service can be accessed.
Each binding template is assigned a UUID and contains an address that can be used to
call a Web Service. This address can be a URL or an e-mail address. The tModelInstanceDetails
element of the binding template identifies a specific tModel that contains
the details of the interface used to access the Web Service. The bindingTemplate includes
zero or more tModels.
The tModel structure contains the technical specification of the Web Service interface.
It contains a UUID for the tModel, a name, and a description. tModels can contain
identifier and category descriptors.
The publisherAssertion provides a way for two businesses to assert a joint relationship.
For this to work, both businesses must agree to the assertion before it is published.
While UDDI depends on SOAP for its API structure, the services listed in a UDDI
Registry need not be limited to SOAP-based services. Likewise, SOAP subscribers are
not limited to using UDDI registries to locate Web Services. Subscribers can learn of the
existence of a Web Service through word of mouth, from an advertisement, or by looking
up the desired service in a paper-based phone directory. Once the service has been
located, details of the service can be provided to the subscriber by email, on a floppy
disk, or in a manual. There is no tight coupling between SOAP and UDDI. UDDI is not
needed in order for SOAP to succeed. For all these reasons, adoption of UDDI is not
happening as quickly as its backers expected. As we will see in the next section, the
same holds true for the relationship between UDDI and WSDL.
WSDL
To ease the burden of developing SOAP code, a vendor standard for an XML-based
language to describe the SOAP interface has been developed. The initial Web Services
Description Language (WSDL) (Microsoft 2001) specification was a joint development of
Ariba, IBM, and Microsoft. The WSDL 1.1 specification was turned over to the W3C,
which published it as a note (W3C 2001b) in March 2001. The W3C Web Services
Description Working Group is now working on further development of the language.
Earlier, we discussed concerns about the verbosity of XML. WSDL expands XML
several times over. Luckily, WSDL is usually only used during design and development
of Web Services applications. We should also note that even though WSDL is
text-based, human beings were not meant to comprehend WSDL. It is a machine-generated
and machine-processed markup language used with software development
tools. Finally, WSDL is its own markup language. It is not SOAP. So if someone looks
at it and it does not look familiar, this is understandable.
Since we don’t expect that human beings will have to dissect a WSDL specification,
we won’t go into the details of WSDL. Instead, we’ll discuss its structure and describe
how it specifies the interfaces to Web Services.
WSDL documents describe logical and concrete details of the Web Service. The logical
part of the WSDL document describes characteristics of Web Services that are
determined by the service developer and are valid regardless of the actual implementation.
The concrete part of the document describes aspects of the service that are
decided by the service provider. This supports the independent development of Web
Services that may be offered by different service providers. Figure 2.5 shows the parts
of a WSDL document.
To define an interface with WSDL, we begin by defining the types of data exchanged
across the interface. The type portion of a WSDL document declares the namespaces
and datatypes used in the Web Services messages that constitute the service. It defines
application-specific data types. Data is then organized into messages. In the case of
SOAP, message descriptions only apply to the body of the SOAP message. Headers are
defined elsewhere within the WSDL document. portType defines the operations supported
by a logical endpoint and the messages sent or received. For instance, a SOAP
service provider receives a message and generates a response to the received message.
The messages received and sent in response are defined in the message portion of the
WSDL message.
Up to this point, no implementation-specific information should be specified. For
instance, the protocol used to transmit the messages, the encoding used for the data,
and the location of the actual ports that are the connection points should not have been
given. These features are regarded as differentiators for different Web Services
providers. The service provider rather than the service developer makes these choices.
The bindings, ports, and service portions of a WSDL document specify this information.
First, bindings are used to specify the underlying protocol used to transport the
messages in a portType (portTypes were previously defined in the logical portion of
the WSDL document). The binding also specifies the encoding for the messages that
are part of the operations in the portType. Aport specifies the address at which the service
is available. Finally, a service specifies the ports at which the service is available.
WSDL is not tightly coupled to SOAP, and the interface WSDL describes can be
accessed via other protocols. Several bindings extend WSDL to account for differences
in underlying transport protocol. There is a SOAP binding, an HTTP GET or POST
binding without SOAP, and an SMTP binding. The SOAP binding describes how to
specify whether a message body is a document type or an RPC type. If the message is
an RPC, it describes how to identify arguments. The SOAP binding also includes the
definition of header elements and header fault elements.
Because Web Service descriptions allow independently specified components, there
is a lot of redundancy in a WSDL document. Operations reference messages. Bindings
reference operations, and messages further define how the operations and messages
are transmitted. This redundancy, combined with the use of XML, is responsible for the
large size of a WSDL document compared to the actual SOAP messages it defines. This
is the price of modularity.
WSDL is loosely coupled with SOAP and UDDI. There is a SOAP binding for WSDL,
but there are also HTTP GET and POST bindings and an SMTP binding. SOAP interfaces
can be specified by other means. There is guidance for using WSDL to provide the tModel
and binding template of a UDDI entry, but UDDI could be used with other description
languages, and there are other ways to distribute WSDL interface specifications.
A SOAP service provider is not required to use WSDL. WSDL’s verbosity makes it
difficult for human beings to understand and is an impediment to its acceptance.
Developers who use WSDL are those using development tools that automatically generate
and consume WSDL interface descriptions.