IVOA Note 2009 August 5
Version 1.0 (2009 May 14)
Alasdair Allan, University of Exeter, UK
Robert B. Denny, DC-3 Dreams, SP, Mesa, AZ, USA
This document describes a simple TCP-based protocol for transporting VOEvent messages from authors, through brokers, to subscribers. The protocol has been in successful use on the VOEvent network for several years. Having been thus proven, the community needs a reference from which to implement it. This IVOA Note provides the information needed to implement the protocol to provide compatible communication in the VOEvent network.
Revised in V1.1
Version 1.1 adds an optional <Result> sub-element (containing text) within the optional <Meta> element. This is intended to convey details on errors encountered if the Transport response is nak but may also be used for informational purposes in ack messages.
Status of This Document
This is an IVOA Note expressing suggestions from and opinions of the authors. It is intended to share best practices, possible approaches, or other perspectives on interoperability with the Virtual Observatory. It should not be referenced or otherwise interpreted as a standard specification.
A list of current IVOA Recommendations and other technical documents can be found at http://www.ivoa.net/Documents/.
Robert W. White, ex LANL, now at NREL
Phillip Warner, ex NOAO
Robert Seaman, NOAO
The VOEvent network has evolved around a relatively small set of specific requirements for dissemination of transient events. Development has focused on the VOEvent messages themselves, resulting in a standard for these messages. Meanwhile, the means for moving VOEvent messages around the network have evolved as a separate issue. Several protocols are in use, some of which were borrowed from other applications.
In order for VOEvent to grow successfully, a standard transport protocol for VOEvent messages is needed. The purpose of this protocol is to move a VOEvent message between a sender and a receiver. Refer to Figure 1 below. Senders include both VOEvent message authors who originate messages and VOEvent brokers which disseminate messages to subscribers. Receivers include both subscribers who use the VOEvent messages and VOEvent brokers which receive messages from Authors for dissemination to subscribers.
The protocol described herein is intentionally as simple as possible while still accomplishing the required task. There are some value-added VOEvent services being developed which will require more complex protocols. It is not the purpose of this protocol to compete with those being used for value-added services. This protocol is intended strictly for the transmission of messages from authors to brokers to subscribers. The VOEvent network should, at a minimum, provide a universal distribution service which supports destination filtering. High-traffic authors such as Pan-Starrs will require some source filtering to prevent flooding of the VOEvent network. This issue is beyond the scope of this document.
Figure 1. VOEvent network architecture showing node roles
As of summer 2009, the VOEvent network operates over a backbone consisting of four brokers which are interconnected. See Figure 2 below. It is expected that additional brokers will be added to the network, some of which will not operate as part of the backbone.
Figure 2. VOEvent Backbone (2009)
The Transport protocol operates over TCP connections, and relies on TCP's guaranteed error-free in-order delivery of data. The payload of the transport protocol messages is either VOEvent XML or Transport XML. The latter are used by the Transport protocol itself and invisible to protocol clients.
All messages are sent over the TCP connection preceded by a 4-byte network-ordered count, followed immediately by the payload data. The 4-byte count is interpreted as a 32-bit integer equal to the number of payload bytes following the count bytes. The payload is considered an opaque collection of bytes at this level, but as described elsewhere herein, all messages are XML documents. No checksum or digest check data is included; the protocol relies on TCP’s guaranteed error-free delivery of data.
The VOEvent network consists of three types of nodes (refer to Figure 1 above):
The flow of messages is over two types of connections:
o Author to broker
o Broker to subscriber
Each type of connection is discussed qualitatively below.
When an author wants to submit a VOEvent message to the network, it opens a TCP connection to a broker, sends the message, waits for a response from the broker then closes the TCP connection. The response from the broker is a Transport message.
When a subscriber wants to receive VOEvent message traffic, it opens a TCP connection to a broker. This connection is kept open continuously. When the broker receives a message, it relays a copy of the VOEvent message to all of the connected subscribers. Thus, a subscriber must continuously listen on the TCP connection and be prepared to receive VOEvent messages at any time, even when it is busy processing a previously received VOEvent message. When a subscriber receives a VOEvent message from its broker, it sends back a response. The response from the subscriber is a Transport message.
Traffic between brokers uses the preceding methods. Each broker is just a “subscriber” as far as every other broker is concerned. A broker that wishes to receive a feed from another broker should connect to that broker’s subscriber port. No special protocol features are needed.
All connections over which a broker sends VOEvent messages are kept open continuously. Basic TCP does not provide any dead-peer indication. Furthermore, NAT proxies and firewalls might sever a TCP connection after some period of inactivity. This gives rise to the need for keep-alive messages. The Transport protocol (see Section 5) provides for keep-alive. The broker periodically sends a Transport iamalive message, to which the subscriber replies with a copy of that message plus some optional identification information
At both ends of the continuous connection, the node either expects to receive an iamalive message or expects to receive the response to its iamalive message. If not seen, the node assumes that the connection has been lost or the peer is dead. The node that is responsible for opening the tries to re-open the connection. It keeps trying periodically "forever", with a geometric back-off algorithm, until it once again has an open connection.
Transport messages are XML documents. There are three transport messages:
o Iamalive response
o Response to VOEvent message (ack/nak)
All three Transport messages have the same general syntax, and are defined by the Transport schema (see Appendix A Transport Schema). The role attribute of the root <Transport> element distinguishes between connection maintenance messages (iamalive) and VOEvent message receipt responses (ack or nak).
The schema is given in Appendix A Transport Schema.
The iamalive message is indicated by role=”iamalive”. The <Origin> element contains the IVORN of the broker which is managing the connection. The <TimeStamp> element contains the date/time (UTC) at which the message is sent.
Figure 3. Sample iamalive message
The iamalive response is an extension of the initial iamalive message. It also has a role of iamalive. It must include an additional <Response> element containing the IVORN of the subscriber. It may include a <Meta> element with <Param> sub-elements which give additional information about the subscriber or any other relevant information. <Param> elements have no content and must contain name and value attributes. The names and values may be any string. The <TimeStamp> element contains the date/time (UTC) at which the response is sent.
Figure 4. Sample iamalive response message
The VOEvent message receipt response is similar to the iamalive response except the role is either ack or nak, the <Origin> is the IVORN of the just-received VOEvent message, and an optional <Result> element may accompany the <Param> elements. <Result> may contain any string, and it is recommended that it contain a human-readable error message if role is nak. The <TimeStamp> element contains the date/time (UTC) at which the response is sent.
Figure 5. Sample VOEvent message receipt response (ack)
Figure 6. Sample VOEvent message receipt response (nak)
This section describes the operation and sequencing of Transport for each end of a connection between an author and a broker, as well as between a broker and a subscriber. See section 3 above for a qualitative discussion of the protocol from the viewpoint of each entity. In the sections below, the “message” is the combination of the byte count and payload as described in section 2.1 above.
The protocol diagrams in the following sections depict layers between which are interfaces. Figure 7 below shows the overall architecture of communication between authors, brokers, and subscribers. Each entity (author, broker, and subscriber) uses the Transport layer to send and/or receive VOEvent messages. The protocol diagrams illustrate this in detail.
Figure 7. Layered architecture
The author client calls a single function (e.g. PublishEvent(message)). The Transport layer attempts to send the message to the broker as shown in Figure 8 below. If successful, the function returns normally. It is the responsibility of the author client to apply any digital signature (see section 7.2) to the message before sending it to the Transport layer. It is the responsibility of the Transport layer to prepend the byte count (see section 2.1) to the message before sending to the broker, and to interpret the ack/nak message (see section 5.3) from the broker.
Figure 8. Transport protocol at an author node
The broker responds to two events from the Transport layer, as shown in Figure 9. One event indicates that a request for a TCP connection has arrived, and the broker must respond with accept or reject status. This is where the broker can provide IP white list access control (see section 7.1). The other event indicates that a message has been received, and the broker must respond with accept or reject status. This is where the broker can validate the XML and check any digital signature (see section 7.2). It is the responsibility of the Transport layer to remove the byte count (see section 2.1) from the incoming message, and to create the ack/nak message (see section 5.3).
Figure 9. Transport protocol at broker, receiving from author
See Figure 10. Subscribers connect to the broker and leave the connection open “forever”. The broker responds to an event from the Transport layer indicating that a subscriber wants to open a TCP connection, and the broker must respond with accept or reject status. This is where the broker can provide IP white list access control (see section 7.1). If accepted, the broker would presumably add this subscriber to its distribution list.
See section 4 Connection Maintenance. Once the subscriber has successfully connected to the broker, the broker’s Transport layer begins sending periodic iamalive messages to the subscriber, who replies with an iamalive reply (see sections 5.1 and 5.2). If the subscriber’s iamalive reply messages stop arriving, the Transport layer assumes that the subscriber is dead or gone, and closes the TCP connection. The iamalive process continues until the subscriber closes the TCP connection or the broker’s Transport layer stops receiving iamalive replies from the subscriber.
To send a message to the subscriber, the broker calls a single function (e.g. SendEvent(message)). The Transport layer attempts to send the message to the subscriber. If successful, the function returns normally. It is the responsibility of the Transport layer to prepend the byte count (see section 2.1) to the message before sending to the subscriber, and to interpret the ack/nak message (see section 5.3) from the broker. If there is an error (nak received, dead-peer, or subscriber closed the connection), the Transport layer returns or raises an error. Presumably, this would result in the broker removing the subscriber from its distribution list.
Figure 10. Transport protocol at broker, sending to subscriber
See Figure 11. Subscribers connect to the broker and leave the connection open “forever”. If the connection is successful, the subscriber will begin receiving (VOEvent) messages.
See section 4 Connection Maintenance. Once the subscriber has successfully connected to the broker, the subscriber’s Transport layer begins listening for periodic iamalive (see section 5.1) messages from the broker. For each iamalive message received, the Transport layer responds with an iamalive reply (see section 5.2). If the broker’s iamalive messages stop arriving, the subscriber’s Transport layer must attempt to re-connect to the broker. Reconnection attempts should probably use some sort of geometric back-off algorithm. If the subscriber’s Transport layer cannot re-establish the connection, it should probably raise/throw an error to the subscriber client.
The subscriber client responds to an event advising that a (VOEvent) message has arrived, and the subscriber client must respond with accept or reject status. This is where the subscriber can validate the XML and check any digital signature that may be present. It is the responsibility of the Transport layer to remove the byte count (see section 2.1) from the incoming message, and to create and send the ack/nak message (see section 5.3).
Figure 11. Transport protocol at subscriber
Clearly, unrestricted submission of VOEvent messages into the network poses a risk to subscribers. Rogue messages can cause unwanted activity at subscriber observatories. A “denial of observing” attack could overwhelm subscriber observatories with apparently interesting but bogus events, triggering observations which could preempt their real work. A “denial of service” attack could overwhelm a broker with messages to send to its connected subscribers and brokers.
In addition, there should be some way to control who can subscribe to VOEvents coming from brokers. Unwanted subscribers pose little risk apart from the potential of overwhelming brokers with connections requiring their service. Nonetheless, some form of access control should be provided.
The VOEvent architecture is such that access control is best provided by brokers. Since an author must connect to a broker to submit messages, and subscribers must connect to a broker to receive messages, the broker is the place to allow or deny access by allowing or denying these connections. The next two sections describe two ways to provide access control. A broker may choose to implement either or both.
The simplest form of access control is IP address white listing. This scheme requires the broker to know a priori the IP addresses or IP address ranges from which allowed authors and subscribers can connect.
If a connection request comes in from an IP address not on the white list, it is immediately denied by the broker. If the connection is allowed, and the connected entity tries to send a VOEvent message (the entity is trying to be an author), a second white list is consulted to see if the entity is an allowed author. If not, the message is refused with a nak response from the broker. Presumably, entities who are allowed subscribers would not be hostile (well at least not for long!), and therefore would post negligible denial of service risk.
This simple but effective means of providing access control has been used successfully by (e.g.) GCN, thus establishing precedent for its use.
As the VOEvent network grows in size, IP address white listing may become unwieldy to manage. The administrator of a broker is required to maintain its white list which can get large, creating a time-consuming task. New subscribers, and more importantly authors, need to be vetted and added to the list(s). IP address changes, while relatively rare in static networks, can still change. And how does one know that an IP address is no longer in use?
If the VOEvent network is to accommodate large numbers of smaller educational and amateur observatories, there is another larger issue with white listing. Often, these smaller observatories are connected to the internet with consumer-class cable or DSL, and these connections are usually via dynamic IP addresses. Their IP addresses can change on a daily or at least weekly basis. This could pose an intractable task to the broker administrator.
An alternative to IP address white listing is the use of digital signatures on messages. A simple and transparent means to digitally sign XML messages has been presented in the IVOA Note A Proposal for Digital Signatures in VOEvent Messages. The signing system uses cross-platform open source software and does not require expensive X.509 digital certificates. Refer to the aforementioned document for details. Digital signatures for authentication are also covered in the IVOA Note Single-Sign-On Authentication for the IVO: introduction and description of principles, as well as Allen (2008). Note that the Proposal for Digital Signatures contains a much simpler infrastructure which provides virtually all of the needs expressed in the SSO document; it applies only to VOEvent (XML) messages and not to general access by users to computing resources.
A digital signature on messages coming to a broker from a connected entity (author or subscriber) can be used for access control. The broker needs to have a known genuine copy of the entity’s public key. If so, the broker can validate the signature on messages it receives, and thus have positive identification of the connected entity. Effectively, the public keys in the broker’s possession are a form of white list with which the broker can control access. With this access control scheme, the IP address of the connected entity is irrelevant.
Authors connect, send a VOEvent message, and then disconnect (see section 6.2). All that’s needed is for the author to sign the VOEvent message being submitted. The broker’s acceptance/rejection criteria include validating the signature. If that succeeds, the broker accepts the message and returns an ack. Otherwise, if the broker does not have the author’s public key or if validation fails, the message is discarded and the broker returns a nak. It is recommended that a <Result> element be included explaining that the signature validation failed. See section 5.3.
The benefits of having authors sign their VOEvent messages extend beyond broker access control. The presence of a signature on an author’s VOEvent message can be used by subscribers to guarantee the integrity of the message.
Subscribers do not send VOEvent messages to the broker (see section 6.5). To accommodate subscriber authentication using a signed message, the broker sends a Transport authenticate message to the subscriber immediately after the subscriber connects. The subscriber returns a digitally signed authenticate response, with which the broker authenticates the subscriber as before. If the authentication succeeds, the broker leaves the TCP connection open and begins to deliver VOEvent messages to the subscriber. If the authentication fails, the broker simply closes the TCP connection.
Note: For clarity, and because they are optional, the subscriber authentication messages just described are not included in the preceding sections describing the protocol. They are virtually identical to the iamalive and imalive response messages (see sections 5.1and 5.2) except for the role and the presence of a digital signature in the response. Samples are shown in Figure 12 and Figure 13 below.
The broker should include some way to prevent an endless cycle of connect-reject-closes from a given IP address (e.g an IP black list with manual removal or an automatic tarpitting scheme). There is no need for the subscriber to digitally sign subsequent iamalive responses as long as the TCP connection remains open. If the connection is lost, forcing a reconnect, the authenticate exchange will occur as before.
Figure 12. Sample authenticate message
Figure 13. Sample authenticate response message with digital signature
Figure 14. Graphical Representation of Transport Schema
Figure 15. Transport Schema
 IVOA Note A Proposal for Digital Signatures in VOEvent Messages
 As defined by the Internet Protocol, also called “big endian” ordering. The Berkeley sockets API defines functions htonl() and ntohl() which convert (as needed) between the local processor-native byte ordering and the Internet Protocol standard.
 As a result, the format of the transport document is opaque to the transport layer. Therefore both ASCII and UTF-8 are equally supported.
 TCP does support a keep-alive service, but it is considered by some to be controversial and is not accessible from some socket APIs.
 The function may return a boolean indicating success/failure, or it may return nothing (void). In the latter case, the Transport layer must raise/throw an error condition if sending fails.
 The function may return a boolean indicating success/failure, or it may return nothing (void). In the latter case, the Transport layer must raise/throw an error condition if sending fails.
 Subscriber connections from one broker to another will be relatively small in number. Therefore, IP address white listing should suffice for these connections.