HL7 v2 Related Tools that use Java/XML Technology, SAX, and TRAX. $Id$ Gunther Schadow, Regenstrief Institute. CONTENTS - HL7XMLReader An HL7 v2 parser that produces XML/SAX events. - HL7XMLWriter An HL7 v2 builder that outputs traditional HL7 v2 from XML/SAX events. - MLLPServer A multi-treaded TCP/IP server that understands the HL7 Minimal Lower Layer Protocol (MLLP) to handle message streams using an XSLT transform. - MLLPDriver Implements the MLLP on a pair of Input/Output streams. that can be used for driving a client connection. DETAIL ---------------------------------------------------------------------- The HL7XMLReader can parse HL7 v2 into an XML format and a growing number of XSLT transforms for common tasks and demos of what one can do with this. CLASS HL7XMLReader implements org.xml.sax.XMLReader This is a very simple parser for HL7 messages that generates a very simple XML format. This is NOT the same format as the HL7 v2xml specification for a number of reasons. The most important reason being that the v2xml specification format cannot be generated from HL7 instances alone without much help from the specification (e.g., which data type a certain field is, and it requires groups to be identified as elements on their own.) SIMPLE CONTENT FIRST COMPONENTSECOND COMPONENT THIRD COMPONENT FIRST REPETITIONSECOND REPETITION THIRD REPETITION FIRST COMPONENT FIRST ITEMSECOND COMPNENT ... ......... This is what I call "lazy structure", i.e., structural tags are only used at the point where they are really needed. It is easy to use in XSLT and XPath. The first node in a tag is the first field content. The next node, if any, is a structural tag that will tell you on what structural level the first text node was. Since HL7 has no mixed content models, there is never any ambiguity. This follows the lazy spirit of HL7 v2. It does so not because the author believes that that is a good way of thinking and handling information, but because the real world is just that messy and after having done all a person can do to produce a structure-anal HL7 parser (ProtoGen/HL7) the author has given up any hope that HL7 v2.x use will ever get there. This class behaves like an XML SAX parser, i.e., upon reading an HL7 message it generates SAX events. It is extremely simple and extremely easy to use with standard XML tools in Java. One can simply run the HL7 message through an XSLT transform. And this is really the main purpose of this class: to open up the HL7 v2.x message of any uglyness into the world of powerful XSLT transforms. This can be used to drive message processors or just message transformers that end up emitting the result of the transformation in HL7 v2 syntax. Note also that there is no guarrantee the result is actually an HL7 message. It could be a batch or a continuation of a preceeding message. That's why the toplevel element isn't called "message" but simply "hl7". USAGE You can invoke this parser in various ways according to the TRAX specification, as this class implements the SAX XMLReader interface. I recommend using Saxon v7 and higher as follows: $ saxon7 -x org.regenstrief.xhl7.HL7XMLReader test.hl7 deep-identity-transform.xsl Now for testing and simple tasks the HL7XMLReader class can be invoked directly to simply output the XML with indentation and no further transform. $ java org.regenstrief.xhl7.HL7XMLReader file:test.hl7 (Notice that the argument is a url that must begin with a url scheme and colon) TRANSFORMS The lazy-structure may not be good to use for all circumstances. Hence there are two additional structure-models and transforms to those. lazy2eager.xsl eager2lazy.xsl eager2normalized.xsl normalized2eager.xsl And finally there is a transform to produce HL7 in traditional encoding format. lazy2traditional.xsl ---------------------------------------------------------------------- The MLLPServer is a single server program that can set up a number of TCP/IP server threads where each serves one port in turn in a multi-threaded fashion. The servers are configured from a single XML configuration file such as the following: In this example configuration, four servers are being defined that listen on port 7001, 7002, 7003, and 7005. They all use SAXON as their XSLT engine. The reader tag sets an XMLReader, which is used to set the HL7 v2 reader (see above). All action occurs in the XSLT transform. Once a client makes a connection, a handler thread serves this connection until the client closes the connection. The connection can typically remain up for a long time with thousands of messages being transmitted. The MLLP protocol delineates the boundary of messages sent by the client and regulates when the server can send a response. Notice that each port can handle many connections (the backlog value has nothing to do with limiting the number of established connections to a port.) It's a very common mistake in HL7 related TCP/IP programming to assume that a port can only server a single client. Not here :-) The error tag specifies a transform that is called if the normal transformation fails either because the input is invalid or because of any other processing error signified by an exception thrown. SSL Connections SSL is supported in a very basic way but fine way. If you specify a element in the server configuration, that java.net.ServerSocketFactory is instantiated with getDefault(). Most likely you will specify the class="javax.net.ssl.SSLServerSocketFactory" here, as shown in the example. You will have to set system Properties to configure keystore and truststore. - javax.net.ssl.keyStore - javax.net.ssl.keyStorePassword - javax.net.ssl.trustStore - javax.net.ssl.trustStorePassword For example, start the server as follows: java -Djavax.net.ssl.keyStore=test.jks \ -Djavax.net.ssl.keyStorePassword=changeit \ -Djavax.net.ssl.trustStore=test.jks \ -Djavax.net.ssl.trustStorePassword=changeit \ org.regenstrief.xhl7.MLLPServer file:mllp.xml and the test client as follows: java -Djavax.net.ssl.trustStore=test.jks -Djavax.net.ssl.trustStorePassword=changeit org.regenstrief.xhl7.MLLPClient -s localhost 7005 [For reference check out: http://www-106.ibm.com/developerworks/java/library/j-customssl] ----------------------------------------------------------------------- The MLLPDriver handles the MLLP protocol. For example, given a connected Socket (or any other pair of Input/OutputStreams (e.g. System.in/.out) the MLLPDriver is used as follows: MLLPDriver driver = new MLLPDriver(socket.getInputStream(), socket.get.OutputStream() true); the MLLPDriver then provides a pair of MLLP driven Input- and OutputStreams (sort of proxy-streams) driver.getInputStream(); driver.getOutputStream(); The driver manages the MLLP half-duplex state but does so pretty transparently and without much annoying exceptions. The boolean argument to the constructor specifies whether this driver starts out as a sender or as a receiver. This argument is called the "clearToSend". Once a line is clear to send, as soon as the first write to the output stream, the MLLP start- of-transmission control is sent. When the driver is in sending state the first attempt to read on the input stream terminates the sending state (end-of- transmission sequence is sent) and skips forward to immediately behind the next start-of-transmission control character received from the peer. When in receiving mode, the driver transparently buffers any output sent wile the MLLP state does not allow for writing because the peer is not done writing. Then, when the peer has finished writing, the buffered output is sent. Once the driver is in sending state, all further output is directly sent to the real output stream, without further buffering. In order to terminate sending without immediately attempting to receive data, one can explicitly call driver.turn(); to turn over the connection to the peer. In order to find out whether a new message is available, one can call driver.hasMoreInput() which blocks if no more input is available or if the stream is closed. If the client closes the input, this results in driver.hasMoreInput() to return false. A closed input channel can also be detected with the predicate driver.inputClosed() that returns true is the input is closed. The fact that a receiver is done sending a message is apparent from the input stream that keeps returning -1 (the EOF mark). Only after the driver sent something (or alternatively when driver.turn() has been called without anything sent, resulting in an empty transmission frame), only then will a subsequent read return actual data. Before abandoning a client driver, do close it so that it will close and free the server. driver.close();