HL7 v2 Related Tools that use Java/XML Technology, SAX, and TRAX.
$Id$
Gunther Schadow, Regenstrief Institute.
CONTENTS
- HL7XMLReader An HL7 v2 parser that produces XML/SAX events.
- HL7XMLWriter An HL7 v2 builder that outputs traditional HL7 v2 from
XML/SAX events.
- MLLPServer A multi-treaded TCP/IP server that understands the HL7
Minimal Lower Layer Protocol (MLLP) to handle message
streams using an XSLT transform.
- MLLPDriver Implements the MLLP on a pair of Input/Output streams.
that can be used for driving a client connection.
DETAIL
----------------------------------------------------------------------
The HL7XMLReader
can parse HL7 v2 into an XML format and a growing number of XSLT
transforms for common tasks and demos of what one can do with this.
CLASS HL7XMLReader implements org.xml.sax.XMLReader
This is a very simple parser for HL7 messages that generates a very
simple XML format. This is NOT the same format as the HL7 v2xml
specification for a number of reasons. The most important reason being
that the v2xml specification format cannot be generated from HL7
instances alone without much help from the specification (e.g., which
data type a certain field is, and it requires groups to be identified
as elements on their own.)
SIMPLE CONTENT
FIRST COMPONENTSECOND COMPONENT
THIRD COMPONENT
FIRST REPETITIONSECOND REPETITION
THIRD REPETITION
FIRST COMPONENT FIRST ITEMSECOND COMPNENT
...
.........
This is what I call "lazy structure", i.e., structural tags are only
used at the point where they are really needed. It is easy to use in
XSLT and XPath. The first node in a tag is the first field
content. The next node, if any, is a structural tag that will tell you
on what structural level the first text node was. Since HL7 has no
mixed content models, there is never any ambiguity.
This follows the lazy spirit of HL7 v2. It does so not because the
author believes that that is a good way of thinking and handling
information, but because the real world is just that messy and after
having done all a person can do to produce a structure-anal HL7 parser
(ProtoGen/HL7) the author has given up any hope that HL7 v2.x use will
ever get there.
This class behaves like an XML SAX parser, i.e., upon reading an HL7
message it generates SAX events. It is extremely simple and extremely
easy to use with standard XML tools in Java. One can simply run the
HL7 message through an XSLT transform. And this is really the main
purpose of this class: to open up the HL7 v2.x message of any uglyness
into the world of powerful XSLT transforms. This can be used to drive
message processors or just message transformers that end up emitting
the result of the transformation in HL7 v2 syntax.
Note also that there is no guarrantee the result is actually an HL7
message. It could be a batch or a continuation of a preceeding
message. That's why the toplevel element isn't called "message" but
simply "hl7".
USAGE
You can invoke this parser in various ways according to the TRAX
specification, as this class implements the SAX XMLReader interface.
I recommend using Saxon v7 and higher as follows:
$ saxon7 -x org.regenstrief.xhl7.HL7XMLReader test.hl7 deep-identity-transform.xsl
Now for testing and simple tasks the HL7XMLReader class can be
invoked directly to simply output the XML with indentation and
no further transform.
$ java org.regenstrief.xhl7.HL7XMLReader file:test.hl7
(Notice that the argument is a url that must begin with a url scheme
and colon)
TRANSFORMS
The lazy-structure may not be good to use for all circumstances.
Hence there are two additional structure-models and transforms
to those.
lazy2eager.xsl
eager2lazy.xsl
eager2normalized.xsl
normalized2eager.xsl
And finally there is a transform to produce HL7 in traditional
encoding format.
lazy2traditional.xsl
----------------------------------------------------------------------
The MLLPServer
is a single server program that can set up a number of TCP/IP server
threads where each serves one port in turn in a multi-threaded
fashion.
The servers are configured from a single XML configuration file such
as the following:
In this example configuration, four servers are being defined that
listen on port 7001, 7002, 7003, and 7005. They all use SAXON as their
XSLT engine. The reader tag sets an XMLReader, which is used to set
the HL7 v2 reader (see above). All action occurs in the XSLT
transform.
Once a client makes a connection, a handler thread serves this
connection until the client closes the connection. The connection
can typically remain up for a long time with thousands of messages
being transmitted. The MLLP protocol delineates the boundary of
messages sent by the client and regulates when the server can send
a response.
Notice that each port can handle many connections (the backlog
value has nothing to do with limiting the number of established
connections to a port.) It's a very common mistake in HL7 related
TCP/IP programming to assume that a port can only server a single
client. Not here :-)
The error tag specifies a transform that is called if the normal
transformation fails either because the input is invalid or
because of any other processing error signified by an exception
thrown.
SSL Connections
SSL is supported in a very basic way but fine way. If you specify
a element in the server
configuration, that java.net.ServerSocketFactory is instantiated
with getDefault(). Most likely you will specify the
class="javax.net.ssl.SSLServerSocketFactory" here, as shown
in the example. You will have to set system Properties to configure
keystore and truststore.
- javax.net.ssl.keyStore
- javax.net.ssl.keyStorePassword
- javax.net.ssl.trustStore
- javax.net.ssl.trustStorePassword
For example, start the server as follows:
java -Djavax.net.ssl.keyStore=test.jks \
-Djavax.net.ssl.keyStorePassword=changeit \
-Djavax.net.ssl.trustStore=test.jks \
-Djavax.net.ssl.trustStorePassword=changeit \
org.regenstrief.xhl7.MLLPServer file:mllp.xml
and the test client as follows:
java -Djavax.net.ssl.trustStore=test.jks
-Djavax.net.ssl.trustStorePassword=changeit
org.regenstrief.xhl7.MLLPClient -s localhost 7005
[For reference check out:
http://www-106.ibm.com/developerworks/java/library/j-customssl]
-----------------------------------------------------------------------
The MLLPDriver
handles the MLLP protocol. For example, given a connected Socket (or
any other pair of Input/OutputStreams (e.g. System.in/.out) the
MLLPDriver is used as follows:
MLLPDriver driver = new MLLPDriver(socket.getInputStream(),
socket.get.OutputStream()
true);
the MLLPDriver then provides a pair of MLLP driven Input- and
OutputStreams (sort of proxy-streams)
driver.getInputStream();
driver.getOutputStream();
The driver manages the MLLP half-duplex state but does so pretty
transparently and without much annoying exceptions.
The boolean argument to the constructor specifies whether this
driver starts out as a sender or as a receiver. This argument
is called the "clearToSend". Once a line is clear to send, as
soon as the first write to the output stream, the MLLP start-
of-transmission control is sent.
When the driver is in sending state the first attempt to read
on the input stream terminates the sending state (end-of-
transmission sequence is sent) and skips forward to immediately
behind the next start-of-transmission control character received
from the peer.
When in receiving mode, the driver transparently buffers any
output sent wile the MLLP state does not allow for writing because
the peer is not done writing. Then, when the peer has finished
writing, the buffered output is sent. Once the driver is in sending
state, all further output is directly sent to the real output stream,
without further buffering.
In order to terminate sending without immediately attempting to
receive data, one can explicitly call
driver.turn();
to turn over the connection to the peer.
In order to find out whether a new message is available, one can
call
driver.hasMoreInput()
which blocks if no more input is available or if the stream is
closed. If the client closes the input, this results in
driver.hasMoreInput() to return false. A closed input channel can also
be detected with the predicate
driver.inputClosed()
that returns true is the input is closed.
The fact that a receiver is done sending a message is apparent from
the input stream that keeps returning -1 (the EOF mark). Only after
the driver sent something (or alternatively when driver.turn() has
been called without anything sent, resulting in an empty transmission
frame), only then will a subsequent read return actual data.
Before abandoning a client driver, do close it so that it will close
and free the server.
driver.close();