
Should one use Text Messages to carry XML documents as payloads? I believe the answer is ‘no’. Of course, this was written in 1999, when XML was still used to markup content rather than define it. One reason for this is that XML will likely become a popular mechanism for representing the content of JMS messages. The inclusion of this message type is based on our presumption that String messages will be used extensively. The JMS specification says this about JMS Text Messages: When using the JMS adapter, the UTF-8 replacement characters are perfectly legal, hence the pipeline disassembled the document correctly. When the File Adapter was used, the pipeline threw an exception because the byte, 0xB0, was illegal. As per the Unicode specification, the UTF-16 replacement character, 0xFFFD, is converted to the UTF-8 replacement characters: 0圎F, 0xBF and 0xBD. When the JMS BizTalk adapter receives the JMS Text Message, it must convert the contained text to the expected encoding, UTF-8, before submitting the message to the BizTalk DB. This mechanism is part of the Unicode specification. During the conversion, any illegal UTF-8 characters, like 0xB0, are converted to the UTF-16 replacement character, 0xFFFD.
#XML DECODE EDITOR CODE#
Though not entirely obvious, the above line of code has just performed a conversion. The string, jmsMessageBody, is used to create a JMS Text Message that will be published to a queue. UTF_8.decode(ByteBuffer.wrap(rawBytes)).toString() Ī UTF-8 XML file is read as raw bytes, then explicitly converted from UTF-8 to a. String jmsMessageBody = StandardCharsets. The code is Java 7.īyte rawBytes = Files.readAllBytes(Paths.get(somePath)) At some point, Java code similar to this executed.

The problem got its start when the message was published to the JMS queue. The customer, understandably, was confused. The byte 0xB0 had been replaced by three bytes: 0圎F, 0xBF and 0xBD. The final XML document, wherever it was routed to, now contained ‘ � ‘ instead. When the customer moved to the JMS adapter, there were no failed messages, even those containing an illegal character, in this case, the one-byte degree symbol. The failed message was routed to a directory where the illegal characters were presumably fixed and the message resubmitted. When the file adapter was used, the XML pipeline would throw an exception during disassembly when the illegal UTF-8 character was encountered. Occasionally, a UTF-8 XML document would contain an illegal character. The JMS BizTalk adapter was replacing the File Adapter as the customer was moving to a messaging solution centered on JMS. Recently, a customer using the JNBridge JMS Adapter for BizTalk Server ran into some unexpected behavior. To be absolutely sure, you need a good binary editor that has the ability to convert between encodings according to the specifications behind the encodings. The problem is that just viewing the document in a text editor isn’t going to tell you that. If the document is composed without regard to the underlying encoding, it’s very easy to end up with an UTF-8 document containing an illegal character. In UTF-8, the degree symbol is multi-byte, 0xC2B0.
#XML DECODE EDITOR WINDOWS#
In Windows 1252 (and many other encodings), the degree symbol in hex is 0xB0, a single byte. However, the degree symbol, °, is different. My point is that the very first line of the document,, doesn’t necessarily indicate that what you see is what you get.įor the most part, UTF-8 and Windows 1252 encodings are identical if all the characters are single byte.

When I paste the XML into another editor and save it to disk, the resulting encoding will not be UTF-8 because the encoding copied and pasted is Windows 1252. Lets say I highlighted and copied the XML from the Visual Studio XML editor to the clipboard. However, if I compose this same document in a generic text editor and save it to disk, the encoding will most likely be Windows 1252, not UTF-8. If this file is saved to disk, the encoding is truly UTF-8, including the UTF-8 Byte Order Marker. Consider this simple XML document composed in the Visual Studio XML editor:

How does one view an XML document? Notepad, or the XML editor in Visual Studio? XML is text, after all, so theoretically any text editor will do the job.
