“Escaped” Data Characters


Datafile Software

"Escaped” Data Characters


Data in an XML file can consist of virtually any character. However, special processing is required if the data contains any of five characters that serve specific functions within an XML file. When present in data, these characters are converted to special "escape” sequences on output to an XML file, or converted back to their original form from an input XML file. Escape sequences start with an ampersand "&”, and are terminated by a semi-colon ";”.

The special characters are:

< The opening angle bracket character is interpreted as the start of an element tag, and is converted to the sequence "&lt;” (ignoring the quotes, this is an ampersand, "lt” for "less than”, which is how the opening angle bracket is known, and a semi-colon)

& The ampersand itself is represented as "&amp;”

> The closing angle bracket closes an element tag, and is converted to "&gt;”

" Double quotes are used to enclose the value of an attribute, and each one is converted to the sequence "&quot;”

' The apostrophe can be used as an alternative to the double quote, and is converted to the sequence "&apos;”

According to the XML standards, only &lt; and &amp; must always be used instead of their literal character equivalents in element content. But often the other three are also converted for both symmetry reasons (&gt;) and to prevent them being misconstrued as XML markup rather than data (&quot; and &apos;).
Custom Fields

Article ID: 1803
Created On: Mon, Jul 9, 2012 at 1:46 PM
Last Updated On: Thu, Jun 22, 2023 at 5:00 PM

Online URL: https://kb.datafile.co.uk/article/“escaped”-data-characters-1803.html