Home » Categories » Solutions by Business Process » E-Business » XML

XML Background

Datafile Software

XML Background


This background note is an introduction to what XML is about. There are many publications about XML, and you will find tutorials and any amount of other information about XML on the web, see the Useful Links section at the end of this appendix.

XML is a mnemonic for eXtensible Markup Language. It is a W3C†-endorsed markup language to format documents that contain structured information for sending and processing electronically. In practice it is proving to have a very wide variety of uses. But the description below concerns mainly its use as a way of exchanging data.

In non-electronic terms, almost any document that we may pick up will have some structure, otherwise we wouldn’t understand what they are for or how to use them. By "structured information” is meant the content (the words, pictures and figures) and what the content is for. For example, the content of a heading tells us what to expect from the paragraphs that follow; the content of a footnote provides further information about the word, phrase or topic that is footnoted; the content of graphics and tables serve to illustrate in graphical and tabular terms what is written in the associated text; long documents often have an index whose content we use to point us to specific pages.

Content can be prepared in electronic form easily enough by keying it in. A markup language is a technique to define the structure of a document in electronic form to shape the content. XML is becoming the internationally accepted standard for adding such markup instructions to electronic documents.

The concept is not new. The Internet was developed on the back of HTML (HyperText
Markup Language) which itself is a subset of SGML (Standardised Generalised Markup
Language) which first saw the light of day in the 1970s. But HTML is limited to text and
graphics, and is not designed to differentiate data.

For example, whilst you might receive an invoice as an HTML page and display it on your browser, no computer application could extract the essential pieces of data from that page, such as issuing company, date, reference, line details, tax amount, invoice total etc. HTML has no way to differentiate that data from text narrative such as the company name and address, the words "Date” and "Invoice Number”, the column headings for the invoice details, "VAT” and "Total”. Worse, invoices from different companies will all look different (so their HTML pages would all be very different) and the necessary data will all be in different places and in different formats.

XML provides a technique whereby such documents can be defined in electronic form such that the data elements can be recognised and extracted by a processing program, and the document itself re-created in human-readable form using your computer’s browser program. The XML document itself is merely a computer file that can be sent and received by any means, although the Internet will be the most common transport medium.

As a final illustration of the intended universality of XML, this present document you are reading is itself a document that could be represented in an XML file. The text would still appear as text in the file. But the presentation of the document, in terms of paragraphs, illustrations, headings, indents, page breaks, contents pages and so on could all be represented with embedded XML markup.

It happens that this document was created using Microsoft’s Word; but its XML version could be sent to anyone, even though they don’t have Word, and at least it could be displayed in a web browser in a form that looked pretty much as you are seeing it now.

† W3C is the Word-Wide Web Consortium, and is the closest thing to a standards organisation that the web possesses. It has more than 400 member companies largely from IT industries, including AT&T, Apple, IBM, Intel, Microsoft, and even UK’s BT. You can find out more about W3C at www.w3c.org.
Custom Fields
  • Release ID: Standard
Attachments Attachments
There are no attachments for this article.
Related Articles RSS Feed
increment
Viewed 1569 times since Mon, Jul 9, 2012
XML Transaction Definition
Viewed 1668 times since Mon, Jul 9, 2012
Template Wizard
Viewed 1503 times since Mon, Jul 9, 2012
Output XML Orders
Viewed 3895 times since Mon, Jul 9, 2012
Run-Time Document Processing Errors
Viewed 1382 times since Mon, Jul 9, 2012
Style Sheets
Viewed 3173 times since Tue, Jul 10, 2012
Path and File Names
Viewed 1584 times since Mon, Jul 9, 2012
Application Considerations - Faxed/Phoned Orders
Viewed 9048 times since Tue, Jul 10, 2012
Example Datafile XML Template
Viewed 14556 times since Tue, Jul 10, 2012
subtractfrom
Viewed 1479 times since Mon, Jul 9, 2012