There are a few changes between DocBook XML and SGML. Handling these differences should be relatively easy for most small documents, and many authors will not need to make any changes to convert their documents other than the XML and DocBook declarations at the start of their document.
For others, here is a list of what you should keep in mind when converting your documents from SGML to XML.
Differences between XML and SGML elements | ||
---|---|---|
An XML element typically has three parts: the start tag, the content (your words) and the end tag. Qualifiers are added in the start tag and are known as attributes. They will always have a name and a quoted value.
The start tag contains one attribute (class) with a value of "directory". The end tag (also filename) must not contain any attributes. |
Element names (tags) and their attributes are case-dependent--typically lowercase. The following will not validate because the end tag <PARA> is uppercase:
<para>This part will fail XML validation</PARA> |
All attributes in the start tag must be "quoted". This can be either single (') or double (") quotes, but not reverse (`) or "smart quotes". The quote used to start a name="value" pair must be the same quote used at the end of the value. In other words: "this" would validate, but 'that" would not.
Tags that have a start tag, but no end tag are referred to as "empty" because they do not contain (wrap around) anything. These tags must still be closed with a trailing slash (/). For example: xref must be written as <xref linkend="software"/>. You may not have any spaces between the / and >. (Although you may have a space after the final attribute: <xref linkend="foo" />.)
Processing instructions that get sent to the transformation engine (DSSSL or XSLT) and must have a question mark at the end of the tag. All processing instructions are removed from the output stream. The XML version of this tag would look like this:
<?dbhtml filename="foo"?> |
If you're converting from SGML to XML, be sure file names refer to .xml files instead of .sgml. Some tools may get confused if a .sgml file contains XML.
Tag minimizations were used in SGML instead of writing out the element name in the end tag. Example: <para>This is foo.</> Tag minimizations are not supported in XML and their use is discouraged in DocBook.