Internet Technologies:xml

Introduction to XML Professional

This section gives you a brief introduction to the XML file format and demonstrates the design process for XML files from an Xbase++ programmer's point of view. How to define configuration data for an Xbase++ application is then discussed using a data driven approach for loading and building Database Engines.

What is XML?

To answer this questions, we will start with a definition found in one of the numerous books presenting XML introductions:

XML enables you to define a grammar for marking up documents in terms of semantic tags and their attributes.

This is a quite good definition of XML. However, don't be disappointed if it doesn't tell much to you. There are some terms that need an explanation:

Grammar

This term describes the words allowed in a language and how or where they may appear in a sentence.

Semantic

This term describes the meaning of the words allowed in a language. E.g. "Deutsch" is not part of the English language, it is part of the German language. The semantic of the words "Deutsch" and "German" is identical, depending on the language.

Mark-up

This term describes all characters in a text that define how text is displayed or printed. For example, italic or bold markers are included in a text but do not contribute to its contents. The markers are invisible when the text is displayed or printed. However, they are visible in an ASCII or HEX editor.

Tag

A tag is a word enclosed in an opening and closing delimiter. <HTML> is a tag existing in all HTML documents, for example.

Attribute

An attribute is part of a tag and defines additional information for mark-up. For example, <TR ALIGN="RIGHT"> is a string defining the HTML tag <TR> whose ALIGN attribute has the value "RIGHT".

With these "definition of terms", the XML definition above is much better to understand: XML allows you to define your own grammar for mark-up, i.e. in XML you can define the names for tags and their meaning, the context in which they may appear, and you can define attributes for tags and their meaning. As a matter of fact, XML is a meta-language that allows you to define your own language to describe documents, or data structures, and that is what makes XML the file format of choice for managing configuration data.

Since XML is a mark-up language, similar to HTML, there are some rules to comply with. The first rule is that all XML files must be "well-formed". This means that textual content in a file must be embedded in an opening and closing tag. The second rule requires all XML files to begin with an <?xml?> tag which identifies the file as being an XML file and includes version information. This is the only tag that has no closing pendant. The following XML code illustrates a simple, well-formed XML document used to describe the data structure for a customer address:

01: <?xml version="1.0"?> 
02: <Customer> 
03:  <Address> 
04:   <Street>1562 FIRST AVE</Street> 
05:   <City>New York City</City> 
06:   <Zip>KOC 1HO</Zip> 
07:   <State>NY</State> 
08:   <Country>USA</Country> 
09:  </Address> 
10: </Customer> 

This example demonstrates how data is organized in XML files. They contain information that can be represented as a tree, or as a parent/child hierarchy. <Customer> has a child tag <Address>, <Address> has child tags <Street>, <City>, <Zip> and so forth. The opening tag defines the meaning (semantic) of embedded information and the closing tag defines the end of data. The tag names define the allowed words (grammar) for a data structure.

As you can see, XML is a pretty simple file format. It is flexible enough to manage data of high complexity that can be stored in an ASCII compliant file format. XML files can be used anytime you have to store information outside a database, or when you have to exchange information between applications. As a matter of fact, whenever you need a file format to exchange data or to store configuration data - simply start drafting your own XML document. You are always on the safe side with XML.

Furthermore, because XML is already a well accepted standard, you can expect more and more tools to become available over time which ease the editing, parsing and data-exchange of information based on XML. There are a couple of tools available already to edit XML documents - one is Microsoft's XML Notepad (http://msdn.microsoft.com/xml/notepad/intro.asp), another more sophisticated one is XMLSPY (http://www.xmlspy.com).

XML data for Xbase++ applications

The task we are going to solve is to define configuration data for an Xbase++ application that tell a program which DatabaseEngines (DBE) to load and to build at program start. In other words, we will replace the DbeSys() procedure using a data driven approach. XML allows us to perfectly describe the data required for calling the functions DbeLoad() and DbeBuild(). Both functions must be called in an application program to accomplish the proposed task. The following code shows an example of an XML file that contains the required data. We will discuss its design process step by step in the next section.

01: <?xml version="1.0"?> 
02: 
03: <Configuration> 
04: 
05:  <DatabaseEngines> 
06:   <load>DBFDBE</load> 
07:   <load>NTXDBE</load> 
08: 
09:   <build name="DBFNTX"> 
10:    <data>DBFDBE</data> 
11:    <order>NTXDBE</order> 
12:   </build> 
13:  </DatabaseEngines> 
14:  
15: </Configuration> 

Designing a configuration file in XML

As always, the design stage is the most time consuming part of the software development process. This is also the case for our small configuration file project. So let's start with the design.

The first thing an XML file must contain is the <?xml?> tag identifying the XML version used in a file. The <?xml?> tag is required due to the XML file specification. An XML parser will fail to process an XML file correctly unless this specification is met.

The next thing to do is to define the tags, or the words, describing our configuration data in a way that allows us to extend configuration files in the future, whenever this may be required. We start with defining the root tag and name it <Configuration> because that's what our XML file is all about. The root tag is going to have child tags which identify various configuration data. At the moment, however, we are dealing only with Database Engines, which one to load and build, and the first child tag is <DatabaseEngines>. This tag will embed all DBE related data in the XML file. The design process leads us to a basic structure for the configuration file:

01: <?xml version="1.0"?> 
02: 
03: <Configuration> 
04: 
05:   <DatabaseEngines> 
06:     <!-- our DBE specific tags will be here --> 
07:   </DatabaseEngines> 
08: 
09: </Configuration> 

There is a dedicated section for Database Engines and we must fill it with additional child tags in order to support the load and build operations for DatabaseEngines. Therefore, we have to take a close look at the parameter interface of the specific functions we need to support.

// load a DatabaseEngine 
DbeLoad( <cDbeFile>,  [<lHidden>] ) --> lSuccess 

// build a DatabaseEngine 
DbeBuild( <cCompoundDBE>  , ; 
          <cDataDBE>      , ; 
         [<cOrderDBE>]    , ; 
         [<cRelationDBE>] , ; 
         [<cDictionaryDBE>] ) --> lSuccess 

Again, our first step is the naming of our child tags, which is a pretty easy task. We just call them <load> and <build> to give them the right semantic. Then we define the syntax of our child tags. Let's start with the DbeLoad() operation.

Due to the nature of the DbeLoad() function, the <load> tag is simple. The DbeLoad() function expects at least a single parameter: the name of the DBE. As a consequence, we decide to design the tag as <load>....</load> where the DBE name to be loaded is embedded in the opening and closing tag. The following code shows the resulting syntax of the new child tag embedded in a parent tag:

01:  <DatabaseEngines> 
02:    <load>DBFDBE</load> 
03:    <load>NTXDBE</load> 
04:  </DatabaseEngines> 

The build operation is more complex. The function prototype accepts many different parameters with a different semantic. However, to keep things simple in our example, we limit the supported parameters and call DbeBuild() as follows:

DbeBuild( cNewDbeName, cDataDbe, cOrderDbe ) 

By taking a close look at these parameters and their semantic, we can isolate the cNewDbeName parameter as the name of the DBE to be built. To reflect this in our <build> tag, we add an attribute to the tag. The syntax is then as follows:

<build name="MYDBE">. 

The interface of DbeBuild() function needs cDataDbe as the Data provider and cOrderDbe as the Order provider specified. However, there could be more DBEs and we want to stay flexible and keep all options open. Therefore, we add those DBEs as child tags to the <build> tag and use the provider type <order> and <data> as the names for child tags. The following code illustrates our new build tag.

01:  <build name="DBFNTX"> 
02:    <data>DBFDBE</data> 
03:    <order>NTXDBE</order> 
04:  </build> 

Now, let us put the pieces together and see how the entire configuration file looks like:

01: <?xml version="1.0"?> 
02: 
03: <Configuration> 
04: 
05:   <DatabaseEngines> 
06:     <load>DBFDBE</load> 
07:     <load>NTXDBE</load> 
08:     <load>CDXDBE</load> 
09: 
10:     <build name="DBFNTX"> 
11:       <data>DBFDBE</data> 
12:       <order>NTXDBE</order> 
13:     </build> 
14: 
15:     <build name="DBFCDX"> 
16:       <data>DBFDBE</data> 
17:       <order>CDXDBE</order> 
18:     </build> 
19:   </DatabaseEngines> 
20: 
21: </Configuration> 

The section holding configuration data starts with <Configuration> as the root node. It embeds a <DatabaseEngines> tag which defines data related to DBE configuration. Child tags recognized for this tag are <load> and <build>. The <load> tags embeds the name of the DBE to be loaded. The <build> tag, in turn, requires additional child tags to specifiy the DBE components for the resulting DBE, while the name of the new DBE is reflected in the "NAME" attribute of the <build> tag.

By processing this XML configuration file, an Xbase++ application would load three component Database Engines and build two compound DBEs. If there is another DBE required, we would modify only the configuration file and add <load> and <build> tags embedding appropriate data. There is no need to recompile or relink an application.

An important aspect in the XML tag design is that DBE related data is embedded in the <DatabaseEngine> tag, which does not contribute to DbeLoad() or DbeBuild() related data. However, we gain the flexibility of re-using existing tags in the future in another context. For example, we could create a new child tag for <Configuration> and name it <DynamicDLL>. This section could support DllLoad() functionalities and could configure additional DLLs to be loaded at start-up. In this case, the <load> tag could be re-used since it has the same semantic for loading DLLs implemented in Xbase++ or other languages.

XML and Internet Explorer 5.0

A nice feature of Internet Explorer 5 is its capability to browse XML files. You can load an XML file into IE5 by dragging it over the browser. You then see a tree displaying your XML file structure. This is an easy way to inspect XML files. You can even click with the mouse on the "-" or "+" icons on each node to collapse or expand a subtree.

Feedback

If you see anything in the documentation that is not correct, does not match your experience with the particular feature or requires further clarification, please use this form to report a documentation issue.