‹header›

‹date/time›

Click to edit Master text styles

Second level

Third level

Fourth level

Fifth level

‹footer›

‹#›

My name is Keith Alcock, as you can see, and the formal title of this presentation is XML Documents, Schemas, and Transformations.

The informal title is XMLSpy demo.

I have a very small number of slides just to get us started and to slow us down at the end, but I’m going to try to spend some time with the software and maybe I can even get someone out there to run it after we get warmed up.

To get us rolling let me first tell you...

Where I found the information I have on this topic.

I started my present investigation into XML with a book, a very thick book, which in retrospect I can’t recommend. It is a place to start and this one does a reasonable job explaining what all of the abbreviations mean and how they relate. I have read worse books on XML, but luckily I can’t remember much about them.

Does anyone here have a book that they can actually recommend?

I read the book because I was about to write an application by chance in C++ and Smalltalk that processes XML data. The C++ part does it without any library support and can stand alone. The Smalltalk portion uses a wrapper around a SAX (Simple API for XML) implementation. In doing this I became aware of some products that are used in XML development. One was simply an updated version of my programmers editor and another is XMLSpy, which comes with a couple of tutorials that are lasted here.

Also, I read a couple of pages from the SVG Spec.

XML we have talked about a couple of times already.

It is used to make documents that are often stored as files.

Martin mentioned a DTD before and a Schema is very similar in that it defines the structure of an XML document, much like a one would for a database. Since XML is hierarchical, a tree structure is described for each file rather than a linear set of fields. A transformation is just a conversion from one format to another, often between XML and HTML.

When we do the transformation later, you’ll notice that some XPath statements are involved. This is akin to database query. One of the transformations will be to a format that Martin also mentioned, SVG. One could consider it a kind of PostScript for XML.

With that in mind, tonight I am going to get some XML data from an application, write a schema to describe and validate it, and then alter it in a pair of ways that will help us view it differently.

The data I have alluded to has to do with something called ReadSmartTM, which is made by Language Technologies, Inc., which I do consulting work for. We have a plug in for Adobe Acrobat that converts a PDF file to an XML file that contains enough information for our formatting purposes. A formatter reads in the abstract document described in the XML file and formats it. The plug-in then does an XML to PDF conversion, effecting any of the formatting changes.

I’ll do a quick demonstration of this.

Here is what the data looks like in a text editor.

Any questions?

One of the reasons for the XML file is that this provides an opportunity for human intervention…and human error. If someone goes and edits this file, it would be good to be able to easily detect that there is a problem with it.

When I tried this out for practice, I went through the tree structure of the document four times and that’s how we’ll start. Martin described last time how there are elements and attributes. Elements are containers while attributes are described within the tags using strings.

These XML files can be used for more than just formatting these PDF documents. Lets look at a couple possibilities.

XSLT Designer is not integrated well with XMLSpy and it seems limited to doing XML to HTML transforms, which are admittedly the most popular. These it can do with drag and drop simplicity. Lets try it. The SVG transformation I had to do by hand and it took a lot of searching through manuals. Let’s take a look at how it works rather than retyping it all.

In reading up on this topic and trying things out, I’ve formed a couple of opinions.

Both the book and the software are too complex in my opinion and I will be looking for better materials. Instead of learning the complex software, learn the language. It would be easy to exhaust the capabilities of XSLT Desginer, for example, and then it will have to be coded. This might not apply to things like validation or SOAP, which you need the computer to help you with, but do apply to the programming part.

I am impressed with XML and am happy using it.