Recent Changes - Search:

About

Users

Internal

Other Info (External)

edit SideBar

Tutorials1

This tutorial is meant as a first exposure to the tools and as such, is the simplest possible example for Squash that I could think of. We will create all the files we need, run Squash, and view the output.

This tutorial will assume that we have already built and installed tXSchema, and we are running squash from a command line (using csh or one of it's derivatives). If this is not the case, the general ideas below will still be applicable.

We'll also assume that you will be working out of the default tXSchema directory structure, with the base absolute path to that directory being $TXSCHEMA. (Note: it isn't necessary to actually create an environment variable by this name; the tutorial will only refer to this name (as opposed to /home/sthomas/...) for convenience. In my scenario, $TXSCHEMA will refer the location /Users/doofus/Documents/workspace/tXSchema on my machine.)

Scenario: We have been keeping track of Steve's salary in an XML file, when all of a sudden he gets a raise (way to go Steve!). So we have two versions of the file - one with the salary before the raise ($45,000) and one with the salary after the raise ($51,000).

The tutorial consists of creating 6 documents by hand, running the tool, and viewing the output.

Getting our ducks in a row

We first need to create/organize some files and our working environment.

What tXMLSchema has already given us

  • A set of Schema documents located in the $TXSCHEMA/etc directory.
    • TDSchema.xsd - Used to validate the temporal document that we are going to create.
    • TSSchema.xsd - Used to validate the temporal schema document that we are going to create.
    • ASchema.xsd - Used to validate the annotation document that we are going to create.
  • The squash runscript that takes a temporal document file as input (described later).
  • A "properties.txt" file (located in the $TXSCHEMA/etc directory) that holds system wide preferences. Open the file and make sure everything looks correct (i.e., full paths to files are correct).



What we need to create ourselves

  • Let's create a directory for our run. Let's call it myRun1:
$ cd $TXSCHEMA/src/test_cases
$ mkdir myRun1
$ cd myRun1
  • Let's create the "Snaphsot Schema:" the schema against which myXML_version1.xml and myXML_version2.xml validate. Call it "mySnapshotSchema.xsd".


mySnapshotSchema.xsd

<?xml version="1.0" encoding="UTF-8"?>

<!--    mySnapshotSchema.xsd. Created by Stephen W. Thomas as
        a simple example. 
        June 2008.
-->

<xsd:schema     xmlns:xsd="http://www.w3.org/2001/XMLSchema"
                elementFormDefault="qualified" 
                attributeFormDefault="unqualified">

  <xsd:element name="person">
    <xsd:complexType mixed="true">
      <xsd:sequence>
        <xsd:element ref="personData"/>
      </xsd:sequence>
    </xsd:complexType>
  </xsd:element>

  <xsd:element name="personData">
    <xsd:complexType mixed="true">
      <xsd:sequence>
        <xsd:element name="personID" type="xsd:integer" minOccurs="1" maxOccurs="1"/>
        <xsd:element name="firstName" type="xsd:string" minOccurs="1" maxOccurs="1"/>
        <xsd:element name="salary"    type="xsd:string" minOccurs="1" maxOccurs="1"/>
      </xsd:sequence>
    </xsd:complexType>
  </xsd:element>

</xsd:schema>


  • Let's create the "Non-Temporal Data:" a series of simple, regular XML documents. Since this is a simple example, let's make only two: call them 'salary_version1.xml and salary_version2.xml. Note that salary_version2.xml'' will be exactly the same as version 1 with only one minor difference.


salary_version1.xml

<?xml version="1.0" encoding="UTF-8"?>
<person     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
            xsi:noNamespaceSchemaLocation="./mySnapshotSchema.xsd">
    <personData>
        <personID>67</personID>
        <firstName>Steve</firstName>
        <salary>45000</salary>
    </personData>
</person>


salary_version2.xml

<?xml version="1.0" encoding="UTF-8"?>
<person     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
            xsi:noNamespaceSchemaLocation="./mySnapshotSchema.xsd">
    <personData>
        <personID>67</personID>
        <firstName>Steve</firstName>
        <salary>51000</salary>
    </personData>
</person>
  • At this point, we should be able to use a traditional validator tool to make sure our schema and XML documents are formatted correctly. Let's use libxml2's xmllint tool (which comes standard with most *nix distributions).
$ xmllint --schema mySnapshotSchema.xsd salary_version1.xml
$ xmllint --schema mySnapshotSchema.xsd salary_version2.xml

After each of the above commands, a line similar to "salary_version1.xml validates" should be displayed to stdout. If any errors occurred, read the output produced by xmllint; the output is usually helpful.

  • Let's create the "annotation" - which is an xml document describing which elements can change over time (logical) and where in the file a timestamp can be placed (physical). Call the file "Annotations.xml".


Annotations.xml

<?xml version="1.0" encoding="utf-8"?>
<annotationSet xmlns="http://www.cs.arizona.edu/tau/ASchema"
               xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
               xsi:schemaLocation="http://www.cs.arizona.edu/tau/ASchema
                                    ../../../etc/ASchema.xsd">

  <logical>
    <item target="salary">
      <transactionTime kind="state" content="varying" existence="constant" />
      <itemIdentifier name="@text" timeDimension="transactionTime">
        <field path="/@text"/>
      </itemIdentifier>
    </item>
  </logical>
  <physical>
    <default>
      <format plugin="XMLSchema" granularity="days"/>
    </default>
    <stamp target="salary">
      <stampKind timeDimension="transactionTime" stampBounds="extent" />
    </stamp>
  </physical>
</annotationSet>


After the file is created, it should validate against the ASchema.xsd document that comes standard with tXSchema. Let's check to make sure.

xmllint --schema ../../../etc/ASchema.xsd Annotations.xml
  • Let's create the "temporal schema" - which is an xml document that simply references the Snapshot Schema and Annotation documents we just created. Call the file "myTempSchema.xml".


myTempSchema.xml

<?xml version="1.0" encoding="utf-8"?>
<temporalSchema xmlns="http://www.cs.arizona.edu/tau/TSSchema"
                xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                xsi:schemaLocation="http://www.cs.arizona.edu/tau/TSSchema
                                    ../../../etc/TSSchema.xsd">
  <conventionalSchema>
    <include schemaLocation="mySnapshotSchema.xsd"/>
  </conventionalSchema>
  <annotationSet>
    <include location="Annotations.xml"/>
  </annotationSet>

</temporalSchema>


After the file is created, it should validate against the TSSchema.xsd document that comes standard with tXSchema. Let's check to make sure.

xmllint --schema ../../../etc/TSSchema.xsd myTempSchema.xml
  • Let's create the "temporal document" - which is an xml document that simply references the temporal schema and Snapshot XML files that we've created. This document is the actual input to Squash. Call the file "temporalSalary.xml".


temporalSalary.xml

<?xml version="1.0" encoding="utf-8"?>
<temporalRoot xmlns="http://www.cs.arizona.edu/tau/TDSchema"
              xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
              xsi:schemaLocation="http://www.cs.arizona.edu/tau/TDSchema
                                  ../../../etc/TDSchema.xsd">
  <temporalSchemaSet>
    <temporalSchema schemaLocation="./myTempSchema.xml"/>
  </temporalSchemaSet>

  <sliceSequence>
    <slice location="./salary_version1.xml" begin="2002-06-21" end="2002-06-22"/>
    <slice location="./salary_version2.xml" begin="2002-06-22" end="2002-06-23"/>
  </sliceSequence>

</temporalRoot>


After the file is created, it should validate against the TDSchema.xsd document that comes standard with tXSchema. Let's check to make sure.

xmllint --schema ../../../etc/TDSchema.xsd temporalSalary.xml




Running Squash

Assuming the installation is complete and your environment is setup correctly, running squash is easy:

$ squash temporalSalary.xml


You may see a few warning messages or other debug output from tXSchema, but unless you see a stack trace being dumped to stderr, squash has completed succesfully.


Viewing Output

The output of squash is "temporalSalary_squashed.xml", which is another temporal document that summaries all the versions of the "Non-Temporal" XML documents. The file should look something like:


temporalSalary_squashed.xml

<?xml version="1.0" encoding="UTF-8"?>
<temporalRoot xmlns:tv="http://www.cs.arizona.edu/tau/TVSchema" begin="2002-06-21" end="2002-06-23">
  <temporalSchemaSet>
     <temporalSchema schemaLocation="./myTempSchema.xml"/>
  </temporalSchemaSet>
  <person xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
     <personData>
        <personID>67</personID>
	<firstName>Alex</firstName>
	<salary_RepItem isItem="y" originalElement="salary">
           <salary_Version>
              <tv:timestamp_TransExtent begin="2002-06-21" end="2002-06-22"/>
              <salary>45000</salary>
           </salary_Version>
           <salary_Version>
              <tv:timestamp_TransExtent begin="2002-06-22" end="2002-06-23"/>
              <salary>51000</salary>
           </salary_Version>
        </salary_RepItem>
     </personData>
  </person>
</temporalRoot>



This file is can now be used as input into TemporalValidator. See the tutorials page for a list of tutorials that use Temporal Validator.

The above process can now be used for more temporal documents (i.e. more versions of salaray_....xml).



Back to Tutorials index.

Edit - History - Print - Recent Changes - Search
Page last modified on June 15, 2009, at 01:25 PM