Support

XML Smart Data Exchange

(Version 1 - Updated June 2011)
(Updated 9/15/2011 to correct usage of recordID)

BizInt Solutions has worked together with vendors of analysis tools to create an XML format suitable for transferring data between software tools. This XML has been implemented by BizInt Smart Charts and by Vantage Point, and an implementation by INTELLIXIR is in progress.

This document describes the XML Smart Data Exchange format. In addition to describing the format, this document explains the key features to include in any set of results destined for import into BizInt Smart Charts.

User documentation for using this feature can be found here.

XML Elements

The document element for this format is InterToolExchange. This element contains all of the data to be transferred.

<?xml [version] [encoding] ?>
<InterToolExchange>

Information about the dataset contained in the exchange is presented in a list of dataset descriptions.

<datasetList>
 <dataset source="com.bizcharts">
  <title>This is a descriptive title of the data set</title>
  <user>John User</user>
  <generator>tool and version information</generator>
  <description label="Search String">the search string</description>
  <description label="another">another descriptive string</description>
 </dataset>
</datasetList>

Each exchange should contain at least one dataset describing the collection.

A tool may add its dataset description to the end of the datasetList, or may include only its own dataset description. The list would function as documentation of how this exchange was created.

An InterToolExchange contains a list of fielded records. Each record is self-describing, allowing data from a variety of sources to be mixed in a single export.

<record>
...
</record>
<record>
...
</record>

Meta-data related to the display or presentation of a record may be included in each record. A small selection of meta labels are defined by the interface [tbd]. Others may be included by tools as needed.

<record>
<meta label="database">FOO</meta>
<meta label="an">2001:123456</meta>
<meta label="updateDate">2001-04-25</meta>
<meta label="databaseLabel">The FOO Database</meta>
<meta label="copyright">Copyright 2001 Foo Bar Inc</meta>
<meta label="recordURL">http://foo.bar/123456.pl</meta>
<meta label="groupID">1</meta>

In certain circumstances, it is possible to have more than one value for a piece of meta data (e.g. multiple recordURL or multiple instances of an when a record is a composite of several sources. In this case, meta may contain value elements. An attribute numItems may be included with meta, indicating the number of values in the meta.

<meta label="an">
  <value>first AN</value>
  <value>second AN</value>
</meta>

Meta-data can be thought of as a generic presentation of record attributes.

A tool is not expected to export 100% of its internal data related to a record using this interface, although it could if so desired. Rather, the fields of data which are required by a second tool could be exported as a subset of the record. Tools should probably offer a means for a user to specify what they want to export.

An intended use for this interface is for a round-trip analysis, where data is exported from Tool 1 into Tool 2. Tool 2 will perform operations on the exported data, editing field contents, creating new derived content, and/or selecting a subset of the data. These work products will then be exported from Tool 2 back to Tool 1, where they will be integrated with the original data. To facilitate this, a unique record identifier may be exported by a tool along with each record. All tools MUST preserve any record identifiers produced by other tools found with a record, and export them as part of the
record.

<meta label="recordID">com.bizcharts:1234-abcd:3</meta>
<meta label="recordID">fr.intellixir:id.abcd.4321.1</meta>

The data in a record consists of a list of one or more fields. The fields are not contained in a list or other container (record is the container object).

<field>...</field>
<field>...</field>

Each field has the following attributes. Only the label attribute is required:

label (REQUIRED) = user label for the field
vlabels = labels for sub-fields in a vector
tag = an optional name for a field which is expected to be consistent for a particular type of data from a particular generator
datatype = text (default), vector, image, url, number, date
state = original (default), edited, generated
numItems = number of values

Every field contains, at least, a label and a value. The label is assumed to be human-readable, but it could be a simple tag. Values are always text (never binary data). If a field contains multiple data items, each is placed in its own value.

<field label="IPC 8">
  <value>B65B7/28</value>
  <value>B65D41/56</value>
</field>

Some fields contain related data items on a single line. We call this data type a vector (datatype="vector"). Elements of a vector are separated by an item separator element <sep />, which would take the place of a tab in many presentations. Vector data elements must have a label for each sub-item, contained in a <vlabels> element.

Here is a vector field showing a small patent family.

<field label="Patent Family" datatype="vector">
  <vlabels>Number<sep />Kind<sep />Date</vlabels>
  <value>US1234567<sep />A<sep />2001-01-15</value>
  <value>CA1234567<sep />B<sep />2004-12-31</value>
</field>

An export may choose to use a different separator for vector elements. This is specified in a vsep attribute. The following sample presents the same family using the separator "||".

<field label="Patent Family" datatype="vector" vsep="||">
  <vlabels>Number||Kind||Date</vlabels>
  <value>US1234567||A||2001-01-15</value>
  <value>CA1234567||B||2004-12-31</value>
</field>

Whitespace is not required, and is included in our samples for clarity only.

Fields which are empty do not need to be present.

Sub-fields in a vector may be empty, which may cause consecutive<sep /> elements with no intervening space (e.g. <sep /><sep />)

Since this interchange format is intended to allow tools to modify records, meta information about whether a field is provided to flag a field as "original", "edited", or "generated". The default is original, so in general this should not be included.

<field state="edited"...

The intention of this information is to indicate how the receiving tool should treat the exchanged information. 'edited' is used to indicate that the data can replace the original data in the record. 'generated' would be used to indicate that new classification data (or other derived information) could be used to augment the existing record.

A tool may generate new records in the exchange. These records will not be marked in any way except that they will only have a recordID from the generating tool.

One attribute of a record is the groupID. This is used to indicate a relationship between all records with the same groupID, such as records which describe the same patent family. We envision one way of using newly generated records is to summarize a group of records. In this case, the new record should have the same groupID.

Required Elements for Adding Data to an Existing Chart

BizInt Smart Charts can import an InterToolExchange document and use the results to modify an existing chart. In order to do this, each record must carry a recordID from com.bizcharts. That is to say, each record must be a transformation of a record generated by BizInt Smart Charts.

An existing chart may be modified in one of three ways, and the required information varies slightly in each case.

The first modification is selection of rows in a chart. In this case, the InterToolExchange requires only a list of records containing a recordID meta value. No fields need to be present. Those rows corresponding to the present recordIDs will be selected.

<record>
<meta label="recordID">com.bizcharts:1234-abcd:3</meta>
</record>
<record>
<meta label="recordID">com.bizcharts:1234-abcd:7</meta>
</record>
<record>
<meta label="recordID">com.bizcharts:1234-abcd:8</meta>
</record>
<record>
<meta label="recordID">com.bizcharts:1234-abcd:9</meta>
</record>
<record>
<meta label="recordID">com.bizcharts:1234-abcd:15</meta>
</record>

The second modification is to add new data cells to chart rows. The exchange should include the new data in a field of the corresponding record. The label of the field will be used as the column label in the chart.

The attribute state="generated" is assumed if the field name was not present in the original export. (This is not exactly true - if the field label does not match a column title at the time of the import, the field will be assumed to be generated).

<record>
 <meta label="recordID">com.bizcharts:1234-abcd:3</meta>
 <field label="New Information" state="generated">
  <value>This is the new item</value>
 </field>
</record>

Selection and adding data may be combined in an exchange.

A final modification is to edit existing cell contents in the chart. In order to edit contents, the attribute state="edited" must be present in the field. Furthermore, BizInt Smart Charts requires the field to contain the tag that appeared in the original exchange generated by BizInt Smart Charts.

<record>
 <meta label="recordID">com.bizcharts:1234-abcd:7</meta>
 <field label="Old Field" tag="OLDTAG" state="edited">
  <value>We replace the old value with this</value>
  <value>and also with this</value>
 </field>
</record>

Required Elements for Creating a New Chart

[tbd]