CaseWiki:XML Embedding Extension
Contents |
[edit] What Is It?
The XML Embedding Extension allows XML fragments to be embedded in a page.
[edit] Why Embed XML?
XML is easily read by computers. It can be used to easily store and retrive information about an object.
[edit] Why Use a Wiki?
Say you are creating a list of all the buildings on a campus. How do you maintain an up-to-date listing of those buildings? How do you store the data? Sure, you could have a central file somewhere, but who has access to it? What happens when a building is built? Who updates the record? By storing the data in a wiki, you provide the data to the public world as well as provide the means to easily update the data. As an added bonus, you can even view a history of that data.
[edit] How It Works
[edit] Saving XML
XML can be embedded between <xmldata> and </xmldata> tags in an article. At save time, this XML is extraced and verified. The first step of verification is to see whether the text is actually valid XML. If it is, it is validated against the RelaxNG schema defined at CaseWiki:RelaxNG. If either of these steps fail, nothing special happens and the article is saved as normal. If both of these complete successfully, the XML structure is stored in a special database table. The table itself has four fields: a unique ID, an article ID, the root element name, and the XML itself. Next, the XML is examined at a closer level. It looks for all wiki tags (<wiki>). It takes the contents of these tags (text as defined by the schema on this site) and sees if that text represents an article in the Case Wiki. If it does, an entry in a second database table is made that says that this XML block references a specific wiki article.
[edit] Displaying XML
Nobody likes reading straight XML. It is boring and worthless. This is where MediaWiki's parser hook comes to the rescue. A parser rendering hook is registered for the <xmldata> tag. The string passed to the hook rendering function is the actual XML that is saved inside the page. Again, the XML is verified against the RelaxNG schema on this site. Next, the name of the root element of the XML is found. The rendering function then looks for an XSLT document at Template:XSLT:rootName where rootName is the name of the root element. If this template exists, the XSLT is extracted and applied to the XML document. The resulting transformation is returned and the user sees something meaningful.
[edit] Retrieving XML
Storing the XML in a separate database table makes obtaining information relatively easy. We further this ease by indexing the root element of every XML document. A web service could easily be written to connect to the database and return all documents with a specific root element. Alternatively, we could return all documents that reference a specific wiki article. The possibilities for using the data are almost limitless. For example, say you store XML in your wiki about every building on your campus. This XML contains GPS coordinates. You could query for a list of buildings, extract the GPS coordinates from every building, and then do something useful with the data.
[edit] Case Wiki's Implementation
This wiki has a special namespace (Metadata) where the storing of XML data is allowed. For all other namespaces, embedded XML data will be stripped from the article text at save time.
We have chosen not to allow XML to be embedded with normal articles because it will just confuse users. Editing XML is not for the average user! Instead, XML is stored in a separate namespace and the XML schema is designed so that all XML elements have the option of referencing wiki topics. A quick parser hook was written that dereferences all objects that reference an article and displays links to those pages. In the future, we hope to expand this ability to something more practical.
[edit] Examples
- Metadata:Buildings/Tomlinson Hall
- Template:XSLT:building
- Tomlinson Hall
- Metadata:Food/Einstein Bagel
- Nord Hall
[edit] Limitations
The biggest limitation to this extension is that it requires users to manually type XML in pages. This is not desirable! The following alternatives have been discussed:
- Create a form input for data
- Although practical, how does one actually do this? The holy grail would be a schema to form converter. If that isn't in place, then you need static schema or else the forms break.
- Create an alternative data entry method
- What about simple delimited entries? For example:
<xmldata> BUILDING name=Tomlinson Hall wiki=Tomlinson Hall abbr=TOML height=3 </xmldata>
The other limitation is finding a use for this data. How could you use it? Why should it be done this way?
[edit] Extension Source Code
Source code for this extension can be found at http://opensource.case.edu/projects/MediaWikiHacks.
Case Referrers
Blog Entries
- Gregory Szorc's blog - More Metadata on the Case Wiki -- Now With Google Maps (32 referral)
- Gregory Szorc's blog - XML Archives (1 referral)
- Gregory Szorc's blog - metadata Archives (1 referral)
- Gregory Szorc's blog - MediaWiki Archives (1 referral)
