Constructing Complex Results with SPARQL

A quick side note:

At several points, I have compared SQL to the W3C language stack and one of the capabilities in SQL that was awkward in the OWL/SWRL combination was the lack of a way to construct complex results (such as blank nodes and new sub-graphs) in the way that SQL can do with its data manipulation language.  Ideally, we want to be able to do all with the new stack that the older relational languages could do and without needing to leave the session (all in the same script).

As I was reviewing SPARQL, the W3C query language, I came across the CONSTRUCT query type.  This allows a query to create a new graph (group of triples) constructed from the data in the triples.  I am starting to look at this as a possible way to get around the issue.  It appears that if OWL/SWRL takes care of making the logical inferences, the facts can then be gathered and re-formed into a new “shape” using SPARQL CONSTRUCT queries.  The problem is that while OWL/SWRL live in the session, SPARQL really lives outside, in the sense that the SPARQL query is initiated externally to pull information from the knowledge base or working memory.  If this knowledge was needed in the session, it would need to be pulled, then reinserted by some external process.

Since the topic right now needs to do work with existing ontologies, new object construction will be an issue. I need to be able to take data in one OWL schema and construct equivalent objects in types of a new ontology.  One project that will probably come up is converting existing data to an upper ontology (such as SUMO) and this is sure to come up.

As with any part of the stack, having it work on a given platform is always in question.  It will need to be tested. To do the project above, I am creating some convenience modules in JAVA to let me construct an ETL process for OWL data (much like the loader components I would use with a SQL database or XML with ESB components).

Back to Ontology again.

Back to Ontology

The three things that are needed for serious Semantic Web applications are a standard set of languages, engines that know how to deal with these languages and knowledge bases in these languages.  This blog is supposed to be about Ontology, that is, Knowledge Representation (KR).  So far, I have had to spend a lot of time on tools and languages. With the testing of engines so far, there is enough of a platform with Pellet, OWL/SWRL/RDF and JENA triplestores to do general Ontology work, so let’s shelve all that for a while.

First of all, the W3C languages (OWL/SWRL/RDF) are not the only language group that can be used for serious Ontology work.

  • KIF was created in the 1990′s and had several variants, including SUO-KIF, a version currently in use on the SUMO project.  KIF is logic in a LISP-like syntax and is quite easy to learn and expressive. One thing it lacks is RDF’s partitioning (KIF has a single name space, so the names tend to get long for disambiguation).
  • MELD, the language used by CYC and the open source OPEN-CYC (a publically released spin-off project which contains a subset of the full CYC product).  This is another LISP-like language with several extensions to support the special inference capabilities of the CYC engine.
  • Common Logic is an upcoming ISO standard language for logic.  Like RDF, it comes in several syntaxes.  Common Logic Interchange Language (CLIF) is a LISP-like syntax (which just proves that computer scientists seriously love their nested parenteses). However, there are other syntaxes such as CGIF (Conceptual Graph Interchange Format), XCL (XML-based notation for Common Logic) and CLCE (Common Logic Controlled English).  The latter is significant, because it represents a trend towards natural controlled languages for specifying logic (also see SBVR and ACE).

For now, I am sticking to OWL/SWRL, but CLCE and others like it are serious enticements.

So, what next? Here are some important areas that look interesting:

  • Existing Ontology Standards – There are literally thousands of published standards available on the web.  Take a look at SchemaWeb or Swoogle. There are a few, though, that have been especially well crafted and have gained wide acceptance. These are important because when doing Ontology work, it is better not to start from scratch every time a project starts.
    • Upper Ontologies – These try to be act as universal definitions of everything. Generally, they are abstract and are meant to be the core to which other ontologies (domain-specific) link terms, allowing all the linked into a single world-view and be able to do cross-inference.
    • Domain-specific Ontologies – There are a number of well-accepted standard ontologies for things like various kinds of time, standards, spatial relations and so on.
  • Semantic Design Patterns – What is the best approach to representing various kinds of knowledge domains?  While the whole SW effort has been directed almost exclusively at tools and languages, very little work has been done on exactly what and how all this knowledge should be constructed.  Fortunately, philosophers have been arguing these fine points for centuries.  So what approaches are good?  Some areas to examine:
    • Space - There are many ways to represent space (volumes, cartography, grids), ways to measure space (units) and relations in space (topological, compass directions, containment, touching regions, mereology).
    • Time - There are many types of time, measurement systems for time and relations in time.
    • Events (what/where/who and so on) – How to represent things happening sequentially in space and time.
    • Narrative and reification – This covers notions about statements, including who believes them, who said them and in which form, when they were said and whether they are still valid.  This is important in a number of areas, such as understanding narratives and tracking business requirements in systems engineering.
    • Solution Frameworks – If you have knowledge about a problem, how do you arrange the knowledge (“frame the problem”) so that there is a clear way to solve it?  It is amazing how much tacit knowledge is involved in “simple” high-school physics problems in mechanics can be to describe in a logical form.

More OWL References

Well, since we are on the topic of basic OWL materials…

The primary references are:

  • OWL 2 Primer – The W3C primers are typically good material, though light in basic concepts, but the new one has an especially good feature.  It is wired so that you can choose which of the syntaxes you want to see for each of the examples.  The examples are fairly comprehensive and come in 5 dialects (functional, RDF/XML, turtle, Manchester and OWL/XML).  This is especially useful when you are translating between dialects, since you can just turn on the two you are using, search for a keyword and see the target syntax.
  • Manchester Protégé OWL tutorial – This is the original tutorial that the Protégé team prefers.  Aside from introducing Protégé and OWL, it also teaches using the new OWL 2 Manchester syntax and teaches basic Ontology techniques.
  • Semantic Web Programming Blog – I just found this one yesterday.  It only started in March 2009 (about 2 months before this blog), but it contains really good material on OWL 2 techiques and other useful information.  It is backed by a book (link on the site).

When I started looking at OWL about 4 years ago, the materials on the web were very limited, but things are getting better by the day.

Thinking Machines

Let’s face it – computers are stupid. I have been talking to them for over 22 years and have thrown piles of code into them, but they are pretty much as stupid as ever. None of them can match the smooth-talking moon base computers of Heinlein or the robots of Asimov and I think we have all been quite patient, but it is time.

The sad part for a lot of us is the the parts have been laying around in the lab for decades.  The early inference engines were developing as early as the 1960′s and they have been improving continuously. With the semantic web and RuleML communities pushing standards, content is much easier to create.  We have tools.

One problem is that the communities involved are small enough (in terms of the web) that information gets lost, links are not comprehensive, basic questions from outsiders don’t get answered.  Also, with the tools and standards rolling over ever 3 years, the churn is causing a lot of confusion.

This blog will try to cover useful information I could not find quickly when I searched the usual sites.  Hopefully it will save someone else the trouble. Generally, it should be on the following topics:

  • Inference engines, rule engines
  • Knowledge and rule interchange languages (OWL (1,2), SWRL, RDF, CL, KIF)
  • Tools (Protege, JENA and other API’s – mostly in Java, but …)
  • Ontology and Knowledge System design – Techniques, design patterns, approaches.

So much to do…

Follow

Get every new post delivered to your Inbox.