<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>Alan Meech - Ontology Blog</title>
	<atom:link href="http://alanmeech.wordpress.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://alanmeech.wordpress.com</link>
	<description>Slightly Logical Computing</description>
	<lastBuildDate>Fri, 04 Feb 2011 00:06:44 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='alanmeech.wordpress.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://s2.wp.com/i/buttonw-com.png</url>
		<title>Alan Meech - Ontology Blog</title>
		<link>http://alanmeech.wordpress.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://alanmeech.wordpress.com/osd.xml" title="Alan Meech - Ontology Blog" />
	<atom:link rel='hub' href='http://alanmeech.wordpress.com/?pushpress=hub'/>
		<item>
		<title>Chunking a Timeline</title>
		<link>http://alanmeech.wordpress.com/2010/12/08/chunking-a-timeline/</link>
		<comments>http://alanmeech.wordpress.com/2010/12/08/chunking-a-timeline/#comments</comments>
		<pubDate>Thu, 09 Dec 2010 03:31:22 +0000</pubDate>
		<dc:creator>Alan Meech</dc:creator>
				<category><![CDATA[Time]]></category>

		<guid isPermaLink="false">http://alanmeech.wordpress.com/?p=175</guid>
		<description><![CDATA[So, say you have the case in the last entry and an exponential number of event before/after relations.  Either you pre-calculate the entries, which uses lots of space but minimizes the query time) or you leave it up to the engine to run the rules in real time (in which case even the smartest engine [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=alanmeech.wordpress.com&amp;blog=7953574&amp;post=175&amp;subd=alanmeech&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>So, say you have the case in the last entry and an exponential number of event before/after relations.  Either you pre-calculate the entries, which uses lots of space but minimizes the query time) or you leave it up to the engine to run the rules in real time (in which case even the smartest engine will need to calculate a huge pile of inferences at query time).  One way or another, the rules need to get fired and for even a medium-sized database, this could make operations crawl.</p>
<p>One way humans deal with the problem is partitioning.  Think about it: If someone asks whether Ghandi was born after King Solomon, most people do not need to calculate or even know the dates.  They immediately know that King Solomon was in &#8220;ancient times&#8221; and Ghandi was in &#8220;the twentieth century&#8221;, and these blocks of time easy to calculate.  Unconsciously, time events belong to groups in our memory and the groups have time intervals and the intervals are deep in our memory framework (heavily pre-calculated).</p>
<p>To set this up, various groups of events (&#8220;births in ancient Greece&#8221;, &#8220;current events in 1949&#8243;, &#8220;stages of construction on the Hoover Dam&#8221;) are created with set time intervals, and their before/after relations are created (or implied). Some of the groups will overlap in their intervals. Events are put into those groups and within the groups, before/after relations are created (or implied) for each of events. This two-tiered system allows some of the events to be calculated by partition, as long as the two intervals are non-overlapping.  (There are other things that can be done with the overlapping intervals.)</p>
<p>Note that the exact time of the event does not need to be known.  In order to put this into a group, the group should have an known interval (&#8220;events in 1940&#8243; has a definite time limit, even if the events in it were just definitely inside that year), but even the boundaries can be fuzzy (all &#8220;ancient roman times&#8221; were before &#8220;the 1800&#8242;s&#8221;).  In current events or project management, for instance, this gets important when tracking dialogs, news stories or other sequences, since the date it happened may be known, but the exact time when a specific part of the sequence happened is not &#8211; just what was before and after.  We humans do this all day long, every day.</p>
<p>To take advantage of this approach, however, you really need a different relation (&#8220;fast before/after&#8221; maybe) that uses these rules:</p>
<ol>
<li>If the two events are in the same group, use the direct before/after relations.</li>
<li>If the two events are in the different non-overlapping groups use the relations on the group to determine the before/after relation.</li>
<li>Otherwise, things get more complicated and it may not be possible to say whether the events are definitely before or after each other. There are some smart approaches that can be applied, but they are not fast.</li>
</ol>
<p>This can be done in rules inside the engine, but there is also the option of offloading the partition function. In SWRL, for instance, a custom built-in function could be defined to offload some of this functionality into a relational database.  Relational databases can take advantage of indexing to speed up the queries in each of the first two steps.  This is especially true if you have a lot of highly interconnected events in a group, such as in long-running dialogs, stories or meeting notes.  If you are doing knowledge engineering, this could be an index of subject matter expert interviews, for instance.</p>
<p>As loose as these groups may be, even with fuzzy intervals, they can dramatically cut down on the amount of events that need to be considered in a query, so they can allow knowledge bases to get an order of magnitude larger when applied properly.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/alanmeech.wordpress.com/175/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/alanmeech.wordpress.com/175/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/alanmeech.wordpress.com/175/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/alanmeech.wordpress.com/175/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/alanmeech.wordpress.com/175/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/alanmeech.wordpress.com/175/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/alanmeech.wordpress.com/175/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/alanmeech.wordpress.com/175/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/alanmeech.wordpress.com/175/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/alanmeech.wordpress.com/175/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/alanmeech.wordpress.com/175/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/alanmeech.wordpress.com/175/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/alanmeech.wordpress.com/175/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/alanmeech.wordpress.com/175/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=alanmeech.wordpress.com&amp;blog=7953574&amp;post=175&amp;subd=alanmeech&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://alanmeech.wordpress.com/2010/12/08/chunking-a-timeline/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/fc4e03498c7d20337121405f23632d3c?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">alanmeech</media:title>
		</media:content>
	</item>
		<item>
		<title>Customizing Time Concepts</title>
		<link>http://alanmeech.wordpress.com/2010/11/12/customizing-time-concepts/</link>
		<comments>http://alanmeech.wordpress.com/2010/11/12/customizing-time-concepts/#comments</comments>
		<pubDate>Fri, 12 Nov 2010 04:11:37 +0000</pubDate>
		<dc:creator>Alan Meech</dc:creator>
				<category><![CDATA[Ontology Topics]]></category>
		<category><![CDATA[Time]]></category>

		<guid isPermaLink="false">http://alanmeech.wordpress.com/?p=168</guid>
		<description><![CDATA[One issue most people have with applying high-level ontologies is that they normally are very abstract and it is hard to see how they apply to the problem at hand.  Time ontologies have been studied for centuries, so philosophers have had plenty of time to simplify and extract the essence of the concepts, but an [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=alanmeech.wordpress.com&amp;blog=7953574&amp;post=168&amp;subd=alanmeech&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>One issue most people have with applying high-level ontologies is that they normally are very abstract and it is hard to see how they apply to the problem at hand.  Time ontologies have been studied for centuries, so philosophers have had plenty of time to simplify and extract the essence of the concepts, but an ontology developer now needs to drag them down from the clouds to the problem area.  This can be a problem.</p>
<p>For example, history is full of births and geneology (kings and emperors, for instance).  Births are instants, even if the date is not known, and geneologies are sets of relations between births.  If dates are known for a birth, then they can be attached to the birth event.  This part is simple.</p>
<p>However, there is also general understanding that rules apply to the sequence of time events.  For instance, a person cannot be born before his parents (either of them).  This is also transitive, so it would apply to grandparents as well, and so on.  Prior (see reference in the previous entry) lays out a fairly comprehensive logical system for representing relative time and operators which covers the representation of before, after and so on.  If these rules are applied in a birth and geneology ontology, then they would also have to be repeated in many other time-based areas where relative sequences were used (project planning, development, speech, news casts items).</p>
<p>To avoid this, intermediate or upper ontologies can be used to hold the rules and general relations, then application areas can state sub-property relations to &#8220;inherit&#8221; from the general ontologies, allowing rule reuse.</p>
<p>Using <a href="http://www.w3.org/TR/owl-time/">Owl-Time</a> as a base ontology, say a new ontology is defined which defines the following rules (SWRL presentation, sort of):</p>
<p style="padding-left:30px;"><strong><em>@prefix time: &lt;http://www.w3.org/2006/time&gt;</em></strong></p>
<p style="padding-left:30px;"><strong><em>time:Instant(?t1), time:Instant(?t2), time:Instant(?t3), time:before(?t1,?t2), time:before(?t2,?t3) -&gt; time:before(?t1,?t3) .</em></strong></p>
<p style="padding-left:30px;"><strong><em>time:Instant(?t1), time:Instant(?t2), time:Instant(?t3), time:after(?t1,?t2), time:after(?t2,?t3) -&gt; time:after(?t1,?t3) .</em></strong></p>
<p>Pretty simple stuff, of course.  Any serious ontology would also include combinations of <em><strong>time:Interval</strong></em> and <em><strong>time:Instant</strong></em> and there are a number of more interesting axioms in Prior&#8217;s paper that could be stated as rules in the ontology.</p>
<p>In the Births and Geneologies (<em><strong>gen:</strong></em>) ontology, these concepts and rules would be inherited and extended.  The primary concept is <em><strong>gen:Person</strong></em>, which could have a birth property, but to take advantage of the existing types, births are stated as a type of <em><strong>time:Instant</strong></em> and related with an object property (<em><strong>gen:birth</strong></em>).  The basic relation between children and parents might be <em><strong>gen:parentOf</strong></em> (with sub-properties <em><strong>gen:motherOf</strong></em> and <em><strong>gen:fatherOf</strong></em>, normally).  If relative time is an issue for the application, a rule can be stated thus:</p>
<p style="padding-left:30px;"><em><strong>gen:Person(?child), gen:Person(?parent), gen:birth(?child,?t1), gen:birth(?parent, ?t2), gen:parentOf(?parent,?child) -&gt; gen:before(?t2, ?t1).</strong></em></p>
<p>When this rule fires, the system will know that parents birth dates are before their child birth dates, and thanks to the upper-level ontology rules that work on <em><strong>time:Instant</strong></em>, the same will be true of the grand parents and great-grand-parents.</p>
<p>Of course, if the system is capable of running simple rules like this (and that is a stretch at the moment), the above rule will lead to an exponentially growing set of assertions about dates.  If you give 10 kings (just a paternal line), the last will have 9 assertions, the one before will have 8 assertions and so on.  Considering that other &#8220;interesting&#8221; relations and rules may have been added at the upper-level time ontology and that this is only one small dimension that might be needed in even a basic geneology knowledge base, this will be a serious scalability issue.</p>
<p>Normally, a historical application would need to have more than a single line of rulers, and a typical question might be &#8220;Who else was alive during his lifetime?&#8221;  To answer concurrency problems in history, there is no choice but to keep these relations. How can this be improved?</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/alanmeech.wordpress.com/168/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/alanmeech.wordpress.com/168/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/alanmeech.wordpress.com/168/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/alanmeech.wordpress.com/168/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/alanmeech.wordpress.com/168/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/alanmeech.wordpress.com/168/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/alanmeech.wordpress.com/168/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/alanmeech.wordpress.com/168/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/alanmeech.wordpress.com/168/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/alanmeech.wordpress.com/168/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/alanmeech.wordpress.com/168/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/alanmeech.wordpress.com/168/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/alanmeech.wordpress.com/168/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/alanmeech.wordpress.com/168/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=alanmeech.wordpress.com&amp;blog=7953574&amp;post=168&amp;subd=alanmeech&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://alanmeech.wordpress.com/2010/11/12/customizing-time-concepts/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/fc4e03498c7d20337121405f23632d3c?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">alanmeech</media:title>
		</media:content>
	</item>
		<item>
		<title>Time Ontologies</title>
		<link>http://alanmeech.wordpress.com/2009/10/05/time-ontologies/</link>
		<comments>http://alanmeech.wordpress.com/2009/10/05/time-ontologies/#comments</comments>
		<pubDate>Tue, 06 Oct 2009 02:05:23 +0000</pubDate>
		<dc:creator>Alan Meech</dc:creator>
				<category><![CDATA[Ontology]]></category>
		<category><![CDATA[Time]]></category>

		<guid isPermaLink="false">http://alanmeech.wordpress.com/?p=163</guid>
		<description><![CDATA[The first topic area I want to look at is Time. Time is a rich area of thought and greatly complicates most existing human languages.  We daily discuss notions about future and past, continuous and instant time references, schedules, events, Now versus Then and a host of temporal relations, and we never give it a [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=alanmeech.wordpress.com&amp;blog=7953574&amp;post=163&amp;subd=alanmeech&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>The first topic area I want to look at is Time. Time is a rich area of thought and greatly complicates most existing human languages.  We daily discuss notions about future and past, continuous and instant time references, schedules, events, <em>Now</em> versus <em>Then </em>and a host of temporal relations, and we never give it a second thought. Great minds have been studying the problem of Time as long as language has had tenses.  As a matter of fact, some of the recent work in the logic of Time has been in the area of Tense, some of which is summed up in this article:</p>
<ul>
<li>Prior, A. N.  (1971).<em> Recent Advances in Tense Logic</em>.  In E. Freeman and W. Sellars (Ed.), <em>Basic Issues in the Philosophy of Time</em>, The Open Court Publishing Co., La Salle, Illinois.</li>
</ul>
<p>Many systems have been proposed over time for dealing with Time.  Pat Hayes published a good summary of the various theories that were in use in 1995 in his <a href="http://www.ihmc.us/users/phayes/Pub/timeCatalog.pdf"><em>A Catalog of Temporal Theories</em></a>.  Among other things, this catalog includes his breakdown of major types of Time covered by the current theories.  Some of the common notions that are really important include:</p>
<ul>
<li><strong>Calendar Instants</strong> (Absolute Timestamps) and <strong>Intervals </strong>- These concepts represent a known point in time relative to an absolute time frame, such as the Gregorian Calendar (&#8220;June 23, 1883 at 3:05 pm in the afternoon&#8221;).  The issue of whether a time like this is an instant or a duration depends on the practical usage of the term, for knowledge workers, if not philosophers. There are large and rich systems of relations between intervals and times (before, after, during, ending at the same time).</li>
<li><strong>Recurrent Instants </strong>and <strong>Intervals</strong> &#8211; Whether in absolute time or not, there are applications that need to record recurring times for rules (&#8220;closed on Sundays&#8221;), schedules (&#8220;Conferences held yearly&#8221;, &#8220;Meeting recurs every Tuesday at 8:30 am&#8221;) and so on. A related notion is the use of relative times in scripts (&#8220;Bring to a boil for 5 minutes, then turn to minimum and simmer for 10 minutes before serving.&#8221;).</li>
<li><strong>Durations </strong>- A duration is a length of time (with or without a related absolute start and end time), such as &#8220;The half-life of Thorium&#8221;.</li>
<li><strong>Units of Time</strong> &#8211; Aside from time frames like Gregorian time, the units used to measure time are also an area of concepts, like &#8220;Second&#8221; and the conversions between units.</li>
</ul>
<p>These are common and basic, but like all foundational concepts, they build into larger concepts, like Events, Schedules, Scripts and Change Management Processes, all of which are crucial in knowledge base construction (or DB Schemas).  Likewise, dozens of critical relations exist between the basic concepts, like before, after, occurs during and so on.</p>
<p>So, naturally, most of the Upper Ontologies discussed earlier devote a portion of their content to ideas of Time and its relations. SUMO, for instance, has dozens of time concepts high in its class heirarcy.  There are also widespread domain-specific ontologies that deal with time, notably the <a href="http://www.w3.org/TR/owl-time/">OWL-Time ontology</a>.</p>
<p>In most information systems, the Calendar Time concepts are the most important. They represent times elements in logs, financial transactions, historical events, meeting dates and many other domain concepts. Most ontologies that represent times support both atomic timestamps and &#8220;exploded&#8221; time formats.  Atomic formats are compact, such as the XML date-time type, which represents a time and can be compared to another date-time.  &#8220;Exploded&#8221; formats break the various elements (year, month, day, hour&#8230;) into attributes of the Timestamp concept, which allows partial representations (&#8220;Thursday&#8221; of any week), unit conversions (the number of seconds between two dates in 1993) and other more complicated operations. Both of these can be interchanged when convenient in an application.</p>
<p>In the next few entries, I will look at some of these concepts.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/alanmeech.wordpress.com/163/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/alanmeech.wordpress.com/163/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/alanmeech.wordpress.com/163/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/alanmeech.wordpress.com/163/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/alanmeech.wordpress.com/163/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/alanmeech.wordpress.com/163/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/alanmeech.wordpress.com/163/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/alanmeech.wordpress.com/163/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/alanmeech.wordpress.com/163/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/alanmeech.wordpress.com/163/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/alanmeech.wordpress.com/163/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/alanmeech.wordpress.com/163/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/alanmeech.wordpress.com/163/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/alanmeech.wordpress.com/163/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=alanmeech.wordpress.com&amp;blog=7953574&amp;post=163&amp;subd=alanmeech&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://alanmeech.wordpress.com/2009/10/05/time-ontologies/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/fc4e03498c7d20337121405f23632d3c?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">alanmeech</media:title>
		</media:content>
	</item>
		<item>
		<title>Constructing Complex Results with SPARQL</title>
		<link>http://alanmeech.wordpress.com/2009/10/03/constructing-complex-results-with-sparql/</link>
		<comments>http://alanmeech.wordpress.com/2009/10/03/constructing-complex-results-with-sparql/#comments</comments>
		<pubDate>Sun, 04 Oct 2009 02:17:48 +0000</pubDate>
		<dc:creator>Alan Meech</dc:creator>
				<category><![CDATA[General]]></category>
		<category><![CDATA[Pellet]]></category>
		<category><![CDATA[SWRL]]></category>

		<guid isPermaLink="false">http://alanmeech.wordpress.com/?p=160</guid>
		<description><![CDATA[A quick side note: At several points, I have compared SQL to the W3C language stack and one of the capabilities in SQL that was awkward in the OWL/SWRL combination was the lack of a way to construct complex results (such as blank nodes and new sub-graphs) in the way that SQL can do with [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=alanmeech.wordpress.com&amp;blog=7953574&amp;post=160&amp;subd=alanmeech&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>A quick side note:</p>
<p>At several points, I have compared SQL to the W3C language stack and one of the capabilities in SQL that was awkward in the OWL/SWRL combination was the lack of a way to construct complex results (such as blank nodes and new sub-graphs) in the way that SQL can do with its data manipulation language.  Ideally, we want to be able to do all with the new stack that the older relational languages could do and without needing to leave the session (all in the same script).</p>
<p>As I was reviewing SPARQL, the W3C query language, I came across the CONSTRUCT query type.  This allows a query to create a new graph (group of triples) constructed from the data in the triples.  I am starting to look at this as a possible way to get around the issue.  It appears that if OWL/SWRL takes care of making the logical inferences, the facts can then be gathered and re-formed into a new &#8220;shape&#8221; using SPARQL CONSTRUCT queries.  The problem is that while OWL/SWRL live in the session, SPARQL really lives outside, in the sense that the SPARQL query is initiated externally to pull information from the knowledge base or working memory.  If this knowledge was needed in the session, it would need to be pulled, then reinserted by some external process.</p>
<p>Since the topic right now needs to do work with existing ontologies, new object construction will be an issue. I need to be able to take data in one OWL schema and construct equivalent objects in types of a new ontology.  One project that will probably come up is converting existing data to an upper ontology (such as SUMO) and this is sure to come up.</p>
<p>As with any part of the stack, having it work on a given platform is always in question.  It will need to be tested. To do the project above, I am creating some convenience modules in JAVA to let me construct an ETL process for OWL data (much like the loader components I would use with a SQL database or XML with ESB components).</p>
<p>Back to Ontology again.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/alanmeech.wordpress.com/160/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/alanmeech.wordpress.com/160/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/alanmeech.wordpress.com/160/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/alanmeech.wordpress.com/160/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/alanmeech.wordpress.com/160/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/alanmeech.wordpress.com/160/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/alanmeech.wordpress.com/160/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/alanmeech.wordpress.com/160/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/alanmeech.wordpress.com/160/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/alanmeech.wordpress.com/160/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/alanmeech.wordpress.com/160/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/alanmeech.wordpress.com/160/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/alanmeech.wordpress.com/160/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/alanmeech.wordpress.com/160/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=alanmeech.wordpress.com&amp;blog=7953574&amp;post=160&amp;subd=alanmeech&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://alanmeech.wordpress.com/2009/10/03/constructing-complex-results-with-sparql/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/fc4e03498c7d20337121405f23632d3c?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">alanmeech</media:title>
		</media:content>
	</item>
		<item>
		<title>Upper Ontologies</title>
		<link>http://alanmeech.wordpress.com/2009/09/25/upper-ontologies/</link>
		<comments>http://alanmeech.wordpress.com/2009/09/25/upper-ontologies/#comments</comments>
		<pubDate>Fri, 25 Sep 2009 22:44:11 +0000</pubDate>
		<dc:creator>Alan Meech</dc:creator>
				<category><![CDATA[Upper Ontology]]></category>

		<guid isPermaLink="false">http://alanmeech.wordpress.com/?p=148</guid>
		<description><![CDATA[Ontologies are groups of concepts, relations and rules that define what we know. This can include definitions of types, or the known instances themselves. There are two main groups: Domain-Specific Ontologies (DSO) &#8211; Domain-specific ontologies define groups of concepts about a particular area of interest. These are defined by subject matter experts who use the [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=alanmeech.wordpress.com&amp;blog=7953574&amp;post=148&amp;subd=alanmeech&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Ontologies are groups of concepts, relations and rules that define what we know. This can include definitions of types, or the known instances themselves.  There are two main groups:</p>
<ul>
<li>Domain-Specific Ontologies (DSO) &#8211; Domain-specific ontologies define groups of concepts about a particular area of interest.  These are defined by subject matter experts who use the definitions.</li>
<li>Upper Ontologies (UO) &#8211; Upper ontologies try to define the universe of concepts at a very abstract level.  Instead of describing concrete types, they concentrate on notions like Binary Relations and Temporal Entities.  They are not directly useful for defining concrete concepts.  Instead they are used to relate more concrete concepts by giving designers a set of high-level classifications for objects.  The intent is that if they are used for a large number of DSO&#8217;s, they will enable rich inference across domains, allow data interchange and save everyone the effort of figuring out the basic concepts to use.  UO&#8217;s are generally defined by philosophers and can have a heavy learning curve.</li>
</ul>
<p>A list of some of the available UO&#8217;s is:</p>
<table border="1">
<tbody>
<tr>
<th>Ontology</th>
<th>Language</th>
<th>Created</th>
<th>License</th>
<th>Concepts</th>
<th>Comments</th>
</tr>
<tr>
<td><a href="http://www.ontologyportal.org/">SUMO</a></td>
<td>SUO-KIF</td>
<td>2001</td>
<td>Open Source</td>
<td>1000</td>
<td>Endorsed by IEEE</td>
</tr>
<tr>
<td><a href="http://www.ontologyportal.org/">SUMO/MILO</a></td>
<td>SUO-KIF</td>
<td>2001</td>
<td>Open Source</td>
<td>20000</td>
<td>This is SUMO with a second expanded layer of concrete concepts.</td>
</tr>
<tr>
<td><a href="http://www.cyc.com/">Upper CYC</a></td>
<td>CycL</td>
<td>1984</td>
<td>Commercial</td>
<td>6000</td>
<td>In addition to the core, the full commercial system has hundreds of thousands of concepts. It is probably one of the most extensive in the world.</td>
</tr>
<tr>
<td><a href="http://www.cyc.com/cyc/opencyc">Open CYC</a></td>
<td>OWL</td>
<td>2002</td>
<td>Open Source</td>
<td>6000</td>
<td>The upper ontology should be the same as CYC, however it contains mainly just the concepts, not the rules that go with them. There is a lot of content that is not the upper ontology.</td>
</tr>
<tr>
<td><a href="http://www.loa-cnr.it/DOLCE.html">DOLCE</a></td>
<td>KIF</td>
<td>2003</td>
<td>Open Source</td>
<td>100</td>
<td>The upper ontology is small, but like CYC, is only one of the components of the full ontology.</td>
</tr>
<tr>
<td><a href="http://www.onto-med.de/ontologies/gfo.html">GFO</a></td>
<td>FOL/KIF</td>
<td>1999</td>
<td>Open Source</td>
<td>79</td>
<td>Used in some medical projects.</td>
</tr>
<tr>
<td><a href="http://proton.semanticweb.org/">PROTON</a></td>
<td>OWL</td>
<td>2006</td>
<td>Open Source</td>
<td>300</td>
<td>Research ontology</td>
</tr>
<tr>
<td><a href="http://www.umbel.org/">UMBEL</a></td>
<td>OWL</td>
<td>2009</td>
<td>open Source</td>
<td>20,093</td>
<td>Uses some components of OpenCYC as a basis.</td>
</tr>
<tr>
<td><a href="http://www.w3.org/2004/02/skos/">SKOS</a></td>
<td>OWL</td>
<td>2005</td>
<td>Open Source</td>
<td>32</td>
<td>W3C ontology</td>
</tr>
<tr>
<td><a href="http://www.micra.com/COSMO/">COSMO</a></td>
<td>OWL</td>
<td>2006</td>
<td>Open Source</td>
<td>5200</td>
<td>See FTP site for an overview (COSMOoverview.doc)</td>
</tr>
</tbody>
</table>
<p>For further information on these ontologies, you can look at the web sites.  Also, here are a few of the references I have used:</p>
<ul>
<li>Salim K. Semy, Mary K. Pulvermacher, Leo J. Obrst. (2004) <a href="http://www.dtic.mil/cgi-bin/GetTRDoc?AD=ADA459575&amp;Location=U2&amp;doc=GetTRDoc.pdf">Toward the Use of an Upper Ontology for U.S. Government and U.S. Military Domains: An Evaluation</a></li>
<li>Viviana Mascardi1, Valentina Cordì1, Paolo Rosso2. (2006) <a href="http://www.disi.unige.it/person/MascardiV/Download/DISI-TR-06-21.pdf">A Comparison of Upper Ontologies </a></li>
</ul>
<p>My recommendation?  Choose one, it will save you time and extend the usefulness of your design.  However, be prepared to commit some time to learn how the concepts work.  If you are doing serious work with ontology in general, it helps to have a design strategy and a UO will give you that.</p>
<p>CYC is notable, not only because it is HUGE, but because it is comprehensive and has an excellent system of Micro-Theories which allow a federated approach to dealing with ambiguous and mismatched knowledge domains (fungus in medicine versus fungus in botany, for instance). However, it is bound to the proprietary CYC engine which makes interchange difficult.</p>
<p>One of the seriously important thing about CYC is that it has an excellent tutorial system.  These documents cover a number of topics, including their use of events and time concepts. Anyone starting on ontology work should go through these materials.</p>
<p>I had to choose one a few years ago, so my choice was SUMO. I support it because it had been available for a while and is fairly comprehensive, but more because I am familliar with it at this point.  I don&#8217;t think it is necessarily the best, but it is more than enough for the work that I do.  I will expand on it in some of the discussions ahead.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/alanmeech.wordpress.com/148/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/alanmeech.wordpress.com/148/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/alanmeech.wordpress.com/148/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/alanmeech.wordpress.com/148/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/alanmeech.wordpress.com/148/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/alanmeech.wordpress.com/148/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/alanmeech.wordpress.com/148/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/alanmeech.wordpress.com/148/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/alanmeech.wordpress.com/148/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/alanmeech.wordpress.com/148/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/alanmeech.wordpress.com/148/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/alanmeech.wordpress.com/148/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/alanmeech.wordpress.com/148/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/alanmeech.wordpress.com/148/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=alanmeech.wordpress.com&amp;blog=7953574&amp;post=148&amp;subd=alanmeech&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://alanmeech.wordpress.com/2009/09/25/upper-ontologies/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/fc4e03498c7d20337121405f23632d3c?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">alanmeech</media:title>
		</media:content>
	</item>
		<item>
		<title>Back to Ontology</title>
		<link>http://alanmeech.wordpress.com/2009/09/19/back-to-ontology/</link>
		<comments>http://alanmeech.wordpress.com/2009/09/19/back-to-ontology/#comments</comments>
		<pubDate>Sun, 20 Sep 2009 00:52:45 +0000</pubDate>
		<dc:creator>Alan Meech</dc:creator>
				<category><![CDATA[General]]></category>
		<category><![CDATA[Ontology]]></category>

		<guid isPermaLink="false">http://alanmeech.wordpress.com/?p=144</guid>
		<description><![CDATA[The three things that are needed for serious Semantic Web applications are a standard set of languages, engines that know how to deal with these languages and knowledge bases in these languages.  This blog is supposed to be about Ontology, that is, Knowledge Representation (KR).  So far, I have had to spend a lot of [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=alanmeech.wordpress.com&amp;blog=7953574&amp;post=144&amp;subd=alanmeech&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>The three things that are needed for serious Semantic Web applications are a standard set of languages, engines that know how to deal with these languages and knowledge bases in these languages.  This blog is supposed to be about Ontology, that is, Knowledge Representation (KR).  So far, I have had to spend a lot of time on tools and languages. With the testing of engines so far, there is enough of a platform with Pellet, OWL/SWRL/RDF and JENA triplestores to do general Ontology work, so let&#8217;s shelve all that for a while.</p>
<p>First of all, the W3C languages (OWL/SWRL/RDF) are not the only language group that can be used for serious Ontology work.</p>
<ul>
<li>KIF was created in the 1990&#8242;s and had several variants, including SUO-KIF, a version currently in use on the <a href="http://www.ontologyportal.org/">SUMO </a>project.  KIF is logic in a LISP-like syntax and is quite easy to learn and expressive. One thing it lacks is RDF&#8217;s partitioning (KIF has a single name space, so the names tend to get long for disambiguation).</li>
<li>MELD, the language used by <a href="http://www.cyc.com/">CYC </a>and the open source <a href="http://www.opencyc.org/">OPEN-CYC</a> (a publically released spin-off project which contains a subset of the full CYC product).  This is another LISP-like language with several extensions to support the special inference capabilities of the CYC engine.</li>
<li><a href="http://common-logic.org/">Common Logic</a> is an upcoming ISO standard language for logic.  Like RDF, it comes in several syntaxes.  Common Logic Interchange Language (CLIF) is a LISP-like syntax (which just proves that computer scientists seriously love their nested parenteses). However, there are other syntaxes such as CGIF (Conceptual Graph Interchange Format), XCL (XML-based notation for Common Logic) and CLCE (Common Logic Controlled English).  The latter is significant, because it represents a trend towards natural controlled languages for specifying logic (also see SBVR and ACE).</li>
</ul>
<p>For now, I am sticking to OWL/SWRL, but CLCE and others like it are serious enticements.</p>
<p>So, what next? Here are some important areas that look interesting:</p>
<ul>
<li><strong>Existing Ontology Standards</strong> &#8211; There are literally thousands of published standards available on the web.  Take a look at <a href="http://www.schemaweb.info/">SchemaWeb </a>or <a href="http://swoogle.umbc.edu/">Swoogle</a>. There are a few, though, that have been especially well crafted and have gained wide acceptance. These are important because when doing Ontology work, it is better not to start from scratch every time a project starts.
<ul>
<li> <strong>Upper Ontologies</strong> &#8211; These try to be act as universal definitions of everything. Generally, they are abstract and are meant to be the core to which other ontologies (domain-specific) link terms, allowing all the linked into a single world-view and be able to do cross-inference.</li>
<li><strong>Domain-specific Ontologies</strong> &#8211; There are a number of well-accepted standard ontologies for things like various kinds of time, standards, spatial relations and so on.</li>
</ul>
</li>
<li><strong>Semantic Design Patterns</strong> &#8211; What is the best approach to representing various kinds of knowledge domains?  While the whole SW effort has been directed almost exclusively at tools and languages, very little work has been done on exactly what and how all this knowledge should be constructed.  Fortunately, philosophers have been arguing these fine points for centuries.  So what approaches are good?  Some areas to examine:
<ul>
<li><strong>Space </strong>- There are many ways to represent space (volumes, cartography, grids), ways to measure space (units) and relations in space (topological, compass directions, containment, touching regions, mereology).</li>
<li><strong>Time </strong>- There are many types of time, measurement systems for time and relations in time.</li>
<li><strong>Events </strong>(what/where/who and so on) &#8211; How to represent things happening sequentially in space and time.</li>
<li><strong>Narrative and reification</strong> &#8211; This covers notions about statements, including who believes them, who said them and in which form, when they were said and whether they are still valid.  This is important in a number of areas, such as understanding narratives and tracking business requirements in systems engineering.</li>
<li><strong>Solution Frameworks</strong> &#8211; If you have knowledge about a problem, how do you arrange the knowledge (&#8220;frame the problem&#8221;) so that there is a clear way to solve it?  It is amazing how much tacit knowledge is involved in &#8220;simple&#8221; high-school physics problems in mechanics can be to describe in a logical form.</li>
</ul>
</li>
</ul>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/alanmeech.wordpress.com/144/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/alanmeech.wordpress.com/144/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/alanmeech.wordpress.com/144/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/alanmeech.wordpress.com/144/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/alanmeech.wordpress.com/144/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/alanmeech.wordpress.com/144/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/alanmeech.wordpress.com/144/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/alanmeech.wordpress.com/144/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/alanmeech.wordpress.com/144/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/alanmeech.wordpress.com/144/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/alanmeech.wordpress.com/144/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/alanmeech.wordpress.com/144/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/alanmeech.wordpress.com/144/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/alanmeech.wordpress.com/144/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=alanmeech.wordpress.com&amp;blog=7953574&amp;post=144&amp;subd=alanmeech&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://alanmeech.wordpress.com/2009/09/19/back-to-ontology/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/fc4e03498c7d20337121405f23632d3c?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">alanmeech</media:title>
		</media:content>
	</item>
		<item>
		<title>Performance Charts for File Example</title>
		<link>http://alanmeech.wordpress.com/2009/09/18/performance-charts-for-file-example/</link>
		<comments>http://alanmeech.wordpress.com/2009/09/18/performance-charts-for-file-example/#comments</comments>
		<pubDate>Sat, 19 Sep 2009 00:49:52 +0000</pubDate>
		<dc:creator>Alan Meech</dc:creator>
				<category><![CDATA[Pellet]]></category>

		<guid isPermaLink="false">http://alanmeech.wordpress.com/?p=135</guid>
		<description><![CDATA[In an earlier post, I noted that the approch to the use of Pellet would need to be changed for the Files rule engine.  The file approach was not working well for the volume of data in the files, so a database was engaged, and I noted that a more efficient approach would be to [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=alanmeech.wordpress.com&amp;blog=7953574&amp;post=135&amp;subd=alanmeech&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>In an <a href="http://alanmeech.wordpress.com/2009/08/22/resources-used-in-files-example/">earlier post</a>, I noted that the approch to the use of Pellet would need to be changed for the Files rule engine.  The file approach was not working well for the volume of data in the files, so a database was engaged, and I noted that a more efficient approach would be to arrange to do only one small batch of inferences at a time, in this case, processing a single file.  Of course, numbers are better.</p>
<p>So, the application was set up with timers and memory usage logging and three approaches were used:</p>
<ul>
<li>Files1 &#8211; The original file approach, where the ontology and rules were loaded into memory from RDF files, the file data was read in and added to RDF in memory, all results were inferred and the results were extracted.</li>
<li>DB1 &#8211; The first database approach was to engage a Derby database (easy, in-process, portable and file-based) and a JENA ModelRDB model as storage, then the data was loaded into memory as before.  However, the files were simply loaded as in the first approach, then stored in the DB model.  This is intended to test whether the Pellet reasoner is doing any smart background saves while processing.</li>
<li>DB2 &#8211; The second database approach uses the same DB Model for storage, but after the ontology and rules were loaded, the files and directories were loaded one at a time and for each, the inferences were done and stored to the DB model.  A commit was done at the end of each batch to flush to the DB so the transactions would not get too large.  This limits the amount of inference needed at each step, since all the previous inferences are already available as facts in the DB model.  If Pellet takes advantage of this, the amount of memory usage should drop drastically.</li>
</ul>
<p>For each run, a set number of files and directories were read.  The number of files is not an exact measure, since on an average, files and directories use different amounts of data, but overall, the average should show a reliable trend.  Likewise, the measure should have been in triples, but this also should not affect the trend much.  At the end of each run, the database was dumped so each run started from scratch with an empty database.</p>
<p>The results are shown in the following two charts.  The first is the memory chart, where the vertical axis is the number of Megabytes allocated by the JVM by the end of the run (Runtime.getRuntime().totalMemory() &#8211; crude, but indicative) versus the number of files scanned.<img class="aligncenter size-full wp-image-139" title="Pellet-MemoryPerformance" src="http://alanmeech.files.wordpress.com/2009/09/pellet-memoryperformance1.png?w=447&#038;h=320" alt="Pellet-MemoryPerformance" width="447" height="320" /></p>
<p>Some notes about this:</p>
<ul>
<li>The first approach (&#8220;File1&#8243;) simply loaded all the file data, ontology and rules into memory, did the inference and extracted the complete results.  Memory rose quickly and exceeded the limits of the machine very quickly.</li>
<li>The second approach (&#8220;DB1&#8243;) showed a similar result, which is not surprising if Pellet is doing exactly the same thing and really only stores the results at the very end when the model changes are committed.  This approch does not gain anything.</li>
<li>The third approach (&#8220;DB2&#8243;) which commits frequently, reducing the inference results to base facts in the DB model and avoids doing large numbers of inferences shows much better results.  As a matter of fact, it is almost flatline &#8211; which is good for applications that need to scale.</li>
</ul>
<p>Another thing to note (which is not really seen in the chart) is that the file approach starts out at almost zer0 when the input data is small (important for small applications &#8211; if anyone actually has that luxury&#8230;).  However, both DB approaches show a minimal baseline above zero, which if you have the actual data, works out to about 12 MB.  This means that the DB model itself is initially taking a constant amount of memory.  I am not sure if that would be repeated for each triplestore used, so it is good to keep in mind.</p>
<p>The time plot is shown below for the same data, where the vertical axis is time in seconds for the given number of files to be read.</p>
<p><img class="aligncenter size-full wp-image-140" title="Pellet-TimePerformance" src="http://alanmeech.files.wordpress.com/2009/09/pellet-timeperformance1.png?w=470&#038;h=318" alt="Pellet-TimePerformance" width="470" height="318" /></p>
<p>Some things to note here:</p>
<ul>
<li>The database is deleted between runs, so a minimal overhead is incurred while JENA sets up the triplestore tables in Derby.  That appears to work out to about 14 seconds.</li>
<li>The DB1 results are worse overall than the File1 results.  This can be explained by the extra time needed to store the inference results at the end of the run when using the DB1, whereas the File1 approach was able to do a quick file dump.</li>
<li>The DB2 times (subtracting the DB initialization) rose much less rapidly than the other two approaches.  As expected, this approach is far more efficient and scalable.  However, it is not a proportional improvement, since Pellet still has to load a lot of data from the store in order to do the inference.</li>
</ul>
<p>I do not know the internals of Pellet and its usage of the JENA ModelRDB for triplestores.  There are a lot of potential variables even in a simple example like this and results could vary widely from application to application.  However, it does give some indications that can be useful for planning.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/alanmeech.wordpress.com/135/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/alanmeech.wordpress.com/135/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/alanmeech.wordpress.com/135/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/alanmeech.wordpress.com/135/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/alanmeech.wordpress.com/135/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/alanmeech.wordpress.com/135/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/alanmeech.wordpress.com/135/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/alanmeech.wordpress.com/135/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/alanmeech.wordpress.com/135/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/alanmeech.wordpress.com/135/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/alanmeech.wordpress.com/135/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/alanmeech.wordpress.com/135/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/alanmeech.wordpress.com/135/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/alanmeech.wordpress.com/135/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=alanmeech.wordpress.com&amp;blog=7953574&amp;post=135&amp;subd=alanmeech&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://alanmeech.wordpress.com/2009/09/18/performance-charts-for-file-example/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/fc4e03498c7d20337121405f23632d3c?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">alanmeech</media:title>
		</media:content>

		<media:content url="http://alanmeech.files.wordpress.com/2009/09/pellet-memoryperformance1.png" medium="image">
			<media:title type="html">Pellet-MemoryPerformance</media:title>
		</media:content>

		<media:content url="http://alanmeech.files.wordpress.com/2009/09/pellet-timeperformance1.png" medium="image">
			<media:title type="html">Pellet-TimePerformance</media:title>
		</media:content>
	</item>
		<item>
		<title>HermiT &#8211; A New OWL/SWRL Engine</title>
		<link>http://alanmeech.wordpress.com/2009/09/17/hermit-a-new-owlswrl-engine/</link>
		<comments>http://alanmeech.wordpress.com/2009/09/17/hermit-a-new-owlswrl-engine/#comments</comments>
		<pubDate>Thu, 17 Sep 2009 23:54:19 +0000</pubDate>
		<dc:creator>Alan Meech</dc:creator>
				<category><![CDATA[Hermit]]></category>
		<category><![CDATA[SWRL]]></category>

		<guid isPermaLink="false">http://alanmeech.wordpress.com/?p=126</guid>
		<description><![CDATA[Earlier this year, the first release of the HermiT engine was announced.  This engine has done quite well in the OWL 2 conformance tests (it is neck-in-neck with Pellet!) and with the 1.0 release shows initial support for SWRL. It has been run through the SWRL test suite and passed all the tests except the [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=alanmeech.wordpress.com&amp;blog=7953574&amp;post=126&amp;subd=alanmeech&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Earlier this year, the first release of the <a title="HermiT Engine" href="http://hermit-reasoner.com/" target="_blank">HermiT </a>engine was announced.  This engine has done quite well in the <a title="OWL Test Suite Status" href="http://www.w3.org/2007/OWL/wiki/Test_Suite_Status" target="_blank">OWL 2 conformance tests</a> (it is neck-in-neck with Pellet!) and with the 1.0 release shows initial support for SWRL. It has been run through the SWRL test suite and passed all the tests except the ones dealing with built-in functions, which is the next area of development. Considering how long the engine has been available, this is looking quite impressive.  I will update the results as new versions become available.</p>
<p>(This last bit is important.  I have been trying a number of engines lately and aside from Pellet and Hermit, I have not found any engines yet that are providing good support for SWRL. I will publish any positive results I get with other engines in future issues. It is irritating to me, though, that given the importance of rules in inference, five YEARS after it became the only widely accepted standard, SWRL support is still limited to a few players.)</p>
<p>Hermit is another Java implementation and it is bundled into a single JAR file, which includes a version of the OWL-API, so it is quite easy to include in projects.  If you need sample code for setting up Hermit, you can get details from the Hermit site above or you can grab a copy of the Hermit test code from the <a title="SWRL Test Suite" href="http://semwebcentral.org/projects/swrl-test-suite/" target="_blank">SWRL Test Suite</a>.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/alanmeech.wordpress.com/126/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/alanmeech.wordpress.com/126/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/alanmeech.wordpress.com/126/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/alanmeech.wordpress.com/126/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/alanmeech.wordpress.com/126/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/alanmeech.wordpress.com/126/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/alanmeech.wordpress.com/126/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/alanmeech.wordpress.com/126/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/alanmeech.wordpress.com/126/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/alanmeech.wordpress.com/126/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/alanmeech.wordpress.com/126/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/alanmeech.wordpress.com/126/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/alanmeech.wordpress.com/126/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/alanmeech.wordpress.com/126/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=alanmeech.wordpress.com&amp;blog=7953574&amp;post=126&amp;subd=alanmeech&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://alanmeech.wordpress.com/2009/09/17/hermit-a-new-owlswrl-engine/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/fc4e03498c7d20337121405f23632d3c?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">alanmeech</media:title>
		</media:content>
	</item>
		<item>
		<title>SWRL Test Suite</title>
		<link>http://alanmeech.wordpress.com/2009/09/16/swrl-test-suite/</link>
		<comments>http://alanmeech.wordpress.com/2009/09/16/swrl-test-suite/#comments</comments>
		<pubDate>Thu, 17 Sep 2009 01:41:30 +0000</pubDate>
		<dc:creator>Alan Meech</dc:creator>
				<category><![CDATA[Rules]]></category>

		<guid isPermaLink="false">http://alanmeech.wordpress.com/?p=124</guid>
		<description><![CDATA[There has been quite a lot of SWRL activity on some of the forums lately, so I have pushed the test suite onto the SemWebCentral site. The suite contains SWRL samples in RDF/XML, N3 and Turtle, and includes Java-based tests for the Pellet and SWRL OWL engines. The tests are not comprehensive, but they hit [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=alanmeech.wordpress.com&amp;blog=7953574&amp;post=124&amp;subd=alanmeech&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>There has been quite a lot of SWRL activity on some of the forums lately, so I have pushed the test suite onto the SemWebCentral site. The suite contains SWRL samples in RDF/XML, N3 and Turtle, and includes Java-based tests for the Pellet and SWRL OWL engines.</p>
<p>The tests are not comprehensive, but they hit most of the areas I needed in my latest development work, so they should at least be a good source of samples.</p>
<p style="text-align:left;">Currently, they are available by anonymous SVN access from the site:</p>
<ul>
<li><a title="Test Suite" href="http://semwebcentral.org/projects/swrl-test-suite/">http://semwebcentral.org/projects/swrl-test-suite/</a></li>
</ul>
<p>If you have any trouble with them, log a question on the help forum on the site. I hope it helps.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/alanmeech.wordpress.com/124/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/alanmeech.wordpress.com/124/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/alanmeech.wordpress.com/124/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/alanmeech.wordpress.com/124/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/alanmeech.wordpress.com/124/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/alanmeech.wordpress.com/124/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/alanmeech.wordpress.com/124/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/alanmeech.wordpress.com/124/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/alanmeech.wordpress.com/124/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/alanmeech.wordpress.com/124/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/alanmeech.wordpress.com/124/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/alanmeech.wordpress.com/124/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/alanmeech.wordpress.com/124/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/alanmeech.wordpress.com/124/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=alanmeech.wordpress.com&amp;blog=7953574&amp;post=124&amp;subd=alanmeech&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://alanmeech.wordpress.com/2009/09/16/swrl-test-suite/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/fc4e03498c7d20337121405f23632d3c?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">alanmeech</media:title>
		</media:content>
	</item>
		<item>
		<title>Resources used in Files Example</title>
		<link>http://alanmeech.wordpress.com/2009/08/22/resources-used-in-files-example/</link>
		<comments>http://alanmeech.wordpress.com/2009/08/22/resources-used-in-files-example/#comments</comments>
		<pubDate>Sat, 22 Aug 2009 19:20:39 +0000</pubDate>
		<dc:creator>Alan Meech</dc:creator>
				<category><![CDATA[Pellet]]></category>
		<category><![CDATA[Rules]]></category>

		<guid isPermaLink="false">http://alanmeech.wordpress.com/?p=116</guid>
		<description><![CDATA[One other note about the file example used earlier.  As the number of asserted individuals rises, the time and memory used by Pellet to do the classification and rules increases, as can be expected.  Given the current case, with 4 SWRL rules and reading the individuals from an RDF/XML file (on a 1 GB laptop), [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=alanmeech.wordpress.com&amp;blog=7953574&amp;post=116&amp;subd=alanmeech&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>One other note about the file example used earlier.  As the number of asserted individuals rises, the time and memory used by Pellet to do the classification and rules increases, as can be expected.  Given the current case, with 4 SWRL rules and reading the individuals from an RDF/XML file (on a 1 GB laptop), the numbers look like this:</p>
<table style="border:1px solid #777;" border="0" cellspacing="0">
<thead>
<tr>
<td style="border:1px solid #777;"><strong># of Individuals</strong></td>
<td style="border:1px solid #777;"><strong>Inference Time (sec)</strong></td>
<td style="border:1px solid #777;"><strong>Memory (B)</strong></td>
</tr>
</thead>
<tbody>
<tr>
<td style="border:1px solid #777;">8</td>
<td style="border:1px solid #777;">0.375</td>
<td style="border:1px solid #777;">517,7344</td>
</tr>
<tr>
<td style="border:1px solid #777;">100</td>
<td style="border:1px solid #777;">2.7655</td>
<td style="border:1px solid #777;">42,745,856</td>
</tr>
<tr>
<td style="border:1px solid #777;">230</td>
<td style="border:1px solid #777;">12.406</td>
<td style="border:1px solid #777;">187,015,168</td>
</tr>
<tr>
<td style="border:1px solid #777;">500</td>
<td style="border:1px solid #777;">61.015</td>
<td style="border:1px solid #777;">780,402,688</td>
</tr>
<tr>
<td style="border:1px solid #777;">1000</td>
<td style="border:1px solid #777;">N/A</td>
<td style="border:1px solid #777;">Out of Memory</td>
</tr>
</tbody>
</table>
<p>Since in this example, it will be difficult to know how many files will be imported in a batch, scalability becomes an important issue.  If the number of files in a batch (for whatever reason) just happens to get near 1000, the inference will fail.</p>
<p>Some common suggestions:</p>
<ul>
<li><strong>Increase Memory</strong> &#8211; &#8220;Memory is cheap&#8221; is a very common response when this issue is raised.  It can be a fast fix in a pinch. However, most industry people (system architects, for instance) will shoot this down immediately for a number of reasons.
<ul>
<li>No matter how much memory you throw at a solution, if the input is unbounded, eventually there is a risk that the new limit will be reached (unexpectedly).  Normally this will happen during a demonstration to upper management &#8230;</li>
<li>&#8220;Real&#8221; applications (enterprise, commercial) have to share resources in an infrastructure and are expected to behave nicely. Resources like memory are frequently shared with other virtual servers (VM&#8217;s) in the same way as disk space is on a SAN, and processor speed is trottled by server.  Even if an application has &#8220;full access&#8221; to a server of it&#8217;s own, when it is deployed to production, there may be new limits on what it can use.</li>
<li>While this application is mostly dealing with single-thread batch processing, most rule applications in an infrastructure are dealing with any number of concurrent threads.  If all of those threads have unbounded memory, no amount of memory would be safe.</li>
</ul>
</li>
<li><strong>Tune the Engine</strong> &#8211; In any rule (or knowledge) base, there are features of the engine that can be turned off to conserve resources.  (Try the information in the <a title="Pellet 2 FAQ" href="http://clarkparsia.com/pellet/faq/" target="_blank">Pellet FAQ</a> for instance.) Optimization is good in any application, especially if the gains are good. However, no matter how much you tune the engine, if the number of instances coming into the application is unbounded, eventually a spike in the number of input instances will hit the magic limit.</li>
<li><strong>Process a Fixed Number of Files</strong> &#8211; Typically, a rules application will look at a single case at a time and process the results.</li>
</ul>
<p>It really depends on the application, of course. Research applications (and heavy-AI applications in general) are frequently given more resources than typical enterprise applications.  Tuning and a set limit on input instances is usually possible.</p>
<p>In this case, the chosen approach is to use the third option.  To do this, one approach is to merge the file scanner part of the application into the classification step, load a copy of the file ontology (OWL and SWRL rules) as a base ontology, and for each file or directory found:</p>
<ol>
<li>Assert the file information (name and so on).</li>
<li>Run classification.</li>
<li>Extract the results and act on them.</li>
</ol>
<p>This kind of issue pops up frequently, so we will be dealing with it again.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/alanmeech.wordpress.com/116/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/alanmeech.wordpress.com/116/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/alanmeech.wordpress.com/116/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/alanmeech.wordpress.com/116/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/alanmeech.wordpress.com/116/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/alanmeech.wordpress.com/116/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/alanmeech.wordpress.com/116/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/alanmeech.wordpress.com/116/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/alanmeech.wordpress.com/116/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/alanmeech.wordpress.com/116/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/alanmeech.wordpress.com/116/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/alanmeech.wordpress.com/116/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/alanmeech.wordpress.com/116/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/alanmeech.wordpress.com/116/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=alanmeech.wordpress.com&amp;blog=7953574&amp;post=116&amp;subd=alanmeech&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://alanmeech.wordpress.com/2009/08/22/resources-used-in-files-example/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/fc4e03498c7d20337121405f23632d3c?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">alanmeech</media:title>
		</media:content>
	</item>
	</channel>
</rss>
