Bayesian Network for Ontology Mapping

I found one interesting application of Bayesian Networks for the Semantic Web, and it’s aimed to help ontology mapping between two different Ontologies, i.e. determine how much Onto1:Concept1 is similar to Onto2:Concept2, thus trying to map two different concepts obtaining a value that corresponds to the degree of similarity.
The details of how it works are pretty interesting. First, the Ontologies are converted to Bayesian Networks using a framework called BayesOWL (references on the paper); the resulting Bayesian Network preserves the semantics of the original ontologies, and support ontology reasoning, within and across ontologies, using Bayesian inferences. This BayesOWL framework provides the methods that utilizes available probability constraints about classes and inter-class relations in constructing the the conditional probability tables of the network.
Prior probability distributions of uncertainty about concepts used for the framework, conditional distributions for relations between classes in the same ontology and joint probability distributions for semantic similarity between different concepts in different ontologies, are constructed based on machine learning of these probabilities using text classification techniques, associating a concept with a group of sample text documents called exemplars, retrieved from a search engine.

The main idea of this research is to provide a simple and efficient way of determine the semantic similarity of two concepts in distinct ontologies. This is an approach that I didn’t know, but it’s worth to keep in mind when doing research in Semantic Web technologies.

Reference:
Pan, Rong, et. Al. A Bayesian Network Approach to Ontology Mapping. The Semantic Web – ISWC 2005. Lecture Notes in Computer Science. 2005 Springer Berlin / Heidelberg. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.122.3201&rep=rep1&type=pdf

Advertisements

A complete chapter for XML, it deserves it!

It has been a while since I had my first encounter with the XML syntax, even though I didn’t know about the specification too much it seemed like a very straight forward way to specify data structures for a file that required specific data schema. Because of its similarity with HTML, XML was a natural steep to take for me, having no problem at all. Nevertheless, the importance of XML goes beyond of the practical uses that most of the people may implement on their daily activities or projects. It is important to realize the huge impact that technology, like XML, does on many applications, which are not perceived at first time.

I guess that my story is like many others, at first time, you don’t realize of the importance or scope of the stuff that you are using/learning. Because of its ease of use, people may underestimate the real value of the technology that is used constantly, in this particular case, the XML syntax. XML let machines process and communicate each other, since it provides a machine readable structure. The standard comes from the SGML specification, which defines all the Markup Languages applications such as HTML and XML included.

XML has become the basic serialization schema for applications such as Web Services and Graph serialization under RDF/XML, which will enable all the core functionalities on the internet of tomorrow. Nowadays every mayor web service out there offers the possibility to return information in xml format (among others), which also is the core serialization format for all the information interchange in remote calls such as SOAP, basically XML messages. Core technologies for the Semantic Web are based on the XML specification, such as RDF/XML, which is the serialization of graphs into XML files, but also OWL, used to build Ontologies; all of them are based on XML.

Semantic Web Stack

Semantic Web Stack

But, why XML has become so popular and “Standard”? Many advantages provided by the specification made it the favorite syntax for information serialization/schematization on the web. A clean, human and machine readable structure provides the means to virtually use XML in plenty types of applications, from process communication in distributed systems to database schemas and data transactions. An easy to use and powerful query mechanism/languages such as XPath provides the key access to every piece of information in the structure of the XML files.

But also modularization has an incredible weight on the fact of it’s popularity, since the web is a distributed system, it is also a modular system, which is comprised of many heterogenous elements. XML must deal with disambiguation of terms, thus, it has the so called namespaces, in which terms/tags may be defined without a risk of confusing them with the same named terms in another domain or namespace, these are the basis for Ontologies and RDF. But its most important feature is flexibility. The ability to define any schema / tags / namespace / domain, provides entities out there with the power of defining their own language for serialization; using XML Schema (in the old times DTDs), they can maintain the consistency of communications (in the distributed systems environment), or the consistency of a database schema (for databases).

We can find a full set of features that make of XML syntax a worth candidate to deserve its own chapter in any book related to web technology: Semantic Web, Databases, Web Services, Protocols, Distributed Systems, etc.

RDF databases really differ from NoSQL databases!

Working along with RDF data you can realize pretty fast of the need to store all that huge amount of information in some easy to query engine. Nowadays, there has been a huge explosion of several “so called” NoSQL databases, which are based in the idea of the flexibility of it’s schema free designs. But, RDF data and databases differ from this type of architectures, even when they share some aspects.

Anyways, I have found a pretty interesting article written by Arto Bendiken, a Cofounder of Datagraph, which describes the differences between these approaches. Pretty interesting article and clearly explained. It’s worth the read.

You can find the article here.