Instance Table - Build a parser for the source pages

II. Build a parser for the source pages

31.1.3.9 Instance Table

It lists the instance name, its attributes and their values for all the instances in the Ontology.

Some of the intermediated views of the developed Ontology are shown in the Appendix –

, ' " ! " # " " ' ! " # - - , ' " ! " # " " ' ! " # - - , ' " ! " # " " ' ! " # - - , ' " ! " # " " ' ! " # - - Introduction

To improve the quality of this project, the developed recipes Ontology has been extended merging [see merging theory in chapter 10.3] it with a wide wine’s Ontology. This has been found in the DAML library

[http://www.daml.org/cgi-bin/hyperdaml?http://ontolingua.stanford.edu/doc/chimaera/ontologies/wines.daml] this initiative to share knowledge over the Web has been described in detail in chapter 23.3.3.1 Reusing Ontologies: Merging

All the list of available Ontologies has been carefully studied looking for some that could complement the recipes domain. The wines Ontology is a good one; it does include a long knowledge description for wines and meals. Both Ontologies can be merged because they have several concepts (about the same subject) in common.

The wines and the recipes Ontologies can be considered as Local Ontologies [see chapter XXX] as they both refer to specific domains. Merging them (and maybe someone else) a Regional Ontology can be formed, referring to a wider context (drinks and food for example).

Also some upper Ontologies have been found in

[http://www.daml.org/cgi-bin/hyperdaml?http://reliant.teknowledge.com/DAML/SUMO.owl]

(Standard Upper Merged Ontology proposal to the IEEE Standard Upper Ontology effort), but an in-depth investigation has to be made in order to refer the recipes Ontology to an Upper level Ontology, this is a very complex task. As the merging process is out of the scope of this project, only one merging has been made. This project-extension pretends to show how the merging process is and how important is to reuse and merge existent information.

The source code of this Ontology can be found in the Appendix-18

It is written in DAML but it does not present any problem, as the chosen Ontology editor can import and merge Ontologies in a wide range of languages (being DAML+OIL one of those) Merging method

The merging process has to be guided and supervised by an expert, because it is necessary to provide additional semantic information besides the two Ontologies to merge.

This is because the Ontologies normally do not have the same structure and not the same words are normally used to design the same concepts.

The merging process of two Ontologies in WebODE (with ODEMerge [explain or reference]

service) is based in additional semantic information provided by the developer. It is necessary to provide some mechanism to compare related concepts. It is necessary to provide as input:

Two Ontologies to merge

Additional semantic information:

o A synonym table, which contains the synonyms between both Ontologies.

o A hyperonym table,[see Glossary] SEE GLOSARY (Hyponym) which contains the hyponyms (subclass relationship) concepts among both Ontologies.

This additional information guides the merging service, stating how to connect both Ontologies. The resulting output is a new ontology with the concepts of both related.

All the related work, the recipes Ontology skeleton, the wines Ontology skeleton and

explanations, the synonym table, the hyperonym table and the result Ontology can be found in

[Appendix-5]

With this new Ontology the one developed before for the recipes purpose is extended and enriched. Some entities missing or incomplete in both Ontologies can be complemented with the merging. So merging Ontologies is a good method to make it easier to make a complete Ontology.

The merging service currently merges only concepts, not the other components of the ontology (like the attributes, axioms, synonyms, etc).These knowledge is added to the new ontology, as long as the element/s they belonged to is also present in the new taxonomy.

The last step is to fill the Ontology with the instances of each entity. It could be done by hand by the developer, but the main objective of this project is to populate the Ontology

automatically by retrieving online information. After filling the Ontology with all the instances the IE system is able to recognize, it will finally be the knowledge-base, where all the

information is perfectly defined and related.

! " # $ '

The Ontology editor incorporates a mechanism to evaluate the designed Ontologies. It is called OntoClean. OntoClean is an evaluation methodology that analyzes other Ontologies stating whether they are correct or not. It was developed in the UPM by N. Guarino and C.

Welty. It is based on philosophical notions that are used to analyze the conceptual model of an Ontology.

The recipes Ontology has been evaluated with this methodology and the following information was obtained: “Synchronous Evaluation OntoClean for recipes 0 ERRORS FOUND”

This means that the Ontology is well formed following Methontology and has no errors in its design.

7 7 7

7 % % % %

METHONTOLOGY is a methodology that guides the creation of Ontologies in an

incremental way trough all its life cycle. It states to firstly create the terms of the Ontology (concepts, instances, properties, etc) structuring them afterwards with the relationships, axioms and formulas to create the taxonomy. It also verifies and validates the ontology.

$ 3 ! " # / ) 4 $ 3 ! " # / ) 4 $ 3 ! " # / ) 4 $ 3 ! " # / ) 4

It is just automatic to convert the developed Ontology to DAML. The Ontology editor incorporates an option to export it. This language has been chosen as the intermediate one, because both the Ontology editor and the IE tool support it. In Appendix-16 this document is shown.

! ! ! ! &< * * &< * * &< * * &< * * B

B B

B # # # 8: # 8: 8: 8:

4 4 4

4 - - - -

To set this annotation tool is necessary run two files, a Melita client, which comprises the GUI, where the user can interact with the application, and a server file, which is the one that connects with Amilcare.

The inputs necessary to run the annotation tool are the following:

• A training corpus: These are some representative files of the domain. This is not included in the Appendix due to its big volume. Please refer to the enclosed CD to consult these files. Melita specifications state that these files can be HTML files, but a lot of bugs were found when working whit them, from a bad performance to loading errors. So the input files given had to be plain text documents to avoid this

inconveniences.

• An Ontology: The skeleton of the Ontology that is going to guide the annotation process. It has to be a very simple ontology, without attributes or relationships, because the tool does not cope with them.

It only annotates instances of the Ontology concepts. This is why the Ontology had to be remade several times to adapt it to this tool (and the IE tool as well). All the

attributes had to be remodeled into concepts related to the one they belong. The new Ontology model is shown in [Appendix-13]. This is finally the last domain model made in this project.

This tool has another inconvenience about the Ontology language, it is not any standard one (quite the opposite than the IE tool that can import a DAML Ontology),

it does not support any web-oriented language (as it would be desired), and so the Ontology had to be remade by hand into a specific Melita language. The picture below shows the Ontology:

Melita’s Ontology things(X) ==> concept(X) v relation(X).

concept(X) ==> general_features(X) v ingredient_description_part(X) v way_of_doing(X) v nutritional_value(X) v course(X).

general_features(X) ==>name(X) v retrieved_from(X) v posted_by(X) v price_person(X) v difficulty(X) v cooking_time(X) v number_of_servings(X).

nutritional_value(X) ==> fats(X) v carbohydrates(X) v cholesterol(X) v proteins(X) v calories(X) v good_for(X).

ingredient_description_part(X) ==> ingredient_description(X).

ingredient_description(X) ==> ingredient(X) v quantity(X) v measure(X).

ingredient(X) ==> food(X) v drink(X).

food(X) ==> fish(X) v dairy_produt(X) v vegetable(X) v fat_oil(X) v cereal_grain(X) v egg(X) v meat(X) v fruit(X) v miscellaneous(X) cereal_grain_based(X).

vegetable(X) ==> spice(X).

In this notation the only features that can be specified are the entities and their

relationships (it is not possible to specify the kind of relationship between two entities).

No attributes can be defined.

Relationships can be specified in this notation, the picture below shows an example:

Relationships in Melita’s Ontology things(X) ==> concept(X) v relation(X).

relation(X) ==> IS-A(X) v is-part-of(X) v is_made_of(X) v consists_of(X) v etc … Although the relations can be defined in this language and they appear in the left part of the screen, and moreover; although it is stated in the User Manual and other

documentation; after several attempts it was found out that it is not possible to annotate the texts with the relationships.

After several investigations, Melita’s developers admitted that is impossible to perform this action with the annotation tool. As they explained, the first developer of this tool thought it would be easier to make a heuristic to annotate any kind of relationship, but afterwards he realized of the difficulty of this task and he dropped it. Unfortunately it is still said in the documentation provided with the tool and so, confusing the user. This is a mistake that should be immediately removed from the documentation.

• Accuracy levels:

These parameters can be set up by the user wants. They have two different scopes; they can be defined globally, for all the entities of the Ontology, or can be set locally giving different values of accuracy to each element in the Ontology.

• Annotations: This is the third input needed to run the annotation tool. The annotations are made manually by the user, who highlights the words he/she wants to identify in the text. This is done very easily dragging an Ontology concept from the left part of the screen and dropping on the word/s to be identified.

The picture below shows an annotated recipe with Melita’s annotation tool to clarify this process:

(

-As seen in the picture, the GUI is very intuitive and allows users without experience to highlight concepts in the texts.

The Onotology is displayed in the left part of the screen, and the documents to be annotated can be consecutively displayed in the right part.

In document Ontology-based semantic querying of the Web with respect to food recipes (Sider 91-96)