Thursday, April 6, 2017

Configuring the Index Info folder for Sesame in PowerAqua --- how do I do it with Virtuoso

In multi_index_properties.xml :

<ONTOLOGY_INDEX_DB>jdbc:mysql://localhost:3306/ontologyindexrelation</ONTOLOGY_INDEX_DB><ONTOLOGY_INDEX_DB_LOGIN>root</ONTOLOGY_INDEX_DB_LOGIN>
<ONTOLOGY_INDEX_DB_PASSWORD>password</ONTOLOGY_INDEX_DB_PASSWORD>
<INDEX_GLOBAL_PATH>/var/lib/tomcat7/webapps/poweraqua/LuceneIndexes/</INDEX_GLOBAL_PATH>
<INDEX_INFO_FOLDER>./indexListInformation/indexkmiweb07-philips/</INDEX_INFO_FOLDER>


I have no idea how to configure index_info_folder. For Sesame, there is a folder in ./PowerAqua/code/netbeans/PowerAquaOpenSource.zip/IndexListInformation/

IndexListInformation has the folders:

"indexkmiweb07-linkeddata  indexkmiweb07-pg    indexkmiweb07-uam
indexkmiweb07-mysql       indexkmiweb07-trec  indexmooneyexample"

that lists the indexes similar to those in the index_global_path :

"indexkmiweb07linkedata  indexkmiweb07mysql  indexkmiweb07pg  indexkmiweb07trec  indexkmiweb07-trec  indexkmiweb07uam"


Looking in the indexkmiweb07-linkeddata in IndexListInformation gives two files:

"index_properties.xml  service_properties.xml"

index_properties.xml   :

<?xml version="1.0" encoding="UTF-8"?>
<CONFIGURATION>
<INDEX>
<INDEX_DIRECTORY>indexkmiweb07linkedata/index_dir/</INDEX_DIRECTORY>
<SPELL_INDEX_DIRECTORY>indexkmiweb07linkedata/spell_index_dir/</SPELL_INDEX_DIRECTORY>
<METADATA_INDEX_DB>jdbc:mysql://localhost:3306/metadataindex</METADATA_INDEX_DB>
<METADATA_INDEX_DB_LOGIN>root</METADATA_INDEX_DB_LOGIN>
<METADATA_INDEX_DB_PASSWORD />
<METADATA_INDEX_TABLE>indexkmiweb07linkedata</METADATA_INDEX_TABLE>
</INDEX></CONFIGURATION>


service_properties.xml :

<?xml version="1.0" encoding="UTF-8"?>^M
<CONFIGURATION>^M
<PLUGIN_MANAGER>/Applications/apache-tomcat-5.5.23/webapps/powerAqua/WEB-INF/aquaplugins</PLUGIN_MANAGER> ^M
<REPOSITORY>^M
        <SERVER>http://kmi-web03.open.ac.uk:8080/sesame</SERVER>^M
        <PROXY>wwwcache.open.ac.uk</PROXY>^M
        <PORT>80</PORT>^M
        <LOGIN></LOGIN>^M
        <PASSWORD></PASSWORD>^M
        <PLUGIN_TYPE>sesame</PLUGIN_TYPE>^M
        <REPOSITORY_NAME>travel_destinations</REPOSITORY_NAME>^M
        <TYPE>OWL</TYPE>^M
</REPOSITORY>
...
</CONFIGURATION>

These sesame repository SQL indexes look to have been built with:

./PowerAquaOpenSource/src/SesamePlugin/SesameURLDatabaseTransformer.java


For Virtuoso there is a full-text index with bif:contains : http://docs.openlinksw.com/virtuoso/virtuosofaq8/
http://docs.openlinksw.com/virtuoso/rdfpredicatessparql/

It may be accessed by: ?? ..




Accessing the index from Virtuoso does not seem readily clear to me. I know that I can filter in virtuoso with bif:contains, and then build an ontologyindexrelation database with a ontologyindextable with the columns:

id ontologyId indexManagerId    .


In the metadataindex database, there are several tables, each with columns :

id ontologyId entityURI classURI classLabel


The ontologyId is shared across tables

For example, there is a table for https://www.w3.org/TR/2003/PR-owl-guide-20031215/food  in the metadataindex database titled:



indexkmiweb07mysqldirectclasses


In addition I know that the ontology food with the id
1334540344 appears in:

In the metadataindex database:
       ontologyId   table

     1334540344   indexkmiweb07mysqldirectclasses

        1334540344   indexkmiweb07mysqldirectsubclasses

        1334540344   indexkmiweb07mysqldirectsuperclasses

        1334540344   indexkmiweb07mysqlequivalent

        1334540344   indexkmiweb07mysqlsubclasses

        1334540344   indexkmiweb07mysqlsuperclasses


This dumps all rows for a particular ontologyId and table (change the stuff in ` `)

SELECT * FROM `indexkmiweb07mysqldirectsubclasses` WHERE `indexkmiweb07mysqldirectsubclasses`.ontologyId LIKE 1334540344



Also note (in index_properties.xml):

<?xml version="1.0" encoding="UTF-8"?>
<CONFIGURATION>

<!-- In order to acess these indexes import the mysql metadata tables in mysql indexes backups -->
<INDEX>
<INDEX_DIRECTORY>indexkmiweb07mysql/index_dir/</INDEX_DIRECTORY><SPELL_INDEX_DIRECTORY>indexkmiweb07mysql/spell_index_dir/</SPELL_INDEX_DIRECTORY><METADATA_INDEX_DB>jdbc:mysql://localhost:3306/metadataindex</METADATA_INDEX_DB><METADATA_INDEX_DB_LOGIN>root</METADATA_INDEX_DB_LOGIN><METADATA_INDEX_DB_PASSWORD></METADATA_INDEX_DB_PASSWORD><METADATA_INDEX_TABLE>indexkmiweb07mysql</METADATA_INDEX_TABLE>
</INDEX>
</CONFIGURATION>

 and (PowerAquaOpenSource/indexListInformation/indexkmiweb07-mysql/index_properties.xml):
 
<?xml version="1.0" encoding="UTF-8"?>
<CONFIGURATION>
<INDEX>
<INDEX_DIRECTORY>indexkmiweb07mysql/index_dir/</INDEX_DIRECTORY>
<SPELL_INDEX_DIRECTORY>indexkmiweb07mysql/spell_index_dir/</SPELL_INDEX_DIRECTORY>
<METADATA_INDEX_DB>jdbc:mysql://localhost:3306/metadataindex</METADATA_INDEX_DB> 
<METADATA_INDEX_DB_LOGIN>root</METADATA_INDEX_DB_LOGIN> 
<METADATA_INDEX_DB_PASSWORD /> 
<METADATA_INDEX_TABLE>indexkmiweb07mysql</METADATA_INDEX_TABLE> 
</INDEX></CONFIGURATION>





and(PowerAquaOpenSource/indexListInformation/indexkmiweb07-mysql/service_properties.xml): 

<?xml version="1.0" encoding="UTF-8"?>
<CONFIGURATION>
<PLUGIN_MANAGER>/Applications/apache-tomcat-5.5.23/webapps/powerAqua/WEB-INF/aquaplugins</PLUGIN_MANAGER><REPOSITORY>
 <SERVER>http://kmi-web03.open.ac.uk:8080/sesame</SERVER>
 <PROXY>wwwcache.open.ac.uk</PROXY>
 <PORT>80</PORT>
 <LOGIN></LOGIN>
 <PASSWORD></PASSWORD>
 <PLUGIN_TYPE>sesame</PLUGIN_TYPE>
 <REPOSITORY_NAME>WINE_FOOD</REPOSITORY_NAME>
 <TYPE>OWL</TYPE>
</REPOSITORY></CONFIGURATION>



Thus 
In multi_index_properties.xml (working by adding a folder for indexListInformation to powerAquaLinked (renamed poweraqua)):

<ONTOLOGY_INDEX_DB>jdbc:mysql://localhost:3306/ontologyindexrelation</ONTOLOGY_INDEX_DB><ONTOLOGY_INDEX_DB_LOGIN>root</ONTOLOGY_INDEX_DB_LOGIN>
<ONTOLOGY_INDEX_DB_PASSWORD>password</ONTOLOGY_INDEX_DB_PASSWORD>
<INDEX_GLOBAL_PATH>/var/lib/tomcat7/webapps/poweraqua/LuceneIndexes/</INDEX_GLOBAL_PATH>
<INDEX_INFO_FOLDER>./indexListInformation/indexkmiweb07mysql/</INDEX_INFO_FOLDER>


 ./poweraqua/LuceneIndexes/indexkmiweb07mysql/

contains the subdirectories:

index_dir  spell_index_dir

/poweraqua/LuceneIndexes/indexkmiweb07mysql/index_dir contains:

_9tf.cfs  _instances  segments_38  segments.gen

/poweraqua/LuceneIndexes/indexkmiweb07mysql/index_dir/_instances contains:

_6fs.cfs  segments_1mk  segments.gen

/poweraqua/LuceneIndexes/indexkmiweb07mysql/spell_index_dir contains:

_1s.cfs  _instances  segments_3n  segments.gen

/poweraqua/LuceneIndexes/indexkmiweb07mysql/spell_index_dir/_instances contains:

_1p.cfs  segments_3h  segments.gen

It is is confusing that PowerMap is apparently associated with Sesame in the multi_index_properties.xml file. All of the ontologies in the Sesame repositories are associated with Lucene Indexes and mysql database tables in the metadataindex and ontologyindextable .  I am not certain what the ontology subclasses, direct classes, direct subclasses, direct superclasses, equivalent classes, and superclasses are.

In multi_index_properties.xml what are index_global_path and index_into_folder with respect to Virtuoso, it seems straightforward for Sesame and Lucene :
<INDEX_GLOBAL_PATH>/var/lib/tomcat7/webapps/poweraqua/LuceneIndexes/</INDEX_GLOBAL_PATH>
<INDEX_INFO_FOLDER>./indexListInformation/indexkmiweb07mysql/</INDEX_INFO_FOLDER>

"By default files are automatically free text indexed as they are inserted into Virtuoso. This is very convenient but can be time consuming if you frequently insert or update text files. For this reason Virtuoso can be set to index in batch mode at a particular interval in minutes."
http://docs.openlinksw.com/virtuoso/webdavadmin/



Perhaps I could build (or load /poweraqua/LuceneIndexes/indexkmiweb07mysql)
Solr and Lucene First Steps

https://www.youtube.com/watch?v=rq3stslcvIw

Can a raw Lucene index be loaded by Solr?
http://stackoverflow.com/questions/2715973/can-a-raw-lucene-index-be-loaded-by-solr

SeeAlso:

Full-text search with Lucene and neat things you can do with it - Itamar Syn-Hershko @ NDC 2012

https://www.youtube.com/watch?v=Nf9p-d01p78

Also, I am guessing where the port and path is for ontology_index_db based on " netstat -lntu " .


Uploading the food ontology to virtuoso:























Perhaps you could also try: http://docs.openlinksw.com/virtuoso/webdavadmin/

Doing Free Text Search in Virtuoso:

http://docs.openlinksw.com/virtuoso/ch-freetext/

To Install Lucene see: https://www.digitalocean.com/community/tutorials/how-to-install-solr-5-2-1-on-ubuntu-14-04


No comments:

Post a Comment