SquirrelRDF is available in the following locations:
https://jena.svn.sourceforge.net/svnroot/jena/SquirrelRDF/trunk
Help is available from jena-dev, IRC (freenode, #jena), or direct email.
It is distributed under the Jena licence
There is a lot of structured information out there, but it just isn't in RDF. It isn't always possible, or desirable, to dump this data and convert it to RDF. It may not be possible to access the raw data, and, regardless, keeping this RDF version up to date would not be trivial.
SquirrelRDF is a tool which allows non-RDF data stores (or, perhaps, not explicitly RDF) to be queried using SPARQL. In its current form this includes relational databases (via JDBC) and LDAP servers (via JNDI). It provides an ARQ QueryEngine
(for java access), a command line tool, and a servlet for SPARQL http access. As a result the information now looks like RDF, and is always current.
SquirrelRDF exposes the mapped store in a rather 'raw' form. It makes no attempt, for example, to reveal implicit relations between objects (suggested by foreign keys), or normalise denormalised data. This simplifies Squirrel's task, focusing it on mapping to RDF and ignoring the complex task of transforming between vocabularies or ontologies, which are better left to pure RDF tools. Here are some approaches:
CONSTRUCT
, which is a more powerful means to change the shape of the results.You will need:
Put the jar files in lib/
if you want to build or test SquirrelRDF. Otherwise just ensure that they, together with lib/squirrelrdf.jar
, are on your CLASSPATH
.
In this section I will assume you are armed with a configuration file config.ttl
. See below for details on configuration.
The command line tool provides an easy way to check that all is working correctly:
lewis:~/ pldms$ java squirrelrdf.Query config.ttl \ "SELECT * WHERE { ?s <http://example.com/people_name> ?name }" WARN [main] (QueryEngine.java:106) - Default model is null in the dataset ----------------------------------------------- | s | name | =============================================== | <http://example.com/people;id=1> | "Damian" | | <http://example.com/people;id=2> | "Libby" | | <http://example.com/people;id=3> | "Dan" | | <http://example.com/people;id=4> | "Danny" | -----------------------------------------------
The second argument can also be a file containing the query. If no argument is given the query is taken from STDIN.
SquirrelRDF implements ARQ's QueryEngine
. From this one can execute ASK
, SELECT
and CONSTRUCT
queries:
Model config = FileManager.get().loadModel(configFile); Query query = QueryFactory.create(theQuery); QueryEngine qe = new SQLQueryEngine(query, config); // or new LdapQueryEngine(query, config); for LDAP qe.setDataset(DatasetFactory.create()); // empty data set ResultSet results = qe.execSelect();
SquirrelRDF includes a servlet (squirrelrdf.Servlet
) and an example web app to get you started. Copy the libraries to webapp/WEB-INF/lib
, and your configuration to webapp/WEB-INF/map.ttl
. Deploy this web application, for example as 'squirrel', and you should be able to execute a query by visiting http://localhost:8080/squirrel/, or from the command line:
lewis:~/ pldms$ curl http://localhost:8080/squirrel/model \ -d 'query=SELECT * WHERE { ?s <http://example.com/people_name> ?name }'
The servlet was written for a simple demonstration, and as a result is pretty limited. It can execute SELECT
queries, and return results in XML. You can also give a stylesheet parameter, which will add a processing instruction to resulting XML. ASK
, CONSTRUCT
, JSON results, et al should be easy to add since ARQ supports them all.
The relational database map follows roughly what is described in [1]. It performs no model mapping, unlike [2].
The database mapping can be automatically configured using the squirrel.ExtractConfig
tool. Take your database details, and a namespace, and pass them to the tool. The result is a configuration in turtle:
lewis:~/ pldms$ java squirrelrdf.ExtractConfig \ jdbc:mysql://localhost/conference \ com.mysql.jdbc.Driver \ user password http://example.com/db/ > dbmap.ttl
You can also use squirrelrdf.ExtractConfig --list-tables
to show the available tables, and append them to the command above, if you don't want to map every table (useful in SQL Server, if memory serves). Invoke with no arguments for usage details.
Here's a simple example:
@prefix db: <http://jena.hpl.hp.com/schemas/rdbmap#> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix ex: <http://example.com/db/> . @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix : <#> . @prefix owl: <http://www.w3.org/2002/07/owl> . ex:map a db:Map ; db:mapsClass ex:people .
Here is a map, and it maps just one class, ex:people
.
<jdbc:mysql://localhost/conference> a db:Database ; db:pass "username" ; db:user "password" ; db:driver "com.mysql.jdbc.Driver" .
This is a database, with all the details necessary to talk to it
ex:people a rdfs:Class ; db:primaryKey ex:people_id ; db:database <jdbc:mysql://localhost/conference> ; db:table "people" .
This class, which is mapped, corresponds to the table "people", in the given database. It has a primary key:
ex:people_id a rdf:Property ; rdfs:domain ex:people ; db:col "id" ; db:colType "int" .
This is a property of ex:people
. It maps to the column "id". The column type given is not used currently, but we can see it's an integer.
ex:people_name a rdf:Property ; rdfs:domain ex:people ; db:col "name" ; db:colType "varchar" .
And another property of ex:people
, called ex:people_name
. The class and property URIs aren't significant, incidentally, but what ExtractConfig
generates.
This mapping makes this table into the rdf:
People | |
---|---|
id | name |
1 | Damian |
2 | Libby |
ex:people;id=1 a ex:people ; ex:people_id 1 ; ex:people_name "Damian" . ex:people;id=2 a ex:people ; ex:people_id 2 ; ex:people_name "Libby" .
In summary:
No { ?s ?p ?o }
, I'm afraid, or even { :foo ?p :bar }
. Sorry.
You can't query for type ({ ?s a ?type }
) at the moment. Because the relational type system is stronger than RDF's giving a type is often redundant, and doesn't change the SQL query. However { ?s a ex:type }
is the notably exception to this.
You may have noticed that databases are associated with classes, not maps. So a map can involve more that one database, which you may find useful.
If a table has no primary key squirrelrdf can't identify rows, and returns blank nodes as subjects. This can result in oddities with optionals and unions.
The LDAP Mapper is less complex, and less mature, than the RDB mapper. On the other hand LDAP is quite close to RDF, and so has fewer issues.
No automatic configuration here, alas, but it isn't too hard. The map just maps properties to attributes, although there is some additional work depending on the range of the attribute.
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . @prefix lmap: <http://jena.hpl.hp.com/schemas/ldapmap#> . @prefix ex: <http://example.com/schemas/hpcorp#> . @prefix geo: <http://www.w3.org/2003/01/geo/wgs84_pos#> . <> a lmap:Map ; lmap:server <ldap://ldap.example.com/o=example.com> ;
An ldap map, mapping this server (starting search at the base o=example.com
).
lmap:mapsProp [ lmap:property foaf:name ; lmap:attribute "cn" ; ] ;
Map name to the cn
attribute.
lmap:mapsProp [ lmap:property foaf:homepage ; lmap:attribute "webpage" ; a lmap:URIProperty ; ] ; lmap:mapsProp [ lmap:property foaf:mbox ; lmap:attribute "uid" ; a lmap:EmailProperty ; ] ;
The values of webpage
and uid
are both URIs. In the case of the latter, however, mailto:
will be prepended to the value.
lmap:mapsProp [ lmap:property foaf:based_near ; lmap:attribute "workLocation" ; a lmap:ObjectProperty ; ] ; lmap:mapsProp [ lmap:property geo:lat ; lmap:attribute "latitude" ; ] ; lmap:mapsProp [ lmap:property geo:long ; lmap:attribute "longitude" ; ] ; .
workLocation
points to another ldap node, which holds the work location.
The result is that the following query now works (skipping prefixes):
SELECT ?lat ?long WHERE { ?person foaf:name "Damian Steer" ; foaf:based_near [ geo:lat ?lat ; geo:long ?long . ] . }
As with the RDB mapper, no { ?s ?p ?o }
and friends. Sorry.
Some attributes have multiple values, such as class
in one case, where the class value was the closure over subclasses. This needs fixing.
Because multiple values don't work I couldn't do type support. If people want it, it will happen. You can see the beginnings in the schema.
[1] Relational Databases on the Semantic Web
[2] D2RQ