#-- Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to You under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. -->
The
FieldQuery is part of the java API defined in the
org.apache.stanbol.entityhub.servicesapi
bundle
"selected"
:
json array with the name of the fields selected by this query "offset"
:
the offset of the first result returned by this query "limit"
:
the maximum number of results returned "constraints"
:
json array holding all the constraints of the query "ldpath"
:
LDpath program
that is executed for all results of the query. More powerful alternative
to the "selected"
parameter to define returned information
for query results.Simple Field Query that selects rdfs:label and rdf:type with no offset that returns at max three results. Constraints are skipped
{
"selected": [
"http:\/\/www.w3.org\/2000\/01\/rdf-schema#label",
"http:\/\/www.w3.org\/1999\/02\/22-rdf-syntax-ns#type"],
"offset": "0",
"limit": "3",
"constraints": [...]
}
The following example uses an LDPath program to select the rdfs:type and the rdfs:labels as schema:name. The offset is set to 5 and a maximum of 5 results are returned. This is similar the 2nd page if the number of items is set to 5.
{
"ldpath": "schema:name = rdfs:label;rdf:type;",
"offset": "5",
"limit": "5",
"constraints": [...]
}
Constraints are always applied to a field. Currently the implementation is limited to a single constraint/field. This is an limitation of the implementation and not a theoretical one.
While there are five different Constraint types the following attributes are required by all types.
field
: the field to apply the constraint.type
: the type of the constraint.
One of "reference"
, "value"
,
"text"
, "range"
or "similarity"
There are 4 different constraint types.
Additional key:
value
(required): the URI value(s). For a single value a
string can be used. Multiple values need to be parsed as JSON arraymode
: If multiple values are parsed this can be used
to specify if query results must have "any
" or "all
"
parsed values (default: "any
")
Search for instances of the type Place as defined in the dbpedia ontology
{
"type": "reference",
"field": "http:\/\/www.w3.org\/1999\/02\/22-rdf-syntax-ns#type",
"value": "http:\/\/dbpedia.org\/ontology\/Place",
}
Search Entities that link to all of the following Entities. NOTE that the
field "http://stanbol.apache.org/ontology/entityhub/query#references
is special as it will cause a search in any outgoing relation. See the section
special fields for details
{
"type": "reference",
"field": "http:\/\/stanbol.apache.org\/ontology\/entityhub\/query#references",
"value": [
"http:\/\/dbpedia.org\/resource\/Category:Capitals_in_Europe",
"http:\/\/dbpedia.org\/resource\/Category:Host_cities_of_the_Summer_Olympic_Games",
"http:\/\/dbpedia.org\/ontology\/City"
],
"mode": "all"
}
Value Constraints are very similar to Reference Constraints however they can
be used to check values of fields for any data type.
If no data type is defined the data type will be guessed based on the provided
JSON type of the value. For details please see the table below.
Additional keys:
value
(required): the value(s). For multiple values
a JSON array must be used.datatype
: the data type of the value as a string.
Multiple data types can also be parsed by using a JSON array.
Note that if no datatype is define, the default is guessed based on the
type of the parsed value. mode
: If multiple values are parsed this can be used
to specify if query results must have "any
" or "all
"
parsed values (default: "any
"). For an usage example see the
2nd reference constraint example
Search for all entities with an altitude of 34 meter. Note that a String is parsed as value, but the datatype is explicitly set to 'xsd:integer'
{
"selected": [
"http:\/\/www.w3.org\/2000\/01\/rdf-schema#label"],
"offset": "0",
"limit": "3",
"constraints": [{
"type": "value",
"value": "34",
"field": "http:\/\/www.w3.org\/2003\/01\/geo\/wgs84_pos#alt",
"datatype": "xsd:int"
}]
}
The same can be achieved by parsing numerical 34 and not specifying the datatype. In this case "xsd:interger" would be guessed based on the provided value. Note however that this would not work for "xsd:long".
{
"type": "value",
"value": 34,
"field": "http:\/\/www.w3.org\/2003\/01\/geo\/wgs84_pos#alt",
}
Expected Results on DBPedia.org for this query include Berlin and Baghdad
Additional key:
text
(required): the text to search. Multiple values
can be parsed by using a JSON array. Note that multiple values are
considerd optional. (e.g. parsing "Barack Obama" returns Entities that
contain both "Barack" and "Obama" while parsing ["Barack","Obama"]
will also return documents with any of the two words; Also combinations
like ["Barack Obama","USA","United States"] are allowed)
language
: the language of the searched text as string.
Multiple languages can be parsed as JSON array. Parsing "" as language
will include values with missing language information. If no language is
defined values in any language will be used.patternType
: one of "wildcard", "regex" or "none"
(default is "none") caseSensitive
: boolean (default is "false")
(1) Searches for entities with an german rdfs:label starting with "Frankf"
(2) Searches for entities that contain "Frankfurt" OR "Main" OR "Airport" in
any language
Typically the "Frankfurt am Main Airport" should be ranked first because it
contains all the optional terms.
{
"type": "text",
"language": "de",
"patternType": "wildcard",
"text": "Frankf*",
"field": "http:\/\/www.w3.org\/2000\/01\/rdf-schema#label"
}
{
"type": "text",
"text": ["Frankfurt","Main","Airport"]
"field": "http:\/\/www.w3.org\/2000\/01\/rdf-schema#label"
},
Expected Results on DBPedia.org for (1) include "Frankfurt am Main", "Eintracht Frankfurt" and "Frankfort, Kentucky" and for (2) the Airport of Frankfurt am Main, Frankfurt as well as Airport.
Additional key:
lowerBound
: The lower bound of the range
(one of lower and upper bound MUST BE defined) upperBound
: The upper bound of the range
(one of lower and upper bound MUST BE defined) inclusive
: used for both upper and lower bound
(default is "false") The following Query combines two range constraints and a reference constraint to search for cities with more than one million inhabitants that are more than 1000 meter above sea level.
Note that the range for the population needs to parse the datatype "xsd:long" because otherwise the parsed value would be converted the "xsd:integer".
{
"selected": [
"http:\/\/www.w3.org\/2000\/01\/rdf-schema#label",
"http:\/\/dbpedia.org\/ontology\/populationTotal",
"http:\/\/www.w3.org\/2003\/01\/geo\/wgs84_pos#alt"],
"offset": "0",
"limit": "3",
"constraints": [{
"type": "range",
"field": "http:\/\/dbpedia.org\/ontology\/populationTotal",
"lowerBound": 1000000,
"inclusive": true,
"datatype": "xsd:long"
},{
"type": "range",
"field": "http:\/\/www.w3.org\/2003\/01\/geo\/wgs84_pos#alt",
"lowerBound": 1000,
"inclusive": true,
},{
"type": "reference",
"field": "http:\/\/www.w3.org\/1999\/02\/22-rdf-syntax-ns#type",
"value": "http:\/\/dbpedia.org\/ontology\/City",
}]
}
Expected Results on DBPedia.org include Mexico City, Bogota and Quito.
The following query searches for persons born in 1946
{
"selected": [
"http:\/\/www.w3.org\/2000\/01\/rdf-schema#label",
"http:\/\/dbpedia.org\/ontology\/birthDate",
"http:\/\/dbpedia.org\/ontology\/deathDate"],
"offset": "0",
"limit": "3",
"constraints": [{
"type": "range",
"field": "http:\/\/dbpedia.org\/ontology\/birthDate",
"lowerBound": "1946-01-01T00:00:00.000Z",
"upperBound": "1946-12-31T23:59:59.999Z",
"inclusive": true,
"datatype": "xsd:dateTime"
},{
"type": "reference",
"field": "http:\/\/www.w3.org\/1999\/02\/22-rdf-syntax-ns#type",
"value": "http:\/\/dbpedia.org\/ontology\/Person",
}]
}
Expected Results on DBPedia.org include Bill Clinton, George W. Bush and Donald Trump.
This constaint allows to select entities similar to the parsed context. This
constraint is curretly only supported by the Solr based storage of the Entityhub.
It can not be implemented on storages that use SPARQL for search.
NOTE also that only a single Similarity Constraint can be used per Field Query.
Additional key:
context
(required): The text used as context to search
for similar entities. Users can parse values form single words up to
the text of the current section or an whole document.addFields
: This allows to parse additional fields
(properties) used for the similarity search. This fields will be added to
the value of the "field
".
This example combines a filter for Entities with the type Place with an
similarity search for "Wolfgang Amadeus Mozart". The field
http://stanbol.apache.org/ontology/entityhub/query#fullText
is
a special field that allows to search the full
text (all textual and xsd:string
values) of an Entity.
{
"type": "reference",
"value": "http:\/\/dbpedia.org\/ontology\/Place",
"field": "http:\/\/www.w3.org\/1999\/02\/22-rdf-syntax-ns#type",
},
{
"type": "similarity",
"context": "Wolfgang Amadeus Mozart",
"field": "http:\/\/stanbol.apache.org\/ontology\/entityhub\/query#fullText",
}
Expected results with the default DBpedia dataset include Salzurg. However because the default dataset only includes the short rdfs:comment texts results of similarity searches are very limited. Typically the use of similarity searches needs already considered when indexing data sets.
Currently the following special fields are defined
http://stanbol.apache.org/ontology/entityhub/query#fullText
:
Allows to search within the all natuaral langauge and xsd:string
values that are linked with the Entity. This field is especially usefull for
Text Constraints and
Similarity Constraint searches.http://stanbol.apache.org/ontology/entityhub/query#references
:
Allows to search far all entities referenced by this Entity. This includes
other entities and xsd:anyURI
values (e.g. foaf:homepage values).
Because if this Reference Constraints
applied to this field are queries for the semantic context of an Entity.