How to insert data using parametrized sparql query? - if-statement

I have this insert query:
INSERT
{
<http://www.google.com/go/guest> <http://www.google.com/go/hasRelatives> ?state}
WHERE {?state a <http://www.google.com/go#State>.
filter ($state=<http://www.google.com/go/State-USA>)
}
If the state is equal to <http://www.google.com/go#State-USA> I would need to insert all the states of type <http://www.google.com/go#State>. -Exactly what the SPARQL insert query is doing at the moment.
If not, I would need to insert only the specified state, for example: <http://www.google.com/go#State-Alabama>
Like with the below query:
INSERT { <http://www.google.com/go/guest> <http://www.google.com/go/hasRelatives> $state }
WHERE {?state a <http://www.google.com/go#State>.
filter ($state!=<http://www.google.com/go/State-USA>)
}
How could I write an if-else statement inside the insert, to check what the value of ?state is, and then to run the needed insert query.
How could I combine the two queries into only one, with the proper conditions?
The triples:
#prefix : <http://www.google.com/go#> .
#prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
#prefix xml: <http://www.w3.org/XML/1998/namespace> .
#prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
<http://www.google.com/go/State-USA> a :State ;
rdfs:label "USA" .
<http://www.google.com/go/State-Michigan> a :State ;
rdfs:label "Michigan" .
<http://www.google.com/go/State-NewYork> a :State ;
rdfs:label "New York" .
<http://www.google.com/go/State-Alabama> a :State ;
rdfs:label "Alabama" .

Related

Regexextract of importdata from website GoogleSheets

The purpose is to extract the title and tags from a webpage.
I'm using importdata and I want to have the results all in 1 row. Like this:
[webpage] [title] [1st tag] [2nd tag] [3 rd tag] [4th tag] ... [last tag]
I am stuck halfway my process in googlesheet
first tab Extracted - I've extracted the necessary lines from the
big data.
=query({array_constrain(IMPORTDATA(A1),6375,10)},"WHERE (Col1 CONTAINS 'btn btn-secondary' AND Col1 CONTAINS 'href') or (Col1 CONTAINS 'meta property' AND Col1 CONTAINS 'og:title')")
second tab with REGEXEXTRACT - extracted the text I need, but only works for the first line (only extracted tags, title still not there as it spreads across a few columns...)
=REGEXEXTRACT(query({array_constrain(IMPORTDATA(A1),6375,10)},"WHERE (Col1 CONTAINS 'btn btn-secondary' AND Col1 CONTAINS 'href')"),"\>(.+)\
I don't know how to go further :( Any help is appreciated!
=ARRAYFORMULA({REGEXREPLACE(TEXTJOIN(", ",1,
QUERY(ARRAY_CONSTRAIN(SUBSTITUTE(IMPORTDATA(A2),"""",""),1000,15),
"where Col1 contains '<meta property=og:title content='")),
"<meta property=og:title content=| />",""),
TRANSPOSE(REGEXEXTRACT(QUERY(TRANSPOSE(QUERY(TRANSPOSE(
ARRAY_CONSTRAIN(SUBSTITUTE(IMPORTDATA(A2),"""",""),8000,3)),,50000)),
"where Col1 contains '<a class=btn btn-secondary'"),"\>(.*)+\<"))})
demo spreadsheet

Python code with SPARQL not working

I'm writing a python code to match the list of actors between DBPEDIA and WIKIDATA. First i'm retrieving the list of actors with some additional information such as birth date, birth place from Dbpedia using SPARQL and using the same list of actors which are retrieved from Dbpedia, i'm trying to retrieve some additional information such as awards received. My python code is throwing an error.
I have a hunch that the dbpedia portion of the query is timing out within wikidata. Skipping the federated binding and adding a limit, the query goes to completion, but takes several seconds. Un-comment the triple about the awards, and it times out.
Since there are problems with the SPARQL, I'm going to ignore the Python processing for now.
Independent of that, I found two glitches:
# missing prefixes
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX dbp: <http://dbpedia.org/property/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT *
WHERE {
SERVICE <http://dbpedia.org/sparql> {
?c rdf:type <http://umbel.org/umbel/rc/Actor> ;
rdfs:label ?Actor
FILTER ( lang(?Actor) = "en" )
?c dbo:deathDate ?Death_date ;
dbo:birthPlace ?b
# date filterning not working... add cast
FILTER ( xsd:date(?Death_date) >= "1990 - 01 - 01"^^xsd:date )
?b rdfs:label ?birth_Place
FILTER ( lang(?birth_Place) = "en" )
?Starring rdf:type dbo:Film ;
dbo:starring ?c .
?c dbo:deathCause ?d .
?d dbp:name ?Cause_Of_Death .
?c owl:sameAs ?wikidata_actor
FILTER strstarts(str(?wikidata_actor), "http://www.wikidata.org")
}
# ?wikidata_actor wdt:P166 ?award_received.
}
LIMIT 9
Every SPARQL endpoint has its own unique personality. So in my opinion, federated queries (which use the service keyword and hit two or more endpoints) can be especially tricky. In case you're new to federation, here's an unrelated query that works.
There is some entity that tweets under the name 'darwilliamstour'. What is the name of that entity?
select *
where
{
?twitterer wdt:P2002 'darwilliamstour' .
service <http://dbpedia.org/sparql>
{
?twitterer rdfs:label ?name
}
}

Regex to match starts with

I need to update a table setting attribute MATCH to True where the attribute_a STARTS with the Value of attribute_b.
Somehow I can't get the correct syntax in Postgresql to do this pattern match.
UPDATE table
SET match= True
WHERE attribute_a ~ '^attribute_b' ;
eg MATCH TRUE: attribute_a = Nelson Mandela ; attribute_b = 'Nelson'
You do not need pattern matching, use left(), e.g.:
with my_table(attribute_a, attribute_b) as (
values
('Nelson Mandela', 'Nelson'),
('Donald Trump', 'Donald Duck'),
('John Major', 'John M')
)
select *
from my_table
where attribute_b = left(attribute_a, length(attribute_b));
attribute_a | attribute_b
----------------+-------------
Nelson Mandela | Nelson
John Major | John M
(2 rows)
If you absolutely want to use regex, you have to build the pattern with concat() or format(), like this:
select *
from my_table
where attribute_a ~ concat('^', attribute_b)
-- where attribute_a ~ format('^%s', attribute_b)

dc:Creator string literal vs. regex FILTER in SPARQL

I am using Europeana's Virtuoso SPARQL Endpoint.
I have been trying to search in SPARQL for content about a specific contributor. To my understanding, this could be carried out this way:
PREFIX dc: <http://purl.org/dc/elements/1.1/>
SELECT ?title
WHERE {
?objectInfo dc:title ?title .
?objectInfo dc:creator 'Picasso' .
}
Nevertheless, I get nothing in return.
Alternatively, I used FILTER regex to search for the literal.
PREFIX dc: <http://purl.org/dc/elements/1.1/>
SELECT ?title ?creator
WHERE {
?objectInfo dc:title ?title .
?objectInfo dc:creator ?creator .
FILTER regex(?creator, 'Picasso')
}
This actually worked very well and returned correctly the results.
My question is: Is it possible to produce the SPARQL query without using FILTER to search the work of a particular artist?
Many thanks.
I don't think there are any objects with 'Picasso' literally as the creator. So a regex filter is a good choice, but slow.
Here's a way to find the strings your regex is matching:
PREFIX dc: <http://purl.org/dc/elements/1.1/>
SELECT ?creator, (count(?creator) as ?ccount)
WHERE {
?objectInfo dc:title ?title .
?objectInfo dc:creator ?creator .
FILTER regex(?creator, 'Picasso')
}
group by ?creator
order by ?ccount
It might have been easier for you to see that if your had displayed all variables in the select statement:
PREFIX dc: <http://purl.org/dc/elements/1.1/>
SELECT *
WHERE {
?objectInfo dc:title ?title .
?objectInfo dc:creator ?creator .
FILTER regex(?creator, 'Picasso')
}
If you don't want to use a regex filter, you could enumerate all of the Picasso variants you are looking for:
PREFIX dc: <http://purl.org/dc/elements/1.1/>
SELECT *
WHERE {
values ?creator { "Picasso, Pablo" "Pablo Picasso" } .
?objectInfo dc:title ?title .
?objectInfo dc:creator ?creator
}
bif:contains works on this endpoint and is pretty fast:
PREFIX dc: <http://purl.org/dc/elements/1.1/>
SELECT *
WHERE {
?objectInfo dc:title ?title .
?objectInfo dc:creator ?creator .
?creator bif:contains 'Picasso'
#FILTER regex(?creator, 'Picasso')
}
1) Your first query has unconnected triple patterns.
2) I guess and according to the vocabulary description, dc:creator expects a resource, i.e. a URI. Using the URI of the entity Picasso doesn't work?
+--------------------+------------------------------------------------------------------------------------------------------------------------------------------------+
| Term Name: creator | |
| URI: | http://purl.org/dc/elements/1.1/creator |
| Label: | Creator |
| Definition: | An entity primarily responsible for making the resource. |
| Comment: | Examples of a Creator include a person, an organization, or a service. Typically, the name of a Creator should be used to indicate the entity. |
+--------------------+------------------------------------------------------------------------------------------------------------------------------------------------+
It would good to see your data in order to decide whether FILTER on literals is necessary or not.

retrieving the class name of a specific subclass in owl

I am an rdflib beginner, i have an ontology with classes and sub-classes and I need to look for a specific word in a subclass and, if it is found, return its class name.
I have the following code:
import rdflib
from rdflib import plugin
from rdflib.graph import Graph
g = Graph()
g.parse("test.owl")
from rdflib.namespace import Namespace
plugin.register(
'sparql', rdflib.query.Processor,
'rdfextras.sparql.processor', 'Processor')
plugin.register(
'sparql', rdflib.query.Result,
'rdfextras.sparql.query', 'SPARQLQueryResult')
qres = g.query("""
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?subject ?object
WHERE { ?subject rdfs:subClassOf ?object }
""")
# n is asubclass name and its class name is good-behaviour which i want to be the result
n="pity"
for (subj,pred,obj) in qres:
if n in subj:
print obj
else:
print "not found"
When I print the result of qres it returns a complete URL, and I need the name only of the sub-class and the class.
Can anyone help with this.
You can use RDFLib without SPARQL and Python string manipulation to get your answer. If you prefer to use SPARQL, the Joshua Taylor answer to this question would be the way to go. You also don't need the SPARQL processor plugin with recent versions (4+) of RDFLib - see the "Querying with SPARQL" documentation.
To get the answer you are looking for you can use the RDFLIB Graph method subject_objects to get a generator of subjects and objects with the predicate you are interested in, rdfs:subClassOf. Each subject and object will be an RDFLib URIRef, which are also Python unicode objects that can be manipulated using standard Python methods. To get the suffix of the IRI call the split method of the object and take the last item in the returned list.
Here is your code reworked to do as described. Without the data, I can't fully test it but this did work for me when using a different ontology.
from rdflib import Graph
from rdflib.namespace import RDFS
g = Graph()
g.parse("test.owl")
# n is a subclass name and its class name is good-behaviour
# which i want to be the result
n = "pity"
for subj, obj in g.subject_objects(predicate=RDFS.subClassOf):
if n in subj:
print obj.rsplit('#')[-1]
else:
print 'not found'
You haven't shown your data, so I can't use your exact query or data, but based on your comments, it sounds like you're getting IRIs (e.g., http://www.semanticweb.org/raya/ontologies/test6#Good-behaviour) as results, and you want just the string Good-behaviour. You can use strafter to do that. For instance, if you had data like this:
#prefix : <http://stackoverflow.com/questions/20830056/> .
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
:retrieving-the-class-name-of-a-specific-subclass-in-owl
rdfs:label "retrieving the class name of a specific subclass in owl"#en .
Then a query like this will return results that have full IRIs:
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
select ?question where {
?question rdfs:label ?label .
}
---------------------------------------------------------------------------------------------------------
| question |
=========================================================================================================
| <http://stackoverflow.com/questions/20830056/retrieving-the-class-name-of-a-specific-subclass-in-owl> |
---------------------------------------------------------------------------------------------------------
You can use strafter to get the part of a string after some other string. E.g.,
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
select ?q where {
?question rdfs:label ?label .
bind(strafter(str(?question),"http://stackoverflow.com/questions/20830056/") as ?q)
}
-------------------------------------------------------------
| q |
=============================================================
| "retrieving-the-class-name-of-a-specific-subclass-in-owl" |
-------------------------------------------------------------
If you define the prefix in the query, e.g., as a so:, then you can also use str(so:) instead of the string form. If you prefer, you can also do the string manipulation in the variable list rather than the graph pattern. That would look like this:
prefix so: <http://stackoverflow.com/questions/20830056/>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
select (strafter(str(?question),str(so:)) as ?q) where {
?question rdfs:label ?label .
}
-------------------------------------------------------------
| q |
=============================================================
| "retrieving-the-class-name-of-a-specific-subclass-in-owl" |
-------------------------------------------------------------