I am new to SPARQL and I have a question how I can filter the cases where the value of the object is different from a set of values collected from different objects?
I want to use the query as part of the SPARQL-SHACL.
I have no problem to access the value I want to check, but then somehow the check is done for a single value and not if it is in the list/series of values
example:
my variable ?value is 6
I want to check if ?value which is 6 does not equal to any of the values of object ?obj
?obj have single values for different triples (different subjects), e.g. 1 for one case, 2 for another, 3 for other,...
If I do FILTER (?value!=?obj) I get all cases where 6!=1 and so on
I want to be able to do ?value NOT IN (?obj) where the ?obj is a list 1, 2, 3,.... I assume in that case I will get just one result that 6 is not found in the list.
So maybe 2 questions
is it possible to construct a list from ?obj as part of the query so that I could eventually use NOT IN?
Is there other way to solve this problem?
Thanks in advance.
Thank you very much for the answer. I found a solution myself, but in case more efficient solutions are available I would be glad to know them.
SELECT $this ?value (COUNT(?type) AS ?types) ?typesall
WHERE {
{
SELECT $this (COUNT(?typeall) AS ?typesall)
WHERE {
?nscpointsa ex:class.assend $this .
?nscpointsa rdf:type ?typeall .
}
GROUP BY $this ?typesall
}
$this $PATH ?value .
OPTIONAL {$this ex:class2.control ?contr}.
$this ex:class3.control1 ?control1 .
?nscpoints ex:class.assend $this .
?nscpoints rdf:type ?type .
?nscpoints ex:class.attr1 ?attr1 .
FILTER (bound(?contr) && ?control1=true && ?value!=?attr1) .
}
GROUP BY $this ?value ?types ?typesall
HAVING(?types=?typesall)
Related
I want to make the following transformation to a set of datas in my google spreadsheets :
6 views -> 6
73K views -> 73000
3650 -> 3650
163K views -> 163000
1.2K views -> 1200
52.5K -> 52500
All the datas are in a column and depending on the case I need to apply a specific transformation.
I tried to put all the regex in one formula but I failed. I always had a case over two regular expressions etc.
Anyaway I end up making these regex one case by one case in different columns. It works fine but I feel like it could slowdown the sheet since I except a lot of data coming into this sheet.
Here is the sheet : spreadsheet
Thank you for your help !
Use regexreplace(), like this:
=arrayformula(
iferror( 1 /
value(
regexreplace(
regexreplace(trim(A2:A), "\s*K", "e3"),
" views", ""
)
)
^ -1 )
)
See your sample spreadsheet.
replace 'views' using regex: /(?<=(\d*\.?\d+\K?)) views/gi
To replace 'K' with or without decimal value, first, detect K then replace K with an empty string and multiply by 1000.
use call back function as:
txt.replace(/(?<=(\d*\.?\d+\K?)) views/gi, '').replace(/(?<=\d)\.?\d+K/g, x => x.replace(/K/gi, '')*1000)
code:
arr = [`6 views`,
`73K views`,
`3650`,
`163K views`,
`1.2K views`,
`52.5K`];
arr.forEach(txt => {
console.log(txt.replace(/(?<=(\d*\.?\d+\K?)) views/gi, '').replace(/(?<=\d)\.?\d+K/g, x => x.replace(/K/gi, '')*1000))
})
Output:
6
73000
3650
163000
1200
52500
Say your inputs are in column A. Empty cells allowed. In any other column,
=arrayformula(if(A2:A<>"",value(substitute(substitute(A2:A," views",""),"K","e3")),))
works.
Adjust the range A2:A as needed.
Also note that non-empty cells with empty strings are ignored.
Basically, since Google Sheet's regex engine doesn't support look around, it is more efficient to take advantage of the rather strict patterns in your application and use substitute() instead.
I hope you are doing well.
Here is the basic structure of my graph database. Components have estimation methods, estimation methods have parameters and parameters have data sources.
c -> em -> p -> ds
Where,
c stands for components
em stands for estimation methods
p stands for parameters
ds stands for data sources
I am able to query individuals in the structured format like this:
SELECT ?c ?em ?p ?ds WHERE {
?c wb:hasEstimationMethod ?em.
OPTIONAL {
?em wb:hasParameter ?p.
OPTIONAL{
?p wb:hasDataSource ?ds.
}
}
}
I use OPTIONAL clause because there is a possibility that estimation method might not have any parameters and similarly parameters might not have any data sources.
However, there are few cases where, for example, an estimation method is unknown but we know the parameter. So for example in that case, components will directly have parameters and I would prefer to have blank for estimation methods. So here is the output I would like to have,
c
em
p
ds
component-1
estimation method-1
parameter-1
data source-1
component-2
parameter-2
data source-2
component-3
parameter-3
If you notice the last two rows have have missing info which is what I want to have in my output if that is the case. In other words, I want to skip a step in the hierarchical structure.
So my question is, how can I first query ?c wb:hasEstimationMethod ?em but if it does not have any value, I want to tell SPARQL to use query ?c wb:hasParameter ?p and similarly if that has no value as well, do ?c wb:hasDataSource ?ds ?
Any help will be greatly appreciated! Please let me know if I am not using the right terminology. Have a wonderful day :)
I have a variable $yearMonth := "2015-02"
I have to search this date on an element Date as xs:dateTime.
I want to use regex expression to find all files/documents having this date "2015-02-??"
I have path-range-index enabled on ModifiedInfo/Date
I am using following code but getting Invalid cast error
let $result := cts:value-match(cts:path-reference("ModifiedInfo/Date"), xs:dateTime("2015-02-??T??:??:??.????"))
I have also used following code and getting same error
let $result := cts:value-match(cts:path-reference("ModifiedInfo/Date"), xs:dateTime(xs:date("2015-02-??"),xs:time("??:??:??.????")))
Kindly help :)
It seems you are trying to use wild card search on Path Range index which has data type xs:dateTime().
But, currently MarkLogic don't support this functionality. There are multiple ways to handle this scenario:
You may create Field index.
You may change it to string index which supports wildcard search.
You may run this workaround to support your existing system:
for $x in cts:values(cts:path-reference("ModifiedInfo/Date"))
return if(starts-with(xs:string($x), '2015-02')) then $x else ()
This query will fetch out values from lexicon and then you may filter your desired date.
You can solve this by combining a couple cts:element-range-querys inside of an and-query:
let $target := "2015-02"
let $low := xs:date($target || "-01")
let $high := $low + xs:yearMonthDuration("P1M")
return
cts:search(
fn:doc(),
cts:and-query((
cts:element-range-query("country", ">=", $low),
cts:element-range-query("country", "<", $high)
))
)
From the cts:element-range-query documentation:
If you want to constrain on a range of values, you can combine multiple cts:element-range-query constructors together with cts:and-query or any of the other composable cts:query constructors, as in the last part of the example below.
You could also consider doing a cts:values with a cts:query param that searches for values between for instance 2015-02-01 and 2015-03-01. Mind though, if multiple dates occur within one document, you will need to post filter manually after all (like in option 3 of Navin), but it could potentially speed up post-filtering a lot..
HTH!
I want to filter my SPARQL query for specific keywords while at the same time excluding other keywords. I thought this may be easily accomplished with FILTER (regex(str(?var),"includedKeyword","i") && !regex(str(?var),"excludedKeyword","i")). It works without the "!" condition, but not with. I also separated the FILTER statements, but no use.
I used this query on http://europeana.ontotext.com/ :
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX edm: <http://www.europeana.eu/schemas/edm/>
PREFIX ore: <http://www.openarchives.org/ore/terms/>
SELECT DISTINCT ?CHO
WHERE {
?proxy dc:subject ?subject .
FILTER ( regex(str(?subject),"gemälde","i") && !regex(str(?subject),"Fotografie","i") )
?proxy edm:type "IMAGE" .
?proxy ore:proxyFor ?CHO.
?agg edm:aggregatedCHO ?CHO; edm:country "germany".
}
But I always get the result on the first row with the title "Gemäldegalerie", which has a dc:subject of "Fotografie" (the one I want excluded). I think the problem lies in the fact that one object from the Europeana database can have more than one dc:subject property, so maybe it looks only for one of these properties while ignoring the other ones.
Any ideas? Would be very thankful!
The problem is that your combined filter checks for the same binding of ?subject. So it succeeds if at least one value of ?subject matches both conditions (which is almost always true, because the string "Gemäldegalerie", for example, matches your first regex and does not match the second).
So for the negative condition, you need to formulate something that checks for all possible values, rather than just one particular value. You can do this using SPARQL's NOT EXISTS function, for example like this:
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX edm: <http://www.europeana.eu/schemas/edm/>
PREFIX ore: <http://www.openarchives.org/ore/terms/>
SELECT DISTINCT ?CHO
WHERE {
?proxy edm:type "IMAGE" .
?proxy ore:proxyFor ?CHO.
?agg edm:aggregatedCHO ?CHO; edm:country "germany".
?proxy dc:subject ?subject .
FILTER(regex(str(?subject),"gemälde","i"))
FILTER NOT EXISTS {
?proxy dc:subject ?otherSubject.
FILTER(regex(str(?otherSubject),"Fotografie","i"))
}
}
As an aside: since you are doing regular expression checks, and now combining them with an NOT EXISTS operator, this is likely to become very expensive for the query processor quite quickly. You may want to think about alternative ways to formulate your query (for example, using the exact subject string to include or exclude to eliminate the regex), or even having a look at some non-standard extensions that the SPARQL endpoint might provide (OWLIM, for example, the store on which the Europeana endpoint runs, supports various full-text-search extensions, though I am not sure they are enabled in the Europeana endpoint).
I am trying to compare two string variables to discover if one is contained in the other, specifically if one is composed of the other (so, I would like to avoid retrieving that "information" contains "format". I am interested only in results similar to "information_management" includes "information".
I have tried both FILTER CONTAINS() and FILTER regex() with the same results. How can I modify the query so it includes the fact that there needs to be a space either before or after the term?
SELECT DISTINCT ?l1 ?l2
WHERE
{
?term1 skos:prefLabel ?l1.
?term2 skos:prefLabel ?l2.
FILTER(contains(?l1,?l2))
}
So if I understand you correctly you want to find pairs of terms where one term is contained in the other but is not equal to the other?
If so you can add a !SAMETERM() call into the the FILTER clause like so:
SELECT DISTINCT ?l1 ?l2
WHERE
{
?term1 skos:prefLabel ?l1.
?term2 skos:prefLabel ?l2.
FILTER(!SAMETERM(?l1, ?l2) && contains(?l1,?l2))
}
Edit
Re-reading the question I don't think I addressed the whole question, for the problem where you have the terms "format" and "information" and don't want them to be matched you can do something like the following:
SELECT DISTINCT ?l1 ?l2
WHERE
{
?term1 skos:prefLabel ?l1.
?term2 skos:prefLabel ?l2.
FILTER(!SAMETERM(?l1, ?l2)
&& contains(?l1,?l2)
&& ( STRENDS(STRBEFORE(?l1, ?l2)," ")
|| STRSTARTS(STRAFTER(?l1, ?l2), " ")
))
}
This requires that the string before/after the containing term must end/start with whitespace. You may have to play around with this to get something that more closely models your constraints.
Another solution would be by constructing a regex pattern on the fly, like:
FILTER(regex(concat("\\b", ?l1, "\\b"), ?l2))
I'm not entirely sure that SPARQL/XML Schema requires \b, but I think most implementations will have it.