I am trying to perform a relative simple regex query, however it uses a variable inside the regexp. Is sparql not capable of using these kind of concatenation or am I using a wrong method? What I am trying to query is the following:
SELECT *
WHERE {
?part local:part_start ?start .
?chunk local:long_region ?long_region
BIND(REPLACE(?long_region, ".{"+?start+"}(.{10}).*", "$1") AS ?regionX)
}
I will end up with a small part from a long region of characters according to a start location and 10 characters further.
No + cannot be used for string concatenation in most SPARQL implementations, in principal an implementation can support this as an extended operator mapping but I'm not aware of any that do.
Instead you can use the standard CONCAT() function to achieve your aim provided you are using a SPARQL 1.1 compliant engine:
SELECT *
WHERE {
?part local:part_start ?start .
?chunk local:long_region ?long_region
BIND(REPLACE(?long_region, CONCAT(".{", ?start, "}(.{10}).*"), "$1") AS ?regionX)
}
Related
According to the MongoDB documentation and the ICU documentation it should be possible to ignore full-width and half-width difference in Japanese text by utilizing collation.
I tried the following;
{ locale: "ja", caseLevel:true, strength:1}
with different strength but none of them is working.
db.getCollection('mycollection')
.find({"desc":/バンド/})
.collation({ locale: "ja", caseLevel:true, strength:1})
This query cannot get result from the following document;
{
"desc": "*EGRパイプバンド外れ"
}
update
Found reason that in MongoDB regex cannot apply collation, so if I use certain match to perform query the result is perfect:
db.getCollection('mycollection')
.find({"desc":"*EGRパイプバンド外れ???"})
.collation({ locale: "ja", caseLevel:true, strength:1})
This query will return *EGRパイプバンド外れ this result.
But not if I use regex, any suggestion on it?
There is no way to make collate work with any regex find logic, since the regex script will override any collate definition, and only use the logic defined within itself, namely find any string that contains half-width バンド only.
The simplest way to achieve this is to add an extra logic before you send the search text into your MongoDB client, and duplicate the text into both half & full width. You can use some existing tool like this.
Then apply both half & full width search parameters into your find condition with $or;
db.mycollection.find({$or: [{"desc":/バンド/}, {"desc":/バンド/}]})
Same issue;
Use of collation in mongodb $regex
I am working in an environment without a JSON parser, so I am using regular expressions to parse some JSON. The value I'm looking to isolate may be either a string or an integer.
For instance
Entry1
{"Product_ID":455233, "Product_Name":"Entry One"}
Entry2
{"Product_ID":"455233-5", "Product_Name":"Entry One"}
I have been attempting to create a single regex pattern to extract the Product_ID whether it is a string or an integer.
I can successfully extract both results with separate patterns using look around with either (?<=Product_ID":")(.*?)(?=") or (?<=Product_ID":)(.*?)(?=,)
however since I don't know which one I will need ahead of time I would like a one size fits all.
I have tried to use [^"] in the pattern however I just cant seem to piece it together
I expect to receive 455233-5 and 455233 but currently I receive "455233-5"
(?<="Product_ID"\s*:\s*"?)[^"]+(?="?\s*,)
, try it here.
I'm doing some testing with MonetDB.
The gist of the query I'm trying perform (using borrowed syntax) goes like this:
SELECT mystring FROM mytable WHERE mystring REGEXP 'myxpression';
MonetDB does not support this syntax, but the docs claim that it supports PCRE, so this may be possible, still the syntax eludes me.
Check the Does MonetDB support regular expression predicates?
The implementation is there in the MonetDB backend, the module that
implements it is pcre (to be found in MonetDB5 source tree).
I'm not sure whether it is available by default from MonetDB/SQL.
If not, with these two function definition, you link SQL functions to the
respective implementations in MonetDB5:
-- case sensitive
create function pcre_match(s string, pattern string)
returns BOOLEAN
external name pcre.match;
-- case insensitive
create function pcre_imatch(s string, pattern string)
returns BOOLEAN
external name pcre.imatch;
If you need more, I'd suggest to have a look at MonetDB5/src/modules/mal/
pcre.mx in the source code.
Use select name from sys.functions; to check if the function exists, otherwise you will need to create it.
As an example, you may use pcre_imatch() like this:
SELECT mystring FROM mytable WHERE pcre_imatch(mystring, 'myexpression');
I have this piece of code that gets sessionid, make it a string, and then create a set with key as e.g. {{1401,873063,143916},<0.16443.0>} in redis. I'm trying replace { characters in this session with letter "a".
OldSessionID= io_lib:format("~p",[OldSession#session.sid]),
StringForOldSessionID = lists:flatten(OldSessionID),
ejabberd_redis:cmd([["SADD", StringForSessionID, StringForUserInfo]]);
I've tried this:
re:replace(N,"{","a",[global,{return,list}]).
Is this a good way of doing this? I read that regexp in Erlang is not a advised way of doing things.
Your solution works, and if you are comfortable with it, you should keep it.
On my side I prefer list comprehension : [case X of ${ -> $a; _ -> X end || X <- StringForOldSessionID ]. (just because I don't have to check the function documentation :o)
re:replace(N,"{","a",[global,{return,list}]).
Is this a good way of doing this? I read that regexp in Erlang is not
a advised way of doing things.
According to official documentation:
2.5 Myth: Strings are slow
Actually, string handling could be slow if done improperly. In Erlang, you'll have to think a little more about how the strings are used and choose an appropriate representation and use the re module instead of the obsolete regexp module if you are going to use regular expressions.
So, either you use re for strings, or:
leave { behind(using pattern matching)
if, say, N is {{1401,873063,143916},<0.16443.0>}, then
{{A,B,C},Pid} = N
And then format A,B,C,Pid into string.
Since Erlang OTP 20.0 you can use string:replace/3 function from string module.
string:replace/3 - replaces SearchPattern in String with Replacement. 3rd function parameter indicates whether the leading, the trailing or all encounters of SearchPattern are to be replaced.
string:replace(Input, "{", "a", all).
I can't get case insensitive searches to work for REGEX in SQLITE. Is the syntax supported?
SELECT * FROM table WHERE name REGEXP 'smith[s]*\i'
I would expect the following answers (assuming the database has these entries):
Smith
Smiths
smith
smitH <--- Not a typo but in database
Note - This is a small part of a larger REGEX, so I won't be using LIKE
As described by CL, this feature is not supported in SQLite. A simple solution to this problem is to "lowercase" the left-hand side of the REGEXP expression using lower:
SELECT * FROM table WHERE lower(name) REGEXP 'smith[s]*';
it is not ideal, but it works. But pay attention to diacritics. I would read the documentation for lower if your text uses them.
The REGEXP function shipped with SQLite Manager is implented in JavaScript as follows:
var regExp = new RegExp(aValues.getString(0));
var strVal = new String(aValues.getString(1));
if (strVal.match(regExp)) return 1;
else return 0;
To get case-insensitive searches with the JavaScript RegExp object, you would not add a flag to the pattern string itself, but pass the i flag in the second parameter to the RegExp constructor. However, the binary REGEXP operator does not have a third flags parameter, and the code does not try to extract flags from the pattern, so these flags are not supported in this implementation.
From https://www.sqlite.org/lang_expr.html
If an application-defined SQL function named "regexp" is added at run-time, then the "X REGEXP Y" operator will be implemented as a call to "regexp(Y,X)".