I can't get case insensitive searches to work for REGEX in SQLITE. Is the syntax supported?
SELECT * FROM table WHERE name REGEXP 'smith[s]*\i'
I would expect the following answers (assuming the database has these entries):
Smith
Smiths
smith
smitH <--- Not a typo but in database
Note - This is a small part of a larger REGEX, so I won't be using LIKE
As described by CL, this feature is not supported in SQLite. A simple solution to this problem is to "lowercase" the left-hand side of the REGEXP expression using lower:
SELECT * FROM table WHERE lower(name) REGEXP 'smith[s]*';
it is not ideal, but it works. But pay attention to diacritics. I would read the documentation for lower if your text uses them.
The REGEXP function shipped with SQLite Manager is implented in JavaScript as follows:
var regExp = new RegExp(aValues.getString(0));
var strVal = new String(aValues.getString(1));
if (strVal.match(regExp)) return 1;
else return 0;
To get case-insensitive searches with the JavaScript RegExp object, you would not add a flag to the pattern string itself, but pass the i flag in the second parameter to the RegExp constructor. However, the binary REGEXP operator does not have a third flags parameter, and the code does not try to extract flags from the pattern, so these flags are not supported in this implementation.
From https://www.sqlite.org/lang_expr.html
If an application-defined SQL function named "regexp" is added at run-time, then the "X REGEXP Y" operator will be implemented as a call to "regexp(Y,X)".
Related
From a PostgreSQL database, I'm trying to match 6 or more digits that come after a string that looks like "(OCoLC)" and I thought I had a working regular expression that would fit that description:
(?<=\(ocolc\))[0-9]{6,}
Here are some strings that it should return the digits for:
|a(OCoLC)08507541 will return 08507541
|a(OCoLC)174097142 will return 174097142
etc...
This seems to work to match strings when I test it on regex101.com, but when I incorporate it into my query:
SELECT
regexp_matches(v.field_content, '(?<=\(ocolc\))[0-9]{6,}', 'gi')
FROM
varfield as v
LIMIT
1;
I get this message:
ERROR: invalid regular expression: quantifier operand invalid
I'm not sure why it doesn't seem to like that expression.
UPDATE
I ended up just resorting to using a case statement, as that seemed to be the best way to work around this...
SELECT
CASE
WHEN v.field_content ~* '\(ocolc\)[0-9]{6,}'
THEN (regexp_matches(v.field_content, '[0-9]{6,}', 'gi'))[1]
ELSE v.field_content
END
FROM
varfield as v
as electricjelly noted, I'm kind of after just the numeric characters, but they have to be preceded by the "(OCoLC)" string, or they're not exactly what I'm after. This is part of a larger query, so I'm running a second case statement a boolean flag in cases where the start of the string wasn't "(OCoLC)". These seems to be more helpful anyway, as I'm going to probably want to preserve those other values somehow.
After looking over your question it seems your error is caused from a syntax problem, not so much from the function not being available on your version of PostgreSQl, as I tested it on 9.6 and I received the same error.
However, what you seem to want is to pull the numbers from a given field as in
|a(OCoLC)08507541 becomes 08507541
an easy way you could accomplish this would be to use regex_replace
the function would be:
regexp_replace('table.field', '\D', '', 'g')
the \D in the function finds all non-numbers and replaces it with a nothing (hence the '') and returns everything else
It looks like after doing some more searching, this is only a feature of versions of PostgreSQL server >= 9.6
https://www.postgresql.org/docs/9.6/static/functions-matching.html#POSIX-CONSTRAINTS-TABLE
The version I am running is version 9.4.6
https://www.postgresql.org/message-id/E1ZsIsY-0006z6-6T#gemulon.postgresql.org
So, the answer is it's not available for this version of PostgreSQL, but presumably this would work just fine in the latest version of the server.
I'm doing some testing with MonetDB.
The gist of the query I'm trying perform (using borrowed syntax) goes like this:
SELECT mystring FROM mytable WHERE mystring REGEXP 'myxpression';
MonetDB does not support this syntax, but the docs claim that it supports PCRE, so this may be possible, still the syntax eludes me.
Check the Does MonetDB support regular expression predicates?
The implementation is there in the MonetDB backend, the module that
implements it is pcre (to be found in MonetDB5 source tree).
I'm not sure whether it is available by default from MonetDB/SQL.
If not, with these two function definition, you link SQL functions to the
respective implementations in MonetDB5:
-- case sensitive
create function pcre_match(s string, pattern string)
returns BOOLEAN
external name pcre.match;
-- case insensitive
create function pcre_imatch(s string, pattern string)
returns BOOLEAN
external name pcre.imatch;
If you need more, I'd suggest to have a look at MonetDB5/src/modules/mal/
pcre.mx in the source code.
Use select name from sys.functions; to check if the function exists, otherwise you will need to create it.
As an example, you may use pcre_imatch() like this:
SELECT mystring FROM mytable WHERE pcre_imatch(mystring, 'myexpression');
I was handed some very badly written vb.Net code today and asked to migrate it to use ODP.Net. To shortcut this a little, I used Find/Replace to set all of the command variables to BindByName = true. Based on the first few code files though, I though all of these were named "cmd". Unfortunately, they aren't; the original author of the code actually named all of their commands after their purpose, even though they only used one OracleCommand per function. They also decided that using was apparently not worth doing, either.
Dim cmGetStatus As New OracleCommand
cmGetStatus.CommandType = CommandType.StoredProcedure
cmd.BindByName = True `<--this was added by my previous replace with a regex
What regex could I use to grab all instances of "Dim ____ as New OracleCommand" and replace the variable name with "cmd"? What about the same sort of replacement on all instances of "_____.CommandType"? This would save me at least 8 hours of manual edits.
Search: (Dim ).*( As New OracleCommand)
Replace: $1cmd$2
Search: .*( = CommandType.StoredProcedure)
Replace: cmd$1
Group replacements are done with $1, $2, etc.
I write a mysql query
select * from table where name like '%salil%'
which works fine but it will no return records with name 'sal-il', 'sa#lil'.
So i want a query something like below
select * from table whereremove_special_character_from(name)like '%salil%'
remove_special_character_from(name) is a mysql method or a regular expression which remove all the special characters from name before like executed.
No, mysql doesn't support regexp based replace.
I'd suggest to use normalized versions of the search terms, stored in the separate fields.
So, at insert time you strip all non-alpha characters from the data and store it in the data_norm field for the future searches.
Since I know no way to do this, I'd use a "calculated column" for this, i.e. a column which depends on the value of name but without the special characters. This way, the cost for the transformation is paid only once and you can even create an index on the new column.
See this answer how to do this.
I agree with Aaron and Col. Shrapnel that you should use an extra column on the table e.g. search_name to store a normalised version of the name.
I noticed that this question was originally tagged ruby-on-rails. If this is part of a Rails application then you can use a before_save callback to set the value of this field.
In MYSQL 5.1 you can use REGEXP to do regular expression matching like this
SELECT * FROM foo WHERE bar REGEXP "baz"
see http://dev.mysql.com/doc/refman/5.1/en/regexp.html
However, take note that it will be slow and you should do what others posters suggested and store the clean value in a separate field.
I get travel confirmations that look like this:
"SQ 966 E 27JUL SINCGK"
= "Airline Space Flight Space BookingClass Space Date_with_Month_as_name Space 3LetterFrom 2LetterTo".
I can chop all this into pieces using a regex to submit it to a website. But the site would expect instead of 27JUL 27/07/2009 or at least 27/07. Is there a way to transform a regex result based on a piece in the input. Jan -> 01, Feb -> 02 ... Dec -> 12.
(Regex flavour is Java)
DateFormat is a more appropriate class:
DateFormat output = new SimpleDateFormat("dd/MM", Locale.US);
DateFormat input = new SimpleDateFormat("dd MMM", Locale.US);
System.out.println(output.format(input.parse("24 Dec")));
output:
24/12
In Perl syntax (s{pattern}{replacement}):
s{([0-9][0-9])JAN}{\1/01}
s{([0-9][0-9])FEB}{\1/02}
s{([0-9][0-9])MAR}{\1/03}
s{([0-9][0-9])APR}{\1/04}
s{([0-9][0-9])MAY}{\1/05}
s{([0-9][0-9])JUN}{\1/06}
s{([0-9][0-9])JUL}{\1/07}
s{([0-9][0-9])AUG}{\1/08}
s{([0-9][0-9])SEP}{\1/09}
s{([0-9][0-9])OCT}{\1/10}
s{([0-9][0-9])NOV}{\1/11}
s{([0-9][0-9])DEC}{\1/12}
(Yes this is long and ugly, but it would probably work).
I would be very careful with doing this with regular expressions as they don't tell you how the conversion went.
Extract every bit of information manually. Sanity check everything, and then use the SimpleDateFormat parser to get a Date object you can use from there on.
It isnt a regex solution, but you could use SimpleDateFormat to help you with your final formatting. You should note in the JavaDoc that this is not a thread-safe option out of the box.
Alternatively, you could use DateFormatSymbols.getShortMonths() and iterate over the months to identify the index* and format your string manually.
*dont forget to add 1 ;)
edit:
I am not sure what you are looking for is possible in Java regex without the ablility to make code changes. The conditional constructs that Perl supports are not supported by Java because Java provides if-then-else support as a language feature.