Solr regex not working - regex

Morning everybody ,, this is my first time to use solr after reading some document about q , and fq i want to make query that return all rows that have last digit equal 3 for example .
I tried to use regex like condition id:/.*3/ , but this return no data .
Now i want to do this using mod function , Please explain your solution .
Thanks for help.

You don't need a regular expression to do that, q=id:*3 works right out of the box.
You can apply a function query to your field using the {!frange} parser, which will return any document where the result of the function behind it returns a value within the range defined.
q={!frange l=0 u=0}mod(id, 3)
.. assuming that id is an integer. This will only give you values where the function evaluates as 0, as both the upper and lower bounds are 0 (and by default, the lower/upper is included in the range).

Related

Partially match integers in PostgreSQL queries

So in my PostgreSQL 10 I have a column of type integer. This column represents a code of products and it should be searched against another code or part of the code. The values of the column are made of three parts, a five-digit part and two two-digit parts. Users can search for only the first part, the first-second or first-second-third.
So, in my column I have , say 123451233 the user searches for 12345 (the first part). I want to be able to return the 123451233. Same goes if the users also searches for 1234512 or 123451233.
Unfortunately I cannot change the type of column or break the one column into three (one for every part). How can I do this? I cannot use LIKE. Maybe something like a regex for integers?
Thanks
Consider to use simple arithmetic.
log(value)::int + 1 returns the number of digits in integer part of the value and using this:
value/(10^(log(value)::int-log(search_input)::int))::int
returns value truncated to the same digits number as search_input so, finally
search_input = value/(10^(log(value)::int-log(search_input)::int))::int
will make the trick.
It is more complex literally but also could be more efficient then strings manipulations.
PS: But having index like create index idx on your_table(cast(your_column as text)); search like
select * from your_table
where cast(your_column as text) like search_input || '%';
is the best case IMO.
You do not need regex functions. Cast the integer to text and use the function left(), example:
create table my_table(code int); -- or bigint
insert into my_table values (123451233);
with input_data(input_code) as (
values('1234512')
)
select t.*
from my_table t
cross join input_data
where left(code::text, length(input_code)) = input_code;
code
-----------
123451233
(1 row)

Split string and get last element

Let's say I have a column which has values like:
foo/bar
chunky/bacon/flavor
/baz/quz/qux/bax
I.e. a variable number of strings separated by /.
In another column I want to get the last element from each of these strings, after they have been split on /. So, that column would have:
bar
flavor
bax
I can't figure this out. I can split on / and get an array, and I can see the function INDEX to get a specific numbered indexed element from the array, but can't find a way to say "the last element" in this function.
Edit:
this one is simplier:
=REGEXEXTRACT(A1,"[^/]+$")
You could use this formula:
=REGEXEXTRACT(A1,"(?:.*/)(.*)$")
And also possible to use it as ArrayFormula:
=ARRAYFORMULA(REGEXEXTRACT(A1:A3,"(?:.*/)(.*)$"))
Here's some more info:
the RegExExtract function
Some good examples of syntax
my personal list of Regex Tricks
This formula will do the same:
=INDEX(SPLIT(A1,"/"),LEN(A1)-len(SUBSTITUTE(A1,"/","")))
But it takes A1 three times, which is not prefferable.
You could do this too
=index(SPLIT(A1, "/"), COLUMNS(SPLIT(A1, "/"))-1)
Also possible, perhaps best on a copy, with Find:
.+/
(Replace with blank) and Search using regular expressions ticked.
You can try use this!
You've got the array of String, so you can acess the last element by length
String message = "chunky/bacon/flavor";
String[] outSplited = message.split("/");
System.out.println(outSplited[outSplited.length -1]);

perform substring extraction on data frame column

I have a dataframe with 1 column called 'full_url'. Each element of the column is just a url. How to I write a function to remove the 'http://' from all of the elements at once? I need to use some kind of regex because some don't have it at all, some have https, etc. The closest I've gotten is gsub(".*//","",unlist(full_url))
but that also returns 'full_url1' 'full_url2' 'full_url3' ... as the row names for some reason
Without a reproducible example I'm not sure, but would something like this work?
apply(df$full_url, 1, function(x) ifelse(substr(x,1,7) == "http://", substr(x,8,length(x)),x)
So using apply to go by row and substr to find if the first 7 characters are "http://". If they are replace without the http and if they're not then replace with just x.

Simplest way to find out if at least one cell in a cell array matches a regular expression

I need to search a cell array and return a single boolean value indicating whether any cell matches a regular expression.
For example, suppose I want to find out if the cell array strs contains foo or -foo (case-insensitive). The regular expression I need to pass to regexpi is ^-?foo$.
Sample inputs:
strs={'a','b'} % result is 0
strs={'a','foo'} % result is 1
strs={'a','-FOO'} % result is 1
strs={'a','food'} % result is 0
I came up with the following solution based on How can I implement wildcard at ismember function of matlab? and Searching cell array with regex, but it seems like I should be able to simplify it:
~isempty(find(~cellfun('isempty', regexpi(strs, '^-?foo$'))))
The problem I have is that it looks rather cryptic for such a simple operation. Is there a simpler, more human-readable expression I can use to achieve the same result?
NOTE: The answer refers to the original regexp in the question: '-?foo'
You can avoid the find:
any(~cellfun('isempty', regexpi(strs, '-?foo')))
Another possibility: concatenate first all cells into a single string:
~isempty(regexpi([strs{:}], '-?foo'))
Note that you can remove the "-" sign in any of the above:
any(~cellfun('isempty', regexpi(strs, 'foo')))
~isempty(regexpi([strs{:}], 'foo'))
And that allows using strfind (with lower) instead of regexpi:
~isempty(strfind(lower([strs{:}]),'foo'))

Postgres set varchar field to regular expression of itself

I'm trying to normalise a data field by removing a fairly common postfix. I've got as far as using the substring() function in postgres, but can't quite get it to work. For example, if I want to strip the postfix 'xyz' from any values that have it;
UPDATE my_table SET my_field=substring(my_field from '#"%#"xyz' for '#');
But this is having some weird effects that I cant pin down. Any thoughts? Many thanks as always.
update my_table
set my_field = regexp_replace(my_field, 'xyz$', '')
where my_field ~ 'xyz$';
This will also change the value 'xyz' into an empty string. I don't know if you want that (or if the suffix can exists "on it's own".
The where clause is not strictly necessary but will make the update more efficient because only those rows are updated that actually meet the criteria.
UPDATE my_table
SET my_field = left(my_field, -3)
WHERE my_field LIKE '%xyz';
For several reasons:
If you don't want to change every single row, always add a WHERE clause to your UPDATE. Even if only some rows are actually changed by the expression. An UPDATE from the same value to the same value is still an UPDATE and will produce dead rows and table bloat and trigger triggers ...
Use left() in combination with LIKE.
left() with a negative second parameter effectively trims the number of character from the end of the string. left() was introduced with PostgreSQL 9.1. I quote the manual here:
When n is negative, return all but last |n| characters.
Always pick LIKE over a regular expression (~) if you can. LIKE is not as versatile, but much faster. (SIMILAR TO is rewritten as regular expression internally). Details in this related answer on dba.SE.
If you want to make sure that a minimum of characters remains:
WHERE my_field LIKE '_%xyz'; -- prepend as many _ as you want chars left
substring() would work like this (one possibility):
substring(my_field, '^(.*)xyz$');