How to Select Date Range using RegEx

How to Select Date Range using RegEx - regex

I have date strings that looks like so:
20120817110329
Which, as you can see, is formatted: YYYYMMDDHHMMSS
How would I select (using RegEx) dates that are between 7/15 and 8/20? Or what about 8/1 to 8/15?
I have this working if I want to select a range that doesn't involve more than one place, but it is very limited:
^2012081[0-7] //selects 8/10 to 8/17
Update
Never forget the obvious (as pointed out by Wiseguy below), one can simply look for a range between 201207150000 and 201208209999.

Since you're just querying a database field that contains these values, you could simply check for a value between 201207150000 and 201208209999.
If you still want the regex, it ain't pretty, but this does it:
^20120(7(1[5-9]|2\d|3[01])|8([0-1]\d|20))\d{4}$
reFiddle example
You basically have to account for each possible range by hand.
^20120
(
7
(
1[5-9]
|2\d
|3[01]
)
|
8
(
[0-1]\d
|20
)
)
\d{4}$

I think this should work:
^2012(07(1[5-9]|[2-3][0-9])|08([0-1][0-9]|20))
Although the other answers are pretty the same...
You can check this for more info.

Related

How to simplify this google sheets regex sequence?

I want to make the following transformation to a set of datas in my google spreadsheets :
6 views -> 6
73K views -> 73000
3650 -> 3650
163K views -> 163000
1.2K views -> 1200
52.5K -> 52500
All the datas are in a column and depending on the case I need to apply a specific transformation.
I tried to put all the regex in one formula but I failed. I always had a case over two regular expressions etc.
Anyaway I end up making these regex one case by one case in different columns. It works fine but I feel like it could slowdown the sheet since I except a lot of data coming into this sheet.
Here is the sheet : spreadsheet
Thank you for your help !

Use regexreplace(), like this:
=arrayformula(
iferror( 1 /
value(
regexreplace(
regexreplace(trim(A2:A), "\s*K", "e3"),
" views", ""
)
)
^ -1 )
)
See your sample spreadsheet.

replace 'views' using regex: /(?<=(\d*\.?\d+\K?)) views/gi
To replace 'K' with or without decimal value, first, detect K then replace K with an empty string and multiply by 1000.
use call back function as:
txt.replace(/(?<=(\d*\.?\d+\K?)) views/gi, '').replace(/(?<=\d)\.?\d+K/g, x => x.replace(/K/gi, '')*1000)
code:
arr = [`6 views`,
`73K views`,
`3650`,
`163K views`,
`1.2K views`,
`52.5K`];
arr.forEach(txt => {
console.log(txt.replace(/(?<=(\d*\.?\d+\K?)) views/gi, '').replace(/(?<=\d)\.?\d+K/g, x => x.replace(/K/gi, '')*1000))
})
Output:
6
73000
3650
163000
1200
52500

Say your inputs are in column A. Empty cells allowed. In any other column,
=arrayformula(if(A2:A<>"",value(substitute(substitute(A2:A," views",""),"K","e3")),))
works.
Adjust the range A2:A as needed.
Also note that non-empty cells with empty strings are ignored.
Basically, since Google Sheet's regex engine doesn't support look around, it is more efficient to take advantage of the rather strict patterns in your application and use substitute() instead.

Can I directly pass string parameter to Quicksight function as argument?

I made a parameter with a custom list of options 'MM', 'YYYY', and 'Q'. When a user selects one, I planned my calculated field to use it as an argument for the extract() function, like this:
extract(${period}, date)
I tried to omit the quotes, include them, but nothing works, saying "At least one of the arguments in this function does not have correct type."
Is what I want to make possible?

From the little testing I've done it looks like extract requires a string literal as its first argument. This could be a bug and may be worth bringing to Amazon's attention.
As a workaround, you could solve this by using ifelse
ifelse(
${period} = 'MM', extract('MM', {Date}),
${period} = 'YYYY', extract('YYYY', {Date}),
extract('Q', {Date})
)
This is actually kind of nice because it gives you the opportunity to make the filter control more readable (e.g. Month, Year, Quarter) then do
ifelse(
${period} = 'Month', extract('MM', {Date}),
${period} = 'Year', extract('YYYY', {Date}),
extract('Q', {Date})
)
This works for your example because your grouping options are well defined, however, it wouldn't work for a dynamic, less understood set of controls.

SPARQL: combine and exclude regex filters

I want to filter my SPARQL query for specific keywords while at the same time excluding other keywords. I thought this may be easily accomplished with FILTER (regex(str(?var),"includedKeyword","i") && !regex(str(?var),"excludedKeyword","i")). It works without the "!" condition, but not with. I also separated the FILTER statements, but no use.
I used this query on http://europeana.ontotext.com/ :
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX edm: <http://www.europeana.eu/schemas/edm/>
PREFIX ore: <http://www.openarchives.org/ore/terms/>
SELECT DISTINCT ?CHO
WHERE {
?proxy dc:subject ?subject .
FILTER ( regex(str(?subject),"gemälde","i") && !regex(str(?subject),"Fotografie","i") )
?proxy edm:type "IMAGE" .
?proxy ore:proxyFor ?CHO.
?agg edm:aggregatedCHO ?CHO; edm:country "germany".
}
But I always get the result on the first row with the title "Gemäldegalerie", which has a dc:subject of "Fotografie" (the one I want excluded). I think the problem lies in the fact that one object from the Europeana database can have more than one dc:subject property, so maybe it looks only for one of these properties while ignoring the other ones.
Any ideas? Would be very thankful!

The problem is that your combined filter checks for the same binding of ?subject. So it succeeds if at least one value of ?subject matches both conditions (which is almost always true, because the string "Gemäldegalerie", for example, matches your first regex and does not match the second).
So for the negative condition, you need to formulate something that checks for all possible values, rather than just one particular value. You can do this using SPARQL's NOT EXISTS function, for example like this:
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX edm: <http://www.europeana.eu/schemas/edm/>
PREFIX ore: <http://www.openarchives.org/ore/terms/>
SELECT DISTINCT ?CHO
WHERE {
?proxy edm:type "IMAGE" .
?proxy ore:proxyFor ?CHO.
?agg edm:aggregatedCHO ?CHO; edm:country "germany".
?proxy dc:subject ?subject .
FILTER(regex(str(?subject),"gemälde","i"))
FILTER NOT EXISTS {
?proxy dc:subject ?otherSubject.
FILTER(regex(str(?otherSubject),"Fotografie","i"))
}
}
As an aside: since you are doing regular expression checks, and now combining them with an NOT EXISTS operator, this is likely to become very expensive for the query processor quite quickly. You may want to think about alternative ways to formulate your query (for example, using the exact subject string to include or exclude to eliminate the regex), or even having a look at some non-standard extensions that the SPARQL endpoint might provide (OWLIM, for example, the store on which the Europeana endpoint runs, supports various full-text-search extensions, though I am not sure they are enabled in the Europeana endpoint).

How to make =NULL work in SQLite?

Given the following table:
Table: Comedians
=================
Id First Middle Last
--- ------- -------- -------
1 Bob NULL Sagat
2 Jerry Kal Seinfeld
I want to make the following prepared query:
SELECT * FROM Comedians WHERE Middle=?
work for all cases. It currently does not work for the case where I pass NULL via sqlite3_bind_null. I realize that the query to actually search for NULL values uses IS NULL, but that would mean that I cannot use the prepared query for all cases. I would actually have to change the query depending on the input, which largely defeats the purpose of the prepared query. How do I do this? Thanks!

You can use the IS operator instead of =.
SELECT * FROM Comedians WHERE Middle IS ?

Nothing matches = NULL. The only way to check that is with IS NULL.
You can do a variety of things, but the straight forward one is...
WHERE
middle = ?
OR (middle IS NULL and ? IS NULL)
If there is a value you know NEVER appears, you can change that to...
WHERE
COALESCE(middle, '-') = COALESCE(?, '-')
But you need a value that literally NEVER appears. Also, it obfuscates the use of indexes, but the OR version can often suck as well (I don't know how well SQLite treats it).
All things equal, I recommend the first version.

NULL is not a value, but an attribute of a field. Instead use
SELECT * FROM Comedians WHERE Middle IS NULL

If you want match everything on NULL
SELECT * FROM Comedians WHERE Middle=IfNull(?, Middle)
if want match none on NULL
SELECT * FROM Comedians WHERE Middle=IfNull(?, 'DUMMY'+Middle)
See this answer: https://stackoverflow.com/a/799406/30225

SQL and regular expression to check if string is a substring of larger string?

I have a database filled with some codes like
EE789323
990
78000
These numbers are ALWAYS endings of a larger code. Now I have a function that needs to check if the larger code contains the subcode.
So if I have codes 90 and 990 and my full code is EX888990, it should match both of them.
However I need to do it in the following way:
SELECT * FROM tableWithRecordsWithSubcode
WHERE subcode MATCHES [reg exp with full code];
Is a regular expression like this this even possible?
EDIT:
To clarify the issue I'm having, I'm not using SQL here. I just used that to give an example of the type of query I'm using.
In fact I'm using iOS with CoreData, and I need a predicate to fetch me only the records that match.
In the way that is mentioned below.

Given the observations from a comment:
Do you have two tables, one called tableWithRecordsWithSubcode and another that might be tableWithFullCodeColumn? So the matching condition is in part a join - you need to know which subcodes match any of the full codes in the second table? But you're only interested in the information in the tableWithRecordsWithSubcode table, not in which rows it matches in the other table?
and the laconic "you're correct" response, then we have to rewrite the query somewhat.
SELECT DISTINCT S.*
FROM tableWithRecordsWithSubcode AS S
JOIN tableWithFullCodeColumn AS F
ON F.Fullcode ...ends-with... S.Subcode
or maybe using an EXISTS sub-query:
SELECT S.*
FROM tableWithRecordsWithSubcode AS S
WHERE EXISTS(SELECT * FROM tableWithFullCodeColumn AS F
WHERE F.Fullcode ...ends-with... S.Subcode)
This uses a correlated sub-query but avoids the DISTINCT operation; it may mean the optimizer can work more efficiently.
That just leaves the magical 'X ...ends-with... T' operator to be defined. One possible way to do that is with LENGTH and SUBSTR. However, SUBSTR does not behave the same way in all DBMS, so you may have to tinker with this (possibly adding a third argument, LENGTH(s.subcode)):
LENGTH(f.fullcode) >= LENGTH(s.subcode) AND
SUBSTR(f.fullcode, LENGTH(f.fullcode) - LENGTH(s.subcode)) = s.subcode
This leads to two possible formulations:
SELECT DISTINCT S.*
FROM tableWithRecordsWithSubcode AS S
JOIN tableWithFullCodeColumn AS F
ON LENGTH(F.Fullcode) >= LENGTH(S.Subcode)
AND SUBSTR(F.Fullcode, LENGTH(F.Fullcode) - LENGTH(S.Subcode)) = S.Subcode;
and
SELECT S.*
FROM tableWithRecordsWithSubcode AS S
WHERE EXISTS(
SELECT * FROM tableWithFullCodeColumn AS F
WHERE LENGTH(F.Fullcode) >= LENGTH(S.Subcode)
AND SUBSTR(F.Fullcode, LENGTH(F.Fullcode) - LENGTH(S.Subcode)) = S.Subcode);
This is not going to be a fast operation; joins on computed results such as required by this query seldom are.

I'm not sure why you think that you need a regular expression... Just use the charindex function:
select something
from table
where charindex(code, subcode) <> 0
Edit:
To find strings at the end, you can create a pattern with the % wildcard from the subcode:
select something
from table
where '%' + subcode like code

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How to Select Date Range using RegEx - regex

I think this should work: ^2012(07(1[5-9]|[2-3][0-9])|08([0-1][0-9]|20)) Although the other answers are pretty the same... You can check this for more info.

Related

How to simplify this google sheets regex sequence?

Can I directly pass string parameter to Quicksight function as argument?

SPARQL: combine and exclude regex filters

How to make =NULL work in SQLite?

SQL and regular expression to check if string is a substring of larger string?

Categories

Resources