Removing single and double quotes from BigQuery using regexp_extract - regex

I'm a total noob with regexp. All I want to do is to remove the single and double quotes from a string in BigQuery. I can remove the single and double quotes at the beginning of the string, but not the end:
SELECT regexp_extract(foo, r'\"new_foo\":\"(.*?)\"') AS new_foo
FROM [mybq:Schema.table]
All I get is Null but without regexp_extract I have expected results. Help is appreciated.

Try something like below
SELECT REGEXP_REPLACE(foo, r'([\'\"])', '') AS new_foo
FROM [mybq:Schema.table]

Your regex expression should be like /["']/g
And your are using different method to get the expected result. Try REGEXP_REPLACE('orig_str', 'reg_exp', 'replace_str')
Something like this:
SELECT REGEXP_REPLACE(word, /["']/g, '')AS new_foo
FROM [mybq:Schema.table]

select replace(word,'"','') as word

Related

Google Sheets REGEXTRACT the second date of a string

So I have the following string example in a Google Spreadsheet:
AN_U_John_Doe_01.01.1900_24.01.2022.pdf
I want to REGEXTRACT the second date, which would be:
24.01.2022
I did the following, which works but I'm sure there's a better way to do this:
REGEXEXTRACT($A1;"(\d+\.\d+\.\d+\.)") which results in: 24.01.2022. (notice the dot at the end)
and then I do REGEXEXTRACT($B1;"(\d+\.\d+\.\d+)") which gets rid of the dot.
Is there a way to do this in one regextract? Also the front part of the string might not always be the same, can be shorter or longer only the dates are always in the end like that.
You can join your two REGEXEXTRACTs by using this:
=REGEXEXTRACT($A1,"(\d+\.\d+\.\d+)\.")
try:
=REGEXEXTRACT(A1; "_(\d+.\d+.\d+).pdf$")

How to replace to_timestamp('some value') with GETDATE() string in Notepad++?

I have thousands of line having different time value like to_timestamp('14/03/18 07:46:33,573000000','DD/MM/RR HH24:MI:SSXFF'), and I want to replace all this with a single string say GETDATE().
for example i have following entries in my file,
to_timestamp('14/03/18 07:46:33,573000000','DD/MM/RR HH24:MI:SSXFF')
to_timestamp('14/03/18 08:45:34,342000000','DD/MM/RR HH24:MI:SSXFF')
to_timestamp('04/01/18 18:15:08,119000000','DD/MM/RR HH24:MI:SSXFF')
Now I would like to replace all of them with GETDATE() string like below
GETDATE()
GETDATE()
GETDATE()
How I can achive this with Notepas++ with regular expression ? or is there any other way to achieve this?
If you don't have any other function like this you can simply use regex (to_timestamp\(.*?\))
If it wants to be specific then use,
(to_timestamp\('\d+\/\d+\/\d+\s\d+:\d+:\d+,\d+','DD\/MM\/RR\sHH24:MI:SSXFF'\)) and replace with GETDATE\(\)
Regex
Use the regex to_timestamp\(.*\)

Data Validation in Pentaho using regular expression

I have these sample data. (Current Balance is numeric field and has some bad records which need to be replaced)
Accno,Cust_id,gender,DOB,Current_balance
0008647447654709299,87128110,M,29/02/1960,184126.23
0008650447626799299,143500723,F,4/18/1967,165198.85
0008651447674209299,479941323,M,5/5/1979,NULL
0008653447693589299,687746622,M,18-08-1981,#20
0008654447606469299,890134223,M,18-08-1983,0
0008655447659179299,684451923,F,10/9/1982,142.25
0008658447686789299,57470921,F,25-02-1978,458518.25
0008669447629759299,57470925,M,23-01-1981,xx
I need to validate data in Pentaho and want the output like below :
Accno,Cust_id,gender,DOB,Current_balance
0008647447654709299,87128110,M,29/02/1960,184126.23
0008650447626799299,143500723,F,4/18/1967,165198.85
0008651447674209299,479941323,M,5/5/1979,
0008653447693589299,687746622,M,18-08-1981,
0008654447606469299,890134223,M,18-08-1983,0
0008655447659179299,684451923,F,10/9/1982,142.25
0008658447686789299,57470921,F,25-02-1978,458518.25
0008669447629759299,57470925,M,23-01-1981,
That means the validator pass the good row(s) and replace those bad data into null value.
Can anyone suggest how can I do this??
I'm not sure about Pentaho, but to point you in the right direction, you can use the following regex:
,(?=[^,]+$)(?!\d+(\.\d{2})).*$
In Multi-line mode
If you replace all matches with ',' you should have the desired output.
Working on RegexPal
RegexPlanet translates this into the following Java regex (looks like you just need to escape the backslashes):
,(?=[^,]+$)(?!\\d+(\\.\\d{2})).*$
So in Java I guess you'd use something like:
str.replaceAll("(?m),(?=[^,]+$)(?!\\d+(\\.\\d{2})).*$", ",");
The (?m) at the start is the multi-line flag mentioned above.

how to process double quotes in Regex?

i want to use Regex in div at HTML in order to replace a certain string, this string is like :
age:7Refat"student" or it will be like age:7Refat , i'm using the following command that is ok with the second pattern:
$("#order_list").append($(this).text().replace(new RegExp("age:[0-9]+","g"),''));
but what if i want to use a general command for both patterns, the something is i don't know how to deal with the first pattern as it has double quotes"" , and i can't write:
$("#order_list").append($(this).text().replace(new RegExp("age:[0-9]+"[a-z]"","g"),''));
or
$("#order_list").append($(this).text().replace(new RegExp("price:[0-9]+[\"a-z\"]","g"),''));
Either escape the quotes like you did in your third example (but I think you put them in the wrong place):
new RegExp("price:[0-9]+\"[a-z]\"","g")
or (better) use a regex literal:
/price:[0-9]+"[a-z]"/g
You may try this
$("#order_list").append($(this).text().replace(new RegExp("age:[0-9]+\"[a-z]\"","g"));
instead of this:-
$("#order_list").append($(this).text().replace(new RegExp("age:[0-9]+","g"),''));
That finally works with me :
$("#order_list").append(($(this).text().replace(new RegExp("age:[0-9]+","g"),'')).replace(new RegExp("[a-zA-Z]+","g"),'').replace(new RegExp("\"+","g"),''));

Regular expression find and replace in Postgres

I have a table that contains a number of rows with columns containing a URL. The URL is of the form:
http://one.example1.com:9999/dotFile.com
I would like to replace all matches in that column with http://example2.com/dotFile.com while retaining everything after :9999. I have found some documentation on regexp_matches and regexp_replace, but I can't quite wrap my head around it.
To replace a fixed string, use the simple replace() function.
To replace a dynamic string, you can use regexp_replace() like this:
UPDATE
YourTable
SET
TheColumn = regexp_replace(
TheColumn, 'http://[^:\s]+:9999(\S+)', 'http://example2.com\1', 'g'
)
if you know the url, you don't have to use regex. replace() function should work for you:
replace(string text, from text, to text)
Replace all occurrences in string of substring from with substring to
example: replace('abcdefabcdef', 'cd', 'XX') abXXefabXXef
you could try:
UPDATE yourtable SET
yourcolumn = replace(yourcolumn, 'one.example1.com:9999','example2.com')
;