Extracting Google Ads CIDs using Google Sheets Regex - regex

I am trying to create a regex that will extract CIDs from a cell that contains both text and CIDs.Here is an example feild of text I would like to extract the CIDs from:
Example1:
Pokemon FY19 - Instream
261-963-9423
Pokemon - pokedex FY19 - Bumper
334-724-7943
Example2:
Instream: 856-613-9156
Bumper: 999-448-5246
The CIDs are the XXX-XXX-XXXX ids.
I have tried this =REGEXEXTRACT(A2, "\d{3}-\d{3}-\d{4}") but it only returns the first CID in the field when I need it to return all.
I expect the out but to be 261-963-9423 334-724-7943, but the output is just 261-963-9423

You can try to remove all lines except the lines that start with digits
=REGEXREPLACE(A2,"(?:^|\n)[A-Za-z_-].*",)
Or change all matching ids to (.*) and extract them later:
=REGEXEXTRACT(A2, "\Q"&REGEXREPLACE(A2, "\d{3}-\d{3}-\d{4}","\\E(.*)\\Q")&"\E")

=ARRAYFORMULA(SUBSTITUTE(TRIM(TRANSPOSE(QUERY(TRANSPOSE(IFERROR(
REGEXEXTRACT(IFERROR(SPLIT(A1:A, CHAR(10))), "\d+-\d+-\d+")))
,,999^99))), " ", CHAR(10)))

Related

Regex: extract multiple URL strings from a cell of arrays

What is a clean regex pattern for matching URL strings that stops at the first comma? Trying to extract values from an array of arrays in Google Sheets.
Cell A1
{https://www.myshop.com/shop/the_first_shop,Marcus. White's. Shop.,ACTIVE,US};{https://www.myshop.com/shop/a-second-shop,The first! Shop,CLOSED,UK};{EMPTY,ClosedShop,CLOSED,IN}
Desired Output (Cell B1)
https://www.myshop.com/shop/the_first_shop,https://www.myshop.com/shop/a-second-shop
I have figured out how to get a clean array of matching values in my desired output cell using:
=trim(regexreplace(regexreplace(regexreplace(REGEXREPLACE(A2,"/(https?:\/\/[^ ]*)/"," "),";"," "),"}"," "),"{"," "))
But I can't find a regex pattern that stops at a comma. For example, this soution:
"/(https?:\/\/[^ ]*)/"
matches the first URL, but gives me back:
https://www.myshop.com/shop/the_first_shop,Marcus. White's. Shop.,ACTIVE,US https://www.myshop.com/shop/a-second-shop,The first! Shop,CLOSED,UK EMPTY,ClosedShop,CLOSED,IN
I'd go with REGEXREPLACE and use:
=REGEXREPLACE(A1,".*?(?:(https.*?)|$)","$1")
Just a trailing comma to deal with...
=REGEXREPLACE(REGEXREPLACE(A1,".*?(?:(https.*?(,))|$)","$1"),",$","")
A much longer alternative to REGEXREPLACE could be:
=TEXTJOIN(",",,QUERY(TRANSPOSE(SPLIT(SUBSTITUTE(SUBSTITUTE(A1,"{","}"),"}",","),",")),"Select Col1 where Col1 like 'http%'"))
regex pattern that stops at a comma
=REGEXEXTRACT(A1, "(https?:\/\/[^,]*)")

Select the next line of the matched pattern in clob column using oracle regular expression

I have a clob column "details" in table xxx. I want to select the next line of the matched pattern using Regex.
Input Text (CLOB DATA) like below :( all placed in new line)
MODEL_DATA 1
TEST1:
NONE
TEST2:
NONE
INFO:
SERVICES,VALUED-YES
TYPE:
NONE
I tried to use INFO as pattern match string and retrieve the next line of the text . But could not able to do it by using Regular expression function . Please help me to resolve this
Output :
SERVICES,VALUES-YES
You can use the below to get the details
select replace(regexp_substr(details,'INFO:'||chr(10)||'.+'),'INFO:')
from your_table;
You can also try the below to be operation system independent
select replace(regexp_substr(details,'INFO:
('||chr(10)||'|'||chr(13)||chr(10)||').+'),'INFO:')
from your_table;

How to reshape timestamp in Google Sheets?

I have an imported cell in Google Sheets with the following string representing a time/date format:
2019-03-30T14:39:07-03:00
What would be the correct REGEXTRACT or DATEVALUE formula solution so that it will result in a valid Google Sheets date&time format?
The -03:00 at the end of the string can be ignored.
2019-03-30T14:39:07-03:00
should result in
yyyy-mm-dd hh:mm:ss
You can use REGEXREPLACE and use this regex,
.*?(\d{4}(?:-\d{2}){2}).(\d{2}(?::\d{2}){2}).*
And replace it with $1 $2.
Demo
I've tested and it works well in Google sheets.
Just use this,
=REGEXREPLACE(A1, ".*?(\d{4}(?:-\d{2}){2}).(\d{2}(?::\d{2}){2}).*", "$1 $2")
And replace in the cell you want to get your desired value.
Here is a screenshot showing the google sheets demo,
As you can see in the samples, this regex will find the date, no matter the date is surrounded by any optional text. In each of the case, you will have your desired date extract in the next column or any column you want.
all you need is:
=SUBSTITUTE(LEFT(A1, 19), "T", " ")
for the whole column:
=ARRAYFORMULA(SUBSTITUTE(LEFT(A1:A, 19), "T", " "))

How can I separate a string by underscore (_) in google spreadsheets using regex?

I need to create some columns from a cell that contains text separated by "_".
The input would be:
campaign1_attribute1_whatever_yes_123421
And the output has to be in different columns (one per field), with no "_" and excluding the final number, as it follows:
campaign1 attribute1 whatever yes
It must be done using a regex formula!
help!
Thanks in advance (and sorry for my english)
=REGEXEXTRACT("campaign1_attribute1_whatever_yes_123421","(("&REGEXREPLACE("campaign1_attribute1_whatever_yes_123421","((_)|(\d+$))",")$1(")&"))")
What this does is replace all the _ with parenthesis to create capture groups, while also excluding the digit string at the end, then surround the whole string with parenthesis.
We then use regex extract to actuall pull the pieces out, the groups automatically push them to their own cells/columns
To solve this you can use the SPLIT and REGEXREPLACE functions
Solution:
Text - A1 = "campaign1_attribute1_whatever_yes_123421"
Formula - A3 = =SPLIT(REGEXREPLACE(A1,"_+\d*$",""), "_", TRUE)
Explanation:
In cell A3 We use SPLIT(text, delimiter, [split_by_each]), the text in this case is formatted with regex =REGEXREPLACE(A1,"_+\d$","")* to remove 123421, witch will give you a column for each word delimited by ""
A1 = "campaign1_attribute1_whatever_yes_123421"
A2 = "=REGEXREPLACE(A1,"_+\d*$","")" //This gives you : *campaign1_attribute1_whatever_yes*
A3 = SPLIT(A2, "_", TRUE) //This gives you: campaign1 attribute1 whatever yes, each in a separate column.
I finally figured it out yesterday in stackoverflow (spanish): https://es.stackoverflow.com/questions/55362/c%C3%B3mo-separo-texto-por-guiones-bajos-de-una-celda-en...
It was simple enough after all...
The reason I asked to be only in regex and for google sheets was because I need to use it in Google data studio (same regex functions than spreadsheets)
To get each column just use this regex extract function:
1st column: REGEXP_EXTRACT(Campaña, '^(?:[^_]*_){0}([^_]*)_')
2nd column: REGEXP_EXTRACT(Campaña, '^(?:[^_]*_){1}([^_]*)_')
3rd column: REGEXP_EXTRACT(Campaña, '^(?:[^_]*_){2}([^_]*)_')
etc...
The only thing that has to be changed in the formula to switch columns is the numer inside {}, (column number - 1).
If you do not have the final number, just don't put the last "_".
Lastly, remember to do all the calculated fields again, because (for example) it gets an error with CPC, CTR and other Adwords metrics that are calculated automatically.
Hope it helps!

vim: search, capture & replace on different lines using regex

Relatively new linux/vim/regex user here. I want to use regex to search for a numerical patterns, capture it, and then use the captured value to append a string to the previous line. In other words...I have a file of format:
title: description_id
text: {en: '2. text description'}
I want to capture the values from the text field and append them to the beginning of the title field...to yield something like this:
title: q2_description_id
text: {en: '2. text description'}
I feel like I've come across a way to reference other lines in a search & replace but am having trouble finding that now. Or maybe a macro would be suitable. Any help would be appreciated...thanks!
Perhaps something like:
:%s/\(title: \)\(.*\n\)\(text: \D*\)\(\d*\)/\1q\4_\2\3\4/
Where we are searching for 4 parts:
"title: "
rest of line and \n
"text: " and everything until next digit in line
first string of consecutive digits in line
and spitting them back out, with 4) inserted between 1) and 2).
EDIT: Shorter solution by Peter in the comments:
:%s/title: \zs\ze\_.\{-}text: \D*\(\d*\)/q\1_/
Use \n for the new lines (and ^v+enter for new lines on the substitute line): A quick and not very elegant example:
:%s/title: description_id\n\ntext: {en: '\(\i*\)\(.*\)/title: q\1_description_id^Mtext: {en: '\1\2/