How can I achieve this price REGEX with REGEXMATCH in Google Spreadsheet? - regex

Here is the deal,
I want to allow user to enter this kind of entries in my price column:
1 or 1234 or 1234,1 or 1234,1234 ...
So I've used this regex which works fine with REGEX101's website
^\d+(,\d+)?$
https://regex101.com/r/D5dAXx/1
only problem is that it doesn't work well with Google spreadsheet's function REGEXMATCH
=REGEXMATCH(TO_TEXT(C2), "^\d+(,\d+)?$")
for example this entries do not match
1
12
1,123
when this entries matches correctly
1,1
1,12
Why is that and what could be the correct REGEX?

My problem was a bad format on the column.
When I entered:
12,1234
the format turned it into
12.1234
which was not matching my REGEXMATCH.
This means data validation criterion comes after the formatting in google's spreadsheets

Related

Azure data factory - mapping data flows regex implementation to format a number

I am creating a mapping data flow where I have a phone number column which can contain values like
(555) 555-1234 or
(555)555-1234 or
555555-1234
I want to extract numbers from this value. How can that be done. I have tried the below function with different variations but nothing is working.
regexExtract("(555) 555-1234",'\d+)')
regexExtract("(555) 555-1234",'(\d\d\d\d\d\d\d\d\d\d)')
Because you have multiple phone formats, you need to remove parentheses and spaces and dashes so you need multiple statements of regexExtract which will make your solution complicated.
instead, i suggest that you use regexReplace, mainly keeping only digits.
i tried it in ADF and it worked, for the sake of the demo, i added a derived column phoneNumber with a value: (555) 555-1234
in the derived column activity i added a new column 'validPhoneNumber' with a regexReplace value like so:
regexReplace(phoneNumber,'[^0-9]', '')
Output:
You can read about it here: https://learn.microsoft.com/en-us/azure/data-factory/data-flow-expressions-usage#regexReplace

Regex for values that are in between spaces

I am new to regex and having difficulty obtaining values that are caught in between spaces.
I am trying to get the values "field 1" "abc/def try" from the sameple data below just using regex
Currently im using (^.{18}\s+) to skip the first 18 characters, but am at at loss of how to do grab values with spaces between.
A1234567890 field 1 abc/def try
02021051812 12 test test 12 pass
3333G132021 no test test cancel
any help/pointers will be appreciated.
If this text has fixed-width columns, you can match and trim the column values knowing the amount of chars between start of string and the column text.
For example, this regex will work for the text you posted:
^(.*?)\s*(?<=.{19})(.*?)\s*(?<=^.{34})(.*?)\s*(?<=^.{46})
See the regex demo.
So, Column 2 starts at Position 19, Column 3 starts at Position 34 and Column 4 (end of string here) is at Position 46.
However, this regex is not that efficient, and it would be really great if the data format is fixed on the provider's side.
Given the not knowing if the data is always the same length I created the following, which will provide you with a group per column you might want to use:
^((\s{0,1}\S{1,})*)(\s{2,})((\s{0,1}\S{1,})*)(\s{2,})((\s{0,1}\S{1,})*)
Regex demo

How to get text from URLs using regexp_extract in data studio

Example URLs:
/en/current-season/abc-note-book/2018-abc-note-book-arun-1
/en/current-season/xyz-note-book/2018-xyz-note-book-kumar-2
/en/current-season/pqr-note-book/2018-pqr-note-book-rahul-3
I want to extract 'abc-note-book' section as column 1 from all the URLs
Expected Result:
abc note book
xyz note book
pqr note book
And also need to extract 'arun-1' section as column 2 from all the URLs
Expected Result
arun-1
kumar-2
rahul-3
Please suggest how to extract using regexp_extract in data studio? Or is there any other formula to extract it.
Thanks.
Created a Google Data Studio Report (Google Sheets Embedded) to demonstrate. The required text can be extracted using the REGEXP_EXTRACT function, and in the case of Column 1, REGEXP_REPLACE can be used to replace the - with a space:
Column 1 (e.g. abc note book)
REGEXP_REPLACE(REGEXP_EXTRACT(URL, "/\\d+-(\\w+-\\w+-\\w+)"), "-", " ")
Column 2 (e.g. arun-1)
REGEXP_EXTRACT(URL, "(\\w+-\\d+)$")

how to do a fast regex search on a hdf5 database

I have an HDF5 database with 100 million+ rows of text each storing a simple three column set of values:
ID WORD HEADWORD
1 the the
2 cats cat
3 sat sit
4 on on
5 the the
6 mats mat
...
I want to do a search on the "WORD" column to find all hits for at (i.e., 'cats', 'sat', 'mats').
In some other database (e.g. PostgresQL) I might do this with a simple regex search '?at?'. If I could search the HDF5 index using regex, that would be fine. But, I don't think this is possible. Any suggestions for how to do this kind of 'wildcard' (regex) search quickly?
Try following regex
[^\s]+[\s]+([a-zA-Z]*at[a-zA-Z]*)[\s]+[^\s]+
Group 1 in above regex will give you desired result.
"WORD" column to find all hits for at (i.e., 'cats', 'sat', 'mats').
Debuggex Demo
Regex Demo

REGEXEXTRACT - Error when trying to get a phone number from sting

I am wondering if someone can help me get this formula right in google spreadsheets.
After a 2 week event I do get a spreadsheet with more that 2000 rows of comments which include phone numbers here and there. I am trying to extract the phone numbers from those strings.
example string: call at 228-219-4241 after
formula: =IFERROR(REGEXEXTRACT(V133,"^(?(?:\d{3}))?[-.]?(?:\d{3})[-.]?(?:\d{4})$"),"NOT FOUND!!!")
and I do get "NOT FOUND!!!!
image from gsheet... NOT FOUND!!!
But it works only in this case..
just the number
Cheers.
Your regex is too complicated and your restricting it to a rule that says the number is the first thing in the string, change to this:
=iferror(regexextract(A1,"\d{3}\-\d{3}\-\d{4}"))
In your example the '^' sign means beginning of the line and '$' means the end so your saying the first thing in your string will always be 3 numbers and the last will always be 4