Find matching strings in table column Oracle 10g - regex

I am trying to search a varchar2 column in a table for matching strings using the value in another column. The column being searched allows free form text and allows words and numbers of different lengths. I want to find a string that is not part of a larger string of text and numbers.
Example: 1234a should match "Invoice #1234a" but not "Invoice #1234a567"
Steps Taken:
I have tried Regexp_Like(table2.Searched_Field,table1.Invoice) but get many false hits when the invoice number has a number sequence that can be found in other invoice numbers.

Suggestions:
Match only at end:
REGEXP_LIKE(table2.Searched_Field, table1.Invoice || '$')
Match exactly:
table2.Searched_Field = 'Invoice #' || table1.Invoice
Match only at end with LIKE:
table2.Searched_Field LIKE '%' || table1.Invoice

Related

Regex Replace First Number from Phone number if it is a 1

I've got a list of phone numbers in bigquery.
Some have the number 1 in front and some do not. I would like to remove the 1s using regex replace:
The data looks as follows:
16047779887
4037778776
And I would like to return:
6047779887
4037778776
Any help is appreciated!
Select regex_replace(column_name, "1*", "")
from table
The * represents the rest of the string.
If the first letter is 1, remove it. (replace it with an empty string)

Alteryx - Split a string with an uncertain length into 5 characters per column

I am trying to split a string (the string length is uncertain; it could be 500 characters or 1500 characters) into multiple columns, and each column should only contain 5 characters.
For example,
If column A contains the string:
AAGANAB5ARAB7AAAB9AAAC--CAC--1ACMRD
Then, I need Column B to Column H to be:
AAGAN,
AB5AR,
AB7AA,
AB9AA,
AC--C,
AC--1,
ACMRD
Also, the string contains “-“, but it is NOT delimiter. It should also be counted as a part of 5 char strings.
I know RegEx is probably the function I should use, and just by putting "(.....)" in the Regular Expression, Alteryx can extract the first 5 characters. But I don't know how to ask Alteryx to automatically split the entire string (length varies each row) to columns of 5 chars.
In Alteryx, use their RegEx tool (instead of the Formula tool with one of their REGEX expressions). In the config panel of the RegEx tool, and simply enter ..... as the RegEx, and the key is to select "Split to Rows"... this will give you rows with a new field that is the result of the applied RegEx.

Reg exp search in notes/comments/description data in PostgreSQL 10.7

I have a scenario which I am not able to do in 10.7 version. Basically, I have a data column in which I need to find the Reg Exp pattern inside the data which is in the form of notes/comments/description.
For example, Data in the column : The SSN number is 760-56-6289
In the above data 760-56-6289 is the actual SSN number which I need to find across all schemas/tables/columns for the defined reg exp pattern. And, we can have a pre or post text for actual SSN value.
Could you please let me know how to achieve this PostgreSQL 10.7?
Please let me know if you need more information for the same.
demo:db<>fiddle
SELECT
(regexp_matches(mycolumn, '^.*([\d]{3}-[\d]{2}-[\d]{4}).*$'))[1]
FROM mytable
The RegEx means:
Start of text: ^
arbitrary number of characters: .*
group of your number: (...)
3 digit characters: [\d]{3}
- character
2 digits: [\d]{2}
- character
4 digits: [\d]{4}
arbitrary number of characters: .*
end of text: $
regexp_matches() gives out all found groups as an array. So, there is only one group, the array contains only one value. This is your number which can be get with the index [1]

google analytics regular expression filter to count the number of keywords

Im trying to figure out a regular expression in google analytics to count the number of words in onsite search terms. The problem I have is that my onsite keywords are a single string containing + which split the words. For example hot+water+bottle. This would obviously be a three word keyphrase. storage+box would be a 2 word keyphrase. I can get all words containing + to indicate more than word was used but I can't for the life of me show all queries containing 2 words or 3 words etc. Can anyone help?
a two-word query would have at least 1 "+", thus I think the following regex will give you anything with 1 or more "+"
.*\+.*
.* = any number of characters
+ = the plus sign
so the pattern is (any number of characters)(plus)(any number of characters)

extract number from string in Oracle

I am trying to extract a specific text from an Outlook subject line. This is required to calculate turn around time for each order entered in SAP. I have a subject line as below
SO# 3032641559 FW: Attached new PO 4500958640- 13563 TYCO LJ
My final output should be like this: 3032641559
I have been able to do this in MS excel with the formulas like this
=IFERROR(INT(MID([#[Normalized_Subject]],SEARCH(30,[#[Normalized_Subject]]),10)),"Not Found")
in the above formula [#[Normalized_Subject]] is the name of column in which the SO number exists. I have asked to do this in oracle but I am very new to this. Your help on this would be greatly appreciated.
Note: in the above subject line the number 30 is common in every subject line.
The last parameter of REGEXP_SUBSTR() indicates the sub-expression you want to pick. In this case you can't just match 30 then some more numbers as the second set of digits might have a 30. So, it's safer to match the following, where x are more digits.
SO# 30xxxxxx
As a regular expression this becomes:
SO#\s30\d+
where \s indicates a space \d indicates a numeric character and the + that you want to match as many as there are. But, we can use the sub-expression substringing available; in order to do that you need to have sub-expressions; i.e. create groups where you want to split the string:
(SO#\s)(30\d+)
Put this in the function call and you have it:
regexp_substr(str, '(SO#\s)(30\d+)', 1, 1, 'i', 2)
SQL Fiddle