Can I insert variables in Regular Expression? [closed] - regex

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
I want to use regex so as to obtain specific information from the text and I give an example with a semi-pseudocode ~ you can also reply me with semi-pseudocode:
list=["orange","green","grey"]
text= "The Orange is orange"
for word in list:
if word == re.compile(r'word, text):
capture Orange in order to have the noun
Beware! My question focuses whether there is a possibility to use variables (as word up above) so as to make a loop and see if there are equal words in an text based on a list.
Do not focus on how to capture the Orange.

I think Biffen has the right idea, you're in a world of pain if you're using this for POS tagging. Anyway, this allows you to match words in your text variable
for word in list:
if word in text:
# Do what you want with word
If you wanted to use regex then you can build patterns from strings, use parentheses to capture. Then use group() to access captured patterns
for word in list:
pattern = re.compile(".*(" + word + ").*")
m = re.match(pattern, text)
if m:
print(m.group(1))

Related

how to add whitespace before certain characters or words in google sheet [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 1 year ago.
Improve this question
I have a google sheet where i have long text string in each cell.
following is one of the text string
VACANCY...Test Hotels is hiring for the below position.#Job_Title : Director of a Revenue/ Revenue Manager#Hotel_Name : Signature Hotel#Job_Location : Dubai#Nationality : Selective#Experience : Mandatory hotel experience#Salary_Range : Unspecified#Benefits : Unspecified- Candidate should be currently in UAE and has relevant UAE Hotel experience.Please specify “Applying Position” in the subject line.Email CV: test#test.com#jobseekers #vacancy #Dubai #jobs #recruiters #hotels #manager #revenue
I want to add white spaces before certain words or character so it look neat . for example i want to add white space before "#", "Salary", " job location" etc.
Ho can i do that
Please be reminded that you have to escape any metacharacter
=REGEXREPLACE(A1,"(#|Salary|job location)"," $1")

How to use regular expression in Hive to extract the second integer? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 3 years ago.
Improve this question
Data:
BUY 2 FOR 5(STORES)
BUY 2 FOR 10(STORES)
What I tried:
regexp_extract(DATA, '.*? (\\d+) .*$', 2)
Desired result:
5
10
Like this:
regexp_extract(DATA, '^[^0-9]+?\\d+[^0-9]+?(\\d+)', 1);
or
regexp_extract(DATA, '^\\D+?\\d+\\D+?(\\d+)', 1);
Regex means: one or more Non-digits at the beginning, one of more digits, one or more non-digits, and finally the capturing group of digits, you need to extract the group number one.
One more solution is to split string by non-didits and take 2nd element:
select split(DATA, '[^0-9]+')[2];
Or even simpler:
select split(DATA, '\\D+')[2]; --\\D+ means one or more non-digits

Newline \n Characters between Double Quotes Simple Regex [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I have a single line string like so (trying to turn it into a properly formatted csv):
customer id,description,card country\nBZkvIP2FFfhA3s,"Customer\n10019\nUS\n55769 - example#email.co,",US\nBZiFuAQ6Bd7iNw,"EVV c/o Company\r\n47713\r\nUS\r\n55761 - email#example.com",US\n
I want to find a simple regex that I can use to replace the \n characters that are in the "description" (which is always between double quotes) with a space, then I will do a replace for the remaining \n characters (which will be at the end of the csv line. So my end result will be formatted like so:
customer id,description,card country
BZkvIP2FFfhA3s,"Customer 10019 US 55769 - example#email.co,",US
BZiFuAQ6Bd7iNw,"EVV c/o Company\r 47713\r US\r 55761 - email#example.com",US
I can't figure out how to do this simply, I don't need a regex that handles a million exceptions, just matches all \n that are between " and "
This should work as the "search" term:
(".*?)(\\n)(.*?")
and for your "replace" you need:
$1 $3
https://regex101.com/r/djG74I/4

Regular expression - Get specific part of string [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 5 years ago.
Improve this question
I have the following sentence:
b.g The big bag of bits was bugged.
How can I exclude the b.g from it by using a regular expression?
I am sure I need a negative lookahead but I cannot get it right yet.
Something like
^(?!b\.g)
I would do it this way:
[^\S].*
What [^\S] does is basically skip any character until it reaches the first space. then start capturing. No need in this case for negative or positing Lookbehind.
Demo: regex101
If you prefer to do it with positive Lookbehind, you can do it this way
(?<=b\.g).*
Demo: regex101
sed 's/^...//' strips the first 3 characters, "b.g", but I doubt that's what you're really asking. Your ^ anchor appears to be a red herring.
You already have correct escaping for . period, just stick with that:
sed 's/b\.g//'
Python's positive lookbehind ?<= may be what you are trying to find words to express:
>>> m = re.search(r'(?<=b\.g)(.*)', 'b.g The big bag of bits was bugged.')
>>> print(m.group(1))
The big bag of bits was bugged.
In python you could do something like this:
import re
w = 'b.g The big bag of bits was bugged.'
print w
d = re.compile(r'^b.g\s')
a = re.sub(d, '', w)
print a

Autohotkey: Regex for getting street name in address string [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
65 Gregory Street
;Gregory
141-145 Dickson Road
;Dickson
6B Malvern Avenue
;Malvern
230A John Street
;John
I'm trying to extract just the street name in a string, skip the numbers even ones with letters in them and just extract the first word in the string. What's the correct expression for this?
Skip the first group of non-space characters, get the next non-space group, skip the rest:
street := RegExReplace(address, "^\S+ (\S+).*$", "$1")
In case of multiline text you can process all lines at once with m and `a options:
streets := RegExReplace(addresses, "m`a)^\S+ (\S+).*$", "$1")
Use regex101.com to test the expressions online.