Regex to find substring from a string - regex

I have strings like:
str1 = eval(sum(feat(57),feat(57),feat(66))/feat(57));
str2 = eval(sum(feat(47),feat(55),feat(86)));
str3 = eval(feat(47)/sum(feat(51),feat(52),feat(53)));
str4 = eval(feat(63)/sum(feat(57):feat(66)));
I want to write a regex to get out as:
str1_output = (feat(57),feat(57),feat(66))
str2_output = (feat(47),feat(55),feat(86))
str3_output = (feat(51),feat(52),feat(53))
str4_output = (feat(57):feat(66))
I tried in the following way:
output = re.findall(re.compile(r"sum.*"),str_name)
This is giving correct output except str1.
Please suggest me a way to find out the desired output.

I guess you could try
sum\((?:\([^()]*\)|.)*?\)
It matches sum and the following matching pair of parentheses, and whatever are between them.
Example at regex101.
Regards.

Related

Finding a string in a phrase

I am using regexpi to find a string in a phrase. But I also encountered with something different which I never intended.
Let's say the words I need to find are anandalak and nandaki.
str1 = {'anandalak'};
str2 = {'nanda'};
button = {'nanda'};
Both of the following return me logical 1:
~cellfun('isempty',regexpi(str1,button))
~cellfun('isempty',regexpi(str2,button))
How can I avoid this? I need logical 0 in first case and logical 1 in the second.
You probably need to use the word-boundaries(\<\>) in order to get the match which you require.
You may try:
str1 = {'anandalak'}
str2 = {'nanda'}
button = {'\<nanda\>'} % Notice this
~cellfun(#isempty,regexpi(str1,button)) % Returns ans = 0 No match
~cellfun(#isempty,regexpi(str2,button)) % Return ans = 1 Exact match
You can find the sample run result of the above implementation in here.

RegularExpression get strings between new lines

I want to taking every string who is located on a new line with Regular Expression
string someStr = "first
second
third
"
example:
string str1 = "first";
string str2 = "second";
string str3 = "third";
Or if you just want the first word of each line;
^(\w+).*$ with multi-line flag.
Regex101 has a nice regex testing tool: https://regex101.com/r/JF3cKR/1
Just split it with "\n";
someStr.split("\n")
And you can filter the empty strings if you'd like
Or if you really want regex, do /^.*$/ with multiline flag
List<String> listOfLines = new ArrayList<String>();
Pattern pattern = Pattern.compile("^.*$", Pattern.MULTILINE);
Matcher matcher = pattern.matcher("first\nsecond\nthird\n");
while (matcher.find()) {
listOfLines.add(matcher.group());
}
Then you have;
listOfLines.get(0) = first
listOfLines.get(1) = second
listOfLines.get(2) = third
You can use the following regex :
(\w+)(?=\n|"|$)
see demo

Regex - Private subtags RFC5646

Can someone please help me with a regex to pull out subtags from a RFC5646?
Example strings
en-us-x-test-test1 = test,test1
en-gb-x-test-test2 = test,test2
fr-x-test-test3 = test,test3
I'm using a QRegExp
Thanks for any assistance
You don't need a regex here. Split your input by - then take the last two string and add a coma in between:
QString str = "en-us-x-test-test1";
QStringList list = str.split('-');
QString output = list.at(list.count()-2) + "," + list.at(list.count()-1);
Of course, you have to check for list length to avoid index error.

Shortcut to get a statement with certain pattern in R

I have to write the following as it is.
('trial1' = Ozone1, 'trial2' = Ozone2, trial3 = Ozone3,...........trial1000 = Ozone1000)
I want to write this with one command in R. How do I do it?
I tried it using paste0
Let us take only 5 as number of repetitions:
paste0("trial",1:5,"= Ozone", 1:5)
I get this as result.
"trial1= Ozone1" "trial2= Ozone2" "trial3= Ozone3" "trial4= Ozone4" "trial5= Ozone5"
But it is not the way I wanted it. I want the output to come out as it is like (not even in inverted commas):
('trial1' = Ozone1, 'trial2' = Ozone2, 'trial3' = Ozone3, 'trial4' = Ozone4, 'trial5 = Ozone5)
Also as you can see, it is not a string i.e. output should not come between inverted commas as "........". I want it as it is exactly.
How do i do it?
This will generate the string you want...
paste0('(',paste0("'trial",1:1000,"'= Ozone",1:1000,collapse=' ,'),')')
This will print the string without quotes...
print(paste0('(',paste0("'trial",1:10,"'= Ozone",1:10,collapse=' ,'),')'), quote=FALSE)
I hope it answered your question...
You need to escape the single quotes, ie \', and use the collapse argument of paste0:
paste0("(", paste0("\'trial",1:5,"\' = Ozone",1:5, collapse=", "), ")")
[1] "('trial1' = Ozone1, 'trial2' = Ozone2, 'trial3' = Ozone3, 'trial4' = Ozone4, 'trial5' = Ozone5)"

Recursive tricks with regexp in Matlab

I tried to use regexprep to solve a problem - I'm given a string, that represents a function; it contains a patterns like these: 'sin(arcsin(f))' where f - any substring; and I need to replace it with simple 'f_2'. I successfully used regexprep unless I face with such string:
str = 'sin(arcsin(sin(arcsin(f_2))))*x^2';
str = regexprep(str, 'sin\(arcsin\((\w*)\)\)','$1');
it returns
str =
sin(arcsin(f_2))*x^2
But I want it to be
str =
f_2*x^2
Is there any way to solve it (except obvious solution with for-loops).
I was not able to test this, but I thinkg I found an expression that you can call multiple times to do what you asked for; each time it will "strip" one sin(arcsin()) pair out of your equation. Once it stops changing, you're done.
(.*)sin\(arcsin\((.*(\(.*?\))*)(\)\).*$)
Here is some Matlab code that shows how this might work:
str = 'sin(arcsin(sin(arcsin(f_2))))*x^2';
regex = (.*)sin\(arcsin\((.*(\(.*?\))*)(\)\).*$);
oldlength = 0
newlength = length(str)
while (newlength != oldlength)
oldlength = newlength;
str = regexprep(str, regex,'$1$2');
newlength = length(str);
end
As I said - I could not test this. Let me know if you have any problems with this.
Demo of the regular expression:
http://regex101.com/r/bR9gC7
Change your pattern to search for 1 or more (+) nested sin(arcsin( occurrences:
str = 'sin(arcsin(sin(arcsin(f_2))))*x^2';
str2 = regexprep(str, '(sin\(arcsin\()+(\w*)(\)\))+','$2')
str2 =
f_2*x^2