delete multiple characers between separator with regular expressions

delete multiple characers between separator with regular expressions - regex

I have a string like "some_{abcd_etc}_text"
eveything between { } should be removed, including {} itself.
I need only the string "some_text" at the end.
How can this been done by regex?

You could use this expression:
{.*?}

Sure, just replace this with an empty string:
{[^}]+}
Here is a Python example:
>>> from re import sub
>>> s = r'some_{abcd_etc}_text'
>>> sub(r'{[^}]+}', '', s)
'some__text'

Related

Regex to not match a specific string, but with additional check

So for example I have this string
var = 'column1;column2;column3\r\nval1;val2;val3\r\n;val4;val5;val6\r\n'
I want to be able to find all \r\n and replace it with temp\r\n, but I want to ignore column3\r\n
Tried to do ^(?!.*column3).*$\r\n but the \r\n syntax does not work

You want to use a negative lookbehind, that is make the substitution when \r\n is not preceded by column3:
re.sub(r'(?<!column3)\r\n', r'temp\r\n', var)
For example:
>>> import re
>>>
>>> var = 'column1;column2;column3\r\nval1;val2;val3\r\n;val4;val5;val6\r\n'
>>> new_text = re.sub(r'(?<!column3)\r\n', r'temp\r\n', var)
>>> new_text
'column1;column2;column3\r\nval1;val2;val3temp\r\n;val4;val5;val6temp\r\n'
>>>

Comparing strings with regex

I basically want to match strings like: "something", "some,thing", "some,one,thing", but I want to not match expressions like: ',thing', '_thing,' , 'some_thing'.
The pattern I want to match is: A string beginning with only letters and the rest of the body can be a comma, space or letters.
Here's what I did:
import re
x=re.compile('^[a-zA-z][a-zA-z, ]*') #there's space in the 2nd expression here
stri='some_thing'
x.match(str)
It gives me:
<_sre.SRE_Match object; span=(0, 4), match='some'>
The thing is, my regex somehow works but, it actually extracts the parts of the string that do match, but I want to compare the entire string with the regular expression pattern and return False if it does not match the pattern. How do I do this?

You use [a-Z] which matches more thank you think.
If you want to match [a-zA-Z] for both you might use the case insensitive flag:
import re
x=re.compile('^[a-z][a-z, ]*$', re.IGNORECASE)
stri='some,thing'
if x.match(stri):
print ("Match")
else:
print ("No match")
Test

the easiest way would be to just compare the result to the original string.
import re
x=re.compile('^[a-zA-z][a-zA-z, ]*')
str='some_thing'
x.match(str).group(0) == str #-> False
str = 'some thing'
x.match(str).group(0) == str #-> True

Finding out unknown matched words

I have a regex pattern:
import regex as re
re.sub(r'(.*)\bHello (.*) BGC$\b', "OTR", 'Hello People BGC')
This will replace to give OTR, but how do I find out what the matched characters are within the (.*)?
Using regex==2016.1.10, Python 3.5.1

Compile the pattern and then call match() and sub() separately:
>>> pattern = re.compile(r'^Hello (.*?) BGC$')
>>> s = 'Hello People BGC'
>>> pattern.match(s).group(1)
'People'
>>> pattern.sub("OTR", s)
'OTR'

Get the also strings before numeric serarch using regular expression in python

I have a string from which I need to use re to get "PASS_MAX_DAYS 180" as an output and then replace it using re.sub to someother value, but when I do a re I am not able to get the sting
>>>_file = '#\nPASS_MAX_DAYS 180\nPASS_MIN_DAYS 1\nPASS_WARN_AGE 8\n'
>>> re.findall(r'PASS_MAX_DAYS\s*\b([0-9]{1,2}|1[0-7][0-9]|180)\b', _file, re.M)
['180']
Not sure where I am going wrong, any suggesting please

Turn the capturing group to non-capturing group because re.findall function returns only the characters present inside the groups, if the regex used has any capturing groups.
r"PASS_MAX_DAYS\s*\b(?:[0-9]{1,2}|1[0-7][0-9]|180)\b"
Example:
>>> _file = '#\nPASS_MAX_DAYS 180\nPASS_MIN_DAYS 1\nPASS_WARN_AGE 8\n'
>>> re.findall(r'PASS_MAX_DAYS\s*\b(?:[0-9]{1,2}|1[0-7][0-9]|180)\b', _file, re.M)
['PASS_MAX_DAYS 180']

Regex for matching this string

With python ( regex module ), I am triying to substitute 'x' for each letter 'c' in those strings occurring in a text and:
delimited by 'a', at the left, and 'b' at the right, and
with no more 'a's and 'b's in them.
Example:
cuacducucibcl -> cuaxduxuxibcl
How can I do this?
Thank you.

With the standard re module in Python, you can use a[^ab]+b to match the string which starts and end with a and b and doesn't have any occurence of a or b in between, then supply a replacement function to take care of the replacement of c:
>>> import re
>>> re.sub('a[^ab]+b', lambda m: m.group(0).replace('c', 'x'), 'cuacducucibcl')
'cuaxduxuxibcl'
Document of re.sub for reference.

Use the below regex and then replace the matched c's with x . For this , you need to install external regex module.
>>> import regex
>>> s = 'cuacducucibcl'
>>> regex.sub(r'((?:a|(?<!^)\G)[^abc\n]*)c', r'\1x', s)
'cuaxduxuxibcl'
DEMO

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

delete multiple characers between separator with regular expressions - regex

I have a string like "some_{abcd_etc}_text" eveything between { } should be removed, including {} itself. I need only the string "some_text" at the end. How can this been done by regex?

You could use this expression: {.*?}

Sure, just replace this with an empty string: {[^}]+} Here is a Python example: >>> from re import sub >>> s = r'some_{abcd_etc}_text' >>> sub(r'{[^}]+}', '', s) 'some__text'

Related

Regex to not match a specific string, but with additional check

Comparing strings with regex

Finding out unknown matched words

Get the also strings before numeric serarch using regular expression in python

Regex for matching this string

Categories

Resources