What does ' \ ' followed by a non-escape character do? [duplicate] - regex

This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 4 years ago.
I came across someone doing: grep -v "\#". Could someone just quickly explain what happens when you precede a character that is not an escape character (like '#') with a backslash.

\# is a regular expression which matches the literal character #. Infact the backslash is not needed since the regular expression # is simpler and serves the same purpose.

Related

How to exclude a substring in a regular expression? [duplicate]

This question already has answers here:
What is the difference between .*? and .* regular expressions?
(3 answers)
What do 'lazy' and 'greedy' mean in the context of regular expressions?
(13 answers)
Closed 5 months ago.
There is a line of text:
Lorem ~Ipsum~ is simply ~dummy~ text ~of~ the printing...
To find all the words enclosed in ~~ I use
re.search(r'~([^~]*)~', text)
Let's say it became necessary to use ~~ instead of ~
([^\~]+) indicates to exclude the ~ character from the text within those characters
How do I make a regular expression to exclude a string of characters instead of just one?
That is, ~~Lor~em~~ should return Lor~em
The symbol of the new string must not be excluded and the length of the found string cannot be 0
Use a non-greedy quantifier instead of a negated character set.
re.search(r'~~(.*?)~~', text, flags=re.DOTALL)
re.DOTALL makes . match newline characters.

Understand the XPath expression [duplicate]

This question already has answers here:
What does $1, $2, etc. mean in Regular Expressions?
(2 answers)
Closed 5 months ago.
I want to understand the below XPath expression
fn:replace ($var,concat('^.*',fn:replace('.','(\.|\[|\]|\\|\||\-|\^|\$|\?|\*|\+|\{|\}|\(|\))','\\$1')),'')
I understand
\ denotes an escape character
| denotes OR
. denotes any single character
But I am confused with the use of \$1 here. What is it referring to?
$1 is a back-reference.
The goal of the expression is to replace any regex meta-character (., [, ], and so on) with an escaped version of that character.

Removing escaped unicode sequence in a text file [duplicate]

This question already has an answer here:
Regex for matching Unicode pattern
(1 answer)
Closed 2 years ago.
I have a text file with lots of unicode escaped sequence (of emojis by the way), for instance
blablabla \uD83D\uDC4D\uD83C blablabla \uDFFC\uD83D\uDC4F\uD83C\uDFFD
I'd like to remove it all, and get
blablabla blablabla
Is there Any regex expression which would clean these considering that i use Notepad++?
Thanks.
I would suggest: \\u[0-9A-F]{4}\s?.
\\u escapes the slash and matches it and the u literal. [0-9A-F]{4} matches exactly 4 of these characters. Perhaps you should update it to also match length 2 characters depending on the actual text: \\u([0-9A-F]{4}|[0-9A-F]{2})\s?
The \s? matches zero or more whitespace so you don't end up with multiple consecutive whitespace characters.

Sed syntax to extract all of text BEFORE last delimiter? [duplicate]

This question already has answers here:
Regular Expression, remove everything after last forward slash
(5 answers)
Closed 3 years ago.
I am trying to get the syntax right so that I can make scanning-client-container-0.2.tar look like scanning-client-container
I am using the delimiter " - " like so:
sed -e 's/-[^*]*$//'
with the result scanning, which is cut off too early
You can use a negated character class in your regex:
sed 's/-[^-]*$//' <<< 'scanning-client-container-0.2.tar'
scanning-client-container
RegEx Details:
-: Match a -
[^-]*: Match 0 or more characters that are not -
$: Match end

Add spacebar in regular expression [duplicate]

This question already has answers here:
Matching a space in regex
(10 answers)
Closed 9 years ago.
I use this regular expression for checking
public const string FullNameRegularExpression = #"^[a-zA-Z0-9._-]+$";
How to add "spacebar" in?
If you are looking for one single space it is: (" "), a very complete example can be found in this reference.
Or if you want to match any whitespace character (\n,\r,\f,\t, ), you can use \s.
Notice an added \s
public const string FullNameRegularExpression = #"^[a-zA-Z0-9._-\s]+$";
You may push a spacebar on your keyboard or add \s or \s+ or \s* to your regex ;-)