Regular Expression to remove all numbers and all dots - regex

I have this code in VB.NET :
MessageBox.Show(Regex.Replace("Example 4.12.0.12", "\d", ""))
It removes/extracts numbers
I want also to remove dots
so I tried
MessageBox.Show(Regex.Replace("Example 4.12.0.12", "\d\.", ""))
but it keeps the numbers.
how to remove both (numbers & dots) from the string ?
thanks.

Try using a character group:
MessageBox.Show(Regex.Replace("Example 4.12.0.12", "[\d\.]", ""))
I'll elaborate since I inadvertently posted essentially the same answer as Steven.
Given the input "Example 4.12.0.12"
"\d" matches digits, so the replacement gives "Example ..."
"\d\." matches a digit followed by a dot, so the replacement gives "Example 112"
"[\d.]" matches anything that is a digit or a dot. As Steven said, it's not necessary to escape the dot inside the character group.

You need to create a character group using square brackets, like this:
MessageBox.Show(Regex.Replace("Example 4.12.0.12", "[\d.]", ""))
A character group means that any one of the characters listed in the group is considered a valid match. Notice that, within the character group, you don't need to escape the . character.

Related

How to extract dot(.) ended sentence?

I need to extract sentence ends with dot '.', but don't extract sentence ends in ' ...' (blank and three dots).
Example:
I love you.
I love you too ...
I want to match first sentence not second.
i image python style pseudo code:
for string in strings:
checker1 = (string == .)
if checker:
checekr2 = (prev_string(string) != blank)
if checker2:
extract_all_strings()
else:
pass
else:
pass
but I can't image regular expression code.
You can use the following regex:
[\w ]+\.(?!\.)
It matches one or more either Word character or Space, then it use a neagative look ahead to make sure there's only one dot.
You can use (?<! \.\.)\.$, see the demo.
Here you go with a very simple Regex:
[\w ]+\.$
Test the solution on Regex101.
[\w ] is a group of allowed characters, where \w stands for any character from [a-zA-Z0-9_] and stands for space itself.
[\w ]+ where + means that there the characters from the group described in the point above might appear between one and unlimited times.
\. is the dot itself which must be escaped, otherwise the dot . matches any character.
$ stands for the end of a string.
This together assures that only a sentence ending with exactly one dot will be caught.
Another and less strict approach might be allow anything, where the 2nd character from the end is not a dot and the last one is a dot (Regex101).
.+[^\.]\.$

how to trim empty spaces between a word and digits using RegEx

How to trim empty spaces between string and digits. This regular expression (Version .*?\d$) provides version numbers, but leaves several space between the version and the version number, but I am trying trim the space so that there is only 1 space.
(App Version .*?\d$)
Example: App Version 1.1.0.11
Desired output: App Version 1.1.0.11
Any guidance or direction greatly appreciated.
The syntax might be slightly different depending on what you're coding in, but something along these lines should work:
var str = "App Version 1.1.0.11"
var rgx = /((?:\s*\w+)+)\s+([\d.]+)$/
console.log(str.replace(rgx, "$1 $2"))
The regex captures a repeating sequence of words at the beginning (?:\s*\w+)+, matches but does not capture spaces between \s+, then captures the sequence of digits and . characters at the end [\d.]+
If your regex engine supports positive lookbehind (?<= then you might use:
(?<=App Version) + and replace with a single whitespace.
it's possible that you don't have only spaces, but tabs in between. The simplest regexp I can think is to search for a \w followed by any number of tabs, or spaces, and then a digit \d, and substitute it by the same word char, followed by just one space, and the second matched digit.
(\w)[\s\t]+(\d)
to be substituted by
$1 $2
See demo for details
Regex: /(?<=\w ) +(?=\d)/
Replace with: '' (an empty string)
This will match one or more spaces if immediately preceded by a "word sequence ([A-Za-z0-9_]) then a space and the one or more spaces are immediately followed by a digit ([0-9]).
Effectively, the removal of spaces only occurs if there are two or more spaces. The first spaces is never removed.

How to extract piece with '\' and spacec?

"This is a piece of 432432\5321 text".
Numbers could be whatever long and also could be letters. How to get only 432432\5321 part of this?
Here is a sample:
(\d+\\\d+)
Group of digits followed by slash and followed by group of digits. Surrounding parenthesis is a capturing group.
Here is the fiddle: https://regex101.com/r/gI5rG4/2
EDIT:
I have missed that you also want letters. Then use \w instead of \d.
You can use the following example:
input = 'This is a piece of 432432\\5321 text'
print re.findall(r'(\d+(?:\\\d+)+)', input)
It can handle both input like 111\222, 111\222\333, etc.
Use \w for matching alphanumeric characters and \\for matching the backslash:
(\w+\\\w+)
This would match inputs like 32432\5321 as well those with letters in it, e.g. 32A1\BB1
Fiddle: https://regex101.com/r/yF2aX1/2

regex match till a character from a second occurance of a different character

My question is pretty similar to this question and the answer is almost fine. Only I need a regexp not only for character-to-character but for a second occurance of a character till a character.
My purpose is to get password from uri, example:
http://mylogin:mypassword#mywebpage.com
So in fact I need space from the second ":" till "#".
You could give the following regex a go:
(?<=:)[^:]+?(?=#)
It matches any consecutive string not containing any : character, prefixed by a : and suffixed by a #.
Depending on your flavour of regex you might need something like:
:([^:]+?)#
Which doesn't use lookarounds, this includes the : and # in the match, but the password will be in the first capturing group.
The ? makes it lazy in case there should be any # characters in the actual url string, and as such it is optional. Please note that that this will match any character between : and # even newlines and so on.
Here's an easy one that does not need look-aheads or look-behinds:
.*:.*:([^#]+)#
Explanation:
.*:.*: matches everything up to (and including) the second colon (:)
([^#]+) matches the longest possible series of non-# characters
# - matches the # character.
If you run this regex, the first capturing group (the expression between parentheses) will contain the password.
Here it is in action: http://regex101.com/r/fT6rI0

Is it possible to know what string replaces a wildcard in a regex comparison?

Suppose I check whether a string match a regex with wildcard. How can I programatically get extract which substring replaces the wildcard?
Simple example: the regex is "[foo|bar].*\.txt" and say, a matching string that is found is "foo123.txt". In this case the answer I want is "123", since it is the substring that replaces the wildcard. If a matching string is bar0123456789.txt, then the answer is 0123456789.
I use c#, but I wouldn't mind answers in other languages that I can also implement in c#
Don't use square brackets, if you want a group. Square brackets create a character class.
What you want is a non capturing group for that:
(?:foo|bar).*\.txt
To get the result from the .* (. is a special character that matches any character, but Newline characters (by default) and * is a quantifier that repeats the previous character 0 or more times), you need to put it into a capturing group.
(?:foo|bar)(.*)\.txt