Ruby regex for extracting email addresses not detecting hypens [duplicate] - regex

This question already has answers here:
Get final special character with a regular expression
(2 answers)
Closed 8 years ago.
Tried looking at the regex that some others are using, but for some reason it's not working for me.
I just basically have a string, such as "testing-user#example.com", It'll only extract user#example.com and not the whole thing.
Here's what I have:
regex = Regexp.new(/\b[a-zA-Z0-9._%+-,]+#[a-zA-Z0-9.-]+\.[a-zA-Z]{2,4}\b/)
email = line.scan(regex)
Any help would be greatly appreciated.

The hyphen needs to be escaped for the position it is at inside of the character class.
[a-zA-Z0-9._%+\-,]+
^
(+-,) currently matches a single character in the range between + and ,
Inside of a character class the hyphen has special meaning. You can place the hyphen as the first or last character of the class. In some regex implementations, you can also place directly after a range. If you place the hyphen anywhere else you need to precede it with a backslash it in order to add it to your class.

Related

RE2 Match from first character AFTER a character until FIRST space [duplicate]

This question already has answers here:
Regular expression to stop at first match
(9 answers)
How do I match everything after # until space?
(4 answers)
Regex everything after x until space
(1 answer)
Closed 28 days ago.
I really tried hard looking over the internet for an hour or so, trying to find if this question has already an answer somewhere, but no joy. If it already has an answer somewhere, feel free to link and close this one.
I am trying to match from AFTER a specific character until the FIRST space.
This is an example of the source string
blabla/1.2.3 [other stuff I dont care about]
I just want 1.2.3
I have tried so many different variants which I am not gonna pollute here all of them.
But the one I am most intrigued about is
\/.*\s
Apart from matching the / which I want to exclude, why does this match until the end of the line and not until the first space?
Other things I have tried
\/\b This just matches /
\/.*\b Matches almost everything until ]
\/.*\s? Again until end of line
\/.*(\s)? Ditto
\/.*\ Matches until the LAST whitespace excluding newline
And so on...

RegEx to find count of special characters in String [duplicate]

This question already has answers here:
How to get the count of only special character in a string using Regex?
(6 answers)
Closed 2 years ago.
I need to form the RegEx to produce the output only if more than two occurrences of special characters exists in the given string.
1) abcd##qwer - Match
2) abcd#dsfsdg#fffj-Match
3) abcd#qwetg- No Match
4) acwexyz - No Math
5) abcd#ds#$%fsdg#fffj-Match
Can anyone help me on this?
Note: I need to use this regular expression in one of the existing tool not in any programming language.
UPDATE after OP edit
The edited OP introduces a small amount of additional complexity that necessitates a different pattern entirely. The keys here are that (a) there is now a significantly limited set of "special characters" and (b) that these characters must appear at least twice (c) in any position in the string.
To implement this, you would use something like:
(?:.*?[##$%].*?){2,}
Asserts a non-capturing group,
Which contains any number of characters, followed by
Any character in the set ##$%
Followed by any number of characters
Ensures this pattern happens twice in a given string.
Original answer
By "special characters", I assume you mean anything outside standard alphanumeric characters. You can use the pattern below in most flavors of Regex:
([^A-Za-z0-9])\1
This (a) creates a set of all characters not including alphanumeric characters and matches a character against it, then (b) checks to see if the same character appears adjacent.
Regex101

Regex to capute single backslash with single space after [duplicate]

This question already has answers here:
Check if string contains single backslashes with regex
(3 answers)
Closed 3 years ago.
I have trouble with figuring out this regex:
https://regex101.com/r/WtAYVa/2
It works capturing the first single backslash (\), but I want to ignore (\\), especially, when there's a space after \\.
If we wish to fail the double backslash, and only pass the single one, we would be simply adding more boundaries to our expression, such as we would be using start and end anchors:
^\\\s$
Demo

How to build a regular expression which prohibits hyphens from appearing at the start and end of a string? [duplicate]

This question already has answers here:
RegEx for allowing alphanumeric at the starting and hyphen thereafter
(4 answers)
Closed 5 years ago.
I want to build a regular expression which only matches [A-Za-z0-9\-] with an additional rule that hyphens (-) are not allowed to appear at the start and at the end.
For example:
my-site is matched.
m is matched.
mysite- is not matched.
-mysite is not matched.
Currently, I've come up with ^[A-Za-z0-9][A-Za-z0-9\-]*[A-Za-z0-9]+$.
But this doesn't match m.
How can I change my regular expression so that it fits my needs?
Use look arounds:
^(?!-)[A-Za-z0-9-]*(?<!-)$
The reason this works is that look arounds don't consume input, so the look ahead and the look behind can both assert on the same character.
Note that you don't need to escape the dash within the character class if it's the first or last character.

The use of ".*" in regex for password validation [duplicate]

This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 6 years ago.
I came across this regex used for password validation:
(?=.*[a-z])(?=.*[A-Z])(?=.*[\d])(?=.*[^a-zA-Z\d])(?=\S+$).{8,}
There are only two things that are unclear to me about this regex:
what are .* used for and why this regex doesn't work without them?
what is the difference/benefit or using [\d] instead of \d, because the regex works just fine in both cases
.* matches any sequence of characters; . matches any character (other than newline, which is not relevant here) and * matches zero or more of the preceding pattern. This is used in the lookaheads to search for matches anywhere in the password. If you didn't have it,then it would require that you have those types of characters in a specific order: a lowercase letter followed by an uppercase letter followed by a digit. With .*, it means the password must contain at least one of each of them, but they can be anywhere in the password.
There's no difference between \d and [\d]. Whoever write this might just use the brackets out of habit, or perhaps to make it easier to modify it to put other characters into the character class.