Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 8 years ago.
Improve this question
I want to remove "-" but not " - " from a string.
For Example: "01-Frozen - Madonna.mp3" becomes "01Frozen - Madonna.mp3"
I will than remove all digits using /d, I have seen some patterns for it.
So can any body help?
Let's take the example you already specified. 01-Frozen - Madonna.mp3.
The pattern is this: <non space character><hyphen><non space character>
If you need a space, the regex would be \s which will match a single non breaking space. The wonderful aspect of Regular Expression is that most match flags have an opposite, usually denoted by a capital letter of the same identifier. Since, in this case, we don't want a space, we could use \S which matches all characters that are not a space.
So the pattern now looks like: \S-\S.
If you've tried this, it won't work as expected since we want only the hyphens that do not have non-space-items around them and should not include the non-space-items themselves.
Cases like these call for a special kind of...erm...things termed as lookaheads and lookbehinds. Usually this involves a question mark and one more identifier — one of >, <, =, :, !. These extra identifiers ensure what kind of lazy you want your matches to get. You can read more about them here.
For this case, we need to use the = which will ensure that token appended to it — \S in our case — won't be a part of the result. This is called a positive lookahead matcher. So the final regex looks like this:
/(?=\S)-(?=\S)/
[Edited]
Paraphrasing #jerry's comments:
Well, if you want it to work properly, you'll need a lookbehind: /(?<=\S)-(?=\S)/. Though I would prefer negative ones in this case as it would be more natural to say 'not preceded by' and 'not followed by': /(?
Option 1:
/(?<=\S)-(?=\S)/
Option 2:
/(?<!\s)-(?!\s)/
Related
This question already has answers here:
How to get the count of only special character in a string using Regex?
(6 answers)
Closed 2 years ago.
I need to form the RegEx to produce the output only if more than two occurrences of special characters exists in the given string.
1) abcd##qwer - Match
2) abcd#dsfsdg#fffj-Match
3) abcd#qwetg- No Match
4) acwexyz - No Math
5) abcd#ds#$%fsdg#fffj-Match
Can anyone help me on this?
Note: I need to use this regular expression in one of the existing tool not in any programming language.
UPDATE after OP edit
The edited OP introduces a small amount of additional complexity that necessitates a different pattern entirely. The keys here are that (a) there is now a significantly limited set of "special characters" and (b) that these characters must appear at least twice (c) in any position in the string.
To implement this, you would use something like:
(?:.*?[##$%].*?){2,}
Asserts a non-capturing group,
Which contains any number of characters, followed by
Any character in the set ##$%
Followed by any number of characters
Ensures this pattern happens twice in a given string.
Original answer
By "special characters", I assume you mean anything outside standard alphanumeric characters. You can use the pattern below in most flavors of Regex:
([^A-Za-z0-9])\1
This (a) creates a set of all characters not including alphanumeric characters and matches a character against it, then (b) checks to see if the same character appears adjacent.
Regex101
This question already has an answer here:
Restricting character length in a regular expression
(1 answer)
Closed 3 years ago.
I need help with a regex that should match a fixed length pattern.
For example, the following regex allows for at most 1 ( and 1 ) in the matched pattern:
([^)(]*\(?[^)(]*\)?[^)(]*)
However I can not / do not want to use this solution because of the *, as the text I have to scan through is very large using it seems to really affect the performance.
I thus want to impose a match length limit, e.g. using {10,100} for example.
In other words, the regex should only match if
there are between 0 and 1 set of parenthese inside the string
the total length of the match is fixed, e.g. not infinite (No *!)
This seems to be a solution to my problem, however I do not get it to work and I have trouble understanding it.
I tried to use the accepted answer and created this:
^(?=[^()]{5,10}$)[^()]*(?:[()][^()]*){0,2}$
which does not seem to really work as expected: https://regex101.com/r/XUiJZz/1
Also please do not mark this question a duplicate of another question, if the answers in that question make use of the kleene star operator, it wont help me.
Edit:
I know this is a possible solution, but I'm wondering if there is a better way to do it:
([^)(]{0,100}\(?[^)(]{0,100}\)?[^)(]{0,100})
I thus want to impose a match length limit, e.g. using {10,100}
You may want to anchors add a lookahead assertion in your regex:
^(?=.{10,100})[^)(]*(?:\(?[^)(]*\))?[^)(]*$
(?=.{10,100}) is lookahead condition to assert that length of string must be between 10 and 100.
RegEx Demo
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 4 years ago.
Improve this question
Is [A] a regular expression that will match a string of characters which contains any number of occurrences of the letter A (and only the letter A, with no other characters or spaces) such as AAAA?
Anything in square brackets is a character class. This is complicated enough that it has its own Perl documentation page (in the link), so it's not a surprise it wasn't evident how it works.
A character class defines a set of possible characters; when pattern matching, a character class by itself matches one character from the input, no matter how many characters there are inside the square brackets.
/[A]/ # find one copy of 'A' anywhere in the string
/[abcd]/ # find one copy of any of 'a', 'b', 'c', or 'd' anywhere in the string
/[A..Z]/ # find any one uppercase ASCII character somewhere in the string
If you want your class to match differently, you can add modifiers:
/[A..Z]+/ # find one or more uppercase ASCII characters in a row
/[A]*/ # find zero or more 'A's in a row
The linked page will show you a lot of other options to specify sets of characters inside the square brackets. But the key is that one set of square brackets matches one character unless you add + (one or more of these) or '*' (zero or more of these).
No.
The regular expression pattern [A] can be simplified to just A. It will match any string that contains A. While that includes AAAA, it also includes ZAZ.
For starters, you will need to anchor the match.
This question already has answers here:
RegEx for allowing alphanumeric at the starting and hyphen thereafter
(4 answers)
Closed 5 years ago.
I want to build a regular expression which only matches [A-Za-z0-9\-] with an additional rule that hyphens (-) are not allowed to appear at the start and at the end.
For example:
my-site is matched.
m is matched.
mysite- is not matched.
-mysite is not matched.
Currently, I've come up with ^[A-Za-z0-9][A-Za-z0-9\-]*[A-Za-z0-9]+$.
But this doesn't match m.
How can I change my regular expression so that it fits my needs?
Use look arounds:
^(?!-)[A-Za-z0-9-]*(?<!-)$
The reason this works is that look arounds don't consume input, so the look ahead and the look behind can both assert on the same character.
Note that you don't need to escape the dash within the character class if it's the first or last character.
This question already has answers here:
Can you make just part of a regex case-insensitive?
(5 answers)
Closed 3 years ago.
Okay this might not be tricky at all for some but at the moment really screwing up with my head.
First of all i don't know what engine i am dealing with, but it doesn't seem to identify uppercase.
I have a string for example
Circuit Ref
Service Type
A End Address
Z End Address
52GD J32SD41 O2AE EVC001
Evolve Internet
And I am only trying to extract the string "52GD J32SD41 O2AE EVC001". I have already tried quite a few combinations like
[0-9A-Z]{4}\s[0-9A-Z]+\s[0-9A-Z]+\s[0-9A-Z]+
[A-Z0-9]{4}\s\W+\s\W+\s\W+
[A-Z0-9]{4}\s[A-Z0-9\s]*[A-Z0-9\s]*[A-Z0-9\s]*
Nothing seem to work...I want to keep the expression fairly flexible as the expression can change order of the letters and digits. but the pattern is mostly same. Any nudge in a right direction will be greatly appreciated.
Thanks
This is wild guess, but please try following things:
in front of the regex add (?-i) (Related question, regular-expressions.info, net page about regex)
enclose regex with (?-i: ... )
enclose regex with (?I: ... )
BTW. Regarding 2nd case that you tried: [A-Z0-9]{4}\s\W+\s\W+\s\W+.
Seem that you tried to use \W as "upper case word character", but it is not what it means.
\W means anything that is not \w. That is any non-word character.