Regex multiple replace with capture group and negated character class

Regex multiple replace with capture group and negated character class - regex

I have a problem with a regex and I cannot figure out if what I'm doing is possible. I was trying to write a regex to replace some strings with the following code
String string = "address='21 Street' and country='United Kingdom'";
Pattern pattern = Pattern.compile(" (address|country)='[^']'");
String replacedString = pattern.matcher(string).replaceAll(" $1='call us'");
System.out.println(replacedString);
What I'm expecting is to print the string
address='call us' and country='call us'
I'm not going to end up implementing this with a regex, as there are other better ways, but I just want to know why this is not working :'(.
What confuses me is that the negated character class [^'] is does not "work" and the regex doesn't replace anything.

You want [^']* and not [^']. The former matches any number of characters, the latter matches exactly a single non-' character.

You're missing a quantifier. By itself, a character class matches exactly one character in the input string, so you need to specify a quantifier of some sort make it match more than one character.
Try adding a + (one or more) or * (zero or more) after your character class:
Pattern pattern = Pattern.compile(" (address|country)='[^']*'");

Related

Regex: scrub punctuation except if inside a word?

I'm not great at regex but I have this for removing punctuation from a string.
let text = 'a user provided string'
let pattern = /(-?\d+(?:[.,]\d+)*)|[-.,()&$#![\]{}"']+/g;
text.replace(pattern, "$1");
I am looking for a way to modify this so that it keeps punctuation if inside a word e.g.
some-hypenated-words
a_snake_case
or.even.a.dot.word
should all keep the punctuation. How would I modify it for that?

One option could be changing the \d to \w to extend the match to word characters and add a hyphen to the character class in the capturing group.
In the replacement use group 1.
(\w+(?:[.,-]\w+)*)|[-.,()&$#![\]{}"']+
Regex demo
If you want to match multiple hyphens, commas or dots you could repeat the character class [.,-]+

REGEX - Get all groups of characters with their delimiter

I'm not pretty good with regex sot his is my problem.
I have a String who contains c#m#fc#fm# and I want to get all groups of characters with their # at the end.
Like this :
c#
m#
fc#
fm#
I have try some regex but I never get what I want.
Thanks a lot for your help.

You can use [^#]+# and find all matches, where match will start by capturing one or more characters using negated character class [^#]+ (any character except #) and at the end will match one #
Regex Demo
Also, in case you have space in your string which you don't want to include in matched texts, you can put \s also within the negated character class and use this regex,
[^#\s]+#
Regex Demo excluding space from matched tokens

Regex to match words after dot until a whitespace occurs

Given the following string
span.a.b this.is.really.confusing
I need to return the matches a and b. I've been able to get close with the following regex:
(?<=\.)[\w]+
But it's also matching is, really, and confusing. When I include a negative lookahead I get even closer, but I'm still not there.
(?<=\.)[\w]+(?=\s) # matches b, confusing
How can I match words after a dot until a whitespace occurs?

How can I match words after a dot until a whitespace occurs?
NB: this is language agnostic pseudo-code, but should work.
regex = "^[^\s.]+.(\S+).*"
targets = <extracted_group>.split(".")
Regex explanation:
"^": beings with
"[^\s.]+." 1 or more non-whitespace, non-period characters, followed by a period.
"(\S+)": group and capture all of the following non-whitespace characters
".*": matches 0 or more of any non-newline character
If the split function takes a regex instead of a string, you'll need to escape the '.' or use a character class.
NB: You can do it without the split, but I think that the split is more transparent.

I am not sure if this is good enough for all your possible cases, but it should work with the provided example:
\.([\w]+)\.([\w]+)\s
$1 = a, $2 = b

Regular expression for alpahbet,underscore,hyphen,apostrophe only

I want a regular expression that accept only alphabets,hyphen,apostrophe,underscore.
I tried
/^[ A-Za-z-_']*$/
but its not working. Please help.

Your regex is wrong. Try this:
/^[0-9A-Za-z_#'-]+$/
OR
/^[\w#'-]+$/
Hyphen needs to be at first or last position inside a character class to avoid escaping. Also if empty string isn't allowed then use + (1 or more) instead of * (0 or more)
Explanation:
^ assert position at start of the string
[\w#'-]+ match a single character present in the list below
Quantifier: Between one and unlimited times, as many times as possible
\w match any word character [a-zA-Z0-9_]
#'- a single character in the list #'- literally
$ assert position at end of the string

Move the hyphen at the end or the beginig of the character class or escape it:
^[ A-Za-z_'-]*$
or
^[- A-Za-z_']*$
or
^[ A-Za-z\-_']*$
If you want all letters:
^[ \pL_'-]*$
or

When using a hyphen in a character class, be sure to place it at the end of the character class as a best practice.
The reason for this is because the hyphen is used to signify a range of characters in the character class, and when it is at the end of the class, it will not create any ranges.

My best bet would be :
/[A-Za-z-\'_#0-9]+/g

You can use the following (in Java):
String acceptHyphenApostropheUnderscoreRegEx = "^(\\p{Alpha}*+((['_-]+)\\p{Alpha})?)*+$";
If you want to have spaces and # also (as some have given above) try:
String acceptHyphenApostropheUnderscoreRegEx = "^(\\p{Alpha}*+((\\s|['#_-]+)\\p{Alpha})?)*+$";

Regex matching beginning AND end strings

This seems like it should be trivial, but I'm not so good with regular expressions, and this doesn't seem to be easy to Google.
I need a regex that starts with the string 'dbo.' and ends with the string '_fn'
So far as I am concerned, I don't care what characters are in between these two strings, so long as the beginning and end are correct.
This is to match functions in a SQL server database.
For example:
dbo.functionName_fn - Match
dbo._fn_functionName - No Match
dbo.functionName_fn_blah - No Match

If you're searching for hits within a larger text, you don't want to use ^ and $ as some other responders have said; those match the beginning and end of the text. Try this instead:
\bdbo\.\w+_fn\b
\b is a word boundary: it matches a position that is either preceded by a word character and not followed by one, or followed by a word character and not preceded by one. This regex will find what you're looking for in any of these strings:
dbo.functionName_fn
foo dbo.functionName_fn bar
(dbo.functionName_fn)
...but not in this one:
foodbo.functionName_fnbar
\w+ matches one or more "word characters" (letters, digits, or _). If you need something more inclusive, you can try \S+ (one or more non-whitespace characters) or .+? (one or more of any characters except linefeeds, non-greedily). The non-greedy +? prevents it from accidentally matching something like dbo.func1_fn dbo.func2_fn as if it were just one hit.

^dbo\..*_fn$
This should work you.

Well, the simple regex is this:
/^dbo\..*_fn$/
It would be better, however, to use the string manipulation functionality of whatever programming language you're using to slice off the first four and the last three characters of the string and check whether they're what you want.

\bdbo\..*fn
I was looking through a ton of java code for a specific library: car.csclh.server.isr.businesslogic.TypePlatform (although I only knew car and Platform at the time). Unfortunately, none of the other suggestions here worked for me, so I figured I'd post this.
Here's the regex I used to find it:
\bcar\..*Platform

Scanner scanner = new Scanner(System.in);
String part = scanner.nextLine();
String line = scanner.nextLine();
String temp = "\\b" + part + "|" + part + "\\b";
Pattern pattern = Pattern.compile(temp.toLowerCase());
Matcher matcher = pattern.matcher(line.toLowerCase());
System.out.println(matcher.find() ? "YES" : "NO");
If you need to determine if any of the words of this text start or end with the sequence, you can use this regex: \bsubstring|substring\b:
anythingsubstring
substringanything
anythingsubstringanything

The simplest thing that you can do is:
dbo.*_fn$
It searches with dbo, followed by any characters, and then ends with _fn.
If you can identify what’s the right next character after n if it’s space, you can replace $ with space .

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Regex multiple replace with capture group and negated character class - regex

You want [^']* and not [^']. The former matches any number of characters, the latter matches exactly a single non-' character.

Related

Regex: scrub punctuation except if inside a word?

REGEX - Get all groups of characters with their delimiter

Regex to match words after dot until a whitespace occurs

Regular expression for alpahbet,underscore,hyphen,apostrophe only

Regex matching beginning AND end strings

Categories

Resources