regex to duplicate a word and add in extra text - regex

I have a long list of words that I want to duplicate
Example
CallDateTime
WebDateTime
WavName
Dnis
Verified
Concern
ConcernCode
I'm trying to understand some regex to copy each word and placing to the right, along with adding in some needed text
's/(\t+)_(\w+)/\u\2, \u\1, \0/'
Well.. that is not working , THIS IS EXPECTED OUTPUT NEEDED
#CallDateTime = i.CallDateTime,
#WebDateTime = i.WebDateTime,
etc...
Obviously adding in ^ with # is easy and $ with , , but I want to also copy with a regex
I have seen this
((\w+)_(\w+))
Replace Pattern:
\3, \2, \1
But I don't understand that ..

Let's solve this with notepad++:
Find what: (\w+)
Replace with: #\1 = i.\1,
Explanation:
\w+ matches one or more word characters
(...) is a capturing group. You can reference it with \1 in the replacement part
replacement: A literal #, then the captured word, then a space, etc...

Searching for .+ and replacing it with #$0 = i.$0, should do the job.
https://regex101.com/r/WQXFy6/3

Replace :
\b(\w+)\b
by
#\1 = i.\1
Javascript code :
var str = "CallDateTime\nWebDateTime\nWavName\nDnis\nVerified\nConcern\nConcernCode";
str = str.replace(/\b(\w+)\b/g, '#$1 = i.$1');
console.log(str);

Related

Regex - Exclude some string after matches [duplicate]

I need to extract from a string a set of characters which are included between two delimiters, without returning the delimiters themselves.
A simple example should be helpful:
Target: extract the substring between square brackets, without returning the brackets themselves.
Base string: This is a test string [more or less]
If I use the following reg. ex.
\[.*?\]
The match is [more or less]. I need to get only more or less (without the brackets).
Is it possible to do it?
Easy done:
(?<=\[)(.*?)(?=\])
Technically that's using lookaheads and lookbehinds. See Lookahead and Lookbehind Zero-Width Assertions. The pattern consists of:
is preceded by a [ that is not captured (lookbehind);
a non-greedy captured group. It's non-greedy to stop at the first ]; and
is followed by a ] that is not captured (lookahead).
Alternatively you can just capture what's between the square brackets:
\[(.*?)\]
and return the first captured group instead of the entire match.
If you are using JavaScript, the solution provided by cletus, (?<=\[)(.*?)(?=\]) won't work because JavaScript doesn't support the lookbehind operator.
Edit: actually, now (ES2018) it's possible to use the lookbehind operator. Just add / to define the regex string, like this:
var regex = /(?<=\[)(.*?)(?=\])/;
Old answer:
Solution:
var regex = /\[(.*?)\]/;
var strToMatch = "This is a test string [more or less]";
var matched = regex.exec(strToMatch);
It will return:
["[more or less]", "more or less"]
So, what you need is the second value. Use:
var matched = regex.exec(strToMatch)[1];
To return:
"more or less"
Here's a general example with obvious delimiters (X and Y):
(?<=X)(.*?)(?=Y)
Here it's used to find the string between X and Y. Rubular example here, or see image:
You just need to 'capture' the bit between the brackets.
\[(.*?)\]
To capture you put it inside parentheses. You do not say which language this is using. In Perl for example, you would access this using the $1 variable.
my $string ='This is the match [more or less]';
$string =~ /\[(.*?)\]/;
print "match:$1\n";
Other languages will have different mechanisms. C#, for example, uses the Match collection class, I believe.
[^\[] Match any character that is not [.
+ Match 1 or more of the anything that is not [. Creates groups of these matches.
(?=\]) Positive lookahead ]. Matches a group ending with ] without including it in the result.
Done.
[^\[]+(?=\])
Proof.
http://regexr.com/3gobr
Similar to the solution proposed by null. But the additional \] is not required. As an additional note, it appears \ is not required to escape the [ after the ^. For readability, I would leave it in.
Does not work in the situation in which the delimiters are identical. "more or less" for example.
Most updated solution
If you are using Javascript, the best solution that I came up with is using match instead of exec method.
Then, iterate matches and remove the delimiters with the result of the first group using $1
const text = "This is a test string [more or less], [more] and [less]";
const regex = /\[(.*?)\]/gi;
const resultMatchGroup = text.match(regex); // [ '[more or less]', '[more]', '[less]' ]
const desiredRes = resultMatchGroup.map(match => match.replace(regex, "$1"))
console.log("desiredRes", desiredRes); // [ 'more or less', 'more', 'less' ]
As you can see, this is useful for multiple delimiters in the text as well
PHP:
$string ='This is the match [more or less]';
preg_match('#\[(.*)\]#', $string, $match);
var_dump($match[1]);
This one specifically works for javascript's regular expression parser /[^[\]]+(?=])/g
just run this in the console
var regex = /[^[\]]+(?=])/g;
var str = "This is a test string [more or less]";
var match = regex.exec(str);
match;
To remove also the [] use:
\[.+\]
I had the same problem using regex with bash scripting.
I used a 2-step solution using pipes with grep -o applying
'\[(.*?)\]'
first, then
'\b.*\b'
Obviously not as efficient at the other answers, but an alternative.
I wanted to find a string between / and #, but # is sometimes optional. Here is the regex I use:
(?<=\/)([^#]+)(?=#*)
Here is how I got without '[' and ']' in C#:
var text = "This is a test string [more or less]";
// Getting only string between '[' and ']'
Regex regex = new Regex(#"\[(.+?)\]");
var matchGroups = regex.Matches(text);
for (int i = 0; i < matchGroups.Count; i++)
{
Console.WriteLine(matchGroups[i].Groups[1]);
}
The output is:
more or less
If you need extract the text without the brackets, you can use bash awk
echo " [hola mundo] " | awk -F'[][]' '{print $2}'
result:
hola mundo

Regex Replace anything that does not match capturing group

I have a String like following: [Monster:Test]Maps=1,5,2,3[Monster:Test2]Maps=2-5
I need to replace the string of unnecessary text.
The only text I want to keep is the brackets including the text between the brackets. So only [Monster:Test] and [Monster:Test2] should be kept.
So my regex to find it is: \\[(.*)\\]
I don't understand how to replace anything that does not match my group.
How about using preg_match_all
$s = "[Monster:Test]Maps=1,5,2,3[Monster:Test2]Maps=2-5~";
preg_match_all("/\[[^\]]+\]/", $s, $m);
echo implode($m[0]);
Results into:
[Monster:Test][Monster:Test2]
Does this work as required?
Just join all matches and you should end up with what you want.
/\[[^\]]+\]/g;
matches the first [
matches anything that is not a ]
matches a ]
g flag for all matches
Implementation Example:
var string = "[Monster:Test]Maps=1,5,2,3[Monster:Test2]Maps=2-5";
var result = string.match(/\[[^\]]+\]/g).join("");
console.log(result);
* Although the example is javascript, you should be able to do this in any other language.

Regex: how to find a comma between two quotation marks

I need to find the dot between two quotation marks, and substitute it with a comma.
I'm trying with this
\".*?\"
but it finds everything between the quotation marks.
I need to transform something like this "100,21$" into this "100.21$"
In general, you can match quoted substrings first with a
"[^"]+"
and then replace the . with , in the matched block (in a callback, or a post-process method/function).
Alternatively, you might use capturing + backreferences:
"(\d*),(\d+\$)"
to replace with "$1.$2" (where $1 is the text captured with (\d*) and $2 is the value captured with (\d+\$)). See this demo.
If your string is simple like your exemple : "100,21$"
You can simple use this code :
str = str.replace(/,/g,'.');
If it's a little bit more complex like :
Here is an "example" of "100,21$" price, I am a string
You can use this :
str = str.replace(/"(\d+),(\d+)(.*?)"/g,'"$1.$2$3")
Regexr link if you want to test :
http://regexr.com/3dbo5
nb : I wrote the example, but you can use the regexp and the string pattern in another langages

Regex to match alphanumerics, URL operators except forward slashes

I've been trying for the past couple of hours to get this regex right but unfortunately, I still can't get it. Tried searching through existing threads too but no dice. :(
I'd like a regex to match the following possible strings:
userprofile?id=123
profile
search?type=player&gender=male
someotherpage.htm
but not
userprofile/
helloworld/123
Basically, I'd like the regex to match alphanumerics, URL operators such as ?, = and & but not forward slashes. (i.e. As long as the string contains a forward slash, the regex should just return 0 matches.)
I've tried the following regexes but none seem to work:
([0-9a-z?=.]+)
(^[^\/]*$[0-9a-z?=.]+)
([0-9a-z?=.][^\/]+)
([0-9a-z?=.][\/$]+)
Any help will be greatly appreciated. Thank you so much!
The reason they all match is that your regexp matches part of the string and you've not told it that it needs to match the entire string. You need to make sure that it doesn't allow any other characters anywhere in the string, e.g.
^[0-9a-z&?=.]+$
Here's a small perl script to test it:
#!/usr/bin/perl
my #testlines = (
"userprofile?id=123",
"userprofile",
"userprofile?type=player&gender=male",
"userprofile.htm",
"userprofile/",
"userprofile/123",
);
foreach my $testline(#testlines) {
if ($testline =~ /^[0-9a-z&?=.]+$/) {
print "$testline matches\n";
} else {
print "$testline doesn't match - bad regexp, no cookie\n";
}
}
This should do the trick:
/\w+(\.htm|\?\w+=\w*(&\w+=\w*)*)?$/i
To break this down:
\w+ // Match [a-z0-9_] (1 or more), to specify resource
( // Alternation group (i.e., a OR b)
\.htm // Match ".htm"
| // OR
\? // Match "?"
\w+=\w* // Match first term of query string (e.g., something=foo)
(&\w+=\w*)* // Match remaining terms of query string (zero or more)
)
? // Make alternation group optional
$ // Anchor to end of string
The i flag is for case-insensitivity.

Regular Expression to find a string included between two characters while EXCLUDING the delimiters

I need to extract from a string a set of characters which are included between two delimiters, without returning the delimiters themselves.
A simple example should be helpful:
Target: extract the substring between square brackets, without returning the brackets themselves.
Base string: This is a test string [more or less]
If I use the following reg. ex.
\[.*?\]
The match is [more or less]. I need to get only more or less (without the brackets).
Is it possible to do it?
Easy done:
(?<=\[)(.*?)(?=\])
Technically that's using lookaheads and lookbehinds. See Lookahead and Lookbehind Zero-Width Assertions. The pattern consists of:
is preceded by a [ that is not captured (lookbehind);
a non-greedy captured group. It's non-greedy to stop at the first ]; and
is followed by a ] that is not captured (lookahead).
Alternatively you can just capture what's between the square brackets:
\[(.*?)\]
and return the first captured group instead of the entire match.
If you are using JavaScript, the solution provided by cletus, (?<=\[)(.*?)(?=\]) won't work because JavaScript doesn't support the lookbehind operator.
Edit: actually, now (ES2018) it's possible to use the lookbehind operator. Just add / to define the regex string, like this:
var regex = /(?<=\[)(.*?)(?=\])/;
Old answer:
Solution:
var regex = /\[(.*?)\]/;
var strToMatch = "This is a test string [more or less]";
var matched = regex.exec(strToMatch);
It will return:
["[more or less]", "more or less"]
So, what you need is the second value. Use:
var matched = regex.exec(strToMatch)[1];
To return:
"more or less"
Here's a general example with obvious delimiters (X and Y):
(?<=X)(.*?)(?=Y)
Here it's used to find the string between X and Y. Rubular example here, or see image:
You just need to 'capture' the bit between the brackets.
\[(.*?)\]
To capture you put it inside parentheses. You do not say which language this is using. In Perl for example, you would access this using the $1 variable.
my $string ='This is the match [more or less]';
$string =~ /\[(.*?)\]/;
print "match:$1\n";
Other languages will have different mechanisms. C#, for example, uses the Match collection class, I believe.
[^\[] Match any character that is not [.
+ Match 1 or more of the anything that is not [. Creates groups of these matches.
(?=\]) Positive lookahead ]. Matches a group ending with ] without including it in the result.
Done.
[^\[]+(?=\])
Proof.
http://regexr.com/3gobr
Similar to the solution proposed by null. But the additional \] is not required. As an additional note, it appears \ is not required to escape the [ after the ^. For readability, I would leave it in.
Does not work in the situation in which the delimiters are identical. "more or less" for example.
Most updated solution
If you are using Javascript, the best solution that I came up with is using match instead of exec method.
Then, iterate matches and remove the delimiters with the result of the first group using $1
const text = "This is a test string [more or less], [more] and [less]";
const regex = /\[(.*?)\]/gi;
const resultMatchGroup = text.match(regex); // [ '[more or less]', '[more]', '[less]' ]
const desiredRes = resultMatchGroup.map(match => match.replace(regex, "$1"))
console.log("desiredRes", desiredRes); // [ 'more or less', 'more', 'less' ]
As you can see, this is useful for multiple delimiters in the text as well
PHP:
$string ='This is the match [more or less]';
preg_match('#\[(.*)\]#', $string, $match);
var_dump($match[1]);
This one specifically works for javascript's regular expression parser /[^[\]]+(?=])/g
just run this in the console
var regex = /[^[\]]+(?=])/g;
var str = "This is a test string [more or less]";
var match = regex.exec(str);
match;
To remove also the [] use:
\[.+\]
I had the same problem using regex with bash scripting.
I used a 2-step solution using pipes with grep -o applying
'\[(.*?)\]'
first, then
'\b.*\b'
Obviously not as efficient at the other answers, but an alternative.
I wanted to find a string between / and #, but # is sometimes optional. Here is the regex I use:
(?<=\/)([^#]+)(?=#*)
Here is how I got without '[' and ']' in C#:
var text = "This is a test string [more or less]";
// Getting only string between '[' and ']'
Regex regex = new Regex(#"\[(.+?)\]");
var matchGroups = regex.Matches(text);
for (int i = 0; i < matchGroups.Count; i++)
{
Console.WriteLine(matchGroups[i].Groups[1]);
}
The output is:
more or less
If you need extract the text without the brackets, you can use bash awk
echo " [hola mundo] " | awk -F'[][]' '{print $2}'
result:
hola mundo