How would I match all data between 2 symbols with Regex?

How would I match all data between 2 symbols with Regex? - regex

I'm trying to find all data (including and after) a dash (-) appears, only up to the first delimiter which is a colon.
Example data:
Input:
bart23-testaccount#test.test:Test:Test:Test
Desired output:
bart23:Test:Test:Test
I've done some research and found this regex, but it's not fit for purpose -(.*):
My purpose is for thousands of lines which are all in various types of order, however the purpose remains the same, highlight all text between the - and the first : (which I will then proceed to delete). I will be using Notepad++
I can answer any questions or make my post more specific if need be, it's kind of hard to explain.

In Notepad++ you can use regex find/replace. Look for:
^([^-]+)-[^:]+(:.*)$
which captures everything up to the first - in group 1, and everything after (and including) the first : in group 2, and replace with
\1\2

Using Notepad++, without any capture group:
Ctrl+H
Find what: -[^:]+
Replace with: LEAVE EMPTY
check Wrap around
check Regular expression
Replace all
Explanation:
- # an hyphen (by default, the first one in a line)
[^:]+ # 1 or more not colon
Result for given example:
bart23:Test:Test:Test
Screen capture:

Related

When replacing with regexes, how do I append digits to the end of a match group?

This is for Visual Studio code, but I'm not sure how relevant that is.
Here's my regex: cost:(\d\d)
I'm trying to multiply that number by 100. e.g., change cost:25 to cost:2500
But if I try this replacement, it thinks I'm asking for group #100, instead of group #1 and two zeros: cost:$100 changes cost:25 to cost:$100. If I put a space after the two zeros, it replaces it with this: cost:25 00. Is there a way to tell the regex engine where the group name stops?
I've tried replacing with cost:${1}00, but that doesn't work. It changes it to cost:${1}00:

On VS Code 1.51.1, this solution works for Replace (current file) and Replace in Files:
Use $0 to refer to the whole match:
Search: cost:\d\d
Replace: $000
As mentioned in the comment, these two methods only works for Replace (current file) but not Replace in Files:
Use $& to refer to the whole match:
Search: cost:\d\d
Replace: $&00
Use $1 to refer to the first capturing group:
Search: cost:(\d\d)
Replace: cost:$100
For the regular Replace, the replacement string parsing rule is the same as JavaScript. Since we don't have 10 or 100 capturing groups in the regex, the replacement string is parsed as $1 (whatever matched in group 1) followed by literal 00
However, for Replace in Files, the replacement only works when there is only 1 digit following the $1 (cost:$10). Once there are more than 2 digits, it's treated as literal string replacement
I played around with named capturing groups - while they are recognized in the search field, I can't refer to it in replacement string. Common syntax such as ${group} and $<group> doesn't work.
After testing a bit more, I find some weird bugs with how Replace in Files is implemented.
This straight-forward method with named capturing group doesn't work:
Search: cost:(?<n>\d\d)
Replace: cost:$<n>00
But we can coax it to work by mixing named capturing group replacement with numbered group replacement:
Search: cost:(?<n>\d\d)()
Replace: cost:$2$<n>00
We can also apply this trick to get the numbered group replacement to work:
Search: cost:(\d\d)()
Replace: cost:$2$100
Seems that there is a bug when the replacement string only includes a single replacement token.

Instead of using groups, why not match the zero-width character using lookbehind?
(?<=cost:\d\d)
replacement:
00

You have run into this bug discussed here: VSCode Regex Find/Replace In Files: can't get a numbered capturing group followed by numbers to work out
and the filed issue: https://github.com/microsoft/vscode/issues/102221
which was unfortunately closed with "ask on SO response" despite having originated here. Although there are workarounds - also noted in the So link here, you can still comment on the issue and ask it to be reopened.
The weird thing is that if you add just one digit like cost:$10 it works just fine - that is why I consider this a bug.

Capture groups in MS Word regex

I am trying to remove the new line prior to "n=", replace with a space and contain the captured number in (), all these in MS Word's advanced find+replace, using wildcards.
Currently:
some preceeding text
n=1,233,023
Desired result:
some preceding text (1,233,023)
I've been struggling with ^13n=(*{1,})
and replace with " (\1)" (without the quotes)
but it can't even match it.
Any help please , appreciated.
Thank you

MS Word does have weird ways in regular expressions. The following steps were succesfull for me (mine is in Dutch so please forgive any small translations errors):
Hit Ctrl+H to open Search And Replace.
Tick More and tick Use Wildcards
Now with this done we can search for:
^13(n=[0-9,]{1,})
^13 - Match newline.
( - Open capture group 1.
n= - Match "n=" literally.
[,0-9]{1,} - Match a digit or commas at least 1 time.
) - Close capture group 1.
Replace by:
^s\1
^s\1 - A space followed by capture group 1.
As mentioned I would consider the type of regular expressions Word is offering dodgy. Here you can read a bit more about it's flaws too. I couldn't create capture groups within a capture group neither was I able to create optional blocks of three consecutive digits and commas. Fortunately in your own attempt just knowing a newline followed by literally n= seemed enough.
Second to last note; because I'm Dutch my local parameter seperator is the semi-colon. This also reflects in this search and replace function its occurrence indicators meaning I used: ^13(n=[,0-9]{1;})
And one last note, another pattern I found worked for me was ^13(n=*^13), but since we had zero control of the pattern between n= and the paragraph end I would stick with my initial thought. The reason why the use of the * worked here is because we used it as an actual frequence of any characters between n= and ^13.
Before:
After:

The wildcard search term should be
(^13)([a-z])(=)([,0-9]{1,})
and the replacement is
(\4)
Note the first character above is a space.

RegEx: Find & Replace snake_case to UpperCamelCase/PascalCase Between Characters

I am using my IDE's Find & Replace (w/ RegEx) feature to find & replace the type parameter of arguments to go from snake_case to PascalCase (AKA UpperCamelCase). There are several files and lines throughout the project that need to be changed, and manually doing so is quite error prone and tedious (plus I am sure I am going to need the essential pattern again for future changes).
For example:
CURRENT: function find_all_by_name_and_status(_i_find_all_by_name_and_statusCriteria find_all_by_name_and_status_criteria) ...
Should be:
DESIRED: function find_all_by_name_and_status(IFindAllByNameAndStatusCriteria find_all_by_name_and_status_criteria) ...
The patterns I am using are the following:
FIND: (?<=\()_(.)(Criteria)*
REPLACE: \U$1\L
The replace pattern will work, as far as I can see, if the 1st found capture group is correct (the letter just after an "_").
The core pattern of _(.) finds the correct components to replace, however, it captures the other parts of the string as well. So, I added a positive lookbehind (?<=\() to start at the opening parentheses and an ending dummy capture for (Criteria)*. The entire pattern seems to cause the core pattern to only match once and not repeatedly. (?R) does not seem to help either.
P.S.
It looks the (Criteria)* does not do anything either, but I figured that is the second problem to address after getting the core pattern to find all matches / repeat.
I feel like I am close to a solution, but not quite there yet. I, of course, could be VERY off base on the solution. Any help would be appreciated.

This expression,
(.*\()|(_)([a-z])([a-z]*)|(Criteria.*)
which is not really the best one, with a replacement of something similar to:
$1\U$3\L$4\E$5
might likely work here (the \E is for demoing).
In this demo on the right panel, the expression is explained, if you might be interested.
RegEx Circuit
jex.im visualizes regular expressions:

This is working with Notepad++
Ctrl+H
Find what: (\(|\G)_(.[^\W_]*)(?=\w+Criteria)
Replace with: $1\u$2
check Match case
check Wrap around
check Regular expression
Replace all
Explanation:
(\(|\G) # group 1, openning parenthesis or restart from last match position
_ # underscore
(.[^\W_]*) # group 2, 1 any character followed by 0 or more alphanum
(?=\w+Criteria) # positive lookahead, make sure we have 1 or more word character and Criteria
Replacement:
$1 # content of group 1
\u$2 # content of group 2 with first character uppercased
Result for given example:
function find_all_by_name_and_status(IFindAllByNameAndStatusCriteria find_all_by_name_and_status_criteria) ...
Screen capture:

RegEx help for NotePad++

I need help with RegEx I just can't figure it out I need to search for broken Hashtags which have an space.
So the strings are for Example:
#ThisIsaHashtagWith Space
But there could also be the Words "With Space" which I don't want to replace.
So important is that the String starts with "#" then any character and then the words "With Space" which I want to replace to "WithSpace" to repair the Hashtags.
I have a Document with 10k of this broken Hashtags and I'm kind of trying the whole day without success.
I have tried on regex101.com
with following RegEx:
^#+(?:.*?)+(With Space)
Even I think it works on regex101.com it doesn't in Notepad++
Any help is appreciated.
Thanks a lot.
BR

In your current regex you match a # and then any character and in a capturing group match (With Space).
You could change the capturing group to capture the first part of the match.
(#+.*?)With Space
Then you could use that group in the replacement:
$1WithSpace
As an alternative you could first match a single # followed by zero or more times any character non greedy .*? and then use \K to reset the starting point of the reported match.
Then match With Space.
#+(?:.*?)\KWith Space
In the replacement use WithSpace
If you want to match one or more times # you could use a quantifier +. If the match should start at the beginning of string you could use an anchor ^ at the start of the regex.

Try using ^(#.+?)(With\s+Space) for your regex as it also matches multiple spaces and tab characters - if you have multiple rows that you want to affect do gmi for the flags. I just tried it with the following two strings, each on a separate line in Notepad++
#blablaWith Space
#hello###$aWith Space
The replace with value is set to $1WithSpace and I've tried both replaceAll and replace one by one - seems to result in the following.
#blablaWithSpace
#hello###$aWithSpace
Feel free to comment with other strings you want replaced. Also be sure that you have selected the Regular Extension search mode in NPP.

Try this? (#.*)( ).
I tried this in Notepad++ and you should be able to just replace all with $1. Make sure you set the find mode to regular expressions first.
const str = "#ThisIsAHashtagWith Space";
console.log(str.replace(/(#.*)( )/g, "$1"));

Regex: Find multiple matching strings in all lines

I'm trying to match multiple strings in a single line using regex in Sublime Text 3.
I want to match all values and replace them with null.
Part of the string that I'm matching against:
"userName":"MyName","hiScore":50,"stuntPoints":192,"coins":200,"specialUser":false
List of strings that it should match:
"MyName"
50
192
200
false
Result after replacing:
"userName":null,"hiScore":null,"stuntPoints":null,"coins":null,"specialUser":null
Is there a way to do this without using sed or any other substitution method, but just by matching the wanted pattern in regex?

You can use this find pattern:
:(.*?)(,|$)
And this replace pattern:
:null\2
The first group will match any symbol (dot) zero or more times (asterisk) with this last quantifier lazy (question mark), this last part means that it will match as little as possible. The second group will match either a comma or the end of the string. In the replace pattern, I substitute the first group with null (as desired) and I leave the symbol matched by the second group unchanged.

Here is an alternative on amaurs answer where it doesn't put the comma in after the last substitution:
:\K(.*?)(?=,|$)
And this replacement pattern:
null
This works like amaurs but starts matching after the colon is found (using the \K to reset the match starting point) and matches until a comma of new line (using a positive look ahead).
I have tested and this works in Sublime Text 2 (so should work in Sublime Text 3)
Another slightly better alternative to this is:
(?<=:).+?(?=,|$)
which uses a positive lookbehind instead of resetting the regex starting point
Another good alternative (so far the most efficient here):
:\K[^,]*

This may help.
Find: (?<=:)[^,]*
Replace: null

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How would I match all data between 2 symbols with Regex? - regex

In Notepad++ you can use regex find/replace. Look for: ^([^-]+)-[^:]+(:.*)$ which captures everything up to the first - in group 1, and everything after (and including) the first : in group 2, and replace with \1\2

Related

When replacing with regexes, how do I append digits to the end of a match group?

Capture groups in MS Word regex

RegEx: Find & Replace snake_case to UpperCamelCase/PascalCase Between Characters

RegEx help for NotePad++

Regex: Find multiple matching strings in all lines

Categories

Resources