Remove everything after second colon - regex

I need to remove everything after the colon following orange
Example:
apple:orange:banana:grapes:
Becomes:
apple:orange
I've looked up a million different references for this and cannot find a solution.
Currently doing this in Notepad++ using the Find/Replace function.

Find what : (^[a-z]+:[a-z]+).*$
(^[a-z]+:[a-z]+) First capturing group. Match alphabetic characters at start of string, a colon, alphabetic characters.
.*$ Match anything up to the end of the string.
Replace with : \1
\1 Replace with captured group one.
You could of course make the expression more general:
Find what : (^[^:]+:[^:]+).*$
(^[^:]+:[^:]+) Capturing group. Match anything other than a colon at start of string, a colon, anything other than a colon.
.*$ Match anything up to end of string.
Replace with : \1
\1 Replace with captured group one.
As pointed out by revo in the comment below, you should disable the matches newline option when using the patterns above in Notepad++.

If I understand you correctly, you can use the plugin ConyEdit to do this. You can use its command cc.dac <nth>/<regex>/[<mode>] [-options]. cc.dac means: delete after column.
For Example:
With ConyEdit running in the background, copy the text and the command line below, then paste:
apple:orange:banana:grapes:
cc.dac 2/:/ -d

How big is the file?
If it's a small file, you could probably write a simple code something like following snippet in java. Most the programming languages would support such operations.
String input = "apple:orange:banana:grapes:";
String[] arrOfStr = input.split(":");
int index = arrOfStr.indexOf("orange");
String[] arrOfStrSub = Arrays.copyOf(arrOfStr, 0, index);
String output = StringUtils.join(arrOfStrSub, ':');

Related

Regex find replace to add function parameter

I'm trying to find and replace some function calls in py program. The idea is to add some boolean parameter to each call found on the project.
I looked for solutions on the internet 'cause I don't know regex science at all... It seems like a basic exercice for regex guys but still.
In my case I have this call in a lot of files :
myFunction("test")
My gooal is to find and replace this call into :
myFunction("test", false)
Could you help me write the regex ?
Try this command:
sed -re 's/(myFunction)[[:space:]]*\([[:space:]]*("test")[[:space:]]*\)/\1(\2, false)/' SOURCE_FILENAME
If you prefer to replace the existing source file with an updated one, then write -i SOURCE_FILENAME instead of SOURCE_FILENAME.
This works by defining a pattern to match the function call you would like to update:
myFunction (obviously) matches the text myFunction;
[[:space:]] matches any whitespace character, mainly spaces and tabs.
[[:space:]]* matches zero or more whitespace characters.
\( and \) match literal parenthesis in your program text;
( and ) are regex metacharacters that match nothing, but ("test") matches "test" and captures the matched text for later use.
Note that this pattern captures two things using ( and ). The ("test") is the second of these.
Now let us examine the overall structure of the Sed command 's/.../.../'. The s means "substitute," so 's/.../.../' is Sed's substitution command.
Between the first and second slashes comes the pattern we have just discussed. Between the second and third slashes comes the replacement text Sed uses to replace the matched part of any line of your program text that matches the pattern. Within the replacement text, the \1 and \2 are backreferences that place the text earlier captured using ( and ).
So, there it is. Not only have I helped you to write the regex but have shown you how the regex works so that, next time, you can write your own.
Refer this:
import re
#Replace all white-space characters with the digit "9":
str = "The rain in Spain"
x = re.sub("\s", "9", str)
print(x)
you could use this regex to match and capture
(myFunction\("test")(\))
then use the regex below to replace
$1, false$2

Visual Studio Regex Find and Replace `new vtk[...]()` to `vtk[...].New()`

I am trying to replace all the places where we instantiate a VTK object like this: new vtk[...](); with this pattern: vtk[...].New().
The Find regex I'm using is: new vtk\w, but I don't know what the replacement regex should be. How would I do this?
For Example
... = new vtkPoints(); should turn into ... = vtkPoints.New();
Find new vtk\[(.+)\]\(\);
Replace with vtk[$1].New();
Explanation:
We must escape characters in the FIND string, if they are regex operators. Thus, \( means capture a (, \] means capture a ] etc..
We capture the content using (.+) meaning capture at least one of any character (. matches all characters), so that we can use it in the replace string.
in the REPLACE string, we use $1, which means the content of the first capture group
Edit: if you want to support new vtk(); without anything inside the parantheses, replace (.+) with (.*), which means at least 0, instead of at least 1
Edit 2: misread your question a bit, you need new vtk(\w+)\(\) with vtk$1.New()

Notepad++, replace the first and second comma to ":"

In Notepad++, I'd like to replace only the first and second comma (","), by ":".
Example :
blue,black,red -> blue:black:red (2 first commas replaced)
blue,black,red,yellow -> blue:black:red,yellow (third comma still here)
Thanks!
I believe you can do this by replacing this regex:
^([^,]*),([^,]*),(.*)$
With this:
$1:$2:$3
For compatibility with cases where there are less than 2 commas, use these:
^(([^,]*),)?(([^,]*),)?(.*)$
$2:$4:$5
Something along this line,
^([^,]*),([^,]*),(.*)$
And replace with
$1:$2:$3
Or \1:\2:\3
Just two capturing groups is enough.
Regex:
^([^,]*),([^,]*),
Replacement string:
$1:$2:
DEMO
Explanation:
^ Asserts that we are at the start.
([^,]*) Captures any character not of , zero or more times and stored it into a group.(ie, group 1)
, Matches a literal , symbol.
([^,]*) Captures any character not of , zero or more times and stored it into a group.(ie, group 2)
, Matches a literal , symbol.
Well you can try to capture the parts in groups and then replace them as follows:
/^([^,]*),([^,]*),(.*)$/$1:$2:$3
How does it work: each line is matched such that the first part contains all data before the first comma, the second part in between the two commas and the third part all other characters (including commas).
This is simply replaced by joining the groups with colons.
A no-brainer; virtually "GREP 1-0-1". Not really an effort.
Just find
^([^,]+),([^,]+),
and replace with
\1:\2:
Click on the menu item: Search > Replace
In the dialog box that appears, set the following values...
Find what: ^([^,]+),([^,]+),
Replace with: $1:$2:
Search Mode: Regular expression

What REGEX pattern should I use to look for a specific string pattern and remove anything else that doesnt match?

I'm parsing through code using a Perl-REGEX parsing engine in my IDE and I want to grab any variables that look like
$hash->{ hash_key04}
and nuke the rest of the code..
So far my very basic REGEX doesnt do what I expected
(.*)(\$hash\-\>\{[\w\s]+\})(.*)
(
\$
hash
\-\>
\{
[\w\s]+
\}
)
I know to use replace for this ($1,$2,etc), but match (.*) before and after the target string doesnt seem to capture all the rest of the code!
UPADTED:
tried matching null but of course thats too greedy.
([^\0]*)
What expression in regex should i use to look only for the string pattern and remove the rest?
The problem is I want to be left with the list of $hash->{} strings after the replace runs in the IDE.
This is better approached from the other direction. Instead of trying to delete everything you don't want, what about extracting everything you do want?
my #vars = $src_text =~ /(\$hash->\{[\w\s]+\})/g;
Breaking down the regex:
/( # start of capture group
\$hash-> # prefix string with $ escaped
\{ # opening escaped delimiter
[\w\s]+ # any word characters or space
\} # closing escaped delimiter
)/g; # match repeatedly returning a list of captures
Here is another way that might fit within your IDE better:
s/(\$hash->\{[\w\s]+\})|./$1/gs;
This regex tries to match one of your hash variables at each location, and if it fails, it deletes the next character and then tries again, which after running over the whole file will have deleted everything you don't want.
Depends on your coding language. What you want is group 2 (The second set of characters in parenthesis). In perl that would be $2, in VIM it would be \2, etc ...
It depends on the platform, but generally, replace the pattern with an empty string.
In javascript,
// prints "the la in ing"
console.log('the latest in testing'.replace(/test/g, ''));
In bash
$ echo 'the latest in testing' | sed 's/test//g'
the la in ing
In C#
Console.WriteLine(Regex.Replace("the latest in testing", "test", ""));
etc
By default the wildcard . won't match newlines. You can enable newlines in its matching set using a flag depending on what regex standard you're using and under what language/api. Or you can add them explicitly yourself by defining a character set:
[.\n\r]* <- Matches any character including newline, carriage return.
Combine this with capture groups to grab desired variables from your code and skip over lines which contain no capture group.
If you want help constructing the proper regex for your context you'll need to paste some input text and specify what the output should be.
I think you want to add a ^ to the beginning of the regex s/^.(PATTERN)(.)$/$1/ so that it starts at the beginning of the line and goes to the end, removing anything except that pattern.

what can be the regex for the following string

I am doing this in groovy.
Input:
hip_abc_batch hip_ndnh_4_abc_copy_from_stgig abc_copy_from_stgig
hiv_daiv_batch hip_a_de_copy_from_staging abc_a_de_copy_from_staging
I want to get the last column. basically anything that starts with abc_.
I tried the following regex (works for second line but not second.
\abc_.*\
but that gives me everything after abc_batch
I am looking for a regex that will fetch me anything that starts with abc_
but I can not use \^abc_.*\ since the whole string does not start with abc_
It sounds like you're looking for "words" (i.e., sequences that don't include spaces) that begin with abc_. You might try:
/\babc_.*\b/
The \b means (in some regular expression flavors) "word boundary."
Try this:
/\s(abc_.*)$/m
Here is a commented version so you can understand how it works:
\s # match one whitepace character
(abc_.*) # capture a string that starts with "abc_" and is followed
# by any character zero or more times
$ # match the end of the string
Since the regular expression has the "m" switch it will be a multi-line expression. This allows the $ to match the end of each line rather than the end of the entire string itself.
You don't need to trim the whitespace as the second capture group contains just the text. After a cursory scan of this tutorial I believe this is the way to grab the value of a capture group using Groovy:
matcher = (yourString =~ /\s(abc_.*)$/m)
// this is how you would extract the value from
// the matcher object
matcher[0][1]
I think you are looking for this: \s(abc_[a-zA-Z_]*)$
If you are using perl and you read all lines into one string, don't forget to set the the m option on your regex (that stands for "Treat string as multiple lines").
Oh, and Regex Coach is your free friend.