Regular expressions: inserting a word and NOT replacing the found key - regex

I have a list of items, such as:
this_thing.ety
other-stuff.ety
34-pairings.ety
I want to do this:
"At the beginning of every line, insert "images/"
so the result of search/replace with reg exp would yield:
images/this_thing.ety
images/other-stuff.ety
images/34-pairings.ety
I am using:
^.
as my anchor to find the beginning of each line but everything I've tried to add "images/" has resulted in actually replacing that first character. I am using Notepad ++, but can use anything.
I thought using ${foo} was on the right track but I'm missing something here.

In a regex ^.is matching begin of line and a character. If you replace this by 'image', first character, which matched, will be replaced. Empty line wont have 'image' but stay identical (they don't match ^.)
Just use ^ as regexp for begin of line

. is the any character symbol, but can only account for one character. You will want to use ^..*$ or ^.+$ if your version of regex allows so that every line that contains at least one character will be fully replaced. With replace, it would look like this
s/^(.+)$/images\/\1/
where the \1 re-inserts the part in parenthesis in the regex. In older versions of regex, try
s/^^\(..*\)$/\1/

Related

Regular expression matching space but at the end of line

I'm trying to replace multiple spaces with a single one, but at the start of the line.
Example:
___abc___def__
___ghi___jkl__
should turn to
___abc_def__
___ghi_jkl__
Note that I've replaced space with underscore
A simple search using the following pattern:
([^\s])\s+
matches the space at the end of the first line up to the space at the beginning of the next one.
So, if I replace with \1_, I get the following:
___abc_def_ghi_jkl
And that is absolutely not what I expect and regex engines, e.g., PowerGREP or the one in Visual Studio, don't behave that way.
If you want to match only horizontal spaces, use \h:
Find what: (?<=\S)\h+(?=\S)
Replace with: (a space)
There are several possible interpretations of the question. For each of them the replacement will be a single space character.
If spaces is plural and means space characters but not tabs then use
a find string of (^ {2,})|( {2,}$).
If spaces is plural and should includes tabs then use a find string
of (^[ \t]{2,})|([ \t]{2,}$).
If any leading or trailing spaces and tabs (one or more) is to be
replaced with a space then use a find string of (^[ \t]+)|([ \t]+$).
The general form of each of these is (^...)|(...$). The | means an alternation so either the preceding or the following bracketed expression can match. Hence the find what text can match either at the beginning or the end of a line. The ... varies depending on exactly what needs to be matched. Specifying [ \t] means only the two characters space and tab, whereas \s includes the line-end characters.
Ok, so the intention was to replace this:
Hey diddle diddle, \n<br/>
The Cat and the fiddle,\n
with this:
Hey diddle diddle,\n<br/>
The Cat and the fiddle,\n
A slightly modified version of Toto's answer did the trick:
(?<=\S)\h+(?=\S)|\s+$
finding any space(s) between word-characters and trailing space at the end of the line.

Regex to change all past a certain pattern to Uppercase

I have an xml file that has a value like
JOBNAME="JBDSR14353_Some_other_Descriptor"
I am looking for an expression that will go through the file and change all of the characters in the quotes to Uppercase letters. Is there a Regex expression that will search for JOBNAME="Anything within the quotes" and change them to uppercase? Or a command that will find JOBNAME= and change all on that line to uppercase letters? I know that can just do a search for JOBNAME= and then use a VU command in vim to throw the line to uppercase store that to a macro and run that, but I was wondering if there was a way to get this done with a regex??
Here's an alternative with :substitute, as you had originally intended. This works better than #Zach's solution with gU_ when there's other text in the line:
:%s/JOBNAME="[^"]\+"/\U&/g
"[^"]\+" matches the quoted text (non-greedily by matching only non-quotes inside, to handle multiple quotes in a line)
\U turns the remainder of the replacement uppercase
for simplicity, the entire match (&) is uppercased here, but one could have also used capture groups (\(...\)), or match limiting with \zs
You can use the :g command which executes a command on lines that match a pattern:
:g/JOBNAME/norm! gU_
This will execute the gU_, which capitalizes all letters on a line, on all the lines that match JOBNAME
If there are other things on the same line that you don't want to capitalize, here is a solution for only the words in quotes:
:g/JOBNAME/norm! f"gU;
f" goes to the next quote. gU capitalizes with a motion. The motion used is ; which searches for the next " (repeats the last f command).
To do this with substitution you can use the \U atom which makes everything after it uppercase.
:%s/JOBNAME="\zs.*\ze"/\U&
\zs and \ze mark the start and end of the match and & is the whole match. This means that only the part between quotes is replaced.

Notepad++ Replace all with an exception

I am attempting to edit a csv file, below is a sample line from this file.
|MIGRATE|;|10000|;|2ACC0003|;|30/09/13|;|Positive Adjmt.|;||;|MIGRATE|;|95004U
The beginning of the line |MIGRATE| needs to be modified without changing the second MIGRATE so the line would read
|MIGRATE|;|MIG_IN|;|10000|;|2ACC0003|;|30/09/13|;|Positive Adjmt.|;||;|MIGRATE|;|95004U
There are 7700 or so lines so if I am forced to do this manually I will probably cry a little.
Thanks in advance!
Just replace all the ones you want not changed with another word temporarily, then replace the rest with what you want. I'm not sure what you're asking here, but from what I can guess this might help.
It seems like you could just search for Just search for:
^\|MIGRATE\|
And replace with:
|MIGRATE|;|MIG_IN|
Make sure you've checked 'Regular expression' in the 'Search Mode' options.
Explanation: The ^ is a begin anchor; it will match the beginning of the line, ensuring that it does not match the second |MIGRATE|. The \ characters are required to escape the | characters since they normally have special meaning in regular expressions, and you want to match a literal |.
You can use beginning of line anchors:
Find:
^(\|MIGRATE\|)
Replace with:
$1;|MIG_IN|
regex101 demo
Just make sure that you are using the regular expression mode of the Search&Replace.
If you want to be a bit fancier, you can use a positive lookbehind:
Find:
(?<=^\|MIGRATE\|)
Replace with:
;|MIG_IN|
^ Will match only at the beginning of a line.
( ... ) is called a capture group, and will save the contents of the match in variable you can use (in the first regex, I accessed the variable using $1 in the replace. The first capture gets stored to $1, the second to $2, etc.)
| is a special character meaning 'or' in regex (to match a character or group of characters or another, e.g. a|b matches a or b. As such, you need to escape it with a backslash to make a regex match a literal |.
In my second regex, I used (?<= ... ) which is called a positive lookbehind. It makes sure that the part to be matched has what's inside before it. For instance, (?<=a)b matches a b only if it has an a before it. So that the b in ab matches but not in bb.
The website I linked also explains the details of the regex and you can try out some regex yourself!

Regex to match and copy up to but not including last occurrence of a particular value

In one regex ksh line I need to:
look for the occurrence of a particular string followed by any number of characters up to the last occurrence of a particular value (in this case a comma),
copy the stuff matched to the output, and then
insert a new value after the copied text and before the last occurrence of the particular value (in this case a comma)
So, if my input string looked like this:
SEARCH_STRING anything_else(foo,bar),
What I'd like to output is this:
SEARCH_STRING anything_else(foo,bar) INSERTED_VALUE,
So far, my sed expression looks like this (which only matches and copies everything up to the first occurrence of the comma, not up to the last):
sed -e 's/SEARCH_STRING [^,]\+/& INSERTED_VALUE/'
...which results in this:
SEARCH_STRING anything_else(foo INSERTED_VALUE,bar)
...which is not quite right. I know I need to use something like a negative look ahead - but can't quite get the syntax right. Any advice you could offer would be greatly appreciated, thanks. I also need to do the same replacement incidentally at the end of the line even if the comma isn't found as well please (although I appreciate that may require a separate question and expression). Thanks in advance for any advice offered....
Use the $ special character to match the end of the line, and the . special character to match the last character before that:
sed 's/\(SEARCH_STRING .*\)\(.\)$/\1INSERTED_VALUE\2/'
You could replace the final dot in the match expression with a comma if you know that this is always going to be the character you want to replace. If that last character varies, then using dot will match any such character. One downside, however, is that it also matches whitespace, so if your line has a few extra spaces after the comma, this expression will delete a space, not the comma.
To replace the last non-whitespace character, use this expression instead:
sed 's/\(SEARCH_STRING .*\)\(\S\s*\)$/\1INSERTED_VALUE\2/'
The simplest would be to use a lookahead SEARCH_STRING .*(?=,) but sed does not support this, instead you can do something like this:
sed -e 's/\(SEARCH_STRING .*\)\(,.*\)/\1 INSERTED_VALUE\2/'
Basically we make a backreference what comes before and after the last comma, and then piece back it together with INSERTED_VALUE in the middle.

what can be the regex for the following string

I am doing this in groovy.
Input:
hip_abc_batch hip_ndnh_4_abc_copy_from_stgig abc_copy_from_stgig
hiv_daiv_batch hip_a_de_copy_from_staging abc_a_de_copy_from_staging
I want to get the last column. basically anything that starts with abc_.
I tried the following regex (works for second line but not second.
\abc_.*\
but that gives me everything after abc_batch
I am looking for a regex that will fetch me anything that starts with abc_
but I can not use \^abc_.*\ since the whole string does not start with abc_
It sounds like you're looking for "words" (i.e., sequences that don't include spaces) that begin with abc_. You might try:
/\babc_.*\b/
The \b means (in some regular expression flavors) "word boundary."
Try this:
/\s(abc_.*)$/m
Here is a commented version so you can understand how it works:
\s # match one whitepace character
(abc_.*) # capture a string that starts with "abc_" and is followed
# by any character zero or more times
$ # match the end of the string
Since the regular expression has the "m" switch it will be a multi-line expression. This allows the $ to match the end of each line rather than the end of the entire string itself.
You don't need to trim the whitespace as the second capture group contains just the text. After a cursory scan of this tutorial I believe this is the way to grab the value of a capture group using Groovy:
matcher = (yourString =~ /\s(abc_.*)$/m)
// this is how you would extract the value from
// the matcher object
matcher[0][1]
I think you are looking for this: \s(abc_[a-zA-Z_]*)$
If you are using perl and you read all lines into one string, don't forget to set the the m option on your regex (that stands for "Treat string as multiple lines").
Oh, and Regex Coach is your free friend.