sed regex match and replace any last digit - regex

I have lots of file containing following ipaddress, and i want to replace last digit of ip and look like i am having struggle to come up with correct regex
file1
IPADDR=10.30.2.26
NETMASK=255.255.0.0
GATEWAY=10.30.0.1
I want to replace 10.30.2.26 to 10.30.2.27 using sed but somehow i am missing something, i have tried following.
I have many file which i want to replace and last digit could be anything.
I have tried sed 's/[^IPADDR].$/7/g' file1
how do i match anything between ^IPADDR{anything}$ ?

In your regex, [^IPADDR] is a character class that search for any character except those listed between brackets. I'm not sure that's what you want.
You can use an address instead to find lines starting with IPADDR(/^IPADDR/) and apply the substitution command on it:
sed '/^IPADDR/s/[0-9]$/7/' file

You may use the following command:
sed -r 's/(^IPADDR=[0-9.]+)([0-9]$)/\17/g' file
Prints:
IPADDR=10.30.2.27
NETMASK=255.255.0.0
GATEWAY=10.30.0.1

Related

regex in sed removing only the first occurrence from every line

I have the following file I would like to clean up
cat file.txt
MNS:N+ GYPA*01 or GYPA*M
MNS:M+ GYPA*02 or GYPA*N
MNS:Mc GYPA*08 or GYP*Mc
MNS:Vw GYPA*09 or GYPA*Vw
MNS:Mg GYPA*11 or GYPA*Mg
MNS:Vr GYPA*12 or GYPA*Vr
My desired output is:
MNS:N+ GYPA*01 or GYPA*M
MNS:M+ GYPA*02 or GYPA*N
MNS:Mc GYPA*08 or GYP*Mc
MNS:Vw GYPA*09 or GYPA*Vw
MNS:Mg GYPA*11 or GYPA*Mg
MNS:Vr GYPA*12 or GYPA*Vr
I would like to remove everything between ":" and the first occurence of "or"
I tried sed 's/MNS:d*?or /MNS:/g' though it removes the second "or" as well.
I tried every option in https://www.geeksforgeeks.org/sed-command-in-linux-unix-with-examples/
to no avail. should I create alias sed='perl -pe'? It seems that sed does not properly support regex
perl should be more suitable here because we need Lazy match logic here.
perl -pe 's|(:.*?or +)(.*)|:\2|' Input_file
by using .*?or we are checking for the first nearest match for or string in the line.
This might work for you (GNU sed):
sed '/:.*\<or\>/{s/\<or\>/\n/;s/:.*\n//}' file
If a line contains : followed by the word or, then substitute the first occurrence of the word or with a unique delimiter (e.g.\n) and then remove everything between : and the unique delimiter.
Wrt I would like to remove everything between ":" and the first occurence of "or" - no you wouldn't. The first occurrence of or in the 2nd line of sample input is as the start of orweqqwe. That text immediately after : looks like it could be any set of characters so couldn't it contain a standalone or, e.g. MNS:2 or eqqwe or M+ GYPA*02 or GYPA*N
Given that and the fact it's apparently a fixed number of characters to be removed on every line, it seems like this is what you should really be using:
$ sed 's/:.\{14\}/:/' file
MNS:N+ GYPA*01 or GYPA*M
MNS:M+ GYPA*02 or GYPA*N
MNS:Mc GYPA*08 or GYP*Mc
MNS:Vw GYPA*09 or GYPA*Vw
MNS:Mg GYPA*11 or GYPA*Mg
MNS:Vr GYPA*12 or GYPA*Vr
If it is sure the or always occurs twice a line as provided example, please try:
sed 's/\(MNS:\).\+ or \(.\+ or .*\)/\1\2/' file.txt
Result:
MNS:N+ GYPA*01 or GYPA*M
MNS:M+ GYPA*02 or GYPA*N
MNS:Mc GYPA*08 or GYP*Mc
MNS:Vw GYPA*09 or GYPA*Vw
MNS:Mg GYPA*11 or GYPA*Mg
MNS:Vr GYPA*12 or GYPA*Vr
Otherwise using perl is a better solution which supports the shortest match as RavinderSingh13 answers.
ex supports lazy matching with \{-}:
ex -s '+%s/:\zs.\{-}or //g|wq' input_file
The pattern :\zs.\{-}or matches any character after the first : up to the first or.

How to cut a string till first numerical value appears using regex

I am trying to write a script which can extract the words from a string untill the first number appears.
ex :- I have a file named as typed-list-4.1.3.Final.jar and I want the output as:- typed-list.jar
Since all the files have different names, but, they end with a version number and .jar extension so I was trying to sed the part from where the first number appears and then append .jar.
My files look like :-
log4j-slf4j-impl-2.8.2.jar, hibernate-core-5.0.12.Final.jar etc
I tried to use sed command like this but it's not working :-
sed -i 's/-[0-9]*$//g' test1.sh --- where test1.sh contains this string "typed-list-4.1.3.Final.jar"
How about:
sed 's/-\([0-9]\+\.\)\+[0-9]\+.*\.jar/.jar/' Input_file
Results for the provided inputs:
typed-list.jar
log4j-slf4j-impl.jar
hibernate-core.jar
The regex matches with a substring such as:
starting with a dash -
pattern repetition of digit(s) dot digit(s) ...
some other substring in between (such as Final)
ends with the extension .jar
Then the sed command replaces the matched substring with just the extension.
Hope this helps.
Sed:
sed -E 's/(.*)-([[:digit:]]+\.){2}[[:digit:]]+.*(\.[^.]+)$/\1\3/' dat
log4j-slf4j-impl.jar
hibernate-core.jar
typed-list.jar
echo typed-list-4.1.3.Final.jar | awk 'sub(/-4.{10}/,"",$0)'
typed-list.jar

Property File with Sed regex - Ignore first character for match

I have a test property file with this in it:
-config.test=false
config.test=false
I'm trying to, using sed, update the values of these properties whether they have the - in front of them or not. Originally I was using this, which worked:
sed -i -e "s/#*\(config.test\)\s*=\s*\(.*\)/\1=$(echo "true" | sed -e 's/[\/&]/\\&/g')/" $FILE_NAME
However, since I was basically ignoring all characters before the match, I found that when I had properties with keys that ended in the same value, it'd give me problems. Such as:
# The regex matches both of these
config.test=true
not.config.test=true
Is there a way to either ignore the first character for a match or ignore the initial - specifically?
EDIT:
Adding a little clarification in terms of what I'd want the regex to match:
config.test=false # Should match
-config.test=false # Should match
not.config.test=false # Should NOT match
sed -E 's/^(-?config\.test=).*/\1true/' file
? means zero or 1 repetitions of so it means the - can be present or not when matching the regexp.
I found some solution for a regex of a specific length instead of ignoring the first character with sed and awk. Sometimes the opposite does the same by an easier way.
If you only have the alternative to use sed I have two workaround depending on your file.
If your file looks like this
$ cat file
config.test=false
-config.test=false
not.config.test=false
you can use this one-liner
sed 's/^\(.\{11,12\}=\)\(.*$\)/\1true/' file
sed is looking at the beginning ^ of each line and is grouping \( ... \) for later back referencing every character . that occurs 11 or 12 times \{11,12\} followed by a =.
This first group will be replaced with the back reference \1.
The second group that match every character after the = to the end of line \(.*$\) will be dropped. Instead of the second group sed replaces with your desired string true.
This also means, that every character after the new string true will be chopped.
If you want to avoid this and your file looks like
$ cat file
config.test=true # Should match
-config.test=true # Should match
not.config.test=false # Should NOT match
you can use this one-liner
sed 's/^\(.\{11,12\}=\)\(false\)\(.*$\)/\1true\3/' file
This is like the example before but works with three groups for back referencing.
The content of the former group 2 is now in group 3. So no content after a change from false to true will be chopped.
The new second group \(false\) will be dropped and replaced by the string true.
If your file looks like in the example before and you are allowed to use awk, you can try this
awk -F'=' 'length($1)<=12 {sub(/false/,"true")};{print}'
For me this looks much more self-explanatory, but is up to your decision.
In both sed examples you invoke only one time the sed command which is always good.
The first sed command needs 39 and the second 50 character to type.
The awk command needs 52 character to type.
Please tell me if this works for you or if you need another solution.

Regex command line change format of each line

I have a file that contains lines in a format similar to this...
/data/file.geojson?10,20,30,40
/data/file.geojson?bbox=-5.20751953125,49.05227025601607,3.0322265625,56.46249048388979
/data/file.geojson?bbox=-21.46728515625,45.99696161820381,19.2919921875,58.88194208135912
/data/file.geojson?bbox=-2.8482055664062496,54.38935426009769,-0.300750732421875,55.158473983815306
/data/file.geojson?bbox=-21.46728515625,45.99696161820381,19.2919921875,58.88194208135912
/data/file.geojson?bbox=-21.46728515625,45.99696161820381,19.2919921875,58.88194208135912
I've tried a combination of grep, sed, gawk, and |(pipes) to try and pattern match and then change the format to be more like this...
[10,40],[30,40],[30,20][10,20],
[-5.20751953125,56.46249048388979],[3.0322265625,56.46249048388979].....
Hopefully you get the idea from the first line so I don't have to type out all the examples manually!
I've got the hang of regex to match the co-ordinates. In fact the input file is the result of extracting from apache access logs. It might be easier to read/understand answers if they just match positive integer numbers, I will then be able to slot in a more complicated pattern to match the right range.
To be able to arrange the results like you which it is important to be able to access the last for values per line.
No pattern matching is required if you use awk. You can split the input strings by a set of delimiters and reassemble the resulting fields. 40 can be accessed as $(NF), 30 as $(NF-1) and so on.
awk -F'[?,=]' '
{printf "[%s,%s],[%s,%s],[%s,%s],[%s,%s]\n",
$(NF-3),$(NF),$(NF-1),$(NF),
$(NF-1),$(NF-2),$(NF-3),$(NF-2)
}' file
I'm using ?, , or = as the field delimiters. This makes it simple to access the columns of interest.
Output:
[10,40],[30,40],[30,20],[10,20]
[-5.20751953125,56.46249048388979],[3.0322265625,56.46249048388979],[3.0322265625,49.05227025601607],[-5.20751953125,49.05227025601607]
[-21.46728515625,58.88194208135912],[19.2919921875,58.88194208135912],[19.2919921875,45.99696161820381],[-21.46728515625,45.99696161820381]
[-2.8482055664062496,55.158473983815306],[-0.300750732421875,55.158473983815306],[-0.300750732421875,54.38935426009769],[-2.8482055664062496,54.38935426009769]
[-21.46728515625,58.88194208135912],[19.2919921875,58.88194208135912],[19.2919921875,45.99696161820381],[-21.46728515625,45.99696161820381]
[-21.46728515625,58.88194208135912],[19.2919921875,58.88194208135912],[19.2919921875,45.99696161820381],[-21.46728515625,45.99696161820381]
Btw, also sed can be used here:
sed -r 's/.*[?=]([^,]+),([^,]+),([^,]+),(.*)/[\1,\4],[\3,\4],[\3,\2],[\1,\2]/' file
The command is capturing the numbers at the end each in a separate capturing group and re-assembles them in the replacement part.
Not all versions of sed support the + quantifier. The most compatible version would look like this :)
sed 's/.*[?=]\([^,]\{1,\}\),\([^,]\{1,\}+\),\([^,]\{1,\}\),\(.*\)/[\1,\4],[\3,\4],[\3,\2],[\1,\2]/' file
sed strips off items prior to numbers, then awk splits on comma and outputs in different order. Assuming data is in a file called "td.txt"
sed 's/^[^0-9-]*//' td.txt|awk -F, '{print "["$1","$4"],["$3","$4"],["$3","$2"],["$1","$2"],"}'
This might work for you (GNU sed):
sed -r 's/^.*\?[^-0-9]*([^,]*),([^,]*),([^,]*),([^,]*)/[\1,\4],[\3,\4],[\3,\2],[\1,\2]/' file
Or with more toothpicks:
sed 's/^.*\?[^-0-9]*\([^,]*\),\([^,]*\),\([^,]*\),\([^,]*\)/[\1,\4],[\3,\4],[\3,\2],[\1,\2]/' file
You can use the following to match:
(\/data\/file\.geojson\?(?:bbox=)?)([0-9.-]+),([0-9.-]+),([0-9.-]+),([0-9.-]+)
And replace with the following:
$1[$2,$3],[$4,$5]
See DEMO

using sed to copy lines and delete characters from the duplicates

I have a file that looks like this:
#"Afghanistan.png",
#"Albania.png",
#"Algeria.png",
#"American_Samoa.png",
I want it to look like this
#"Afghanistan.png",
#"Afghanistan",
#"Albania.png",
#"Albania",
#"Algeria.png",
#"Algeria",
#"American_Samoa.png",
#"American_Samoa",
I thought I could use sed to do this but I can't figure out how to store something in a buffer and then modify it.
Am I even using the right tool?
Thanks
You don't have to get tricky with regular expressions and replacement strings: use sed's p command to print the line intact, then modify the line and let it print implicitly
sed 'p; s/\.png//'
Glenn jackman's response is OK, but it also doubles the rows which do not match the expression.
This one, instead, doubles only the rows which matched the expression:
sed -n 'p; s/\.png//p'
Here, -n stands for "print nothing unless explicitely printed", and the p in s/\.png//p forces the print if substitution was done, but does not force it otherwise
That is pretty easy to do with sed and you not even need to use the hold space (the sed auxiliary buffer). Given the input file below:
$ cat input
#"Afghanistan.png",
#"Albania.png",
#"Algeria.png",
#"American_Samoa.png",
you should use this command:
sed 's/#"\([^.]*\)\.png",/&\
#"\1",/' input
The result:
$ sed 's/#"\([^.]*\)\.png",/&\
#"\1",/' input
#"Afghanistan.png",
#"Afghanistan",
#"Albania.png",
#"Albania",
#"Algeria.png",
#"Algeria",
#"American_Samoa.png",
#"American_Samoa",
This commands is just a replacement command (s///). It matches anything starting with #" followed by non-period chars ([^.]*) and then by .png",. Also, it matches all non-period chars before .png", using the group brackets \( and \), so we can get what was matched by this group. So, this is the to-be-replaced regular expression:
#"\([^.]*\)\.png",
So follows the replacement part of the command. The & command just inserts everything that was matched by #"\([^.]*\)\.png", in the changed content. If it was the only element of the replacement part, nothing would be changed in the output. However, following the & there is a newline character - represented by the backslash \ followed by an actual newline - and in the new line we add the #" string followed by the content of the first group (\1) and then the string ",.
This is just a brief explanation of the command. Hope this helps. Also, note that you can use the \n string to represent newlines in some versions of sed (such as GNU sed). It would render a more concise and readable command:
sed 's/#"\([^.]*\)\.png",/&\n#"\1",/' input
I prefer this over Carles Sala and Glenn Jackman's:
sed '/.png/p;s/.png//'
Could just say it's personal preference.
or one can combine both versions and apply the duplication only on lines matching the required pattern
sed -e '/^#".*\.png",/{p;s/\.png//;}' input