Replace non-unique occurences with sed or other command - replace

my first post here and beginner level. Is there a way I can solve this problem with sed (or any other means)? I want to manipulate a newly created file daily and replace some IP and port occurences.
1) I want to replace the first occurence of "5027,5028" with A3 and the second with A4.
2) I want to replace the first occurence of "5026" with A1 and the second with A2.
PS. I have tried to simplify the example and left the preceeding lines with version="y" or version="x" that could be of help to distinguish the occurences from eachother. (The first x and y version pair is a primary connection and the other two the secondary connection).
Input file:
version="x"
commaSeparatedList="5027,5028"`
version="y"
commaSeparatedList="5026"
version="x"
commaSeparatedList="5027,5028"
version="y"
commaSeparatedList="5026"
Edited file:
version="1.4.1-12"
commaSeparatedList="A3"
version="1.3.0"
commaSeparatedList="A1"
version="1.4.1-12"
commaSeparatedList="A4"
version="1.3.0"
commaSeparatedList="A2"
Sorry, I had some editing horror for a few minutes. Hope it looks easier to understand now. I am basically receiving this file on a system that is deployed nightly and I want to edit this file using a cron job before it starts to make sure a connection works.

Do not bother trying to use sed for this. It can be done, but sed is the wrong tool.
Use awk instead. To replace the first occurrence of "5027,5028" with A3 and the second with A4.
awk '/5027,5028/ && count < 2 { if( count ++ ) repl="A4"; else repl="A3";
sub( "5027,5028", repl)} 1' input
The second replacement is left as an exercise. It is basically the same thing, and you can either run awk twice or just add additional clauses the above.
To overwrite the original file, use shell redirections:
awk ... input > tmpfile && mv tmpfile input

This might work for you (GNU Sed):
sed '1,/5027,5028/s/5027,5028/A3/;s/5027,5028/A4/;1,/5026/s/5026/A1/;s/5026/A2/' file

Related

Bash replace substring after first colon

I am trying to build a connection string that requires pulling 3 IP addresses from another config file. When I get those values, I need to replace the port on each. I plan to replace each port using simple Bash find and replace ${string/pattern/replacement} but my problem is I'm stuck on the best way to parse the pattern out of the IP.
Here is what I have so far:
myFile.config:
ip.1=ip-ip-1-address:1234:5678
ip.2=ip-ip-2-address:1234:5678
ip.3=ip-ip-3-address:1234:5678
Copying some other simple process, I found I can pull the value of each IP like this:
IP1=`grep "ip.1=" /path/to/conf/myFile.config | awk -F "=" '{print $2}'`
which gives me ip.1=ip-ip-1-address:1234:5678. However, I need to replace 1234:5678 with 6543 for example. I've been looking around and I found this awesome answer that detailed using Bash prefix substitution but that relies on knowing the parameter. for example, I would have to do it this way:
test=${ip1##ip-ip-1-address:}
which results in $test being 1234:5678. That's fine but maybe I don't know the IP address as the parameter, so I'm back to considering regex unless there's a way for me to use * as the parameter or something, but I have been unsuccessful so far. For regex, I have tried a bunch such as test=${ip1/(?<=:).*/}.
Note that ${ip1/(?<=:).*/} you tried is an example of string manipulation syntax that does not support regex, only specific patterns.
You seem to want
x='ip.1=ip-ip-1-address:1234:5678'
echo "${x%%:*}:6543" # => ip.1=ip-ip-1-address:6543
The ${x%%:*} takes the value of x and removes all chars from the end till the first : including it. :6543 is added to the result of this manipulation using "${x%%:*}:6543".
To extract that value, you may also use
awk '/^ip\.1=/{sub("^[^:]+:", "");print}' myFile.config
The awk command finds lines starting with ip.1= and then removes all text from the start till the first colon including the colon and only prints these values.

Issues while processing zeroes found in CSV input file with Perl

Friends:
I have to process a CSV file, using Perl language and produce an Excel as output, using the Excel::Writer::XSLX module. This is not a homework but a real life problem, where I cannot download whichever Perl version (actually, I need to use Perl 5.6), or whichever Perl module (I have a limited set of them). My OS is UNIX. I can also use (embedding in Perl) ksh and csh (with some limitation, as I have found so far). Please, limit your answers to the tools I have available. Thanks in advance!
Even though I am not a Perl developer, but coming from other languages, I have already done my work. However, the customer is asking for extra processing where I am getting stuck on.
1) The stones in the road I found are coming from two sides: from Perl and from Excel particular styles of processing data. I already found a workaround to handle the Excel, but -as mentioned in the subject- I have difficulties while processing zeroes found in CSV input file. To handle the Excel, I am using the '0 way which is the final way for data representation that Excel seems to have while using the # formatting style.
2) Scenario:
I need to catch standalone zeroes which might be present in whichever line / column / cell of the CSV input file and put them as such (as zeroes) in the Excel output file.
I will go directly to the point of my question to avoid loosing your valuable time. I am providing more details after my question:
Research and question:
I tried to use Perl regex to find standalone "0" and replace them by whichever string, planning to replace them back to "0" at the end of processing.
perl -p -i -e 's/\b0\b/string/g' myfile.csv`
and
perl -i -ple 's/\b0\b/string/g' myfile.csv
Are working; but only from command line. They aren't working when I call them from the Perl script as follows:
system("perl -i -ple 's/\b0\b/string/g' myfile.csv")
Do not know why... I have already tried using exec and eval, instead of system, with the same results.
Note that I have a ton of regex that work perfectly with the same structure, such as the following:
system("perl -i -ple 's/input/output/g' myfile.csv")
I have also tried using backticks and qx//, without success. Note that qx// and backticks have not the same behavior, since qx// is complaining about the boundaries \b because of the forward slash.
I have tried using sed -i, but my System is rejecting -i as invalid flag (do not know if this happens in all UNIX, but at least happens in the one at work. However is accepting perl -i).
I have tried embedding awk (which is working from command line), in this way:
system `awk -F ',' -v OFS=',' '$1 == \"0\" { $1 = "string" }1' myfile.csv > myfile_copy.csv
But this works only for the first column (in command line) and, other than having the disadvantage of having extra copy file, Perl is complaining for > redirection, assuming it as "greater than"...
system(q#awk 'BEGIN{FS=OFS=",";split("1 2 3 4 5",A," ") } { for(i in A)sub(0,"string",$A[i] ) }1' myfile.csv#);
This awk is working from command line, but only 5 columns. But not in Perl using #.
All the combinations of exec and eval have also been tested without success.
I have also tried passing to system each one of the awk components, as arguments, separated by commas, but did not find any valid way to pass the redirector (>), since Perl is rejecting it because of the mentioned reason.
Using another approach, I noticed that the "standalone zeroes" seem to be "swallowed" by the Text::CSV module, thus, I get rid off it, and turned back to a traditional looping in csv line by line and a spliter for commas, preserving the zeroes in that way. However I found the "mystery" of isdual in Perl, and because of the limitation of modules I have, I cannot use the Dumper. Then, I also explored the guts of binaries in Perl and tried the $x ^ $x, which was deprecated since version 5.22 but valid till that version (I said mine is 5.6). This is useful to catch numbers vs strings. However, while if( $x ^ $x ) returns TRUE for strings, if( !( $x ^ $x ) ) does not returns TRUE when $x = 0. [UPDATE: I tried this in a devoted Perl script, just for this purpose, and it is working. I believe that my probable wrong conclusion ("not returning TRUE") was obtained when I did not still realize that Text::CSV was swallowing my zeroes. Doing new tests...].
I will appreciate very much your help!
MORE DETAILS ON MY REQUIREMENTS:
1) This is a dynamic report coming from a database which is handover to me and I pickup programmatically from a folder. Dynamic means that it might have whichever amount of tables, whichever amount of columns in each table, whichever names as column headers, whichever amount of rows in each table.
2) I do not know, and cannot know, the column names, because they vary from report to report. So, I cannot be guided by column names.
A sample input:
Alfa,Alfa1,Beta,Gamma,Delta,Delta1,Epsilon,Dseta,Heta,Zeta,Iota,Kappa
0,J5,alfa,0,111.33,124.45,0,0,456.85,234.56,798.43,330000.00
M1,0,X888,ZZ,222.44,111.33,12.24,45.67,0,234.56,0,975.33
3) Input Explanation
a) This is an example of a random report with 12 columns and 3 rows. Fist row is header.
b) I call "standalone zeroes" those "clean" zeroes which are coming in the CSV file, from second row onwards, between commas, like 0, (if the case is the first position in the row) or like ,0, in subsequent positions.
c) In the second row of the example you can read, from the beginning of the row: 0,J5,alfa,0, which in this particular case, are "words" or "strings". In this case, 4 names (note that two of them are zeroes, which required to be treated as strings). Thus, we have a 4 names-columns example (Alfa,Alfa1,Beta,Gamma are headers for those columns, but only in this scenario). From that point onwards, in the second row, you can see floating point (*.00) numbers and, among them, you can see 2 zeroes, which are numbers. Finally, in the third line, you can read M1,0,X888,Z, which are the names for the first 4 columns. Note, please, that the 4th column in the second row has 0 as name, while the 4th column in the third row has ZZ as name.
Summary: as a general picture, I have a table-report divided in 2 parts, from left to right: 4 columns for names, and 8 columns for numbers.
Always the first M columns are names and the last N columns are numbers.
- It is unknown which number is M: which amount of columns devoted for words / strings I will receive.
- It is unknown which number is N: which amount of columns devoted for numbers I will receive.
- It is KNOWN that, after the M amount of columns ends, always starts N, and this is constant for all the rows.
I have done a quick research on Perl boundaries for regex ( \b ), and I have not found any relevant information regarding if it applies or not in Perl 5.6.
However, since you are using and old Perl version, try the traditional UNIX / Linux style (I mean, what Perl inherits from Shell), like this:
system("perl -i -ple 's/^0/string/g' myfile.csv");
The previous regex should do the work doing the change at the start of the each line in your CSV file, if matches.
Or, maybe better (if you have those "standalone" zeroes, and want avoid any unwanted change in some "leading zeroes" string):
system("perl -i -ple 's/^0,/string,/g' myfile.csv");
[Note that I have added the comma, after the zero; and, of course, after the string].
Note that the first regex should work; the second one is just a "caveat", to be cautious.

How to replace using sed command in shell scripting to replace a string from a txt file present in one directory by another?

I am very new to shell scripting and trying to learn the "sed" command functionality.
I have a file called configurations.txt with some variables defined in it with some string values initialised to each of them.
I am trying to replace a string in a file (values.txt) which is present in some other directory by the values of the variables defined. The name of the file is values.txt.
Data present in configurations.txt:-
mem="cpu.memory=4G"
proc="cpu.processor=Intel"
Data present in the values.txt (present in /home/cpu/script):-
cpu.memory=1G
cpu.processor=Dell
I am trying to make a shell script called repl.sh and I dont have alot of code in it for now but here is what I got:-
#!/bin/bash
source /home/configurations.txt
sed <need some help here>
Expected output is after an appropriate regex applied, when I run script sh repl.sh, in my values.txt , It must have the following data present:-
cpu.memory=4G
cpu.processor=Intell
Originally which was 1G and Dell.
Would highly appreciate some quick help. Thanks
This question lacks some sort of abstract routine and looks like "help me do something concrete please". Thus it's very unlikely that anyone would provide a full solution for that problem.
What you should do try to split this task into number of small pieces.
1) Iterate over configuration.txt and get values from each line. To do that you need to get X and Y from a value="X=Y" string.
This regex could be helpful here - ([^=]+)=\"([^=]+)=([^=]+)\". It contains 3 matching groups separated by ". For example,
>> sed -r 's/([^=]+)=\"([^=]+)=([^=]+)\"/\1/' configurations.txt
mem
proc
>> sed -r 's/([^=]+)=\"([^=]+)=([^=]+)\"/\2/' configurations.txt
cpu.memory
cpu.processor
>> sed -r 's/([^=]+)=\"([^=]+)=([^=]+)\"/\3/' configurations.txt
4G
Intel
2) For each X and Y find X=Z in values.txt and substitute it with a X=Y.
For example, let's change cpu.memory value in values.txt with 4G:
>> X=cpu.memory; Y=4G; sed -r "s/(${X}=).*/\1${Y}/" values.txt
cpu.memory=4G
cpu.processor=Dell
Use -i flag to do changes in place.
Here is an awk based answer:
$ cat config.txt
cpu.memory=4G
cpu.processor=Intel
$ cat values.txt
cpu.memory=1G
cpu.processor=Dell
cpu.speed=4GHz
$ awk -F= 'FNR==NR{a[$1]=$2; next;}; {if($1 in a){$2=a[$1]}}1' OFS== config.txt values.txt
cpu.memory=4G
cpu.processor=Intel
cpu.speed=4GHz
Explanation: First read config.txt & save in memory. Then read values.txt. If a particular value was defined in config.txt, use the saved value from memory (config.txt).

using awk to match a column in log file and print the entire line

I'm trying to write a script which will analyse a log file,
i want to give the user the option to enter a pattern and then print any line which matches this pattern in a specific column (the fifth one)
the following works from the terminal
awk ' $5=="acpid:" {print$0}' *filename*
ok so above im trying to match "acpid:" this works fine but in the script i want to be able to allow multiple entries and search for them all, the problem is i'm messing up the variable in the script this is what i have:
echo "enter any services you want details on, seperated by spaces"
read -a details
for i in ${details[#]}
do
echo $i
awk '$5 == "${i}" {print $0}' ${FILE}
done
again if i directly put in a matching expression instead of the variable it works so i guess my problem is here any tips would be great
UPDATE
So im using the second option suggested(shown below) by #ghoti as it matches my log file slightly better
however im not having any luck with multiple entries. ive added two lines to illustratre the results im getting these are echo $i and echo "finish loop" as placed they should tell me what input the loop is currently on and that im leaving the loop
'read -a details
re=""
for i in "${details[#]}"; do
re="$re${re:+|}$i"
echo $i
echo"finish loop"
done
awk -v re="$re" '$5 ~ re' "$FILE" `
When i give read an input of either "acpid" or "init" seperately a perfect result is matched, however when the input is "acpid init" the following is the output
acpid init
finish loop
What im seeing from this is that the read is taking the both words as one entry and then the awk is searching but not matching them (as would be expected). so why is the input not being taken as two separate entries i had thought the -a option with read specified that words separated by a space would be placed into separate elements of the array. perhaps i have not declared the array correctly?
Update update
ok cancel the above update like i fool i'd forgotten that id chaged IFS to \n earlier in the script changed it back and bingo !!!
Many thanks again to #ghoti for his help!!
There are a few ways that you could do what you want.
One option might be to run through a for loop for each word, then apply a different call to awk, and show the results sequentially. For example, if you entered foo bar into the $details variable, you might get a list of foo matches, followed by a list of bar matches:
read -a details
for i in "${details[#]}"; do
awk -v s="$i" '$5 == s' "$FILE"
done
The idea here is that we use awk's -v option to get each word into the script, rather than expanding the variable inside the quoted script. You should read about how bash deals with different kinds of quotes. (There are also a few Stackoverflow questions on the topic, here and here and elsewhere.)
Another option might be to construct a regular expression that searches for all the words you're interested in, all at once. This has the benefit of using a single run of awk to search through $FILE:
read -a details
re=""
for i in "${details[#]}"; do
re="$re${re:+|}$i"
done
awk -v re="$re" '$5 ~ re' "$FILE"
The result will contain all the interesting lines from $FILE in the order in which they appear in $FILE, rather than ordered by the words you provided.
Note that this is a fairly rudimentary search, without word boundaries, so if you search for "foo bar babar", you may get results you don't want. You can play with the regex yourself, though. :)
Does that answer your question?

Sed command find and replace in even lines of a file

Hi I am new to this forum. I want to use SED to replace an expression on even lines of a file. My problem is that I cannot think f how to save the changes in the original file (i.e, how to overwrite the changes in the file). I have tried with :
sed -n 'n;p;' filename | sed 's/aaa/bbb/'
but this does not save the changes. I appreciate your help on this.
Try :
sed -i '2~2 s/aaa/bbb/' filename
The -i option tells sed to work in place, so not to write the edited version to stout and leave the original file be, but to apply the changes to the file. The 2~2 portion is the address for the lines sed should apply the commands. 2~2 means edit only even lines. 1~2 would edit only odd lines. 5~6 would edit every fifth line, starting at line 5 etc...
#Mithrandir's answer is an excellent, correct and complete one.
I will just add that the m~n addressing method is a GNU sed extension that may not work everywhere. For example, not all Macs have GNU sed, as well as *BSD systems may not have it either.
So, if you have a file like the following one:
$ cat f
1 ab
2 ad
3 ab
4 ac
5 aa
6 da
7 aa
8 ad
9 aa
...here is a more universal solution:
$ sed '2,${s/a/#A#/g;n}' f
1 ab
2 #A#d
3 ab
4 #A#c
5 aa
6 d#A#
7 aa
8 #A#d
9 aa
What does it do? The address of the command is 2,$, which means it will be applied to all lines between the second one (2) and the last one ($). The command in fact are two commands, treated as one because they are grouped by brackets ({ and }). The first command is the replacement s/a/#A#/g. The second one is the n command, which gets, in the current iteration, the next line, appends it to the current pattern space. So the current iteration will print the current line plus the next line, and the next iteration will process the next next line. Since I started it at the 2nd line, I am doing this process at each even line.
Of course, since you want to update the original file, you should call it with the -i flag. I would note that some of those non-GNU seds require you to give a parameter to the -i flag, which will an extension to be append to a file name. This file name is the name of a generated backup file with the old content. (So, if you call, for example, sed -i.bkp s/a/b/ myfile.txt the file myfile.txt will be altered, but another file, called myfile.txt.bkp, will be created with the old content of myfile.txt.) Since a) it is required in some places and b) it is accepted in GNU sed and c) it is a good practice nonetheless (if something go wrong, you can reuse the backup), I recommend to use it:
$ ls
f
$ sed -i.bkp '2,${s/a/#A#/g;n}' f
$ ls
f f.bkp
Anyway, my answer is just a complement for some specific scenarios. I would use #Mithrandir's solution, even because I am a Linux user :)
This might work for you:
sed -i 'n;s/aaa/bbb/' file
Use sed -i to edit the file in place.