I'm trying to write a bash script that gets user input, checks a .txt for the line that contains that input then plugs that into a wget statement to commence a download.
In testing the functionality awk seems to print out every line, not just pattern matched lines.
chosen=DSC01985
awk -v c="$chosen" 'BEGIN {FS="/"; /c/}
{print $8, "found", c}
END{print " done"}' ./imgLink.txt
The above should take from imgLink.txt, search for the pattern and return that the pattern is found. Instead it prints the the 8th field of every line in the file.
I have tried moving /c/ out of the begin statement but to no avail.
what's going on here?
Example input:
https://xxxx/xxxx/xxxx/xxxx/xxx/DSC01533.jpg
https://xxxx/xxxx/xxxx/xxxx/xxx/DSC01536.jpg
https://xxxx/xxxx/xxxx/xxxx/xxx/DSC01543.jpg
https://xxxx/xxxx/xxxx/xxxx/xxx/DSC01558.jpg
https://xxxx/xxxx/xxxx/xxxx/xxx/DSC01565.jpg
etc.
Example output:
...
DSC02028.jpg found DSC01985
DSC02030.jpg found DSC01985
DSC02032.jpg found DSC01985
DSC02038.jpg found DSC01985
DSC02042.jpg found DSC01985
etc.
You were close in your attempt, you can't search an awk variable like /var/ you need different method for this. Could you please try following.Considering that your string which you want to look will come in URL value(s) which you have currently xxxed in your post.
awk -v c="$chosen" -F'/' '$0 ~ c{print $NF " found " c}' Input_file
Not sure why you have written done in your END block, you could add it here if you need it. Also $NF means last field of current line you could print it as per your need too.
I have a csv file like the following:
entity_name,data_field_name,type
Unit,id
Track,id,LONG
The second row is missing a comma. I wonder if there might be some regex or awk like tool in order to append commas to the end of line in case there are missing commas in these rows?
Update
I know the requirements are a little vague. There might be several alternative ways to narrow down the requirements such as:
The header row should define the number of columns (and commas) that is valid for the whole file. The script should read the header row first and find out the correct number of columns.
The number of columns might be passed as an argument to the script.
The number of columns can be hardcoded into the script.
I didn't narrow down the requirements at first because I was ok with any of them. Of course, the first alternative is the best but I wasn't sure if this was easy to implement or not.
Thanks for all the great answers and comments. Next time, I will state acceptable alternative requirements explicitly.
You can use this awk command to fill up all rows starting from 2nd row with the empty cell values based on # of columns in the header row, in order to avoid hard-coding # of columns:
awk 'BEGIN{FS=OFS=","} NR==1{nc=NF} NF{$nc=$nc} 1' file
entity_name,data_field_name,type
Unit,id,
Track,id,LONG
Earlier solution:
awk 'BEGIN{FS=OFS=","} NR==1{nc=NF} {printf "%s", $0;
for (i=NF+1; i<=nc; i++) printf "%s", OFS; print ""}' file
I would use sed,
sed 's/^[^,]*,[^,]*$/&,/' file
Example:
$ echo 'Unit,id' | sed 's/^[^,]*,[^,]*$/&,/'
Unit,id,
$ echo 'Unit,id,bar' | sed 's/^[^,]*,[^,]*$/&,/'
Unit,id,bar
Try this:
$ awk -F , 'NF==2{$2=$2","}1' file
Output:
entity_name,data_field_name,type
Unit,id,
Track,id,LONG
With another awk:
awk -F, 'NF==2{$3=""}1' OFS=, yourfile.csv
to present balance to all the awk solutions, following could be a vim only solution
:v/,.*,/norm A,
rationale
/,.*,/ searches for 2 comma's in a line
:v apply a global command on each line NOT matching the search
norm A, enters normal mode and appends a , to the end of the line
This MIGHT be all you need, depending on the info you haven't shared with us in your question:
$ awk -F, '{print $0 (NF<3?FS:"")}' file
entity_name,data_field_name,type
Unit,id,
Track,id,LONG
I have a config file like this:
[whatever]
Do I need this? no!
[directive]
This lines I want
Very much text here
So interesting
[otherdirective]
I dont care about this one anymore
Now I want to match the lines in between [directive] and [otherdirective] without matching [directive] or [otherdirective].
Also if [otherdirective] is not found all lines till the end of file should be returned. The [...] might contain any number or letter.
Attempt
I tried this using sed like this:
sed -r '/\[directive\]/,/\[[[:alnum:]+\]/!d
The only problem with this attempt is that the first line is [directive]and the last line is [otherdirective].
I know how to pipe this again to truncate the first and last line but is there a sed solution to this?
You can use the range, as you were trying, and inside it use // negated. When it's empty it reuses last regular expression matched, so it will skip both edge lines:
sed -n '/\[directive\]/,/\[otherdirective\]/ { //! p }' infile
It yields:
This lines I want
Very much text here
So interesting
Here is a nice way with awk to get section of data.
awk -v RS= '/\[directive\]/' file
[directive]
This lines I want
Very much text here
So interesting
When setting RS to nothing RS= it divides the file up in records based on blank line.
So when searching for [directive] it will print that record.
Normally a record is one line, but due to the RS (record selector) is change, it gives the block.
Okay damn after more tries I found the solution or merely one solution:
sed -rn '/\[buildout\]/,/\[[[:alnum:]]+\]/{
/\[[[:alnum:]]+\]/d
p }'
is this what you want?
\[directive\](.*?)\[
Look here
I am trying to filter out text between two patterns, I've seen a dozen examples but didn't manage to get exactly what I want:
Sample input:
START LEAVEMEBE text
data
START DELETEME text
data
more data
even more
START LEAVEMEBE text
data
more data
START DELETEME text
data
more
SOMETHING that doesn't start with START
# sometimes it starts with characters that needs to be escaped...
I want to stay with:
START LEAVEMEBE text
data
START LEAVEMEBE text
data
more data
SOMETHING that doesn't start with START
# sometimes it starts with characters that needs to be escaped...
I tried running sed with:
sed 's/^START DELETEME/,/^[^ ]/d'
And got an inclusive removal, I tried adding "exclusions" (not sure if I really understand this syntax well):
sed 's/^START DELETEME/,/^[^ ]/{/^[^ ]/!d}'
But my "START DELETEME" line is still there (yes, I can grep it out, but that's ugly :) and besides - it DOES remove the empty line in this sample as well and I'd like to leave empty lines if they are my end pattern intact )
I am wondering if there is a way to do it with a single sed command.
I have an awk script that does this well:
BEGIN { flag = 0 }
{
if ($0 ~ "^START DELETEME")
flag=1
else if ($0 !~ "^ ")
flag=0
if (flag != 1)
print $0
}
But as you know "A is for awk which runs like a snail". It takes forever.
Thanks in advance.
Dave.
Using a loop in sed:
sed -n '/^START DELETEME/{:l n; /^[ ]/bl};p' input
GNU sed
sed '/LEAVEMEBE/,/DELETEME/!d;{/DELETEME/d}' file
I would stick with awk:
awk '
/LEAVE|SOMETHING/{flag=1}
/DELETE/{flag=0}
flag' file
But if you still prefer sed, here's another way:
sed -n '
/LEAVE/,/DELETE/{
/DELETE/b
p
}
' file
I have a large log file from which I need to extract file names.
The file looks like this:
/path/to/loremIpsumDolor.sit /more/text/here/notAlways/theSame/here
/path/to/anotherFile.ext /more/text/here/differentText/here
.... about 10 million times
I need to extract the file names like this:
loremIpsumDolor.sit
anotherFile.ext
I figure my first strategy is to find/replace all /path/to/ with ''. But I'm stuck how to remove all characters after the space.
Can you help?
sed 's/ .*//' file
It doesn't take any more. The transformed output appears on standard output, of course.
In theory, you could also use awk to grab the filename from each line as:
awk '{ print $1 }' input_file.log
That, of course, assumes that there are no spaces in any of the filenames. awk defaults to looking for whitespace as the field delimiters, so the above snippet would take the first "field" from your log file (your filename) for each line, and output it.
Pass it to cut:
cut '-d ' -f1 yourfile
a bash-only solution:
while read path otherstuff; do
echo ${path##*/}
done < filename