sed regular expressions address ranges - regex

I have a txt file that looks something like this
-----------------------------------
RUNNING PROCESSES
-----------------------------------
ftpd
kswapd
init
etc..
---------------------------------
HOSTNAME
--------------------------------
mypc.local.com
With sed I want to just get one section of this file. So just the RUNNING PROCESSES section, however I seem to be failing to get my regexp right to do so.
I got this far
sed -n '/^-./,/RUNNING PROCESSES/, /[[:space::]]/p' linux.txt | more
however it keeps complaining about
-e expression #1, char 26: unknown commmand `,'
Can anybody help??

Did you mean:
sed -n '/RUNNING PROCESSES/,/HOSTNAME/p' linux.txt |
sed -e '/^[- ]/d' -e '/^$/d'

I would probably prefer to use awk for that:
awk '/RUNNING PROCESSES/ {s=2}
/^---/ {s=s-1}
{if(s>0){print}}' linux.txt
That awk will give you:
RUNNING PROCESSES
-----------------------------------
ftpd
kswapd
init
etc..
You can then pipe that through sed '/^$/d' to filter out the blank lines.

Here is another variable of the answer accepted, but not extra call to another sed process
sed -n '/RUNNING PROCESSES/,/HOSTNAME/{s/RUNN.*\|HOSTNAME//;s/--*//;/^$/!p}' file

Related

Sed : print all lines after match

I got my research result after using sed :
zcat file* | sed -e 's/.*text=\(.*\)status=[^/]*/\1/' | cut -f 1 - | grep "pattern"
But it only shows the part that I cut. How can I print all lines after a match ?
I'm using zcat so I cannot use awk.
Thanks.
Edited :
This is my log file :
[01/09/2015 00:00:47] INFO=54646486432154646 from=steve idfrom=55516654455457 to=jone idto=5552045646464 guid=100021623456461451463 n
um=6 text=hi my number is 0 811 22 1/12 status=new survstatus=new
My aim is to find all users that spam my site with their telephone numbers (using grep "pattern") then print all the lines to get all the information about each spam. The problem is there may be matches in INFO or id, so I use sed to get the text first.
Printing all lines after a match in sed:
$ sed -ne '/pattern/,$ p'
# alternatively, if you don't want to print the match:
$ sed -e '1,/pattern/ d'
Filtering lines when pattern matches between "text=" and "status=" can be done with a simple grep, no need for sed and cut:
$ grep 'text=.*pattern.* status='
You can use awk
awk '/pattern/,EOF'
n.b. don't be fooled: EOF is just an uninitialized variable, and by default 0 (false). So that condition cannot be satisfied until the end of file.
Perhaps this could be combined with all the previous answers using awk as well.
Maybe this is what you actually want? Find lines matching "pattern" and extract the field after text= up through just before status=?
zcat file* | sed -e '/pattern/s/.*text=\(.*\)status=[^/]*/\1/'
You are not revealing what pattern actually is -- if it's a variable, you cannot use single quotes around it.
Notice that \(.*\)status=[^/]* would match up through survstatus=new in your example. That is probably not what you want? There doesn't seem to be a status= followed by a slash anywhere -- you really should explain in more detail what you are actually trying to accomplish.
Your question title says "all line after a match" so perhaps you want everything after text=? Then that's simply
sed 's/.*text=//'
i.e. replace up through text= with nothing, and keep the rest. (I trust you can figure out how to change the surrounding script into zcat file* | sed '/pattern/s/.*text=//' ... oops, maybe my trust failed.)
The seldom used branch command will do this for you. Until you match, use n for next then branch to beginning. After match, use n to skip the matching line, then a loop copying the remaining lines.
cat file | sed -n -e ':start; /pattern/b match;n; b start; :match n; :copy; p; n ; b copy'
zcat file* | sed -e 's/.*text=\(.*\)status=[^/]*/\1/' | ***cut -f 1 - | grep "pattern"***
instead change the last 2 segments of your pipeline so that:
zcat file* | sed -e 's/.*text=\(.*\)status=[^/]*/\1/' | **awk '$1 ~ "pattern" {print $0}'**

How to cut a string from a string

My script gets this string for example:
/dir1/dir2/dir3.../importance/lib1/lib2/lib3/file
let's say I don't know how long the string until the /importance.
I want a new variable that will keep only the /importance/lib1/lib2/lib3/file from the full string.
I tried to use sed 's/.*importance//' but it's giving me the path without the importance....
Here is the command in my code:
find <main_path> -name file | sed 's/.*importance//
I am not familiar with the regex, so I need your help please :)
Sorry my friends I have just wrong about my question,
I don't need the output /importance/lib1/lib2/lib3/file but /importance/lib1/lib2/lib3 with no /file in the output.
Can you help me?
I would use awk:
$ echo "/dir1/dir2/dir3.../importance/lib1/lib2/lib3/file" | awk -F"/importance/" '{print FS$2}'
importance/lib1/lib2/lib3/file
Which is the same as:
$ awk -F"/importance/" '{print FS$2}' <<< "/dir1/dir2/dir3.../importance/lib1/lib2/lib3/file"
importance/lib1/lib2/lib3/file
That is, we set the field separator to /importance/, so that the first field is what comes before it and the 2nd one is what comes after. To print /importance/ itself, we use FS!
All together, and to save it into a variable, use:
var=$(find <main_path> -name file | awk -F"/importance/" '{print FS$2}')
Update
I don't need the output /importance/lib1/lib2/lib3/file but
/importance/lib1/lib2/lib3 with no /file in the output.
Then you can use something like dirname to get the path without the name itself:
$ dirname $(awk -F"/importance/" '{print FS$2}' <<< "/dir1/dir2/dir3.../importance/lib1/lib2/lib3/file")
/importance/lib1/lib2/lib3
Instead of substituting all until importance with nothing, replace with /importance:
~$ echo $var
/dir1/dir2/dir3.../importance/lib1/lib2/lib3/file
~$ sed 's:.*importance:/importance:' <<< $var
/importance/lib1/lib2/lib3/file
As noted by #lurker, if importance can be in some dir, you could add /s to be safe:
~$ sed 's:.*/importance/:/importance/:' <<< "/dir1/dirimportance/importancedir/..../importance/lib1/lib2/lib3/file"
/importance/lib1/lib2/lib3/file
With GNU sed:
echo '/dir1/dir2/dir3.../importance/lib1/lib2/lib3/file' | sed -E 's#.*(/importance.*)#\1#'
Output:
/importance/lib1/lib2/lib3/file
pure bash
kent$ a="/dir1/dir2/dir3.../importance/lib1/lib2/lib3/file"
kent$ echo ${a/*\/importance/\/importance}
/importance/lib1/lib2/lib3/file
external tool: grep
kent$ grep -o '/importance/.*' <<<$a
/importance/lib1/lib2/lib3/file
I tried to use sed 's/.*importance//' but it's giving me the path without the importance....
You were very close. All you had to do was substitute back in importance:
sed 's/.*importance/importance/'
However, I would use Bash's built in pattern expansion. It's much more efficient and faster.
The pattern expansion ${foo##pattern} says to take the shell variable ${foo} and remove the largest matching glob pattern from the left side of the shell variable:
file_name="/dir1/dir2/dir3.../importance/lib1/lib2/lib3/file"
file_name=${file_name##*importance}
Removeing the /file at the end as you ask:
echo '<path>' | sed -r 's#.*(/importance.*)/[^/]*#\1#'
Input /dir1/dir2/dir3.../importance/lib1/lib2/lib3/file
Returns: /importance/lib1/lib2/lib3
See this "Match groups" tutorial.

SED: Number of returned lines

To a file jungle.txt with following text ...
A lion sleeps in the jungle
A lion sleeps tonight
A tiger awakens in the swamp
The parrot observes
Wimoweh, wimoweh, wimoweh, wimoweh
... one could perform GREP search ...
$ grep lion jungle.txt
... or SED search ...
$ sed "/lion/p" jungle.txt
... to find occurences of a pattern ("lion" in this case).
Is there some easy way to get a number of returned lines? Or at least to know that there was more than 1 found? As always, I've googled a lot first, but surprisingly found no answer.
Thanks!
grep can count matching lines:
grep -c 'lion' file
Output:
2
Syntax:
-c: Suppress normal output; instead print a count of matching lines for each input file. With the -v, --invert-match option (see below), count non-matching lines. (-c is specified by POSIX.)
This might work for you (GNU sed):
sed '/lion/!d' file | sed '$=;d'
or if you prefer:
sed -n '/lion/p' file | sed -n '$='
N.B. if the file is empty or the first sed command finds nothing the result of the second sed command is blank.
You can use awk
awk '/lion/ {a++} END {print a+0}'
2
But I would say that the best solution is the one posted by Cyros using grep -c 'lion' file
Just pass the grep command output to wc- l command to count the number of returned lines,
$ grep 'lion' file | wc -l
2
From wc --help
-l, --lines print the newline counts

Regex with sed to parse archive name

I'd like to parse different kinds of Java archive with the sed command line tool.
Archives can have the followin extensions:
.jar, .war, .ear, .esb
What I'd like to get is the name without the extension, e.g. for Foobar.jar I'd like to get Foobar.
This seems fairly simple, but I cannot come up with a solution that works and is also robust.
I tried something along the lines of sed s/\.+(jar|war|ear|esb)$//, but could not make it work.
You were nearly there:
sed -E 's/\.+(jar|war|ear|esb)$//' file
Just needed to add the -E flag to sed to interpret the expression. And of course, respect the sed 's/something/new/' syntax.
Test
$ cat a
aaa.jar
bb.war
hello.ear
buuu.esb
hello.txt
$ sed -E 's/\.+(jar|war|ear|esb)$//' a
aaa
bb
hello
buuu
hello.txt
Using sed:
s='Foobar.jar'
sed -r 's/\.(jar|war|ear|esb)$//' <<< "$s"
Foobar
OR better do it in BASH itself:
echo "${s/.[jwe]ar/}"
Foobar
You need to escape the | and the () and also add ' if you do not add option like -r or -E
echo "test.jar" | sed 's/\.\(jar\|war\|ear\|esb\)$//'
test
* is also not needed, sine you normal have only one .
On traditionnal UNIX (tested with AIX/KSH)
File='Foobar.jar'
echo ${File%.*}
from a list having only your kind of file
YourList | sed 's/\....$//'
form a list of all kind of file
YouList | sed -n 's/\.[jew]ar$/p
t
s/\.esb$//p'

Filter apache log file using regular expression

I have a big apache log file and I need to filter that and leave only (in a new file) the log from a certain IP: 192.168.1.102
I try using this command:
sed -e "/^192.168.1.102/d" < input.txt > output.txt
But "/d" removes those entries, and I needt to leave them.
Thanks.
What about using grep?
cat input.txt | grep -e "^192.168.1.102" > output.txt
EDIT: As noted in the comments below, escaping the dots in the regex is necessary to make it correct. Escaping in the regex is done with backslashes:
cat input.txt | grep -e "^192\.168\.1\.102" > output.txt
sed -n 's/^192\.168\.1\.102/&/p'
sed is faster than grep on my machines
I think using grep is the best solution but if you want to use sed you can do it like this:
sed -e '/^192\.168\.1\.102/b' -e 'd'
The b command will skip all following commands if the regex matches and the d command will thus delete the lines for which the regex did not match.