how to find angle braket literals with grep

how to find angle braket literals with grep - regex

In the following example, the second match "<_dl_start_user>" was unexpected:
$ objdump -D /lib64/ld-linux-x86-64.so.2|grep -A5 '<_start>:'
0000003ba0400b30 <_start>:
3ba0400b30: 48 89 e7 mov %rsp,%rdi
3ba0400b33: e8 28 06 00 00 callq 3ba0401160 <_dl_start>
0000003ba0400b38 <_dl_start_user>:
3ba0400b38: 49 89 c4 mov %rax,%r12
how can I match exactly '<_start>:' ?

You are matching <_start>: exactly. You're also seeing 5 lines of trailing context after the match because you specified -A5.

Related

grep -v pattern and also remove 1 line before and 4 lines after [duplicate]

This question already has answers here:
How to exclude several lines around match with grep or similar tool?
(2 answers)
Closed 8 years ago.
I would like to grep a pattern and remove the line of the matching pattern and also 1 line before and 4 lines after the context. I tried:
grep -v -A 4 -B 1
Thanks in advance!
Example:
Rule: r1
Owner: Process explorer.exe Pid 1544
0x01ec350f 8b 45 a8 0f b6 00 8d 4d a8 ff 14 85 c8 7f ed 01 .E.....M........
0x01ec351f 84 c0 75 ec 8b 4d fc e8 ba f5 fe ff f7 85 b0 fd ..u..M..........
0x01ec352f ff ff 00 00 01 00 75 13 33 c0 50 50 50 68 48 28 ......u.3.PPPhH(
0x01ec353f eb 01 33 d2 8b cb e8 b0 57 ff ff f7 05 8c 9b ed ..3.....W.......
I would like to grep "explorer.exe" and remove the line and also 1 line before and 4 lines after.

awk
this awk one-liner would help:
awk 'NR==FNR{if(/explorer[.]exe/)d[++i]=NR;next}
{for(x=1;x<=i;x++)if(FNR>=d[x]-1&&FNR<=d[x]+4)next}7' file file
see this example:
kent$ cat f
foo
foo2
Rule: r1
Owner: Process explorer.exe Pid 1544
remove1
remove2
remove3
remove4
bar
bar2
kent$ awk 'NR==FNR{if(/explorer[.]exe/)d[++i]=NR;next}{for(x=1;x<=i;x++)if(FNR>=d[x]-1&&FNR<=d[x]+4)next}7' f f
foo
foo2
bar
bar2
vim
if vim is also possible for you, it could be a lot easier:
:g/Pattern/norm! k6dd
Note, the vim solution would have problem in first match if your pattern was on the 1st line in your file.

Regular expression for extracting a series of hex numbers from a file

I am examining an object dump of a file and I want to figure out all the possible addresses.
The approach I am using involves using perl and regex to extract all words
The format of the object file is like this
00000000000044444 <function>
44448: 48 ca add ....
4444c: 48 ca 55 call ....
44450: 48 ca 8d 55 jmp..
I am trying to extract 48 ca 48 ca 55 48 ca 8d 55
Currently, i thought that the regex /(\s[0-9a-f][0-9a-f]\s)/g would help however, that only extracts every other, i.e48, 8d, 55, as it parse 48 and then it cant parse ca because the previous space character has already been consumed (at least that is my understanding)
/(\s[0-9a-f][0-9a-f]\s)|([0-9a-f][0-9a-f]\s)/g but that parses things it shouldnt like an add instruction dd
Any help as to how I can only extract these pairs of numbers deliminated by a space?
Edit: I updated a more realistic format of the file.
Thank you

Instead of \s, you just want the word boundary \b.
while (<DATA>) {
my #nums = m/\b([[:xdigit:]]{2})\b/g;
print "#nums\n";
}
__DATA__
00000000000044444 <function>
44448: 48 ca 8d 55
4444c: 48 ca 8d 55
44450: 48 ca 8d 55
Update
Given, you made your data more complicated with instructions after the hex codes, I'd lean toward making your regex more restrictive like so;
while (<DATA>) {
if (/^\s+\w+:((?:\s[[:xdigit:]]{2})+)\b/) {
my #nums = split ' ', $1;
print "#nums\n";
}
}
__DATA__
00000000000044444 <function>
44448: 48 ca add ....
4444c: 48 ca 55 call ....
44450: 48 ca 8d 55 jmp..
Outputs:
48 ca
48 ca 55
48 ca 8d 55

Try this example that uses your regex in positive lookahead to perform overlapping matching:
$\ = $/;
while(<DATA>){
print for m/(?=\s([0-9a-f][0-9a-f])\s)/g;
}
__DATA__
00000000000044444 <function>
44448: 48 ca 8d 55
4444c: 48 ca 8d 55
44450: 48 ca 8d 55

Try this:
(([0-9a-f]{2}\s){3}[0-9a-f]{2})$
[0-9a-f]{2} is a pair of hex digits.
Group those with a space three times, and then look for another pair of hex digits after that.
The $ anchors it to the end of the line.

You can use:
while(<DATA>) {
print m/(?:(?<=:)|\G)( [a-f0-9]{2})(?=\s)/g;
}
__DATA__
00000000000044444 <function>
44448: 48 ca add ....
4444c: 48 ca 55 call ....
44450: 48 ca 8d 55 jmp..
The pattern is build to force byte to be consecutive with \G or to be preceded with :. (If it doesn't suffice you can add [0-9a-f]{5} before the :)

Finding a integer number after a beginning t=

I have a string like this:
33 00 4b 46 ff ff 03 10 30 t=25562
I am only interested in the five digits at the very end after the t=
How can I get this numbers with a regular expression out of it?
I tried grep t=..... but I also got all characters including the t= in the beginning, which I would like to drop?
After finding that five digit number, I would like to divide this by 1000. So in the above mentioned case the number 25.562. Is this possible with grep and regular expressions?
Thanks for your help.

Using awk
echo '33 00 4b 46 ff ff 03 10 30 t=25562' | awk -F= '{print $2/1000}'
Output:
25.562
EDIT
As pointed out by #anubhava in comment, above assumes = is not present anywhere before t=. If that's not the case,
echo '33 00 4b 46 ff ff 03 10 30 t=25562' | awk -F' t=' '{print $2/1000}'

This should be OK. It will only get value form t= at the end. OP also post that t= could exist as extra character in the middle of the line, but hi only like to get the one at the end of the line.
echo '33 00 4b 46 ff ff t=22 03 10 30 t=25562' | awk '{split($NF,a,"=");print a[2]/1000}'
25.562
Another variation
echo '33 00 4b 46 ff ff t=22 03 10 30 t=25562' | awk 'END {print $0/1000}' RS==
25.562

How to remove non-ascii chars using sed

I want to remove non-ascii chars from some file. I have already tried these many regexs.
sed -e 's/[\d00-\d128]//g' # not working
cat /bin/mkdir | sed -e 's/[\x00-\x7F]//g' >/tmp/aa
but this file contains some non-ascii chars.
[root#asssdsada ~]$ hexdump /tmp/aa |more
00 01 02 03 04 05 06 07 - 08 09 0A 0B 0C 0D 0E 0F 0123456789ABCDEF
00000000 45 4C 46 B0 F0 73 38 C0 - C0 BC BC FF FF 61 61 61 ELF..s8......aaa
00000010 A0 A0 50 E5 74 64 50 57 - 50 57 50 57 D4 D4 51 E5 ..P.tdPWPWPW..Q.
00000020 74 64 6C 69 62 36 34 6C - 64 6C 69 6E 75 78 78 38 tdlib64ldlinuxx8
00000030 36 36 34 73 6F 32 47 4E - 55 42 C8 C0 80 70 69 42 664so2GNUB...piB
00000040 44 47 BA E3 92 43 45 D5 - EC 46 E4 DE D8 71 58 B9 DG...CE..F...qX.
00000050 8D F1 EA D3 EF 4B 86 FC - A9 DA 79 ED 63 B5 51 92 .....K....y.c.Q.
00000060 BA 6C FC D1 69 78 30 ED - 74 F1 73 95 CC 85 D2 46 .l..ix0.t.s....F
00000070 A5 B4 6C 67 DA 4A E9 9A - 4B 58 77 A4 37 80 C0 4F ..lg.J..KXw.7..O
00000080 F3 E9 B2 77 65 97 74 F9 - A2 C0 F2 CC 4A 9C 58 A1 ...we.t.....J.X.

This doesn't seem to work with sed. Perhaps tr will do?
tr -d '\200-\377'
Or with the complement:
tr -cd '\000-\177'

Did you try
cat /bin/mkdir | tr -cd "[:print:]"
I think it solves the problem ?
If only text content interest you, you can also use
cat /bin/mkdir | strings

Do you know what encoding the file is currently using? If so, you can use iconv to convert it. It's a utility to convert from one character encoding to another. So if the original file is in UTF-8 and you want to convert to ASCII you can use the following:
iconv -f utf8 -t ascii <inputfile>
The file command on the input file might tell you the current encoding.
Interestingly, there's a command called enca which will do its best to determine the character encoding being used if you know the language of the contents of the file.
This other question might be the answer.

The solutions offered here did not work for me. Maybe my problem was different, but I needed to strip the ASCII colors and other characters from the otherwise pure ASCII text.
The following worked for me, however:
Stripping Escape Codes from ASCII Text
sed -E 's/\x1b\[[0-9]*;?[0-9]+m//g'
In context (BASH):
$ printf "\e[32;1mhello\e[0m\n"
hello
$ printf "\e[32;1mhello\e[0m\n" | cat -vet
^[[32;1mhello^[[0m$
$ printf "\e[32;1mhello\e[0m\n" | sed -E 's/\x1b\[[0-9]*;?[0-9]+m//g' | cat -vet
hello$

Try with sed -i option, eg.
sed -i 's/[\d128-\d255]//g' MYFILE.txt
it will replace all non-ascii characters in the file.

Convert hex stream to GIF

How do I create a .gif file from the following HEX stream:
0d 0a 0d 0a 47 49 46 38 39 61 01 00 01 00 80 ff 00 ff ff ff 00 00 00 2c 00 00 00 00 01 00 01 00 00 02 02 44 01 00 3b
3b is the GIF file terminator
I'm trying to do it following the guide at http://en.wikipedia.org/wiki/Graphics_Interchange_Format#Example_GIF_file
I'd like to implement this in either Perl or C/C++. Any language will do though.
Many thanks in advance,
Thanks guys for all the replies. I removed the leading '0d 0a 0d 0a'...
Here's what I have sofar:
#!/usr/bin/perl
use strict;
use warnings;
open(IN,"<test.txt");
open(OUT,">test.gif");
my #lines=<IN>;
foreach my $line (#lines){
$line=~s/\n//g;
my #bytes=split(/ /,$line);
foreach my $byte (#bytes){
print OUT $byte;
}
}
close(OUT);
close(IN);

You can do it in the shell, using GNU echo:
$ /bin/echo -n -e "\x47\x49\x46\x38\x39\x61\x01\x00\x01\x00\x80\xff\x00\xff\xff\xff\x00\x00\x00\x2c\x00\x00\x00\x00\x01\x00\x01\x00\x00\x02\x02\x44\x01\x00\x3b" > foo.gif
$ identify foo.gif
foo.gif GIF 1x1 1x1+0+0 8-bit PseudoClass 2c 36B 0.000u 0:00.000
You can also use the xxd command which will "make a hexdump or do the reverse". Annoyingly, however, it seems very picky about the input format. Here's an example using xxd:
$ cat > mygif.hex <<END
0000000: 4749 4638 3961 0100 0100 80ff 00ff ffff
0000010: 0000 002c 0000 0000 0100 0100 0002 0244
0000020: 0100 3b0a
END
$ xxd -r < mygif.hex > mygif.gif
gvim has an interface to xxd. Use the "Tools → Convert To Hex" menu option (keyboard: :%!xxd) and then "Tools → Convert Back" (:%!xxd -r).
EMACS also has a built-in hex editor, which is accessed by M-x hexl-mode (see Editing Binary Files in the manual). It's also a little bit annoying, because you have to type C-M-x (i.e. Ctrl-Meta-X) before entering a character by its hex code:
Of course, it is very easy to write a simple C program to do the conversion:
#include <stdio.h>
int main(int argc, char **argv) {
unsigned int c;
while (1 == scanf("%x", &c))
putchar(c);
return 0;
}
usage:
$ gcc -Wall unhexify.c -o unhexify
$ echo "47 49 46 38 39 61 01 00 01 00 80 ff
00 ff ff ff 00 00 00 2c 00 00 00 00
01 00 01 00 00 02 02 44 01 00 3b" | ./unhexify > mygif.gif
Also: many answers here in this code golf question.

Open a new file for writing (in binary mode) with the .gif extension.
Read each pair of hex characters.
Convert the hex to a byte (char) value.
Write the byte to the opened file.
When finished, close the file.
If the hex data represents a GIF image, the file should contain it.

perl -ne'
BEGIN { binmode STDOUT }
s/\s//g;
print pack "H*", $_;
' file.hex > file.gif
Perl 5.14:
perl -ne'
BEGIN { binmode STDOUT }
print pack "H*", s/\s//gr;
' file.hex > file.gif
(-n and print can be replaced with -p and $_ = if you want to golf (shorten the length of the program.)

You may want to read the documentation for unpack (or hex) and pack. You may also find the perlio documentation useful for creating a raw file handle (so perl doesn't try to help you with things like encodings or line endings).

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

how to find angle braket literals with grep - regex

You are matching <_start>: exactly. You're also seeing 5 lines of trailing context after the match because you specified -A5.

Related

grep -v pattern and also remove 1 line before and 4 lines after [duplicate]

Regular expression for extracting a series of hex numbers from a file

Finding a integer number after a beginning t=

How to remove non-ascii chars using sed

Convert hex stream to GIF

Categories

Resources