why the string of allclear changed after sprintf( command, "rm %s", newfile ),I thind "command" has not relationship with "allclear"
(gdb) p allclear
$18 = "/home/river/Desktop/stage2/bin/config/02_allclear_12HD", '\000' <repeats 45 times>
(gdb) p &allclear
$19 = (char (*)[100]) 0xbfffea0c
(gdb) p &command
$20 = (char (*)[50]) 0xbfffe9da
**(gdb) n
65 sprintf( command, "rm %s", newfile );**
(gdb) p allclear
$21 = "/home/river/Desktop/stage2/bin/config/02_allclear_12HD", '\000' <repeats 45 times>
(gdb) n
66 if( argc < 1) return 1;
**(gdb) p allclear
$22 = "001005/controlpage\000/stage2/bin/config/02_allclear_12HD", '\000' <repeats 45 times>**
(gdb) p $allclear
$23 = void
(gdb) p &allclear
$24 = (char (*)[100]) 0xbfffea0c
(gdb) p newfile
$25 = "/home/river/Desktop/stage2/test_case/01_SES/SES001005/controlpage", '\000' <repeats 34 times>
(gdb) p &command
$26 = (char (*)[50]) 0xbfffe9da
some part of my code is :
char allclear[MAXPATHSIZE];
memset( allclear, 0, MAXPATHSIZE);
sprintf( allclear, "%s/config/02_allclear_12HD", curfilepathdir);
char command[MAXCOMMAMDSIZE];
memset( command, 0, MAXCOMMAMDSIZE);
sprintf( command, "rm %s", newfile );
From GDB output, it's pretty clear that MAXCOMMANDSIZE is 50.
How long is "rm /home/river/Desktop/stage2/test_case/01_SES/SES001005/controlpage" ?
You might want to read up on buffer overflows, and start using safer variant of sprintf, namely snprintf.
Related
This question already has an answer here:
How do you access low-byte registers for r8-r15 from gdb in x86-64?
(1 answer)
Closed 1 year ago.
I'm writing a function in x86-64 to convert a 1-byte value into a hexadecimal string representing the ASCII code for that byte. At the start of my function, I try to use
movb %dil, %r11b
to store the 1-byte value in the lowest byte of register %r11. However, when I examine this in gdb, %r11b is never set. Instead, the higher bytes of %r11 are getting set. This is what I get when using gdb:
Breakpoint 1, 0x00000000004011f0 in byte_as_hex ()
(gdb) print /x $r11b
$1 = 0x0
(gdb) print /x $r11
$2 = 0x246
(gdb) print /x $rdi
$3 = 0x48
(gdb) print /x $dil
$4 = 0x48
(gdb) stepi /* subq $8, %rsp */
0x00000000004011f4 in byte_as_hex ()
(gdb) print /x $r11b
$5 = 0x0
(gdb) print /x $r11
$6 = 0x246
(gdb) print /x $rdi
$7 = 0x48
(gdb) print /x $dil
$8 = 0x48
(gdb) stepi /* movb %dil, %r11b */
0x00000000004011f7 in byte_as_hex ()
(gdb) print /x $r11b
$9 = 0x0
(gdb) print /x $r11
$10 = 0x248
(gdb) print /x $rdi
$11 = 0x48
(gdb) print /x $dil
$12 = 0x48
(gdb) print /x $r11d
$13 = 0x248
(gdb) print /x $r11w
$14 = 0x248
(gdb) print /x $r11b
$15 = 0x0
I'm very confused because I specifically tried to movb from %dil into %r11b, but I still can't set the byte. Could anyone explain to me why this is this happening? Thanks!
Multiple problems are at play here:
(Reported as GDB bug.) An undefined convenience variable (a GDB-local variable that starts with $), when printed with an explicit format specifier, is shown as 0 instead of the default void, which is displayed when format is not specified:
$ gdb /bin/true
Reading symbols from /bin/true...
(gdb) p $asdf
$1 = void <------ undefined, OK
(gdb) p/x $asdf
$2 = 0x0 <------ the problem
(gdb) set $asdf=4345
(gdb) p $asdf
$3 = 4345
(gdb) p/x $asdf
$4 = 0x10f9
(gdb)
Register values have the same syntax as the values of convenience variables. Thus, when you mistake the name of a register, e.g. use r11b instead of GDB's r11l, you refer to a(n undefined) convenience variable. Moreover, even if you simply use the correct name in incorrect case, like R11L, you bump into this too.
GDB uses its own set of names for x86(_64) registers. Sometimes they differ from the names given e.g. in Intel manuals (e.g. ftag instead of Intel's FTW). In any case, the lowest bytes of the general purpose registers have the following names in GDB:
al
cl
dl
bl
spl
bpl
sil
dil
r8l
...
r15l
There are no aliases for them like e.g. r11b for r11l, so one must use the correct names.
I am having difficulty with a certain set of regex that needs to be solved to calculate the frequency of positive, negative, and 0 integers within the data set inside of the sample code. I have successfully gotten it to solve the negative integers, but no such luck with positive and 0.
#!/usr/bin/perl
use strict;
use warnings;
my ( $ctrP, $ctrN, $ctrZ ) = ( 0, 0, 0 );
while( my $num = <DATA> ) {
chomp($num);
## print "num=[$num]\n";
if ( $num =~ /^-\d+$/ ) {
$ctrN++;
}
elsif ( $num =~ /^[1-9]\d*$/ ) {
$ctrZ++;
}
else {
$ctrP++;
}
}
printf("freq(Z+):%8s\n", $ctrP );
printf("freq(Z-):%8s\n", $ctrN );
printf("freq(0):%9s\n", $ctrZ );
printf("Total:%11s\n", ($ctrP+$ctrN+$ctrZ) );
exit;
__DATA__
29
42
324
-511
32
354
0
-29
765
17
-32
You can use the numeric comparison operator <=> which returns -1, 0, or 1 according to whether the first operand is less than, equal to, or greater than the second, respectively. If you use it to compare each value to zero and add one to the result then you can index into an array
Like this
use strict;
use warnings 'all';
my #counts;
++$counts[($_ <=> 0) + 1] while <DATA>;
my ($ctrN, $ctrZ, $ctrP) = #counts;
printf "freq(Z+): %4d\n", $ctrP;
printf "freq(Z-): %4d\n", $ctrN;
printf "freq(0): %4d\n", $ctrZ;
printf "Total: %4d\n", $ctrP + $ctrN + $ctrZ;
__DATA__
29
42
324
-511
32
354
0
-29
765
17
-32
output
freq(Z+): 7
freq(Z-): 3
freq(0): 1
Total: 11
Swap $ctrZ++; and $ctrP++; lines:
# ...................
elsif ( $num =~ /^[1-9]\d*$/ ) {
$ctrP++;
}
else {
$ctrZ++;
}
# ...................
I have the following file : extract_info.txt
ABC
PNG
CHNS
and to_extractfrom.txt from which I need to retrieve information:
ABC 123 234 TCHSL
NBV 234 23764 DHG
CHNS 123 347 CGJKS
CVS 233 4747 JSHGD
PNG 122 324 HGH
SJDH 373 3487 JHG
and I am running the following code
while read line
do
gene=$(echo $line | awk -F' ' '{print $1}')
app1=$(awk -v comp1="$gene" '(comp1==$1) {print $1 }' to_extractfrom.txt)
done < extract_info.txt
However my desired output is to extract the information for the column in extract_info.txt from the file to_extractfrom.txt such that I get the first column of the previous line on the right and next line on the left of the pattern matched line i.e for the columns in the first file, I will have the output as :
NBV ABC -
SJDH PNG CVS
CVS CHNS NBV
awk '
BEGIN {prev = "-"}
NR == FNR {extract[$1] = 1; next}
is_match {print $1, m1, m2; is_match = 0}
$1 in extract {is_match = 1; m1 = $1; m2 = prev}
{prev = $1}
' extract_info.txt to_extractfrom.txt
NBV ABC -
CVS CHNS NBV
SJDH PNG CVS
If you must have the output in the same order as the extract_info file, and you use GNU awk, you can do
gawk '
BEGIN {prev = "-"}
NR == FNR {extract[$1] = FNR; next}
is_match {output[m1] = $1 FS m1 FS m2; is_match = 0}
$1 in extract {is_match = 1; m1 = $1; m2 = prev}
{prev = $1}
END {
PROCINFO["sorted_in"] = "#val_num_asc"
for (key in extract) print output[key]
}
' extract_info.txt to_extractfrom.txt
NBV ABC -
SJDH PNG CVS
CVS CHNS NBV
I'm trying to repeat the block of lines avobe the OCCURS word the number of times inticated in the line. The block of lines to repeat have a smaller number at the start of the line.
I mean, with this input:
01 PATIENT-TREATMENTS.
05 PATIENT-NAME PIC X(30).
05 PATIENT-SS-NUMBER PIC 9(9).
05 NUMBER-OF-TREATMENTS PIC 99 COMP-3.
05 TREATMENT-HISTORY OCCURS 2.
10 TREATMENT-DATE OCCURS 3.
15 TREATMENT-DAY PIC 99.
15 TREATMENT-MONTH PIC 99.
15 TREATMENT-YEAR PIC 9(4).
10 TREATING-PHYSICIAN PIC X(30).
10 TREATMENT-CODE PIC 99.
05 HELLO PIC X(9).
05 STACK OCCURS 2.
10 OVERFLOW PIC X(99).
This would be the output:
01 PATIENT-TREATMENTS.
05 PATIENT-NAME PIC X(30).
05 PATIENT-SS-NUMBER PIC 9(9).
05 NUMBER-OF-TREATMENTS PIC 99 COMP-3.
05 TREATMENT-HISTORY OCCURS 2.
10 TREATMENT-DATE OCCURS 3.
15 TREATMENT-DAY PIC 99.
15 TREATMENT-MONTH PIC 99.
15 TREATMENT-YEAR PIC 9(4).
15 TREATMENT-DAY PIC 99.
15 TREATMENT-MONTH PIC 99.
15 TREATMENT-YEAR PIC 9(4).
15 TREATMENT-DAY PIC 99.
15 TREATMENT-MONTH PIC 99.
15 TREATMENT-YEAR PIC 9(4).
10 TREATING-PHYSICIAN PIC X(30).
10 TREATMENT-CODE PIC 99.
15 TREATMENT-DAY PIC 99.
15 TREATMENT-MONTH PIC 99.
15 TREATMENT-YEAR PIC 9(4).
15 TREATMENT-DAY PIC 99.
15 TREATMENT-MONTH PIC 99.
15 TREATMENT-YEAR PIC 9(4).
15 TREATMENT-DAY PIC 99.
15 TREATMENT-MONTH PIC 99.
15 TREATMENT-YEAR PIC 9(4).
10 TREATING-PHYSICIAN PIC X(30).
10 TREATMENT-CODE PIC 99.
05 HELLO PIC X(9).
05 STACK OCCURS 2.
10 OVERFLOW PIC X(99).
10 OVERFLOW PIC X(99).
I tried it by this way:
tac input.txt |
awk '
BEGIN {
lbuff="";
n=0;
}{
if($0 ~ /^\s*$/) {next;}
if ($3 == "OCCURS") {
lev_oc=$1
len_oc=$4
lstart=0
for (x=1; x<n; x++) {
split(saved[x],saved_level," ")
if (saved_level[1] <= lev_oc) {
print saved[x]
lstart=x+1
}
}
for (i=1; i<=len_oc; i++) {
for (x=lstart; x<n; x++) {
print saved[x]
}
}
print $0
}else if ($0) {
saved[n]=$0
n++
}
}' | tac
But I don't get the result what I'm trying to obtain. Is awk the best way to do it? Do you have any alternative?
I used perl for this because it's easy to make arbitrarily complex data structures:
#!/usr/bin/perl
use strict;
use warnings;
# read the file into an array of lines.
open my $f, '<', shift;
my #lines = <$f>;
close $f;
my #occurring;
my #occurs;
# iterate over the lines of the file
for (my $i = 0; $i < #lines; $i++) {
# extract the "level", the first word of the line
my $level = (split ' ', $lines[$i])[0];
# if this line contains the OCCURS string,
# push some info onto a stack.
# This marks the start of something to be repeated
if ($lines[$i] =~ /OCCURS (\d+)/) {
push #occurring, [$1-1, $level, $i+1];
next;
}
# if this line is at the same level as the level of the start of the
# last seen item on the stack, mark the last line of the repeated text
if (#occurring and $level eq $occurring[-1][1]) {
push #occurs, [#{pop #occurring}, $i-1];
}
}
# If there's anything open on the stack, it ends at the last line
while (#occurring) {
push #occurs, [#{pop #occurring}, $#lines];
}
# handle all the lines to be repeated by appending them to the last
# line of the repetition
for (#occurs) {
my $repeated = "";
my ($count, undef, $start, $stop) = #$_;
$repeated .= join "", #lines[$start..$stop] for (1..$count);
$lines[$stop] .= $repeated;
}
print #lines;
For your reading pleasure, here's an awk translation.
BEGIN {
s = 0
f = 0
}
function stack2frame(lineno) {
f++
frame[f,"reps"] = stack[s,"reps"]
frame[f,"start"] = stack[s,"start"]
frame[f,"stop"] = lineno
s--
}
{
lines[NR] = $0
level = $1
}
# if this line contains the OCCURS string, push some info onto a stack.
# This marks the start of something to be repeated
$(NF-1) == "OCCURS" {
s++
stack[s,"reps"] = $NF-1
stack[s,"level"] = level
stack[s,"start"] = NR+1
next
}
# if this line is at the same level as the level of the start of the
# last seen item on the stack, mark the last line of the repeated text
level == stack[s,"level"] {
stack2frame(NR-1)
}
END {
# If there's anything open on the stack, it ends at the last line
while (s) {
stack2frame(NR)
}
# handle all the lines to be repeated by appending them to the last
# line of the repetition
for (i=1; i<=f; i++) {
repeated = ""
for (j=1; j <= frame[i,"reps"]; j++) {
for (k = frame[i,"start"]; k <= frame[i,"stop"]; k++) {
repeated = repeated ORS lines[k]
}
}
lines[frame[i,"stop"]] = lines[frame[i,"stop"]] repeated
}
for (i=1; i <= NR; i++)
print lines[i]
}
Here's a ruby solution:
#!/usr/bin/env ruby
# -*- coding: utf-8 -*-
stack = []
def unwind_frame(stack)
frame = stack.pop
_,occurs,data = *frame
with_each = stack==[] ? ->(l){ puts l} : ->(l){stack.last[2].push l}
occurs.times { data.each &with_each }
end
while gets
$_.chomp! "\n"
if m=$_.match(/OCCURS ([0-9]*)\.\s*$/)
puts $_
occurs=m[1].to_i
level = $_.to_i
stack.push([level,occurs,[]])
next
end
if stack==[]; puts $_; next; end
level = $_.to_i
if level > stack.last[0]
stack.last[2].push $_
next
end
while(stack!=[] && level <= stack.last[0])
unwind_frame(stack)
stack!=[] ? stack.last[2].push($_) : puts($_)
end
end
while(stack!=[])
unwind_frame(stack)
end
The result matches what you expected to get.
I would like to print only a '+' o '-' symbols if string is found or not. Basically, I have two files:
Input file 1 (tab-delimited):
HPNK_00457
HPNK_00458
HPNK_00459
Input file 2 (tab-delimited):
HPNK_00457 AAA50325 1e-43 437 28 43 83 ATP-binding protein.
HPNK_00458 P25256 8e-43 429 28 43 82 RecName: Full=Tylosin resistance ATP-binding protein tlrC.
HPNK_00458 CAM96590 1e-42 429 27 42 87 ABC transporter ATP-binding protein [Streptomyces ambofaciens].
Desired output (tab-delimited, maintaining order of strings in file 1):
HPNK_00457 +
HPNK_00458 +
HPNK_00459 -
This is what I've been using up to now, but need to update:
while read vl; do grep "^$vl " file2 || printf -- "- -\n" ; done < file1
Thanks, trying to learn everyday here.
Here's one way using awk:
awk 'FNR==NR { a[$1]; next } { print $1, ($1 in a ? "+" : "-" ) }' file2 file1
Results:
HPNK_00457 +
HPNK_00458 +
HPNK_00459 -
You can use:
while read -r line
do
grep -q "$line" f2 && echo "$line +" || echo "$line -"
done < f1
As grep -q just returns true if it has matched something, in that case we print the file name + + otherwise, we print the file name + -.
It returns:
$ while read -r line; do grep -q "$line" f2 && echo "$line +" || echo "$line -"; done < f1
HPNK_00457 +
HPNK_00458 +
HPNK_00459 -
perl -lane'
BEGIN{ $, ="\t"; $x=shift; #h{ map /(\S+)/, <> } =(); #ARGV=$x }
print #F, exists $h{$F[0]} ? "+" : "-";
' file1 file2
output
HPNK_00457 +
HPNK_00458 +
HPNK_00459 -
Here's the algorithm:
Read file 2. For each line,
Get the first word
Store it in a hash.
Read file 1. For each line, chomp it, then
print $hash{$_}? '+' : '-'
I can write the code for you but if you want to learn everyday, it will be a useful exercise if you want to write it yourself.
This simple Perl script should do the work
#!/usr/local/bin/perl
## f1 and f2 are the 2 files containing your input data
open FILE1, f1;
open FILE2, f2;
#file1data = <FILE1>;
#file2data = <FILE2>;
my $row = 0;
foreach $data (#file1data) {
chomp($data);
if (grep (/$data/,$file2data[$row]) ) {
print $data . " " . "+\n";
}
else {
print $data . " " . "-\n";
}
$row++;
}
awk 'FNR==NR
{a[$1];next}
{b[$1]}
END{
for(i in a)
if(b[i]){print i,"+"}
else{print i,"-"}
}' file1 file2