I'd like to make GDB call a given function a large number of times automatically, say 100. Is there any command that will let me do that?
Save this example script into a file say my_gdb_extensions
define fcall_n_times
set $count = $arg0
set $i=0
while($i < $arg0)
call $arg1
set $i = $i + 1
end
end
You can find more about gdb extensions here.
$ gdb -x my_gdb_extensions <your_bin>
(gdb) start
(gdb) fcall_n_times 10 fact(3)
In the mentioned example 10 is the number of times you want to call the function. fact(3) is the function name with argument as 3.
Related
I was wondering if we have any TACL experts out there can can help me answer probably a very basic question.
How do you inject multiple arguments into you routine.
This is what I have currently so far
[#CASE [#ARGUMENT / VALUE job_id/number /minimum [min_job], maximum [max_job]/
otherwise]
|1|#output Job Number = [job_id]
|otherwise|
#output Bad number - Must be a number between [min_job] & [max_job]
#return
]
I have been told you need to use a second #ARGUMENT statement to get it to work but I have had no such luck getting it to work. And the PDF guides don't help to much.
Any ideas/answers would be great
Thanks.
The #CASE statement isn't required if your arguments are positional and of one type (i.e. you know what you are getting and in what order). In that case you can just use a sequence of #ARGUMENT statements to get the arguments.
In your example #ARGUMENT accepts either a number in a range or anything else - the OTHERWISE bit. The #CASE statement then tells you which of those two you got, 1 or 2.
#ARGUMENT can do data validation for you (you may recognize the output from some of the TACL routines that come with the operating system).
So you can write something like this:
SINK [#ARGUMENT / VALUE job_id/number /minimum [min_job], maximum [max_job]/]
The SINK just tosses away the expansion of the #ARGUMENT, you don't need it since you only accept a number and fail otherwise.
I figured out a way but idk if it is the best way to do it.
It seems that for one an Argument statement needs to always be in a #CASE statement so all I basically did was mirror the above and just altered it for text rather than use integer.
If you know of any other/better ways let me know :)
It find it best to use CASE when you have multiple types of argument
input to process. Kind of mocked up how I would see multiple argument
types being used in the context that you shared with the CASE
expression:
?TACL ROUTINE
#FRAME
#PUSH JOB_ID MIN_JOB MAX_JOB
#SETMANY MIN_JOB MAX_JOB , 1 3
[#DEF VALID_KEYWORDS TEXT |BODY| THISJOB THATJOB SOMEOTHERJOB]
[#CASE
[#ARGUMENT/VALUE JOB_ID/
NUMBER/MINIMUM [MIN_JOB],MAXIMUM [MAX_JOB]/
KEYWORD/WORDLIST [VALID_KEYWORDS]/
STRING
OTHERWISE
]
| 1 |
#OUTPUT VALID JOB NUMBER = [JOB_ID]
| 2 |
#OUTPUT VALID KEYWORD = [JOB_ID]
| 3 |
#OUTPUT VALID STRING = [JOB_ID]
| OTHERWISE |
#OUTPUT NOT A NUMBER, KEYWORD, OR A STRING
#OUTPUT MUST BE ONE OF:
#OUTPUT A NUMBER IN THE RANGE OF: [MIN_JOB] TO [MAX_JOB]
#OUTPUT A KEYWORD IN THIS LIST: [VALID_KEYWORDS]
#OUTPUT OR A STRING OF CHARACTERS
#RETURN
]
#OUTPUT
#OUTPUT NOW WE ARE USING ARGUMENT [JOB_ID] !!!
TIME
#UNFRAME
I have just started doing my first research project, and I have just begun programming (approximately 2 weeks ago). Excuse me if my questions are naive. I might be using python very inefficiently. I am eager to improve here.
I have experimental data that I want to analyse. My goal is to create a python script that takes the data as input, and that for output gives me graphs, where certain parameters contained in text files (within the experimental data folders) are plotted and fitted to certain equations. This script should be as generalizable as possible so that I can use it for other experiments.
I'm using the Anaconda, Python 2.7, package, which means I have access to various libraries/modules related to science and mathematics.
I am stuck at trying to use For and While loops (for the first time).
The data files are structured like this (I am using regex brackets here):
.../data/B_foo[1-7]/[1-6]/D_foo/E_foo/text.txt
What I want to do is to cycle through all the 7 top directories and each of their 6 subdirectories (named 1,2,3...6). Furthermore, within these 6 subdirectories, a text file can be found (always with the same filename, text.txt), which contain the data I want to access.
The 'text.txt' files is structured something like this:
1 91.146 4.571 0.064 1.393 939.134 14.765
2 88.171 5.760 0.454 0.029 25227.999 137.883
3 88.231 4.919 0.232 0.026 34994.013 247.058
4 ... ... ... ... ... ...
The table continues down. Every other row is empty. I want to extract information from 13 rows starting from the 8th line, and I'm only interested in the 2nd, 3rd and 5th columns. I want to put them into lists 'parameter_a' and 'parameter_b' and 'parameter_c', respectively. I want to do this from each of these 'text.txt' files (of which there is a total of 7*6 = 42), and append them to three large lists (each with a total of 7*6*13 = 546 items when everything is done).
This is my attempt:
First, I made a list, 'list_B_foo', containing the seven different 'B_foo' directories (this part of the script is not shown). Then I made this:
parameter_a = []
parameter_b = []
parameter_c = []
j = 7 # The script starts reading 'text.txt' after the j:th line.
k = 35 # The script stops reading 'text.txt' after the k:th line.
x = 0
while x < 7:
for i in range(1, 7):
path = str(list_B_foo[x]) + '/%s/D_foo/E_foo/text.txt' % i
m = open(path, 'r')
line = m.readlines()
while j < k:
line = line[j]
info = line.split()
print 'info:', info
parameter_a.append(float(info[1]))
parameter_b.append(float(info[2]))
parameter_c.append(float(info[5]))
j = j + 2
x = x + 1
parameter_a_vect = np.array(parameter_a)
parameter_b_vect = np.array(parameter_b)
parameter_c_vect = np.array(parameter_c)
print 'a_vect:', parameter_a_vect
print 'b_vect:', parameter_b_vect
print 'c_vect:', parameter_c_vect
I have tried to fiddle around with indentation without getting it to work (receiving either syntax error or indentation errors). Currently, I get this output:
info: ['1', '90.647', '4.349', '0.252', '0.033', '93067.188', '196.142']
info: ['.']
Traceback (most recent call last):
File "script.py", line 104, in <module>
parameter_a.append(float(info[1]))
IndexError: list index out of range
I don't understand why I get the "list index out of range" message. If anyone knows why this is the case, I would be happy to hear you out.
How do I solve this problem? Is my approach completely wrong?
EDIT: I went for a pure while-loop solution, taking RebelWithoutAPulse and CamJohnson26's suggestions into account. This is how I solved it:
parameter_a=[]
parameter_b=[]
parameter_c=[]
k=35 # The script stops reading 'text.txt' after the k:th line.
x=0
while x < 7:
y=1
while y < 7:
j=7
path1 = str(list_B_foo[x]) + '/%s/pdata/999/dcon2dpeaks.txt' % (y)
m = open(path, 'r')
lines = m.readlines()
while j < k:
line = lines[j]
info = line.split()
parameter_a.append(float(info[1]))
parameter_b.append(float(info[2]))
parameter_c.append(float(info[5]))
j = j+2
y = y+1
x = x+1
Meta: I am not sure If I should give the answer to the person who answered the quickest and who helped me finish my task. Or the person with the answer which I learned most from. I am sure this is a common issue that I can find an answer to by reading the rules or going to Stackexchange Meta. Until I've read up on the recomendations, I will hold off on marking the question as answered by any of you two.
Welcome to stack overflow!
The error is due to name collision that you inadvertenly have created. Note the output before the exception occurs:
info: ['1', '90.647', '4.349', '0.252', '0.033', '93067.188', '196.142']
info: ['.']
Traceback (most recent call last):
...
The line[1] cannot compute - there is no "1"-st element in the list, containing only '.' - in python the lists start with 0 position.
This happens in your nested loop,
while j < k
where you redefine the very line you read previously created:
line = m.readlines()
while j < k:
line = line[j]
info = line.split()
...
So what happens is on first run of the loop, your read the lines of the files into line list, then you take one line from the list, assign it to line again, and continue with the loop. At this point line contains a string.
On the next run reading from line via specified index reads the character from the string on the j-th position and the code malfunctions.
You could fix this with different naming.
P.S. I would suggest using with ... as ... syntax while working with files, it is briefly described here - this is called a context manager and it takes care of opening and closing the files for you.
P.P.S. I would also suggest reading the naming conventions
Looks like you are overwriting the line array with the first line of the file. You call line = m.readlines(), which sets line equal to an array of lines. You then set line = line[j], so now the line variable is no longer an array, it's a string equal to
1 91.146 4.571 0.064 1.393 939.134 14.765
This loop works fine, but the next loop will treat line as an array of chars and take the 4th element, which is just a period, and set it equal to itself. That explains why the info variable only has one element on the second pass through the loop.
To solve this, just use 2 line variables instead of one. Call one lines and the other line.
lines = m.readlines()
while j < k:
line = lines[j]
info = line.split()
May be other errors too but that should get you started.
How do I make my program print the answers on separate lines + with what key the line corresponds to?
def break_crypt(message):
for key in range(1,27):
for character in message:
if character in string.uppercase:
old_ascii=ord(character)
new_ascii=(old_ascii-key-65)%26+65
new_char=chr(new_ascii)
sys.stdout.write(new_char),
elif character in string.lowercase:
old_ascii=ord(character)
new_ascii=(old_ascii-key-97)%26+97
new_char=chr(new_ascii)
sys.stdout.write(new_char),
else:
sys.stdout.write(character),
to jump a line simply use "\n"
for instance:
sys.stdout.write("a\nb")
will write a and b in differents lines
use + to add a string to another
sys.stdout.write("a"+variable+"b")
there is other "more advanced" ways like
sys.stdout.write("a%sb" % variable)
or
sys.stdout.write("a{0}b".format(variable)
also in your code if there is no point of using sys.stdout.write don't use it
this may helps you
https://docs.python.org/2/tutorial/introduction.html
If you simply add the following at the end of the outer loop, then it'll both print the key and go to the next line:
print '', key
Then the output will look like this:
Sghr hr z sdrs 1
Rfgq gq y rcqr 2
Qefp fp x qbpq 3
.
.
.
Uijt jt b uftu 25
This is a test 26
But I would really build the whole string for the current key in a string variable and then print it at once.
Is it possible to set a complex breakpoint which has condition which involves check on the argument passed to the outer function(frame).
eg.
1 #0 sample::_processMessage (this=0xa5c8c0, data=0x7fffe5ae31db "\027w\270߸\023\032\212\v", line=0x7fffe4799db8 "224.4.2.197:60200", should_process=true) a sample.cpp:426
2 #1 0x00007ffff682f05d in sample::_process (this=0xa5c8c0, should_process=true, line=0x7fffe4799db8 "224.4.2.197:60200", data=0x7fffe5ae31db "\027w\270߸\023\032\212\v", sn=31824) a sample.cpp:390
3 #2 0x00007ffff6836744 in sample::drain (this=0xa5c8c0, force=true) at sample.cpp:2284
4 #3 0x00007ffff682ed81 in sample::process (this=0xa5c8c0, mdData=0x7fffe67914e0) at sample.cpp:354
Here I want to set a break point on sample.cpp:356,which is on in the function process-frame#3,
if the _process or frame #1 at the time hitting breakpoint has sn == 31824
so actually break point is at function _process but I want pause the execution in the function process
Thanks in advance
I don't know if it's possible to create conditional breakpoints that reference an outer frame, but you could use breakpoint commands to achieve a similar result.
Here's an example gdb session:
(gdb) break some-location
(gdb) commands
Type commands for breakpoint(s) 1, one per line.
End with a line saying just "end".
>silent
>up
>if (sn != 31824)
>continue
>end
>end
Now every time gdb hits the breakpoint it will automatically move up a frame, check sn and continue if the value is not correct. This will not be any (or much) slower than a conditional breakpoint, and the only real downside is that it will print out a line each time the breakpoint is hit, even if gdb then continues.
The silent in the command list cuts down on some of the normal output that is produced when a breakpoint is hit, this can be removed to get a more verbose experience.
This can be accomplished with a gdb convenience function implemented in python:
import gdb
class CallerVar(gdb.Function):
"""Return the value of a calling function's variable.
Usage: $_caller_var (NAME [, NUMBER-OF-FRAMES [, DEFAULT-VALUE]])
Arguments:
NAME: The name of the variable.
NUMBER-OF-FRAMES: How many stack frames to traverse back from the currently
selected frame to compare with.
The default is 1.
DEFAULT-VALUE: Return value if the variable can't be found.
The default is 0.
Returns:
The value of the variable in the specified frame, DEFAULT-VALUE if the
variable can't be found."""
def __init__(self):
super(CallerVar, self).__init__("_caller_var")
def invoke(self, name, nframes=1, defvalue=0):
if nframes < 0:
raise ValueError("nframes must be >= 0")
frame = gdb.selected_frame()
while nframes > 0:
frame = frame.older()
if frame is None:
return defvalue
nframes = nframes - 1
try:
return frame.read_var(name.string())
except:
return defvalue
CallerVar()
It can be used like:
(gdb) b sample.cpp:356 if $_caller_var("sn",2)==31824
I would like to retrieve the coding amino-acid when there is certain pattern in a DNA sequence. For example, the pattern could be: ATAGTA. So, when having:
Input file:
>sequence1
ATGGCGCATAGTAATGC
>sequence2
ATGATAGTAATGCGCGC
The ideal output would be a table having for each amino-acid the number of times is coded by the pattern. Here in sequence1, pattern codes only for one amino-acid, but in sequence2 it codes for two. I would like to have this tool working to scale to thousands of sequences. I've been thinking about how to get this done, but I only thought to: replace all nucleotides different than the pattern, translate what remains and get summary of the coded amino-acids.
Please let me know if this task can be performed by an already available tool.
Thanks for your help. All the best, Bernardo
Edit (due to the confusion generated with my post):
Please forget the original post and sequence1 and sequence2 too.
Hi all, and sorry for the confusion. The input fasta file is a *.ffn file derived from a GenBank file using 'FeatureExtract' tool (http://www.cbs.dtu.dk/services/FeatureExtract/download.php), so a can imagine they are already in frame (+1) and there is no need to get amino-acids coded in a frame different than +1.
I would like to know for which amino-acid the following sequences are coding for:
AGAGAG
GAGAGA
CTCTCT
TCTCTC
The unique strings I want to get coding amino-acids are repeats of three AG, GA, CT or TC, that is (AG)3, (GA)3, (CT)3 and (TC)3, respectively. I don't want the program to retrieve coding amino-acids for repeats of four or more.
Thanks again, Bernardo
Here's some code that should at least get you started. For example, you can run like:
./retrieve_coding_aa.pl file.fa ATAGTA
Contents of retrieve_coding_aa.pl:
#!/usr/bin/perl
use strict;
use warnings;
use File::Basename;
use Bio::SeqIO;
use Bio::Tools::CodonTable;
use Data::Dumper;
my $pattern = $ARGV[1];
my $fasta = Bio::SeqIO->new ( -file => $ARGV[0], -format => 'fasta');
while (my $seq = $fasta->next_seq ) {
my $pos = 0;
my %counts;
for (split /($pattern)/ => $seq->seq) {
if ($_ eq $pattern) {
my $dist = $pos % 3;
unless ($dist == 0) {
my $num = 3 - $dist;
s/.{$num}//;
chop until length () % 3 == 0;
}
my $table = Bio::Tools::CodonTable->new();
$counts{$_}++ for split (//, $table->translate($_));
}
$pos += length;
}
print $seq->display_id() . ":\n";
map {
print "$_ => $counts{$_}\n"
}
sort {
$counts{$a} <=> $counts{$b}
}
keys %counts;
print "\n";
}
Here are the results using the sample input:
sequence1:
S => 1
sequence2:
V => 1
I => 1
The Bio::Tools::CodonTable class also supports non-standard codon usage tables. You can change the table using the id pointer. For example:
$table = Bio::Tools::CodonTable->new( -id => 5 );
or:
$table->id(5);
For more information, including how to examine these tables, please see the documentation here: http://metacpan.org/pod/Bio::Tools::CodonTable
I will stick to that first version of what you wanted cause the addendum only confused me even more. (frame?)
I only found ATAGTA once in sequence2 but I assume you want the mirror images/reverse sequence as well, which would be ATGATA in this case. Well my script doesn't do that so you would have to write it up twice in the input_sequences file but that should be no problem I would think.
I work with a file like yours which I call "dna.txt" and a input sequences file called "input_seq.txt". The result file is a listing of patterns and their occurences in the dna.txt file (including overlap-results but it can be set to non-overlap as explained in the awk).
input_seq.txt:
GC
ATA
ATAGTA
ATGATA
dna.txt:
>sequence1
ATGGCGCATAGTAATGC
>sequence2
ATGATAGTAATGCGCGC
results.txt:
GC,6
ATA,2
ATAGTA,2
ATGATA,1
Code is awk calling another awk (but one of them is simple). You have to run
"./match_patterns.awk input_seq.txt" to get the results file generated.:
*match_patterns.awk:*
#! /bin/awk -f
{return_value= system("awk -vsubval="$1" -f test.awk dna.txt")}
test.awk:
#! /bin/awk -f
{string=$0
do
{
where = match(string, subval)
# code is for overlapping matches (i.e ATA matches twice in ATATAC)
# for non-overlapping replace +1 by +RLENGTH in following line
if (RSTART!=0){count++; string=substr(string,RSTART+1)}
}
while (RSTART != 0)
}
END{print subval","count >> "results.txt"}
Files have to be all in the same directory.
Good luck!