I'm trying to do something like this (assuming $input is something provided by the user):
LIST = pre1 pre2 pre3 pre4 pre5 pre6 pre7 pre8 pre9 pre10
START = 0
for prefix in $(LIST); do \
if $(input) == $(prefix) then
START = 1
endif \
if $(START) == 1 then \
if [ -f $(prefix)<file_name> ]; then <do_A>; else <do_B>; fi
endif \
done
my problem is with the two if's mentioned above. i don't know how can i choose a specific string value from a list while iterating it (if $(input) == $(prefix) then) and i don't know how to check if a value is 1 or 0 (if $(START) == 1 then).
My intent with this code is to use the same makefile for different directories which have the same file name, but with a different prefix. sometimes, a directory might contain multiple versions of the file with a different prefix and i want to define a hierarchy of those prefixes (defined by LIST in my example). when the user provide a version, the idea is to start searching for the most up-to date version, starting from the version he provides (e.g. if the user provide pre4, then i need to search pre4 first and if it's not exist - i'll go on and search for pre5 and so on. but in this example, i won't search for pre1 even if it do exist in the current directory).
Anyone has an idea on how can i do that?
Thanks in advance.
If that is supposed to be a command in a Makefile, the syntax would have to be something like this:
LIST = pre1 pre2 pre3 pre4 pre5 pre6 pre7 pre8 pre9 pre10
START = 0
input = somename
file_name = whatever
some_target:
for prefix in $(LIST); do \
if test "$(input)" = $$prefix; then \
START=1; \
fi; \
if test "$(START)" = 1; then \
if test -f $$prefix$(file_name); then \
<do_A>; \
else \
<do_B>; \
fi; \
fi; \
done
But you didn't tell us what <input> and <file_name> are supposed to be, so I assumed they are other make variables. Basically the make rules look like one long shell line, with commands separated by semicolons, and lines continued with backslashes. $$ is replaced by make with a single $, which is why references to shell variables ($$prefix) need two dollars.
Your make manual (type man make has the whole story and is fun to read and understand.) Become a make guru today! Be sure to understand the difference between a make variable and a shell variable.
Related
I am trying to use macro for calling appropriate object based on the type.
#define DELEGATE_FUNC(FuncName, kind, paramPtr) \
if (kind == 1) { \
return PolicyObject1.##FuncName(paramPtr); \
} \
else { \
return PolicyObject2.##FuncName(paramPtr); \
} \
return 0; \
(PolicyObject1 & PolicyObject2 are two static objects.)
Now when using the macro, e.g.
DELEGATE_FUNC(ProcessPreCreate, 1, null_ptr);
It compiles fine in VS 2015, but gives error with LLVM "Pasting formed an invalid processing token '.ProcessPreCreate'"
I looked for, and found a few posts and understood it up to some level - Need double level of indirection, e.g.
Why do I need double layer of indirection for macros?
However I am unable to define those two layers of macro, can anyone help?
(Please leave aside the discussion on design aspects)
Thanks
When the compiler reads your C++ file, one of the first steps is dividing it into tokens like identifier, string literal, number, punctuation, etc. The C preprocessor works on these tokens, not on text. The ## operator glues tokens together. So, for example, if you have
#define triple(foo) foo##3
Then triple(x) will get you the identifier x3, triple(12) will get you the integer 123, and triple(.) will get you the float .3.
However, what you have is .##FuncName, where FuncName is ProcessPreCreate. This creates the single token .ProcessPreCreate, which is not a valid C++ token. If you had typed PolicyObject1.ProcessPreCreate directly instead of through a macro, it would be tokenized into three tokens: PolicyObject1, ., and ProcessPreCreate. This is what your macro needs to produce in order to give valid C++ output.
To do that, simply get rid of the ##. It is unnecessary to glue the . to the FuncName, because they are separate tokens. To check this, you can put a space between a . and a member name; it will still compile just fine. Since they are separate tokens, they should not and cannot be glued together.
delete "##".
#define DELEGATE_FUNC(FuncName, kind, paramPtr) \
if (kind == 1) { \
return PolicyObject1.FuncName(paramPtr); \
} \
else { \
return PolicyObject2.FuncName(paramPtr); \
} \
return 0; \
I already written something like that:
Check privileges for PDTA
${end}= Get Matching Xpath Count //*[#id="listForm:displayDataTable:tbody"]/tr
${start}= Set Variable 0
: FOR ${index} IN RANGE ${start} ${end}
\ ${status}= Run Keyword And Ignore Error Element Should Contain listForm:displayDataTable:${index} su ${index}
And the log output is:
As you can see I want to get the number of row where I could find the value 'su'. This value can be found in a row number 6. Variable ${end} equals the number of all rows in the table.
Does anyone know how to get that number? Maybe there's a keyword which could help me, isn't it? Thanks in advance !!!
: FOR ${index} IN RANGE ${start} ${end}
\ ${Name}= Get Text listForm:displayDataTable:${index}
\ ${IsEqual}= Run Keyword And Return Status Should Be Equal ${Name} Su
\ ${RowNumber}= Set Variable ${index}
\ Run Keyword If '${IsEqual}'=='True' Run Keywords Log Rownumber is ${RowNumber} AND Exit For Loop
U can try this.
The variable ${RowNumber} gets the row number which has the text "Su".
As part of your FOR loop, I would add:
Run Keyword If '${status}' == 'PASS' Log ${index}
If you need to actually use it then simply set a variable or append to a list variable or something
I have a script which includes the following step as the first step in a series of filters of genomics data:
--option ~/folder$1/file$1 --option2 ~/folder$1/file$1 --indv NA12775 --options...
The script already uses a for-loop to go through folder/file indices 1-22. The option --indv takes a string, which is an identifiers. I have a separate list file which is just a column of "indv" identifiers:
NA06984
NA06986
NA06989
NA06994
NA07000
I have many such lists and I am looking for a solution to automatically take a single identifier from my list file, run the filtering script for "indv X" and then take the next consecutive identifier and repeat. Something like "for line in ID-list, run filter-script"...
You can use xargs for this:
xargs -I {} ./myprogram --indv {} < indvlist.txt
A couple bash methods for doing this:
for indv in $(<list-of-indv-values.txt)
do
something something .... ${indv} .....
done
or
while read indv
do
something something ... ${indv} .....
done < list-of-indv-values.txt
I would like to retrieve the coding amino-acid when there is certain pattern in a DNA sequence. For example, the pattern could be: ATAGTA. So, when having:
Input file:
>sequence1
ATGGCGCATAGTAATGC
>sequence2
ATGATAGTAATGCGCGC
The ideal output would be a table having for each amino-acid the number of times is coded by the pattern. Here in sequence1, pattern codes only for one amino-acid, but in sequence2 it codes for two. I would like to have this tool working to scale to thousands of sequences. I've been thinking about how to get this done, but I only thought to: replace all nucleotides different than the pattern, translate what remains and get summary of the coded amino-acids.
Please let me know if this task can be performed by an already available tool.
Thanks for your help. All the best, Bernardo
Edit (due to the confusion generated with my post):
Please forget the original post and sequence1 and sequence2 too.
Hi all, and sorry for the confusion. The input fasta file is a *.ffn file derived from a GenBank file using 'FeatureExtract' tool (http://www.cbs.dtu.dk/services/FeatureExtract/download.php), so a can imagine they are already in frame (+1) and there is no need to get amino-acids coded in a frame different than +1.
I would like to know for which amino-acid the following sequences are coding for:
AGAGAG
GAGAGA
CTCTCT
TCTCTC
The unique strings I want to get coding amino-acids are repeats of three AG, GA, CT or TC, that is (AG)3, (GA)3, (CT)3 and (TC)3, respectively. I don't want the program to retrieve coding amino-acids for repeats of four or more.
Thanks again, Bernardo
Here's some code that should at least get you started. For example, you can run like:
./retrieve_coding_aa.pl file.fa ATAGTA
Contents of retrieve_coding_aa.pl:
#!/usr/bin/perl
use strict;
use warnings;
use File::Basename;
use Bio::SeqIO;
use Bio::Tools::CodonTable;
use Data::Dumper;
my $pattern = $ARGV[1];
my $fasta = Bio::SeqIO->new ( -file => $ARGV[0], -format => 'fasta');
while (my $seq = $fasta->next_seq ) {
my $pos = 0;
my %counts;
for (split /($pattern)/ => $seq->seq) {
if ($_ eq $pattern) {
my $dist = $pos % 3;
unless ($dist == 0) {
my $num = 3 - $dist;
s/.{$num}//;
chop until length () % 3 == 0;
}
my $table = Bio::Tools::CodonTable->new();
$counts{$_}++ for split (//, $table->translate($_));
}
$pos += length;
}
print $seq->display_id() . ":\n";
map {
print "$_ => $counts{$_}\n"
}
sort {
$counts{$a} <=> $counts{$b}
}
keys %counts;
print "\n";
}
Here are the results using the sample input:
sequence1:
S => 1
sequence2:
V => 1
I => 1
The Bio::Tools::CodonTable class also supports non-standard codon usage tables. You can change the table using the id pointer. For example:
$table = Bio::Tools::CodonTable->new( -id => 5 );
or:
$table->id(5);
For more information, including how to examine these tables, please see the documentation here: http://metacpan.org/pod/Bio::Tools::CodonTable
I will stick to that first version of what you wanted cause the addendum only confused me even more. (frame?)
I only found ATAGTA once in sequence2 but I assume you want the mirror images/reverse sequence as well, which would be ATGATA in this case. Well my script doesn't do that so you would have to write it up twice in the input_sequences file but that should be no problem I would think.
I work with a file like yours which I call "dna.txt" and a input sequences file called "input_seq.txt". The result file is a listing of patterns and their occurences in the dna.txt file (including overlap-results but it can be set to non-overlap as explained in the awk).
input_seq.txt:
GC
ATA
ATAGTA
ATGATA
dna.txt:
>sequence1
ATGGCGCATAGTAATGC
>sequence2
ATGATAGTAATGCGCGC
results.txt:
GC,6
ATA,2
ATAGTA,2
ATGATA,1
Code is awk calling another awk (but one of them is simple). You have to run
"./match_patterns.awk input_seq.txt" to get the results file generated.:
*match_patterns.awk:*
#! /bin/awk -f
{return_value= system("awk -vsubval="$1" -f test.awk dna.txt")}
test.awk:
#! /bin/awk -f
{string=$0
do
{
where = match(string, subval)
# code is for overlapping matches (i.e ATA matches twice in ATATAC)
# for non-overlapping replace +1 by +RLENGTH in following line
if (RSTART!=0){count++; string=substr(string,RSTART+1)}
}
while (RSTART != 0)
}
END{print subval","count >> "results.txt"}
Files have to be all in the same directory.
Good luck!
I have a list of variables all containing the same string "test". How do I rename all of these variables to for example var1-var20, where 20 is the number of variables. The order is not important here. I tried installing the package "renvars", and did the following
renvars *test* \ var1-var20
but this does not work. Any help is appreciated.
If you're using Stata 12, I think you should be able to just do:
rename (*test*) var#, addnumber
Check out this link (in particular Rule #18): http://www.stata.com/help.cgi?rename+group
To be any more help we'll need the error and how it fails. *test* should be a valid varlist and if there are the same numbers of variables in each varlist (left and right of \), the it should work.
The following works for me.
* generate some variables that fit the description
clear
local i = 0
foreach pre in ho ak {
forvalues j = 1/10 {
local ++i
generate `pre'_icd`i' = ""
}
}
* rename variables that match pattern
renvars *icd* \ var1-var20
Maybe more variables match *icd* than you expect?