Why map is creating NULL in C++? - c++

I have a 2D vector of string as
vector<vector<string>> database
it has following data
database[0] -> A C D (database[0][0] -> A , database[0][1] -> C and so on)
database[1] -> B C E F
database[2] -> A B C E F
database[3] -> B E
database[4] -> A C F
I am counting the occurrence of each string (in this example each character A,B etc.) and saving it in a map map_c as
map<string,int> map_c ;
for(i=0 ; i<database.size() ; i++)
{
for(j=0 ; j<database.at(i).size() ; j++)
{
if(map_c.find(database.at(i).at(j)) != map_c.end())
{
map_c[database[i][j]]++;
}
else
{
map_c[database[i][j]] = 1;
}
}
}
And printing the count of each string using the code
for(map<string,int>::iterator it = map_c.begin() ; it != map_c.end() ; it++ )
{
cout << it->first << " -> " << it->second << endl;
}
Output ->
-> 1
A -> 3
B -> 3
C -> 4
D -> 1
E -> 3
F -> 3
Why NULL key has been created with a count 1 ?

The code you post should not generate an empty (or spaces) string. It is possible such a string comes with your initial data.
Either way, your code to fill the map can be made much much shorter:
map<string,int> map_c ;
for(const auto& line: database)
for(const auto& s: line)
++map_c[s];

Replace your if...else with just ++map_c[database[i][j]]; . The default value for int in a map is 0.
The unexpected entry in your output is because your database actually had that entry in it. If you are not sure why this entry is in your database, review the code that sets up your database (and/or post that code).

Sorry , finaly i got rid the problem. Problem was in the input file. At the end of the file , there was a space character at last line and i was thinking that my file has been over before it . That space was creating the problem. I am sorry for inconvenience and thank you so much for your replies and time .

Related

Syntax errors in main function - SML/NJ [deleting DO VAL, deleting VAL ID, SEMICOLON ID, deleting SEMICOLON END SEMICOLON]

May someone highlight to me why I am getting the syntax errors for the main function, so that I can fix it. I am quite new to the language. Actually I was introduced to it through the assignment, so I am totally lost as to how to refactor it to avoid the syntax error:
val IDs = [410021001,410021002,410021003,410021004,410021005,410021006,410021007,410021008,410021009,410021010];
val Names = ["Alan","Bob","Carrie","David","Ethan","Frank","Gary","Helen","Igor","Jeff"]: string list;
val HW1 = [90.0,85.0,90.0,117.0,85.0,90.0,117.0,117.0,117.0,117.0] : real list;
val HW2 = [84.5,49.0,110.5,85.0,56.0,65.0,65.0,59.5,50.0,50.0] : real list;
val HW3 = [117.0,117.0,117.0,0.0,65.0,117.0,50.0,51.0,75.0,75.0] : real list;
val Midterm = [60.0,57.0,6.0,44.0,72.0,43.0,54.0,75.0,53.0,75.0] : real list;
val Final = [66.0,64.0,62.0,55.0,66.0,75.0,75.0,75.0,75.0,75.0] : real list;
fun score(HW1, HW2, HW3, Midterm, Final) =
round(HW1 * 0.1 + HW2 * 0.1 + HW3 * 0.1 + Midterm * 0.3 + Final * 0.4);
fun letterGrade(score) =
if score >= 90 then "A+"
else if score >= 85 then "A"
else if score >= 80 then "A-"
else if score >= 77 then "B+"
else if score >= 73 then "B"
else if score >= 70 then "B-"
else if score >= 67 then "C+"
else if score >= 63 then "C"
else if score >= 60 then "C-"
else if score >= 50 then "D"
else "E";
val i = 0
val max = length(IDs)
fun main() =
while i < max do
var ind_score = score(HW1[i], HW2[i], HW3[i], Midterm[i], Final[i])
var grade = letterGrade(ind_score)
print(IDs[i], " ", Names[i], " ", ind_score, " ", grade)
i = i + 1
end
end
This is the error I am producing after running my programme, which shows that my errors start at this function:
Terminal feedback
Part 1 - Straightforward corrections
I'll go from the simplest to the most complex. I'll also provide a more functional implementation in the end, without the while loop.
The construct var does not exist in ML. You probably meant val ind_score = ...
Array indexing is not done by array[i]. You need (as with everything else) a function to do that. The function happens to be List.nth. So, everywhere you have HW1[i], you should have List.nth(HW1, i).
Most language constructs expect a single expression, so you usually cannot simply string commands as we do in imperative languages. Thus, there are some constructs missing after the do in your while.
Variables in functional languages are usually immutable by default, so you have to indicate when you want something to be mutable. In your while, you want i to be mutable, so it has to be declared and used as such: val i = ref 0. When using the value, you have to use the syntax !i to get the 'current' value of the variable (essentially, de-referencing it).
The function call syntax in ML does not use (). When you call a function like score(a, b, c, d) what you are doing is creating a tuple (a, b, c, d) and passing it as a single argument to the function score. This is an important distinction because you are actually passing a tuple to your print function, which does not work because print expects a single argument of type string. By the way, the string concatenation operator is ^.
If you do all these changes, you'll get to the following definition of main. It is quite ugly but we will fix that soon:
val i = ref 0 (* Note that i's type is "int ref". Think about it as a pointer to an integer *)
val max = length(IDs)
fun main() =
while !i < max do (* Notice how annoying it is to always de-reference i and that the syntax to access a list element is not very convenient *)
let (* The 'let in end' block is the way to define values that will be used later *)
val ind_score = score(List.nth(HW1, !i), List.nth(HW2, !i), List.nth(HW3, !i), List.nth(Midterm, !i), List.nth(Final, !i))
val grade = letterGrade(ind_score)
in ( (* The parenthesis here allow stringing together a series of "imperative" operations *)
print(Int.toString(List.nth(IDs, !i)) ^ " " ^ List.nth(Names, !i) ^ " " ^ Int.toString(ind_score) ^ " " ^ grade ^ "\n");
i := !i + 1 (* Note the syntax := to re-define the value of i *)
)
end;
Part 2 - Making it more functional
Functional language programs are typically structured differently from imperative programs. A lot of small functions, pattern matching and recursion are typical. The code below is an example of how you could improve your main function (it is by no means "optimal" in terms of style though). A clear advantage of this implementation is that you do not even need to worry about the length of the lists. All you need to know is what to do when they are empty and when they are not.
(* First, define how to print the information of a single student.
Note that the function has several arguments, not a single argument that is a tuple *)
fun printStudent id name hw1 hw2 hw3 midterm final =
let
val ind_score = score (hw1, hw2, hw3, midterm, final)
val grade = letterGrade ind_score
in
print(Int.toString(id) ^ " " ^ name ^ " " ^ Int.toString(ind_score) ^ " " ^ grade ^ "\n")
end;
(* This uses pattern matching to disassemble the lists and print each element in order.
The first line matches an empty list on the first element (the others don't matter) and return (). Think of () as None in Python.
The second line disassemble each list in the first element and the rest of the list (first::rest), print the info about the student and recurse on the rest of the list.
*)
fun printAllStudents (nil, _, _, _, _, _, _) = ()
| printAllStudents (id::ids, name::names, hw1::hw1s, hw2::hw2s, hw3::hw3s, mid::midterms, final::finals) =
(printStudent id name hw1 hw2 hw3 mid final;
printAllStudents(ids, names, hw1s, hw2s, hw3s, midterms, finals));
printAllStudents(IDs, Names, HW1, HW2, HW3, Midterm, Final);
Note that it is a bit of a stretch to say that this implementation is more legible than the first one, even though it is slightly more generic. There is a way of improving it significantly though.
Part 3 - Using records
You may have noticed that there is a lot of repetition on the code above because we keep having to pass several lists and arguments. Also, if a new homework or test was added, several functions would have to be reworked. A way to avoid this is to use records, which work similarly to structs in C. The code below is a refactoring of the original code using a Student record. Note that, even though it has a slightly larger number of lines than your original code, it is (arguably) easier to understand and easier to update, if needed. The important part about records is that to access a field named field, you use an accessor function called #field:
(* Create a record type representing a student *)
type Student = {id:int, name:string, hw1:real, hw2:real, hw3:real, midterm:real, final:real};
(* Convenience function to construct a list of records from the individual lists of values *)
fun makeListStudents (nil, _, _, _, _, _, _) = nil (* if the input is empty, finish the list *)
| makeListStudents (id::ids, name::names, hw1::hw1s, hw2::hw2s, hw3::hw3s, mid::midterms, final::finals) = (* otherwise, add one record to the list and recurse *)
{id=id, name=name, hw1=hw1, hw2=hw2, hw3=hw3, midterm=mid, final=final} :: makeListStudents(ids, names, hw1s, hw2s, hw3s, midterms, finals);
val students = makeListStudents (IDs, Names, HW1, HW2, HW3, Midterm, Final);
fun score ({hw1, hw2, hw3, midterm, final, ...}: Student): int = (* Note the special patter matching syntax *)
round(hw1 * 0.1 + hw2 * 0.1 + hw3 * 0.1 + midterm * 0.3 + final * 0.4);
fun letterGrade (score) =
if score >= 90 then "A+"
else if score >= 85 then "A"
else if score >= 80 then "A-"
else if score >= 77 then "B+"
else if score >= 73 then "B"
else if score >= 70 then "B-"
else if score >= 67 then "C+"
else if score >= 63 then "C"
else if score >= 60 then "C-"
else if score >= 50 then "D"
else "E";
(* Note how this function became more legible *)
fun printStudent (st: Student) =
let
val ind_score = score(st)
val grade = letterGrade(ind_score)
in
print(Int.toString(#id(st)) ^ " " ^ #name(st) ^ " " ^ Int.toString(ind_score) ^ " " ^ grade ^ "\n")
end;
(* Note how, now that we have everything in a single list, we can use map *)
fun printAllStudents (students) = map printStudent students;
printAllStudents(students);

Is there type casting in VDM++?

For example, I want to cast nat to seq of char in VDM++.
In the code below, I want operation setValueOfX to return "X is {value of q}", if q is of type nat and q < 3.
class A1
instance variables
private x : nat := 0;
operations
public setValueOfX : nat ==> seq of char
setValueOfX(q) ==
(
if is_nat(q) and q < 3
then
(
x := q;
-- The following line doesn't work
-- return "X is " ^ q;
)
else
return "Invalid value! Value of x must be more than 0 and less than 3.";
);
end A1
I've tried using ^ but I got the following error:
Error[207] : Rhs of '^' is not a sequence type
act : nat
exp : ( seq1 of # | [] )
No, you can't implicitly convert values like that in VDM.
If you just want to see the result in a textural form, you could look at the IO library, which lets you "print" a mixture of types to the console. Alternatively, if you must return a string, you could specify a function that converts a nat to a decimal string, and the return "Result is" ^ nat2str(q).

C++ to VBA (Excel)

So, basically, in Excel, I have 4 columns of data (all with strings) that I want to process, and want to have the results in another column, like this (nevermind the square brackets, they just represent cells):
Line Column1 Column2 Column3 Column4 Result
1: [a] [b] [k] [YES] [NO]
2: [a] [c] [l] [YES] [NO]
3: [b] [e] [] [YES] [NO]
4: [c] [e] [f] [NO] [NO]
5: [d] [h] [b] [NO] [NO]
6: [d] [] [w] [NO] [NO]
7: [e] [] [] [YES] [NO]
8: [j] [m] [] [YES] [YES]
9: [j] [] [] [YES] [YES]
10: [] [] [] [YES] [YES]
The process that I want the data to go through is this:
Assume that CheckingLine is the Line for which I currently want to calculate the value of Result, and that CurrentLine is any Line (except CheckingLine) that I am using to calculate the value of Result, at a given moment.
If Column4[CheckingLine] is "NO", Result is "NO" (simple enough, no help needed);
Example: CheckingLine = 1 -> Column4[1] = "NO" -> Result = "NO";
Else, I want to make sure that all Lines that share a common value with CheckingLine (in any Column between 1 and 3), also have Column4 as "YES" (Doing that would be simple enough even without VBA - in fact, I started by doing it in plain Excel and realised that it wasn't what I wanted) - if that happens, Result is "YES";
Example: CheckingLine = 8 -> Only shared value is "j" -> CurrentLine = 9 -> Column4[9] = "YES" -> Result = "YES";
Here's the tricky part: If one of those lines has any value (again, in any Column between 1 and 3) that IS NOT shared with CheckingLine, I want to do the whole process (restart at 1.), but checking the CurrentLine instead.
Example: CheckingLine = 2, "a" is shared with Line 1, c is shared with Line 4 -> CurrentLine = 1 -> Column4[1] = "YES", but "b" and "k" are not shared with CheckingLine -> CheckingLine' = 1 -> "b" is shared with Line 5 -> Column4[5] = "NO" -> Result = "NO";
I have written the corresponding C++ code (which works) (and it could have been in any other language, C++ was just the one I was using at the moment) (and the code HAS NOT been optimized in any way, because it's purpose was to be AS CLEAR about its functionality AS POSSIBLE) (the table above is the actual result of running it):
#include <iostream>
#include <string>
#include <vector>
std::vector<std::string> column1, column2, column3, column4, contentVector;
unsigned int location, columnsSize;
void InsertInVector(std::string Content)
{
if(Content == "")
{
return;
}
for(unsigned int i = 0; i < contentVector.size(); i++)
{
if(contentVector[i] == Content)
{
return;
}
}
contentVector.push_back(Content);
}
std::string VerifyCurrentVector(unsigned int Start)
{
std::string result = "";
if(contentVector.size() == 0)
{
result = "YES";
}
else
{
unsigned int nextStart = contentVector.size();
for(unsigned int i = 0; i < columnsSize; i++)
{
if(i != location)
{
for(unsigned int j = Start; j < nextStart; j++)
{
if(column1[i] == contentVector[j])
{
InsertInVector(column2[i]);
InsertInVector(column3[i]);
}
else if(column2[i] == contentVector[j])
{
InsertInVector(column1[i]);
InsertInVector(column3[i]);
}
else if(column3[i] == contentVector[j])
{
InsertInVector(column1[i]);
InsertInVector(column2[i]);
}
}
}
}
if(nextStart == contentVector.size())
{
for(unsigned int i = 0; i < columnsSize; i++)
{
if(i != location)
{
for(unsigned int j = 0; j < nextStart; j++)
{
if(column1[i] == contentVector[j] || column2[i] ==
contentVector[j] || column3[i] == contentVector[j])
{
if(column4[i] == "NO")
{
result = "NO";
return result;
}
}
}
}
}
result = "YES";
}
else
{
result = VerifyCurrentVector(nextStart);
}
}
return result;
}
std::string VerifyCell(unsigned int Location)
{
std::string result = "";
location = Location - 1;
if(column4.size() < Location)
{
result = "Error";
}
else if(column4[location] == "NO")
{
result = "NO";
}
else
{
contentVector.clear();
InsertInVector(column1[location]);
InsertInVector(column2[location]);
InsertInVector(column3[location]);
result = VerifyCurrentVector(0);
}
return result;
}
void SetUpColumns(std::vector<std::string> &Column1, std::vector<std::string> &Column2,
std::vector<std::string> &Column3, std::vector<std::string> &Column4)
{
if(Column4.size() > Column1.size())
{
for(unsigned int i = Column1.size(); i < Column4.size(); i++)
{
Column1.push_back("");
}
}
if(Column4.size() > Column2.size())
{
for(unsigned int i = Column2.size(); i < Column4.size(); i++)
{
Column2.push_back("");
}
}
if(Column4.size() > Column3.size())
{
for(unsigned int i = Column3.size(); i < Column4.size(); i++)
{
Column3.push_back("");
}
}
column1 = Column1;
column2 = Column2;
column3 = Column3;
column4 = Column4;
columnsSize = Column4.size();
}
int main()
{
std::vector<std::string> Column1, Column2, Column3, Column4;
Column1.push_back("a");
Column1.push_back("a");
Column1.push_back("b");
Column1.push_back("c");
Column1.push_back("d");
Column1.push_back("d");
Column1.push_back("e");
Column1.push_back("j");
Column1.push_back("j");
Column2.push_back("b");
Column2.push_back("c");
Column2.push_back("e");
Column2.push_back("e");
Column2.push_back("h");
Column2.push_back("");
Column2.push_back("");
Column2.push_back("m");
Column3.push_back("k");
Column3.push_back("l");
Column3.push_back("");
Column3.push_back("f");
Column3.push_back("b");
Column3.push_back("w");
Column4.push_back("YES");
Column4.push_back("YES");
Column4.push_back("YES");
Column4.push_back("NO");
Column4.push_back("NO");
Column4.push_back("NO");
Column4.push_back("YES");
Column4.push_back("YES");
Column4.push_back("YES");
Column4.push_back("YES");
SetUpColumns(Column1, Column2, Column3, Column4);
std::cout << "Line\t" << "Column1\t" << "Column2\t" << "Column3\t" << "Column4\t" <<
std::endl;
for(unsigned int i = 0; i < Column4.size(); i++)
{
std::cout << i + 1 << ":\t" << "[" << column1[i] << "]\t[" << column2[i] <<
"]\t[" << column3[i] << "]\t[" << column4[i] << "]\t[" << VerifyCell(i + 1)
<< "]" << std::endl;
}
return 0;
}
So, after this lengthy explanation, what I want to know is this:
Is there any way to do this in Excel's VBA (or even better, in plain Excel without VBA)?
If not, how can I have my code (which I can easily translate to another C-like language and/or optimise) get the data from, and deliver the results to, Excel?
Is there any way to do this in Excel's VBA?
Yes, you can surely do this with VBA, it is a complete and powerful programming language
(or even better, in plain Excel without VBA)?
Nope. The calculation seems too complicated to fit with Excel formulae without any VBA code.
If not, how can I have my code (which I can easily translate to another C-like language and/or optimise) get the data from, and deliver the results to, Excel?
You can access Excel from C++ in many ways. Using ATL is one of them. another, easier way would be to import/export your Excel file in CSV format, which is easy to parse and write from C++.
Also consider C#, it has complete COM inter-operability to access office components.
Ok, if you like to "whipped the code in a rush" then you'll love VBA, next time please try to ask a more specific question. Based on code and comments #MikeAscended you're a relatively good programmer, with a grasp of functions/recursion, variable/parameters, conditions, loops, data structures, etc. Re: " I have only touched VBA once in my life and ran away from it" My intent is to get you started and give you syntax here not necessarily a working solution. I'm happy to answer any further specific questions you may continue to have.
Strategy-wise,
I recommend plain VBA which is easy to use in Excel. Obviously your problem can be solved in many ways including formulas, however VBA is a powerful tool that any programmer will benefit from using.
Code-wise,
To start access the editor from Excel press [Alt-F11], or from Design Mode insert and double-click an ActiveX button. To run a macro press [Alt-F8], or in VBA click the green play button.
One last note, if you want those line numbers in column 1 in excel then yours will become Column 2-5 or B-F. I'm assuming you'll use the row numbers in excel so that Column 1 is A, but row 1 will still have titles, so you are staring your data on row 2.
sub processResults_Col5()
' Run This Script as Main()
dim rowCount as long, i as long 'rowCount = columnsSize
with sheets(1)
.Range("A1:D1") = Array("a", "b", "k", "YES")
' finish init here
' SetUpColumns not necessary in excel
if .cells(2,1).value <> "" then 'do not use .end(xldown) if data is missing
rowCount = .cells(1,1).end(xldown).row
for i = 1 to rowCount
.cells(i,5) = verifyCell(i + 1, rowCount)
next i
endif 'space will be added :p
end with
end sub
function verifyCell(rowLocation as long, size as long, optional wSh as excel.worksheet) as string
' the rest should be easy for you to figure out based on C-code
with wSh
if wsh is nothing then set wsh = activesheet 'let VBA capitalize stuff so you know you typed it correctly
if size < rowlocation then
verifyCell = "Error" 'the function name is the return value
'msgbox "Error" ' you can uncomment this line to see error
elseif cells(rowLocation, 4).value = "NO" then
cells(rowLocation, 5) = "NO" 'set result
else
call InsertInVector(rowLocation) 'CheckingLine
' edit the current rowLocation with for loops
verifyCell = VerifyCurrentVector(0) 'whatever you're doing here
endif
end with
end function
sub InsertInVector()
end sub
sub VerifyCurrentVector() 'function returns a value
end sub
Some tips:
Generally, Comment Your Code!
Generally, The first word/acronym of Variable and Object names should start in lowercase, then continue in camel-case. This helps distinguish them from library types.
In VBA always put [option explicit] in the beginning of every sheet/module, this requires you to [dim varName as Type] which will help debugging and make your code more explicit so it's easy to understand.
In VBA for numbers use type Long, learn early vs late-binding. If you're instantiating any object that requires a reference/library, always state it explicitly. This includes Excel.Worksheet, Excel.Workbook, etc. (eg. you may want your code in MS Access)
In Office One of the first settings you're going to want to disable is the popup error window, also use debug.print and the immediate box a few times.
Generally, as you know from C++ take your time, try to write correct code on your the first try as this will save you debugging time. Try not to rush and keep coffee & healthy snacks on hand. Good luck and have fun :)

Extracting columns with a difference in aligned data

I have some aligned data (something bioinformatic related) as so:
reference_string = 'yearning'
string2 = 'learning'
string3 = 'aligning'
I need to extract only columns showing differences in relation to the reference data.
The output should show only positional information of the columns containing differences in relation to the reference string and the corresponding reference item.
1 2 3 4
y e a r
l
a l i g
My current code does most things okay except that it also reports columns with no difference.
string1 = 'yearning'
string2 = 'learning'
string3 = 'aligning'
string_list = [string1, string2]
reference = reference_string
diffs_top, diffs = [], []
all_diffs = set()
for s in string_list:
diffs = []
for i, c in enumerate(s):
if s[i] != reference[i]:
diffs.append(i)
all_diffs.add(i)
diffs_top.append(diffs)
for d in all_diffs:
print str(int(d+1)),
print
for c in reference:
print str(c),
print
for i, s in enumerate(string_list):
for j, c in enumerate(s):
if j in diffs_top[i]:
print str(c),
else:
print str(' '),
print
This code would give:
1 2 3 4
y e a r n i n g
l
a l i g
Any help appreciated.
EDIT: I have picked some section of real data to make the problem as clearer as possible and my attempt at solving it thus far:
reference_string = 'MAHEWGPQRLAGGQPQAS'
string1 = 'MAQQWSLQRLAGRHPQDS'
string2 = 'MAQRWGAHRLTGGQLQDT'
string3 = 'MAQRWGPHALSGVQAQDA'
string_list = [string1, string2, string3]
reference = reference_string
diffs_top, diffs = [], []
all_diffs = set()
for s in string_list:
diffs = []
for i, c in enumerate(s):
if s[i] != reference[i]:
diffs.append(i)
all_diffs.add(i)
diffs_top.append(diffs)
#print diffs_top
#print all_diffs
for d in all_diffs:
print str(int(d+1)), # retains natural positions of the reference residues
print
for d in all_diffs:
for i, c in enumerate(reference):
if i == d:
print c,
print
The print out will be an output showing the position at which there is any difference to other non-reference strings and the corresponding reference letter.
3 4 6 7 8 9 11 13 14 15 17 18
H E G P Q R A G Q P A S
Then the next step is to write a code that will process non reference strings by printing out the difference with the reference (at that position). If there is no difference it will leave blank (' ').
Doing it manually the output will be:
3 4 6 7 8 9 11 13 14 15 17 18
H E G P Q R A G Q P A S
Q Q S L R H D
Q R A H T L D T
Q R H A S V A D A
My entire code as an attempt to get to the solution above as been messy to say the least:
reference_string = 'MAHEWGPQRLAGGQPQAS'
string1 = 'MAQQWSLQRLAGRHPQDS'
string2 = 'MAQRWGAHRLTGGQLQDT'
string3 = 'MAQRWGPHALSGVQAQDA'
string_list = [string1, string2, string3]
reference = reference_string
diffs_top, diffs = [], []
all_diffs = set()
for s in string_list:
diffs = []
for i, c in enumerate(s):
if s[i] != reference[i]:
diffs.append(i)
all_diffs.add(i)
diffs_top.append(diffs)
#print diffs_top
#print all_diffs
for d in all_diffs:
print str(int(d+1)),
print
for d in all_diffs:
for i, c in enumerate(reference):
if i == d:
print c,
print
# this is my attempt to look into non-reference strings
# to check for the difference with the reference, and print an output.
for d in all_diffs:
for i, s in enumerate(string_list):
for j, c in enumerate(s):
if j == d:
print c,
else:
print str(' '),
print
Your code is working perfectly fine (as per your logic).
What is happening , is that while printing the output, when you come across the reference string, Python looks for the corresponding entry in the diffs_top list and because while storing in diff_top, you have no entry stored for the reference string, Python just prints blank spaces for your reference string.
1 2 3 4
y e a r n i n g #prints the reference string, because you've coded in that way
#prints blank as string_list[0] and reference string are the same
l
a l i g
The question here is how exactly do you define your difference for reference string.
Besides, I also found some fundamental flaws in your code implementation. If you try to run your code by setting string_list[1] as your reference string, you would get your output as :
1 2 3 4
l e a r n i n g
y
a l i g
Is this what you need? Please spend some time in properly defining difference for all cases and then try to implement you code.
EDIT:
As per you updated requirements, replace the last block in your code with this:
for i, s in enumerate(string_list):
for d in all_diffs:
if d in diffs_top[i]:
print s[d],
else:
print ' ',
print
Cheers!
I think there is a general problem in your logic. If you need to extract only columns showing difference in relation to the reference data and string1 is the reference the output should be:
1 2 3 4
l
a l i g
So, 'yearning' shouldn't show any character because it has no difference to string1.
If you delete or put the following lines in comments, you will exactly get what I expect is the right answer:
#for c in reference:
# print str(c),
#print
Consider to review your logic if this solution is not what you actually want.
Update
Here is a shorter solution which solves your task:
from itertools import compress, izip_longest
def delta(reference, string):
return [ '' if a == b else b for a, b in izip_longest(reference, string)]
ref_string = 'MAHEWGPQRLAGGQPQAS'
strings = ['MAQQWSLQRLAGRHPQDS',
'MAQRWGAHRLTGGQLQDT',
'MAQRWGPHALSGVQAQDA']
delta_strings = [delta(ref_string, string) for string in strings]
selectors = [1 if any(tup) else 0 for tup in izip_longest(*delta_strings)]
indices = [str(i+1) for i in range(len(selectors))]
output_data = [indices, ref_string] + delta_strings
for line in output_data:
print ''.join(x.rjust(3) for x in compress(line, selectors))
Explanation:
I defined a function delta(reference, string) which returns the delta between the string and the referenced string. For example: delta("ABFF", "AECF") returns the list ['', E, C, ''].
The variable delta_strings holds all the deltas between each string in the list strings and the reference string ref_string.
The variable selector is a list containing only 1 and 0 values, where 0 specifies the collumns which shouldn't be printed and vice versa.

Returning list in ANTLR for type checking, language java

I am working on ANLTR to support type checking. I am in trouble at some point. I will try to explain it with an example grammar, suppose that I have the following:
#members {
private java.util.HashMap<String, String> mapping = new java.util.HashMap<String, String>();
}
var_dec
: type_specifiers d=dec_list? SEMICOLON
{
mapping.put($d.ids.get(0).toString(), $type_specifiers.type_name);
System.out.println("identext = " + $d.ids.get(0).toString() + " - " + $type_specifiers.type_name);
};
type_specifiers returns [String type_name]
: 'int' { $type_name = "int";}
| 'float' {$type_name = "float"; }
;
dec_list returns [List ids]
: ( a += ID brackets*) (COMMA ( a += ID brackets* ) )*
{$ids = $a;}
;
brackets : LBRACKET (ICONST | ID) RBRACKET;
ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')*;
LBRACKET : '[';
RBRACKET : ']';
In rule dec_list, you will see that I am returning List with ids. However, in var_dec when I try to put the first element of the list (I am using only get(0) just to see the return value from dec_list rule, I can iterate it later, that's not my point) into mapping I get a whole string like
[#4,6:6='a',<17>,1:6]
for an input
int a, b;
What I am trying to do is to get text of each ID, in this case a and b in the list of index 0 and 1, respectively.
Does anyone have any idea?
The += operator creates a List of Tokens, not just the text these Tokens match. You'll need to initialize the List in the #init{...} block of the rule and add the inner-text of the tokens yourself.
Also, you don't need to do this:
type_specifiers returns [String type_name]
: 'int' { $type_name = "int";}
| ...
;
simply access type_specifiers's text attribute from the rule you use it in and remove the returns statement, like this:
var_dec
: t=type_specifiers ... {System.out.println($t.text);}
;
type_specifiers
: 'int'
| ...
;
Try something like this:
grammar T;
var_dec
: type dec_list? ';'
{
System.out.println("type = " + $type.text);
System.out.println("ids = " + $dec_list.ids);
}
;
type
: Int
| Float
;
dec_list returns [List ids]
#init{$ids = new ArrayList();}
: a=ID {$ids.add($a.text);} (',' b=ID {$ids.add($b.text);})*
;
Int : 'int';
Float : 'float';
ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')*;
Space : ' ' {skip();};
which will print the following to the console:
type = int
ids = [a, b, foo]
If you run the following class:
import org.antlr.runtime.*;
public class Main {
public static void main(String[] args) throws Exception {
TLexer lexer = new TLexer(new ANTLRStringStream("int a, b, foo;"));
TParser parser = new TParser(new CommonTokenStream(lexer));
parser.var_dec();
}
}