How to use "PASCAL Annotation" files to train SVM with C++ - c++

I need to test my selfwritten person detector using Hog Descriptor (Dalal paper) on a Scientific dataset (INRIA person dataset). I have thousands of pos and neg Images in my Folder to train a Support Vector Machine (SVM). However to label the Images as positive (1.0) or negative (-1.0) I need to read the information from a text file that is provided with the dataset in a so called "PASCAL Annotation" format.
My problem is that I don't know how to read this format efficiently. I'm using C++ and OpenCV. Does anyone know how to do this efficiently? Are there already code snippets for C++?
In the end I need a loop that goes through a file "Annotations.lst" where all the picture filenames are listed. The program loads the picture and the corresponding annotation file (picturename.txt) to see if this picture belongs to a positive or negative training data (or later when actually detecting: if the tested picture belongs to a detected person or not)
Thanks for your help!

Maybe it's not the best implementation but works fine, hope it becomes useful even today! You will need filesystem library in order to work with files.
string line,value; //Line stores lines of the file and value stores characters of the line
int i=0; //Iterate through lines
int j=0; //Iterate through characters
int n=0; //Iterate through ,()-...
char v; //Stores variable value as a char to be able to make comparisions easily
vector <Rect> anotations; //Stores rectangles for each image
vector <int> bbValues; //Bounding box values (xmin,ymin,xmax,ymax)
fs::path anotationsFolder = "THE FOLDER PATH OF ANOTATIONS"; //Path of anotations folder
fs::path anotationsParsedFolder = "THE FOLDER PATH TO STORE PARSED ANOTATIONS"; //Path to store new anotations
fs::recursive_directory_iterator it(anotationsFolder); //Iteradores of files
fs::recursive_directory_iterator endit;
cout<<"Loading anotations from "<<anotationsFolder<<endl;
while((it != endit)) //Until end of folder
{
if((fs::is_regular_file(*it))) //Good practice
{
fs::path imagePath(it->path()); //Complete path of the image
cout<<"Reading anotations from"<<it->path().filename()<<endl;
ifstream inputFile; //Declare input file with image path
inputFile.open(imagePath.string().data(), std::ios_base::in);
i=0;
while (! inputFile.eof() ){ //Until end of file
getline (inputFile,line);//Get lines one by one
if ((i>=17) && ((i-17)%7==0)){ //In lines numer 17,24,31,38...where bounding boxes coordinates are
j=69;
v=line[j]; //Start from character num 69 corresponding to first value of Xmin
while (j<line.size()){ //Until end of line
if (v=='(' || v==',' || v==')' || v==' ' || v=='-'){ //if true, push back acumulated value unless previous value wasn't a symbol also
if (n==0){
bbValues.push_back(stoi(value)); //stoi converts string in to integer ("567"->567)
value.clear();
}
n++;
}
else{
value+=v; //Append new number
n=0;//Reset in order to know that a number has been read
}
j++;
v=line[j];//Read next character
}
Rect rect(bbValues[0],bbValues[1],bbValues[2]-bbValues[0],bbValues[3]-bbValues[1]); //Build a rectangle rect(xmin,ymin,xmax-xmin,ymax-ymin)
anotations.push_back(rect);
bbValues.clear();
}
i++;//Next line
}
inputFile.close();
cout<<"Writing..."<<endl;
//Save the anotations to a file
ofstream outputFile; //Declare file
fs::path outputPath(anotationsParsedFolder / it->path().filename());// Complete path of the file
outputFile.open(outputPath.string().data(), ios_base::trunc);
// Store anotations as x y width heigth
for (int i=0; i<anotations.size(); i++){
outputFile<<anotations[i].x<<" ";
outputFile<<anotations[i].y<<" ";
outputFile<<anotations[i].width<<" ";
outputFile<<anotations[i].height<<endl;
}
anotations.clear();
outputFile.close();
}
++it;//Next file in anotations folder
}

Related

C++: How to read multiple lines from file until a certain character, store them in array and move on to the next part of the file

I'm doing a hangman project for school, and one of the requirements is it needs to read the pictures of the hanging man from a text file. I have set up a text file with the '-' char which means the end of one picture and start of the next one.
I have this for loop set up to read the file until the delimiting character and store it in an array, but when testing I am getting incomplete pictures, cut off in certain places.
This is the code:
string s;
ifstream scenarijos("scenariji.txt");
for(int i = 0; i < 10; i++ ) {
getline(scenarijos, s, '-');
scenariji[i] = s;
}
For the record, scenariji is an array with type of string
And here is an example of the text file:
example
From your example input, it looks like '-' can be part of the input image (look at the "arms" of the hanged man). Unless you use some other, unique character to delimit the images, you won't be able to separate them.
If you know the dimensions of the images, you could read them without searching for the delimiter by reading a certain amount of bytes from the input file. Alternatively, you could define some more complex rules for image termination, e.g. when the '-' character is the only character in the line. For example:
ifstream scenarijos("scenariji.txt");
string scenariji[10];
for (int i = 0; i < 10; ++i) {
string& scenarij = scenariji[i];
while (scenarijos.good()) {
string s;
getline(scenarijos, s); // read line
if (!scenarijos.good() || s == "-")
break;
scenarij += s;
scenarij.push_back('\n'); // the trailing newline was removed by getline
}
}

Picking a random line from a text file

I need to write an 8 ball code that has eleven options to display and it needs to pull from a text file. I have it taking lines from the text file but sometimes it takes an empty line with no writing. And I need it to only take a line that has writing.
Here are that options it needs to draw from:
Yes, of course!
Without a doubt, yes.
You can count on it.
For sure!Ask me later.
I'm not sure.
I can't tell you right now.
I'll tell you after my nap.
No way!I don't think so.
Without a doubt, no.
The answer is clearly NO.
string line;
int random = 0;
int numOfLines = 0;
ifstream File("file.txt");
srand(time(0));
random = rand() % 50;
while (getline(File, line))
{
++numOfLines;
if (numOfLines == random)
{
cout << line;
}
}
}
IMHO, you need to either make the text lines all the same length, or use a database (table) of file positions.
Using File Positions
Minimally, create a std::vector<pos_type>.
Next read the lines from the file, recording the file position of the beginning of the string:
std::vector<std::pos_type> text_line_positions;
std::string text;
std::pos_type file_position = 0;
while (std::getline(text_file, text)
{
text_line_positions.push_back(file_position);
// Read the start position of the next line.
file_position = text_file.tellg();
}
To read a line from a file, get the file position from the database, then seek to it.
std::string text_line;
std::pos_type file_position = text_line_positions[5];
text_file.seekg(file_position);
std::getline(text_file, text_line);
The expression, text_line_positions.size() will return the number of text lines in the file.
If File Fits In Memory
If the file fits in memory, you could use std::vector<string>:
std::string text_line;
std::vector<string> database;
while (getline(text_file, text_line))
{
database.push_back(text_line);
}
To print the 10 line from the file:
std::cout << "Line 10 from file: " << database[9] << std::endl;
The above techniques minimize the amount of reading from the file.

update records in file (c++)

I wanted to ask how can I append strings to the end of fixed number of lines (fixed position). I am trying and searching books and websites for my answer but I couldn't find what I am doing wrong.
My structure :
const int numberofdays=150 ;
const int numberofstudents=2;
struct students
{
char attendance[numberofdays]; int rollno;
char fullname[50],fathersname[50];
}
Creating a text file
ofstream datafile("data.txt", ios::out );
Then I take input from the user and save it to the file.
How I save my data to text files :
datafile <<attendance <<endl<< rollno <<endl<<
fullname <<endl<< fathersname <<endl ;
How it looks like in text files :
p // p for present - 1st line
1 // roll number
Monte Cristo // full name
Black Michael // Fathers name
a // a for absent - 5th line
2 // roll number
Johnson // full name
Nikolas // Fathers name
How I try to update the file. (updating attendance for everyday)
datafile.open("data.txt", ios::ate | ios::out | ios::in);
if (!datafile)
{
cerr <<"File couldn't be opened";
exit (1);
}
for (int i=1 ; i<=numberofstudents ; i++)
{
long int offset = ( (i-1) * sizeof(students) );
system("cls");
cout <<"\t\tPresent : p \n\t\t Absent : a"<<endl;
cout <<"\nRoll #"<<i<<" : ";
cin >> ch1;
if (ch1 != 'p')
ch1 = 'a';
datafile.seekp(offset);
datafile <<ch1;
datafile.seekg(0);
}
I just want to add (append) characters 'p' or 'a' to the first or fifth line, I tried every possible way but I am unable to do it.
What you are doing is fairly common, but as you say it is inefficient if the size of data grows. Two solutions are to have fixed size records and index files.
For fixed-size records, in the file write the exact bytes of your data structure rather than a variable length text. This would mean you don't have a text file any more, but a binary file. You can then calculate the position to seek to easily.
To create an index file, write two files at once, one your variable size record file, and the other write a binary value with either the offset of the data from the start of the file. Since the index is a fixed size, you can seek to the index, read it, then seek to the position in the data file. If the new record will fit, you can update it in place, otherwise fill in with blanks and put the updated record at the end of the data file, then update the index file to point to the new location. This is basically how early PC databases worked.
Fixed size records are rather inflexible, and by the time you've implemented the index file system and tested it, now-a-days you probably would use a in-process database instead.
I came up with my own (easy & inefficient) logic to copy every line (except the line I want to update) to the another file.
I made my text file to be created like this :
=p // p for present - 1st line
1 // roll number
Monte Cristo // full name
Black Michael // Fathers name
=a // a for absent - 5th line
2 // roll number
Johnson // full name
Nikolas // Fathers name
Then I made the following code to update 1st and 5th line :
ifstream datafile("data.txt", ios ::in);
ofstream tempfile("temp.txt" , ios ::out);
string data, ch1;
while (getline(datafile,data))
{
if (data[0]=='=')
{
system("cls");
cout <<"\t\tPresent : p\n\t\tAbsent : a"<<endl;
cout <<"\nRoll #"<<i<<" : ";
cin >> ch1;
++i;
if (ch1 != "p")
ch1 = "a";
data=data+ch1; // Appending (updating) lines.
}
tempfile <<data <<endl; // If it was 1st or 5th line, it got updated
}
datafile.close(); tempfile.close();
remove("data.txt"); rename("temp.txt" , "data.txt");
But as you can see, this logic is inefficient. I will still wait for someone to inform me if I could somehow move my file pointer to exact location (1st and 5th line) and update them.
Cheers!

Logic for reading rows and columns from a text file (textparser) C++

I'm really stuck with this problem I'm having for reading rows and columns from a text file. We're using text files that our prof gave us. I have the functionality running so when the user in puts "numrows (file)" the number of rows in that file prints out.
However, every time I enter the text files, it's giving me 19 for both. The first text file only has 4 rows and the other one has 7. I know my logic is wrong, but I have no idea how to fix it.
Here's what I have for the numrows function:
int numrows(string line) {
ifstream ifs;
int i;
int row = 0;
int array [10] = {0};
while (ifs.good()) {
while (getline(ifs, line)) {
istringstream stream(line);
row = 0;
while(stream >>i) {
array[row] = i;
row++;
}
}
}
}
and here's the numcols:
int numcols(string line) {
int col = 0;
int i;
int arrayA[10] = {0};
ifstream ifs;
while (ifs.good()) {
istringstream streamA(line);
col = 0;
while (streamA >>i){
arrayA[col] = i;
col++;
}
}
}
edit: #chris yes, I wasn't sure what value to return as well. Here's my main:
int main() {
string fname, line;
ifstream ifs;
cout << "---- Enter a file name : ";
while (getline(cin, fname)) { // Ctrl-Z/D to quit!
// tries to open the file whose name is in string fname
ifs.open(fname.c_str());
if(fname.substr(0,8)=="numrows ") {
line.clear();
for (int i = 8; i<fname.length(); i++) {
line = line+fname[i];
}
cout << numrows (line) << endl;
ifs.close();
}
}
return 0;
}
This problem can be more easily solved by opening the text file as an ifstream, and then using std::get to process your input.
You can try for comparison against '\n' as the end of line character, and implement a pair of counters, one for columns on a line, the other for lines.
If you have variable length columns, you might want to store the values of (numColumns in a line) in a std::vector<int>, using myVector.push_back(numColumns) or similar.
Both links are to the cplusplus.com/reference section, which can provide a large amount of information about C++ and the STL.
Edited-in overview of possible workflow
You want one program, which will take a filename, and an 'operation', in this case "numrows" or "numcols". As such, your first steps are to find out the filename, and operation.
Your current implementation of this (in your question, after editing) won't work. Using cin should however be fine. Place this earlier in your main(), before opening a file.
Use substr like you have, or alternatively, search for a space character. Assume that the input after this is your filename, and the input in the first section is your operation. Store these values.
After this, try to open your file. If the file opens successfully, continue. If it won't open, then complain to the user for a bad input, and go back to the beginning, and ask again.
Once you have your file successfully open, check which type of calculation you want to run. Counting a number of rows is fairly easy - you can go through the file one character at a time, and count the number that are equal to '\n', the line-end character. Some files might use carriage-returns, line-feeds, etc - these have different characters, but are both a) unlikely to be what you have and b) easily looked up!
A number of columns is more complicated, because your rows might not all have the same number of columns. If your input is 1 25 21 abs 3k, do you want the value to be 5? If so, you can count the number of space characters on the line and add one. If instead, you want a value of 14 (each character and each space), then just count the characters based on the number of times you call get() before reaching a '\n' character. The use of a vector as explained below to store these values might be of interest.
Having calculated these two values (or value and set of values), you can output based on the value of your 'operation' variable. For example,
if (storedOperationName == "numcols") {
cout<< "The number of values in each column is " << numColsVal << endl;
}
If you have a vector of column values, you could output all of them, using
for (int pos = 0; pos < numColsVal.size(); pos++) {
cout<< numColsVal[pos] << " ";
}
Following all of this, you can return a value from your main() of 0, or you can just end the program (C++ now considers no return value from main to a be a return of 0), or you can ask for another filename, and repeat until some other method is used to end the program.
Further details
std::get() with no arguments will return the next character of an ifstream, using the example code format
std::ifstream myFileStream;
myFileStream.open("myFileName.txt");
nextCharacter = myFileStream.get(); // You should, before this, implement a loop.
// A possible loop condition might include something like `while myFileStream.good()`
// See the linked page on std::get()
if (nextCharacter == '\n')
{ // You have a line break here }
You could use this type of structure, along with a pair of counters as described earlier, to count the number of characters on a line, and the number of lines before the EOF (end of file).
If you want to store the number of characters on a line, for each line, you could use
std::vector<int> charPerLine;
int numberOfCharactersOnThisLine = 0;
while (...)
{
numberOfCharactersOnThisLine = 0
// Other parts of the loop here, including a numberOfCharactersOnThisLine++; statement
if (endOfLineCondition)
{
charPerLine.push_back(numberOfCharactersOnThisLine); // This stores the value in the vector
}
}
You should #include <vector> and either specific std:: before, or use a using namespace std; statement near the top. People will advise against using namespaces like this, but it can be convenient (which is also a good reason to avoid it, sort of!)

Error copying and pasting data from a file to another

I am writing a code to merge multiple text files and output a single file.
There can be up to 22 input text files which contain 1400 lines each.
Each line has 8 bits of binary and the new line characters \n.
I am out putting a single file that has all 22 text files merged.
Problem is with my output file, after 1400 lines it appears that the content from the previous file is still being placed into output file(although the length of the previous file was 1400 lines). This extra content also begins to have additional line space between each row if opened by microsoft office or sublime, however it is interpreted as a single line if opened by notepad or excel(a single cell in excel).
Following is the picture of expected behaviour of the output file,
Here is a picture of abnormal behaviour. This starts when the first file finishes.
I know this data is from the first file still because the second file starts from 00000000
And here is the start of the second file,
And this abnormal behavior repeats every single time the files are switching.
My implementation to achieve this is as follows:
repeat:
if(user_input == 'y')
{
fstream data_out ("data.txt",fstream::out);
for(int i = 0; i<files_found; i++)
{
fstream data_in ((file_names[i].c_str()),fstream::in);
if(data_in.is_open())
{
data_in.seekg(0,data_in.end);
long size = data_in.tellg();
data_in.seekg(0,data_in.beg);
char * buffer = new char[size];
cout << size;
data_in.read(buffer,size);
data_out.write(buffer,size);
delete[] buffer;
}else
{
cout << "Unexpected error";
return 1;
}
data_in.close();
}
data_out.close();
}else if(user_input == 'n')
{
return 1;
}else
{
cout << "Input not recognised. Type y for Yes, and n for No";
cin >> user_input;
goto repeat;
}
Further information:
I have checked the size variable and it is as I expect, 14000.
8 bits, and a \ with n = 10 characters per line,
1400 rows x 10 = 14000.
Assuming reader of code to be experienced.
Sorry to bump this question, but I really like question that are marked as answered. JoachimPileborg answer seems to have worked for you:
Also, instead of seeking and checking sizes and allocating memory, why
not just do e.g. data_out << data_in.rdbuf();? This will copy the
whole input file to the output. – Joachim Pileborg Jul 29 at 17:26
A reference http://www.cplusplus.com/reference/ios/ios/rdbuf/ and an example:
#include <fstream>
#include <string>
#include <vector>
int main(int argc, char** argv)
{
typedef std::vector<std::string> Filenames;
Filenames vecFilenames;
// Populate the list of file names
vecFilenames.push_back("Text1.txt");
vecFilenames.push_back("Text2.txt");
vecFilenames.push_back("Text3.txt");
// Merge the files into Output.txt
std::ofstream fpOutput("Output.txt");
for (Filenames::iterator it = vecFilenames.begin();
it != vecFilenames.end(); ++it)
{
std::ifstream fpInput(it->c_str());
fpOutput << fpInput.rdbuf();
fpInput.close();
}
fpOutput.close();
return 0;
}