Ifstream stuck on a word, creating an infinite loop - c++

I am writing a code that gathers data from a txt file. To get to the next interesting number, I use a do-while loop. However, the first do-while loop works perfectly, but in the second one, the ifstream myfile get stuck on the word Pmax. No idea what the cause could be. =/
Here is the interesting part of the parser (I am not using XML even though it looks a bit like it):
ifstream myfile;
string comment;
const string filename = "data";
myfile.open(filename.c_str());
do{
myfile>>comment;
} while (comment != "</probScen>");
for (int i=0;i<numberScen;i++){
myfile>>comment;
double prov;
myfile>>prov;
probScen.push_back(prov);
}
do{
if(myfile.eof()){cout<<"EoF reached"<<endl;}
myfile>>comment;
} while (comment != "</Pmax>");
for (int i=0;i<H;i++){
myfile>>comment;
double prov;
myfile>>prov;
Pmax.push_back(prov);
}
And here is the part of the txt file I want to read:
<probScen> scenario s - happening probability </probScen>
1 1
<Pmax> hour h - max price for this hour </Pmax>
1 5
The first do-while loop handles the probScen fine, but myfile in the second do-while gets stuck on Pmax, thus creating an infinite loop. To be more precise, myfile reads every single word until /probScen, then 1, 1, Pmax but then does not move on anymore. The myfile.eof() never returns true.
Thank you in advance for your help!

The problem will occur as soon as numberScen is greater than 1 (one)!
First iteration:
for (int i = 0; i < numberScen; i++)
{
myfile>>comment; // consumes 1
double prov;
myfile>>prov; // consumes 1
probScen.push_back(prov);
}
Second iteration:
for (int i = 0; i < numberScen; i++)
{
myfile>>comment; // consumes <Pmax>
double prov;
myfile>>prov; // trying to parse 'hour', but fails!
// from now on, fail bit is set
probScen.push_back(prov); // stores uninitialized value
}
Within the following while loop, as the fail bit is set, nothing is read at all, and so comment remains at the latestly consumed ""...

Related

C++ How to new line without chopping words

I am currently making a text game in c++. I am using a function that prints text one character at the time (to give a "narration" effect), which also goes to a new line to some condition defined by the function.
Here is the function:
void smart_print(const std::string& str, int spacer)//str is the printed message. spacer is the amount of space you want at the beginning and at the end of the cmd window
{
int max = Console::BufferWidth - (spacer * 2);
int limit = max;
ut.spacer(5);//this prints 5 spaces
for (int i = 0; i != str.size(); ++i)//this loop prints one character of the string every 50 milliseconds. It also checks if the limit is exceeded. If so, print new line
{
if (limit < 0)
{
cout << endl;
ut.spacer(5);
limit = max;
}
limit--;
std::cout << str[i];
Sleep(50);
}
}
The problem with this function, is that it chops the words, because it does a new line everytime the "limit" variable is less than 0, regardless if there is an incomplete word or not.
I made a sort of scheme to try to figure out how it should work correctly, but i can't manage to "translate" it into code.
1) Analyze the string, and check how long is the first word
2) Count the characters and stop counting when there is a space
3) Calculate if it can print the word (by subtracting the number of letters to max)
4) If the limit is exceeded, go to new line. Otherwise proceed to print the word one letter at the time
I really can't manage to make such function. I hope someone can help me out :P
Thanks in advance.
I think you should do something like checking if the current character is blank, using the std::isspace method like this:
// inside your for
if (limit < 0 && isspace(str[i]))
{
cout << endl;
ut.spacer(5);
limit = max;
}
limit--;
if(!isspace(str[i])) std::cout << str[i];
Sleep(50);
Note: I haven't tested the code so I am not 100% sure if it works correctly.

implement striping algorithm C++

Hi I am having trouble implementing a striping algorithm. I am also having a problem loading 30000 records in one vector, I tried this, but it is not working.
The program should declare variables to store ONE RECORD at a time. It should read a record and process it then read another record, and so on. Each process should ignore records that "belong" to another process. This can be done by keeping track of the record count and determining if the current record should be processed or ignored. For example, if there are 4 processes (numProcs = 4) process 0 should work on records 0, 4, 8, 12, ... (assuming we count from 0) and ignore all the other records in between.`
Residence res;
int numProcs = 4;
int linesNum = 0;
int recCount = 0;
int count = 0;
while(count <= numProcs)
{
while(!residenceFile.eof())
{
++recCount;
//distancess.push_back(populate_distancesVector(res,foodbankData));
if(recCount % processIS == linesNum)
{
residenceFile >> res.x >>res.y;
distancess.push_back(populate_distancesVector(res,foodbankData));
}
++linesNum;
}
++count;
}
Update the code
Residence res;
int numProcs = 1;
int recCount = 0;
while(!residenceFile.eof())
{
residenceFile >> res.x >>res.y;
//distancess.push_back(populate_distancesVector(res,foodbankData));
if ( recCount == processId)//process id
{
distancess.push_back(populate_distancesVector(res,foodbankData));
}
++recCount;
if(recCount == processId )
recCount = 0;
}
update sudo code
while(!residenceFile.eof())
{
residenceFile >> res.x >>res.y;
if ( recCount % numProcs == numLines)
{
distancess.push_back(populate_distancesVector(res,foodbankData));
}
else
++numLines
++recCount
}
You have tagged your post with MPI, but I don't see any place where you are checking a processor ID to see which record it should process.
Pseudocode for a solution to what I think you're asking:
While(there are more records){
If record count % numProcs == myID
ProcessRecord
else
Increment file stream pointer forward one record without processing
Increment Record Count
}
If you know the # of records you will be processing beforehand, then you can come up with a cleverer solution to move the filestream pointer ahead by numprocs records until that # is reached or surpassed.
A process that will act on records 0 and 4 must still read records 1, 2 and 3 (in order to get to 4).
Also, while(!residenceFile.eof()) isn't a good way to iterate through a file; it will read one round past the end. Do something like while(residenceFile >> res.x >>res.y) instead.
As for making a vector that contains 30,000 records, it sounds like a memory limitation. Are you sure you need that many in memory at once?
EDIT:
Look carefully at the updated code. If the process ID (numProcs) is zero, the process will act on the first record and no other; if it is something else, it will act on none of them.
EDIT:
Alas, I do not know Arabic. I will try to explain clearly in English.
You must learn a simple technique, before you attempt a difficult technique. If you guess at the algorithm, you will fail.
First, write a loop that iterates {0,1,2,3,...} and prints out all of the numbers:
int i=0;
while(i<10)
{
cout << i << endl;
++i;
}
Understand this before going farther. Then write a loop that iterates the same way, but prints out only {0,4,8,...}:
int i=0;
while(i<10)
{
if(i%4==0)
cout << i << endl;
++i;
}
Understand this before going farther. Then write a loop that prints out only {1,5,9,...}. Then write a loop that reads the file, and reports on every record. Then combine that with the logic from the previous exercise, and report on only one record out of every four.
Start with something small and simple. Add complexity in small measures. Develop new techniques in isolation. Test every step. Never add to code that doesn't work. This is the way to write code that works.

My while loop exits prematurely

thanks for reading this.
I am writing a code to read a big data file. And I try to use a while loop to read it one piece at a time.
But when I write
while(TimeStep++)
it will exit at the first loop.
if I write,
while(TimeStep+=1)
it will be just fine.
Also, if I initialize
int TimeStep=-1;
it will exit at the first loop. But if I initialize
int TimeStep=0;
it will be fine. The magic of while() confuse me. Please help me understand while loop.
Here is all my code.
//get diffusion of all the particles in the 256Coordinate.txt file and diffusion of a single particle.
using namespace std;
typedef vector<double> vec;
int ReadStructure(vec & Coordinate,int size,ifstream & TrajectoryFile){
double a;
for(int i=0;i<size*3;i++){
if(!(TrajectoryFile.eof())){
TrajectoryFile>>a;
Coordinate[i]=a;
}
}
//cout<<Coordinate[1]<<endl;
if(TrajectoryFile.eof()){
return 1;
} else {
return 0;
}
}
int main(){
int ContinueFlag=0,i,j,k;
double a,b,c;
vec Coordinate;
string filename= ("../256Coordinate.txt"); // a file that contains 256*5000*3 numbers
int size=256;
Coordinate.resize(size*3);
int TimeStep=0;
ifstream TrajectoryFile(filename.c_str());//open the .txt file and begin the read data
//TrajectoryFile>>a;
//cout<<a<<endl;
while(TimeStep+=1){//keep looping untils breaks.
ContinueFlag=ReadStructure(Coordinate,size,TrajectoryFile);//read the .txt file and store the values in the vector Coordinate[3*256]. Read 3
*256 numbers at a time.
// cout<<"ContinueFlag= "<<ContinueFlag<<endl;
if(ContinueFlag==1) break;//if we reach the end of the file, exit.
// cout<<Coordinate[1]<<endl;
}
cout<<"total number of timesteps= "<<TimeStep-1<<endl;
}
the body of while loop will execute when the loop condition under
while(loop condition)
is true.
So if you set TimeStep =0 to start with. It will test whether TimeStep ==0 before executing the while loop. Any non-zero value is treated as True. If it is 0, loop body will not execute.
If you initialize as int TimeStep=-1;, TimeStep+=1 will set TimeStep =0, which is equivalent to false, so loop body will not execute.
If you do not know the loop termination condition beforehand, simply use
while (true)
is better than using such a TimeStep variable.
Try:
while(true){
ContinueFlag=ReadStructure(Coordinate,size,TrajectoryFile);
if(ContinueFlag==1)
break;
}
In C++ the integer value 0 is False, any other value including negative integer is True. While loop exits when false.
I think your main problem is not understanding the while loop, it's understanding the increment operator ++.
Let's work with an example:
int x = 5;
int y = x++;
Here, x will have a value of 6 (because you made ++), but which value will y have? Actually, it will be 5. This is a so-called 'postincrement' operator: see, you assign first, and increment later.
If you wrote this
int x = 5;
int y = (x += 1);
Then you would have x = 6 as before, but this time y = 6 also, so you first increment x and only then assign it to y.
This should make your while loop misunderstanding go away:
int TimeStep = 0;
while(TimeStep++)
Here, TimeStep will get the value of 1, but only after it was used by while to test for exit, but while will see the old value (as y in the example above), and the old value is 0, so while exits immediately.
int TimeStep = 0;
while(TimeStep+=1)
In this case the loop goes on because you first increment the TimeStep and then let while test if it's nonzero.
I would really suggest you write a simple loop, why are you testing if TimeStep is nonzero anyway? Just do it like this:
while(true) { // Infinite cycle, until brake is encountered
TimeStep++;
}
The while loop expects a true/false value, according to that, TimeStep++ if TimeStep = -1 is false, because TimeStep++add 1 to TimeStep , so == 0, if TimeStep = 0and you add 1 then is ALWAYS true, because true is every value != 0...
I think you may need to get a better understanding of boolean algebra.
Here's a link to a tutorial http://www.electronics-tutorials.ws/boolean/bool_7.html.
A while loop is based around a boolean expression. If the expression within the while loop parentheses is true it will enter the loop and stop until that expression evaluates to false.
It works when the integer that you are using is set to 0 or 1 because 0 represents false and 1 represents true. You can't use an integer as a boolean expression if it is not 0 or 1.
It looks like you want the loop to break when ContinueFlag==1. So just use that as the while loop parameter. An alternative way would be to just change that code to while (true).
Since you want ContinueFlag to be set at least once (so you know when to break) I would suggest using a do while loop which executes at least once and then repeats if the expression is true.
USE THIS:
do {
ContinueFlag=ReadStructure(Coordinate,size,TrajectoryFile);
TimeStep++; //This allows for TimeStep to increment
} while (ContinueFlag!=1); //It will loop while ContinueFlag!=1 which will cause
//the loop to end when ContinueFlag==1
This is a better way of writing your code (as opposed to while (true)). This allows you to easily see what the purpose of the loop is.

C++ fastest cin for reading stdin?

I've profiled a computationally-heavy C++ program on Linux using cachegrind. Surprisingly, it turns out the bottleneck of my program is not in any sorting or computational method ... it's in reading the input.
Here is a screenshot of cachegrind, in case I'm mis-interpreting the profiler results (see scanf()):
I hope I'm right in saying that scanf() is taking 80.92% of my running time.
I read input using cin >> int_variable_here, like so:
std::ios_base::sync_with_stdio (false); // Supposedly makes I/O faster
cin >> NumberOfCities;
cin >> NumberOfOldRoads;
Roads = new Road[NumberOfOldRoads];
for (int i = 0; i < NumberOfOldRoads; i++)
{
int cityA, cityB, length;
cin >> cityA;
//scanf("%d", &cityA); // scanf() and cin are both too slow
cin >> cityB;
//scanf("%d", &cityB);
cin >> length;
//scanf("%d", &length);
Roads[i] = Road(cityA, cityB, length);
}
If you don't spot any issues with this input reading code, could you please recommend a faster way to read input? I'm thinking of trying getline() (working on it while I wait for responses). My guess is getline() may run faster because it has to do less conversion and it parses the stream a less total number of times (just my guess, though I'd have to parse the strings as integers eventually too).
What I mean by "too slow" is, this is part of a larger homework assignment that gets timed out after a certain period of time (I believe it is 90 seconds). I'm pretty confident the bottleneck is here because I purposely commented out a major portion of the rest of my program and it still timed out. I don't know what test cases the instructor runs through my program, but it must be a huge input file. So, what can I use to read input fastest?
The input format is strict: 3 integers separated by one space for each line, for many lines:
Sample Input:
7 8 3
7 9 2
8 9 1
0 1 28
0 5 10
1 2 16
I need to make a Road out of the integers in each line.
Also please not that input is redirected to my program to the standard input (myprogram < whatever_test_case.txt). I'm not reading a specific file. I just read from cin.
Update
Using Slava's method:
Input reading seems to be taking less time, but its still timing out (may not be due to input reading anymore). Slava's method is implemented in the Road() ctor (2 down from main). So now it takes 22% of the time as opposed to 80%. I'm thinking of optimizing SortRoadsComparator() as it's called 26,000,000 times.
Comparator Code:
// The complexity is sort of required for the whole min() max(), based off assignment instructions
bool SortRoadsComparator(const Road& a, const Road& b)
{
if (a.Length > b.Length)
return false;
else if (b.Length > a.Length)
return true;
else
{
// Non-determinism case
return ( (min(a.CityA, a.CityB) < min(b.CityA, b.CityB)) ||
(
(min(a.CityA, a.CityB) == min(b.CityA, b.CityB)) && max(a.CityA, a.CityB) < max(b.CityA, b.CityB)
)
);
}
}
Using enhzflep's method
Considering solved
I'm going to consider this problem solved because the bottleneck is no longer in reading input. Slava's method was the fastest for me.
Streams pretty well know to be very slow. It is not a big surprise though - they need to handle localizations, conditions etc. One possible solution would be to read file line by line by std::getline( std:::cin, str ) and convert string to numbers by something like this:
std::vector<int> getNumbers( const std::string &str )
{
std::vector<int> res;
int value = 0;
bool gotValue = false;
for( int i = 0; i < str.length(); ++i ) {
if( str[i] == ' ' ) {
if( gotValue ) res.push_back( value );
value = 0;
gotValue = false;
continue;
}
value = value * 10 + str[i] - '0';
gotValue = true;
}
if( gotValue ) res.push_back( value );
return res;
}
I did not test this code, wrote it to show the idea. I assume you do not expect to get anything in input but spaces and numbers, so it does not validate the input.
To optimize sorting first of all you should check if you really need to sort whole sequence. For comparator I would write methods getMin() getMax() and store that values in object (not to calculate them all the time):
bool SortRoadsComparator(const Road& a, const Road& b)
{
if( a.Length != b.Length ) return a.Length < b.length;
if( a.getMin() != b.getMin() ) return a.getMin() < b.getMin();
return a.getMax() < b.getMax();
}
if I understood how you current comparator works correctly.
As Slava says, streams (i.e cin) are absolute pigs in terms of performance (and executable file size)
Consider the following two approaches:
start = clock();
std::ios_base::sync_with_stdio (false); // Supposedly makes I/O faster
cin >> NumberOfCities >> NumberOfOldRoads;
Roads = new Road[NumberOfOldRoads];
for (int i = 0; i < NumberOfOldRoads; i++)
{
int cityA, cityB, length;
cin >> cityA >> cityB >> length;
Roads[i] = Road(cityA, cityB, length);
}
stop = clock();
printf ("time: %d\n", stop-start);
and
start = clock();
fp = stdin;
fscanf(fp, "%d\n%d\n", &NumberOfCities, &NumberOfOldRoads);
Roads = new Road[NumberOfOldRoads];
for (int i = 0; i < NumberOfOldRoads; i++)
{
int cityA, cityB, length;
fscanf(fp, "%d %d %d\n", &cityA, &cityB, &length);
Roads[i] = Road(cityA, cityB, length);
}
stop = clock();
printf ("time: %d\n", stop-start);
Running each way 5 times (with an input file of 1,000,000 entries + the first 2 'control' lines) gives us these results:
Using cin without the direction to not sync with stdio
8291, 8501, 8720, 8918, 7164 (avg 8318.3)
Using cin with the direction to not sync with stdio
4875, 4674, 4921, 4782, 5171 (avg 4884.6)
Using fscanf
1681, 1676, 1536, 1644, 1675 (avg 1642.4)
So, clearly, one can see that the sync_with_stdio(false) direction does help. One can also see that fscanf beats the pants off each approach with cin. In fact, the fscanf approach is nearly 3 times faster than the better of the cin approaches and a whopping 5 times faster than cin when not told to avoid syncing with stdio.
inline void S( int x ) {
x=0;
while((ch<'0' || ch>'9') && ch!='-' && ch!=EOF) ch=getchar_unlocked();
if (ch=='-')
sign=-1 , ch=getchar_unlocked();
else
sign=1;
do
x = (x<<3) + (x<<1) + ch-'0';
while((ch=getchar_unlocked())>='0' && ch<='9');
x*=sign;
}
you can use this function for any type of number input, just change the paramater type.
This will run pretty faster than std scanf.
If you want to save more time best thing will be to use fread() and fwrite() but in that case you have to manipulate the input by yourself.
To save time you should use fread() to read a large chunk of data from standard input stream in one call.That will decrease the number of I/O calls hence you will see a large difference in time.

Do I need more space?

I have code that is supposed to separate a string into 3 length sections:
ABCDEFG should be ABC DEF G
However, I have an extremely long string and I keep getting the
terminate called without an active exception
When I cut the length of the string down, it seems to work. Do I need more space? I thought when using a string I didn't have to worry about space.
int main ()
{
string code, default_Code, start_C;
default_Code = "TCAATGTAACGCGCTACCCGGAGCTCTGGGCCCAAATTTCATCCACT";
start_C = "AUG";
code = default_Code;
for (double j = 0; j < code.length(); j++) { //insert spacing here
code.insert(j += 3, 1, ' ');
}
cout << code;
return 0;
}
Think about the case when code.length() == 2. You're inserting a space somewhere over the string. I'm not sure but it would be okay if for(int j=0; j+3 < code.length(); j++).
This is some fairly confusing code. You are looping through a string and looping until you reach the end of the string. However, inside the loop you are not only modifying the string you are looping through, but you also change the loop variable when you say j += 3.
It happens to work for any string with a multiple of 3 letters, but you are not correctly handling other cases.
Here is a working example of the for loop that is a bit more clear it what it's doing:
// We skip 4 each time because we added a space.
for (int j = 3; j < code.length(); j += 4)
{
code.insert(j, 1, ' ');
}
You are using an extremely inefficient method to do such an operation. Every time you insert a space you are moving all the remaining part of the string forward and this means that the total number of operations you will need is in the order of o(n**2).
You can instead do this transormation with a single o(n) pass by using a read-write approach:
// input string is assumed to be non-empty
std::string new_string((old_string.size()*4-1)/3);
int writeptr = 0, count = 0;
for (int readptr=0,n=old_string.size(); readptr<n; readptr++) {
new_string[writeptr++] = old_string[readptr];
if (++count == 3) {
count = 0;
new_string[writeptr++] = ' ';
}
}
A similar algorithm can be written also to work "inplace" instead of creating a new string, simply you have to first enlarge the string and then work backward.
Note also that while it's true that for a string you don't need to care about allocation and deallocation still there are limits about the size of a string object (even if probably you are not hitting them... your version is so slow that it would take forever to get to that point on a modern computer).