Cpp: Segmentation fault core dumped - c++

I am trying to write a lexer, when I try to copy isdigit buffer value in an array of char, I get this core dumped error although I have done the same thing with identifier without getting error.
#include<fstream>
#include<iostream>
#include<cctype>
#include <cstring>
#include<typeinfo>
using namespace std;
int isKeyword(char buffer[]){
char keywords[22][10] = {"break","case","char","const","continue","default", "switch",
"do","double","else","float","for","if","int","long","return","short",
"sizeof","struct","void","while","main"};
int i, flag = 0;
for(i = 0; i < 22; ++i){
if(strcmp(keywords[i], buffer) == 0)
{
flag = 1;
break;
}
}
return flag;
}
int isSymbol_Punct(char word)
{
int flag = 0;
char symbols_punct[] = {'<','>','!','+','-','*','/','%','=',';','(',')','{', '}','.'};
for(int x= 0; x< 15; ++x)
{
if(word==symbols_punct[x])
{
flag = 1;
break;
}
}
return flag;
}
int main()
{
char buffer[15],buffer1[15];
char identifier[30][10];
char number[30][10];
memset(&identifier[0], '\0', sizeof(identifier));
memset(&number[0], '\0', sizeof(number));
char word;
ifstream fin("program.txt");
if(!fin.is_open())
{
cout<<"Error while opening the file"<<endl;
}
int i,k,j,l=0;
while (!fin.eof())
{
word = fin.get();
if(isSymbol_Punct(word)==1)
{
cout<<"<"<<word<<", Symbol/Punctuation>"<<endl;
}
if(isalpha(word))
{
buffer[j++] = word;
// cout<<"buffer: "<<buffer<<endl;
}
else if((word == ' ' || word == '\n' || isSymbol_Punct(word)==1) && (j != 0))
{
buffer[j] = '\0';
j = 0;
if(isKeyword(buffer) == 1)
cout<<"<"<<buffer<<", keyword>"<<endl;
else
{
cout<<"<"<<buffer<<", identifier>"<<endl;
strcpy(identifier[i],buffer);
i++;
}
}
else if(isdigit(word))
{
buffer1[l++] = word;
cout<<"buffer: "<<buffer1<<endl;
}
else if((word == ' ' || word == '\n' || isSymbol_Punct(word)==1) && (l != 0))
{
buffer1[l] = '\0';
l = 0;
cout<<"<"<<buffer1<<", number>"<<endl;
// cout << "Type is: "<<typeid(buffer1).name() << endl;
strcpy(number[k],buffer1);
k++;
}
}
cout<<"Identifier Table"<<endl;
int z=0;
while(strcmp(identifier[z],"\0")!=0)
{
cout <<z<<"\t\t"<< identifier[z]<<endl;
z++;
}
// cout<<"Number Table"<<endl;
// int y=0;
// while(strcmp(number[y],"\0")!=0)
// {
// cout <<y<<"\t\t"<< number[y]<<endl;
// y++;
// }
}
I am getting this error when I copy buffer1 in number[k] using strcpy. I do not understand why it is not being copied. When i printed the type of buffer1 to see if strcpy is not generating error, I got A_15, I searched for it, but did not find any relevant information.

The reason is here (line 56):
int i,k,j,l=0;
You might think that this initializes i, j, k, and l to 0, but in fact it only initializes l to 0. i, j, and k are declared here, but not initialized to anything. As a result, they contain random garbage, so if you use them as array indices you are likely to end up overshooting the bounds of the array in question.
At that point, anything could happen—in other words, this is undefined behavior. One likely outcome, which is probably happening to you, is that your program tries to access memory that hasn't been assigned to it by the operating system, at which point it crashes (a segmentation fault).
To give a concrete demonstration of what I mean, consider the following program:
#include <iostream>
void print_var(std::string name, int v)
{
std::cout << name << ": " << v << "\n";
}
int main(void)
{
int i, j, k, l = 0;
print_var("i", i);
print_var("j", j);
print_var("k", k);
print_var("l", l);
return 0;
}
When I ran this, I got the following:
i: 32765
j: -113535829
k: 21934
l: 0
As you can see, i, j, and k all came out such that using them as indices into any of the arrays you declared would exceed their bounds. Unless you are very lucky, this will happen to you, too.
You can fix this by initializing each variable separately:
int i = 0;
int j = 0;
int k = 0;
int l = 0;
Initializing each on its own line makes the initializations easier to see, helping to prevent mistakes.
A few side notes:
I was able to spot this issue immediately because I have my development environment configured to flag lines that provoke compiler warnings. Using a variable before it's being initialized should provoke such a warning if you're using a reasonable compiler, so you can fix problems like this as you run into them. Your development environment may support the same feature (and if it doesn't, you might consider switching to something that does). If nothing else, you can turn on warnings during compilation (by passing -Wall -Wextra to your compiler or the like—check its documentation for the specifics).
Since you declared your indices as int, they are signed integers, which means they can hold negative values (as j did in my demonstration). If you try to index into an array using a negative index, you will end up dereferencing a pointer to a location "behind" the start of the array in memory, so you will be in trouble even with an index of -1 (remember that a C-style array is basically just a pointer to the start of the array). Also, int probably has only 32 bits in your environment, so if you're writing 64-bit code then it's possible to define arrays too large for an int to fully cover, even if you were to index into the array from the middle. For these sorts of reasons, it's generally a good idea to type raw array indices as std::size_t, which is always capable of representing the size of the largest possible array in your target environment, and also is unsigned.
You describe this as C++ code, but I don't see much C++ here aside from the I/O streams. C++ has a lot of amenities that can help you guard against bugs compared to C-style code (which has to be written with great care). For example, you could replace your C-style arrays here with instances of std::array, which has a member function at() that does subscripting with bounds checking; that would have thrown a helpful exception in this case instead of having your program segfault. Also, it doesn't seem like you have a particular need for fixed-size arrays in this case, so you may better off using std::vector; this will automatically grow to accommodate new elements, helping you avoid writing outside the vector's bounds. Both support range-based for loops, which save you from needing to deal with indices by hand at all. You might enjoy Bjarne's A Tour of C++, which gives a nice overview of idiomatic C++ and will make all the wooly reference material easier to parse. (And if you want to pick up some nice C habits, both K&R and Kernighan and Pike's The Practice of Programming can save you much pain and tears).

Some general hints that might help you to avoid your cause of crash totally by design:
As this is C++, you should really refer to established C++ data types and schemes here as far as possible. I know, that distinct stuff in terms of parser/lexer writing can become quite low-level but at least for the things you want to achieve here, you should really appreciate that. Avoid plain arrays as far as possible. Use std::vector of uint8_t and/or std::string for instance.
Similar to point 1 and a consequence: Always use checked bounds iterations! You don't need to try to be better than the optimizer of your compiler, at least not here! In general, one should always avoid to duplicate container size information. With the stated C++ containers, this information is always provided on data source side already. If not possible for very rare cases (?), use constants for that, directly declared at/within data source definition/initialization.
Give your variables meaningful names, declare them as local to their used places as possible.
isXXX-methods - at least your ones, should return boolean values. You never return something else than 0 or 1.
A personal recommendation that is a bit controversional to be a general rule: Use early returns and abort criteria! Even after the check for file reading issues, you proceed further.
Try to keep your functions smart and non-boilerplate! Use sub-routines for distinct sub-tasks!
Try to avoid using namespace that globally! Even without exotic building schemes like UnityBuilds, this can become error-prone as hell for huger projects at latest.
the arrays keywords and symbols_punct should be at least static const ones. The optimizer will easily be able to recognize that but it's rather a help for you for fast code understanding at least. Try to use classes here to compound the things that belong together in a readable, adaptive, easy modifiable and reusable way. Always keep in mind, that you might want to understand your own code some months later still, maybe even other developers.

Related

Getting sementation fault (core dumped)

Everything seems to run okay up until the return part of shuffle_array(), but I'm not sure what.
int * shuffle_array(int initialArray[], int userSize)
{
// Variables
int shuffledArray[userSize]; // Create new array for shuffled
srand(time(0));
for (int i = 0; i < userSize; i++) // Copy initial array into new array
{
shuffledArray[i] = initialArray[i];
}
for(int i = userSize - 1; i > 0; i--)
{
int randomPosition = (rand() % userSize;
temp = shuffledArray[i];
shuffledArray[i] = shuffledArray[randomPosition];
shuffledArray[randomPosition] = temp;
}
cout << "The numbers in the initial array are: ";
for (int i = 0; i < userSize; i++)
{
cout << initialArray[i] << " ";
}
cout << endl;
cout << "The numbers in the shuffled array are: ";
for (int i = 0; i < userSize; i++)
{
cout << shuffledArray[i] << " ";
}
cout << endl;
return shuffledArray;
}
Sorry if spacing is off here, not sure how to copy and past code into here, so I had to do it by hand.
EDIT: Should also mention that this is just a fraction of code, not the whole project I'm working on.
There are several issues of varying severity, and here's my best attempt at flagging them:
int shuffledArray[userSize];
This array has a variable length. I don't think that it's as bad as other users point out, but you should know that this isn't allowed by the C++ standard, so you can't expect it to work on every compiler that you try (GCC and Clang will let you do it, but MSVC won't, for instance).
srand(time(0));
This is most likely outside the scope of your assignment (you've probably been told "use rand/srand" as a simplification), but rand is actually a terrible random number generator compared to what else the C++ language offers. It is rather slow, it repeats quickly (calling rand() in sequence will eventually start returning the same sequence that it did before), it is easy to predict based on just a few samples, and it is not uniform (some values have a much higher probability of being returned than others). If you pursue C++, you should look into the <random> header (and, realistically, how to use it, because it's unfortunately not a shining example of simplicity).
Additionally, seeding with time(0) will give you sequences that change only once per second. This means that if you call shuffle_array twice quickly in succession, you're likely to get the same "random" order. (This is one reason that often people will call srand once, in main, instead.)
for(int i = userSize - 1; i > 0; i--)
By iterating to i > 0, you will never enter the loop with i == 0. This means that there's a chance that you'll never swap the zeroth element. (It could still be swapped by another iteration, depending on your luck, but this is clearly a bug.)
int randomPosition = (rand() % userSize);
You should know that this is biased: because the maximum value of rand() is likely not divisible by userSize, you are marginally more likely to get small values than large values. You can probably just read up the explanation and move on for the purposes of your assignment.
return shuffledArray;
This is a hard error: it is never legal to return storage that was allocated for a function. In this case, the memory for shuffledArray is allocated automatically at the beginning at the function, and importantly, it is deallocated automatically at the end: this means that your program will reuse it for other purposes. Reading from it is likely to return values that have been overwritten by some code, and writing to it is likely to overwrite memory that is currently used by other code, which can have catastrophic consequences.
Of course, I'm writing all of this assuming that you use the result of shuffle_array. If you don't use it, you should just not return it (although in this case, it's unlikely to be the reason that your program crashes).
Inside a function, it's fine to pass a pointer to automatic storage to another function, but it's never okay to return that. If you can't use std::vector (which is the best option here, IMO), you have three other options:
have shuffle_array accept a shuffledArray[] that is the same size as initialArray already, and return nothing;
have shuffle_array modify initialArray instead (the shuffling algorithm that you are using is in-place, meaning that you'll get correct results even if you don't copy the original input)
dynamically allocate the memory for shuffledArray using new, which will prevent it from being automatically reclaimed at the end of the function.
Option 3 requires you to use manual memory management, which is generally frowned upon these days. I think that option 1 or 2 are best. Option 1 would look like this:
void shuffle_array(int initialArray[], int shuffledArray[], int userSize) { ... }
where userSize is the size of both initialArray and shuffledArray. In this scenario, the caller needs to own the storage for shuffledArray.
You should NOT return a pointer to local variable. After the function returns, shuffledArray gets deallocated and you're left with a dangling pointer.
You cannot return a local array. The local array's memory is released when you return (did the compiler warn you about that). If you do not want to use std::vector then create yr result array using new
int *shuffledArray = new int[userSize];
your caller will have to delete[] it (not true with std::vector)
When you define any non static variables inside a function, those variables will reside in function's stack. Once you return from function, the function's stack is gone. In your program, you are trying to return a local array which will be gone once control is outside of shuffle_array().
To solve this, either you need to define the array globally (which I won't prefer because using global variables are dangerous) or use dynamic memory allocation for the array which will create space for the array in heap rather than allocating the space on the function's stack. You can use std::vectors also, if you are familiar with vectors.
To allocate memory dynamically, you have to use new as mentioned below.
int *shuffledArray[] = new int[userSize];
and once you completed using shuffledArray, you need to free the memory as below.
delete [] shuffledArray;
otherwise your program will leak memory.

Running Into Segmentation Fault When Attempting To Convert String To Double in C++

I'm working on a fairly straightforward priority scheduling program which takes a text file with lines in the format:
[N/S/n/s] number number
And I'm attempting to convert the numbers into a double format. I'm trying to do this with a stringstream (this is a class project which must run on a version of Linux without stod), using the example here as a reference: https://www.geeksforgeeks.org/converting-strings-numbers-cc/
The problem is, when I try and implement what I'd think should be a fairly simple few lines of code to do this, I'm met with "Segmentation Fault (Core Dumped)" which seems directly related to my attempt to actually send the stringstream to the double variable I've created. I've included my code so far (still obviously quite far from done) and have also indicated the last line I'm able to execute by outputting "made it to here." I'm very confused about this issue, and would appreciate any help available. Note that although I've posted my entire code for completion's sake, only a small part near the bottom is relevant to the question, which I've clearly indicated.
Code:
#include <iostream>
#include <stdio.h>
#include<string.h>
#include <stdlib.h>
#include<cstring>
#include<sstream>
#include<cstdlib>
#include <unistd.h>
#include<pthread.h>
#include<ctype.h>
#include <vector>
#include <sys/wait.h>
#include <fstream>
#include<ctype.h>
using namespace std;
struct Train{
public:
char trainDirection;
double trainPriority;
double trainTimeToLoad;
double trainTimeToCross;
};
void *trainFunction (void* t){cout << "placeholder function for "<< t <<endl;}
vector<string> split(string str, char c = ' '){
vector<string> result;
int start = 0;
int end = 3;
int loadCounter = 1;
int crossCounter = 1;
result.push_back(str.substr(start, 1));
start = 2;
while (str.at(end) != ' '){
end++;
loadCounter++;
}
result.push_back(str.substr(start, loadCounter));
start = end + 1;
end = start +1;
while(end < str.size()){
end++;
crossCounter++;
}
result.push_back(str.substr(start, crossCounter));
for(int i = 0; i < result.size(); i++){
cout << result[i] <<"|";
}
cout<<endl;
return result;
}
int main(int argc, char **argv){
//READ THE FILE
const char* file = argv[1];
cout << file <<endl;
ifstream fileInput (file);
string line;
char* tokenPointer;
int threadCount = 0;
int indexOfThread = 0;
while(getline(fileInput, line)){
threadCount++;
}
fileInput.clear();
fileInput.seekg(0, ios::beg);
//CREATE THREADS
pthread_t thread[threadCount];
while(getline(fileInput, line)){
vector<string> splitLine = split(line);
//create thread
struct Train *trainInstance;
stringstream directionStringStream(splitLine[0]);
char directionChar = 'x';
directionStringStream >> directionChar;
trainInstance->trainDirection = directionChar;
if(splitLine[0] == "N" || splitLine[0] == "S"){
trainInstance->trainPriority = 1;
}
else{
trainInstance->trainPriority = 0;
}
stringstream loadingTimeStringStream(splitLine[1]);
double doubleLT = 0;
cout << "made it to here" <<endl;
loadingTimeStringStream >> doubleLT; //THIS IS THE PROBLEM LINE
trainInstance->trainTimeToLoad = doubleLT;
stringstream crossingTimeStringStream(splitLine[2]);
double doubleCT = 0;
crossingTimeStringStream >> doubleCT;
trainInstance->trainTimeToCross = doubleCT;
pthread_create(&thread[indexOfThread], NULL, trainFunction,(void *) trainInstance);
indexOfThread++;
}
}
There are some mistakes in your code that lead to undefined behavior and that's the cause of your segmentation fault. Namely:
You don't check the number of arguments before using them
You do not return a value in trainFunction
You do not create a valid object for trainInstance to point to
The first two are somewhat obvious to solve, so I'll talk about the last one. Memory management in C++ is nuanced and the proper solution depends on your use case. Because Train objects are small it would be optimal to have them allocated as local variables. The tricky part here is to ensure they are not destroyed prematurely.
Simply changing the declaration to struct Train trainInstance; won't work because this struct will be destroyed at the end of the current loop iteration while the thread will still be potentially alive and try to access the struct.
To ensure that the Train objects are destroyed after the threads are done we must declare them before the thread array and make sure to join the threads before they go out of scope.
Train trainInstances[threadCount];
pthread_t thread[threadCount];
while(...) {
...
pthread_create(&thread[indexOfThread], nullptr, trainFunction,static_cast<void *>(&trainInstances[indexOfThread]));
}
// Join threads eventually
// Use trainInstances safely after all threads have joined
// trainInstances will be destroyed at the end of this scope
This is clean and works, but it isn't optimal because you might want the threads to outlive the trainInstances for some reason. Keeping them alive until the threads are destroyed is a waste of memory in that case. Depending on the number of objects it might not even be worth it to waste time trying to optimize the time of their destruction but you can do something like the following.
pthread_t thread[threadCount];
{
Train trainInstances[threadCount];
while(...) {
...
pthread_create(&thread[indexOfThread], nullptr, trainFunction,static_cast<void *>(&trainInstances[indexOfThread]));
}
// Have threads use some signalling mechanism to signify they are done
// and will never attempt to use their Train instance again
// Use trainInstances
} // trainInstances destroyed
// threads still alive
It is best to avoid pointers when dealing with threads that don't provide a C++ interface because it is a pain to deal with dynamic memory management when you can't simply pass smart pointers around by value. If you use a new statement, execution must always reach exactly one corresponding delete statement on the returned pointer. While this may sound trivial in some cases it is complicated because of potential exceptions and early return statements.
Finally, note the change of the pthread_create call to the following.
pthread_create(&thread[indexOfThread], nullptr, trainFunction,static_cast<void *>(&trainInstances[indexOfThread]));
There are two significant changes to the safety of this line.
Use of nullptr : NULL has an integral type and can silently be passed to non-pointer argument. Without named arguments, this is a problem because it is hard to spot the error without looking up the function signature and verifying the arguments one-by-one. nullptr is type-safe and will cause a compiler error whenever it is assigned to a non-pointer type without an explicit cast.
Use of static_cast : C-style casts are dangerous business. They will try a bunch of different casts and go with the first one that works, which might not be what you intended. Take a look at the following code.
// Has the generic interface required by pthreads
void* pthreadFunc(void*);
int main() {
int i;
pthreadFunc((void*)i);
}
Oops! It should have been (void*)(&i) to cast the address of i to void*. But the compiler won't throw an error because it can implicitly convert an integral value to a void* so it will simply cast the value of i to void* and pass that to the function with potentially disastrous effects. Using a static_cast will catch that error. static_cast<void*>(i) simply fails to compile, so we notice our mistake and change it to static_cast<void*>(&i)
You dereferenced trainInstance via -> operator without assigning valid buffer, so the system will try to write to weird place and cause segmentation fault.
You can allocate buffer like this:
struct Train *trainInstance = new struct Train;
struct is not needed here, but I used one because it is used in the original code.
Also don't forget to check the number of arguments before using argv[1].

Is it problematic to declare variables in the main loop of a OpenGL program? [duplicate]

Question #1: Is declaring a variable inside a loop a good practice or bad practice?
I've read the other threads about whether or not there is a performance issue (most said no), and that you should always declare variables as close to where they are going to be used. What I'm wondering is whether or not this should be avoided or if it's actually preferred.
Example:
for(int counter = 0; counter <= 10; counter++)
{
string someString = "testing";
cout << someString;
}
Question #2: Do most compilers realize that the variable has already been declared and just skip that portion, or does it actually create a spot for it in memory each time?
This is excellent practice.
By creating variables inside loops, you ensure their scope is restricted to inside the loop. It cannot be referenced nor called outside of the loop.
This way:
If the name of the variable is a bit "generic" (like "i"), there is no risk to mix it with another variable of same name somewhere later in your code (can also be mitigated using the -Wshadow warning instruction on GCC)
The compiler knows that the variable scope is limited to inside the loop, and therefore will issue a proper error message if the variable is by mistake referenced elsewhere.
Last but not least, some dedicated optimization can be performed more efficiently by the compiler (most importantly register allocation), since it knows that the variable cannot be used outside of the loop. For example, no need to store the result for later re-use.
In short, you are right to do it.
Note however that the variable is not supposed to retain its value between each loop. In such case, you may need to initialize it every time. You can also create a larger block, encompassing the loop, whose sole purpose is to declare variables which must retain their value from one loop to another. This typically includes the loop counter itself.
{
int i, retainValue;
for (i=0; i<N; i++)
{
int tmpValue;
/* tmpValue is uninitialized */
/* retainValue still has its previous value from previous loop */
/* Do some stuff here */
}
/* Here, retainValue is still valid; tmpValue no longer */
}
For question #2:
The variable is allocated once, when the function is called. In fact, from an allocation perspective, it is (nearly) the same as declaring the variable at the beginning of the function. The only difference is the scope: the variable cannot be used outside of the loop. It may even be possible that the variable is not allocated, just re-using some free slot (from other variable whose scope has ended).
With restricted and more precise scope come more accurate optimizations. But more importantly, it makes your code safer, with less states (i.e. variables) to worry about when reading other parts of the code.
This is true even outside of an if(){...} block. Typically, instead of :
int result;
(...)
result = f1();
if (result) then { (...) }
(...)
result = f2();
if (result) then { (...) }
it's safer to write :
(...)
{
int const result = f1();
if (result) then { (...) }
}
(...)
{
int const result = f2();
if (result) then { (...) }
}
The difference may seem minor, especially on such a small example.
But on a larger code base, it will help : now there is no risk to transport some result value from f1() to f2() block. Each result is strictly limited to its own scope, making its role more accurate. From a reviewer perspective, it's much nicer, since he has less long range state variables to worry about and track.
Even the compiler will help better : assuming that, in the future, after some erroneous change of code, result is not properly initialized with f2(). The second version will simply refuse to work, stating a clear error message at compile time (way better than run time). The first version will not spot anything, the result of f1() will simply be tested a second time, being confused for the result of f2().
Complementary information
The open-source tool CppCheck (a static analysis tool for C/C++ code) provides some excellent hints regarding optimal scope of variables.
In response to comment on allocation:
The above rule is true in C, but might not be for some C++ classes.
For standard types and structures, the size of variable is known at compilation time. There is no such thing as "construction" in C, so the space for the variable will simply be allocated into the stack (without any initialization), when the function is called. That's why there is a "zero" cost when declaring the variable inside a loop.
However, for C++ classes, there is this constructor thing which I know much less about. I guess allocation is probably not going to be the issue, since the compiler shall be clever enough to reuse the same space, but the initialization is likely to take place at each loop iteration.
Generally, it's a very good practice to keep it very close.
In some cases, there will be a consideration such as performance which justifies pulling the variable out of the loop.
In your example, the program creates and destroys the string each time. Some libraries use a small string optimization (SSO), so the dynamic allocation could be avoided in some cases.
Suppose you wanted to avoid those redundant creations/allocations, you would write it as:
for (int counter = 0; counter <= 10; counter++) {
// compiler can pull this out
const char testing[] = "testing";
cout << testing;
}
or you can pull the constant out:
const std::string testing = "testing";
for (int counter = 0; counter <= 10; counter++) {
cout << testing;
}
Do most compilers realize that the variable has already been declared and just skip that portion, or does it actually create a spot for it in memory each time?
It can reuse the space the variable consumes, and it can pull invariants out of your loop. In the case of the const char array (above) - that array could be pulled out. However, the constructor and destructor must be executed at each iteration in the case of an object (such as std::string). In the case of the std::string, that 'space' includes a pointer which contains the dynamic allocation representing the characters. So this:
for (int counter = 0; counter <= 10; counter++) {
string testing = "testing";
cout << testing;
}
would require redundant copying in each case, and dynamic allocation and free if the variable sits above the threshold for SSO character count (and SSO is implemented by your std library).
Doing this:
string testing;
for (int counter = 0; counter <= 10; counter++) {
testing = "testing";
cout << testing;
}
would still require a physical copy of the characters at each iteration, but the form could result in one dynamic allocation because you assign the string and the implementation should see there is no need to resize the string's backing allocation. Of course, you wouldn't do that in this example (because multiple superior alternatives have already been demonstrated), but you might consider it when the string or vector's content varies.
So what do you do with all those options (and more)? Keep it very close as a default -- until you understand the costs well and know when you should deviate.
I didn't post to answer JeremyRR's questions (as they have already been answered); instead, I posted merely to give a suggestion.
To JeremyRR, you could do this:
{
string someString = "testing";
for(int counter = 0; counter <= 10; counter++)
{
cout << someString;
}
// The variable is in scope.
}
// The variable is no longer in scope.
I don't know if you realize (I didn't when I first started programming), that brackets (as long they are in pairs) can be placed anywhere within the code, not just after "if", "for", "while", etc.
My code compiled in Microsoft Visual C++ 2010 Express, so I know it works; also, I have tried to to use the variable outside of the brackets that it was defined in and I received an error, so I know that the variable was "destroyed".
I don't know if it is bad practice to use this method, as a lot of unlabeled brackets could quickly make the code unreadable, but maybe some comments could clear things up.
For C++ it depends on what you are doing.
OK, it is stupid code but imagine
class myTimeEatingClass
{
public:
//constructor
myTimeEatingClass()
{
sleep(2000);
ms_usedTime+=2;
}
~myTimeEatingClass()
{
sleep(3000);
ms_usedTime+=3;
}
const unsigned int getTime() const
{
return ms_usedTime;
}
static unsigned int ms_usedTime;
};
myTimeEatingClass::ms_CreationTime=0;
myFunc()
{
for (int counter = 0; counter <= 10; counter++) {
myTimeEatingClass timeEater();
//do something
}
cout << "Creating class took " << timeEater.getTime() << "seconds at all" << endl;
}
myOtherFunc()
{
myTimeEatingClass timeEater();
for (int counter = 0; counter <= 10; counter++) {
//do something
}
cout << "Creating class took " << timeEater.getTime() << "seconds at all" << endl;
}
You will wait 55 seconds until you get the output of myFunc.
Just because each loop constructor and destructor together need 5 seconds to finish.
You will need 5 seconds until you get the output of myOtherFunc.
Of course, this is a crazy example.
But it illustrates that it might become a performance issue when each loop the same construction is done when the constructor and / or destructor needs some time.
Since your second question is more concrete, I'm going to address it first, and then take up your first question with the context given by the second. I wanted to give a more evidence-based answer than what's here already.
Question #2: Do most compilers realize that the variable has already
been declared and just skip that portion, or does it actually create a
spot for it in memory each time?
You can answer this question for yourself by stopping your compiler before the assembler is run and looking at the asm. (Use the -S flag if your compiler has a gcc-style interface, and -masm=intel if you want the syntax style I'm using here.)
In any case, with modern compilers (gcc 10.2, clang 11.0) for x86-64, they only reload the variable on each loop pass if you disable optimizations. Consider the following C++ program—for intuitive mapping to asm, I'm keeping things mostly C-style and using an integer instead of a string, although the same principles apply in the string case:
#include <iostream>
static constexpr std::size_t LEN = 10;
void fill_arr(int a[LEN])
{
/* *** */
for (std::size_t i = 0; i < LEN; ++i) {
const int t = 8;
a[i] = t;
}
/* *** */
}
int main(void)
{
int a[LEN];
fill_arr(a);
for (std::size_t i = 0; i < LEN; ++i) {
std::cout << a[i] << " ";
}
std::cout << "\n";
return 0;
}
We can compare this to a version with the following difference:
/* *** */
const int t = 8;
for (std::size_t i = 0; i < LEN; ++i) {
a[i] = t;
}
/* *** */
With optimization disabled, gcc 10.2 puts 8 on the stack on every pass of the loop for the declaration-in-loop version:
mov QWORD PTR -8[rbp], 0
.L3:
cmp QWORD PTR -8[rbp], 9
ja .L4
mov DWORD PTR -12[rbp], 8 ;✷
whereas it only does it once for the out-of-loop version:
mov DWORD PTR -12[rbp], 8 ;✷
mov QWORD PTR -8[rbp], 0
.L3:
cmp QWORD PTR -8[rbp], 9
ja .L4
Does this make a performance impact? I didn't see an appreciable difference in runtime between them with my CPU (Intel i7-7700K) until I pushed the number of iterations into the billions, and even then the average difference was less than 0.01s. It's only a single extra operation in the loop, after all. (For a string, the difference in in-loop operations is obviously a bit greater, but not dramatically so.)
What's more, the question is largely academic, because with an optimization level of -O1 or higher gcc outputs identical asm for both source files, as does clang. So, at least for simple cases like this, it's unlikely to make any performance impact either way. Of course, in a real-world program, you should always profile rather than make assumptions.
Question #1: Is declaring a variable inside a loop a good practice or
bad practice?
As with practically every question like this, it depends. If the declaration is inside a very tight loop and you're compiling without optimizations, say for debugging purposes, it's theoretically possible that moving it outside the loop would improve performance enough to be handy during your debugging efforts. If so, it might be sensible, at least while you're debugging. And although I don't think it's likely to make any difference in an optimized build, if you do observe one, you/your pair/your team can make a judgement call as to whether it's worth it.
At the same time, you have to consider not only how the compiler reads your code, but also how it comes off to humans, yourself included. I think you'll agree that a variable declared in the smallest scope possible is easier to keep track of. If it's outside the loop, it implies that it's needed outside the loop, which is confusing if that's not actually the case. In a big codebase, little confusions like this add up over time and become fatiguing after hours of work, and can lead to silly bugs. That can be much more costly than what you reap from a slight performance improvement, depending on the use case.
Once upon a time (pre C++98); the following would break:
{
for (int i=0; i<.; ++i) {std::string foo;}
for (int i=0; i<.; ++i) {std::string foo;}
}
with the warning that i was already declared (foo was fine as that's scoped within the {}). This is likely the WHY people would first argue it's bad. It stopped being true a long time ago though.
If you STILL have to support such an old compiler (some people are on Borland) then the answer is yes, a case could be made to put the i out the loop, because not doing so makes it makes it "harder" for people to put multiple loops in with the same variable, though honestly the compiler will still fail, which is all you want if there's going to be a problem.
If you no longer have to support such an old compiler, variables should be kept to the smallest scope you can get them so that you not only minimise the memory usage; but also make understanding the project easier. It's a bit like asking why don't you have all your variables global. Same argument applies, but the scopes just change a bit.
It's a very good practice, as all above answer provide very good theoretical aspect of the question let me give a glimpse of code, i was trying to solve DFS over GEEKSFORGEEKS, i encounter the optimization problem......
If you try to solve the code declaring the integer outside the loop will give you Optimization Error..
stack<int> st;
st.push(s);
cout<<s<<" ";
vis[s]=1;
int flag=0;
int top=0;
while(!st.empty()){
top = st.top();
for(int i=0;i<g[top].size();i++){
if(vis[g[top][i]] != 1){
st.push(g[top][i]);
cout<<g[top][i]<<" ";
vis[g[top][i]]=1;
flag=1;
break;
}
}
if(!flag){
st.pop();
}
}
Now put integers inside the loop this will give you correct answer...
stack<int> st;
st.push(s);
cout<<s<<" ";
vis[s]=1;
// int flag=0;
// int top=0;
while(!st.empty()){
int top = st.top();
int flag = 0;
for(int i=0;i<g[top].size();i++){
if(vis[g[top][i]] != 1){
st.push(g[top][i]);
cout<<g[top][i]<<" ";
vis[g[top][i]]=1;
flag=1;
break;
}
}
if(!flag){
st.pop();
}
}
this completely reflect what sir #justin was saying in 2nd comment....
try this here
https://practice.geeksforgeeks.org/problems/depth-first-traversal-for-a-graph/1. just give it a shot.... you will get it.Hope this help.
Chapter 4.8 Block Structure in K&R's The C Programming Language 2.Ed.:
An automatic variable declared and initialized in a
block is initialized each time the block is entered.
I might have missed seeing the relevant description in the book like:
An automatic variable declared and initialized in a
block is allocated only one time before the block is entered.
But a simple test can prove the assumption held:
#include <stdio.h>
int main(int argc, char *argv[]) {
for (int i = 0; i < 2; i++) {
for (int j = 0; j < 2; j++) {
int k;
printf("%p\n", &k);
}
}
return 0;
}
The two snippets below generate the same assembly.
// snippet 1
void test() {
int var;
while(1) var = 4;
}
// snippet 2
void test() {
while(1) int var = 4;
}
output:
test():
push rbp
mov rbp, rsp
.L2:
mov DWORD PTR [rbp-4], 4
jmp .L2
Link: https://godbolt.org/z/36hsM6Pen
So, until profiling opposes or computation extensive constructor is involved, keeping declation close to its usage should be the default approach.

100% of array correct in function, 75% of array correct in CALLING function - C

Note: i'm using the c++ compiler, hence why I can use pass by reference
i have a strange problem, and I don't really know what's going on.
Basically, I have a text file: http://pastebin.com/mCp6K3HB
and I'm reading the contents of the text file in to an array of atoms:
typedef struct{
char * name;
char * symbol;
int atomic_number;
double atomic_weight;
int electrons;
int neutrons;
int protons;
} atom;
this is my type definition of atom.
void set_up_temp(atom (&element_record)[DIM1])
{
char temp_array[826][20];
char temp2[128][20];
int i=0;
int j=0;
int ctr=0;
FILE *f=fopen("atoms.txt","r");
for (i = 0; f && !feof(f) && i < 827; i++ )
{
fgets(temp_array[i],sizeof(temp_array[0]),f);
}
for (j = 0; j < 128; j++)
{
element_record[j].name = temp_array[ctr];
element_record[j].symbol = temp_array[ctr+1];
element_record[j].atomic_number = atol(temp_array[ctr+2]);
element_record[j].atomic_weight = atol(temp_array[ctr+3]);
element_record[j].electrons = atol(temp_array[ctr+4]);
element_record[j].neutrons = atol(temp_array[ctr+5]);
element_record[j].protons = atol(temp_array[ctr+6]);
ctr = ctr + 7;
}
//Close the file to free up memory and prevent leaks
fclose(f);
} //AT THIS POINT THE DATA IS FINE
Here is the function I'm using to read the data. When i debug this function, and let it run right up to the end, I use the debugger to check it's contents, and the array has 100% correct data, that is, all elements are what they should be relative to the text file.
http://i.imgur.com/SEq9w7Q.png This image shows what I'm talking about. On the left, all the elements, 0, up to 127, are perfect.
Then, I go down to the function I'm calling it from.
atom myAtoms[118];
set_up_temp(myAtoms); //AT THIS POINT DATA IS FINE
region current_button_pressed; // NOW IT'S BROKEN
load_font_named("arial", "cour.ttf", 20);
panel p1 = load_panel("atomicpanel.txt");
panel p2 = load_panel("NumberPanel.txt");
As soon as ANYTHING is called, after i call set_up_temp, the elements 103 to 127 of my array turn in to jibberish. As more things get called, EVEN MORE of the array turns to jibberish. This is weird, I don't know what's happening... Does anyone have any idea? Thanks.
for (j = 0; j < 128; j++)
{
element_record[j].name = temp_array[ctr];
You are storing, and then returning, pointers into temp_array, which is on the stack. The moment you return from the function, all of temp_array becomes invalid -- it's undefined behavior to dereference any of those pointers after that point. "Undefined behavior" includes the possibility that you can still read elements 0 through 102 with no trouble, but 103 through 127 turn to gibberish, as you say. You need to allocate space for these strings that will live as long as the atom object. Since as you say you are using C++, the easiest fix is to change both char * members to std::string. (If you don't want to use std::string, the second easiest fix is to use strdup, but then you have to free that memory explicitly.)
This may not be the only bug in this code, but it's probably the one causing your immediate problem.
In case you're curious, the reason the high end of the data is getting corrupted is that on most (but not all) computers, including the one you're using, the stack grows downward, i.e. from high addresses to low. Arrays, however, always index from low addresses to high. So the high end of the memory area that used to be temp_array is the part that's closest to the stack pointer in the caller, and thus most likely to be overwritten by subsequent function calls.
Casual inspection yields this:
char temp_array[826][20];
...
for (i = 0; f && !feof(f) && i < 827; i++ )
Your code potentially allows i to become 826. Which means you're accessing the 827th element of temp_array. Which is one past the end. Oops.
Additionally, you are allocating an array of 118 atoms (atom myAtoms[118];) but you are setting 128 of them inside of set_up_temp in the for (j = 0; j < 128; j++) loop.
The moral of this story: Mind your indices and since you use C++ leverage things like std::vector and std::string and avoid playing with arrays directly.
Update
As Zack pointed out, you're returning pointers to stack-allocated variables which will go away when the set_up_temp function returns. Additionally, the fgets you use doesn't do what you think it does and it's HORRIBLE code to begin with. Please read the documentation for fgets and ask yourself what your code does.
You are allocating an array with space for 118 elements but the code sets 128 of them, thus overwriting whatever happens to live right after the array.
Also as other noted you're storing in the array pointers to data that is temporary to the function (a no-no).
My suggestion is to start by reading a good book about C++ before programming because otherwise you're making your life harder for no reason. C++ is not a language in which you can hope to make serious progress by experimentation.

Infix equation solver c++ while loop stack

Im creating an infix problem solver and it crashes in the final while loop to finish the last part a of the equations.
I call a final while loop in main to solve whats left on the stack and it hangs there and if i pop the last element from the stack it will leave the loop and return the wrong answer.
//
//
//
//
//
#include <iostream>
#include<stack>
#include<string>
#include <ctype.h>
#include <stdlib.h>
#include <stdio.h>
#include <sstream>
using namespace std;
#define size 30
int count=0;
int count2=0;
int total=0;
stack< string > prob;
char equ[size];
char temp[10];
string oper;
string k;
char t[10];
int j=0;
char y;
int solve(int f,int s, char o)
{
cout<<"f="<<f<<endl;
cout<<"s="<<s<<endl;
cout<<"o="<<o<<endl;
int a;
if (o== '*')//checks the operand stack for operator
{
cout << f << "*" << s << endl;
a= f*s;
}
if (o == '/')//checks the operand stack for operator
{
cout << f << "/" << s << endl;
if(s==0)
{
cout<<"Cant divide by 0"<<endl;
}
else
a= f/s;
}
if (o == '+')//checks the operand stack for operator
{
cout << f << "+" << s << endl;
a= f+s;
}
if (o == '-')//checks the operand stack for operator
{
cout << f << "-" << s << endl;
a= f-s;
}
return a;
}
int covnum()
{
int l,c;
k=prob.top();
for(int i=0;k[i]!='\n';i++)t[i]=k[i];
return l=atoi(t);
}
char covchar()
{
k=prob.top();
for(int i=0;k[i]!='\n';i++)t[i]=k[i];
return t[0];
}
void tostring(int a)
{
stringstream out;
out << a;
oper = out.str();
}
void charstack(char op)
{
oper=op;
prob.push(oper);
}
void numstack(char n[])
{
oper=n;
prob.push(oper);
}
void setprob()
{
int f,s;
char o;
char t;
int a;
int i;
t=covchar();
if(ispunct(t))
{
if(t=='(')
{
prob.pop();
}
if(t==')')
{
prob.pop();
}
else if(t=='+'||'-')
{
y=t;
prob.pop();
}
else if(t=='/'||'*')
{
y=t;
prob.pop();
}
}
cout<<"y="<<y<<endl;
i=covnum();
cout<<"i="<<i<<endl;
s=i;
prob.pop();
t=covchar();
cout<<"t="<<t<<endl;
if(ispunct(t))
{
o=t;
prob.pop();
}
i=covnum();
cout<<"i="<<i<<endl;
f=i;
prob.pop();
t=covchar();
if (t=='('||')')
{
prob.pop();
}
a=solve(f,s, o);
tostring(a);
prob.push(oper);
cout<<"A="<<prob.top()<<endl;
}
void postfix()
{
int a=0;
char k;
for(int i=0;equ[i]!='\0';i++)
{
if(isdigit(equ[i]))//checks array for number
{
temp[count]=equ[i];
count++;
}
if(ispunct(equ[i]))//checks array for operator
{
if(count>0)//if the int input is done convert it to a string and push to stack
{
numstack(temp);
count=0;//resets the counter
}
if(equ[i]==')')//if char equals the ')' then set up and solve that bracket
{
setprob();
i++;//pushes i to the next thing in the array
total++;
}
while(equ[i]==')')//if char equals the ')' then set up and solve that bracket
{
i++;
}
if(isdigit(equ[i]))//checks array for number
{
temp[count]=equ[i];
count++;
}
if(ispunct(equ[i]))
{
if(equ[i]==')')//if char equals the ')' then set up and solve that bracket
{
i++;
}
charstack(equ[i]);
}
if(isdigit(equ[i]))//checks array for number
{
temp[count]=equ[i];
count++;
}
}
}
}
int main()
{
int a=0;
char o;
int c=0;
cout<<"Enter Equation: ";
cin>>equ;
postfix();
while(!prob.empty())
{
setprob();
a=covnum();
cout<<a<<" <=="<<endl;
prob.pop();
cout<<prob.top()<<"<top before c"<<endl;
c=covnum();
a=solve(c,a,y);
}
cout<<"Final Awnser"<<a<<endl;
system ("PAUSE");
return 0;
}
Hope this isn't too harsh but it appears like the code is riddled with various problems. I'm not going to attempt to address all of them but, for starters, your immediate crashes deal with accessing aggregates out of bounds.
Example:
for(int i=0;k[i]!='\n';i++)
k is an instance of std::string. std::string isn't null-terminated. It keeps track of the string's length, so you should be do something like this instead:
for(int i=0;i<k.size();i++)
Those are the more simple kind of errors, but I also see some errors in the overall logic. For example, your tokenizer (postfix function) does not handle the case where the last part of the expression is an operand. I'm not sure if that's an allowed condition, but it's something an infix solver should handle (and I recommend renaming this function to something like tokenize as it's really confusing to have a function called 'postfix' for an infix solver).
Most of all, my advice to you is to make some general changes to your approach.
Learn the debugger. Can't stress this enough. You should be testing your code as you're writing it and using the debugger to trace through it and make sure that state variables are correctly set.
Don't use any global variables to solve this problem. It might be tempting to avoid passing things around everywhere, but you're going to make it harder to do #1 and you're also limiting the generality of your solution. That small time you saved by not passing variables is easily going to cost you much more time if you get things wrong. You can also look into making a class which stores some of these things as member variables which you can avoid passing in the class methods, but especially for temporary states like 'equ' which you don't even need after you tokenize it, just pass it into the necessary tokenize function and do away with it.
initialize your variables as soon as you can (ideally when they are first defined). I see a lot of obsolete C-style practices where you're declaring all your variables at the top of a scope. Try to limit the scope in which you use variables, and that'll make your code safer and easier to get correct. It ties in with avoiding globals (#2).
Prefer alternatives to macros when you can, and when you can't, use BIG_UGLY_NAMES for them to distinguish them from everything else. Using #define to create a preprocessor definition for 'size' actually prevents the code above using the string's 'size' method from working. That can and should be a simple integral constant or, better yet, you can simply use std::string for 'equ' (aside from making it not a file scope global).
Prefer standard C++ library headers when you can. <ctype.h> should be <cctype>, <stdlib.h> should be <cstdlib>, and <stdio.h> should be <stdio>. Mixing non-standard headers with .h extension and standard headers in the same compilation unit can cause problems in some compilers and you'll also miss out on some important things like namespace scoping and function overloading.
Finally, take your time with the solution and put some care and love into it. I realize that it's homework and you're under a deadline, but you'll be facing even tougher deadlines in the real world where this kind of coding just won't be acceptable. Name your identifiers properly, format your code legibly, document what your functions do (not just how each line of code works which is something you actually shouldn't be doing so much later as you understand the language better). Some coding TLC will take you a long way. Really think about how to design solutions to a problem (if we're taking a procedural approach, decompose the problem into procedures as general units of work and not a mere chopped up version of your overall logic). #2 will help with this.
** Example: rather than a function named 'postfix' which works with some global input string and manipulates some global stack and partially evaluates the expression, make it accept an input string and return* the individual tokens. Now it's a general function you can reuse anywhere and you also reduced it to a much easier problem to solve and test. Document it and name it that way as well, focusing on the usage and what it accepts and returns. For instance:
// Tokenize an input string. Returns the individual tokens as
// a collection of strings.
std::vector<std::string> tokenize(const std::string& input);
This is purely an example and it may or may not be the best one for this particular problem, but if you take a proper approach to designing procedures, the end result is that you should have built yourself a library of well-tested, reusable code that you can use again and again to make all your future projects that much easier. You'll also make it easier to decompose complex problems into a number of simpler problems to solve which will make everything easier and the whole coding and testing process much smoother.
I see a number of things which all likely contribute to the issue of it not working:
There are no error or bounds checking. I realize that this is homework and as such may have specific requirements/specifications which eliminate the need for some checks, but you still need some to ensure you are correctly parsing the input. What if you exceed the array size of equ/tmp/t? What if your stack is empty when you try to pop/top it?
There are a few if statements that look like else if (t == '+' || '-') which most likely doesn't do what you want them to. This expression is actually always true since '-' is non-zero and is converted to a true value. You probably want else if (t == '+' || t == '-').
As far as I can tell you seem to skip parsing or adding '(' to the stack which should make it impossible for you to actually evaluate the expression properly.
You have a while loop in the middle of postfix() which skips multiple ')' but doesn't do anything.
Your code is very hard to follow. Properly naming variables and functions and eliminating most of the globals (you don't actually need most of them) would help a great deal as would proper indentation and add a few spaces in expressions.
There are other minor issues not particularily worth mentioning. For example the covchar() and covnum() functions are much more complex than needed.
I've written a few postfix parsers over the years and I can't really follow what you are trying to do, which isn't to say the way you're trying is impossible but I would suggest re-examining at the base logic needed to parse the expression, particularly nested levels of brackets.