This simple block of code is behaving in an unexpected way.
#include <iostream>
using namespace std;
class Node
{
public:
char* data;
Node(char d)
{
data = &d;
}
};
int main()
{
Node NodeA = Node('c');
cout<<*(NodeA.data)<<endl;
return 0;
}
I was expecting to get 'c' as the output, but instead it outputs '}'.
I had the feeling that it must be related to assigning the "data" pointer to an anonymous variable which is the 'c'.
I found this question discussing a similar issue.
But as it was mentioned in the top answer, the anonymous variable will only be killed if it was not bounded by a pointer referencing it by the end of the expression. Which is what I believe is not the case here as I am binding the pointer ("data") to it, but somehow it still gets killed.
I want to know what is going here that is causing the unexpected output.
In your class:
Node(char d)
{
data = &d;
}
char d is a parameter to constructor Node. The problem is that d lives only in local scope on the program stack. It ceases to exist when the code returns from constructor.
data now has an address pointing somewhere in the program stack. If you try to read the data, you could read some other thing that was pushed on the stack later. If you write to this address you'll overwrite some other variables in your program. It could crash or just do something unexpected.
Related
I am currently working on creating a small compiler using C++. I have defined the following objects:
struct ValueNode
{
std::string name;
int value;
};
struct StatementNode
{
StatementType type;
union
{
struct AssignmentStatement * assign_stmt;
struct PrintStatement * print_stmt;
struct IfStatement * if_stmt;
struct GotoStatement * goto_stmt;
};
struct StatementNode * next; // next statement in the list or NULL
};
I have defined a series of functions relating to different types of statements in the language. One of these functions is called parse_assignment_stmt(). The segmentation fault I am experiencing is happening in this function, immediately after attempting to assign a value to recently-allocated memory. Here is that function:
struct StatementNode* parse_assign_stmt() {
//Object to be returned. Holds an object representing a statement
//made within the input program.
struct StatementNode* st = (struct StatementNode*)malloc(sizeof(struct StatementNode));
st->type = ASSIGN_STMT;
//First token should be an ID. Represents memory location we are assigning to.
Token tok = lexer->GetToken();
if(tok.token_type == ID) {
//Second token in an assignment should be an equal sign
Token tok2 = lexer->GetToken();
if (tok2.token_type == EQUAL) {
//This function reads the next token, makes sure it is of type NUM or ID, then creates and returns a ValueNode containing the relevant value.
struct ValueNode* rhs1 = parse_primary();
Token tok3 = lexer->GetToken();
//Assignment format for this logical branch: "x = 5;"
if(tok3.token_type == SEMICOLON) {
//first type
//Allocate memory for objects needed to build StatementNode st
struct AssignmentStatement* assign_stmt = (struct AssignmentStatement*)malloc(sizeof(struct AssignmentStatement));
struct ValueNode* lhs = (struct ValueNode*)malloc( sizeof(struct ValueNode));
printf("Name: %s, Value: %d\n", lhs->name.c_str(), lhs->value);
//PROBLEM ARISES HERE***
//lhs->name = tok.lexeme;
//return the proper structure
return st;
}
else if(tok3.token_type == PLUS || tok3.token_type == MINUS || tok3.token_type == DIV || tok3.token_type == MULT) {
//second type
//TODO
}
else {
printf("Syntax error. Semicolon or operator expected after first primary on RHS of assignment.");
exit(1);
}
}
else {
//not of proper form
printf("Syntax error. EQUAL expected after LHS of assignment.");
exit(1);
}
}
else {
//Not of proper form. Syntax error
printf("Syntax error. ID expected at beginning of assignment.");
exit(1);
}
}
Essentially, I'm allocating memory for a new ValueNode to create the variable lhs. I am printing out the name and value fields immediately to ensure that there isn't anything present. In my compiler output (I'm using g++, by the way), it's telling me that the name is (null) and the value is 0, which is expected. As soon as I uncomment the line
lhs->name = tok.lexeme;
I get a segmentation fault. At this point, I have no idea what could be going wrong. I'm creating the variable, using malloc to allocate memory to the location, making sure that there isn't anything stored there, and then immediately trying to write a value. And it always gives me a segmentation fault.
Here is the input program (.txt file) that is being fed to the program through stdin.
i;
{
i = 42 ;
print i;
}
I have tried using calloc() instead, since that should make sure that the memory is cleared before returning the pointer, but that didn't change anything. Any suggestions would be wonderful. Thank you!
If the problem arises in the line:
lhs->name = tok.lexeme;
then I'd warrant the problem lies with either lhs or tok.lexeme.
Since, prior to that, you appear to have confirmed that lhs is okay with:
printf("Name: %s, Value: %d\n", lhs->name.c_str(), lhs->value);
then the chances that it's an issue with the token structure skyrocket.
However, we shouldn't need to surmise, you should be able to load up the code into a good debugger (or even gdb, in a pinch(a)), set a breakpoint at the offending line, and actually look at the variables to see if they are as expected. Or, at a bare minimum, print out more stuff before trying to use it.
Aside: It's always been a bugbear of mine that the first course taught at university isn't Debugging 101. That's the first thing I taught my son once he started doing Python development.
(a) Pax ducks for cover :-)
After further investigation, I found that when allocating memory (using malloc()) for my ValueNode objects (and for some reason, only these ones), malloc() was returning a pointer to inaccessible memory. The error I received in gdb when trying to print my ValueNode structure was:
{name = <'error reading variable: Cannot access memory at address 0xfffffffe8>, value = 42}
Unfortunately, I was not able to find a way to allocate the memory for this object using malloc(). A workaround that I managed to make happen, however, was to create a constructor within the structure definition of ValueNode, then use "new" to create the object and allocate the memory for me, rather than trying to force malloc() to work. In retrospect, I probably should have used this simpler approach over malloc() to begin with.
I am trying to create a tree structure using some handler functions that are called while reading a stream. I think the problem is that my variables are created in the function's scope and disappear when the function ends, leaving pointers that point to nothing.
I am not sure what approach to take to keep the objects in memory, whilst still allowing the tree to be scalable.
I have made a simplified version of the code: it compiles and runs but the parent-child relationships of the 'Segment' objects are all wrong.
class Segment
{
public:
Segment* parent;
list<Segment*> children;
string name;
};
void OpenSegment(Segment* p_segCurrentseg);
void CloseSegment(Segment* p_segCurrentseg);
int _tmain(int argc, _TCHAR* argv[])
{
Segment parent;
parent.name="parent";
Segment* p_segCurrentseg=&parent;
OpenSegment(p_segCurrentseg);
OpenSegment(p_segCurrentseg);
OpenSegment(p_segCurrentseg);
CloseSegment(p_segCurrentseg);
return 0;
}
void OpenSegment(Segment* p_segCurrentseg)
{
Segment child;
child.name="child";
p_segCurrentseg->children.push_front(&child);
child.parent=p_segCurrentseg;
p_segCurrentseg=&child;
}
void CloseSegment(Segment* p_segCurrentseg)
{
p_segCurrentseg=p_segCurrentseg->parent;
}
There are couple of problems in your code.
You are passing p_segCurrentseg by value and assigning to another pointer. This has no effect on the variable in the calling function.
As you already suspected, you are trying to assign p_segCurrentseg to point to a variable that will be gone when you return from the function.
What you can do:
Pass p_segCurrentseg by reference to a pointer.
Create an object from the heap and assign p_segCurrentseg to point to it.
Here's my suggestion for OpenSegment:
void OpenSegment(Segment*& p_segCurrentseg)
{
Segment* child = new Segment;
child->name="child";
p_segCurrentseg->children.push_front(child);
child->parent=p_segCurrentseg;
p_segCurrentseg=child;
}
The problem is in the OpenSegment() method, particularly in these 3 lines:
Segment child;
child.name="child";
p_segCurrentseg->children.push_front(&child);
First, child is a local variable and created on the stack. You then push the address of child into your list. When OpenSegment() returns, the address of child contains garbage since storage for local variables are deallocated.
The solution is to define child as a pointer to Segment, create it on the heap so it lives even after OpenSegment() returns. You have to make sure to deallocate its memory too. The proper place is to define a destructor for your Segment class. In it, iterate through the list (of children segments) and deallocate the memory for each child.
Any body has any idea why this code print a and not b?
I tested that value of mainArea.root->rightBro changes when i cout something. but why?
#include<iostream>
using namespace std;
struct triangle{
triangle *rightBro;
};
struct area{
triangle *root;
} mainArea;
void initialize(){
triangle root;
mainArea.root = &root;
}
int main()
{
initialize();
mainArea.root->rightBro = NULL ;
if (mainArea.root->rightBro == NULL) cout << "a" << endl;
if (mainArea.root->rightBro == NULL) cout << "b" << endl;
return 0;
}
You are storing a pointer to a local variable from within initialize. After the function returns that memory address is no longer valid to access through the pointer -- your program invokes undefined behavior (UB) when it dereferences mainArea.root inside main.
By definition, when UB is invoked anything can happen. What you see is some version of anything.
For practical programming purposes, please stop reading here. If you are curious why you are getting specifically this type of behavior, here's an explanation:
What happens in practice is that mainArea.root is left pointing to an "unused" address on the stack just after the stack frame for main. When you invoke operator<< a new stack frame is allocated, which overlaps the memory pointed to by mainArea.root. operator<<'s (stack-allocated) local variables overwrite the contents of that memory, which from the viewpoint of main results in seeing modified values.
This:
void initialize(){
triangle root;
mainArea.root = &root;
}
Is causing undefined behavior.
The variable triangle root; only lasts as long as the function is being executed. Once the function returns it no longer exists. Thus mainArea.root points at random mememory that can be re-used for anything.
Thus any use of mainArea.root after the function exits is undefined behavior. Meaning the application can do anything.
This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Can a local variable's memory be accessed outside its scope?
Is there worrying thing to do a code such (getIDs() returns a pointer):
class Worker{
private:
int workerID;
int departID;
int supervisorID;
public:
Worker()
{
workerID=0;
departID=0;
supervisorID=0;
name="anonymous";
workerAddress="none";
}
void setIDs(int worker, int depart, int supervisor)
{
workerID=worker;
departID=depart;
supervisorID=supervisor;
}
int* getIDs()
{
int id[3];
id[0]=workerID;
id[1]=departID;
id[2]=supervisorID;
return id;
}
};
And then, use it such:
Worker obj;
obj.setIDs(11,22,33);
cout<<(*obj.getIDs())<<endl;
cout<<++(*obj.getIDs())<<endl;
cout<<++(++(*obj.getIDs()))<<endl;
I am wondering about that because the compiler shows:
Warning 1 warning C4172: returning address of local variable or
temporary
Your int id[3] is allocated on a stack and gets destroyed when your int* getIDs() returns.
You're return a pointer to a variable that gets destroyed immediately after getIDs() returns. The pointer then becomes dangling and is practically useless as doing anyting with it is undefined behaviour.
Suppose you defined your class like this:
class Worker{
private:
int IDs[3];
public
// ...
int* getIDs() { return IDs; }
};
This partially solves your problem, as the pointer remains valid as long the Worker object is in scope, but it's still bad practice. Example:
int* ptr;
while (true) {
Worker obj;
obj.setIDs(11,22,33);
ptr = obj.getIDs();
cout << *ptr; // ok, obj is still alive.
break;
} // obj gets destroyed here
cout << *ptr; // NOT ok, dereferencing a dangling pointer
A better way of solving this is to implement your custom operator << for your class. Something like this:
class Worker {
private:
int workerID;
int departID;
int supervisorID;
public:
// ...
friend ostream& operator<<(ostream& out, Worker w);
};
ostream& operator<<(ostream& out, const Worker& w)
{
out << w.workerID << "\n" << w.departID << "\n" << w.supervisorID;
return out;
}
Even if this would work, it wouldn't be good practice to do it this way in c++ unless there is some profound reason why you want pointers to int. Raw c-syle arrays are more difficult to handle than, for instance, std::vectors, so use those, like
std::vector<int> getIDs(){
std::vector<int> id(3);
id[0]=workerID; id[1]=departID; id[2]=supervisorID;
return id;
}
If you're worried about the overhead: this is likely to be optimized away completely by modern compilers.
A local (also caled automatic) variable is destroyed once you leave the function where it is defined. So your pointer will point to this destroyed location, and of course referencing such a location outside the function is incorect and will cause undefined behaviour.
The basic problem here is that when you enter a function call, you get a new frame on your stack (where all your local variables will be kept). Anything that is not dynamically allocated (using new/malloc) in your function will exist in that stack frame, and it gets destroyed when your function returns.
Your function returns a pointer to the start of your 3-element-array which you declared in that stack frame that will go away. So, this is undefined behavior.
While you may get "lucky/unlucky" and still have your data around where the pointer points when you use it, you may also have the opposite happen with this code. Since the space is given up when the stack frame is destroyed, it can be reused - so another part of your code could likely use the memory location where your three elements in that array is stored, which would mean they would have completely different values by the time you dereferenced that pointer.
If you're lucky, your program would just seg-fault/crash so you knew you made a mistake.
Redesign your function to return a structure of 3 ints, a vector, or at the very least (and I don't recommend this), dynamically allocate the array contents with new so it persists after the function call (but you better delete it later or the gremlins will come and get you...).
Edit: My apologies, I completely misread the question. Shouldn't be answering StackOverflow before my coffee.
When you want to return an array, or a pointer rather, there are two routes.
One route: new
int* n = new int[3];
n[0] = 0;
// etc..
return n;
Since n is now a heap object, it is up to YOU to delete it later, if you don't delete it, eventually it will cause memory leaks.
Now, route two is a somewhat easier method I find, but it's kind of riskier. It is where you pass an array in and copy the values in.
void copyIDs(int arr[3] /* or int* arr */)
{
arr[0] = workerID;
/* etc */
}
Now your array is populated, and there was no heap allocation, so no problem.
Edit: Returning a local variable as an address is bad. Why?
Given the function:
int* foo() {
int x = 5;
return &x; // Returns the address (in memory) of x
} // At this point, however, x is popped off the stack, so its address is undefined
// (Garbage)
// So here's our code calling it
int *x = foo(); // points to the garbage memory, might still contain the values we need
// But what if I go ahead and do this?
int bar[100]; // Pushed onto the stack
bool flag = true; // Pushed onto the stack
std::cout << *x << '\n'; // Is this guaranteed to be the value we expect?
Overall, it is too risky. Don't do it.
Hi can someone tell why in Linux and windows the same problem occurs :
#include <iostream>
using namespace std;
class A
{
private:
int _dmember;
public:
void func()
{
cout<<"Inside A!! "<<endl;
cout<<_dmember; // crash when reach here.
}
};
int main ()
{
A* a= NULL;
a->func(); // prints "Inside A!!!"
return 1;
}
can someone tell why this weird behivior occurs ? i mean ,
the a->func() was not supposed to get inside the func() ,...?
this is unwated behavior ,
why the above behivor occurs?
EDIT: Of course , a* =null was intentionaly!! so for all who answered "this is undefined behavior" or "you should never try to call a function on a NULL pointer !!", come on.... that was the point. and this behavior was explained correctly by some of you.
This is undefined behaviour. You must never call functions on a null pointer.
With that out of the way, let's answer the question I think you're asking: why do we get partway into the function anyway?
When you are invoking UB, the compiler is free to do anything, so it's allowed to emit code that works anyway. That's what happens on some (many?) systems in this particular case.
The reason that you're able to call the function on a null pointer successfully is that your compilers don't store the function "in" the object. Rather, the above code is interpreted somewhat like this:
class A {
int _dmember;
};
void A::func(A *this) {
cout << "Inside A!!" << endl;
cout << this->_dmember << endl;
}
int main() {
A *a = ...;
A::func(a);
}
So, you see there is nothing that actually prevents you from calling a function on a null pointer; it'll just invoke the body of the function, with the this pointer set to null. But as soon as the function tries to dereference the this pointer by accessing a field inside the class, the operating system steps in and kills your program for illegal memory access (called segmentation fault on Linux, access violation on Windows).
Nitpicker's corner: Virtual functions are a different story.
Undefined behavior because you are accessing a NULL pointer:
A* a= NULL;
a->func(); // is not defined by the language
Note that even if func() didn't try to access a member variable, the behavior still is undefined. For example, the following code could run without errors, but it is not correct:
func()
{
cout<<"Inside A!! "<<endl;
}
EDIT: With my full respect, C++ doesn't suck!
What you need is a smart pointer, not a raw pointer. As my professor says always, if you don't know what you are doing in C/C++, it is better not to do it!
Use boost::scoped_ptr, and enjoy exception safety, automatic memory management, zero overhead and NULL checking:
struct test
{
int var;
void fun()
{
std::cout << var;
}
};
int main()
{
boost::scoped_ptr<test> p(NULL);
p->fun(); // Assertion will fail, Happy debugging :)
}
Dereferencing a null pointer is undefined behaviour.
Everything could happen, so don't do it.
You must check that the pointer is valid before dereferencig it. this pointer cannot be null so you wouldn't avoid the undefined behaviour.
Most compilers just pass the pointer to the class as the first parameter (The this pointer). If you don't go on to de-reference the this pointer then you are not actually going to cause a crash. Your this pointer, inside the functiom, will simply be NULL.
As AraK pointed out this is undefined behaviour so your mileage mat vary...
Aren't you supposed to allocate a memory for your pointer? I just wonder what is the intention to call a function of a NULL pointer? It's supposed to crash immediately. It doesn't crash on the line where you don't call to A member _dmember,but the moment you call it your function crashes cause the object is simply not allocated. _dmember points on undefined memory... That's why it crashes
Its a null pointer, you simply can't define what should happen if we call a function on it.
Any pointer variable is supposed to point to some object.
Your declaration A * a = NULL;
does not point anywhere and so will not yield the results as it should.
You can however try this
A * a = NULL;
A b;
a=&b;
a->func();
this will yield the output.
Since there are no virtual functions in your class, it's easier here to think about what C code would be generated to represent this type. Approximately:
#include <stdio.h>
typedef struct
{
int d_;
} A;
FILE* print_a_to(A* a, FILE* dest)
{
return fprintf(dest, "Inside A!! \n") < 0 ||
fprintf(dest, "%d", a->d_) < 0 ?
NULL :
dest;
}
int main(int argc, char* argv[])
{
A* a = NULL;
return NULL == print_a_to(a, stdout) ?
-1 :
0;
}
Look at the second line of function print_a_to; see the dereferencing of pointer a? Per the first line of function main, you're passing NULL as the value of pointer a. Dereferencing a null pointer here is equivalent to calling a member function on your class that needs access to its member variables through a null pointer.
if i was'nt clear,
i am not trying to do deliberately below:
A* a=NULL;
a->f();
i wrote that code just to check why is it working , and ofcourse i was disappointed and my reason to be disapointed is that i debug very big program in Redhat-Linux , through log-files concept( meaning - Printing Entering,Exiting from functions to logs, including printing imporant values).
AND, on my way on the logs i hoped that if im on specific STACK of function calls i hoped at least the instances operating these functions are alive, but as i discovered and disapointed its not ought to be , which for me disapointement because it makes the debug through log files even more difficult.
I hope you described the symptoms exactly as what you saw. I tried on both Windows and Linux. Linux gives a segment fault, and Windows displays the error dialog.
The address area around 0x0 is protected by Windows and Linux. Reading and writing in this memory area will cause the OS throws an exception. Your application can catch the exception. Most application do not, and OS default handling routine is to print some error message and terminate the program.
One may ask why the message ""Inside A!! " is printed before termination. The answer is that at backend, C++ compiler converts class methods into procedure calls. This step does not involve pointer dereference. You can think that the result look like this:
void A_func(A* a)
{
cout<<"Inside A!! "<<endl;
cout<<a->_dmember; // crash when reach here.
}
A* a = NULL;
A_func(a);
The dereference of NULL pointer happened at the second statement. So the first statement was executed just fine.
the point is that the -> operator on class object (with no vtable) is not a dereference of the pointer
a->foo()
is really shorthand for
A::foo(a)
where the first param gets transformed into the this pointer. Its when u try to deref 'this' by referring to member variable that things go bad.