There are two possible solutions to the problem: I don't understand the c++ semantics or g++ does.
I am programming a simple network game now. I have been building a library the game uses to communicate over the network. There is a class designated to handle the connection between the apps. Another class implements server functionality so it possess a method accept(). The method is to return a Connection class.
There are a few way to return the class. I have tried these three:
Connection accept() {
...
return Connection(...);
}
Connection* accept() {
...
return new Connection(...);
}
Connection& accept() {
...
Connection *temp = new Connection(...);
return *temp;
}
All three were accepted by g++. The problem is that the third is somewhat faulty. When you use internal information of the object of type Connection, you will fail. I don't know what is wrong because all fields within the object look like initiasized. My problem is that when I use any function from protocol buffers library my program is terminated by Segmentation fault. The function below fails every it calls the protobuf library.
Annoucement Connection::receive() throw(EmptySocket) {
if(raw_input->GetErrno() != 0) throw EmptySocket();
CodedInputStream coded_input(raw_input);
google::protobuf::uint32 n;
coded_input.ReadVarint32(&n);
char *b;
int m;
coded_input.GetDirectBufferPointer((const void**)&b, &m);
Annoucement ann;
ann.ParseFromArray(b, n);
coded_input.Skip(n);
return ann;
}
I get this every time:
Program received signal SIGSEGV,
Segmentation fault. 0x08062106 in
google::protobuf::io::FileInputStream::CopyingFileInputStream::GetErrno
(this=0x20) at
/usr/include/google/protobuf/io/zero_copy_stream_impl.h:104
When I changed the accept() to the second version, it finnaly worked (the first is good too but I modified conception in the meanwhile).
Have you come across any problem that is similiar to this one? Why the third version of accept() is wrong? How should I debug the program to find such a horrible bug (I thought protobuf need some fix whereas the problem was not there)?
First, returning by reference something allocated on the heap is a sure recipe for a memory leak so I would never suggest actually doing that.
The second case can still result in a leak unless the ownership semantics are very well specified. Have you considered using a smart pointer instead of a raw pointer?
As for why it doesn't work, it probably has to do with ownership semantics and not because you're returning by reference, but I can't see a problem in the posted code.
"How should I debug the program to find such a horrible bug?"
If you are on Linux try running under valgrind - that should pick up any memory scribbling going on.
You overlooked raw_input=0x20 which is obviously an invalid pointer. This is in the helpful message you got in the debugger after the segfault.
For general problems of this type, learn to use Valgrind’s memcheck, which gives you messages about where your program abused memory.
Meanwhile I suggest you make sure you understand pass by value vs pass by reference (both pointer and C++ reference) and know when constructors, copy constructors and destructors are called.
Related
In a Java application, I use JNI to call several C++ methods. One of the methods creates an object that has to persist after the method finished and that is used in other method calls. To this end, I create a pointer of the object, which I return to Java as a reference for later access (note: the Java class implements Closable and in the close method, I call a method to delete the object).
However, in rare cases, approximately after 50.000 calls, the C++ code throws a segmentation fault. Based on the content of the log file, only a few lines of code are suspicious to be the source of error (they between the last printed log message and the next one):
MyObject* handle = new MyObject(some_vector, shared_ptr1, shared_ptr2);
handles.insert(handle); // handles is a std::set
jlong handleId = (jlong) handle;
I'd like to know whether there is a possible issue here apart from the fact that I'm using old-style C pointers. Could multi-threading be a problem? Or could the pointer ID be truncated when converted to jlong?
I also want to note that from my previous experience, I'm aware that the log is only a rough indicator of where a segmentation fault occurred. It may as well have been occurred later in the code and the next log message was simply not printed yet. However, reproducing this error may take 1-2 days, so I'd like to check out whether these lines have a problem.
After removing my std::set from the code, the error did not occur anymore. Conclusion: std::set in multithreading must be protected to avoid unrecoverable crashes.
EDIT: I never figured this out - I refactored the code to be pretty much identical to a Boost sample, and still had the problem. If anyone else has this problem, yours may be the more common shared_from_this() being called when no shared_ptr exists (or in the constructor). Otherwise, I recommend just rebuilding from the boost asio samples.
I'm trying to do something that I think is pretty common, but I am having some issues.
I'm using boost asio, and trying to create a TCP server. I accept connections with async_accept, and I create shared pointers. I have a long lived object (like a connection manager), that inserts the shared_ptr into a set. Here is a snippet:
std::shared_ptr<WebsocketClient> ptr = std::make_shared<WebsocketClient>(std::move(s));
directory.addPending(ptr);
ptr->onConnect(std::bind(&Directory::addClient, &directory, std::placeholders::_1));
ptr->onDisconnect(std::bind(&Directory::removeClient, &directory, std::placeholders::_1));
ptr->onMessage(std::bind(&Directory::onMessage, &directory, std::placeholders::_1, std::placeholders::_2));
ptr->start();
The Directory has std::set<std::shared_ptr<WebsocketClient>> pendingClients;
The function for adding a client is:
void Directory::addPending(std::shared_ptr<WebsocketClient> ptr){
std::cout << "Added pending client: " << ptr->getName() << std::endl;
pendingClients.insert(ptr);
}
Now, when the WebsocketClient starts, it tries to create a shared_ptr using shared_from_this() and then initiates an async_read_until ("\r\n\r\n"), and passes that shared_ptr to the lambda to keep ownership. It crashes before actually invoking the asio function, on shared_from_this().
Call stack looks like this:
server.exe!WebsocketClient::start()
server.exe!Server::acceptConnection::__l2::<lambda>(boost::system::error_code ec)
server.exe!boost::asio::asio_handler_invoke<boost::asio::detail::binder1<void <lambda>(boost::system::error_code),boost::system::error_code> >(boost::asio::detail::binder1<void <lambda>(boost::system::error_code),boost::system::error_code> & function, ...)
server.exe!boost::asio::detail::win_iocp_socket_accept_op<boost::asio::basic_socket<boost::asio::ip::tcp,boost::asio::stream_socket_service<boost::asio::ip::tcp> >,boost::asio::ip::tcp,void <lambda>(boost::system::error_code) ::do_complete(boost::asio::detail::win_iocp_io_service * owner, boost::asio::detail::win_iocp_operation * base, const boost::system::error_code & result_ec, unsigned __int64 __formal) Line 142 C++
server.exe!boost::asio::detail::win_iocp_io_service::do_one(bool ec, boost::system::error_code &)
server.exe!boost::asio::detail::win_iocp_io_service::run(boost::system::error_code & ec)
server.exe!Server::run()
server.exe!main(int argc, char * * argv)
However, I get a bad_weak_ptr when I call shared_from_this. I thought that was thrown when no shared_ptr owned this object, but when I call the addPending, I insert "ptr" into a set, so there should still be a reference to it.
Any ideas? If you need more details please ask, and I'll provide them. This is my first post on StackOverflow, so let me know what I can improve.
You could be dealing with memory corruption. Whether that's the case or not, there are some troubleshooting steps you should definitely take:
Log the pointer value returned from make_shared, and again inside the member function just before calling shared_from_this. Check whether that pointer value exists in your running object table (which is effectively what that set<shared_ptr<...>> is)
Instrument constructor and destructor. If the shared_ptr count does actually hit zero, it'll call your destructor and the call stack will give you information on the problem.
If that doesn't help, the fact that you're using make_shared should be useful, because it guarantees that the metadata block is right next to the object.
Use memcpy to dump the raw bytes preceding your object at various times and watch for potential corruption.
Much of this logging will happen in a context that's exhibiting undefined behavior. If the compiler figures out that you're testing for something that's not supposed to be possible, it might actually remove the test. In that case, you can usually manage to make the tests work anyway by precision use of #pragma to disable optimization just on your debug logging code -- you don't want to change optimization settings on the rest of the code, because that might change the way corruption manifests without actually fixing it.
It is difficult to determine the cause of the problem without a code.
But which enable_shared_from_this you use, boost or std?
I see you use std::make_shared, so if WebsocketClient inherits boost::enable_shared_from_this it can cause crash.
I come from Java so this is pretty hard for me to understand.. I am writing a client/server program to start learning C++.
ServerSocket server(30000);
while (true) {
ServerSocket new_sock;
server.accept(new_sock);
std::cout << "client connected...\n";
ClientConnectionThread *cct = new ClientConnectionThread(new_sock);
cct->start();
}
My problem occurs when I try to write to the socket in the ClientConnectionThread.
client_sock << someObj;
Exception was caught in cct: Could not write to socket.
My assumption is that after the cc->start(); command the ServerSocket will lose 'scope' and be popped off the stack and automatically closed. To fix this I changed the code to:
ServerSocket server(30000);
while (true) {
ServerSocket *new_sock; <----
server.accept(new_sock);
std::cout << "client connected...\n";
ClientConnectionThread *cct = new ClientConnectionThread(new_sock);
cct->start();
}
But the program didn't even enter the loop.. with no error messages telling me why that didn't work (Of course changing the necessary code to accept the pointer).
If it is not obvious what I am trying to do.. I am looking to create a new thread on every client connection to handle each client. Of course the thread will need a reference to the socket to receive and send on - which is why I pass it to the CCT object.
If you need more code let me know.
Your first code does not work exactly because of what you said. The object is allocated on the stack but once it leaves of scope, it is destroyed and the underlying pointer to the socket is closed as a consequence.
If you want to keep the object "alive", you need to use pointers. You got that right, but missed a important point: you the to allocate the object! To do so, you need to use the operator new as the following:
ServerSocket *new_sock = new ServerSocket;
Now here's the catch, on Java your object gets deallocated automatically by GC, but C++ has no garbage collector, so you need to do it by hand. Once you are done using the object, you need to delete it.
delete new_sock;
This can be a lot tricky, can cause a lot of crashes and even memory leaks. If you wish some behaviour more like Java's GC, you can use a shared_ptr, that will automatically deallocate the object (it's not that simple, but you will easily find more about that on Google.)
std::shared_ptr<ServerSocket> new_sock = std::shared_ptr<ServerSocket>(new ServerSocket);
server.accept(*new_sock);
(assuming you are compiling against C++11)
You could make your first version work if you pass a copy instead of a reference of the ServerSocket to your thread (if that is possible - server socket would need a proper copy constructor for this). The original ServerSocket would go out of scope as you pointed out, which is now no longer a problem, as the copy is still valid.
If this is not an option for you go with the version Rogiel pointed out (and stick to resource handles like unique and shared pointer, those make your life a lot easier if you are used to GC :-) ).
I am working on a lock-free shared variable class, and I want to be able to generate a SIGSEGV fault to see if my implementation works as I planned. I've tried creating a function that modifies a pointer and read it 100 times. I then call this function in both threads and have the threads run infinitely within my program. This doesn't generate the error I want. How should I go about doing this?
edit
I don't handle segfaults at all, but they are generated in my program if I remove locks. I want to use a lock-less design, therefore i created a shared variable class that uses CAS to remain lockless. Is there are way that I can have a piece of code that will generate segfaults, so that i can use my class to test that it fixes the problem?
#include <signal.h>
raise(SIGSEGV);
Will cause an appropriate signal to be raised.
malloc + mprotect + dereference pointer
This mprotect man page has an example.
Derefencing pointer to unallocated memory (at least on my system):
int *a;
*a = 0;
I am very sorry that I am not able to provide more details of my code, since I am taking over another project. The class structures are very complicated and I am unable to reproduce the issue using an easy example.
Essentally if I delete an object, all the statements in the destructor was executed successfully, but as soon as the destructor finishes execution, seg fault happens. Even if I just make the destructor empty and not do anything, the seg fault still happens. This class does not have any base class.
My code looks like this:
ParallelSynthesizer* p = new ParallelSynthesizer(argc, argv);
p->synthesize();
delete p;
cout << "after deleting" << endl;
"after deleting" was not shown, as the seg fault happens before that. But the destructor of p is executed successfully.
[EDITED AFTER SOME COMMENTS] the "synthesize()" method does use multithreading, but it is very straightforward:
pthread_t threads[num_threads];
// makes the "params" array here. skipped.
for (int i=0; i<num_threads; i++) {
pthread_create(&threads[i], NULL, synthesizeThreadMethod, (void*)(params[i]));
}
for (int i=0; i<num_threads; i++) {
pthread_join(threads[i], NULL);;
}
This pretty much all in the synthesize() method, so I don't think multithreading will result in any issue.
I am using g++ on linux. Does anybody know the possible causes of this problem?
I apologize again for not being able to find an easy example that produces this error.
One possible cause is that another object tries to access p after it got deleted.
Update You could try and run your code through valgrind. Depends a little on how well you can isolate the problem before hand. My guess so far would be that you do something bad inside your class (like constructing an object and passing p as parameter to it).
It's hard to say based on what you've said, but it sounds like you've got some heap corruption.
This kind of problem is tricky to trace and it's virtually impossible for Stack Overflow readers to fix this for you given a large code base. I would recommend running a tool like valgrind, which will track memory accesses and give you a hint at where things went wrong.
I would guess the crash happens during operator delete(void*), which is invoked by delete p; right after the destructor.
There are lots of possible causes for messing up the heap in a way that could cause a crash. A common one would be that some code previously wrote to memory before or after a new-ed object. I would run the program under valgrind memcheck; it's a very useful tool specifically for tracing down this sort of error.