Right now I have a C++ client application that uses mysql.h to connect to a MYSQL database and have to preform some logic in case there is a disconnect. I'm wondering if this is the best way to reconnect to a MYSQL database in a situation where my client gets disconnected.
bool MYSQL::Reconnect(const char *host, const char *user, const char *passwd, const char *db)
{
bool out = false;
pid_t command_pid = fork();
if (command_pid == 0)
{
while(1)
{
sleep(1);
if (mysql_real_connect(&m_mysql, host, user, passwd, db, 0, NULL, 0) == NULL )
{
fprintf(stderr, "Failed to connect to database: Error: %s\n",
mysql_error(&m_mysql));
}
else
{
m_connected = true;
out = true;
break;
}
}
exit(0);
}
if (command_pid < 0)
fprintf(stderr, "Could not fork process[reconnect]: %s\n", mysql_error(&m_mysql));
return out;
}
Right now i take in all my parameters and preform a fork. the child process attempts to reconnect every second with a sleep() statement. Is this a good way to do this? Thanks
Sorry, but your code doesn't do what you think it does, Kaiser Wilhelm.
In essence, you're trying to treat a fork like a thread, which it is not.
When you fork a child, the parent process is completely cloned, including file and socket descriptors, which is how your program is connected to the MySQL database server. That is, both the parent and the child end up with their own copy of the same connection to the database server when you fork. I assume the parent only calls this Reconnect() method when it sees the connection drop, and stops using its copy of the now-defunct MySQL connection object, m_mysql. If so, the parent's copy of the connection is just as useless as the client's when you start the reconnect operation.
The thing is, the reverse is not also true: once the child manages to reconnect to the database server, the parent's connection object remains defunct. Nothing the child does propagates back up to the parent. After the fork, the two processes are completely independent, except insofar as they might try to access some I/O resource they initially shared. For example, if you called this Reconnect() while the connection was up and continued using the connection in the parent, the child's attempts to talk to the DB server on the same connection would confuse either mysqld or libmysqlclient, likely causing data corruption or a crash.
As hinted above, one solution to this is to use threads instead of forking. Beware, however, of the many problems with using threads with the MySQL C API.
Given a choice, I'd rather use asynchronous I/O to do the background connection attempt within the application's main thread, but the MySQL C API doesn't allow that.
It seems you're trying to avoid blocking your main application thread while attempting the DB server reconnection. It may be that you can get away with doing it synchronously anyway by setting the connect timeout to 1 second, which is fine when the MySQL server is on the same machine or same LAN as the client. If you could tolerate your main thread blocking for up to a second for connection attempts to fail — worst case happening when the server is on a separate machine and it's physically disconnected or firewalled — this would probably be a cleaner solution than threads. The connection attempt can fail much quicker if the server machine is still running and the port isn't firewalled, such as when it is rebooting and the TCP/IP stack is [still] up.
As far as I can tell, this doesn't do what you intended.
Logical issues
Reconnect doesn't "perform some logic in case there is a disconnect" at all.
It attempts to connect over and over again until it succeeds, then stops. That's it. The state of the connection is never checked again. If the connection drops, this code knows nothing about it.
Technical issues
Also pay close attention to the technical issues that Warren raises.
Sure, it's perfectly OK. You might want to think about replacing the while ( 1 ) loop with something like
while ( NULL == mysql_real_connect( ... )) {
sleep( 1 );
...
}
which is the kind of idiom that one learns by practice, but your code works just fine as far as I can see. Don't forget to put a counter inside the while loop.
Related
I am using cpp-httplib to retrieve some data from a server using long polling (that is, the client will issue a request to the server, and the server will just keep the connection open until the required data is available or a timeout is reached).
The program is running on my raspberry pi, which sits behind a router that does not have an outgoing static ip address. Every time the ip is reassigned (or, at least, close to that time point), my program breaks, in that the thread currently performing the poll will be forever stuck in httplib::SSLClient::Get, which is caused by a blocking read() syscall. Both server- and client timeouts are unable to do anything, while a connection close should make read immediately return 0, which is what i would have expected in this situation.
Inspecting the program with gdb shows the following:
(gdb) thread 2
(gdb) where
__libc_read (nbytes=5, buf=0x75608edb, fd=3) at ../sysdeps/unix/sysv/linux/read.c:26
__libc_read (fd=3, buf=0x75608edb, nbytes=5) at ../sysdeps/unix/sysv/linux/read.c:24
0x76d1862c in ?? () from /usr/lib/arm-linux-gnueabihf/libcrypto.so.1.1
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
I am not doing anything (as far as I know) that could accidentally overwrite return addresses.
For comparison, a 'healthy' stack trace during a SSLCLient::Get can be found here.
The actual code is quite a lot, but here's a short version that shows the same behaviour:
#include <iostream>
#define CPPHTTPLIB_OPENSSL_SUPPORT 1
#include "httplib.h"
void poll(httplib::SSLClient* c, char* path) {
while (true) {
auto response = c->Get(path);
std::cout << response.body << std::endl;
}
}
int main(int argc, char* argv[]) {
if (argc >= 3) {
httplib::SSLClient client(argv[1], 443, 20);
std::thread poll_thread(poll, &client, argv[2]);
poll_thread.join();
} else {
std::cerr << "Usage: ./poll <host> <path>" << std::endl;
return 1;
}
}
I can think of some workarounds that might or might not work, but I'd really like to know why and how this is happening in the first place.
Just expanding on the keep_alive option I mentioned in the comment.
In the scenario you described, it seems possible that the underlying TCP socket connection was terminated in an unclean fashion. I.e., you say the IP address was reassigned.
Ideally when there is a TCP socket termination, you want your code to exit out of any blocked read/poll operation. That is what will happen for normal socket closures, e.g., say the remote process is killed, or the remote process just decides it is time to close. But if the IP address of your host is changed .... I'm not sure there will necessarily be a low level TCP messages that says, to affect, this connection is now closed. So the consequence for your program is that is can still hold a local socket (the local TCP endpoint), and not realise the connection has dropped.
This is where something like keep_alive. The idea is that the kernel will send keep alive packets to keep testing if the connection is established; if these ever fail, then it can close the local socket (and so your blocking read, or blocking select, will return with some sort of end-of-stream error).
Separately to keep_alive, you can also consider application heart-beat messages (e.g., websocket has ping/pong). In addition to ensuring the TCP connection remains established, it confirms whether the remote application is healthy.
I am trying to create a concurrent c++ TCP server using threads. In particular I was wondering if I could use std::async to accept connections and serve each one in its own thread.
So far I have created a rough mockup but can't really tell if I am on the correct path.
void networking::TCP_Server::acceptConnection() {
std::string stringToSend{"This is a test string to be replied to"};
int new_fd = accept(listeningFD, nullptr, nullptr);
send(new_fd, stringToSend.c_str(), stringToSend.size(), 0);
sleep(3);
std::cout << ("End of thread");
}
///LISTEN FOR CONNECTIONS ON listeningFD
///CREATE A LIST OF FILE DESCRIPTORS FOR POLL fds[]
(fds[i].fd == listeningFD) {
do {
std::cout << ("New incoming connection - %d\n", new_fd);
std::async(std::launch::async, acceptConnection)
} while (new_fd != -1);
} /* End of existing connection is readable */
} /* End of loop through pollable descriptors */
I am connecting at the same time to the server with two clients and would expect for the loop to run through both new connections and create a thread for each one. As of now it is as it runs in deferred mode, one gets accepted, the other waits until the first finishes.
Any ideas?
(Pardon any mistakes in the code)
std::async returns a std::future which the code doesn't save into a variable, hence its destructor is called immediately. std::future::~future() blocks the calling thread until the future becomes ready.
You may like to use (detached) std::thread instead of std::async.
There are more scalable strategies to handle many clients. I highly recommend reading old but instructive The C10K problem.
You may also like to get familar with Asio C++ Library.
I am learning Winsock and trying to create some easy programs to get to know it. I managed to create server which can handle multiple connections also manage them and client according to all tutorials, it is working how it was supposed to but :
I tried to make loop where I check if any of clients has disconnected and if it has, I wanted to close it.
I managed to write something which would check if socket is disconnected but it does not connect 2 or more sockets at one time
Anyone can give me reply how to make working loop checking through every client if it has disconnected and close socket ? It is all to make something like max clients connected to server at one time. Thanks in advance.
while (true)
{
ConnectingSocket = accept (ListeningSocket, (SOCKADDR*)&addr, &addrlen);
if (ConnectingSocket!=INVALID_SOCKET)
{
Connections[ConnectionsCounter] = ConnectingSocket;
char *Name = new char[64];
ZeroMemory (Name,64);
sprintf (Name, "%i",ConnectionsCounter);
send (Connections[ConnectionsCounter],Name,64,0);
cout<<"New connection !\n";
ConnectionsCounter++;
char data;
if (ConnectionsCounter>0)
{
for (int i=0;i<ConnectionsCounter;i++)
{
if (recv(Connections[i],&data,1, MSG_PEEK))
{
closesocket(Connections[i]);
cout<<"Connection closed.\n";
ConnectionsCounter=ConnectionsCounter-1;
}
}
}
}
Sleep(50);
}
it seems that you want to manage multiple connections using a single thread. right?
Briefly socket communication has two mode, block and non-block. The default one is block mode. let's focus your code:
for (int i=0;i<ConnectionsCounter;i++)
{
if (recv(Connections[i],&data,1, MSG_PEEK))
{
closesocket(Connections[i]);
cout<<"Connection closed.\n";
ConnectionsCounter=ConnectionsCounter-1;
}
}
In the above code, you called the recv function. and it will block until peer has sent msg to you, or peer closed the link. So, if you have two connection now namely Connections[0] and Connections[1]. If you were recv Connections[0], at the same time, the Connections[1] has disconnected, you were not know it. because you were blocking at recv(Connections[0]). when the Connections[0] sent msg to you or it closed the socket, then loop continue, finally you checked it disconnect, even through it disconnected 10 minutes ago.
To solve it, I think you need a book Network Programming for Microsoft Windows . There are some method, such as one thread one socket pattern, asynchronous communication mode, non-block mode, and so on.
Forgot to point out the bug, pay attention here:
closesocket(Connectons[i]);
cout<<"Connection closed.\n";
ConnectionsCounter=ConnectionsCounter-1;
Let me give an example to illustrate it. now we have two Connections with index 0 and 1, and then ConnectionsCount should be 2, right? When the Connections[0] is disconnected, the ConnectionsCounter is changed from 2 to 1. and loop exit, a new client connected, you save the new client socket as Connections[ConnectionsCounter(=1)] = ConnectingSocket; oops, gotting an bug. because the disconnected socket's index is 0, and index 1 was used by another link. you are reusing the index 1.
why not try to use vector to save the socket.
hope it helps~
I'm not usually the type to post a question, and more to search why something doesn't work first, but this time I did everything I could, and I just can't figure out what is wrong.
So here's the thing:
I'm currently programming an IRC Bot, and I'm using libircclient, a small C library to handle IRC connections. It's working pretty great, it does the job and is kinda easy to use, but ...
I'm connecting to two different servers, and so I'm using the custom networking loop, which uses the select function. On my personal computer, there's no problem with this loop, and everything works great.
But (Here's the problem), on my remote server, where the bot will be hosted, I can connect to one server but not the other.
I tried to debug everything I could. I even went to examine the sources of libircclient, to see how it worked, and put some printfs where I could, and I could see where does it comes from, but I don't understand why it does this.
So here's the code for the server (The irc_session_t objects are encapsulated, but it's normally kinda easy to understand. Feel free to ask for more informations if you want to):
// Connect the first session
first.connect();
// Connect the osu! session
second.connect();
// Initialize sockets sets
fd_set sockets, out_sockets;
// Initialize sockets count
int sockets_count;
// Initialize timeout struct
struct timeval timeout;
// Set running as true
running = true;
// While the server is running (Which means always)
while (running)
{
// First session has disconnected
if (!first.connected())
// Reconnect it
first.connect();
// Second session has disconnected
if (!second.connected())
// Reconnect it
second.connect();
// Reset timeout values
timeout.tv_sec = 1;
timeout.tv_usec = 0;
// Reset sockets count
sockets_count = 0;
// Reset sockets and out sockets
FD_ZERO(&sockets);
FD_ZERO(&out_sockets);
// Add sessions descriptors
irc_add_select_descriptors(first.session(), &sockets, &out_sockets, &sockets_count);
irc_add_select_descriptors(second.session(), &sockets, &out_sockets, &sockets_count);
// Select something. If it went wrong
int available = select(sockets_count + 1, &sockets, &out_sockets, NULL, &timeout);
// Error
if (available < 0)
// Error
Utils::throw_error("Server", "run", "Something went wrong when selecting a socket");
// We have a socket
if (available > 0)
{
// If there was something wrong when processing the first session
if (irc_process_select_descriptors(first.session(), &sockets, &out_sockets))
// Error
Utils::throw_error("Server", "run", Utils::string_format("Error with the first session: %s", first.get_error()));
// If there was something wrong when processing the second session
if (irc_process_select_descriptors(second.session(), &sockets, &out_sockets))
// Error
Utils::throw_error("Server", "run", Utils::string_format("Error with the second session: %s", second.get_error()));
}
The problem in this code is that this line:
irc_process_select_descriptors(second.session(), &sockets, &out_sockets)
Always return an error the first time it's called, and only for one server. The weird thing is that on my Windows computer, it works perfectly, while on the Ubuntu server, it just doesn't want to, and I just can't understand why.
I did some in-depth debug, and I saw that libircclient does this:
if (session->state == LIBIRC_STATE_CONNECTING && FD_ISSET(session->sock, out_set))
And this is where everything goes wrong. The session state is correctly set to LIBIRC_STATE_CONNECTING, but the second thing, FD_ISSET(session->sock, out_set) always return false. It returns true for the first session, but for the second session, never.
The two servers are irc.twitch.tv:6667 and irc.ppy.sh:6667. The servers are correctly set, and the server passwords are correct too, since everything works fine on my personal computer.
Sorry for the very long post.
Thanks in advance !
Alright, after some hours of debug, I finally got the problem.
So when a session is connected, it will enter in the LIBIRC_STATE_CONNECTING state, and then when calling irc_process_select_descriptors, it will check this:
if (session->state == LIBIRC_STATE_CONNECTING && FD_ISSET(session->sock, out_set))
The problem is that select() will alter the sockets sets, and will remove all the sets that are not relevant.
So if the server didn't send any messages before calling the irc_process_select_descriptors, FD_ISSET will return 0, because select() thought that this socket is not relevant.
I fixed it by just writing
if (session->state == LIBIRC_STATE_CONNECTING)
{
if(!FD_ISSET(session->sock, out_set))
return 0;
...
}
So it will make the program wait until the server has sent us anything.
Sorry for not having checked everything !
Since it seems that I can't find a solution to my original problem, I tried to do a little workaround. I'm simply trying to set a timeout to the connect() call of my TCP Socket.
I want the connect() to be blocking but not until the usual 75 seconds timeout, I want to define my own.
I have already tried select() which worked for the timeout but I couldn't get a connection (that was my initial problem as described here ).
So now I found another way to deal with it: just do a blocking connect() call but interrupt it with an alarm like this :
signal(SIGALRM, connect_alarm);
int secs = 5;
alarm(secs);
if (connect(m_Socket, (struct sockaddr *)&addr, sizeof(addr)) < 0 )
{
if ( errno == EINTR )
{
debug_printf("Timeout");
m_connectionStatus = STATUS_CLOSED;
return ERR_TIMEOUT;
}
else
{
debug_printf("Other Err");
m_connectionStatus = STATUS_CLOSED;
return ERR_NET_SOCKET;
}
}
with
static void connect_alarm(int signo)
{
debug_printf("SignalHandler");
return;
}
This is the solution I found on the Internet in a thread here on stackoverflow. If I use this code the program starts the timer and then goes into the connect() call. After the 5 seconds the signal handler is fired (as seen on the console with the printf()), but after that the program still remains within the connect() function for 75 seconds. Actually every description says that the connect_alarm() should interrupt the connect() function but it seems it doesn't in my case. Is there any way to get the desired result for my problem?
signal is a massively under-specified interface and should be avoided in new code. On some versions of Linux, I believe it provides "BSD semantics", which means (among other things) that providing SA_RESTART by default.
Use sigaction instead, do not specify SA_RESTART, and you should be good to go.
...
Well, except for the general fragility and unavoidable race conditions, that is. connect will return EINTR for any signal, not just SIGALARM. More troublesome, if the system happens to be under heavy load, it could take more than 5 seconds between the call to alarm and the call to connect, in which case you will miss the signal and block in connect forever.
Your earlier attempt, using non-blocking sockets with connect and select, was a much better idea. I would suggest debugging that.
While it's relatively easy to setup the alarm(2) (less the pain of signal handling and system call interruptions), the more efficient way of timing out TCP connection attempts is the non-blocking connect, which also allows you to initiate multiple connections and wait on all of them, handling successes and failures one at a time.