How to write a a self replacing/updating binary? - c++

I am trying to write a C program which may lookup a url and incase a new version of it is avaiable it should be able to update itself.
The method i have tried:
Forkout a new process to Download the new binary say BINARY.tmp, code i am using to forkout the is:
int
forkout_cmd(char *cmdstr) {
pid_t pid;
char *cmd[4];
cmd[0] = "/bin/bash";
cmd[1] = "-c";
cmd[2] = cmdstr;
cmd[3] = NULL;
pid = vfork();
if( pid == -1 ) {
logmsg("Forking for upgradation failed.");
return -1;
}else if( pid == 0 ){
/* we are in child process */
execvp(cmd[0], cmd);
logmsg("execl failed while executing upgradation job.");
}else{
/* need not to wait for the child to complete. */
wait(NULL);
}
return 0;
}
The new process tries to overwrite the original BINARY
for example you may consider the routine which forks out may be doing:
forkout_cmd("wget -O BINARY.tmp https://someurl.com/BINARY_LATEST; /bin/mv -f BINARY.tmp BINARY");
But, the overwriting fails since the original binary is still in execution and hence busy on disk, can somebody provide me some suggestions here to overcome this problem.
Thanks in advance.

Rename the currently running binary to something else, write the new binary, run it, then delete the renamed binary later.

I would save binary.tmp to the same directory as the executable, verify its checksum/signature (whatever it takes to be 100% sure no error occurred), and then atomically rename it to the executable's name.
Under Linux, this can be done while the program is running, no problem whatsoever (you are only changing the link, the underlying file persists while mappings to it are open, that is until the program is closed or restarted).
I would under no circumstances rename the original file or even overwrite it. This is unsafe and not necessary. You can do all "unsafe" operations that could fail on the temp file before touching the original. If anything goes wrong in the atomic rename, you still have the working original.
Then prompt the user to restart the program (if interactive) and done.

Related

How to handle readlink() of "/proc/self/exe" when executable is replaced during execution?

In my C++ application, my application does an execv() in a fork()ed child process to use the same executable to process some work in a new child process with different arguments that communicates with pipes to the parent process. To get the pathname to self, I execute the following code on the Linux port (I have different code on Macintosh):
const size_t bufSize = PATH_MAX + 1;
char dirNameBuffer[bufSize];
// Read the symbolic link '/proc/self/exe'.
const char *linkName = "/proc/self/exe";
const int ret = int(readlink(linkName, dirNameBuffer, bufSize - 1));
However, if while the executable is running, I replace the executable with an updated version of the binary on disk, the readlink() string result is: "/usr/local/bin/myExecutable (deleted)"
I understand that my executable has been replaced by a newer updated version and the original for /proc/self/exe is now replaced, however, when I go to execv() it now fails with the errno 2 - No such file or directory. due to the extra trailing " (deleted)" in the result.
I would like the execv() to either use the old executable for self, or the updated one. I could just detect the string ending with " (deleted)" and modify it to omit that and resolve to the updated executable, but that seems clumsy to me.
How can I execv() the current executable (or its replacement if that is easier) with a new set of arguments when the original executable has been replaced by an updated one during execution?
Instead of using readlink to discover the path to your own executable, you can directly call open on /proc/self/exe. Since the kernel already has an open fd to processes that are currently executing, this will give you an fd regardless of whether the path has been replaced with a new executable or not.
Next, you can use fexecve instead of execv which accepts an fd parameter instead of a filename parameter for the executable.
int fd = open("/proc/self/exe", O_RDONLY);
fexecve(fd, argv, envp);
Above code omits error handling for brevity.
One solution is at executable startup (e.g. near the beginning of main()) to read the value of the link /proc/self/exe once and store it statically for future use:
static string savedBinary;
static bool initialized = false;
// To deal with issue of long running executable having its binary replaced
// with a newer one on disk, we compute the resolved binary once at startup.
if (!initialized) {
const size_t bufSize = PATH_MAX + 1;
char dirNameBuffer[bufSize];
// Read the symbolic link '/proc/self/exe'.
const char *linkName = "/proc/self/exe";
const int ret = int(readlink(linkName, dirNameBuffer, bufSize - 1));
savedBinary = dirNameBuffer;
// On at least Linux, if the executable is replaced, readlink() of
// "/proc/self/exe" gives "/usr/local/bin/flume (deleted)".
// Therefore, we just compute the binary location statically once at
// startup, before it can possibly be replaced, but we leave this code
// here as an extra precaution.
const string deleted(" (deleted)");
const size_t deletedSize = deleted.size();
const size_t pathSize = savedBinary.size();
if (pathSize > deletedSize) {
const size_t matchPos = pathSize - deletedSize;
if (0 == savedBinary.compare(matchPos, deletedSize, deleted)) {
// Deleted original binary, Issue warning, throw an exception, or exit.
// Or cludge the original path with: savedBinary.erase(matchPos);
}
}
initialized = true;
}
// Use savedBinary value.
In this way, it is very unlikely that the original executable would be replaced within the microseconds of main() caching the path to its binary. Thus, a long running application (e.g. hours or days) could get replaced on disk, but per the original question, it could fork() and execv() to the updated binary that perhaps has a bug fix. This has the added benefit of working across platforms, and thus the differing Macintosh code to read the binary path could be likewise protected from binary replacement after startup.
WARNING editors note: readlink does not null terminate the string, so the above program may or may not work accidentally if the buffer was not filled with zeros before calling readlink
The reason you get the (deleted) part into the symbolic link is that you have substituted the file with the right program binary text with a different file, and the symbolic link to the executable is never valid again. Suppose you use this symbolic link to get the symbol table of this program or to load some data embedded on it, and you change the program... the table would be incorrect and you can even crash your program. The executable file for the program you were executing is no longer available (you have deleted it) and the program you have put in its place doesn't correspond to the binary you are executing.
When you unlink(2) a program that is being executed, the kernel marks that symlink in /proc, so the program can
detect that the binary has been deleted and is no longer accessible.
allow you still to gather some information of the last name it had (instead of deleting the symlink from the /proc tree)
You cannot write to a file that is being executed by the kernel, but nobody prevents you to erase that file. The file will continue to be present in the filesystem as long as you execute it, but no name points to it (it's space will be deallocated once the process exit(2)) The kernel doesn't erase its contents until the inode count in kernel memory gets to zero, which happens when all uses (references) to that file are due.

CPP running external program, wait until it finishes and return the retcode

Okay, as a part of my Lib i need a 'Worker' application to run an external program.
Normally i would do it with a call to:
system("");
But this time what is needed is:
Return code of that program
Application to work while the executed program is running
So a pseudocode would look like this in perfect implementation:
CTask::Run()
{
m_iReturnCode = -1;
ExecuteTask(m_strBinaryName);
while(Task_Executing)
{
HeartBeat();
}
return m_iReturnCode;
}
Just to clarify, i am running this on Unix platforms.
What are my options here, popen / fork ?
Anyone having a good solution already running and can shed a bit of light on this please?
Thanks for any input into this.
I am using a linux system, boost for threading and a pipe to execute the command and get its result (if you do not know boost you certainly should have a look at it).
I found the hint to use a pipe here on stackoverflow but I am sorry I do not know the exact question any more.
I do not add the outside thread code. Just start the method execute within its own thread.
std::string execute()
{
std::string result;
// DO NOT INTERRUPT THREAD WHILE READING FROM PIPE
boost::this_thread::disable_interruption di;
// add echo of exit code to command to get the exit code
std::string command = mp_command + "; echo $?";
// open pipe, execute command and read input from pipe
FILE* pipe = popen(command.c_str(), "r");
if (pipe)
{
char buffer[128];
while (!feof(pipe))
{
if (fgets(buffer, 128, pipe) != NULL)
{
std::string currBuffer(buffer);
result += currBuffer;
}
}
}
else
{
mp_isValid = false;
}
// sleeping busy wait for the pipe to close
while (pclose(pipe) == -1)
{
boost::this_thread::sleep(boost::posix_time::milliseconds(100));
}
return result;
}
You can create a fork with fork() (or clone() if you want threads), and then run the program using execve() or system() in one process, and continue running the original program in the other.
For the return code, you can get the return code even from system() call as :
ret = system("<your_command>");
printf("%d\n", WEXITSTATUS(ret));
There must be some sort of interprocess or interthread communication. In case you don't want to fork it or use threads, you can try using a shared file. Open the file for writing in the child task (called by system()) and when you are done, write some value (e.g. "finished") and close the file. In the parent task, heartbeat until you read "finished" from the shared file.
You can do this also by writing a global variable instead of shared file.
However, I would fort or thread it, using a shared file or global variable is error-prone and I am not entirely sure it would work that way.

C++ run executable and pipe output to file

To start off with, I'm pretty new to C++.
I am wanting to accomplish the following:
Execute the following: "SampleApp.exe -cf test.xml"
I need the shell that execute in hidden mode
I need the C++ application to wait until SampleApp is finished
If the SampleApp takes longer than X amount of time, then I need to terminate the process
I want to pipe SampleApp's output to a file (sample.log)
So far I have the following:
SHELLEXECUTEINFO lpExecInfo;
lpExecInfo.cbSize = sizeof(SHELLEXECUTEINFO);
lpExecInfo.lpFile = L"SampleApp.exe";
lpExecInfo.fMask = SEE_MASK_DOENVSUBST|SEE_MASK_NOCLOSEPROCESS;
lpExecInfo.hwnd = NULL;
lpExecInfo.lpVerb = L"open";
lpExecInfo.lpParameters = L"-cf test.xml";
lpExecInfo.lpDirectory = NULL;
lpExecInfo.nShow = SW_HIDE; // hide shell during execution
lpExecInfo.hInstApp = (HINSTANCE) SE_ERR_DDEFAIL;
ShellExecuteEx(&lpExecInfo);
// wait until the process is finished
if (lpExecInfo.hProcess != NULL)
{
::WaitForSingleObject(lpExecInfo.hProcess, INFINITE);
::CloseHandle(lpExecInfo.hProcess);
}
The above code achieves everything except piping output to a file.
However, I doesn't seem to be possible with ShellExecute.
It seems that I need to use CreateProcess instead.
I am hoping that someone with more C++ experience would be able to provide me with the CreateProcess equivalent of my code plus piping output. If not, at least confirm that what I am wanting to do is possible and point me in the right direction.
- Thanks
Unless you're feeling particularly masochistic or truly need to optimize this operation, use _popen to create the child process. That will return a FILE * from which you can read the child's output. Read from there, write to file, done.
FILE *child = _popen("child.exe", "r");
FILE *result = fopen("result.txt", "w");
// error checking omitted.
char buffer[1024];
while (fgets(buffer, sizeof(buffer), child))
fputs(buffer, result);
Doing this on your own (using the Windows API) is certainly possible and can even reduce overhead, but it's tremendously more work.
Your going to want to familiarize yourself with this code, as it's exactly what you want to do.
You will need to add some code to write to file in the ReadFromPipe function.

Linux fork/exec to application in same directory

Is there an exec variant that will use the current application directory to locate the target program?
I am using C++ and Qt to implement a "last ditch" error reporting system. Using Google Breakpad, I can create a minidump and direct execution to a handler. Because my application is in an unstable state, I just want to fork and start a separate error handling process using minimal dependencies. The error reporting application will be deployed in the same directory as the application executable.
I am quite unfamiliar with the fork and exec options, and am not finding an exec option that includes the current application directory in the search path. Here is what I have so far:
static bool dumpCallback(const char* /*dump_path*/,
const char* /*minidump_id*/,
void* /*context*/,
bool succeeded)
{
pid_t pid = fork();
if (pid == 0)
{
// This is what I would *like* to work.
const char* error_reporter_path = "error_reporter";
// This works, but requires hard-coding the entire path, which seems lame,
// and really isn't an option, given our deployment model.
//
// const char* error_reporter_path = "/path/to/app/error_reporter";
// This also works, but I don't like the dependency on QApplication at this
// point, since the application is unstable.
//
// const char* error_reporter_path =
// QString("%1/%2")
// .arg(QApplication::applicationDirPath())
// .arg("error_reporter").toLatin1().constData();
execlp(error_reporter_path,
error_reporter_path,
(char *) 0);
}
return succeeded;
}
Any other suggestions on best practices for using fork and exec would be appreciated as well; this is my first introduction to using them. I'm only concerned about Linux (Ubuntu, Fedora) at this point; I will work on handlers for other operating systems later.
What you asked for is actually quite easy:
{
pid_t pid = fork();
if (pid == 0)
{
const char* error_reporter_path = "./error_reporter";
execl(error_reporter_path,
error_reporter_path,
(char *) 0);
_exit(127);
}
else
return pid != -1;
}
but it doesn't do what you want. The current working directory is not necessarily the same thing as the directory containing the current executable -- in fact, under almost all circumstances, it won't be.
What I would recommend you do is make error_reporter_path a global variable, and initialize it at the very beginning of main, using your "option 2" code
QString("%1/%2")
.arg(QApplication::applicationDirPath())
.arg("error_reporter").toLatin1().constData();
The QString object (not just its constData) then has to live for the lifetime of the program, but that shouldn't be a problem. Note that you should be converting to UTF-8, not Latin1 (I guess QString uses wide characters?)
I think you have 2 choices:
Add '.' to $PATH.
Prepend the result of getcwd() to the executable name.
You should build the path to your helper executable at your program's startup, and save it somewhere (in a global or static variable). If you only need to run on Linux, you can do this by reading /proc/self/exe to get the location of your executable. Something like this:
// Locate helper binary next to the current binary.
char self_path[PATH_MAX];
if (readlink("/proc/self/exe", self_path, sizeof(self_path) - 1) == -1) {
exit(1);
}
string helper_path(self_path);
size_t pos = helper_path.rfind('/');
if (pos == string::npos) {
exit(1);
}
helper_path.erase(pos + 1);
helper_path += "helper";
Excerpted from a full working example here: http://code.google.com/p/google-breakpad/source/browse/trunk/src/client/linux/minidump_writer/linux_dumper_unittest.cc#92
Never, ever, under any circumstances add "." to $PATH !!
If you prepend getcwd() to the executable name (argv[0]), you have to do is as the first thing in main, before anything has the chance to change the current working directory. Then you have to consider what to do about symbolic links in the resulting filename. And even after that you can never be sure that argv[0] is set to the command used to execute your program
Option 3:
Hardcode the full filename in your executable, but use the configure script to set the filename. (You are using a configure script, right?)
Option 4;
Don't call exec. You don't have to call exec after a fork. Just pretend you have just entered "main", and call "exit" when your error reporting has finished.

How to see if a subfile of a directory has changed

In Windows, is there an easy way to tell if a folder has a subfile that has changed?
I verified, and the last modified date on the folder does not get updated when a subfile changes.
Is there a registry entry I can set that will modify this behavior?
If it matters, I am using an NTFS volume.
I would ultimately like to have this ability from a C++ program.
Scanning an entire directory recursively will not work for me because the folder is much too large.
Update: I really need a way to do this without a process running while the change occurs. So installing a file system watcher is not optimal for me.
Update2: The archive bit will also not work because it has the same problem as the last modification date. The file's archive bit will be set, but the folders will not.
This article should help. Basically, you create one or more notification object such as:
HANDLE dwChangeHandles[2];
dwChangeHandles[0] = FindFirstChangeNotification(
lpDir, // directory to watch
FALSE, // do not watch subtree
FILE_NOTIFY_CHANGE_FILE_NAME); // watch file name changes
if (dwChangeHandles[0] == INVALID_HANDLE_VALUE)
{
printf("\n ERROR: FindFirstChangeNotification function failed.\n");
ExitProcess(GetLastError());
}
// Watch the subtree for directory creation and deletion.
dwChangeHandles[1] = FindFirstChangeNotification(
lpDrive, // directory to watch
TRUE, // watch the subtree
FILE_NOTIFY_CHANGE_DIR_NAME); // watch dir name changes
if (dwChangeHandles[1] == INVALID_HANDLE_VALUE)
{
printf("\n ERROR: FindFirstChangeNotification function failed.\n");
ExitProcess(GetLastError());
}
and then you wait for a notification:
while (TRUE)
{
// Wait for notification.
printf("\nWaiting for notification...\n");
DWORD dwWaitStatus = WaitForMultipleObjects(2, dwChangeHandles,
FALSE, INFINITE);
switch (dwWaitStatus)
{
case WAIT_OBJECT_0:
// A file was created, renamed, or deleted in the directory.
// Restart the notification.
if ( FindNextChangeNotification(dwChangeHandles[0]) == FALSE )
{
printf("\n ERROR: FindNextChangeNotification function failed.\n");
ExitProcess(GetLastError());
}
break;
case WAIT_OBJECT_0 + 1:
// Restart the notification.
if (FindNextChangeNotification(dwChangeHandles[1]) == FALSE )
{
printf("\n ERROR: FindNextChangeNotification function failed.\n");
ExitProcess(GetLastError());
}
break;
case WAIT_TIMEOUT:
// A time-out occurred. This would happen if some value other
// than INFINITE is used in the Wait call and no changes occur.
// In a single-threaded environment, you might not want an
// INFINITE wait.
printf("\nNo changes in the time-out period.\n");
break;
default:
printf("\n ERROR: Unhandled dwWaitStatus.\n");
ExitProcess(GetLastError());
break;
}
}
}
This is perhaps overkill, but the IFS kit from MS or the FDDK from OSR might be an alternative. Create your own filesystem filter driver with simple monitoring of all changes to the filesystem.
ReadDirectoryChangesW
Some excellent sample code in this CodeProject article
If you can't run a process when the change occurs, then there's not much you can do except scan the filesystem, and check the modification date/time. This requires you to store each file's last date/time, though, and compare.
You can speed this up by using the archive bit (though it may mess up your backup software, so proceed carefully).
An archive bit is a file attribute
present in many computer file systems,
notably FAT, FAT32, and NTFS. The
purpose of an archive bit is to track
incremental changes to files for the
purpose of backup, also called
archiving.
As the archive bit is a binary bit, it
is either 1 or 0, or in this case more
frequently called set (1) and clear
(0). The operating system sets the
archive bit any time a file is
created, moved, renamed, or otherwise
modified in any way. The archive bit
therefore represents one of two
states: "changed" and "not changed"
since the last backup.
Archive bits are not affected by
simply reading a file. When a file is
copied, the original file's archive
bit is unaffected, however the copy's
archive bit will be set at the time
the copy is made.
So the process would be:
Clear the archive bit on all the files
Let the file system change over time
Scan all the files - any with the archive bit set have changed
This will eliminate the need for your program to keep state, and since you're only going over the directory entries (where the bit is stored) and they are clustered, it should be very, very fast.
If you can run a process during the changes, however, then you'll want to look at the FileSystemWatcher class. Here's an example of how you might use it.
It also exists in .NET (for future searchers of this type of problem)
Perhaps you can leave a process running on the machine watching for changes and creating a file for you to read later.
-Adam
Perhaps you can use the NTFS 5 Change Journal with DeviceIoControl as explained here
If you are not opposed to using .NET the FileSystemWatcher class will handle this for you fairly easily.
From the double post someone mentioned: WMI Event Sink
Still looking for a better answer though.
Nothing easy - if you have a running app you can use the Win32 file change notification apis (FindFirstChangeNotification) as suggested with the other answers. warning: circa 2000 trend micro real-time virus scanner would group the changes together making it necessary to use really large buffers when requesting the file system change lists.
If you don't have a running app, you can turn on ntfs journaling and scan the journal for changes http://msdn.microsoft.com/en-us/library/aa363798(VS.85).aspx but this can be slower than scanning the whole directory when the # of changes is larger than the # of files.