How to check if std::filesystem::copy is ended?

How to check if std::filesystem::copy is ended? - c++

Copying files to a directory to another directory using this code
auto const copyOption = std::filesystem::copy_options::recursive | std::filesystem::copy_options::skip_symlinks;
std::filesystem::copy("/mnt/iso", "/mnt/usb", copyOption);
Copying big files can take a long time.
So how to check when copy is ended ?

How to check if std::filesystem::copy is ended?
The call to a function ends by either (1) returning, by (2) throwing or by (3) terminating the program.
If the execution of the program proceeds to the next statement (or next sibling expression), then you know that the function call has ended by (1) returning. If the execution proceeds to unwind and eventually enters a catch block, then you know that the function has ended by (2) throwing. If the program no longer runs, then you know that the function has (3) terminated the program (std::filesystem::copy won't do this one directly, although it can happen if it throws without being caught). If none of those have happened, then you know that the function call hasn't ended yet.
For example try to unmount an usbkey just after copy finished will take one or two minutes longer if you have big files
There is no standard way in C++ to verify when data has physically been written onto a device.
You've tagged [ubuntu], so you may be interested in fsync function in the POSIX standard. You cannot use it with conjunction with std::filesystem::copy; you'll need to use the other POSIX file manipulation functions instead. fsync should guarantee that any writes have been passed on to the device by the kernel. From then on, you rely on the hardware.

Related

Place critical functions in destructor to "enhance atomicity"?

Say I have two C++ functions foo1() and foo2(), and I want to minimize the likelihood that that foo1() starts execution but foo2() is not called due to some external event. I don't mind if neither is called, but foo2() must execute if foo1() was called. Both functions can be called consecutively and do not throw exceptions.
Is there any benefit / drawback to wrapping the functions in an object and calling both in the destructor? Would things change if the application was multi-threaded (say the parent thread crashes)? Are there any other options for ensuring foo2() is called so long as foo1() is called?
I thought having them in a destructor might help with e.g. SIGINT, though I learned SIGINT will stop execution immediately, even in the middle of the destructor.
Edit:
To clarify: both foo1() and foo2() will be abstracted away, so I'm not concerned about someone else calling them in the wrong order. My concern is solely related to crashes, exceptions, or other interruptions during the execution of the application (e.g. someone pressing SIGINT, another thread crashing, etc.).

If another thread crashes (without relevant signal handler -> the whole application exits), there is not much you can do to guarantee that your application does something - it's up to what the OS does. And there are ALWAYS cases where the system will kill your app without your actual knowledge (e.g. a bug that causes "all" memory being used by your app and the OS "out of memory killer" killing your process).
The only time your destructor is guaranteed to be executed is if the object is constructed and a C++ exception is thrown. All signals and such, make no such guarantees, and contininuing to execute [in the same thread] after for example SIGSEGV or SIGBUS is well into the "undefined" parts of the world - nothing much you can do about that, since the SEGV typically means "you tried to do something to memory that doesn't exist [or that you can't access in the way you tried, e.g. write to code-memory]", and the processor would have aborted the current instruction. Attempting to continue where you were will either lead to the same instruction being executed again, or the instruction being skipped [if you continue at the next instruction - and I'm ignoring the trouble of determining where that is for now]. And of course, there are situations where it's IMPOSSIBLE to continue even if you wanted to - say for example the stack pointer has been corrupted [restored from memory that was overwritten, etc].
In short, don't spend much time trying to come up with something that tries to avoid these sort of scenarios, because it's unlikely to work. Spend your time trying to come up with schemes where you don't need to know if you completed something or not [for example transaction based programming, or "commit-based" programming (not sure if that's the right term, but basically you do some steps, and then "commit" the stuff done so far, and then do some further steps, etc - only stuff that has been "committed" is sure to be complete, uncommitted work is discarded next time around) , where something is either completely done, or completely discarded, depending on if it completed or not].
Separating "sensitive" and "not sensitive" parts of your application into separate processes can be another way to achieve some more safety.

Functions responsibility on data in C

Recently I ran into a problem at work where you have two functions; one opens a file descriptor (which is a local variable in the function), and passes it to another function where it is used for reading or writing. Now, when one of the operations read/write fails the function that was doing the read/write closes this file descriptor, and returns.
The question is, whose responsibility is to close the file descriptor, or let's say do cleanup:
the function which created the fd
the function which experienced the error while read/write
Is there a design rule for these kind of cases; let's say creation and cleanup.
BTW, the problem was that both functions attempted to close the fd, which resulted in a crash on the second call to close.

There are two parts to this answer — the general design issue and the detailed mechanics for your situation.
General Design
Handling resources such as file descriptors correctly, and making sure they are released correctly, is an important design issue. There are multiple ways to manage the problem that work. There are some others that don't.
Your tags use C and C++; be aware that C++ has extra mechanisms available to it.
In C++, the RAII — Resource Acquisition Is Initialization — idiom is a great help. When you acquire a resource, you ensure that whatever acquires the resource initializes a value that will be properly destructed and will release the resource when destructed.
In both languages, it is generally best if the function responsible for allocating the resource also releases it. If a function opens a file, it should close it. If a function is given an open file, it should not close the file.
In the comments, I wrote:
Generally, the function that opened the file should close it; the function that experienced the error should report the error, but not close the file. However, you can work it how you like as long as the contract is documented and enforced — the calling code needs to know when the called code closed the file to avoid double closes.
It would generally be a bad design for the called function to close the file sometimes (on error), but not other times (no error). If you must go that way, then it is crucial that the calling function is informed that the file is closed; the called function must return an error indication that tells the calling code that the file is no longer valid and should neither be used nor closed. As long as the information is relayed and handled, there isn't a problem — but the functions are harder to use.
Note that if a function is designed to return an opened resource (it is a function that's responsible for opening a file and making it available to the function that called it), then the responsibility for closing the file falls on the code that calls the opening function. That is a legitimate design; you just have to make sure that there is a function that knows how to close it, and that the calling code does close it.
Similar comments apply to memory allocation. If a function allocates memory, you must know when the memory will be freed, and ensure that it is freed. If it was allocated for the purposes of the current function and functions it calls, then the memory should be released before return. If it was allocated for use by the calling functions, then the responsibility for release transfers to the calling functions.
Detailed mechanics
Are you sure you're using file descriptors and not FILE * (file streams)? It's unlikely that closing a file descriptor twice would cause a crash (error, yes, but not a crash). OTOH, calling fclose() on an already closed file stream could cause problems.
In general, in C, you pass file descriptors, which a small integers, by value, so there isn't a way to tell the calling function that the file descriptor is no longer valid. In C++, you could pass them by reference, though it is not conventional to do so. Similarly with FILE *; they're most usually passed by value, not by reference, so there isn't a way to tell the calling code that the file is not usable any more by modifying the value passed to the function.
You can invalidate a file descriptor by setting it to -1; that is never a valid file descriptor. Using 0 is a bad idea; it is equivalent to using standard input. You can invalidate a file stream by setting it to 0 (aka NULL). Passing the null pointer to functions that try to use the file stream will tend to cause crashes. Passing an invalid file descriptor typically won't cause crashes — the calls may fail with EBADF set in errno, but that's the limit of the damage, usually.
Using file descriptors, you will seldom get a crash because the file descriptor is no longer valid. Using file streams, all sorts of things can go wrong if you try using an invalid file stream pointer.

Calling shared libraries without releasing the memory

In Ubuntu 14.04, I have a C++ API as a shared library which I am opening using dlopen, and then creating pointers to functions using dlsym. One of these functions CloseAPI releases the API from memory. Here is the syntax:
void* APIhandle = dlopen("Kinova.API.USBCommandLayerUbuntu.so", RTLD_NOW|RTLD_GLOBAL);
int (*CloseAPI) = (int (*)()) dlsym(APIhandle,"CloseAPI");
If I ensure that during my code, the CloseAPI function is always called before the main function returns, then everything seems fine, and I can run the program again the next time. However, if I Ctrl-C and interrupt the program before it has had time to call CloseAPI, then on the next time I run the program, I get a return error whenever I call any of the API functions. I have no documentation saying what this error is, but my intuition is that there is some sort of lock on the library from the previous run of the program. The only thing that allows me to run the program again, is to restart my machine. Logging in and out does not work.
So, my questions are:
1) If my library is a shared library, why am I getting this error when I would have thought a shared library can be loaded by more than one program simultaneously?
2) How can I resolve this issue if I am going to be expecting Ctrl-C to be happening often, without being able to call CloseAPI?

So, if you do use this api correctly then it requires you to do proper clean up after use (which is not really user friendly).
First of all, if you really need to use Ctrl-C, allow program to end properly on this signal: Is destructor called if SIGINT or SIGSTP issued?
Then use a technique with a stack object containing a resource pointer (to a CloseAPI function in this case). Then make sure this object will call CloseAPI in his destructor (you may want to check if CloseAPI wasn't called before). See more in "Effective C++, Chapter 3: Resource Management".
That it, even if you don't call CloseAPI, pointer container will do it for you.
p.s. you should considering doing it even if you're not going to use Ctrl-C. Imagine exception occurred and your program has to be stopped: then you should be sure you don't leave OS in an undefined state.

boost::shared_pointer exit without calling release

I am working with a program where my code calls a third party library which uses boost and shared_pointers to create a large and complex structure. This structure is created in a method that I call and at the end of the method I know that the program is finished.
For a large sample that I am handling the code to handle the processing takes 30 minutes and the boost code called automatically at exit takes many hours. Exiting the program without releasing the memory and spending all that time would be a perfectly acceptable outcome.
I tried
vector *iddListV = new vector(); // this WILL leak memory
with all the relevant structures added to the vector but this does not help.
I also tried calling exit(0); before reaching the end of the subroutine. This also causes the boost code to spend many hours trying to release pointers.
How to I get a C++ program (Microsoft C++ on Windows if that matters) to abruptly exit without calling the boost destructors.
My constraints are I can call any function before the boost structure are allocated but cannot modify the code once it starts running.

_Exit quits without calling any destructors.

If you're unconcerned about portability, you can call TerminateProcess(). But remember to take care that you are absolutely sure that your program is in a state which is ready to terminate. For example, if you terminate before I/O has had a chance to flush, then your file data and network streams may become invalid.

It is possible, in a portable manner, to do:
#include <exception>
...
std::terminate();
However, there's a big gotcha, in that, at least on linux, this may cause a core dump. (I'm really not sure what the behavior is on Windows).
It should be noted, that the behavior is implementation defined as far as whether or not destructors are called. Siting §15.5.1 P2:
In the situation where the search for a handler (15.3) encounters the
outermost block of a function with a noexcept-specification that does
not allow the exception (15.4), it is implementation-defined whether
the stack is unwound, unwound partially, or not unwound at all before
std::terminate() is called.
Additionally in §18.8.3.4 P1:
Remarks: Called by the implementation when exception handling must be
abandoned for any of several reasons (15.5.1), in effect immediately
after evaluating the throw-expression (18.8.3.1). May also be called
directly by the program.
C++11 also defines the function std::quick_exit(int status) that can be used in a similar manner (presumably without a coredump). This function is available from <cstdlib>.

Converting a string into a function in c++

I have been looking for a way to dynamically load functions into c++ for some time now, and I think I have finally figure it out. Here is the plan:
Pass the function as a string into C++ (via a socket connection, a file, or something).
Write the string into file.
Have the C++ program compile the file and execute it. If there are any errors, catch them and return it.
Have the newly executed program with the new function pass the memory location of the function to the currently running program.
Save the location of the function to a function pointer variable (the function will always have the same return type and arguments, so
this simplifies the declaration of the pointer).
Run the new function with the function pointer.
The issue is that after step 4, I do not want to keep the new program running since if I do this very often, many running programs will suck up threads. Is there some way to close the new program, but preserve the memory location where the new function is stored? I do not want it being overwritten or made available to other programs while it is still in use.
If you guys have any suggestions for the other steps as well, that would be appreciated as well. There might be other libraries that do things similar to this, and it is fine to recommend them, but this is the approach I want to look into — if not for the accomplishment of it, then for the knowledge of knowing how to do so.
Edit: I am aware of dynamically linked libraries. This is something I am largely looking into to gain a better understanding of how things work in C++.

I can't see how this can work. When you run the new program it'll be a separate process and so any addresses in its process space have no meaning in the original process.
And not just that, but the code you want to call doesn't even exist in the original process, so there's no way to call it in the original process.
As Nick says in his answer, you need either a DLL/shared library or you have to set up some form of interprocess communication so the original process can send data to the new process to be operated on by the function in question and then sent back to the original process.

How about a Dynamic Link Library?
These can be linked/unlinked/replaced at runtime.
Or, if you really want to communicated between processes, you could use a named pipe.
edit- you can also create named shared memory.

for the step 4. we can't directly pass the memory location(address) from one process to another process because the two process use the different virtual memory space. One process can't use memory in other process.
So you need create a shared memory through two processes. and copy your function to this memory, then you can close the newly process.
for shared memory, if in windows, looks Creating Named Shared Memory
http://msdn.microsoft.com/en-us/library/windows/desktop/aa366551(v=vs.85).aspx
after that, you still create another memory space to copy function to it again.
The idea is that the normal memory allocated only has read/write properties, if execute the programmer on it, the CPU will generate the exception.
So, if in windows, you need use VirtualAlloc to allocate the memory with the flag,PAGE_EXECUTE_READWRITE (http://msdn.microsoft.com/en-us/library/windows/desktop/aa366887(v=vs.85).aspx)
void* address = NULL;
address= VirtualAlloc(NULL,
sizeof(emitcode),
MEM_COMMIT|MEM_RESERVE,
PAGE_EXECUTE_READWRITE);
After copy the function to address, you can call the function in address, but need be very careful to keep the stack balance.

Dynamic library are best suited for your problem. Also forget about launching a different process, it's another problem by itself, but in addition to the post above, provided that you did the virtual alloc correctly, just call your function within the same "loadder", then you shouldn't have to worry since you will be running the same RAM size bound stack.
The real problems are:
1 - Compiling the function you want to load, offline from the main program.
2 - Extract the relevant code from the binary produced by the compiler.
3 - Load the string.
1 and 2 require deep understanding of the entire compiler suite, including compiler flag options, linker, etc ... not just the IDE's push buttons ...
If you are OK, with 1 and 2, you should know why using a std::string or anything but pure char *, is an harmfull.
I could continue the entire story but it definitely deserve it's book, since this is Hacker/Cracker way of doing things I strongly recommand to the normal user the use of dynamic library, this is why they exists.

Usually we call this code injection ...
Basically it is forbidden by any modern operating system to access something for exceution after the initial loading has been done for sake of security, so we must fall back to OS wide validated dynamic libraries.
That's said, one you have valid compiled code, if you realy want to achieve that effect you must load your function into memory then define it as executable ( clear the NX bit ) in a system specific way.
But let's be clear, your function must be code position independant and you have no help from the dynamic linker in order to resolve symbol ... that's the hard part of the job.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How to check if std::filesystem::copy is ended? - c++

Related

Place critical functions in destructor to "enhance atomicity"?

Functions responsibility on data in C

Calling shared libraries without releasing the memory

boost::shared_pointer exit without calling release

Converting a string into a function in c++

Categories

Resources