Asio sync-read random-access with exceptions, how many bytes were read? - c++

How can we know how many bytes were read when calling a synchronous read operation on a random-access device and it throws an exception, for example random_access_file ?
Is this not supported, and to know how many bytes were read, one is supposed to take the boost::system::error_code ec overload?
error_code ec;
size_t s = a.read_some_at(offset, buffers, ec);
offset += s; // need to be done before unwinding
if (ec) throw system_error(ec);
return s;

Short answer: Yes. Take the ec overload. This has been the case for partial-success operations in Asio.
Slightly longer (irrelevant) answer: when using e.g. c++20 coroutines you can make the error-code and byte-count be returned as a tuple, which you might find more convenient:
auto [ec, s] = a.async_read_some_at(offset, buffers, asio::as_tuple(asio::use_awaitable)));
Pet peeve: as_tuple doesn't at current (last checked 1.80.0) appear to work well with asio::use_future :(

Related

Why is getchar_unlocked() faster than alternatives?

I know how this code works but i could not find why this code faster than other i/o methords???
int read_int() {
char c = getchar_unlocked();
while(c<'0' || c>'9') c = getchar_unlocked();
int ret = 0;
while(c>='0' && c<='9') {
ret = 10 * ret + c - 48;
c = getchar_unlocked();
}
return ret;
}
scanf("%d\n", &x) has to parse the format string and lock the stream before and after the reading.
std::cin >> x might do locking too, and it might have to sync with stdin, and it might need to go through some abstraction layers.
With the above, you only do one type of input parsing (so no need to parse a format string and decide what to do based on that) and most importantly, you don't lock the stream.
Locking streams is mandated by POSIX, and glibc uses recursive mutexes to prevent multiple threads (even in a single-threaded environment) from accessing the stdin FILE simultaneously (which would corrupt it).
These mutexes are quite expensive (your read_int should be several (fivish?) times faster than scanf("%d",&x)).
Regarding your implementation, apart from fixing the magic number issue,
you should probably detect failures in getchar_unlocked too and report those failures through a separate channel -- e.g., by returning the parsed integer through a passed-in pointer and using the return status for error reporting.
If you want thread safety, you can still use getchar_unlocked to get a speedup compared to getchar, but you have to flockfile(stdin); and funlock(stdin); at the beginning and end (respectively) of your read_int function.
Locking between threads is expensive. This is a non locking IO call.
https://discuss.codechef.com/questions/2667/getchar_unlocked

stl map.find perform differently in debug and release using vs2010

I am using stl map to store flow information extracted from pcap files. When a packet comes, I use map.find to find if the flow the packet belongs to exist. I have to use map.find twice since the packet from A to B and the packet from B to A belongs to the same flow.
struct FiveTuple
{
unsigned short source_port;
unsigned short dest_port;
unsigned int source_ip_addr;
unsigned int dest_ip_addr;
unsigned char transport_proto_type;
};
The FiveTuple identifies a flow. I use the FiveTuple as the key element in map.
map is map< FiveTuple, Flow, FlowCmp>, where FlowCmp is a struct using memcmp to see if FiveTuple a is less than FiveTuple b, just like operator<.
To find whether the flow of the packet exists, I wrote code as follows where m is the name of the map and five_tuple is a FiveTuple with information extracted from the packet:
auto it = m.find(five_tuple);
if( it == m.end())
{
//swap source and dest ip/port in five_tuple,
it = m.find(five_tuple);
if(it == m.end())
{
//do something
}
}
In the debug version in vs2010, the result is reasonable. When I changed it to the release version, I found that instead of returning the right iterator, the second m.find just gave me m.end most of the time. And I found that there are no initialization problems. How to fix the release version problem?
Seems like you are doing memcmp() on FiveTuple objects. That is undefined behaviour because FiveTuple contains trailing garbage bytes. These trailing garbage bytes are different in the debug version and the release version, so you get different results. You should rewrite FlowCmp so that it doesn't use memcmp().
This is a guess based on the limited information provided, but if you want to test it out try cout << sizeof(FiveTuple);. I bet you'll see that sizeof(FiveTuple) > sizeof(short) + sizeof(short) + sizeof(int) + sizeof(int) + sizeof(char). In other words there's garbage in your struct and you shouldn't use memcmp.
Of course memcmp is bad for another reason because it means your code will be non-portable because it's behaviour will depend on the endianess of your platform. That in itself is good enough reason not to use memcmp for this purpose.

Multithreading: do I need protect my variable in read-only method?

I have few questions about using lock to protect my shared data structure. I am using C/C++/ObjC/Objc++
For example I have a counter class that used in multi-thread environment
class MyCounter {
private:
int counter;
std::mutex m;
public:
int getCount() const {
return counter;
}
void increase() {
std::lock_guard<std::mutex> lk(m);
counter++;
}
};
Do I need to use std::lock_guard<std::mutex> lk(m); in getCount() method to make it thread-safe?
What happen if there is only two threads: a reader thread and a writer thread then do I have to protect it at all? Because there is only one thread is modifying the variable so I think no lost update will happen.
If there are multiple writer/reader for a shared primitive type variable (e.g. int) what disaster may happen if I only lock in write method but not read method? Will 8bits type make any difference compare to 64bits type?
Is any primitive type are atomic by default? For example write to a char is always atomic? (I know this is true in Java but don't know about c++ and I am using llvm compiler on Mac if platform matters)
Yes, unless you can guarantee that changes to the underlying variable counter are atomic, you need the mutex.
Classic example, say counter is a two-byte value that's incremented in (non-atomic) stages:
(a) add 1 to lower byte
if lower byte is 0:
(b) add 1 to upper byte
and the initial value is 255.
If another thread comes in anywhere between the lower byte change a and the upper byte change b, it will read 0 rather than the correct 255 (pre-increment) or 256 (post-increment).
In terms of what data types are atomic, the latest C++ standard defines them in the <atomic> header.
If you don't have C++11 capabilities, then it's down to the implementation what types are atomic.
Yes, you would need to lock the read as well in this case.
There are several alternatives -- a lock is quite heavy here. Atomic operations are the most obvious (lock-free). There are also other approaches to locking in this design -- the read write lock is one example.
Yes, I believe that you do need to lock the read as well. But since you are using C++11 features, why don't you use std::atomic<int> counter; instead?
As a rule of thumb, you should lock the read too.
Read and write to int is atomic on most architecture (and since int is guaranted to be the machine's word size, you should almost never experience corrupted int)
Yet, the answer from #paxdiablo is correct, and will happen if you have someone doing this:
#pragma pack(push, 1)
struct MyObj
{
char a;
MyCounter cnt;
};
#pragma pack(pop)
In that specific case, cnt will not be aligned to a word boundary, and the int MyCounter::counter will/might be emulated in multiple operations in CPU supporting unaligned access (like x86). Thus, you could get this sequence of operations:
Thread A: [...] set counter to 255 (counter is 0x000000FF)
getCount() => CPU reads low byte: lo:255
<interrupted here>
Thread B: increase() => counter is incremented, leading to counter = 256 = 0x00000100)
<interrupted here>
Thread A: CPU read high bytes: 0x000001, concatenate: 0x000001FF, returns 511 !
Now, let's say you never use unaligned access. Yet, if you are doing something like this:
ThreadA.cpp:
int g = clientCounter.getCount();
while (g > 0)
{
processFirstClient();
g = clientCounter.getCount();
}
ThreadB.cpp:
if (acceptClient()) clientCounter.increase();
The compiler is completely allowed to replace the loop in Thread A by this:
if (clientCounter.getCount())
while(true) processFirstClient();
Why ? That's because for each instruction, the compiler will evaluate side-effects of such expression. The getCount() is so simple that the compiler will deduce: it's a read of a single variable, and it's not modified anywhere in ThreadA.cpp, thus, it's constant. Because it's constant, let's simplify this.
If you add a mutex, the mutex code will insert a memory barrier telling the compiler "hey, don't expect anything after this barrier is crossed".
Thus, the "optimization" above can not happen since getCount might have been modified.
Sure, you could have written volatile int counter instead of counter, and the compiler would have avoided this optimization too.
In the end, if you have to write a ton of code just to avoid a mutex, you're doing it wrong (and probably will get wrong results).
You cant gaurantee that multiple threads wont modify your variable at the same time. and if such a situation occurs your variable will be garbled or program might crash. In order to avoid such cases its always better and safer to make the program thread safe.
You can use the synchronization techinques available like: Mutex, Lock, Synchronization attribute(available for MS c++)

Dealing with size of stl containers

I'm rewriting a general purpose library that was written by me before I've learned STL. It uses C-style arrays all the way. In many places there is a code like this:
unsigned short maxbuffersize; // Maximum possible size of the buffer. Can be set by user.
unsigned short buffersize; // Current size of the buffer.
T *buffer; // The buffer itself.
The first thing I did was to change the code like this:
unsigned short maxbuffersize;
unsigned short buffersize;
std::vector<T> buffer;
And then:
typedef unsigned short BufferSize;
BufferSize maxbuffersize;
BufferSize buffersize;
std::vector<T> buffer;
And then I felt like I was doing a very bad thing and should reconsider my coding style. At first, BufferSize seemed like a very bad name for a type but then all kinds of weird questions started popping up. How do I name the size type? Should I use my own type or inherit from std::vector<T>::size_type? Should I cache the size of container or use size() all the way? Should I allow the user to manually set the maximum size of container and if not, how do I check for overflow?
I know that there can't be one-size-fits-all approach therefore I'd like to hear the policies other coders and framework vendors use. The library I'm working on is cross-platform general purpose and is intended to be released into public domain and be used for decades. Thanks.
I think the default choice ought to be to get rid of both buffersize and maxbuffersize and use buffer.size() and buffer.capacity() throughout.
I would advise against caching the sizes unless you have very specific reasons to do this, backed with hard data from profiler runs. Caching would introduce extra complexity and the potential for the cache to get of sync with the real thing.
Finally, in places where you feel bounds checking is warranted, you could use buffer.at(i). This will throw an exception if i is out of bounds.
In general I would advise using iterators to access your data. When you do this you often don't explicitly call the size of the container at all. This also decouples you from using std::vector all together - and lets you simply change to, for example std::list if you realize later that this better suits your needs.
When you use iterators the need for vector.size() in general greatly decreases.
(when you do need it use buffer.size() and buffer.capacity() as aix says).
For example:
typedef unsigned short BufferSize;
BufferSize maxbuffersize;
BufferSize buffersize;
std::vector<T> buffer;
for(unsigned short i = 0; i< maxbuffersize;++i)
{
//do something with buffer[i];
}
becomes
struct do_something
{
void operator()(const T& t)
{
//do something with buffer[i]
}
};
std::vector<T> buffer(maxbuffersize);
std::for_each(buffer.begin(), buffer.end(), do_something());
which is a little bit cleaner.
Keeping the size is useful for may structures, but it's a bit redundant for arrays/vectors, since the size is guaranteed to be the final index+1. If you are worried about running past the end, an iterator approach such as was mentioned would solve this, as well as most other issues regarding possible sizes for comparisons, etc;
it's pretty standard to define all of your types and their sizes in a header with the API which sets them for different platforms and compilers...look at windows with it's definitions of LONG, ULONG, DWORD, etc. The old "C" convention is to preface them with a unique name or initials such as MYAPI_SIZETYPE. It's wordy but avoids any crossplatform confusion or compiler issues.

Lock-Free Data Structures in C++ Compare and Swap Routine

In this paper: Lock-Free Data Structures (pdf) the following "Compare and Swap" fundamental is shown:
template <class T>
bool CAS(T* addr, T exp, T val)
{
if (*addr == exp)
{
*addr = val;
return true;
}
return false;
}
And then says
The entire procedure is atomic
But how is that so? Is it not possible that some other actor could change the value of addr between the if and the assignment? In which case, assuming all code is using this CAS fundamental, it would be found the next time something "expected" it to be a certain way, and it wasn't. However, that doesn't change the fact that it could happen, in which case, is it still atomic? What about the other actor returning true, even when it's changes were overwritten by this actor? If that can't possibly happen, then why?
I want to believe the author, so what am I missing here? I am thinking it must be obvious. My apologies in advance if this seems trivial.
He is describing an atomic operation which is given by the implementation, "somehow." That is pseudo-code for something implemented in hardware.