Boost ASIO streambuf - c++

I am confused about the input sequence and output sequence in boost asio::streambuf classes.
According to the code examples (for sending data) in the documentation it seems that the buffer representing the input sequence is used for writting to socket and the one representing the output sequence is used for reading.
Example -
boost::asio::streambuf b;
std::ostream os(&b);
os << "Hello, World!\n";
// try sending some data in input sequence
size_t n = sock.send(b.data());
b.consume(n); // sent data is removed from input sequence
Now, is there a nomenclature problem?

The nomenclature for boost::asio::streambuf is similar to that of which is defined in the C++ standard, and used across various classes in the standard template library, wherein data is written to an output stream and data is read from an input stream. For example, one could use std::cout.put() to write to the output stream, and std::cin.get() to read from the input stream.
When manually controlling the streambuf input and output sequences, the general lifecycle of data is as follows:
Buffers get allocated with prepare() for the output sequence.
After data has been written into the output sequence's buffers, the data will be commit()ed. This committed data is removed from the output sequence and appended to the input sequence from which it can be read.
Data is read from the input sequence's buffers obtained via data().
Once data has been read, it can then be removed from the input sequence by consume().
When using Boost.Asio operations that operate on streambuf or stream objects that use a streambuf, such as std::ostream, the underlying input and output sequences will be properly managed. If a buffer is provided to an operation instead, such as passing passing prepare() to a read operation or data() to a write operation, then one must explicitly handle the commit() and consume().
Here is an annotated version of the example code which writes directly from an streambuf to a socket:
// The input and output sequence are empty.
boost::asio::streambuf b;
std::ostream os(&b);
// prepare() and write to the output sequence, then commit the written
// data to the input sequence. The output sequence is empty and
// input sequence contains "Hello, World!\n".
os << "Hello, World!\n";
// Read from the input sequence, writing to the socket. The input and
// output sequences remain unchanged.
size_t n = sock.send(b.data());
// Remove 'n' bytes from the input sequence. If the send operation sent
// the entire buffer, then the input sequence would be empty.
b.consume(n);
And here is the annotated example for reading from a socket directly into an streambuf. The annotations assume that the word "hello" has been received, but not yet read, on the socket:
boost::asio::streambuf b;
// prepare() 512 bytes for the output sequence. The input sequence
// is empty.
auto bufs = b.prepare(512);
// Read from the socket, writing into the output sequence. The
// input sequence is empty and the output sequence contains "hello".
size_t n = sock.receive(bufs);
// Remove 'n' (5) bytes from output sequence appending them to the
// input sequence. The input sequence contains "hello" and the
// output sequence has 507 bytes.
b.commit(n);
// The input and output sequence remain unchanged.
std::istream is(&b);
std::string s;
// Read from the input sequence and consume the read data. The string
// 's' contains "hello". The input sequence is empty, the output
// sequence remains unchanged.
is >> s;
Note how in the above examples, the steam objects handled committed and consuming the streambuf's output and input sequences. However, when the buffers themselves were used (i.e. data() and prepare()), the code needed to explicitly handle commits and consumes.

"Everything is relative"
Albert Einstein
The documentation says:
Characters written to the output sequence of a basic_streambuf object are appended to the input sequence of the same object.
From the point of view of the streambuf it will read from its output sequence and write into its input sequence which might seem kind of inverted, but you can think of the streambuf as a pipe for things to make sense.
From the user (anything that uses the streambuf, including sockets) point of view now, you will write into the ouput sequence of the streambuf and read from its input sequence which seems more natural.
So yeah the same way left and right are inverted depending on what you're facing, inputs and outputs are inverted depending from which side you look at it.
"Don't believe every quote you read on the internet, because I totally didn't say that"
Albert Einstein

Related

How to 'read' from a (binary==true) boost::beast::websocket::stream<tcp::socket> into a buffer (boost::beast::flat_buffer?) so it is not escaped?

I am using boost::beast to read data from a websocket into a std::string. I am closely following the example websocket_sync_client.cpp in boost 1.71.0, with one change--the I/O is sent in binary, there is no text handler at the server end, only a binary stream. Hence, in the example, I added one line of code:
// Make the stream binary?? https://github.com/boostorg/beast/issues/1045
ws.binary(true);
Everything works as expected, I 'send' a message, then 'read' the response to my sent message into a std::string using boost::beast::buffers_to_string:
// =============================================================
// This buffer will hold the incoming message
beast::flat_buffer wbuffer;
// Read a message into our buffer
ws.read(wbuffer);
// =============================================================
// ==flat_buffer to std::string=================================
string rcvdS = beast::buffers_to_string(wbuffer.data());
std::cout << "<string_rcvdS>" << rcvdS << "</string_rcvdS>" << std::endl;
// ==flat_buffer to std::string=================================
This just about works as I expected, except there is some kind of escaping happening on the data of the (binary) stream.
There is no doubt some layer of boost logic (perhaps character traits?) that has enabled/caused all non-printable characters to be '\u????' escaped, human-readable text.
The binary data that is read contains many (intentional) non-printable ASCII control characters to delimit/organize chunks of data in the message:
I would rather not have the stream escaping these non-printable characters, since I will have to "undo" that effort anyway, if I cannot coerce the 'read' buffer into leaving the data as-is, raw. If I have to find another boost API to undo the escaping, that is just wasted processing that no doubt is detrimental to performance.
My question has to have a simple solution. How can I cause the resulting flat_buffer that is ws.read into 'rcvdS' to contain truely raw, unescaped bytes of data? Is it possible, or is it necessary for me to simply choose a different buffer template/class, so that the escaping does not happen?
Here is a visual aid - showing expected vs. actual data:
Beast does not alter the contents of the message in any way. The only thing that binary() and text() do is set a flag in the message which the other end receives. Text messages are validated against the legal character set, while binary messages are not. Message data is never changed by Beast. buffers_to_string just transfers the bytes in the buffer to a std::string, it does not escape anything. So if the buffer contains a null, or lets say a ctrl+A, you will get a 0x00 and a 0x01 in the std::string respectively.
If your message is being encoded or translated, it isn't Beast that is doing it. Perhaps it is a consequence of writing the raw bytes to the std::cout? Or it could be whatever you are using to display those messages in the image you posted. I note that the code you provided does not match the image.
If anyone else lands here, rest assured, it is your server end, not the client end that is escaping your data.

Why does std::cin's extractor operator wait for user input?

I know that this may be a stupid question but I am not sure how getting user input from the terminal actually works.
I understand input/output conceptually and I have no problem using them but I am lost when it comes to how these are actually implemented at a basic level.
As far as I know all stream objects use a type of buffer. If you extract all the characters you reach eof. It's this part I am probably wrong and I would like to know more about. For instance when we use the std::cin's extractor operator, it waits for input. How does it differentiate between waiting for input and reaching eof (nothing else to read) ?
std::cin doesn't do anything special. Like all file input, it
emits a system level read (read in Unix, ReadFile in
Windows), for enough bytes to fill its buffer (usually something
well over 1K today). It is the system which detects that the
input is from a keyboard, and behaves differently: from a file,
the system will read as many bytes as are available, up to the
end of file or the number requested, and return immediately.
From the keyboard, the system will normally read the characters
into an internal buffer until enter, to allow editing (back
space, etc.), and only on enter will it pass this buffer back to
the caller (after having appended the new line marker).
EDIT:
Sort of as a summary as to elements mentionned in the
discussion: I'll take as an example what happens in a Unix
system (but Windows is basically the same, modulo the way it
reports the different information). The istream itself is
buffered. When you try to extract a character (an >>
operator, istream::get, etc.), the stream will return it from
its buffer. If there are no more characters left in the buffer,
it will make a read request to the system, with the address
and the size of its buffer. (On todays systems, I would be
surprised to see a buffer of less than 1K.) What the system
does with it will depend on what the file descriptor designates:
a file
The system will copy bytes from the current position in the
file, until it has filled the buffer or reached end of file. It
returns the number of bytes it copies (or -1, if there is an
error).
a keyboard
For keyboards, the system maintains an internal buffer of
its own, and reads line by line. This buffer will only be
considered "ready" when the user presses enter; before that, the
system will simply not return from the `read`. This allows the
system to implement line editing; e.g. processing things like
a backspace. When you hit enter, the system adds the (system
specific) new line sequence to the buffer, and returns with the
number of characters it has copied into the buffer. (Thus, not
0, since there is the new line.) This procedure can be modified
in two ways: both Unix and Windows have a special characters
(control-D under Unix, control-Z under Windows) which tells the
system to return immediately from the read, with whatever the
buffer contains at the moment. If you're at the start of
a line, the buffer contains nothing, the `read` returns
0 characters read, and the stream treats it as an end of file.
And if the stream buffer size is less than the number of characters in
the line (because you've calmly typed in 100000 characters
without a new line), `read` will return the maximum that will
fit in the buffer, and the next `read` will return immediately
with the rest of the line (or the next n `read` will return
immediately, until the entire line has been read).
a pipe
The system will wait until either there are as many
characters as requested in the pipe, or there are no more
processes left with the pipe open for write. It will then copy
the number of characters requested (or less, of the write side
is closed), and return the number copied.
If the read indicates an error, the stream will set badbit;
if the read returns 0 characters read, the stream will treat
it as end of file.
A std::istream might have a limited amount of data (a file on a hard drive, ...) or an unlimited amount of data (a sensor, ..., interactive user input). In addition there might be a special character representing EOF.
An interactive input from a console/terminal is an infinite input. Unless a special character is entered (Linux: ctl-d), the stream never reaches EOF.
A non interactive input from a pipe will (should/might) end in a EOF.

How does getline work with cin?

I feel like there are a lot of similar questions, so I'm really sorry if this is a duplicate. I couldn't find the answer to this specific question, though.
I am confused as to how getline works when cin is passed to it, because my understanding is that it should be calling cin each time it is called. When working with code that was in a book I'm reading though, getline is called several times yet only one input is sent. The cin object is not called from anywhere except for within these getline calls.
What's going on here? When getline is reached does the program simply stop in its tracks and wait for the input stream to pass a value including the desired delimiter? If this is the case, do the subsequent getline calls just not have to wait because the input stream already has data including their respective delimiters? I ran a couple tests that would suggest this could be the case.
Here is the code:
string firstName;
getline(cin,firstName,',');
string lastName;
getline(cin,lastName,',');
string job;
getline(cin,job,'\n');
cout<<firstName<<" "<<lastName<<" is a "<<job<<endl;;
Sorry again if this is a stupid question, but I looked around and genuinely could not find the answer. Thanks in advance for any help that can be provided!
Clarification:
This code outputs "First Last is a Job" for the console input "First,Last,Job\n"
A call to a function using cin is not actually a request for user input (at least not directly). It is a request for characters from the standard input. In normal program operation (where standard input is not being directed from a file or other source) standard input is stored in a buffer. If the standard input buffer is empty, and cin is requesting more characters, then your system will request input from the user via the terminal. (i.e. the keyboard). This input which the terminal requests is generally line oriented. That is, it waits for you to press the Enter key, then sends all the data to be stored in the standard input buffer. If cin gets all the characters it needs before the input buffer is empty, those characters remain until the next request.
So, for example, when you make this call:
getline(cin,firstName,',');
and the input buffer is empty, Let's say the user inputs this:
Benjamin, Lindley, Software DeveloperEnter
First, the following string is stored in the input buffer:
"Benjamin, Lindley, Software Developer\n"
Then getline causes "Benjamin," to be read from the input buffer (but discards the comma).
" Lindley, Software Developer\n"
remains in the buffer for any future operations with cin.
getline does not "call" cin at all. cin is an object. Objects contain data. The data in cin is the information needed by input functions to read the standard input stream. If you wanted to read from a file, for instance, you'd open the file and pass the file object to getline instead.
When getline is called, the program reads whatever is in the input buffer. If the input buffer already contains the delimiter then getline will return right away. Otherwise it will wait.

fscanf multiple lines [c++]

I am reading in a file with multiple lines of data like this:
:100093000202C4C0E0E57FB40005D0E0020C03B463
:1000A3000105D0E0022803B40205D0E0027C03027C
:1000B30002E3C0E0E57FB40005D0E0020C0BB4011D
I am reading in values byte by byte and storing them in an array.
fscanf_s(in_file,"%c", &sc); // start code
fscanf_s(in_file,"%2X", &iByte_Count); // byte count
fscanf_s(in_file,"%4X", &iAddr); // 2 byte address
fscanf_s(in_file,"%2X", &iRec_Type); // record type
for(int i=0; i<iByte_Count; i++)
{
fscanf_s(in_file,"%2X", &iData[i]);
iArray[(iMaskedAddr/16)][iMaskedNumMove+3+i]=iData[i];
}
fscanf_s(in_file,"%2X", &iCkS);
This is working great except when I get to the end of the first line. I need this to repeat until I get to the end of the file but when I put this in a loop it craps out.
Can I force the position to the begining of the next line?
I know I can use a stream and all that but I am dealing with this method.
Thanks for the help
My suggestion is to dump fscanf_s and use either fgets or std::getline.
That said, your issue is handling the newlines, and the next beginning of record token, the ':'.
One method is to use fscanf_s("%c") until the ':' character is read or the end of file is reached:
char start_of_record;
do
{
fscanf_s(infile, "%c", &start_of_record);
} while (!feof(infile) && (start_of_record != ':'));
// Now process the header....
The data the OP is reading is a standard format for transmitting binary data, usually for downloading into Flash Memories and EPROMs.
Your topic clear states that you are using C++ so, if I may, I suggest you use the correct STL stream manipulators.
To read line-by-line, you can use ifstream::getline. But again, you are not reading the file line by line, you are reading it field by field. So, you should try using ifstream::read, which lets you choose the amount of bytes to read from the stream.
UPDATE:
While doing an unrelated search over the net, I found out about a library called IOF which may help you with this task. Check it out.

How to get boost::iostream to operate in a mode comparable to std::ios::binary?

I have the following question on boost::iostreams. If someone is familiar with writing filters, I would actually appreciate your advices / help.
I am writing a pair of multichar filters, that work with boost::iostream::filtering_stream as data compressor and decompressor.
I started from writing a compressor, picked up some algorithm from lz-family and now am working on a decompressor.
In a couple of words, my compressor splits data into packets, which are encoded separately and then flushed to my file.
When I have to restore data from my file (in programming terms, receive a read(byte_count) request), I have to read a full packed block, bufferize it, unpack it and only then give the requested number of bytes. I've implemented this logic, but right now I'm struggling with the following problem:
When my data is packed, any symbols can appear in the output file. And I have troubles when reading file, which contains symbol (hex 1A, char 26) using boost::iostreams::read(...., size).
If I was using std::ifstream, for example, I would have set a std::ios::binary mode and then this symbol could be read simply.
Any way to achieve the same when implementing a boost::iostream filter which uses boost::iostream::read routine to read char sequence?
Some code here:
// Compression
// -----------
filtering_ostream out;
out.push(my_compressor());
out.push(file_sink("file.out"));
// Compress the 'file.in' to 'file.out'
std::ifstream stream("file.in");
out << stream.rdbuf();
// Decompression
// -------------
filtering_istream in;
in.push(my_decompressor());
in.push(file_source("file.out"));
std::string res;
while (in) {
std::string t;
// My decompressor wants to retrieve the full block from input (say, 4096 bytes)
// but instead retrieves 150 bytes because meets '1A' char in the char sequence
// That obviously happens because file should be read as a binary one, but
// how do I state that?
std::getline(in, t); // <--------- The error happens here
res += t;
}
Short answer for reading file as binary :
specify ios_base::binary when opening file stream.
MSDN Link