How to read exactly one line? - c++

I have a Linux file descriptor (from socket), and I want to read one line.
How to do it in C++?

I you are reading from a TCP socket you can't assume when the end of line will be reached.
Therfore you'll need something like that:
std::string line;
char buf[1024];
int n = 0;
while(n = read(fd, buf, 1024))
{
const int pos = std::find(buf, buf + n, '\n')
if(pos != std::string::npos)
{
if (pos < 1024-1 && buf[pos + 1] == '\n')
break;
}
line += buf;
}
line += buf;
Assuming you are using "\n\n" as a delimiter. (I didn't test that code snippet ;-) )
On a UDP socket, that is another story. The emiter may send a paquet containing a whole line. The receiver is garanted to receive the paquet as a single unit .. If it receives it , as UDP is not as reliable as TCP of course.

Pseudocode:
char newline = '\n';
file fd;
initialize(fd);
string line;
char c;
while( newline != (c = readchar(fd)) ) {
line.append(c);
}
Something like that.

Here is a tested, quite efficient code:
bool ReadLine (int fd, string* line) {
// We read-ahead, so we store in static buffer
// what we already read, but not yet returned by ReadLine.
static string buffer;
// Do the real reading from fd until buffer has '\n'.
string::iterator pos;
while ((pos = find (buffer.begin(), buffer.end(), '\n')) == buffer.end ()) {
char buf [1025];
int n = read (fd, buf, 1024);
if (n == -1) { // handle errors
*line = buffer;
buffer = "";
return false;
}
buf [n] = 0;
buffer += buf;
}
// Split the buffer around '\n' found and return first part.
*line = string (buffer.begin(), pos);
buffer = string (pos + 1, buffer.end());
return true;
}
It's also useful to setup signal SIGPIPE ignoring in reading and writing (and handle errors as shown above):
signal (SIGPIPE, SIG_IGN);

Using C++ sockets library:
class LineSocket : public TcpSocket
{
public:
LineSocket(ISocketHandler& h) : TcpSocket(h) {
SetLineProtocol(); // enable OnLine callback
}
void OnLine(const std::string& line) {
std::cout << "Received line: " << line << std::endl;
// send reply here
{
Send( "Reply\n" );
}
}
};
And using the above class:
int main()
{
try
{
SocketHandler h;
LineSocket sock(h);
sock.Open( "remote.host.com", port );
h.Add(&sock);
while (h.GetCount())
{
h.Select();
}
}
catch (const Exception& e)
{
std::cerr << e.ToString() << std::endl;
}
}
The library takes care of all error handling.
Find the library using google or use this direct link: http://www.alhem.net/Sockets/

Related

fork/pipes and running multiple programs

I've written a engine for the game "draughts" some time ago and now I want to write a program that communicates with two of those engines via some protocol. In the end I hope to have something similar to the UCI-protocol which is widely known among programmers of chess engines.
The engine is supposed to receive all the commands via stdin and sends it's response via stdout.
I've created some dummy engine to test this with some testing before the if-statement to see if the engine receives anything at all.
int main(){
std::cerr<<"MainEngine"<<std::endl;
while (!std::cin.eof()) {
std::string current;
std::getline(std::cin, current);
std::cerr<<"FromMain:"<<current<<std::endl;
if (current == "init") {
initialize();
std::cout << "ready" << "\n";
} else if (current == "hashSize") {
std::string hash;
std::getline(std::cin, hash);
setHashSize(1u << std::stoi(hash));
} else if (current == "position") {
std::string position;
std::getline(std::cin, position);
} else if (current == "move") {
std::string move;
std::getline(std::cin, move);
}
}
return 0
}
and here is my attempt at the communication-part using pipes
struct Interface {
enum State {
Idle, Ready, Searching
};
const int &engineRead1;
const int &engineRead2;
const int &engineWrite1;
const int &engineWrite2;
State oneState;
State twoState;
void initEngines();
void writeMessage(const int &pipe, const std::string &message);
void processInput(const int &readPipe);
Interface &operator<<(const std::string message);
};
void Interface::processInput(const int &readPipe) {
std::string message;
char c;
while ((read(readPipe, &c, sizeof(char))) != -1) {
if (c == '\n') {
break;
} else {
message += c;
}
}
if (message == "ready") {
std::cout << "ReadyEngine" << std::endl;
}
}
void Interface::writeMessage(const int &pipe, const std::string &message) {
write(pipe, (char *) &message.front(), sizeof(char) * message.size());
}
int main(int argl, const char **argc) {
int numEngines = 2;
int mainPipe[numEngines][2];
int enginePipe[numEngines][2];
for (auto i = 0; i < numEngines; ++i) {
pipe(mainPipe[i]);
pipe(enginePipe[i]);
auto pid = fork();
if (pid < 0) {
std::cerr << "Error" << std::endl;
exit(EXIT_FAILURE);
} else if (pid == 0) {
dup2(mainPipe[i][0], STDIN_FILENO);
close(enginePipe[i][1]);
dup2(enginePipe[i][1], STDOUT_FILENO);
close(mainPipe[i][0]);
execlp("./engine", "engine", NULL);
}
close(enginePipe[i][0]);
close(mainPipe[i][1]);
std::string message = "init\n";
Interface inter{enginePipe[0][0], enginePipe[1][0], mainPipe[0][1], mainPipe[1][1]};
inter.writeMessage(inter.engineWrite1, message);
inter.writeMessage(inter.engineWrite2, message);
int status;
for (int k = 0; k < numEngines; ++k) {
wait(&status);
}
}
}
I am creating two child-process one for each engine. In this test I simply send "init\n" to each of the engine and would expect the child processes to print "FromMain: init". However, I am only getting the output "MainEngine" from one of the child-processes.
This is my first attempt at using pipes and I dont know where I messed up. I would appreciate some tips/help on how to properly setup the communication part.
close(enginePipe[i][1]);
dup2(enginePipe[i][1], STDOUT_FILENO);
You're closing a pipe and then trying to dup it. This doesn't work.
close(enginePipe[i][0]);
close(mainPipe[i][1]);
std::string message = "init\n";
Interface inter{enginePipe[0][0], enginePipe[1][0], mainPipe[0][1], mainPipe[1][1]};
And you're closing these pipes then trying to use them too. And making inter with all of the pipes each iteration through, instead of each only once.
I'd advise you to do something simple with two processes and one pipe, before trying complicated things like this with three processes and four pipes.

Read system call taking forever on Linux

I'm writing a TCP server application in c++.
I'm trying to read a line one char at a time from a socket, but the read() system call never returns.
string buffered_reader::read_line() {
string str;
int i = 0;
char ch;
do {
int len = conn.read_from_conn((void*)&ch, 1);
if (len == -1)
throw string("Error reading from connection!");
str += ch;
} while (ch != '\n');
return str;
}
And here is the read_from_conn() function
int connectionplusplus::read_from_conn(void *buffer, int buffer_len) {
return read(this->connfd, buffer, buffer_len);
}
The problem is that connfd wasn't initialized.

Is there a better way to search a file for a string?

I need to search a (non-text) file for the byte sequence "9µ}Æ" (or "\x39\xb5\x7d\xc6").
After 5 hours of searching online this is the best I could do. It works but I wanted to know if there is a better way:
char buffer;
int pos=in.tellg();
// search file for string
while(!in.eof()){
in.read(&buffer, 1);
pos=in.tellg();
if(buffer=='9'){
in.read(&buffer, 1);
pos=in.tellg();
if(buffer=='µ'){
in.read(&buffer, 1);
pos=in.tellg();
if(buffer=='}'){
in.read(&buffer, 1);
pos=in.tellg();
if(buffer=='Æ'){
cout << "found";
}
}
}
}
in.seekg((streampos) pos);
Note:
I can't use getline(). It's not a text file so there are probably not many line breaks.
Before I tried using a multi-character buffer and then copying the buffer to a C++ string, and then using string::find(). This didn't work because there are many '\0' characters throughout the file, so the sequence in the buffer would be cut very short when it was copied to the string.
Similar to what bames53 posted; I used a vector as a buffer:
std::ifstream ifs("file.bin");
ifs.seekg(0, std::ios::end);
std::streamsize f_size = ifs.tellg();
ifs.seekg(0, std::ios::beg);
std::vector<unsigned char> buffer(f_size);
ifs.read(buffer.data(), f_size);
std::vector<unsigned char> seq = {0x39, 0xb5, 0x7d, 0xc6};
bool found = std::search(buffer.begin(), buffer.end(), seq.begin(), seq.end()) != buffer.end();
If you don't mind loading the entire file into an in-memory array (or using mmap() to make it look like the file is in memory), you could then search for your character sequence in-memory, which is a bit easier to do:
// Works much like strstr(), except it looks for a binary sub-sequence rather than a string sub-sequence
const char * MemMem(const char * lookIn, int numLookInBytes, const char * lookFor, int numLookForBytes)
{
if (numLookForBytes == 0) return lookIn; // hmm, existential questions here
else if (numLookForBytes == numLookInBytes) return (memcmp(lookIn, lookFor, numLookInBytes) == 0) ? lookIn : NULL;
else if (numLookForBytes < numLookInBytes)
{
const char * startedAt = lookIn;
int matchCount = 0;
for (int i=0; i<numLookInBytes; i++)
{
if (lookIn[i] == lookFor[matchCount])
{
if (matchCount == 0) startedAt = &lookIn[i];
if (++matchCount == numLookForBytes) return startedAt;
}
else matchCount = 0;
}
}
return NULL;
}
.... then you can just call the above function on the in-memory data array:
char * ret = MemMem(theInMemoryArrayContainingFilesBytes, numBytesInFile, myShortSequence, 4);
if (ret != NULL) printf("Found it at offset %i\n", ret-theInMemoryArrayContainingFilesBytes);
else printf("It's not there.\n");
This program loads the entire file into memory and then uses std::search on it.
int main() {
std::string filedata;
{
std::ifstream fin("file.dat");
std::stringstream ss;
ss << fin.rdbuf();
filedata = ss.str();
}
std::string key = "\x39\xb5\x7d\xc6";
auto result = std::search(std::begin(filedata), std::end(filedata),
std::begin(key), std::end(key));
if (std::end(filedata) != result) {
std::cout << "found\n";
// result is an iterator pointing at '\x39'
}
}
const char delims[] = { 0x39, 0xb5, 0x7d, 0xc6 };
char buffer[4];
const size_t delim_size = 4;
const size_t last_index = delim_size - 1;
for ( size_t i = 0; i < last_index; ++i )
{
if ( ! ( is.get( buffer[i] ) ) )
return false; // stream to short
}
while ( is.get(buffer[last_index]) )
{
if ( memcmp( buffer, delims, delim_size ) == 0 )
break; // you are arrived
memmove( buffer, buffer + 1, last_index );
}
You are looking for 4 bytes:
unsigned int delim = 0xc67db539;
unsigned int uibuffer;
char * buffer = reinterpret_cast<char *>(&uibuffer);
for ( size_t i = 0; i < 3; ++i )
{
if ( ! ( is.get( buffer[i] ) ) )
return false; // stream to short
}
while ( is.get(buffer[3]) )
{
if ( uibuffer == delim )
break; // you are arrived
uibuffer >>= 8;
}
Because you said you cannot search the entire file because of null terminator characters in the string, here's an alternative for you, which reads the entire file in and uses recursion to find the first occurrence of a string inside of the whole file.
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
string readFile (char *fileName) {
ifstream fi (fileName);
if (!fi)
cerr << "ERROR: Cannot open file" << endl;
else {
string str ((istreambuf_iterator<char>(fi)), istreambuf_iterator<char>());
return str;
}
return NULL;
}
bool findFirstOccurrenceOf_r (string haystack, char *needle, int haystack_pos, int needle_pos, int needle_len) {
if (needle_pos == needle_len)
return true;
if (haystack[haystack_pos] == needle[needle_pos])
return findFirstOccurrenceOf_r (haystack, needle, haystack_pos+1, needle_pos+1, needle_len);
return false;
}
int findFirstOccurrenceOf (string haystack, char *needle, int length) {
int pos = -1;
for (int i = 0; i < haystack.length() - length; i++) {
if (findFirstOccurrenceOf_r (haystack, needle, i, 0, length))
return i;
}
return pos;
}
int main () {
char str_to_find[4] = {0x39, 0xB5, 0x7D, 0xC6};
string contents = readFile ("input");
int pos = findFirstOccurrenceOf (contents, str_to_find, 4);
cout << pos << endl;
}
If the file is not too large, your best solution would be to load the whole file into memory, so you don't need to keep reading from the drive. If the file is too large to load in at once, you would want to load in chunks of the file at a time. But if you do load in chucks, make sure you check to edges of the chunks. It's possible that your chunk happens to split right in the middle of the string you're searching for.

Infinite read from socket

what is the right way to read chunked data (from http request) from socket?
sf::TcpSocket socket;
socket.connect("0.0.0.0", 80);
std::string message = "GET /address HTTP/1.1\r\n";
socket.send(message.c_str(), message.size() + 1);
// Receive an answer from the server
char buffer[128];
std::size_t received = 0;
socket.receive(buffer, sizeof(buffer), received);
std::cout << "The server said: " << buffer << std::endl;
But server sends infinite data and socket.receive doesn't return management. Any right ways to read chunked data part by part? (The answer is chunked data).
The right way to process HTTP requests is to use a higher-level library that manages the socket connections for you. In C++ one example would be pion-net; there are others too like Mongoose (which is C, but fine to use in C++).
Well infinite data is theoretically possible while the practical implementation differ from process to process.
Approach 1 - Generally many protocol do send size in the first few bytes ( 4 bytes ) and you can have a while loop
{
int i = 0, ret = 1;
unsigned char buffer[4];
while ( i<4 && ret == 0)
socket.receive(buffer + i, 1 , ret);
// have a while loop to read the amount of data you need. Malloc the buffer accordingly
}
Approach 2 - Or in your case where you don't know the lenght ( infinite )
{
char *buffer = (char *)malloc(TCP_MAX_BUF_SIZE);
std::size_t total = 0, received = 0;
while ( total < TCP_MAX_BUF_SIZE && return >= 0) {
socket.receive(buffer, sizeof(buffer), received);
total += received;
}
//do something with your data
}
You will have to break at somepoint and process your data Dispatch it to another thread of release the memory.
If by "chunked data" you are referring to the Transfer-Encoding: chunked HTTP header, then you need to read each chunk and parse the chunk headers to know how much data to read in each chunk and to know when the last chunk has been received. You cannot just blindly call socket.receive(), as chunked data has a defined structure to it. Read RFC 2616 Section 3.6.1 for more details.
You need to do something more like the following (error handling omitted for brevity - DON'T omit it in your real code):
std::string ReadALine(sf::TcpSocket &socket)
{
std::string result;
// read from socket until a LF is encountered, then
// return everything up to, but not including, the
// LF, stripping off CR if one is also present...
return result;
}
void ReadHeaders(sf::TcpSocket &socket, std::vector<std::string> &headers)
{
std::string line;
do
{
line = ReadALine(socket);
if (line.empty()) return;
headers.push_back(line);
}
while (true);
}
std::string UpperCase(const std::string &s)
{
std::string result = s;
std::for_each(result.begin(), result.end(), toupper);
return result;
}
std::string GetHeader(const std::vector<std::string> &headers, const std::string &s)
{
std::string prefix = UpperCase(s) + ":";
for (std::vector<std::string>::iterator iter = headers.begin(), end = headers.end(); iter != end; ++iter)
{
if (UpperCase(i)->compare(0, prefix.length(), prefix) == 0)
return i->substr(prefix.length());
}
return std::string();
}
sf::TcpSocket socket;
socket.connect("0.0.0.0", 80);
std::string message = "GET /address HTTP/1.1\r\nHost: localhost\r\n\r\n";
socket.send(message.c_str(), message.length());
std:vector<std::string> headers;
std::string statusLine = ReadALine(sockeet);
ReadHeaders(socket, headers);
// Refer to RFC 2616 Section 4.4 for details about how to properly
// read a response body in different situations...
int statusCode;
sscanf(statusLine.c_str(), "HTTP/%*d.%*d %d %*s", &statusCode);
if (
((statusCode / 100) != 1) &&
(statusCode != 204) &&
(statusCode != 304)
)
{
std::string header = GetHeader(headers, "Transfer-Encoding");
if (UpperCase(header).find("CHUNKED") != std::string::npos)
{
std::string extensions;
std::string_size_type pos;
std::size_t chunkSize;
do
{
line = ReadALine(socket);
pos = line.find(";");
if (pos != std::string::npos)
{
extensions = line.substr(pos+1);
line.resize(pos);
}
else
extensions.clear();
chunkSize = 0;
sscanf(UpperCase(line).c_str(), "%X", &chunkSize);
if (chunkSize == 0)
break;
socket.receive(someBuffer, chunkSize);
ReadALine(socket);
// process extensions as needed...
// copy someBuffer into your real buffer...
}
while (true);
std::vector<std::string> trailer;
ReadHeaders(socket, trailer);
// merge trailer into main header...
}
else
{
header = GetHeader(headers, "Content-Length");
if (!header.empty())
{
uint64_t contentLength = 0;
sscanf(header.c_str(), "%Lu", &contentLength);
// read from socket until contentLength number of bytes have been read...
}
else
{
// read from socket until disconnected...
}
}
}

How to guarantee read() actually sends 100% of data sent by write() through named pipes

I've got the following two programs, one acting as a reader and the other as a writer. The writer seems to only send about 3/4 of the data correctly to be read by the reader. Is there any way to guarantee that all the data is being sent? I think I've got it set up so that it reads and writes reliably, but it still seems to miss 1/4 of the data.
Heres the source of the writer
#define pipe "/tmp/testPipe"
using namespace std;
queue<string> sproutFeed;
ssize_t r_write(int fd, char *buf, size_t size) {
char *bufp;
size_t bytestowrite;
ssize_t byteswritten;
size_t totalbytes;
for (bufp = buf, bytestowrite = size, totalbytes = 0;
bytestowrite > 0;
bufp += byteswritten, bytestowrite -= byteswritten) {
byteswritten = write(fd, bufp, bytestowrite);
if(errno == EPIPE)
{
signal(SIGPIPE,SIG_IGN);
}
if ((byteswritten) == -1 && (errno != EINTR))
return -1;
if (byteswritten == -1)
byteswritten = 0;
totalbytes += byteswritten;
}
return totalbytes;
}
void* sendData(void *thread_arg)
{
int fd, ret_val, count, numread;
string word;
char bufpipe[5];
ret_val = mkfifo(pipe, 0777); //make the sprout pipe
if (( ret_val == -1) && (errno != EEXIST))
{
perror("Error creating named pipe");
exit(1);
}
while(1)
{
if(!sproutFeed.empty())
{
string s;
s.clear();
s = sproutFeed.front();
int sizeOfData = s.length();
snprintf(bufpipe, 5, "%04d\0", sizeOfData);
char stringToSend[strlen(bufpipe) + sizeOfData +1];
bzero(stringToSend, sizeof(stringToSend));
strncpy(stringToSend,bufpipe, strlen(bufpipe));
strncat(stringToSend,s.c_str(),strlen(s.c_str()));
strncat(stringToSend, "\0", strlen("\0"));
int fullSize = strlen(stringToSend);
signal(SIGPIPE,SIG_IGN);
fd = open(pipe,O_WRONLY);
int numWrite = r_write(fd, stringToSend, strlen(stringToSend) );
cout << errno << endl;
if(errno == EPIPE)
{
signal(SIGPIPE,SIG_IGN);
}
if(numWrite != fullSize )
{
signal(SIGPIPE,SIG_IGN);
bzero(bufpipe, strlen(bufpipe));
bzero(stringToSend, strlen(stringToSend));
close(fd);
}
else
{
signal(SIGPIPE,SIG_IGN);
sproutFeed.pop();
close(fd);
bzero(bufpipe, strlen(bufpipe));
bzero(stringToSend, strlen(stringToSend));
}
}
else
{
if(usleep(.0002) == -1)
{
perror("sleeping error\n");
}
}
}
}
int main(int argc, char *argv[])
{
signal(SIGPIPE,SIG_IGN);
int x;
for(x = 0; x < 100; x++)
{
sproutFeed.push("All ships in the sea sink except for that blue one over there, that one never sinks. Most likley because it\'s blue and thats the mightiest colour of ship. Interesting huh?");
}
int rc, i , status;
pthread_t threads[1];
printf("Starting Threads...\n");
pthread_create(&threads[0], NULL, sendData, NULL);
rc = pthread_join(threads[0], (void **) &status);
}
Heres the source of the reader
#define pipe "/tmp/testPipe"
char dataString[50000];
using namespace std;
char *getSproutItem();
void* readItem(void *thread_arg)
{
while(1)
{
x++;
char *s = getSproutItem();
if(s != NULL)
{
cout << "READ IN: " << s << endl;
}
}
}
ssize_t r_read(int fd, char *buf, size_t size) {
ssize_t retval;
while (retval = read(fd, buf, size), retval == -1 && errno == EINTR) ;
return retval;
}
char * getSproutItem()
{
cout << "Getting item" << endl;
char stringSize[4];
bzero(stringSize, sizeof(stringSize));
int fd = open(pipe,O_RDONLY);
cout << "Reading" << endl;
int numread = r_read(fd,stringSize, sizeof(stringSize));
if(errno == EPIPE)
{
signal(SIGPIPE,SIG_IGN);
}
cout << "Read Complete" << endl;
if(numread > 1)
{
stringSize[numread] = '\0';
int length = atoi(stringSize);
char recievedString[length];
bzero(recievedString, sizeof(recievedString));
int numread1 = r_read(fd, recievedString, sizeof(recievedString));
if(errno == EPIPE)
{
signal(SIGPIPE,SIG_IGN);
}
if(numread1 > 1)
{
recievedString[numread1] = '\0';
cout << "DATA RECIEVED: " << recievedString << endl;
bzero(dataString, sizeof(dataString));
strncpy(dataString, recievedString, strlen(recievedString));
strncat(dataString, "\0", strlen("\0"));
close(fd);
return dataString;
}
else
{
return NULL;
}
}
else
{
return NULL;
}
close(fd);
}
int main(int argc, char *argv[])
{
int rc, i , status;
pthread_t threads[1];
printf("Starting Threads...\n");
pthread_create(&threads[0], NULL, readItem, NULL);
rc = pthread_join(threads[0], (void **) &status);
}
You are definitely using signals the wrong way. Threads are completely unnecessary here - at least in the code provided. String calculations are just weird. Get this book and do not touch the keyboard until you finished reading :)
The general method used to send data through named pipes is to tack on a header with the length of the payload. Then you read(fd, header_len); read(rd, data_len); Note the latter read() will need to be done in a loop until data_len is read or eof. Note also if you've multiple writers to a named pipe then the writes are atomic (as long as a reasonable size) I.E. multiple writers will not case partial messages in the kernel buffers.
It's difficult to say what is going on here. Maybe you are getting an error returned from one of your system calls? Are you sure that you are successfully sending all of the data?
You also appear to have some invalid code here:
int length = atoi(stringSize);
char recievedString[length];
This is a syntax error, since you cannot create an array on the stack using a non-constanct expression for the size. Maybe you are using different code in your real version?
Do you need to read the data in a loop? Sometimes a function will return a portion of the available data and require you to call it repeatedly until all of the data is gone.
Some system calls in Unix can also return EAGAIN if the system call is interrupted - you are not handling this case by the looks of things.
You are possibly getting bitten by POSIX thread signal handling semantics in your reader main thread.
The POSIX standard allows for a POSIX thread to receive the signal, not necessarily the thread you expect. Block signals where not wanted.
signal(SIG_PIPE,SIG_IGN) is your friend. Add one to reader main.
POSIX thread handling semantics, putting the POS into POSIX. ( but it does make it easier to implement POSIX threads.)
Examine the pipe in /tmp with ls ? is it not empty ?