How to send a message with "\0" to a Server via TCP - c++

I've managed to create a connection between my Program(client) and a Server.
Now I want to send the following Text "TOR\0". The Server explicit needs this Info for starting an Operation.
The problem is, that everytime I try to define the text
char a[]="TOR\0";
My client only sends three bytes... Without the NULL at the end.
I can send 5 different Characters like {"T","O","R","\","0"}, but I know I need 4 Bytes of Data. I also know the four Bytes need to have the information "T", "O", "R", "\0". But everytime I try sending this my cpp only sends "T", "O", "R", because it stops reading at the NULL which represents the end of line.
I also tried fixing the length in the send-function. I know I have to specify the length by sending it. But by fixing the length on 4 doesn't work. I use this to see if it worked:
Result = send( ConnectSocket, a, (int)strlen(a), 0 );
cout << Result << endl;
How do I attache the NULL at the end of the text without causing it C++ to stop reading the message at that exact point?

String "TOR\0" has 5 characters for you, however C++ - in the same way as C - understands every sequence of \<character> as a single special character. So \0 is one single character NULL (code 0).
This notation is superfluous, as every string in C/C++ written as "string" has terminating NULL. For example "abc" is equivalent of {'a','b','c','\0'}. So "TOR\0" means {'T','O','R','\0','\0'}.
The strlen function returns the length of string defined as a number of characters from beginning of the string before NULL terminator. So, strlen("TOR\0") == 3 because there are three characters before NULL.
If you intention is to send literal characters as you wrote them: 'T', 'O', 'R', backslash and '0', you can do that in one of those ways:
char a[]={'T','O','R','\\','0','\0'};
char a[]="TOR\\0";
Please mind the \\ which is interpreted by C/C++ as single backslash.

You should be using sizeof a instead of strlen(a) if you're dealing in embedded nulls.

Related

string size() returns 1 too large value in evaluation system

if I have very simple part of code like:
string myvariable;
getline(cin, myvariable);
cout << myvariable.size();
and if I run that program locally it returns appropriate value (so exactly number of characters including spaces in the given string).
But if I upload that program to the programs evaluation system (sth like programming olympics or spoj.com) I get value of size() 1 too much.
For example if myvariable value is "test", then:
locally:
size() == 4
in evaluation system:
size() == 5
I tried .length() but the result is exactly the same.
What is the reason for that? Thank you for your answers!
After the discussion in the comments, it is clear that the issue involves different line ending encodings from different operating systems. Windows uses \r\n and Linux/Unix use \n. The same content may be represented as
"Hello World!\n" // in a Linux file
or
"Hello World!\r\n" // in a Windows file
The method getline by default uses \n as delimiter on Linux/Unix, so it would yield a one greater size for the Windows file, including the unwanted character \r, represented as hex value 0D.
In order to fix this, you can either convert the line endings in your file, or trim the string after reading it in. For example:
string myvariable;
getline(cin, myvariable);
myvariable.erase(myvariable.find_last_not_of("\n\r") + 1);
See also How to trim an std::string? for more ways to trim a string for different types of whitespace.

How to detect the newline character with gzgetc in c/c++

I want to read an entire line of a file character by character using gzgetc and stop when the newline is encountered. I know there is a function to grab the entire line but I would like to try to do it this way first. I tried:
Int c;
do {
c = gzgetc((gzFile) fp);
cout << c;
} while (c != '\n');
The result was an infinite loop. I tried adding (char) before c, still the same result. What am I doing wrong? The data file I am trying to read is encoded in base64 and I want to read in each token separated by space. Some of the lines are variable length and have a mixture of encoded and not encoded data which I set up an algorithm for I just need to know how to stop at newline.
You need to also check for gzgetc() returning -1, which indicates an error or end of file, and exiting the loop in that case. Your infinite loop is likely due to one of those.

Only showing one character while printing in C++

This is my code:
auto text = new wchar_t[WCHAR_MAX];
GetWindowTextW(hEdit, text, WCHAR_MAX);
SetWindowTextW(hWnd, text);
printf_s((const char *)text);
While printing, the char (text), it only outputs one character to the console.
It is a WINAPI gui and a console running together. It sets the winapi title successfully and get the text successfully, but i have no idea why this is only printing out one character to the console...
You're performing a raw cast from a wide string to a narrow string. This conversion is never safe.
Wide strings are stored as two-byte words in Windows. In your case, the high byte of the first character is 0, and x86 is little-endian, so the print stops at the first character.

what's exactly the string of "^A" is?

I run my code on an online judgement. I log the string, key. Below is my code:
fprintf(stderr, "key=%s, and key.size()=%d\n", key.c_str(), key.size());
But the result is this:
key=^A, and key.size()=8
I want to what is the ^A represent in ascii. ^A's size is 2 rather than 8, but it shows that it is 8. I view the result by vim, and the log_file is encoded by UTF-8. Why?
Your viewer is electing to show you the bytes interpreted using a character encoding of its choosing and electing to show the resulting characters in caret notation.
Other viewers could make different choices on both counts or allow you to indicate what you want. For example, control picture characters (␁) instead of caret notation.
For a std:string c_str() is terminated by an additional \x00 byte following the actual value. You often use c_str() with functions that expect a string to be \x00 terminated. This applies to fprintf. In such cases, what's read ends just before the first \x00 seen.
You have several \x00 bytes in your string, which, of course, contributes to size() but fprintf will stop right at the first one (and not count it).
I have solve it by myself. If you write a std::string "\x01\x00\x00\x00\x00end" to a file and open it with vim later, you will get '^A'.
This is my test code:
string sss("\x01\x00\x00\x00\x00end");
ofstream of("of.txt");
for (int i=0; i<sss.size(); i++) {
of.put(sss[i]);
}
of.close();
After I open the file "of.txt", I saw "^A";

How to detect a CRLF in a stream

I got a stringstream of with HTTP request content. As you know HTTP request end up with CRLF break. But operator>> won't recognize CRLF as if it's a normal end-of-file.
How can I detect this CRLF break?
EDIT:
All right, actually I'm using boost.iostreams. But I don't think there should be any differences.
char head[] = "GET / HTTP1.1\r\nConnection: close\r\nUser-Agent: Wget/1.12 (linux-gnu)\r\nHost: www.baidu.com\r\n\r\n";
io::stream<My_InOut> in(head, sizeof head);
string s;
while(in >> s){
char c = in.peek(); // what I am doing here is to check if next character is a normal break so that 's' is a complete word.
switch( c ){
case -1:
// is it eof or an incomplete word?
break;
case 0x20: // a complete word
break;
case 0x0d:
case 0x0a: // also known as \r\n should indicate a complete word
break;
}
In this code, I assume that the request could possibly be split into parts because of its transmission, so I wanted to recognize whether '-1' stand for actual end-of-request or just a break word that I need to read more to complete the request.
First of all, peek returns an int, not a char (at least, std::istream::peek returns int--I don't know about boost). This distinction is important for recognizing -1 as the end of the file rather than a character with the value of 0xFF.
Also be aware that i/o streams in text mode will transform the platform's line separator into '\n' (which, in C and C++, usually has the same value as a line feed, but it might not). So if you're running this on Windows, where the native line separator is CR+LF, you'll never see the CR. But if you run the same code on a Linux box, where the native separator is simply LF, you will.
So given your question:
How can I detect this CRLF break?
The answer is to open the stream in binary mode and check for the character values 0x0D followed by 0x0A.
That said, it's not unheard of for HTML code to overlook that the network protocol requires CR+LF. If you want to be abide by the "be liberal in what you accept" maxim, you just watch for either CR or LF and then skip the next character if it's the complement.