C++ String to byte - c++

so i have a string like this:std::string MyString = "\\xce\\xc6";
where when i print it like this:std::cout << MyString.c_str()[0] << std::endl;
as output i get:\
and i want it to be like this:std::string MyDesiredString = "\xce\xc6";
so when i do:
std::cout << MyDesiredString.c_str()[0] << std::endl;
// OUTPUT: \xce (the whole byte)
so basically i want to identify the string(that represents bytes) and convert it to an array of real bytes
i came up with a function like this:
// this is a pseudo code i'm sure it has a lot of bugs and may not even work
// just for example for what i think
char str_to_bytes(const char* MyStr) { // MyStr length == 4 (\\xc6)
std::map<char*, char> MyMap = { {"\\xce", '\xce'}, {"\\xc6", 'xc6'} } // and so on
return MyMap[MyStr]
}
//if the provided char* is "\\xc6" it should return the char '\xc6'
but i believe there must be a better way to do it.
as much as i have searched i haven't found anything useful
thanks in advance

Try something like this:
std::string teststr = "\\xce\\xc6";
std::string delimiter = "\\x";
size_t pos = 0;
std::string token;
std::string res;
while ((pos = teststr.find(delimiter)) != std::string::npos) {
token = teststr.substr(pos + delimiter.length(), 2);
res.push_back((char)stol(token, nullptr, 16));
std::cout << stol(token, nullptr, 16) << std::endl;
teststr.erase(pos, pos + delimiter.length() + 2);
}
std::cout << res << std::endl;
Take your string, split it up by the literals indicating a hex. value is provided (\x) and then parse the two hex. characters with the stol function as Igor Tandetnik mentioned. You can then of course add those byte values to a string.

Related

How to quickly find and substring multible items from string in C++?

I'm rather new to C++ and I'm struggling with the following problem:
I'm parsing syslog messages from iptables. Every message looks like:
192.168.1.1:20200:Dec 11 15:20:36 SRC=192.168.1.5 DST=8.8.8.8 LEN=250
And I need to quickly (since new messages are coming very fast) parse the string to get SRC, DST and LEN.
If it was a simple program, I'd use std::find to find index of STR substring, then in a loop add every next char to an array until I encounter a whitespace. Then I'd do the same for DST and LEN.
For example,
std::string x = "15:30:20 SRC=192.168.1.1 DST=15.15.15.15 LEN=255";
std::string substr;
std::cout << "Original string: \"" << x << "\"" << std::endl;
// Below "magic number" 4 means length of "SRC=" string
// which is the same for "DST=" and "LEN="
// For SRC
auto npos = x.find("SRC");
if (npos != std::string::npos) {
substr = x.substr(npos + 4, x.find(" ", npos) - (npos+4));
std::cout << "SRC: " << substr << std::endl;
}
// For DST
npos = x.find("DST");
if (npos != std::string::npos) {
substr = x.substr(npos + 4, x.find(" ", npos) - (npos + 4));
std::cout << "DST: " << substr << std::endl;
}
// For LEN
npos = x.find("LEN");
if (npos != std::string::npos) {
substr = x.substr(npos + 4, x.find('\0', npos) - (npos + 4));
std::cout << "LEN: " << substr << std::endl;
}
However, in my situation, I need to do this really quickly, ideally in one iteration.
Could you please give me some advice on this?
If your format is fixed and verified (you can accept undefined behavior as soon as the input string doesn't contain exactly the expected characters), then you might squeeze out some performance by writing larger parts by hand and skip the string termination tests that will be part of all standard functions.
// buf_ptr will be updated to point to the first character after the " SRC=x.x.x.x" sequence
unsigned long GetSRC(const char*& buf_ptr)
{
// Don't search like this unless you have a trusted input format that's guaranteed to contain " SRC="!!!
while (*buf_ptr != ' ' ||
*(buf_ptr + 1) != 'S' ||
*(buf_ptr + 2) != 'R' ||
*(buf_ptr + 3) != 'C' ||
*(buf_ptr + 4) != '=')
{
++buf_ptr;
}
buf_ptr += 5;
char* next;
long part = std::strtol(buf_ptr, &next, 10);
// part is now the first number of the IP. Depending on your requirements you may want to extract the string instead
unsigned long result = (unsigned long)part << 24;
// Don't use 'next + 1' like this unless you have a trusted input format!!!
part = std::strtol(next + 1, &next, 10);
// part is now the second number of the IP. Depending on your requirements ...
result |= (unsigned long)part << 16;
part = std::strtol(next + 1, &next, 10);
// part is now the third number of the IP. Depending on your requirements ...
result |= (unsigned long)part << 8;
part = std::strtol(next + 1, &next, 10);
// part is now the fourth number of the IP. Depending on your requirements ...
result |= (unsigned long)part;
// update the buf_ptr so searching for the next information ( DST=x.x.x.x) starts at the end of the currently parsed parts
buf_ptr = next;
return result;
}
Usage:
const char* x_str = x.c_str();
unsigned long srcIP = GetSRC(x_str);
// now x_str will point to " DST=15.15.15.15 LEN=255" for further processing
std::cout << "SRC=" << (srcIP >> 24) << "." << ((srcIP >> 16) & 0xff) << "." << ((srcIP >> 8) & 0xff) << "." << (srcIP & 0xff) << std::endl;
Note I decided to write the whole extracted source IP into a single 32 bit unsigned. You can decide on a completely different storage model if you want.
Even if you can't be optimistic about your format, using a pointer that is updated whenever a part is processed and continuing with the remaining string instead of starting at 0 might be a good idea to improve performance.
Ofcourse, I suppose your std::cout << ... lines are just for development testing, because otherwise all micro optimization becomes useless anyway.
"quickly, ideally in one iteration" - in reality, the speed of your program does not depend on the number of loops that are visible in your source code. Especially regex'es are a very good way to hide multiple nested loops.
Your solution is actually pretty good. It doesn't waste much time prior to finding "SRC", and doesn't search further than necessary to retrieve the IP address. Sure, when searching for `"SRC" it has a false positive on the first "S" of "Sep", but that is solved by the next compare. If you know for certain that the first occurrence of "SRC" is somewhere in column 20, you might save just a tiny bit of speed by skipping those first 20 characters. (Check your logs, I can't tell)
You can use std::regex, e.g.:
std::string x = "15:30:20 SRC=192.168.1.1 DST=15.15.15.15 LEN=255";
std::regex const r(R"(SRC=(\S+) DST=(\S+) LEN=(\S+))");
std::smatch matches;
if(regex_search(x, matches, r)) {
std::cout << "SRC " << matches.str(1) << '\n';
std::cout << "DST " << matches.str(2) << '\n';
std::cout << "LEN " << matches.str(3) << '\n';
}
Note that matches.str(idx) creates a new string with the match. Using matches[idx] you can get the iterators to the sub-string without creating a new string.

C++: Separating a char* with '\t' delimiter

I've been fighting this problem for a while now, and can't seem to find a simple solution that doesn't involve parsing a char * by hand. I need to split my char* variable by '\t', and I've tried the following ways:
Method 1:
char *splitentry;
std::string ss;
splitentry = strtok(read_msg_.data(), "\\t");
while(splitentry != NULL)
{
std::cout << splitentry << std::endl;
splitentry = strtok(NULL, "\\t");
}
Using the input '\tthis\tis\ta\ttest'
results in this output:
his
is
a
es
Method 2:
std::string s(read_msg_.data());
boost::algorithm::split(strs, s, boost::is_any_of("\\t");
for (int i = 0; i < strs.size(); i++)
std::cout << strs.at(i) << std::endl;
Which creates an identical output.
I've tried using boost::split_regex and used "\\t" as my regex value, but nothing gets split. Will I have to split it on my own, or am I going about this incorrectly?
I would try to make things a little simpler by sticking to std:: functions. (p.s. you never use this: std::string ss;)
Why not do something like this?
Method 1: std::istringstream
std::istringstream ss(read_msg_.data());
std::string line;
while( std::getline(ss,line,ss.widen('\t')) )
std::cout << line << std::endl;
Method 2: std::string::substr (my preferred method as it is lighter)
std::string data(read_msg_.data());
std::size_t SPLITSTART(0); // signifies the start of the cell
std::size_t SPLITEND(0); // signifies the end of the cell
while( SPLITEND != std::string::npos ) {
SPLITEND = data.find('\t',SPLITSTART);
// SPLITEND-SPLITSTART signifies the size of the string
std::cout << data.substr(SPLITSTART,SPLITEND-SPLITSTART) << std::endl;
SPLITSTART = SPLITEND+1;
}

Generate a random unicode string

In VS2010, this function below prints "stdout in error state", I'm unable to understand why. Any thoughts on what I'm doing wrong?
void printUnicodeChars()
{
const auto beg = 0x0030;
const auto end = 0x0039;
wchar_t uchars[end-beg+2];
for (auto i = beg; i <= end; i++) {
uchars[i-beg] = i; // I tried a static_cast<wchar_t>(i), still errors!
}
uchars[end+1] = L'\0';
std::wcout << uchars << std::endl;
if (!std::wcout) {
std::cerr << std::endl << "stdout in error state" << std::endl;
} else {
std::cerr << std::endl << "stdout is good" << std::endl;
}
}
Thanks to #0x499602D2, I found out I had an array out of bounds error in my functions. To be more clear, I wanted my function to construct an unicode string whose characters are in the range [start, end]. This was my final version:
// Generate an unicode string of length 'len' whose characters are in range [start, end]
wchar_t* generateRandomUnicodeString(size_t len, size_t start, size_t end)
{
wchar_t* ustr = new wchar_t[len+1]; // +1 for '\0'
size_t intervalLength = end - start + 1; // +1 for inclusive range
srand(time(NULL));
for (auto i = 0; i < len; i++) {
ustr[i] = (rand() % intervalLength) + start;
}
ustr[len] = L'\0';
return ustr;
}
When this function is called as follows, it generates an unicode string with 5 cyrillic characters.
int main()
{
_setmode(_fileno(stdout), _O_U16TEXT);
wchar_t* output = generateRandomUnicodeString(5, 0x0400, 0x04FF);
wcout << "Random Unicode String = " << output << endl;
delete[] output;
return 0;
}
PS: This function as weird and arbitrary as it may seem, serves a usual purpose for me, I need to generate sample strings for a unit-test case that checks to see if unicode strings are written and retrieved properly from a database, which is the backend of a c++ application. In the past we have seen failures with unicode strings that contain non-ASCII characters, we tracked that bug down and fixed it and this random unicode string logic serves to test that fix.

How to truncate a string [formating] ? c++

I want to truncate a string in a cout,
string word = "Very long word";
int i = 1;
cout << word << " " << i;
I want to have as an output of the string a maximum of 8 letters
so in my case, I want to have
Very lon 1
instead of :
Very long word 1
I don't want to use the wget(8) function, since it will not truncate my word to the size I want unfortunately. I also don't want the 'word' string to change its value ( I just want to show to the user a part of the word, but keep it full in my variable)
I know you already have a solution, but I thought this was worth mentioning: Yes, you can simply use string::substr, but it's a common practice to use an ellipsis to indicate that a string has been truncated.
If that's something you wanted to incorporate, you could just make a simple truncate function.
#include <iostream>
#include <string>
std::string truncate(std::string str, size_t width, bool show_ellipsis=true)
{
if (str.length() > width)
if (show_ellipsis)
return str.substr(0, width) + "...";
else
return str.substr(0, width);
return str;
}
int main()
{
std::string str = "Very long string";
int i = 1;
std::cout << truncate(str, 8) << "\t" << i << std::endl;
std::cout << truncate(str, 8, false) << "\t" << i << std::endl;
return 0;
}
The output would be:
Very lon... 1
Very lon 1
As Chris Olden mentioned above, using string::substr is a way to truncate a string. However, if you need another way to do that you could simply use string::resize and then add the ellipsis if the string has been truncated.
You may wonder what does string::resize? In fact it just resizes the used memory (not the reserved one) by your string and deletes any character beyond the new size, only keeping the first nth character of your string, with n being the new size. Moreover, if the new size is greater, it will expand the used memory of your string, but this aspect of expansion is straightforward I think.
Of course, I don't want to suggest a 'new best way' to do it, it's just another way to truncate a std::string.
If you adapt the Chris Olden truncate function, you get something like this:
#include <iostream>
#include <string>
std::string& truncate(std::string& str, size_t width, bool show_ellipsis=true) {
if (str.length() > width) {
if (show_ellipsis) {
str.resize(width);
return str.append("...");
}
else {
str.resize(width);
return str;
}
}
return str;
}
int main() {
std::string str = "Very long string";
int i = 1;
std::cout << truncate(str, 8) << "\t" << i << std::endl;
std::cout << truncate(str, 8, false) << "\t" << i << std::endl;
return 0;
}
Even though this method does basically the same, note that this method takes and returns a reference to the modified string, so be careful with it since this string could be destroyed because of an external event in your code. Thus if you don't want to take that risk, just remove the references and the function becomes:
std::string truncate(std::string str, size_t width, bool show_ellipsis=true) {
if (str.length() > width) {
if (show_ellipsis) {
str.resize(width);
return str + "...";
}
else {
str.resize(width);
return str;
}
}
return str;
}
I know it's a little bit late to post this answer. However it might come in handy for future visitors.

how to find number of elements in an array of strings in c++?

i have an array of string.
std::string str[10] = {"one","two"}
How to find how many strings are present inside the str[] array?? Is there any standard function?
There are ten strings in there despite the fact that you have only initialised two of them:
#include <iostream>
int main (void) {
std::string str[10] = {"one","two"};
std::cout << sizeof(str)/sizeof(*str) << std::endl;
std::cout << str[0] << std::endl;
std::cout << str[1] << std::endl;
std::cout << str[2] << std::endl;
std::cout << "===" << std::endl;
return 0;
}
The output is:
10
one
two
===
If you want to count the non-empty strings:
#include <iostream>
int main (void) {
std::string str[10] = {"one","two"};
size_t count = 0;
for (size_t i = 0; i < sizeof(str)/sizeof(*str); i++)
if (str[i] != "")
count++;
std::cout << count << std::endl;
return 0;
}
This outputs 2 as expected.
If you want to count all elements sizeof technique will work as others pointed out.
If you want to count all non-empty strings, this is one possible way by using the standard count_if function.
bool IsNotEmpty( const std::string& str )
{
return !str.empty();
}
int main ()
{
std::string str[10] = {"one","two"};
int result = std::count_if(str, &str[10], IsNotEmpty);
cout << result << endl; // it will print "2"
return 0;
}
I don't know that I would use an array of std::strings. If you're already using the STL, why not consider a vector or list? At least that way you could just figure it out with std::vector::size() instead of working ugly sizeof magic. Also, that sizeof magic won't work if the array is stored on the heap rather than the stack.
Just do this:
std::vector<std::string> strings(10);
strings[0] = "one";
strings[1] = "two";
std::cout << "Length = " << strings.size() << std::endl;
You can always use countof macro to get the number of elements, but again, the memory was allocated for 10 elements and thats the count that you'll get.
The ideal way to do this is
std::string str[] = {"one","two"}
int num_of_elements = sizeof( str ) / sizeof( str[ 0 ] );
Since you know the size.
You could do a binary search for not null/empty.
str[9] is empty
str[5] is empty
str[3] is not empty
str[4] is empty
You have 4 items.
I don't really feel like implementing the code, but this would be quite quick.
Simply use this function for 1D string array:
template<typename String, uint SIZE> // String can be 'string' or 'const string'
unsigned int NoOfStrings (String (&arr)[SIZE])
{
unsigned int count = 0;
while(count < SIZE && arr[count] != "")
count ++;
return count;
}
Usage:
std::string s1 = {"abc", "def" };
int i = NoOfStrings(s1); // i = 2
I am just wondering if we can write a template meta program for this ! (since everything is known at compile time)
A simple way to do this is to use the empty() member function of std::string like this e.g.:
size_t stringArrSize(std::string *stringArray) {
size_t num = 0;
while (stringArray->empty() != true) {
++num;
stringArray++;
}
return num;
}