extract sets of characters from boost::array<char, 100> - c++

So I have initialized a Char array of 100 length, that receives incoming bytes from UDP socket. I can get single character from array like;
boost::array<char, 100> recv_buffer_;
std::cout << "1st char of array:>" << recv_buffer_[0] << std::endl;
// or like this;
std::cout << "1st char of array:>" << recv_buffer_.at(0) << std::endl;
but I can't figure out how to extract some sets of characters from this array. i.e. if I receive "this is a test" in my recv_buffer_, how can I extract sub characters from index 2 to index 8 i.e. "is is a". Just like in python where you can extract sub-string from a string by simply giving start and end index.
>>my_string = "this is a test"
>>print my_string[2:8]
>>is is a
I am looking for similar function for boost array char. by at the documentation of boost array, that mentions the use of "operator", but I have no idea how to use it.

You can use string's constructor.
string my_string(recv_buffer_.data() + 2, recv_buffer_.data() + 8 + 1);

Like all standard containers boost::array support iterators.
To get a range you can use e.g. recv_buffer_.begin() + 2 as the start, and e.g. recv_buffer_.begin() + 8 as the end.

Related

How to get length of shellcode string containing NULL characters

I have string of shellcode with NULL characters in between them and i am unable to detect its length, i tried std::string.length() method but it only counts till NULL character after that it doesn't count.
Here is sample code .
std::string shell_str = "\x55\x48\x89\x00\x00\x00\x00\xC3\x90";
std::cout << "shell : " << shell_str << std::endl;
std::cout << "shell length : " << shell_str.length() << std::endl;
Output :
shell : UH�
shell length : 3
But length of string is 9 and i tried to copy it to vector also but still doesn't get the desired output .
Full code snippet is posted here
The problem isn't with the calculation of the length of shell_str, the problem is with the literal string you use to initialize shell_str. The constructor of std::string will stop at the "terminator".
You need to use another std::string constructor, to explicitly tell it the actual length of the string:
std::string shell_str("\x55\x48\x89\x00\x00\x00\x00\xC3\x90", 9);
Also, since the "string" contains arbitrary data, you can't print it as a string either.
And if you want a "string" of arbitrary bytes I suggest you use std::vector<uint8_t> instead.

How to assign string a char array that starts from the middle of the array?

For example in the following code:
char name[20] = "James Johnson";
And I want to assign all the character starting after the white space to the end of the char array, so basically the string is like the following: (not initialize it but just show the idea)
string s = "Johnson";
Therefore, essentially, the string will only accept the last name. How can I do this?
i think you want like this..
string s="";
for(int i=strlen(name)-1;i>=0;i--)
{
if(name[i]==' ')break;
else s+=name[i];
}
reverse(s.begin(),s.end());
Need to
include<algorithm>
There's always more than one way to do it - it depends on exactly what you're asking.
You could either:
search for the position of the first space, and then point a char* at one-past-that position (look up strchr in <cstring>)
split the string into a list of sub-strings, where your split character is a space (look up strtok or boost split)
std::string has a whole arsenal of functions for string manipulation, and I recommend you use those.
You can find the first whitespace character using std::string::find_first_of, and split the string from there:
char name[20] = "James Johnson";
// Convert whole name to string
std::string wholeName(name);
// Create a new string from the whole name starting from one character past the first whitespace
std::string lastName(wholeName, wholeName.find_first_of(' ') + 1);
std::cout << lastName << std::endl;
If you're worried about multiple names, you can also use std::string::find_last_of
If you're worried about the names not being separated by a space, you could use std::string::find_first_not_of and search for letters of the alphabet. The example given in the link is:
std::string str ("look for non-alphabetic characters...");
std::size_t found = str.find_first_not_of("abcdefghijklmnopqrstuvwxyz ");
if (found!=std::string::npos)
{
std::cout << "The first non-alphabetic character is " << str[found];
std::cout << " at position " << found << '\n';
}

How could I copy data that contain '\0' character

I'm trying to copy data that conatin '\0'. I'm using C++ .
When the result of the research was negative, I decide to write my own fonction to copy data from one char* to another char*. But it doesn't return the wanted result !
My attempt is the following :
#include <iostream>
char* my_strcpy( char* arr_out, char* arr_in, int bloc )
{
char* pc= arr_out;
for(size_t i=0;i<bloc;++i)
{
*arr_out++ = *arr_in++ ;
}
*arr_out = '\0';
return pc;
}
int main()
{
char * out= new char[20];
my_strcpy(out,"12345aa\0aaaaa AA",20);
std::cout<<"output data: "<< out << std::endl;
std::cout<< "the length of my output data: " << strlen(out)<<std::endl;
system("pause");
return 0;
}
the result is here:
I don't understand what is wrong with my code.
Thank you for help in advance.
Your my_strcpy is working fine, when you write a char* to cout or calc it's length with strlen they stop at \0 as per C string behaviour. By the way, you can use memcpy to copy a block of char regardless of \0.
If you know the length of the 'string' then use memcpy. Strcpy will halt its copy when it meets a string terminator, the \0. Memcpy will not, it will copy the \0 and anything that follows.
(Note: For any readers who are unaware that \0 is a single-character byte with value zero in string literals in C and C++, not to be confused with the \\0 expression that results in a two-byte sequence of an actual backslash followed by an actual zero in the string... I will direct you to Dr. Rebmu's explanation of how to split a string in C for further misinformation.)
C++ strings can maintain their length independent of any embedded \0. They copy their contents based on this length. The only thing is that the default constructor, when initialized with a C-string and no length, will be guided by the null terminator as to what you wanted the length to be.
To override this, you can pass in a length explicitly. Make sure the length is accurate, though. You have 17 bytes of data, and 18 if you want the null terminator in the string literal to make it into your string as part of the data.
#include <iostream>
using namespace std;
int main() {
string str ("12345aa\0aaaaa AA", 18);
string str2 = str;
cout << str;
cout << str2;
return 0;
}
(Try not to hardcode such lengths if you can avoid it. Note that you didn't count it right, and when I corrected another answer here they got it wrong as well. It's error prone.)
On my terminal that outputs:
12345aaaaaaa AA
12345aaaaaaa AA
But note that what you're doing here is actually streaming a 0 byte to the stdout. I'm not sure how formalized the behavior of different terminal standards are for dealing with that. Things outside of the printable range can be used for all kinds of purposes depending on the kind of terminal you're running... positioning the cursor on the screen, changing the color, etc. I wouldn't write out strings with embedded zeros like that unless I knew what the semantics were going to be on the stream receiving them.
Consider that if what you're dealing with are bytes, not to confuse the issue and to use a std::vector<char> instead. Many libraries offer alternatives, such as Qt's QByteArray
Your function is fine (except that you should pass to it 17 instead of 20). If you need to output null characters, one way is to convert the data to std::string:
std::string outStr(out, out + 17);
std::cout<< "output data: "<< outStr << std::endl;
std::cout<< "the length of my output data: " << outStr.length() <<std::endl;
I don't understand what is wrong with my code.
my_strcpy(out,"12345aa\0aaaaa AA",20);
Your string contains character '\' which is interpreted as escape sequence. To prevent this you have to duplicate backslash:
my_strcpy(out,"12345aa\\0aaaaa AA",20);
Test
output data: 12345aa\0aaaaa AA
the length of my output data: 18
Your string is already terminated midway.
my_strcpy(out,"12345aa\0aaaaa AA",20);
Why do you intend to have \0 in between like that? Have some other delimiter if yo so desire
Otherwise, since std::cout and strlen interpret a \0 as a string terminator, you get surprises.
What I mean is that follow the convention i.e. '\0' as string terminator

Insert values into a string without using sprintf or to_string

Currently I only know of two methods to insert values into a C++ string or C string.
The first method I know of is to use std::sprintf() and a C-string buffer (char array).
The second method is to use something like "value of i: " + to_string(value) + "\n".
However, the first one needs the creation of a buffer, which leads to more code if you just want to pass a string to a function. The second one produces long lines of code, where a string gets interrupted every time a value is inserted, which makes the code harder to read.
From Python I know the format() function, which is used like this:
"Value of i: {}\n".format(i)
The braces are replaced by the value in format, and further .format()'s can be appended.
I really like Python's approach on this, because the string stays readable, and no extra buffer needs to be created. Is there any similar way of doing this in C++?
Idiomatic way of formatting data in C++ is with output streams (std::ostream reference). If you want the formatted output to end up in a std::string, use an output string stream:
ostringstream res;
res << "Value of i: " << i << "\n";
Use str() member function to harvest the resultant string:
std::string s = res.str();
This matches the approach of formatting data for output:
cout << "Value of i: " << i << "\n";

Boost: How to locate the position of the iterator inside a huge text file?

I'm working in a program that uses boost::regex to match some patterns inside a huge text file (greater than 200 MB). The matches are working fine, but to build the output file I need to order the matches (just 2, but over all the text) in the sequence they are found in the text.
Well, when in debug mode, during the cout procedure I can see inside the iterator it1 an m_base attribute that shows an address that is increased each step of the loop and I think this m_base address is the address of the matched pattern in the text, but I could not certify it and I could not find a way to access this attribute to store the address.
I don't know if there is any way to retrieve the address of each matched pattern in the text, but I really need to get this information.
#define FILENAME "File.txt"
int main() {
int length;
char * cMainBuf;
ifstream is;
is.open (FILENAME, ios::binary );
is.seekg(0, ios::end);
length = is.tellg();
is.seekg (0, ios::beg);
cMainBuf = new char[length+1];
memset(cMainBuf, '\0',length+1);
is.read(cMainBuf,length);
is.close();
string str=cMainBuf;
regex reg("^(\\d{1,3}\\s[A-F]{99})");
regex rReg(reg);
int const sub_matches[] = { 1 };
boost::sregex_token_iterator it1(str.begin() ,str.end() ,rReg ,sub_matches ), it2;
while(it1!=it2)
{
cout<<"#"<<sz++<<"- "<< *(it1++) << endl;
}
return 0;
}
#sln
Hi sln,
I'll answer your questions:
1. I removed all code that is not part of this issue, so some libraries remaining there;
2. Same as 1;
3. Because the file is not a simple text file in fact, it can have any symbol and it may affect the reading procedure, as I could realize in the past;
4. Zero buffer was necessary during the tests period, since I could not store more than 1MB in the buffer;
5. the iterator doesn't allo to use char* to set the beggining and the end of the file, so was necessary to change it to string;
6. The incoming RegEx will not be declared static, this is just a draft to show the problem and the anchor act to find the line start, not only the string start;
7. sub_matches was part of the test to see where the iterator was for regex with 2 or more groups inside it;
8. sz is just a counter;
9. There is no cast possible from const std::_String_const_iterator<_Elem,_Traits,_Alloc> to long.
In fact all the code works fine, I can identify any pattern inside the text, but what I really need to know is the memory address of each matched pattern (in this case, the address of the iterator for each iteration). I could realize that m_base has this address, but I could not retrieve this address until this moment.
Ill continue the analysis, if I find any solution for this problem I post it here.
Edit #Tchesko, I am deleting my original answer. I've loaded the boost::regex and tried it out with a regex_search(). Its not the itr1 method like you are doing but, I think it comes down to just getting the results from the boost::smatch class, which is really boost::match_results().
It has member functions to get the position and length of the match and sub-matches. So, its really all you need to find the offset into your big string. The reason you can't get to m_base is that it is a private member variable.
Use the methods position() and length(). See the sample below... which I ran, debugged and tested. I'm getting back up to speed with VS-2005 again. But, boost does seem a little quirky. If I am going to use it, I want it to do Unicode, and than means I have to compile ICU. The boost binarys I'm using is downloaded 1.44. The latest is 1.46.1 so I might build it with vc++ 8 after I asess it viability with ICU.
Hey, let me know how it turns out. Good luck!
#include <boost/regex.hpp>
#include <locale>
#include <iostream>
using namespace std;
int main()
{
std::locale::global(std::locale("German"));
std::string s = " Boris Schäling ";
boost::regex expr("(\\w+)\\s*(\\w+)");
boost::smatch what;
if (boost::regex_search(s, what, expr))
{
// These are from boost::match_results() class ..
int Smpos0 = what.position();
int Smlen0 = what.length();
int Smpos1 = what.position(1);
int Smlen1 = what.length(1);
int Smpos2 = what.position(2);
int Smlen2 = what.length(2);
printf ("Match Results\n--------------\n");
printf ("match start/end = %d - %d, length = %d\n", Smpos0, Smpos0 + Smlen0, Smlen0);
std::cout << " '" << what[0] << "'\n" << std::endl;
printf ("group1 start/end = %d - %d, length = %d\n", Smpos1, Smpos1 + Smlen1, Smlen1);
std::cout << " '" << what[1] << "'\n" << std::endl;
printf ("group2 start/end = %d - %d, length = %d\n", Smpos2, Smpos2 + Smlen2, Smlen2);
std::cout << " '" << what[2] << "'\n" << std::endl;
/*
This is the hard way, still m_base is a private member variable.
Without m_base, you can't get the root address of the buffer.
long Match_start = (long)(what[0].first._Myptr);
long Match_end = (long)(what[0].second._Myptr);
long Grp1_start = (long)(what[1].first._Myptr);
long Grp1_end = (long)(what[1].second._Myptr);
*/
}
}
/* Output:
Match Results
--------------
match start/end = 2 - 17, length = 15
'Boris Schäling'
group1 start/end = 2 - 7, length = 5
'Boris'
group2 start/end = 9 - 17, length = 8
'Schäling'
*/