Read CString from buffer with unknown length? - c++

Let's say I have a file. I read all the bytes into an unsigned char buffer. From there I'm trying to read a c string (null terminated) without knowing it's length.
I tried the following:
char* Stream::ReadCString()
{
char str[0x10000];
int len = 0;
char* pos = (char*)(this->buffer[this->position]);
while(*pos != 0)
str[len++] = *pos++;
this->position += len+ 1;
return str;
}
I thought I could fill up each char in the str array as I went through, checking if the char was null terminated or not. This is not working. Any help?
this->buffer = array of bytes
this->position = position in the array
Are there any other methods to do this? I guess I could run it by the address of the actual buffer:
str[len++] = *(char*)(this->buffer[this->position++]) ?
Update:
My new function:
char* Stream::ReadCString()
{
this->AdvPosition(strlen((char*)&(this->buffer[this->position])) + 1);
return (char*)&(this->buffer[this->position]);
}
and calling it with:
printf( "String: %s\n", s.ReadCString()); //tried casting to char* as well just outputs blank string
Example File:

Check this:
#include <cstring>
#include <iostream>
class A
{
unsigned char buffer[4096];
int position;
public:
A() : position(0)
{
memset(buffer, 0, 4096);
char *pos = reinterpret_cast<char*>(&(this->buffer[50]));
strcpy(pos, "String");
pos = reinterpret_cast<char*>(&(this->buffer[100]));
strcpy(pos, "An other string");
}
const char *ReadString()
{
if (this->position != 4096)
{
while (std::isalpha(this->buffer[this->position]) == false && this->position != 4096)
this->position++;
if (this->position == 4096)
return 0;
void *tmp = &(this->buffer[this->position]);
char *str = static_cast<char *>(tmp);
this->position += strlen(str);
return (str);
}
return 0;
}
};
The reintrepret_cast are only for the init, since you are reading from a file
int main()
{
A test;
std::cout << test.ReadString() << std::endl;
std::cout << test.ReadString() << std::endl;
std::cout << test.ReadString() << std::endl;
}
http://ideone.com/LcPdFD
Edit I have changed the end of ReadString()

str is a local c string. Any referencing pointer to str outsider the function is undefined behavior: Undefined, unspecified and implementation-defined behavior, it might or might not cause notable problem.

Null termination is probably the best way to go as long as you're careful, but the reason its not working for you is most likely because you are returning memory that has been allocated on the stack. This memory is going to be freed as soon as you hit the return which will therefore cause undefined behaviour. Instead, allocate your chars on the heap:
char* str = new char[0x10000];
and free the memory when the caller doesn't need it anymore.

It can be fixed with the following method. I was advancing the position, and then returning the address.
char* Stream::ReadCString()
{
u64 str_len = strlen((char*)&(this->buffer[this->position])) + 1;
this->AdvPosition(str_len);
return (char*)&(this->buffer[this->position - str_len]);
}
Hope this helps anyone.

Related

any wrong I done for using Openssl for calculating HMAC_SHA1 hash value?

int computeHMACSHA1Hash(const char * unhashedcstr, char * hashedcstr, const char * key, int returncode)
{
string hashed;
size_t unhashlength = strlen(unhashedcstr);
char * nonconstunhashcstr = new char[unhashlength];
strcpy_s(nonconstunhashcstr, unhashlength + 1, unhashedcstr);
unsigned char* pixels = reinterpret_cast<unsigned char*>(nonconstunhashcstr);
returncode = 0;
HMAC_CTX* context = HMAC_CTX_new();
size_t unhashedstrlength = sizeof(unhashedcstr);
if (context != NULL)
{
if (HMAC_Init_ex(context, key, strlen(key), EVP_sha1(), NULL))
{
if (HMAC_Update(context, pixels, unhashedstrlength))
{
unsigned char hash[EVP_MAX_MD_SIZE];
unsigned int lengthOfHash = 0;
if (HMAC_Final(context, hash, &lengthOfHash))
{
std::stringstream ss;
for (unsigned int i = 0; i < lengthOfHash; ++i)
{
ss << std::hex << std::setw(2) << std::setfill('0') << (int)hash[i];
}
hashed = ss.str();
size_t outputSize = hashed.length() + 1; // +1 for null terminator
strcpy_s(hashedcstr, outputSize, hashed.c_str());
returncode = 0;
}
else
{
returncode = 7;
}
}
else
{
returncode = 6;
}
}
else
{
returncode = 5;
}
HMAC_CTX_free(context);
}
else
{
returncode = 4;
}
return returncode;
}
int main()
{
const char * unhashedcstr = "a=services&l=v1&p=open&k=SD58292829&i=20200918125249803&n=2124&t=1600404769&f={\"invoiceCode\": \"11111\",\"invoiceNo\": \"2222\",\"inTaxAmount\": \"\",\"exTaxAmount\": \"\"}";
char * hashedcstr = new char[100];
int returncode = 0;
const char * key = "SD886A11B0EE428F";
int result = computeHMACSHA1Hash(unhashedcstr, hashedcstr, key, returncode);
return 0;
}
I tried the code above to calculating the HMAC SHA1 hash value for a content, but compared the results on https://www.freeformatter.com/hmac-generator.html#before-output
it looks like I didn't do it right. I'm not sure what I have done wrong though. Any help would be appreciated.
It turned out the result was "d916b4c2d277319bbf18076c158f0cbcf6c3bc57", while on the website https://www.freeformatter.com/hmac-generator.html#before-output, the result was "71482b292f2b2a47b3eca6dad5e7350566d60963". Even when I tried using the string "a=services&l=v1&p=open&k=SD58292829&i=20200918125249803&n=2124&t=1600404769&f={"invoiceCode": "11111","invoiceNo": "2222","inTaxAmount": "","exTaxAmount": ""}" which removed the escape characters, the result was "09be98b6129c149e685ed57a1d19651a602cda0d". It didn't match the correct one.
Is there anything wrong with my code?
Your hash is calculated over the bytes a=se, which are the first four bytes of the whole input string. Thus, you get d916b4c2d277319bbf18076c158f0cbcf6c3bc57 instead of the 09be98b6129c149e685ed57a1d19651a602cda0d that would correspond to the whole string.
The reason is this:
size_t unhashedstrlength = sizeof(unhashedcstr);
Here, sizeof(unhashedcstr) is the size of the unhashedcstr pointer itself (which is of type const char*), not the size of the null-terminated C-style string this unhashedcstr pointer is pointing to. You are compiling a 32-bit program, so the size of a pointer is 4 bytes. Thus, unhashedstrlength is 4.
To get the length of the C-style string, you can do this instead:
size_t unhashedstrlength = strlen(unhashedcstr);
But just as a comment, in modern C++, you should avoid using raw pointers (such as const char*, char*, unsigned char*), C functions (like strlen(), strcpy_s()) and manual memory management (new / delete and new[] / delete[]). You should prefer to use std::string and/or std::vector<unsigned char> instead, wherever possible. When you need to pass a buffer's address to an API function, you can use std::string::data(), std::vector::data(), or more generally, std::data().
By the way, you currently leak memory: you dynamically allocate buffers using new[], but you never deallocate those (using delete[]). So that memory is released by the OS only after the program exits. This is called a memory leak.

longest palindromic substring. Error: AddressSanitizer, heap overflow

#include<string>
#include<cstring>
class Solution {
void shift_left(char* c, const short unsigned int bits) {
const unsigned short int size = sizeof(c);
memmove(c, c+bits, size - bits);
memset(c+size-bits, 0, bits);
}
public:
string longestPalindrome(string s) {
char* output = new char[s.length()];
output[0] = s[0];
string res = "";
char* n = output;
auto e = s.begin() + 1;
while(e != s.end()) {
char letter = *e;
char* c = n;
(*++n) = letter;
if((letter != *c) && (c == &output[0] || letter != (*--c)) ) {
++e;
continue;
}
while((++e) != s.end() && c != &output[0]) {
if((letter = *e) != (*--c)) {
const unsigned short int bits = c - output + 1;
shift_left(output, bits);
n -= bits;
break;
}
(*++n) = letter;
}
string temp(output);
res = temp.length() > res.length()? temp : res;
shift_left(output, 1);
--n;
}
return res;
}
};
input string longestPalindrome("babad");
the program works fine and prints out "bab" as the longest palindrome but there's a heap overflow somewhere. Error like this appears:
Read of size 6 at ...memory address... thread T0
"babad" is size 5 and after going over this for an hour. I don't see the point where the iteration ever exceeds 5
There is 3 pointers here that iterate.
e as the element of string s.
n which is the pointer to the next char of output.
and c which is a copy of n and decrements until it reaches the address of &output[0].
maybe it's something with the memmove or memset since I've never used it before.
I'm completely lost
TL;DR : mixture of char* and std::string are not really good idea if you don't understand how exactly it works.
If you want to length of string you cant do this const unsigned short int size = sizeof(c); (sizeof will return size of pointer (which is commonly 4 on 32-bit machine and 8 on 64-bit machine). You must do this instead: const size_t size = strlen(c);
Address sanitizers is right that you (indirectly) are trying to get an memory which not belongs to you.
How does constructor of string from char* works?
Answer: char* is considered as c-style string, which means that it must be null '\0' terminated.
More details: constructor of string from char* calls strlen-like function which looks like about this:
https://en.cppreference.com/w/cpp/string/byte/strlen
int strlen(char *begin){
int k = 0;
while (*begin != '\0'){
++k;
++begin;
}
return k;
}
If c-style char* string does not contain '\0' it cause accessing memory which doesn't belongs to you.
How to fix?
Answer (two options):
not use mixture of char* and std::string
char* output = new char[s.length()]; replace with char* output = new char[s.length() + 1]; memset(output, 0, s.length() + 1);
Also you must delete all memory which you newed. So add delete[] output; before return res;

How to convert my string into array of chars

Here is a problem. When I try to convert it by using strncpy_s, array has some type of "trash data" from memory in the end of it. Even when I fill buffer with "\0". How to convert it clear?
typedef class Ryadok {
private:
int LengthOf = 0;
char text[20];
string* address;
public:
Ryadok(string strin) {
this->text[0] = '\0';
memset(text, '\0', sizeof(text));
strncpy_s(text, strin.c_str(), sizeof(text) - 1);
this->address = &strin;
for (int i = 0; i < sizeof(strin); i++) {
cout << this->text[i];
}
}
~Ryadok() {
}
}*cPtr;
int main()
{
Ryadok example("sdsdfsdf");
}
The idea to use c_str() function to convert the std::string to a a-string. Then we can simply call strcpy() function to copu the c-string into char array
std::string s = "Hello World!";
char cstr[s.size() + 1];
strcpy(cstr, s.c_str()); // or pass &s[0]
std::cout << cstr << '\n';
return 0;
When using the strncpy_s function you tell it to copy as many chars as will fit into your buffer "text". Since the string you create the "example" instance with is shorter, the copy function will keep going after the end of the actual string.
That is where your garbage comes from. Even worse you risk a Segmentation Fault this way. Your code might access parts of the RAM it is not allowed to read from. That will cause it to crash.
You are right though to copy the data pointed to by the return of c_str(). The pointer returned by c_str() points to data that belongs to the std::string object and might be changed or even invalidated by that object. (Read more here)
Here's a modified version of your code that should avoid the garbage:
typedef class Ryadok {
private:
int LengthOf = 0;
char text[20];
string* address;
public:
Ryadok(string strin) {
this->text[0] = '\0';
memset(text, '\0', sizeof(text));
if(strin.length()+1 <= sizeof(text)) {
strncpy_s(text, strin.c_str(), strin.length()+1);
} else {
//some error handling needed since our buffer is too small
}
this->address = &strin;
for (int i = 0; i < sizeof(strin); i++) {
cout << this->text[i];
}
}
~Ryadok() {
}
}*cPtr;
int main()
{
Ryadok example("sdsdfsdf");
}

Unable to allocate memory via pointer

I wrote this function, in which the intention is to combine the character equivalent of
argument 3, with argument 2. Then allocate memory for argument 1 and return it. Based on debug statements inserted into the function everything seems to be correct, but it appears to be freeing the memory on return. Why is this? or am I missing something else?
I'm not accustomed to programming on a mac and I can't get gdb to work, so I'm kinda flying blind.
Function
bool BraviaIpCtrl::setVolume(char *output, const char *input, unsigned short value)
{
bool success = false;
output = nullptr;
if(value <= 100)
{
int msgLen = 24;
output = new char[msgLen];
memset(output, 0, sizeof(*output));
std::string numbers = std::to_string(value).c_str();
size_t len = numbers.length();
memcpy(output, input, msgLen);
memcpy(output + (msgLen - 1) - len, numbers.c_str(), len);
success = true;
}
return success;
}
Test Function call
char* test = nullptr;
if(bc.setVolume(test, bc.bctl_volume_set, 43) && test != nullptr)
{
std::cout << *test << std::endl;
}
else
{
std::cout << "NOPE!!" << std::endl;
}
As #mailtreyak pointed out, you are passing a pointer to char:
a copy of pointer output (let's say output_copy) is used within the function,
if you make output_copy point to some different data/memory, your output pointer is still pointing to its previous data/memory,
as soon as you exit the function the modification you expect has not happened (but this is correct, because the data/memory output is pointing to have not been modified at all).
Here below you can find another way using PointerToPointer (**):
bool BraviaIpCtrl::setVolume(char** output, const char* input, unsigned short value)
{
bool success = false;
*output = nullptr;
if (value <= 100)
{
int msgLen = 24;
*output = new char[msgLen];
memset(*output, 0, msgLen);
std::string numbers(*output);
size_t len = numbers.length();
memcpy(*output, input, msgLen);
memcpy(*output + (msgLen - 1) - len, numbers.c_str(), len);
success = true;
}
return success;
}
and calling code:
char* test = nullptr;
if (bc.setVolume(&test, bc.bctl_volume_set, 43) && test != nullptr)
{
std::cout << *test << std::endl;
}
else
{
std::cout << "NOPE!!" << std::endl;
}
Please pay much attention to this error in your previous code:
int msgLen = 24;
output = new char[msgLen];
memset(output, 0, sizeof(*output));
should be instead:
int msgLen = 24;
output = new char[msgLen];
memset(output, 0, msgLen);
this because you want to set 24 bytes, not just only 1 byte (1 = sizeof(*output), the size of a pointer to char)
The problem is that you are passing the pointer variable to the function and like any other variable it is passed by value, so the method "setVolume" is making a local copy of pointer test and assigning memory. The calling test method has no way of seeing this change.
Why not change the method implementation to return the address of the array instead.
char * BraviaIpCtrl::setVolume(const char *input, unsigned short value)
{
char* output = NULL;
if(value <= 100)
{
int msgLen = 24;
output = new char[msgLen];
memset(output, 0, sizeof(*output));
std::string numbers = std::to_string(value).c_str();
size_t len = numbers.length();
memcpy(output, input, msgLen);
memcpy(output + (msgLen - 1) - len, numbers.c_str(), len);
}
return output;
}

c++ remove non utf8

I am working to validate that a string is utf8.
I have found method g_utf8_validate from glib, which returns:
true/false
the location of the last valid data that was read from the string
Is there a posibility to ge beyond this, and also get the valid data after the non-utf8 portion? Example:
std::string invalid = "okdata\xa0\xa1morevalid";
Currenlty I am able to save "okdata" but I would like to get "okdatamorevalid".
Any ideas? Thank you.
You could keep calling g_utf8_validate on the remaining string (skipping the first byte every time) to find more valid sections:
#include <iostream>
#include <string>
#include <glib.h>
int main() {
char const *data = "okdata\xa0\xa1morevalid";
std::string s;
// Under the assumption that the string is null-terminated.
// Otherwise you'll have to know the length in advance, pass it to
// g_utf8_validate and reduce it by (pend - p) every iteration. The
// loop condition would then be remaining_size > 0 instead of *pend != '\0'.
for(char const *p = data, *pend = data; *pend != '\0'; p = pend + 1) {
g_utf8_validate(p, -1, &pend);
s.append(p, pend);
}
std::cout << s << std::endl; // prints "okdatamorevalid"
}
You can call it in a loop. Something like this:
std::string sanitize_utf8(const std::string &in) {
std::string result;
const char *ptr = in.data(), *end = ptr + in.size();
while (true) {
const char *ptr2;
g_utf8_validate(ptr, end - ptr, &ptr2);
result.append(ptr, ptr2);
if (ptr2 == end)
break;
ptr = ptr2 + 1;
}
return result;
}