C++ Parsing and Storing any kind of data - c++

I'm working on a text-based a simple command processor for a microcontroller project (C and C++). It implements an "explicit setter" so to speak. It means that I store pointers to target memory locations and set them to the incoming data.
First of cource, I parse the data based a predefined set of syntactical rules:
"12345" -> string (char*)
12345 -> unsigned int
12.45 -> float
-123456 -> int (signed)
This is the method I've come up with:
void* CommandInterpreter::ParseArgumentValue(std::string token, size_t* ptrSize = nullptr) {
try {
void* value = NULL;
if(std::count(token.begin(), token.end(), '.') == 1) {
auto result = (float)std::atof(token.c_str());
value = malloc(sizeof(float));
ptrSize = sizeof(float);
memcpy(value, &result, sizeof(float));
} else if(token.front() == '"' && token.back() == '"') {
const char* result = token.substr(1, token.size()-2).c_str();
size_t size = (strlen(result) * sizeof(uint8_t)) + 1;
value = malloc(size);
ptrSize = size;
memcpy(value, result, size);
} else if(token.front() == '-') {
auto result = (int)std::atoi(token.c_str());
value = malloc(sizeof(int));
ptrSize = sizeof(int);
memcpy(value, &result, sizeof(int));
} else {
auto result = (unsigned int)std::stoul(token.c_str());
value = malloc(sizeof(unsigned int));
ptrSize = sizeof(unsigned int);
memcpy(value, &result, sizeof(unsigned int));
}
return value;
} catch(std::invalid_argument) {
//return NULL;
} catch(std::out_of_range) {
//return NULL;
}
}
I know it's not pretty, and it borders on bad-practice but it works.. however this will need to be freed after I process the value.
The actual setter-part looks like this:
auto setterdef = this->mDefinitions2[tokens[1]];
void* value = ParseArgumentValue(tokens[2]);
memcpy(setterdef.TargetPtr, value, setterdef.TargetSize);
if(memcmp(setterdef.TargetPtr, value, setterdef.TargetSize) != 0)
throw "COULD_NOT_SET";
free(value);
rbuilder.SetName("OK");
return rbuilder.Get();
The setter part works well, however I want to use the same function to parse an incoming parameter list and then store it in an std::map<std::string, void*> but this will keep the allocated memory even after the map is destroyed; as of right now I have a foreach in the destructor of my CommandParameterList class which frees the pointers in the map.. but it seems pretty odd to me.
My question would be.. how bad is this? And is there a better way to do this?
I know about std::any however, as far as I know I cannot just memcpy X amount of bytes from it, and I'd need to know the specific type to std::any_cast it.

Related

VirtualQueryEx unable to read memory, so can't dereference pointer (LPCVOID/uint8_t*)

I have a function to read the memory of an application
template<class T>
std::unordered_map<uint8_t, T> find_byte(T find) {
std::cout << "Searching for value\n";
std::unordered_map<uint8_t, T> mapping;
// Turn the T bytes to a vector to use nice C++ functions
std::vector<uint8_t> byte_check;
if constexpr (std::is_same_v<T, std::string>) {
byte_check = std::vector<uint8_t>(find.begin(), find.end());
}
else {
uint8_t* data = static_cast<uint8_t*>(static_cast<void*>(&find));
byte_check = std::vector<uint8_t>(data, data + sizeof(find));
}
MEMORY_BASIC_INFORMATION info;
for (uint8_t* addr = nullptr; VirtualQueryEx(m_proc, addr, &info, sizeof(info)) == sizeof(info); addr += info.RegionSize) {
if (info.State == MEM_COMMIT && (info.Type == MEM_MAPPED || info.Type == MEM_PRIVATE)) {
size_t read{};
std::vector<uint8_t> mem_chunk;
mem_chunk.resize(info.RegionSize);
if (ReadProcessMemory(m_proc, addr, mem_chunk.data(), mem_chunk.size(), &read)) {
mem_chunk.resize(read);
for (auto pos = mem_chunk.begin();
mem_chunk.end() != (pos = std::search(pos, mem_chunk.end(), byte_check.begin(), byte_check.end()));
pos++) {
uint8_t* int_addr_ptr = (addr + (pos - mem_chunk.begin()));
mapping[*int_addr_ptr] = find;
}
}
}
}
return mapping;
}
It compiles just fine, however, it crashes it tries to dereference the int_addr_ptr pointer.
After stepping through with a debugger, I noticed that the addr returned from VirtualQueryEx was unable to be read.
I assume the issue lies in how I dereference, but I don't know how to fix it.
I have tired:
auto lpcvoid = (addr + (pos - mem_chunk.begin()));
auto int_addr_ptr = reinterpret_cast<const uint8_t*>(lpcvoid);
from here, but it yielded no results.
I want to note that if I return a map of <uint8_t, T> it works fine, but wanted to avoid the pointers
When you query a memory region, you should start scanning through its content from its BaseAddress, not from the address that you actually queried. VirtualQueryEx() may have to round down the address. info.RegionSize is relative to info.BaseAddress, so when calling ReadProcessMemory(), mem_chunk.size() will be too many bytes to read if addr > info.BaseAddress, you would need to subtract that offset from the size being read.
In the std::search() loop, pos should be incremented by the size of find that is being searched for, not by 1 byte. And addr should again be info.BaseAddress.
But more importantly, why is int_addr_ptr being dereferenced at all? Its value represents a memory address in another process, not in your own process. So, you can't simply dereference it to read a value, doing so will try to read the value from your process. If you want to read the value stored at that address, you will have to use ReadProcessMemory() instead. But, what is mapping supposed to be keying off of exactly? Why are you dereferencing int_addr_ptr to read a byte for a mapping key? Shouldn't you be keying the actual addresses where each copy of find is found? And what is the point of storing copies of find in the mapping?
A simpler vector<void*> would make more sense, since the caller already knows the find value it is searching for, so no need to repeat it in the output.
As an added optimization, I would suggest getting rid of the byte_check vector altogether, as it is an unnecessary memory allocation. You are using it only to have iterators for use with std::search(). Raw pointers are also valid iterators.
With that said, try something more like this:
template<class T>
std::vector<void*> find_value(const T &find) {
std::cout << "Searching for value\n";
std::vector<void*> found;
std::vector<uint8_t> mem_chunk;
const uint8_t *find_begin, *find_end;
size_t find_size;
if constexpr (std::is_same_v<T, std::string>) {
find_begin = reinterpret_cast<const uint8_t*>(find.c_str());
find_size = find.size();
}
else {
find_begin = reinterpret_cast<const uint8_t*>(&find);
find_size = sizeof(find);
}
find_end = find_begin + find_size;
uint8_t *addr = reinterpret_cast<uint8_t*>(0), *base;
MEMORY_BASIC_INFORMATION info;
while (VirtualQueryEx(m_proc, addr, &info, sizeof(info))) {
base = static_cast<uint8_t*>(info.BaseAddress);
if (info.State == MEM_COMMIT && (info.Type == MEM_MAPPED || info.Type == MEM_PRIVATE)) {
mem_chunk.reserve(info.RegionSize);
size_t read;
if (ReadProcessMemory(m_proc, info.BaseAddress, mem_chunk.data(), info.RegionSize, &read)) {
auto start = mem_chunk.data(),
end = start + read,
pos = start;
if (addr > base) {
pos += (addr - base);
}
while ((pos = std::search(pos, end, find_begin, find_end)) != end) {
found.push_back(base + (pos - start));
pos += find_size;
}
}
}
addr = base + info.RegionSize;
}
return found;
}

Clear out the STL list of pointers in C++

I have defined a list of pointers. How should I free all these pointers before clearing the list? What is the best approach to erase all list members? In below program, is it required to free memory allocated for struct?? See my inline comments.
struct MyStruct {
char *data;
int len;
};
typedef std::list<struct MyStruct *> myStruct_list;
myStruct_list l_list;
/* Prepare a list */
for( int i = 0; i < 10; i++) {
struct MyStruct *s = (MyStruct *)malloc(sizeof(struct MyStruct));
s->data = (char*)malloc(MAX_LEN);
get_random_data(s->data,size);
s->len = strlen(s->data);
l_list.push_back(s);
}
/* Delete all members from a list */
myStruct_list::iterator it;
for (it = l_list.begin(); it != l_list.end(); ++it) {
if (*it) {
free(*it); // --->> do I need to free (*it)->data ?????
}
}
l_list.clear();
I want to understand is there any memory leak in below program?
Yes you have it right here:
p = (char*)malloc(MAX_LEN);
p = (char *)buf;
you allocate memory and assign it to p and next line you loose it. So:
You should not use malloc() in C++ programs unless you need to pass data that would be managed by C code
You should use special data structure like std::string etc to manage your data.
If you still need to allocate dynamic memory use smart pointers.
How should I debug if there is any memory leak?
You would not create them in the first place. For example, how could you write get_random_str (assuming you really have to allocate it using malloc):
using spchar = std::unique_ptr<char[], decltype(std::free) *>;
spchar get_random_str( int len )
{
spchar s( static_cast<char *>( malloc( len + 1 ) ), std::free );
static const char alphanum[] = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";
for (int i = 0; i < len; ++i) {
s[i] = alphanum[rand() % (sizeof(alphanum) - 1)];
}
s[len] = '\0';
return s;
}
Note, I did not compile this code, it's to show you the idea.
Update: looks like you think that this code:
p = (char *)buf;
would copy string from buf to p, which is not the case. Instead you make p point to memory of buf loosing old value that malloc() returned before (hence creating memory leak) and you assign that adress of buf to data which leads to UB when you call free() on it, so what you need instead:
strncpy( p, buf, MAX_LEN );
but even that is not necessary as you do not really need buf at all:
void myClass::fillData(void)
{
s = (MyStruct *)malloc(sizeof(struct MyStruct));
s->data = (char*)malloc(MAX_LEN);
get_random_str(s->data,950);
s->len = strlen(s->data);
l_list.push_back(s);
}
but this is more C code, using some C++ syntax. You should get newer compiler and textbook if you really want to learn C++.
Delete elements using lambda function in for_each:
std::for_each(l_list.begin(), l_list.end(), [](const MyStruct* elem){if(elem) free(elem)});
And clear pointers:
l_list.clear();

Implementing a simple ring buffer for float values

I am trying to implement a very simple ring buffer, for holding a stream of audio samples in the form of float values.
I want to be able to take a snapshot of the audio input at any one time. I don't need to pop or delete any values, just keep a moving buffer of the last n samples.
I'd like to ask if there are any potential issues with this implementation for my purposes.
class RingBuffer
{
public:
RingBuffer (int bufferSize) : bufferSize (bufferSize), count (0), head (0)
{
buffer = static_cast<float *> (malloc(bufferSize * sizeof(float)));
readBuffer = static_cast<float *> (malloc(bufferSize * sizeof(float)));
}
~RingBuffer ()
{
if (buffer != nullptr) free(buffer);
buffer = nullptr;
if (readBuffer != nullptr) free(readBuffer);
readBuffer = nullptr;
}
void push (float value)
{
if (count < bufferSize && head == 0)
{
buffer[count++] = value;
}
else if (count == bufferSize)
{
// reset head to beginning if reached the end
if (head >= bufferSize)
{
head = 0;
buffer[head] = value;
}
else
{
buffer[head++] = value;
}
}
}
/**
* Return a snapshot of the buffer as a continous array
*/
const float* getSnapshot ()
{
// Set up read buffer as continuous stream
int writeIndex = 0;
for (int i = head; i < count; ++i)
{
readBuffer[writeIndex++] = buffer[i];
}
for (int i = 0; i < head; ++i)
{
readBuffer[writeIndex++] = buffer[i];
}
return readBuffer;
}
private:
int bufferSize, head, count;
float* buffer;
float* readBuffer;
};
Well, there are indeed several issues I can see. Sorry for the bad news :-/
Bugs
There is a bug here: buffer[head] = value;. You don't increment head, so the sample at this position will be lost (overwritten) when the next sample comes in.
In the constructor, you should initialize buffer and readBuffer to nullptr: if one of your mallocs failed, your destructor would try to free
Your 1st loop in getSnapshot is faulty: the end-point should be min(bufferSize,head+count) rather than count
Design issues
As pointed out by mathematician1975, you should allocate your arrays with new float[bufferSize], it's simpler and more readable than mallocs
You should hold each buffer using a std::unique_ptr, so that you would no longer need any destructor (and your code would be much safer)
As you are working on circular buffers, you should use modulo arithmetics, e.g. writeIndex = (writeIndex +1 ) % bufferSize. Your code will be much simpler that way, especially in getSnapshot (one loop instead of two)

How to Copy Data from One Array to Another Without Names? (C++)

I'm working on an assignment right now and have run into a roadblock. The assignment is an array list in C++ that dynamically expands by a factor of 2 every time it runs out of room to store new elements (initially starts with room for 2 elements). Here is the code I'm working on (some of it is included in a separate .h file provided by the professor, I won't post everything in order to keep this compact).
#include "array_list.h"
//initial size to create storage array
static const unsigned int INIT_SIZE = 2;
//factor to increase storage by when it gets too small
static const unsigned int GROW_FACTOR = 2;
unsigned int growthTracker = 1;
array_list::array_list()
{
m_storage = new unsigned int[INIT_SIZE];
m_capacity = INIT_SIZE;
m_current = -1;
m_size = 0;
}
array_list::~array_list()
{
delete m_storage;
}
void array_list::clear()
{
delete m_storage;
m_storage = new unsigned int[INIT_SIZE];
m_capacity = INIT_SIZE;
m_current = -1;
m_size = 0;
}
unsigned int array_list::size() const
{
return m_size;
}
bool array_list::empty() const
{
bool A = 0;
if(m_size == 0)
{
A = 1;
}
return A;
}
void array_list::insert(const unsigned int val)
{
m_storage[m_size++] = val;
m_current = m_size;
}
void array_list::grow_and_copy()
{
if(m_size == m_capacity)
{
new unsigned int[INIT_SIZE * (GROW_FACTOR ^ growthTracker)];
growthTracker++;
m_capacity = m_capacity * 2;
}
m_storage[m_size++] = val;
}
Now, my problem is trying to figure out how to copy the values of the old, smaller array into the new, larger one. If I wasn't using dynamic unnamed arrays, this would be very easy to do with a loop, a simple case of "for a certain range, arrayA[i] = arrayB[i]." However, because the arrays are just defined as new unsigned int[], I'm not sure how to go about this. There are no names, so I can't figure out how to tell C++ which array to copy into which. And since the grow_and_copy could be called multiple times, I'm fairly sure I can't give them names, right? Because then I would end up with multiple arrays with the same name. Can anyone point me in the right direction here? Thanks so much.
array_list::growList(int increase = GROW_FACTOR)
{
unsigned int* temp = m_storage;
m_storage = new unsigned int[m_capacity * increase];
for (int i = 0; i < m_capacity; i++)
m_storage[i] = temp[i];
m_capacity *= increase;
delete temp;
}
I don't know if there are other variables you want to change, but this should basically do what you are asking.

Return string read from buffer and function without dynamic allocation?

How would I go about returning a string built from a buffer within a function without dynamically allocating memory?
Currently I have this function to consider:
// Reads null-terminated string from buffer in instance of buffer class.
// uint16 :: unsigned short
// ubyte :: unsigned char
ubyte* Readstr( void ) {
ubyte* Result = new ubyte[]();
for( uint16 i = 0; i < ByteSize; i ++ ) {
Result[ i ] = Buffer[ ByteIndex ];
ByteIndex ++;
if ( Buffer[ ByteIndex - 1 ] == ubyte( 0 ) ) {
ByteIndex ++;
break;
};
};
return Result;
};
While I can return the built string, I can't do this without dynamic allocation. This becomes a problem if you consider the following usage:
// Instance of buffer class "Buffer" calling Readstr():
cout << Buffer.Readstr() << endl;
// or...
ubyte String[] = Buffer.String();
Usages similar to this call result in the same memory leak as the data is not being deleted via delete. I don't think there is a way around this, but I am not entirely sure if it's possible.
Personally, I'd recommend just return std::string or std::vector<T>: this neatly avoids memory leaks and the string won't allocate memory for small strings (well, most implementations are going that way but not all are quite there).
The alternative is to create a class which can hold a big enough array and return an object that type:
struct buffer {
enum { maxsize = 16 };
ubyte buffer[maxsize];
};
If you want get more fancy and support bigger strings which would then just allocate memory you'll need to deal a bit more with constructors, destructors, etc. (or just use std::vector<ubyte> and get over it).
There are at least three ways you could reimplement the method to avoid a direct allocation with new.
The Good:
Use a std::vector (This will allocate heap memory):
std::vector<ubyte> Readstr()
{
std::vector<ubyte> Result;
for (uint16 i = 0; i < ByteSize; i++)
{
Result.push_back(Buffer[ByteIndex]);
ByteIndex++;
if (Buffer[ByteIndex - 1] == ubyte(0))
{
ByteIndex++;
break;
}
}
return Result;
}
The Bad:
Force the caller to provide an output buffer and possibly a size do avoid overflows (Does not directly allocate memory):
ubyte* Readstr(ubyte* outputBuffer, size_t maxCount)
{
for (uint16 i = 0; i < ByteSize; i++)
{
if (i == maxCount)
break;
outputBuffer[i] = Buffer[ByteIndex];
ByteIndex++;
if (Buffer[ByteIndex - 1] == ubyte(0))
{
ByteIndex++;
break;
}
}
return outputBuffer;
}
The Ugly:
Use an internal static array and return a reference to it:
ubyte* Readstr()
{
enum { MAX_SIZE = 2048 }; // Up to you to decide the max size...
static ubyte outputBuffer[MAX_SIZE];
for (uint16 i = 0; i < ByteSize; i++)
{
if (i == MAX_SIZE)
break;
outputBuffer[i] = Buffer[ByteIndex];
ByteIndex++;
if (Buffer[ByteIndex - 1] == ubyte(0))
{
ByteIndex++;
break;
}
}
return outputBuffer;
}
Be aware that this last option has several limitations, including possibility of data races in multithreaded application and inability to call it inside a recursive function, among other subtle issues. But otherwise, is probably the closest to what you are looking for and can be used safely if you take some precautions and make some assumptions about the calling code.