How to read x characters from position y in a char * buffer? - c++

I am reading through a buffer (char *) and i have a cursor, where i am tracking my starting position of the buffer, is there a way to copy characters 7-64 out of the buffer, or is my best bet to just loop the buffer from poistion x to position y?
The size of the destination buffer is the result of another function dynamically computed
Initializing this returns
variable-sized object 'version' may not be initialized
Relevant code parts:
int32_t size = this->getObjectSizeForMarker(cursor, length, buffer);
cursor = cursor + 8; //advance cursor past marker and size
char version[size] = this->getObjectForSizeAndCursor(size, cursor, buffer);
-
char* FileReader::getObjectForSizeAndCursor(int32_t size, int cursor, char *buffer) {
char destination[size];
memcpy(destination, buffer + cursor, size);
}
-
int32_t FileReader::getObjectSizeForMarker(int cursor, int eof, char * buffer) {
//skip the marker and read next 4 byes
cursor = cursor + 4; //skip marker and read 4
unsigned char *ptr = (unsigned char *)buffer + cursor;
int32_t objSize = (ptr[0] << 24) | (ptr[1] << 16) | (ptr[2] << 8) | ptr[3];
return objSize;
}

Move the pointer to buffer six units ahead (to get to the seventh index), and then memcpy 64-7 (57) bytes, e.g.:
const char *buffer = "foo bar baz...";
char destination[SOME_MAX_LENGTH];
memcpy(destination, buffer + 6, 64-7);
You may want to terminate the destination array so that you can work with it using standard C string functions. Note that we're adding the null character at the 58th index, after the 57 bytes that were copied over:
/* terminate the destination string at the 58th byte, if desired */
destination[64-7] = '\0';
If you need to work with a dynamically sized destination, use a pointer instead of an array:
const char *buffer = "foo bar baz...";
char *destination = NULL;
/* note we do not multiply by sizeof(char), which is unnecessary */
/* we should cast the result, if we're in C++ */
destination = (char *) malloc(58);
/* error checking */
if (!destination) {
fprintf(stderr, "ERROR: Could not allocate space for destination\n");
return EXIT_FAILURE;
}
/* copy bytes and terminate */
memcpy(destination, buffer + 6, 57);
*(destination + 57) = '\0';
...
/* don't forget to free malloc'ed variables at the end of your program, to prevent memory leaks */
free(destination);
Honestly, if you're in C++, you should really probably be using the C++ strings library and std::string class. Then you can call the substr substring method on your string instance to get the 57-character substring of interest. It would involve fewer headaches and less re-inventing the wheel.
But the above code should be useful for both C and C++ applications.

Related

Subsetting char array without copying it in C++

I have a long array of char (coming from a raster file via GDAL), all composed of 0 and 1. To compact the data, I want to convert it to an array of bits (thus dividing the size by 8), 4 bytes at a time, writing the result to a different file. This is what I have come up with by now:
uint32_t bytes2bits(char b[33]) {
b[32] = 0;
return strtoul(b,0,2);
}
const char data[36] = "00000000000000000000000010000000101"; // 101 is to be ignored
char word[33];
strncpy(word,data,32);
uint32_t byte = bytes2bits(word);
printf("Data: %d\n",byte); // 128
The code is working, and the result is going to be written in a separate file. What I'd like to know is: can I do that without copying the characters to a new array?
EDIT: I'm using a const variable here just to make a minimal, reproducible example. In my program it's a char *, which is continually changing value inside a loop.
Yes, you can, as long as you can modify the source string (in your example code you can't because it is a constant, but I assume in reality you have the string in writable memory):
uint32_t bytes2bits(const char* b) {
return strtoul(b,0,2);
}
void compress (char* data) {
// You would need to make sure that the `data` argument always has
// at least 33 characters in length (the null terminator at the end
// of the original string counts)
char temp = data[32];
data[32] = 0;
uint32_t byte = bytes2bits(data);
data[32] = temp;
printf("Data: %d\n",byte); // 128
}
In this example by using char* as a buffer to store that long data there is not necessary to copy all parts into a temporary buffer to convert it to a long.
Just use a variable to step through the buffer by each 32 byte length period, but after the 32th byte there needs the 0 termination byte.
So your code would look like:
uint32_t bytes2bits(const char* b) {
return strtoul(b,0,2);
}
void compress (char* data) {
int dataLen = strlen(data);
int periodLen = 32;
char* periodStr;
char tmp;
int periodPos = periodLen+1;
uint32_t byte;
periodStr = data[0];
while(periodPos < dataLen)
{
tmp = data[periodPos];
data[periodPos] = 0;
byte = bytes2bits(periodStr);
printf("Data: %d\n",byte); // 128
data[periodPos] = tmp;
periodStr = data[periodPos];
periodPos += periodLen;
}
if(periodPos - periodLen <= dataLen)
{
byte = bytes2bits(periodStr);
printf("Data: %d\n",byte); // 128
}
}
Please than be careful to the last period, which could be smaller than 32 bytes.
const char data[36]
You are in violation of your contract with the compiler if you declare something as const and then modify it.
Generally speaking, the compiler won't let you modify it...so to even try to do so with a const declaration you'd have to cast it (but don't)
char *sneaky_ptr = (char*)data;
sneaky_ptr[0] = 'U'; /* the U is for "undefined behavior" */
See: Can we change the value of an object defined with const through pointers?
So if you wanted to do this, you'd have to be sure the data was legitimately non-const.
The right way to do this in modern C++ is by using std::string to hold your string and std::string_view to process parts of that string without copying it.
You can using string_view with that char array you have though. It's common to use it to modernize the classical null-terminated string const char*.

Converting binary data in string to char

So I have some data I convert from packet to string, in binary (datagram):
std::string Packet::packetToString()
{
//packing to one bitset
std::bitset<208> pak(std::string(std::bitset<2>(type).to_string() + std::bitset<64>(num1).to_string() + std::bitset<64>(num2).to_string() + std::bitset<64>(num3).to_string() + std::bitset<4>(state).to_string() + std::bitset<4>(id).to_string() + "000000"));
std::string temp;
std::bitset<8> tempBitset(0);
for (int i = pak.size() - 1; i >= 0; i--)
{
tempBitset[i % 8] = pak[i];
if (i % 8 == 0)
{
char t = static_cast<char> (tempBitset.to_ulong());
temp.push_back(t);
}
}
return temp;
}
Then I want to convert this string to char array (in this case char buffer[26];) and send it with SendTo("127.0.0.1", 1111, buffer, 26);
What's the problem:
Packet pak1(... data I input ... );
string packet;
packet = pak1.packetToString();
char buffer[26];
strcpy_s(buffer, packet.c_str());
Data send with this array seems to be erased in case 0x00(NULL) appears. This is caused by c_str() i guess. How can I deal with this? :)
strcpy() and strcpy_s() copy null terminated c-strings. So indeed, if there's any 0x00 char in the c_str() the copy will end.
Use std::copy() to copy the full data, regardless of the any 0x00 that you might encounter:
copy (packet.begin(), packet.end(), buffer); // hoping packet.size()<26
or with copy_n():
copy_n (packet.begin(), 26, buffer); // assuming that packet is at least 26 bytes

How is memset working in this snippet of code?

I think this snippet of code is enough to get the idea of what I'm doing.
I'm using getline to read input data from a text file that has lines that might look something like this: The cat is fat/And likes to sing
From searching around the internet I was able to get it working, but I'd like to better understand WHY it is working. My primary question is how the
memcpy(id, buffer, temp - buffer);
line is working. I read what memcpy() does but do not understand how the temp - buffer part is working.
So from my understanding I'm setting *temp to the '/' in that line. Then I'm copying the line up until the '/' into it. But how does the temp, which is at '/' minus the buffer (which is the whole line from getline) work out to just be The cat is fat?
Hopefully that made some sense.
#define MAX_SIZE 255
char buffer[MAX_SIZE + 1] = { 0 };
cin.getline(buffer, MAX_SIZE);
memset(id, 0, 256);
memset(title, 0, 256);
char* temp = strchr(buffer, '/');
memcpy(id, buffer, temp - buffer);
temp++;
strcpy(title, temp);
Also, if I can double dip, why would MAX_SIZE be defined at 255 but MAX_SIZE+1 is often used. Does this have to do with a delimiter or white space at the end of a line?
Thanks for the help.
In my opinion it is simply a bad code.:)
I would write it like
const size_t MAX_SIZE = 256
char buffer[MAX_SIZE] = {};
std::cin.getline( buffer, MAX_SIZE );
id[0] = '\0';
title[0] = '\0';
if ( char* temp = strchr( buffer, '/' ) )
{
std::memcpy( id, buffer, temp - buffer );
id[temp - buffer] = '\0';
std::strcpy( title, temp + 1 );
}
else
{
std::strcpy( id, buffer );
}
As for memcpy in this statement
memcpy(id, buffer, temp - buffer);
then it copies temp - buffer bytes from buffer to id. As id was previously set to zeroes then after memcpy it will contain a string with terminating zero.
You're question concerns pointer-difference calculation, part of the family of arithmetic operations that are done in pointer-arithmetic.
Most beginners don't have too much trouble grasping how pointer-addition works. Given this:
char buffer[256];
char *p = buffer + 10;
it is usually clear that p points to the 10th slot in the buffer char array. But you need to remember that the pointer type is important. The same construct you see above also works for more complicated data types:
struct Something
{
char name[128];
int ident;
int supervisor;
} people[64];
struct Something *p = people+10; // NOTE: same line, different types
Just as before, p points to the tenth element in the array, but note the arithmetic; the size of the underlying type is used to calculate the relevant memory offset. You don't need to do it yourself. No sizeof required here.
So why do you care? Because just like regular math, pointer math has certain properties, one of them being the following:
char buffer[256];
char *p = buffer+10; // p addresses the 10th slot in the array
size_t len = p-buffer // len is the typed-difference between p and buffer.
In this case, len will be 10, the same as the offset of p. So how does this relate to your question? Well...
char* temp = strchr(buffer, '/');
memcpy(id, buffer, temp - buffer);
The horrid nature of this code aside (if there is no '/' in the buffer array the result is temp being NULL, and the ensuing memcpy will all-but-guarantee a massive segfault). This code finds the location in the string where '/' resides. Once it has that, the calculation temp - buffer uses pointer arithmetic (specifically pointer differencing) to calculate the distance between the address in temp and the address as the base of the array. The result is the element count not including the slash itself. Therefore this code copies up-to, but not including, the discovered slash, into the id buffer. The rest of the id buffer retains all the 0 values populated with the memset and therefore the string is terminated (which is way more work than you need to do, btw).
After that line, the remainder:
temp++;
strcpy(title, temp);
post-increments the temp pointer, which says "move to the next element in the array". Then the strcpy copies the remaining chars of the null-terminated buffer string into title. Worth noting this could have simply been:
strcpy(title, ++temp);
And likewise:
strcpy(title, temp+1);
which retains temp at the '/' position. In all of the above, the result in title will be the same: all chars after the slash, but not including it.
I hope that explains what is going on. Best of luck.
MAX_SIZE+1 is reserving space for the null terminator at the end of the string ('\0')
memcpy(id, buffer, temp - buffer)
This is copying (temp-buffer) bytes from buffer to id. Since strchr finds the '/' character in the input, temp is pointing inside buffer (assumiing it's found). So for example assume buffer points to a location in memory:
buffer = 0x781230001
and the third byte is the '/', after strchr, you have
temp = 0x781230003
temp - buffer therefore is 2.
HOWEVER: If the '/' is not found, then temp will not work and the code will crash. You should check the result of strchr before doing the pointer arithmetic.
There you calculate position of first / in buffer.
char* temp = strchr(buffer, '/');
Now temp points to / in buffer. If you want to copy this part of buffer, its enough to get pointer to start and length of string. So temp - buffer evaluates to length.
=================================
The cat is fat/And likes to sing
=================================
^ ^
buffer temp
| length | = temp - buffer
End of null terminated string determinated by \0 (or simply 0). So if you need to store N chars you need to allocate N+1 buffer size.

Dynamic memory allocation to char array

I have given the array size manually as below:
int main(int argc, char *argv[] )
{
char buffer[1024];
strcpy(buffer,argv[1]);
...
}
But if the data passed in the argument exceeds this size, it may will create problems.
Is this the correct way to allocate memory dynamically?
int main(int argc, char *argv[] )
{
int length;
char *buffer;
length = sizeof(argv[1]); //or strlen(argv[1])?
buffer = (char*)malloc(length*sizeof(char *));
...
}
sizeof tells you the size of char*. You want strlen instead
if (argc < 2) {
printf("Error - insufficient arguments\n");
return 1;
}
length=strlen(argv[1]);
buffer = (char*)malloc(length+1); // cast required for C++ only
I've suggested a few other changes here
you need to add an extra byte to buffer for the null terminator
you should check that the user passed in an argv[1]
sizeof(char *) is incorrect when calculating storage required for a string. A C string is an array of chars so you need sizeof(char), which is guaranteed to be 1 so you don't need to multiply by it
Alternatively, if you're running on a Posix-compatible system, you could simplify things and use strdup instead:
buffer = strdup(argv[1]);
Finally, make sure to free this memory when you're finished with it
free(buffer);
The correct way is to use std::string and let C++ do the work for you
#include <string>
int main()
{
std::string buffer = argv[1];
}
but if you want to do it the hard way then this is correct
int main()
{
int length = strlen(argv[1]);
char* buffer = (char*)malloc(length + 1);
}
Don't forget to +1 for the null terminator used in C style strings.
In C++, you can do this to get your arguements in a nice data structure.
const std::vector<std::string>(argv, argv + argc)
length= strlen(argv[1]) //not sizeof(argv[1]);
and
//extra byte of space is to store Null character.
buffer = (char*)malloc((length+1) * sizeof(char));
Since sizeof(char) is always one, you can also use this:
buffer = (char*)malloc(length+1);
Firstly, if you use C++ I think it's better to use new instead of malloc.
Secondly, you're malloc size is false : buffer = malloc(sizeof(char) * length); because you allocate a char buffer not a char* buffer.
thirdly, you must allocate 1 byte more for the end of your string and store '\0'.
Finally, sizeof get only the size of the type not a string, you must use strlen for getting string size.
You need to add an extra byte to hold the terminating null byte of the string:
length=sizeof(argv[1]) + 1;
Then it should be OK.

Copy a multi-string into a buffer

I'm using a windows-api that returns a wide-char multi-string as result. The result is same as below:
L"apple\0banana\0orange\0\0"
Is there any standard function or good performance solution to copy this structure to a buffer?
copy_wide_char_multi_string(dst, src); // dst and src are wchar_t arrays
I've never bothered to work with wide character strings so consider this a guideline.
You can implement an algorithm like the following:
wchar_t * wide_string = L"something\0something else\0herp\0derp\0\0";
int size = 0;
int i = wcslen(wide_string + size); // length of wide string
size += i + 1; // length of wide string inc. null terminator
while (true)
{
int i = wcslen(wide_string + size); // length of wide string
size += i + 1; // length of wide string inc. null terminator
if (i == 0) break; // if length was 0 (2 nulls in a row) break
}
++size; // count final null as part of size
This will give you the size of the data in the buffer. Once you have that you can just use wmemcpy on it
You seem to be already knowing size of original array.so create another wchar_t clonned array of same size and simply use std::copy
std::copy(original, original+size, clonned)