Reading data from a file; first packet is gibberish - c++

I am trying to read 1244 bytes at a time from a file. Essentially, the idea is to segment the 100KB worth of data into packets. So the approach I am taking is, assigning all the data to an array and then creating an array of pointers which will contain starting positions to each of my packets. The pointer array contains values [0, 1244, 2488, and so on].
It works perfectly fine, except my first assignment is gibberish. k[0] and o[0] both come up with garbage while the remaining 79 values seem to be fine. Can anyone assist?
I realize the first argument to the fread command should be a pointer, but this worked also. Also, I need the pointers to the starting of each of my packets because I am doing other function calls (omitted from code) that format the packet properly with the appropriate headers.
It's been a while since I coded in c/c++ so any optimizations you could provide, would be much appreciated.
int main(int argc, const char * argv[])
{
FILE *data;
int size; int i;
int paySize = 1244;
//int hdrSize = 256;
data = fopen("text2.dat","r");
//get data size
fseek(data, 0, SEEK_END);
size = ftell(data);
rewind (data);
char k[size]; //initializing memory location for all the data to be read in.
fread(k, 1, size, data); //reading in data
int temp = ceil(size/paySize);
char * o[temp]; //array of pointers to beginning of each packet.
int q = 0;
for (i = 0; i < size; i = i+paySize)
{
o[q] = &k[i];
q++;
}
cout << o[0] << endl; //this outputs gibberish!

cout << o[0] << endl;
prints an address to which this pointer points. To print value at this address use:
cout << *o[0] << endl;
Here:
char k[size];
char * o[temp];
o[q] = &k[i];
you assign to o[] pointers to characters, dereferencing such a pointer result in a single char.

Related

Why does iterating over a hex array stop early?

Problem and Code
I am working with code to take a screenshot on a Raspberry Pi. Using some magic from the VC handler, I can take a screenshot and store it in memory with calloc. I can use this to store the data in a file as a ppm image with the requisite header using:
void * image;
image = calloc(1, width * 3 * height);
// code to store data into *image
FILE *fp = fopen("myfile.ppm", "wb");
fprintf(fp, "P6\n%d %d\n255\n", width, height);
fwrite(image, width*3*height, 1, fp);
fclose(fp);
This successfully stores the data. I can access it and view it normally.
However, if I instead try to inspect the data which are being put into the file for debugging purposes by printing:
int cnt = 0;
std::string imstr = (char *)image;
for (int i=0; i<(width*3*height); i++) {
std::cout << (int)imstr[i] << " " << cnt << std::endl;
cnt += 1;
}
I segfault early. The numbers which are returned in the print make sense for the context (e.g. color values <255)
Example Numbers
In the case of a 1280 x 768 x 3 image, my cnt stops at 64231. The value it stops at doesn't seem to have any relation to the sizeof char or int.
I think I'm missing something obvious here, but I can't see it. Any suggestions?
very probably you have at least a null character in (char *)image, so the std::string length is shorter than width*3*height due to its initialization because only the characters up to that first null character are used
use a std::array rather than a std::stringinitialized like that
The way you are converting the image data to a std::string is wrong. If the image's raw data contains any 0x00 bytes then the std::string will be truncated, causing your loop to access out of bounds of the std::string. And if the image's raw data does not contain any 0x00 bytes then the std::string constructor will try to read past the bounds of the image's allocated memory.
You need to take the image's size into account when constructing the std::string, eg:
size_t cnt = 0;
std::string imstr(static_cast<char*>(image), width*3*height);
for (size_t i = 0; i < imstr.size(); ++i) {
std::cout << static_cast<int>(imstr[i]) << " " << cnt << std::endl;
++cnt;
}
Otherwise, simply don't convert the image to std::string at all. You can iterate the image's raw data directly instead, eg:
size_t cnt = 0, imsize = width*3*height;
char *imdata = static_cast<char*>(image);
for (size_t i = 0; i < imsize; ++i) {
std::cout << static_cast<int>(imdata[i]) << " " << cnt << std::endl;
++cnt;
}

What's going on with my memory?

I have a function that allocated a buffer for the size of a file with
char *buffer = new char[size_of_file];
The i loop over the buffer and copy some of the pointers into a subbuffer to work with smaller units of it.
char *subbuffer = new char[size+1];
for (int i =0; i < size; i++) {
subbuffer[i] = (buffer + cursor)[i];
}
Next I call a function and pass it this subbuffer, and arbitrary cursor for a location in the subbuffer, and the size of text to be abstracted.
wchar_t* FileReader::getStringForSizeAndCursor(int32_t size, int cursor, char *buffer) {
int wlen = size/2;
#if MARKUP_SIZEOFWCHAR == 4 // sizeof(wchar_t) == 4
uint32_t *dest = new uint32_t[wlen+1];
#else
uint16_t *dest = new uint16_t[wlen+1];
#endif
char *bcpy = new char[size];
memcpy(bcpy, (buffer + cursor), size+2);
unsigned char *ptr = (unsigned char *)bcpy; //need to be careful not to read outside the buffer
for(int i=0; i<wlen; i++) {
dest[i] = (ptr[0] << 8) + ptr[1];
ptr += 2;
}
//cout << "size:: " << size << " wlen:: " << wlen << " c:: " << c << "\n";
dest[wlen] = ('\0' << 8) + '\0';
return (wchar_t *)dest;
}
I store this in a value as the property of a struct whilst looping through the file.
My issue seems to be when I free subbuffer, and start reading the title properties of my structs by looping over an array of struct pointers, my app segfaults. GDB tells me it finished normally though, but a bunch of records that I cout are missing.
I suspect this has to do with function scope of something. I thought the memcpy in getStringForSizeAndCursor would fix the segfault since it's copying bytes outside of subbuffer before I free. Right now I would expect those to then be cleaned up by my struct deconstructor, but either things are deconstructing before I expect or some memory is still pointing to the original subbuffer, if I let subbuffer leak I get back the data I expected, but this is not a solution.
The only definite error I can see in your question's code is the too small allocation of bcpy, where you allocate a buffer of size size and promptly copy size+2 bytes to the buffer. Since you're not using the extra 2 bytes in the code, just drop the +2 in the copy.
Besides that, I can only see one suspicious thing, you're doing;
char *subbuffer = new char[size+1];
and copying size bytes to the buffer. The allocation hints that you're allocating extra memory for a zero termination, but either it shouldn't be there at all (no +1) or you should allocate 2 bytes (since your function hints to a double byte character set. Either way, I can't see you zero terminating it, so use of it as a zero terminated string will probably break.
#Grizzly in the comments has a point too, allocating and handling memory for strings and wstrings is probably something you could "offload" to the STL with good results.

How to create a jagged string array in c++?

I want to create jagged character two dimensional array in c++.
int arrsize[3] = {10, 5, 2};
char** record;
record = (char**)malloc(3);
cout << endl << sizeof(record) << endl;
for (int i = 0; i < 3; i++)
{
record[i] = (char *)malloc(arrsize[i] * sizeof(char *));
cout << endl << sizeof(record[i]) << endl;
}
I want to set record[0] for name (should have 10 letter), record[1] for marks (should have 5 digit mark )and record[3] for Id (should have 2 digit number). How can i implement this? I directly write the record array to the binary file. I don't want to use struct and class.
in C++ it would like this:
std::vector<std::string> record;
Why would you not use a struct when it is the sensible solution to your problem?
struct record {
char name[10];
char mark[5];
char id[2];
};
Then writing to a binary file becomes trivial:
record r = get_a_record();
write( fd, &r, sizeof r );
Notes:
You might want to allocate a bit of extra space for NUL terminators, but this depends on the format that you want to use in the file.
If you are writing to a binary file, why do you want to write mark and id as strings? Why not store an int (4 bytes, greater range of values) and a unsigned char (1 byte)
If you insist on not using a user defined type (really, you should), then you can just create a single block of memory and use pointer arithmetic, but beware that the binary generated by the compiler will be the same, the only difference is that your code will be less maintainable:
char record[ 10+5+2 ];
// copy name to record
// copy mark to record+10
// copy id to record+15
write( fd, record, sizeof record);
Actually the right pattern “to malloc” is:
T * p = (T *) malloc(count * sizeof(T));
where T could be any type, including char *. So the right code for allocating memory in this case is like that:
int arrsize[3] = { 10, 5, 2 };
char** record;
record = (char**) malloc(3 * sizeof(char *));
cout << sizeof(record) << endl;
for (int i = 0; i < 3; ++i) {
record[i] = (char *) malloc(arrsize[i] * sizeof(char));
}
I deleted cout'ing sizeof(record[i]) because it will always yield size of (one) pointer to char (4 on my laptop). sizeof is something that plays in compiling time and has no idea how much memory pointed by record[i] (which is really a pointer - char * type) was allocated in the execution time.
malloc(3) allocates 3 bytes. Your jagged array would be an array containing pointers to character arrays. Each pointer usually takes 4 bytes (on a 32-bit machine), but more correctly sizeof(char*), so you should allocate using malloc(3 * sizeof(char*) ).
And then record[i] = (char*)malloc((arrsize[i]+1) * sizeof(char)), because a string is a char* and a character is a char, and because each C-style string is conventionally terminated with a '\0' character to indicate its length. You could do without it, but it would be harder to use for instance:
strcpy(record[0], name);
sprintf(record[1], "%0.2f", mark);
sprintf(record[2], "%d", id);
to fill in your record, because sprintf puts in a \0 at the end. I assumed mark was a floating-point number and id was an integer.
As regards writing all this to a file, if the file is binary why put everything in as strings in the first place?
Assuming you do, you could use something like:
ofstream f("myfile",ios_base::out|ios_base::binary);
for (int i=0; i<3; i++)
f.write(record[i], arrsize[i]);
f.close();
That being said, I second Anders' idea. If you use STL vectors and strings, you won't have to deal with ugly memory allocations, and your code will probably look cleaner as well.

sizeof continues to return 4 instead of actual size

#include <iostream>
using namespace std;
int main()
{
cout << "Do you need to encrypt or decrypt?" << endl;
string message;
getline(cin, message);
int letter2number;
for (int place = 1; place < sizeof(message); place++)
{
letter2number = static_cast<int>(message[place]);
cout << letter2number << endl;
}
}
Examples of problem: I type fifteen letters but only four integers are printed. I type seven letters but only four integers are printed.
The loop only occurs four times on my computer, not the number of characters in the string.
This is the only problem I am having with it, so if you see other errors, please don't tell me. (It is more fun that way.)
Thank you for your time.
sizeof returns the size of an expression. For you, that's a std::string and for your implementation of std::string, that's four. (Probably a pointer to the buffer, internally.)
But you see, that buffer is only pointed to by the string, it has no effect on the size of the std::string itself. You want message.size() for that, which gives you the size of the string being pointed to by that buffer pointer.
As the string's contents change, what that buffer pointer points to changes, but the pointer itself is always the same size.
Consider the following:
struct foo
{
int bar;
};
At this point, sizeof(foo) is known; it's a compile-time constant. It's the size of an int along with any additional padding the compiler might add.
You can let bar take on any value you want, and the size stays the same because what bar's value is has nothing to do with the type and size of bar itself.
You want to use message.size() not sizeof(message).
sizeof just gives the number of bytes in the data type or expression. You want the number of characters stored in the string which is given by calling size()
Also indexing starts at 0, notice I changed from 1 to 0 below.
for (int place = 0; place < message.size(); place++)
{
letter2number = static_cast<int>(message[place]);
cout << letter2number << endl;
}
Any pointer on an x86 system is only 4 bytes. Even if it is pointing to the first element of an array on the heap which contains 100 elements.
Example:
char * p = new char[5000];
assert(sizeof(p) == 4);
Wrapping p in a class or struct will give you the same result assuming no padding.
class string
{
char * ptr;
//...
size_t size(); // return number of chars (until null) in buffer pointed to by ptr
};
sizeof(message) == sizeof(string) == sizeof(ptr) == 4; // size of the struct
message.size() == number of characters in the message...
sizeof(type) returns the size of the type, not the object. Use the length() method to find the length of the string.
#include<iostream>
#include<conio.h>
using namespace std;
int main()
{
cout << "Do you need to encrypt or decrypt?" << endl;
string message;
getline(cin, message);
int letter2number;
for (int place = 0; place < message.size(); place++)
{
letter2number = static_cast<int>(message[place]);
cout << letter2number << endl;
}
getch();
return 0;
}

C++ qsort array of pointers not sorting

I am trying to sort a buffer full of variable-length records alphabetically in C++. I previously asked how to implement this, and was told to sort an array of pointers to the records. I set up an array of pointers, but realized that each pointer points to the beginning of a record, but there is no way of it knowing when the record stops. When I try to print out the record pointed to by each pointer in the array, therefore, for each pointer, I get the entire buffer of all records, starting from the one pointed to. (For example, if the buffer holds "Helloworld", and there is a pointer at each letter, printing the array of pointers would produce "Helloworldelloworldlloworldloworldoworldworldorldrldldd".) Obviously, this is not what I want; also, the qsort does not seem to be working on the array of pointers either. When I debug, the memory spaces pointed to by the pointers seem to hold very odd characters that are definitely not part of the ascii character set and were not included in my input file. I am very confused. Below is my code; how can I do this without getting the odd results I get now? Thank you so much, bsg.
int _tmain(int argc, _TCHAR* argv[])
{
//allocate memory for the buffer
buff = (unsigned char *) malloc(2048);
realbuff = (unsigned char *) malloc(NUM_RECORDS * RECORD_SIZE);
fp = fopen("postings0.txt", "r");
if(fp)
{
fread(buff, 1, 2048, fp);
/*for(int i=0; i <30; i++)
cout << buff[i] <<endl;*/
int y=0;
//create a pointer to an array of unsigned char pointers
unsigned char *pointerarray[NUM_RECORDS];
//point the first pointer in the pointer array to the first record in the buffer
pointerarray[0] = &buff[0];
int recordcounter = 1;
//iterate through each character in the buffer;
//if the character is a line feed (denoting a new record),
// point the next pointer in the pointer array to the next
//character in the buffer (that is, the start of the next record)
for(int i=0;i <2048; i++)
{
if(buff[i] == char(10))
{
pointerarray[recordcounter] = &buff[i+1];
recordcounter++;
}
}
//the actual qsort (NUM_RECORDS is a constant declared above; omitted here)
qsort(pointerarray, NUM_RECORDS, sizeof(char*), comparator);
}
else
cout << "sorry";
cout << sizeof(pointerarray)/sizeof(char*);
for(int k=0; k < sizeof(pointerarray)/sizeof(char*);k++)
{
cout << pointerarray[k];
}
int comparator(const void * elem1, const void * elem2)
{
//iterate through the length of the first string
while(*firstString != char(10))
{
return(strcmp(firstString, secondString));
firstString++;
secondString++;
/
}
return 0;
}
I'm guessing the problem is in your comparator function (which doesn't compile as posted).
qsort gives a pointer to the array element to the comparator function. In your case that would be a pointer to the char* stored in the array.
The man page for qsort gives this example:
static int
cmpstringp(const void *p1, const void *p2)
{
/* The actual arguments to this function are "pointers to
pointers to char", but strcmp(3) arguments are "pointers
to char", hence the following cast plus dereference */
return strcmp(* (char * const *) p1, * (char * const *) p2);
}
int
main(int argc, char *argv[])
{
int j;
assert(argc > 1);
qsort(&argv[1], argc - 1, sizeof(char *), cmpstringp);
for (j = 1; j < argc; j++)
puts(argv[j]);
exit(EXIT_SUCCESS);
}
This question basically comes down to 'how do you know the length of your variable-length record.' There needs to be some way to tell, either from the record itself, or from some other data.
One way is to use pointer/length pairs to refer to records -- a pointer to the beginning of the record and a length (int or size_t), which you store together in a struct. With C++ you can use std::pair, or with C define a litte struct. You can then use qsort on an array of these.
In your case, you can tell the length by looking for a char(10), as you always use them to terminate your strings. You need a custom comparison (strcmp won't work -- it expects NUL terminators) that is aware of this.