char pointer file I/O - c++

I am struggling to understand the behaviour of the program. The test.txt file is of 16 bytes where as the size of text pointed by pointer p of structure is more than 16 bytes(the string text) there are other int values too. How the extra data is stored in test file with only 16 bytes. The file is read by another structure 'b' but it gives correct values like 'a'.
int main()
{
string text("C:\\Users\\Chitra\\Desktop\\Capture.JPG");
string filepath("C:\\Users\\Chitra\\Desktop\\New folder\\Project1\\Project1\\test.txt");
fstream fout(filepath,ios::out|ios::binary);
fstream fin(filepath,ios::in|ios::binary);
struct block
{
int value;
int size;
const char* p;
int some;
};
block a;
a.value = 1457745;
a.size = text.length();
a.p = text.c_str();
a.some = 97877;
fout.write((char*)&a, sizeof(a));
fout.close();
block b;
fin.read((char*)&b, sizeof(b));
fin.seekg(0, ios::end);
cout << "file size " << fin.tellg();
fin.close();
cout << "\nsize a " << sizeof(a) << " size b " << sizeof(b);
cout << "\n"<<b.value << " " << b.size << " " << b.p << " " << b.some;
getchar();
return 0;
}

The file is only 16 bytes because when you write a.p to the file, you are writing a four-byte pointer value which points to the string in memory -- you are not writing out the string data itself.
The reason that it appears to work is that you immediately read the same pointer value back into another pointer variable in the same process, so when you read the memory pointed to by b.p, it is the same as the memory pointed to by a.p, that is, the NUL-terminated string text.
If you created a second program which just did the file read, without declaring the string text, you should expect to encounter an error or at least, not see the text.
int main()
{
string filepath("C:\\Users\\Chitra\\Desktop\\New folder\\Project1\\Project1\\test.txt");
fstream fin(filepath,ios::in|ios::binary);
struct block
{
int value;
int size;
const char* p;
int some;
};
block b;
fin.read((char*)&b, sizeof(b));
fin.seekg(0, ios::end);
cout << "file size " << fin.tellg();
fin.close();
cout << "\n size b " << sizeof(b);
cout << "\n"<<b.value << " " << b.size << " " << b.p << " " << b.some;
getchar();
return 0;
}

as Sam Mikes said block.p is pointer and you only save its value (the value of a pointer is an address of the memory it point) and because of a.p is still point to it in memory the memory do not change add delete a; like code bellow and see the different
*noted that : sizeof a and b and block is the same you can use sizeof(a) or sizeof(b) or sizeof(block) instead
#include <string>
#include<iostream>
#include <fstream>
using namespace std;
int main(){
string text("C:\\Users\\Chitra\\Desktop\\Capture.JPG");
string filepath("C:\\Users\\Chitra\\Desktop\\New folder\\Project1\\Project1\\test.txt");
fstream fout(filepath,ios::out|ios::binary);
fstream fin(filepath,ios::in|ios::binary);
struct block {
int value;
int size;
const char* p;
int some;
};
block a;
a.value =1457745;
a.size=text.length();
a.p=text.c_str();
a. some=97877;
fout.write((char*)&a,sizeof(a));
fout.close();
delete a.p;
block b;
fin.read((char*)&b,sizeof(b));
fin.seekg(0,ios::end);
cout<<"file size "<<fin.tellg();
fin.close();
cout<<"\nsize a "<<sizeof(a)<<" size b "<<sizeof(b)<<" size block "<<sizeof (block);
cout<<"\n"<<b.value<<" "<<b.size<<" "<<b.p<<" "<<b.some;
getchar();
return 0;
}

You can't just do...
fout.write(pointer, bytes);
...for any values of pointer and byte, because the data doesn't exist in contiguous memory. You must write the data embedded in the block without the pointer, and the pointed-to string data, separately. That's easiest if you move the pointer to the end of the structure, then write:
struct block
{
int value;
int some;
int size;
const char* p;
};
fout.write((char*)&a, sizeof a - sizeof a.p);
// could also "offsetof(block, p)" for size - possibly more fragile
fout.write(a.p, a.size);
Then, to read the data back:
block b;
fin.read((char*)&b, sizeof b - sizeof b.p);
b.p = new char[b.size + 1];
fin.read(b.p, b.size);
b.p[b.size + 1] = '\0'; // guarantee NUL termination
Sometime later you'll need to delete[] b.p; to deallocate the memory returned by the new[]. Alternatively, you could use another string:
block b;
fin.read((char*)&b, sizeof b - sizeof b.p);
std::string b_string(b.size, ' '); // initially b.size spaces
fin.read(&b_string[0], b.size); // overwrite with data from the file...
With the std::string, deallocation happens automatically when the destructor is invoked.
(It's actually best to use a C++ serialisation library with std::fstreams, like boost's here.)

Related

How to read memcpy struct result via a pointer

I want to copy a struct content in memory via char* pc the print it back but here I have an exception (reading violation)
struct af {
bool a;
uint8_t b;
uint16_t c;
};
int main() {
af t;
t.a = true;
t.b = 3;
t.c = 20;
char* pc = nullptr;
memcpy(&pc, &t, sizeof(t));
std::cout << "msg is " << pc << std::endl; // here the exception
return 0;
}
then I want to recover data from memory to another structure of same type.
I did af* tt = (af*)(pc); then tried to access to tt->a but always an exception.
You need to allocate memory before you can copy something into it. Also, pc is already the pointer, you need not take the address of it again. Moreover, the byte representation is very likely to contain non-printable characters. To see the actual effect the following copies from the buffer back to an af and prints its members (note that a cast is needed to prevent std::cout to interpret the uint8_t as a character):
#include <iostream>
#include <cstring>
struct af {
bool a;
uint8_t b;
uint16_t c;
};
int main() {
af t;
t.a = true;
t.b = 3;
t.c = 20;
char pc[sizeof(af)];
std::memcpy(pc, &t, sizeof(t)); // array pc decays to pointer to first element
for (int i=0;i<sizeof(af); ++i){
std::cout << i << " " << pc[i] << "\n";
}
af t2;
std::memcpy(&t2, pc,sizeof(t));
std::cout << t2.a << " " << static_cast<unsigned>(t2.b) << " " << t2.c;
}
Output:
0
1
2
3
1 3 20
Note that I replaced the output of pc with a loop that prints individual characters, because the binary representation might contain null terminators and pc is not a null terminated string. If you want it to be a null-terminated string, it must be of size sizeof(af) +1 and have a terminating '\0'.

Placement new and aligning for possible offset memory

I've been reading up on placement new, and I'm not sure if I'm "getting" it fully or not when it comes to proper alignment.
I've written the following test program to attempt to allocate some memory to an aligned spot:
#include <iostream>
#include <cstdint>
using namespace std;
unsigned char* mem = nullptr;
struct A
{
double d;
char c[5];
};
struct B
{
float f;
int a;
char c[2];
double d;
};
void InitMemory()
{
mem = new unsigned char[1024];
}
int main() {
// your code goes here
InitMemory();
//512 byte blocks to write structs A and B to, purposefully misaligned
unsigned char* memoryBlockForStructA = mem + 1;
unsigned char* memoryBlockForStructB = mem + 512;
unsigned char* firstAInMemory = (unsigned char*)(uintptr_t(memoryBlockForStructA) + uintptr_t(alignof(A) - 1) & ~uintptr_t(alignof(A) - 1));
A* firstA = new(firstAInMemory) A();
A* secondA = new(firstA + 1) A();
A* thirdA = new(firstA + 2) A();
cout << "Alignment of A Block: " << endl;
cout << "Memory Start: " << (void*)&(*memoryBlockForStructA) << endl;
cout << "Starting Address of firstA: " << (void*)&(*firstA) << endl;
cout << "Starting Address of secondA: " << (void*)&(*secondA) << endl;
cout << "Starting Address of thirdA: " << (void*)&(*thirdA) << endl;
cout << "Sizeof(A): " << sizeof(A) << endl << "Alignof(A): " << alignof(A) << endl;
return 0;
}
Output:
Alignment of A Block:
Memory Start: 0x563fe1239c21
Starting Address of firstA: 0x563fe1239c28
Starting Address of secondA: 0x563fe1239c38
Starting Address of thirdA: 0x563fe1239c48
Sizeof(A): 16
Alignof(A): 8
The output appears to be valid, but I still have some questions about it.
Some questions I have are:
Will fourthA, fifthA, etc... all be aligned as well?
Is there a simpler way of finding a properly aligned memory location?
In the case of struct B, it is set up to not be memory friendly. Do I need to reconstruct it so that the largest members are at the top of the struct, and the smallest members are at the bottom? Or will the compiler automatically pad everything so that it's member d will not be malaligned?
Will fourthA, fifthA, etc... all be aligned as well?
yes if the alignement of a type is a multiple of the size
witch is (i think) always the case
Is there a simpler way of finding a properly aligned memory location?
yes
http://en.cppreference.com/w/cpp/language/alignas
or
http://en.cppreference.com/w/cpp/memory/align
as Dan M said.
In the case of struct B, it is set up to not be memory friendly. Do I need to reconstruct it so that the largest members are at the top of the struct, and the smallest members are at the bottom? Or will the compiler automatically pad everything so that it's member d will not be malaligned?
you should reorganize if you think about it.
i don't think compiler will reorganize element in a struct for you.
because often when interpreting raw data (coming from file, network ...) this data is often just interpreted as a struct and 2 compiler reorganizing differently could break code.
I hope my explanation are clear and that I did not make any mistakes

How many values can be put into an Array in C++?

I wanted to read an array of double values from a file to an array. I have like 128^3 values. My program worked just fine as long as I stayed at 128^2 values, but now I get an "segmentation fault" error, even though 128^3 ≈ 2,100,000 is by far below the maximum of int. So how many values can you actually put into an array of doubles?
#include <iostream>
#include <fstream>
int LENGTH = 128;
int main(int argc, const char * argv[]) {
// insert code here...
const int arrLength = LENGTH*LENGTH*LENGTH;
std::string filename = "density.dat";
std::cout << "opening file" << std::endl;
std::ifstream infile(filename.c_str());
std::cout << "creating array with length " << arrLength << std::endl;
double* densdata[arrLength];
std::cout << "Array created"<< std::endl;
for(int i=0; i < arrLength; ++i){
double a;
infile >> a;
densdata[i] = &a;
std::cout << "read value: " << a << " at line " << (i+1) << std::endl;
}
return 0;
}
You are allocating the array on the stack, and stack size is limited (by default, stack limit tends to be in single-digit megabytes).
You have several options:
increase the size of the stack (ulimit -s on Unix);
allocate the array on the heap using new;
move to using std::vector.

C++ strlen(ch) and sizeof(ch) strlen

I have this code:
int main()
{
char ch[15];
cout<<strlen(ch)<<endl; //7
cout<<sizeof(ch)<<endl; //15
return 0;
}
Why does strlen(ch) give different result even if it is empty char array?
Your code has undefined behavior because you are reading the uninitialized values of your array with strlen. If you want a determinate result from strlen you must initialize (or assign to) your array.
E.g.
char ch[15] = "Hello, world!";
or
char ch[15] = {};
sizeof will give the size of its operand, as the size of char is one by definition the size of a char[15] will always be 15.
strlen gives the length of a null terminated string which is the offset of the first char with value 0 in a given char array. For a call to strlen to be valid, the argument to must actually point to a null terminated string.
ch is a local variable and local variables are not initialized. So your assumption that it is an empty string is not correct. Its filled with junk. It was just a co-incidence that a \0 character was found after 7 junk characters and hence strlen returned 7.
You can do something like these to ensure an empty string-
char ch[15]={0};
ch[0]='\0`;
strcpy(ch,"");
Here's a similar thread for more reading
Variable initialization in C++
The problem is in
strlen(ch);
strlen counts the number of chars, untill hitting the \0 symbol. Here, ch is non-initialized, so strlen could return anything.
As for the result from strlen, in your case you have an uninitialized char array, and so strlen only happens to yield 7: there must be a null character at array element 8, but this code could give different results for strlen every time.
Always initialize strings, it's easy enough with an array: char str[15] = {0};
sizeof is an operator used to get the size of a variable or a data type, or the number of bytes occupied by an array, not the length of a C string; don't expect strlen and strcpy to be interchangeable, or even comparable in any useful way.
For instance:
int main()
{
char str[15] = "only 13 chars";
cout << "strlen: " << strlen(str) << endl;
cout << "sizeof: " << sizeof(str) << endl;
}
The output is:
strlen: 13
sizeof: 15
Returns the length of str.
The length of a C string is determined by the terminating
null-character: A C string is as long as the amount of characters
between the beginning of the string and the terminating null
character.
sizeof returns number of bytes (15). Your array is filled by garbage, so, strlen can returns any number. Correct example is
int main()
{
char ch[15] = {0};
cout<<strlen(ch)<<endl; //0
cout<<sizeof(ch)<<endl; //15
return 0;
}
The difference between sizeof and strlen in C++:
1) sizeof is a operator, strlen is a function;
2) The return type of sizeof is size_t,and it is defined (typedef) as unsigned int in its header; It gets the byte size of the memory allocation which can maximize to accommodate this object to be created in memory;
3) sizeof can use type as a parameter, while strlen can only use char pointer (char*) as a pointer, and it must be ended as '\0';
sizeof can also use function as a parameter, for instance:
short f() {return 100;}
std::cout << "sizeof(f()): " << sizeof(f()) << std::endl;
//The result will be sizeof(short), which is 2.
4) If char array is a parameter, it will not be degraded by sizeof, while strlen will degrade it as a char pointer;
5) The result of strlen will be calculated in the run time, not compilation time, strlen is used to get the real size of the content of a string (string, char array, char pointer) until the '\0', not the real size of memory allocation. Most of the compiler will calculate the result of sizeof in the compilation time, no matter the parameter is type or variable, that is why sizeof(x) can be used to decide the dimension of an array:
char str[20]="0123456789";
int a=strlen(str); //a=10;
int b=sizeof(str); //while b=20;
7) If the parameter of sizeof is a type, then parentheses are mandatory, while if the parameter is a variable, parentheses are optional, because sizeof is an operator not a function;
8) When you use a structured type or variable as a parameter, sizeof will return its real size, when you use a static array, sizeof will return the array size. But sizeof operator cannot return the size of an array which is created dynamically or externally. Because sizeof is a compilation time operator.
Here is an example of sizeof and strlen:
#include <iostream>
#include <cstdlib>
#include <string>
#include <cstring>
short f1 ()
{
return 100;
}
int f2 ()
{
return 1000;
}
int main()
{
char* char_star = "0123456789";
// char_star is a char pointer, sizeof will return the pointer size allocated in memory: depends on your machine
std::cout << "sizeof(char_star):" << sizeof(char_star) << std::endl;
// *char_star is the first element of the string, it is a char, sizeof will return the char size allocated in memory: depends on your machine, normally is 1
std::cout << "sizeof(*char_star):" << sizeof(*char_star) << std::endl;
// char_star is a char pointer, strlen will return the real size of the string until '\0': 10
std::cout << "strlen(char_star):" << strlen(char_star) << std::endl;
std::cout << std::endl;
char char_array[] = "0123456789";
// char_array is a char array, sizeof will return the array size allocated in memory, with a '\0' at the end: 10 + 1
std::cout << "sizeof(char_array):" << sizeof(char_array) << std::endl;
// *char_array is the first element of the array, it is a char, sizeof will return the char size allocated in memory: depends on your machine, normally is 1
std::cout << "sizeof(*char_array):" << sizeof(*char_array) << std::endl;
// char_array is a char array, strlen will return the real size of the string until '\0': 10
std::cout << "strlen(char_array):" << strlen(char_array) << std::endl;
std::cout << std::endl;
char_array_fixed[100] = "0123456789";
// char_array_fixed is a char array with fixed size, sizeof will return the array size allocated in memory: 100
std::cout << "sizeof(char_array_fixed):" << sizeof(char_array_fixed) << std::endl;
// *char_array_fixed is the first element of the array, it is a char, sizeof will return the char size allocated in memory: depends on your machine, normally is 1
std::cout << "sizeof(*char_array_fixed):" << sizeof(*char_array_fixed) << std::endl;
// *char_array_fixed is a char array with fixed size, strlen will return the real content size of the string until '\0': 10
std::cout << "strlen(char_array_fixed):" << strlen(char_array_fixed) << std::endl;
std::cout << std::endl;
int int_array[100] = {0,1,2,3,4,5,6,7,8,9};
// int_array is a int array with fixed size, sizeof will return the array size allocated in memory: 100
std::cout << "sizeof(int_array):" << sizeof(int_array) << std::endl;
// *int_array is the first element of the array, it is an int, sizeof will return the int size allocated in memory: depends on your machine, normally is 4
std::cout << "sizeof(*int_array):" << sizeof(*int_array) << std::endl;
// int_array is a int array with fixed size, strlen will throw exception
//std::cout << "strlen(int_array):" << strlen(int_array) << std::endl;
std::cout << std::endl;
char char_array2[] = {'a', 'b', '3'};
// char_array2 is a char array, sizeof will return the array size allocated in memory: 3
std::cout << "sizeof(char_array2):" << sizeof(char_array2) << std::endl;
// *char_array2 is the first element of the array, it is a char, sizeof will return the char size allocated in memory: depends on your machine, normally is 1
std::cout << "sizeof(*char_array2):" << sizeof(*char_array2) << std::endl;
// *char_array2 is a char array, strlen will return the real content size of the string until '\0': 3
std::cout << "strlen(char_array2):" << strlen(char_array2) << std::endl;
std::cout << std::endl;
char char_array3[] = {"abc"};
// char_array3 is a char array, sizeof will return the array size allocated in memory, with a '\0' at the end : 3 + 1
std::cout << "sizeof(char_array3):" << sizeof(char_array3) << std::endl;
// *char_array3 is the first element of the array, it is a char, sizeof will return the char size allocated in memory: depends on your machine, normally is 1
std::cout << "sizeof(*char_array3):" << sizeof(*char_array3) << std::endl;
// *char_array3 is a char array, strlen will return the real content size of the string until '\0': 3
std::cout << "strlen(char_array3):" << strlen(char_array3) << std::endl;
std::cout << std::endl;
std::string str = {'a', 'b', '3', '\0', 'X'};
// str is a string, sizeof will return the string size allocated in memory (string is a wrapper, can be considered as a special structure with a pointer to the real content): depends on your machine, normally is 32
std::cout << "str:" << str << std::endl;
std::cout << "sizeof(str):" << sizeof(str) << std::endl;
// *str means nothing, sizeof will throw exeption
//std::cout << "sizeof(*str):" << sizeof(*str) << std::endl;
// str is a string, strlen will return the real content size of the string until '\0': 3
std::cout << "strlen(str):" << strlen(str.c_str()) << std::endl;
std::cout << std::endl;
// sizeof is an operation, if the parameter is a type, parentheses are mandatory
std::cout << "sizof(int):" << sizeof(int) << std::endl;
// sizeof is an operation, if the parameter is a variable, parentheses are optional
std::cout << "sizof char_star:" << sizeof char_star << std::endl;
std::cout << "sizof char_array:" << sizeof char_array << std::endl;
// sizeof is an operation, can take a function as parameter
std::cout << "sizeof(f()): " << sizeof(f1()) << std::endl;
std::cout << "sizeof(f()): " << sizeof(f2()) << std::endl;
}

What does new [size_t] + 1 return

The following sample of code if from a book "C++ von A bis Z" (second edition, translation: C++ from A to Z) at page 364. The sample is wrong.
// overload operator +=
#include <iostream>
#include <cstring>
using namespace std;
class String {
private:
char* buffer;
unsigned int len;
public:
String(const char* s="") {
// cout << "Constructor: " << s << "\n";
len = strlen(s);
buffer = new char [len+1];
strcpy(buffer, s);
}
~String() {
// cout << "Destructor: " << buffer << "\n";
delete [] buffer;
}
String(const String& s) {
// cout << "Copy_Constructor: " << s.get_buffer() << "\n";
len = s.len;
buffer = new char [len+1];
strcpy(buffer, s.buffer);
}
char* get_buffer() const {
return buffer;
}
// returning a reference is more efficent
// String& operater+=(const String& str1)
String operator+=(const String& str1) {
// cout << "Assignment_Operator +=: " << str1.get_buffer() << "\n";
String tmp(*this);
delete [] buffer;
len = tmp.len + str1.len;
// invalid pointer
// buffer = new char[len+1];
buffer = new char [len]+1;
strcpy(buffer, tmp.buffer);
strcat(buffer, str1.buffer);
// wrong return_type
// return *this;
return buffer;
}
};
int main(void) {
String string1("Adam");
String string2("Eva");
string1+=" und ";
string1.operator+=(string2);
cout << string1.get_buffer() << "\n";
return 0;
}
The lines with the comments are my "fixes". Now I want to know what "new char [len]+1" does? I think the following:
it allocates sizeof(char)*len memory from heap
and returns the WRONG address to the pointer *buffer
but what is the wrong address: "first address of the new memory on heap + 1" or "first address of the new memory on heap + sizeof(char)*1)?
What happens?
Thanks
// edit
Thank you all! You helped me!
I just wanted to know, what this statement will return.
new char [len]+1;
The line itself is, of course, a typo from the author of the book.
Let's break it down:
new char[len];
returns a pointer to an array of char.
new char[len] + 1;
returns the next address in memory.
It's basically cutting off the first character.
EDIT: As others have mentioned, this is most probably a typo, it should be new char[len+1]. I'm just explaining what the code does, but you should only use pointer arithmetics if you really know what you're doing. Trying to delete the returned pointer would be UB, as cHao pointed out. You'll also get UB if len == 1 and attempt to work with the returned pointer.
If you add an integer i to a T*, this will add sizeof(T) * i to the pointer. So in this case, since new char[len] returns a char*, + 1 will indeed add sizeof(char) * 1 to it.
new Type[size] + 1 will allocate an array of size size and yield the address of the element with index 1 - that the second element. Nothing special, just pointer arithmetic. new[] would yield the address of the element with index 0 and size +1 is done on that address it yields address of element with index 1.
It simply returns the pointer to the second array item =)
Read about C pointers ;)
+sizeof(char)*1, but I failed to see why did you do it.
I think it's a typo, and it should be new char [len+1]. The +1 is for the string terminator character that must exist.