I am new to the forum. I have a strange problem. I have a simple code which reads unformatted data from a file using read() function. The code is given below.
int main () {
ifstream meshfile;
char buf[1000], ch;
memset(buf, 0, 1000);
meshfile.open ("sometextfile");
meshfile.read (buf, 1000);//38+62+(19*47) + 7);
cout << strlen(buf) << std::endl;
cout << buf << std::endl;
}
The code when run with the sample input file below gives 1006 as length of buf and prints additional characters for buf. Strangely, this happens only when bufsize is 1000 & 1000 characters are read. Changing the bufsize to > 1000 and reading 1000 chars does not produce this error. Could this be a coding problem?
The sample input file is
fdjgjdskgggggggggggggggggggggggggggggj bvjgdsv dsjkvgds gvdsj gvjdsgvjksdjkfgdsjkgfdsjgfsdjgfjkdsgfkjsdgjfgsdjfgdsjgfsdjgfjsdgfjsgfjsdgfjgsdjfgsdjfgdsjgfsdjgfsdjgfjsdgfjsdgfjsdg
fdjgjdskgggggggggggggggggggggggggggggj bvjgdsv dsjkvgds gvdsj gvjdsgvjksdjkfgdsjkgfdsjgfsdjgfjkdsgfkjsdgjfgsdjfgdsjgfsdjgfjsdgfjsgfjsdgfjgsdjfgsdjfgdsjgfsdjgfsdjgfjsdgfjsdgfjsdg
fdjgjdskgggggggggggggggggggggggggggggj bvjgdsv dsjkvgds gvdsj gvjdsgvjksdjkfgdsjkgfdsjgfsdjgfjkdsgfkjsdgjfgsdjfgdsjgfsdjgfjsdgfjsgfjsdgfjgsdjfgsdjfgdsjgfsdjgfsdjgfjsdgfjsdgfjsdg
fdjgjdskgggggggggggggggggggggggggggggj bvjgdsv dsjkvgds gvdsj gvjdsgvjksdjkfgdsjkgfdsjgfsdjgfjkdsgfkjsdgjfgsdjfgdsjgfsdjgfjsdgfjsgfjsdgfjgsdjfgsdjfgdsjgfsdjgfsdjgfjsdgfjsdgfjsdg
fdjgjdskgggggggggggggggggggggggggggggj bvjgdsv dsjkvgds gvdsj gvjdsgvjksdjkfgdsjkgfdsjgfsdjgfjkdsgfkjsdgjfgsdjfgdsjgfsdjgfjsdgfjsgfjsdgfjgsdjfgsdjfgdsjgfsdjgfsdjgfjsdgfjsdgfjsdg
fdjgjdskgggggggggggggggggggggggggggggj bvjgdsv dsjkvgds gvdsj gvjdsgvjksdjkfgdsjkgfdsjgfsdjgfjkdsgfkjsdgjfgsdjfgdsjgfsdjgfjsdgfjsgfjsdgfjgsdjfgsdjfgdsjgfsdjgfsdjgfjsdgfjsdgfjsdg
You problem is the use of strlen
It expects a string terminated by \0.
read doesn't add a \0 at the end of the buffer, so strlen reads beyond the edge of the buffer.
Make your buffer 1001 chars long, leaving room for a nul termination. Memset 1001 also. Without that nul terminator, strlen will not work.
Better yet, use <array> and do not use memset. A simple buf[1000] = 0 will do.
What's read from the file is not a '\0' terminated string but raw data. You should check the return value of stream.gcount() containing the number of bytes actually read into the buffer and use it in subsequent code. Manual here.
size_t size = meshfile.read(buf, sizeof(buf)/sizeof(*buf)).gcount();
std::cout << size << std::endl;
I think what the guys here and I mean is:
1)read(..) does not add the null terminator.
2)You initialized your buffer to 0 using memset. When you read < 1000 characters, the array elements that comes after the last character read in are all still 0 (effectively null, because they were initialized to that value) hence even if read(..) did not add the null terminator, things did not break. This is why Jive Dadson asks you to resize your buffer to 1001 char and initialize your array to 0, so that even if you read in 1000 characters, the 1001th character is still 0 and will act as the null terminator.
But this is actually a bug. Consider what happens when you read in the last line which has < BUF_SIZE? Hence it is better to check for the number of characters read in at each turn as suggested by some of the posters.
3)When you read > 1000 characters, did you changed to a larger buffer and did you initialize your buffer to '0' for this larger buffer? If so naturally you won't have a problem as the situation will be the same as above. If not, I'm curious how you can read > 1000 characters into a 1000 char buffer.
4)Basically your current method of reading data is unreliable. You should resize your buffer to have 1 more element than the max you intend to read. After each read, you should get the number of characters actually read in and set the null terminator if you intend to pass the buf as a string and perform operations such as copy, printf etc.
Related
I am new to c++ and am still figuring out file streams. I am trying to put a character array into a file that I will be viewing with a hex editor.
I have done different strings, but whenever I put in a null byte, the file ends.
ofstream stream;
char charArray[256];
for (int i = 0; i <= 255; i++)charArray[i] = i;
stream.open("C:\\testfile.myimg");
if (!stream.is_open()) exit(1);
stream << charArray << endl;
return 0;
I want to output bytes with ascending values, but if I start with the null byte, then c++ thinks the character array ends before it starts
Instead of:
stream << charArray << endl;
use:
stream.write(charArray, sizeof(charArray));
stream.write("\n", 1); // this is the newline part of std::endl
stream.flush(); // this is the flush part of std::endl
The first one assumes that you are sending a null-terminated string (because you're passing a char* - see the end of the answer why). That's why when the code encounters the very first char with value 0, which is '\0' (the null-terminator), it stops.
On the other hand, the second approach uses an unformatted output write, which will not care about the values inside charArray - it will take it (as a pointer to its first element) and write sizeof(charArray) bytes starting from that pointer to the stream. This is safe since it's guaranteed that sizeof(char) == 1, thus sizeof(charArray) will yield 256.
What you need to consider is array decaying. It will work in this case (the sizeof thing), but it will not work if you simply pass a decayed pointer to the array's first element. Read more here: what's array decaying?
FILE* file_;
char buffer[5];
file_ = fopen("Data.txt", "a+");
while (!feof(file_))
{
fread(buffer, sizeof(buffer), 1, file_);
cout << buffer<<" ";
BM(buffer, pat);
}
Data.txt="ABCC1ABCC2XXX"
Output:
ABCC1m
ABCC2m
XXXC2m
How can I make buffer stop before it starts generating chars from previous buffer?(bolded font part)
Wanted output:
ABCC1
ABCC2
XXX
You have two problems. Firstly you are not null-terminating buffer at all, (which is why you are getting the m output. Secondly, you are not null-terminating buffer in the right place when there is a short read.
fread will tell you how many characters it has read, and you need to put the '\0' there. Edit That previous description is not accurate. It tells you how many objects it has read, each of size arg2, and you read upto arg3 of them. You need to change the arguments to fread so that you are reading single characters, and as many of them as there room in the buffer. So:
FILE* file_;
char buffer[5+1];
file_ = fopen("Data.txt", "a+");
while (!feof(file_))
{
const size_t nchars = fread(buffer, 1, sizeof(buffer)-1, file_);
buffer[nchars] = '\0';
cout << buffer<<" ";
BM(buffer, pat);
}
Pedantic note: The second argument to fread could also be written as sizeof(buffer[0]) if buffer is of something other than char/signed char/unsigned char - but those three are defined to have a sizeof 1.
OK, I can't find a decent duplicate - maybe someone else will.
The line
fread(buffer, sizeof(buffer), 1, file_);
potentially fills your buffer completely. You need to keep the return value to know how many bytes were actually written, but assuming your file contained at least five bytes, all five bytes of your buffer array are now initialized.
However, to print buffer as a regular C string, it needs a sixth byte, containing the null terminator.
For example, the C string "12345" is actually represented as the char array {'1', '2', '3', '4', '5', 0}.
You don't have room for a terminator in your buffer, and don't write one, so you can't treat it as a simple C string.
Your options are:
add a terminator manually, as in Martin Bonner's answer
don't add a terminator, but track the size - you can use the C++17
std::string_view bufstr(buffer, nchars);
to keep the pointer and size together (and you can print this normally)
stop using the old C I/O library entirely. The C++ I/O library admittedly doesn't have a much better way to read groups of five characters, but reading whole lines, for example, is much easier to do correctly.
Can anyone explain to me why the following code causing segmentation fault? buff should be long enough to hold 128 characters.
int main () {
char buff[16384];
char buff2[128];
sprintf(buff2, "MinPer(PatternName_Equal27_xxxxxxx_MasterPatSetup.PatternName_Equal27_xxxxxxx__default_WFT___WFTRef.ActualT0Period.UserPeriod_2_1)" );
strcat(buff, buff2);
std::cout << buff2 << endl;
std::cout << buff << endl;
return 0;
}
You have two major problems:
Your sprintf is shoving 131 bytes (130 characters plus a NUL) into a 128 byte buffer, meaning three unrelated stack bytes are getting overwritten with garbage. You need a larger buffer, or a smaller initialization string.
You call strcat to append said 131 characters to a buffer with undefined contents (no NUL to indicate where the string being concatenated to ends). This is trivially fixable, by either zero-initializing all of buff (char buff[16384] = {0};) or by inserting the NUL in the first byte (which is all you really need) adding buff[0] = '\0'; just before you strcat to it. Equivalently, you could replace strcat (which assumes a string to concatenate new data to exists in the destination) with strcpy (which ignores the existing contents of the destination) to avoid the problem.
Basically, your code is full of undefined behavior and buffer overruns. Given you're using C++, can I recommend just using std::string to avoid the hassle of C strings?
buff is uninitialized. It needs to contain a null terminated string so that strcat knows where to begin the concatenation. One way to do this is with strcpy:
strcpy(buff, ""); // initialize with empty null terminated string
strcat(buff, buff2); // add to it
strcat needs 'dest' to be a string ending with '\0'. So buff should be initialized manually.
I am trying to read 4 characters at a specific position from a file. The code is simple but the result is really confusing:
fstream dicomFile;
dicomFile.open(argv[1]);
dicomFile.seekg(128,ios::beg);
char * memblock = new char [4];
dicomFile.read(memblock,4);
cout<<"header is "<<memblock<<endl;
Ideally the result should be "DICM" but the actual result from the console was "DICM" plus weird characters, as shown in the picture. What's more, every time I run it, the characters are different. I suppose this may be something about ASCII and Unicode, I tried to change project property from Unicode to multibytes and then change back, no difference.
Does anyone know what's happening here and how do I solve it please? Thanks very much!
C style (char *) strings use the concept of null-terminators. This means strings are ended with a '\0' character in their last element. You are reading in exactly 4 characters into a 4 character buffer, which does not include a null character to end the string. C and C++ will happily run right off the end of your buffer in search for the null terminator that signifies the end of the string.
Quick fix is to create a block of length + 1, read in length data, then set str[length] = '\0'. In your case it would be as below.
char * memBlock = new char [5];
// populate memBlock with 4 characters
memBlock[ 4 ] = '\0';
A better solution is to use std::string instead of char * when working with strings in C++.
You could also initialize the buffer with zeros, putting null-terminators at every location.
char * memblock = new char [5](); // zeros, and one element longer
Fairly inefficient though.
NoobQuestion:
I heard that filling a char array can be terminated early with the null char. How is this done?
I've searched every single google result out there but still have come up empty handed.
Do you mean something like this:
char test[11] = "helloworld";
std::cout << test << std::endl;
test[2] = 0;
std::cout << test;
This outputs
helloworld
he
?
That's a convention called "null-terminated string". If you have a block of memory which you treat as a char buffer and there's a null character within that buffer then the null-terminated string is whatever is contained starting with the beginning of the buffer and up to and including the null character.
const int bufferLength = 256;
char buffer[bufferLength] = "somestring"; //10 character plus a null character put by the compiler - total 11 characters
here the compiler will place a null character after the "somestring" (it does so even if you don't ask to). So even though the buffer is of length 256 all the functions that work with null-terminated strings (like strlen()) will not read beyond the null character at position 10.
That is the "early termination" - whatever data is in the buffer beyond the null character it is ignored by any code designed to work with null-terminated strings. The last part is important - code could easily ignore the null character and then no "termination" would happen on null character.