Can ftell use for non-ASCII characters? - c++

This line
Assert(pos == ftell(file));
is used in my code, when file contains non-ASCII characters, this line failed.
What should I do?
To make it clear, here is the whole function updated:
int getTerminatedString(char * dest, int length)
{
char * rv = fgets(dest,length,file);
int len = -1;
if(rv)
{
len = strlen(rv);
pos += len;
assert(pos == ftell(file));
}
return len;
}
Thanks!

If you open the file in binary mode e.g. fopen("yourfile","rb"), ftell will give the offset in the file regardless of content.

Is the dest buffer large enough to contain all the characters AND a final terminating zero byte as well?
If the buffer pointer by the dest pointer is too small the program may overwrite in the memory something it should not - that is one possible way to get the SIGABRT.

Related

Weird seek behaviour in C and C++ [duplicate]

I did a sample project to read a file into a buffer.
When I use the tellg() function it gives me a larger value than the
read function is actually read from the file. I think that there is a bug.
here is my code:
EDIT:
void read_file (const char* name, int *size , char*& buffer)
{
ifstream file;
file.open(name,ios::in|ios::binary);
*size = 0;
if (file.is_open())
{
// get length of file
file.seekg(0,std::ios_base::end);
int length = *size = file.tellg();
file.seekg(0,std::ios_base::beg);
// allocate buffer in size of file
buffer = new char[length];
// read
file.read(buffer,length);
cout << file.gcount() << endl;
}
file.close();
}
main:
void main()
{
int size = 0;
char* buffer = NULL;
read_file("File.txt",&size,buffer);
for (int i = 0; i < size; i++)
cout << buffer[i];
cout << endl;
}
tellg does not report the size of the file, nor the offset
from the beginning in bytes. It reports a token value which can
later be used to seek to the same place, and nothing more.
(It's not even guaranteed that you can convert the type to an
integral type.)
At least according to the language specification: in practice,
on Unix systems, the value returned will be the offset in bytes
from the beginning of the file, and under Windows, it will be
the offset from the beginning of the file for files opened in
binary mode. For Windows (and most non-Unix systems), in text
mode, there is no direct and immediate mapping between what
tellg returns and the number of bytes you must read to get to
that position. Under Windows, all you can really count on is
that the value will be no less than the number of bytes you have
to read (and in most real cases, won't be too much greater,
although it can be up to two times more).
If it is important to know exactly how many bytes you can read,
the only way of reliably doing so is by reading. You should be
able to do this with something like:
#include <limits>
file.ignore( std::numeric_limits<std::streamsize>::max() );
std::streamsize length = file.gcount();
file.clear(); // Since ignore will have set eof.
file.seekg( 0, std::ios_base::beg );
Finally, two other remarks concerning your code:
First, the line:
*buffer = new char[length];
shouldn't compile: you have declared buffer to be a char*,
so *buffer has type char, and is not a pointer. Given what
you seem to be doing, you probably want to declare buffer as
a char**. But a much better solution would be to declare it
as a std::vector<char>& or a std::string&. (That way, you
don't have to return the size as well, and you won't leak memory
if there is an exception.)
Second, the loop condition at the end is wrong. If you really
want to read one character at a time,
while ( file.get( buffer[i] ) ) {
++ i;
}
should do the trick. A better solution would probably be to
read blocks of data:
while ( file.read( buffer + i, N ) || file.gcount() != 0 ) {
i += file.gcount();
}
or even:
file.read( buffer, size );
size = file.gcount();
EDIT: I just noticed a third error: if you fail to open the
file, you don't tell the caller. At the very least, you should
set the size to 0 (but some sort of more precise error
handling is probably better).
In C++17 there are std::filesystem file_size methods and functions, so that can streamline the whole task.
std::filesystem::file_size - cppreference.com
std::filesystem::directory_entry::file_size - cppreference.com
With those functions/methods there's a chance not to open a file, but read cached data (especially with the std::filesystem::directory_entry::file_size method)
Those functions also require only directory read permissions and not file read permission (as tellg() does)
void read_file (int *size, char* name,char* buffer)
*buffer = new char[length];
These lines do look like a bug: you create an char array and save to buffer[0] char. Then you read a file to buffer, which is still uninitialized.
You need to pass buffer by pointer:
void read_file (int *size, char* name,char** buffer)
*buffer = new char[length];
Or by reference, which is the c++ way and is less error prone:
void read_file (int *size, char* name,char*& buffer)
buffer = new char[length];
...
fseek(fptr, 0L, SEEK_END);
filesz = ftell(fptr);
will do the file if file opened through fopen
using ifstream,
in.seekg(0,ifstream::end);
dilesz = in.tellg();
would do similar

Unexpected variable values reading from file (ESP32)

I am still learning Cpp, so please advise if I am misunderstanding here.
Using an ESP32, I am trying to read / write files to Flash / FFat. This is the method I have created which should read a file from flash and load it into PSRAM:
unsigned char* storage_read(char* path) {
File file = FFat.open(path);
if(!file) {
Serial.println("no file");
return 0x00;
}
int count = file.size();
unsigned char* buffer = (unsigned char*)ps_malloc(count);
Serial.printf("Bytes: %d\n", count);
Serial.printf("Count: %d\n", sizeof(buffer));
for (int i = 0; i < count; i++) {
buffer[i] = (unsigned char)file.read();
}
file.close();
return buffer;
}
The problem is that I get the contents of my b64 data file, with the addition of several extra bytes of data globbed on the end.
Calling the method with:
Serial.printf("Got: %s", storage_read("/frame/testframe-000.b64"));
I get the output:
Bytes: 684
Count: 4
Got: <myb64string> + <68B of garbage>
Why would sizeof not be returning the proper size?
What would be the proper way of loading this string into a buffer?
Why would sizeof not be returning the proper size?
That's because sizeof() has a very specific function (not very intuitive). It is used - compile time - to query the size of the data type passed to it. Calling sizeof(buffer) returns the size, in bytes, of the type of variable buffer. It's an unsigned char*, so a 4-byte memory address. So that's what you get.
What would be the proper way of loading this string into a buffer?
What I noticed is that you're expecting to load string data from your file, but you don't explicitly terminate it with a zero byte. As you probably know, all C strings must be terminated with a zero byte. Data that you load from the file most likely doesn't have one (unless you took extra care to add it while saving). So when you read a string from a file sized N bytes, allocate a buffer of N+1 bytes, load the file into it and terminate it with a zero. Something like this:
unsigned char* storage_read(char* path) {
File file = FFat.open(path);
if(!file) {
Serial.println("no file");
return 0x00;
}
int count = file.size();
unsigned char* buffer = (unsigned char*)ps_malloc(count + 1); //< Updated
Serial.printf("Bytes: %d\n", count);
Serial.printf("Count: %d\n", sizeof(buffer));
for (int i = 0; i < count; i++) {
buffer[i] = (unsigned char)file.read();
}
buffer[count] = 0; //< Added
file.close();
return buffer;
}
And since you're returning a heap-allocated buffer from your function, take extra care to remember to delete it in caller when finished. This line in your code will leak the memory:
Serial.printf("Got: %s", storage_read("/frame/testframe-000.b64"));

error on tcp sending buffer of a Mat

I am trying to send out Mat image by TCP. Firstly the Mat has been transferred into uchar and then into char format. The whole image in char format will be send out buffer by buffer whose size is 1024 byte. The following is my code.
Mat decodeImg = imdecode(Mat(bufferFrame), 1);
uchar *transferImg = decodeImg.data;
char* charImg = (char*) transferImg;
int length = strlen(charImg);
int offset = 0;
while (true)
{
bzero(bufferSend, BUFFER_SIZE);
if (offset + BUFFER_SIZE <= length)
{
for (int i = 0; i < BUFFER_SIZE; i++)
{
bufferSend[i] = charImg[i + offset];
}
// memcpy(charImg+offset, bufferSend,BUFFER_SIZE);
if (send(sockfd, bufferSend, sizeof(bufferSend), 0) < 0)
{
printf("Send FIle Failed,total length is%d,failed offset is%d\n",
length,
offset);
break;
}
}
else
{
for (int i = 0; i < length - offset; i++)
{
bufferSend[i] = charImg[i + offset];
}
if (send(sockfd, bufferSend, sizeof(bufferSend), 0) < 0)
{
printf("Send FIle Failed,total length is%d,failed offset is%d\n",
length,
offset);
break;
}
break;
}
offset += BUFFER_SIZE;
}
The output of the code shows : send file failed, total length is 251035, failed offset is 182272.
I am really appreciated on your help. Thank you in advance!
Pulling out the crystal ball here. This might be OP's problem, but if it isn't, this is certainly a problem that needs to be addressed.
Mat decodeImg = imdecode(Mat(bufferFrame), 1);
uchar *transferImg = decodeImg.data;
Get data. Not a bad idea if that's what you need to send.
char* charImg = (char*) transferImg;
Take the array of bytes from above and treat it as an array of characters.
int length = strlen(charImg);
And Boom. Matrix data is not ascii formated data, a string, so it should not be treated like a string.
strlen counts data until it reaches a null character, a character with the numerical value 0, which does not exist in the normal alpha numeric world and thus can be used as a canary value to signal the end of a string. The count is the number of characters before the first null character in the string.
In this case we don't have a string. We have a blob of binary numbers, any one of which could bee 0. There could be a null value anywhere. Could be right at the beginning. Could be a hundred bytes in. There might not be a null value in the until long after all of the valid image data has been read.
Anyway, strlen will almost certainly return the wrong value. Too few bytes and the receiver doesn't get all of the image data and I have no idea what it does. That code's not available to us. It probably gets upset and discards the result. Maybe it crashes. There's no way to know. If there is too much information, we also don't know what happens. Maybe it processes the file happily and ignores the extra crap that's sent. Maybe it crashes.
But what if it closes the TCP/IP connection when it has enough bytes? That leaves the sender trying to write a handful of unsent and unwanted bytes into a closed socket. send will fail and set the error code to socket closed.
Solution:
Get the right size of the data.
What I'm reading from the openCV documentation is inside a Mat is Mat::elemSize which will give you the size of each item in the matrix and Mat::size which returns a Size object containing the rows and columns. Multiply rows * columns * elemSize and you should have the number of bytes to send.
EDIT
This looks to be a better way to get the size.

I have an issue with fgets in c++

Im doing a small exercise to read a file which contains one long string and load this into an array of strings. So far I have:
char* data[11];
char buf[15];
int i = 0;
FILE* indata;
indata = fopen( "somefile.txt", "r" );
while( i < 11)
{
fgets(buf, 16, indata);
data[i] = buf;
i++;
}
fclose( indata );
somefile.txt: "aaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbaahhhhhbbbbdddddddddddddbbbbb"
etc..
This reads in 15 characters, adds that string to the array and gets the next 15. The problem is the array always equals the last string, so if the last string is "ccccv" the whole array, data[0] = "ccccv", data[1] = "ccccv", data[2] = "ccccv" and so on.
Does anyone know why this is happening and whether there is a better way to do it? Thanks
Each pointer in data will point to the same memory area, which is buf.
You need to use strcpy + malloc.
Also is seems like you have a "minor" buffer overflow. buf is size 15 and you're reading 16 characters.

Storing an image file into a buffer (gif,jpeg etc).

I'm trying to load an image file into a buffer in order to send it through a scket. The problem that I'm having is that the program creates a buffer with a valid size but it does not copy the whole file into the buffer. My code is as follow
//imgload.cpp
#include <iostream>
#include <stdlib.h>
#include <stdio.h>
using namespace std;
int main(int argc,char *argv){
FILE *f = NULL;
char filename[80];
char *buffer = NULL;
long file_bytes = 0;
char c = '\0';
int i = 0;
printf("-Enter a file to open:");
gets(filename);
f = fopen(filename,"rb");
if (f == NULL){
printf("\nError opening file.\n");
}else{
fseek(f,0,SEEK_END);
file_bytes = ftell(f);
fseek(f,0,SEEK_SET);
buffer = new char[file_bytes+10];
}
if (buffer != NULL){
printf("-%d + 10 bytes allocated\n",file_bytes);
}else{
printf("-Could not allocate memory\n");
// Call exit?.
}
while (c != EOF){
c = fgetc(f);
buffer[i] = c;
i++;
}
c = '\0';
buffer[i-1] = '\0'; // helps remove randome characters in buffer when copying is finished..
i = 0;
printf("buffer size is now: %d\n",strlen(buffer));
//release buffer to os and cleanup....
return 0;
}
> output
c:\Users\Desktop>imgload
-Enter a file to open:img.gif
-3491 + 10 bytes allocated
buffer size is now: 9
c:\Users\Desktop>imgload
-Enter a file to open:img2.gif
-1261 + 10 bytes allocated
buffer size is now: 7
From the output I can see that it's allocating the correct size for each image 3491 and 1261 bytes (i doubled checked the file sizes through windows and the sizes being allocated are correct) but the buffer sizes after supposedly copying is 9 and 7 bytes long. Why is it not copying the entire data?.
You are wrong. Image is binary data, nor string data. So there are two errors:
1) You can't check end of file with EOF constant. Because EOF is often defined as 0xFF and it is valid byte in binary file. So use feof() function to check for end of file. Or also you may check current position in file with maximal possible (you got it before with ftell()).
2) As file is binary it may contain \0 in middle. So you can't use string function to work with such data.
Also I see that you use C++ language. Tell me please why you use classical C syntax for file working? I think that using C++ features such as file streams, containers and iterators will simplify your program.
P.S. And I want to say that you program will have problems with really big files. Who knows maybe you will try to work with them. If 'yes', rewrite ftell/fseek functions to their int64 (long long int) equivalents. Also you'll need to fix array counter. Another good idea is to read file by blocks. Reading byte by byte is dramatically slower.
All this is unneeded and actually makes no sense:
c = '\0';
buffer[i-1] = '\0';
i = 0;
printf("buffer size is now: %d\n",strlen(buffer));
Don't use strlen for binary data. strlen stops at the first NUL (\0) byte. A binary file may contain many such bytes, so NUL can't be used.
-3491 + 10 bytes allocated /* There are 3491 bytes in the file. */
buffer size is now: 9 /* The first byte with the value 0. */
In conclusion, drop that part. You already have the size of the file.
You are reading a binary file like a text file. You can't check for EOF as this could be anywhere in the binary file.