valgrind complains doing a very simple strtok in c - c++

Hi I'm trying to tokenize a string by loading an entire file into a char[] using fread.
For some strange reason it is not always working, and valgrind complains in this very small sample program.
Given an input like test.txt
first
second
And the following program
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <sys/stat.h>
//returns the filesize in bytes
size_t fsize(const char* fname){
struct stat st ;
stat(fname,&st);
return st.st_size;
}
int main(int argc, char *argv[]){
FILE *fp = NULL;
if(NULL==(fp=fopen(argv[1],"r"))){
fprintf(stderr,"\t-> Error reading file:%s\n",argv[1]);
return 0;
}
char buffer[fsize(argv[1])];
fread(buffer,sizeof(char),fsize(argv[1]),fp);
char *str = strtok(buffer," \t\n");
while(NULL!=str){
fprintf(stderr,"token is:%s with strlen:%lu\n",str,strlen(str));
str = strtok(NULL," \t\n");
}
return 0;
}
compiling like
gcc test.c -std=c99 -ggdb
running like
./a.out test.txt
thanks

Your buffer size should be filesize + 1. The +1 is for the null char.
filesize = fsize(argv[1]);
char buffer[filesize + 1];
Also fread does not put a \0 at the end of the string. So you'll have to do it yourself as:
fread(buffer,sizeof(char),filesize,fp);
buffer[filesize] = 0;

From this site:
int main(int argc, char* argv[])
{
std::string str = "The quick brown fox";
// construct a stream from the string
std::istringstream stream(str);
// use stream iterators to copy the stream to the vector
// as whitespace separated strings
std::istream_iterator<std::string> it(stream), end;
std::vector<std::string> results(it, end);
// results = ["The", "quick", "brown", "fox"]
}
SO much easier than dealing with those nasty C-strings that keep banging you on the head.
And you know what's great about using higher-order methods ? It takes less screen estate and is easier to understand.

buffer is not null-terminated. You need to make it one byte larger than the size of the file, and you need to set the last byte to be \0.

Your buffer must be filesize + 1 and you will also need to set the terminating 0:
int size = fsize(argv[1]);
char buffer[size + 1];
buffer[size] ='\0';
Also, you should probably allocate the buffer on the heap instead of the stack...

Your buffer is too small. Try this:
int fileSize = fsize(argv[1]);
char buffer[fileSize + 1];
buffer[fileSize] = 0;
right before your call to fread.

Related

Copy char* to another char[]

I've a simple question about string/char. I've tried to implement a basic system like this;
#include <stdio.h>
#include <string.h>
int main()
{
//I'll use 'char*' for socket receive buffer!
const char* input = "This is a test!";
char n[4];
strncpy(n, input, 4);
printf("%s\n%i\n", n, strlen(n));
return 0;
}
And I got this output:
Thisd0#
7
What's wrong? This is a simple as a for/while loop (IDK).
You still need to put a null-terminating char (\0) at the end.
char n[5] = { '\0' }; // Initializes the array to all \0
strncpy(n, input, 4);
Your n char needs to be 5 bytes big (4 characters + null-terminater). You're seeing gonk afterwards because there is no null-terminator \0.

fread storing random characters in buffer

I'm simply trying to read a file using fread and output the contents. It's partially working. It outputs everything correctly but it ends with a bunch of random characters.
#include <iostream>
using namespace std;
void ReadFile(char* filename,char*& buffer)
{
FILE *file = fopen(filename,"rb");
fseek(file,0,SEEK_END);
int size = ftell(file);
rewind(file);
buffer = new char[size];
memset(buffer,0,size);
int r = fread(buffer,1,size,file);
cout << buffer;
fclose(file);
}
int main()
{
char* buffer;
ReadFile("test.txt",buffer);
cin.get();
}
Let's say 'size' is 50 in this instance. for some reason, the size of buffer ends up 55 or 56 after fread is called. I emptied the buffer before using it and tried outputting it, everything is normal (it's empty). Right after the call to fread the buffer somehow gets bigger and is filled with random characters. I've opened the text file in a hex editor to ensure there isn't anything I'm not seeing but there isn't. The file is 50 bytes. fread returns the amount of bytes read, in this case returned to 'r', 'r' is what it should be. so where the mother eff are these bytes coming from?
simplified: fread returns correct amount of bytes read but the buffer somehow makes itself bigger after fread is called, then fills it with random characters. why?
I can't for the life of me figure out how this is happening.
Also, before anyone gives me an easy fix, I already know I could just do buffer[r] = '\0' and not have it output anymore random characters but I'd much rather know WHY this is happening.
cout's << operator on char* expects C strings, so you need to null-terminate your buffer:
int size = ftell(file)+1; // Leave space for null terminator
...
int r = fread(buffer,1,size-1,file);
buffer[r] = '\0';
cout << buffer;
The extra characters that you see is random data in the memory addresses after the end of your buffer. operator << does not know that the string has ended, so it continues printing until it finds the first '\0' byte.
You probably just forgot about null terminating the buffer. Instead, use cout.write and supply the length of the buffer:
Adding a bit of error handling (not enough, but a start), missing includes and using statements: http://coliru.stacked-crooked.com/view?id=8bc4f3b7111554c705de96450d806104-f674c1a6d04c632b71a62362c0ccfc51
#include <iostream>
#include <string>
#include <vector>
#include <cstring>
using namespace std;
void ReadFile(const char* filename,char*& buffer)
{
FILE *file = fopen(filename,"rb");
if (!file)
return;
fseek(file,0,SEEK_END);
int size = ftell(file);
rewind(file);
buffer = new char[size];
memset(buffer,0,size);
int r = fread(buffer,1,size,file);
cout.write(buffer, r);
fclose(file);
}
int main()
{
char* buffer;
ReadFile("test.txt",buffer);
cin.get();
}
Actually cout will print the string until it does not get any NULL char. That mean it needed a NULL for termination.
But assign a NULL is not a good solution for all the times. Your data might be a binary at that time cout will print only the output up to the NULL char. I mean binary data could be anything and it could be a not readable char also. and cout will consider it as a NULL char. That's why it always safe to use a for loop up to the length of the string or your dataset.
len = strlen(buffer)
for (int i = 0; i < len; i++)
printf("%c", buffer[i])
//or you could use FILE *fp; for (int i= 0; i < len; i++) fprintf(fp, "%c", buffer[i]); another good approach is to use fwrite.

Converting char to wide char

I am trying to converts a sequence of multibyte characters to a corresponding sequence of wide characters using the mbstowcs_s function. But I keep having the following heap corruption problem. Can anyone tell me how to fix that?
Here is a sample code. When debugging, it is always the line delete wc_name causing the problem. I know it shouldn't be it.
#include <Windows.h>
#include <iostream>
#include <string>
int main (int argc, char *argv[]) {
size_t returnValue; // The number of characters converted.
const size_t sizeInWords = 50; // The size of the wcstr buffer in words
const char* c_name = "nanana"; // The address of a sequence of characters
wchar_t *wc_name = new wchar_t(50);
errno_t err = mbstowcs_s(&returnValue, wc_name, sizeInWords,
c_name, strlen(c_name) );
wcout << wc_name << endl;
delete wc_name;
return 0;
}
wchar_t *wc_name = new wchar_t(50); should be wchar_t *wc_name = new wchar_t[50]; to allocate an array. And corresponding delete wc_name should be delete[] wc_name;. BTW, if you know the size of the array at compile time itself, there is no need for dynamic memory allocation. You can simply do wchar_t wc_name[50];.

How to convert string to char array in C++?

I would like to convert string to char array but not char*. I know how to convert string to char* (by using malloc or the way I posted it in my code) - but that's not what I want. I simply want to convert string to char[size] array. Is it possible?
#include <iostream>
#include <string>
#include <stdio.h>
using namespace std;
int main()
{
// char to string
char tab[4];
tab[0] = 'c';
tab[1] = 'a';
tab[2] = 't';
tab[3] = '\0';
string tmp(tab);
cout << tmp << "\n";
// string to char* - but thats not what I want
char *c = const_cast<char*>(tmp.c_str());
cout << c << "\n";
//string to char
char tab2[1024];
// ?
return 0;
}
Simplest way I can think of doing it is:
string temp = "cat";
char tab2[1024];
strcpy(tab2, temp.c_str());
For safety, you might prefer:
string temp = "cat";
char tab2[1024];
strncpy(tab2, temp.c_str(), sizeof(tab2));
tab2[sizeof(tab2) - 1] = 0;
or could be in this fashion:
string temp = "cat";
char * tab2 = new char [temp.length()+1];
strcpy (tab2, temp.c_str());
Ok, i am shocked that no one really gave a good answer, now my turn. There are two cases;
A constant char array is good enough for you so you go with,
const char *array = tmp.c_str();
Or you need to modify the char array so constant is not ok, then just go with this
char *array = &tmp[0];
Both of them are just assignment operations and most of the time that is just what you need, if you really need a new copy then follow other fellows answers.
str.copy(cstr, str.length()+1); // since C++11
cstr[str.copy(cstr, str.length())] = '\0'; // before C++11
cstr[str.copy(cstr, sizeof(cstr)-1)] = '\0'; // before C++11 (safe)
It's a better practice to avoid C in C++, so std::string::copy should be the choice instead of strcpy.
Easiest way to do it would be this
std::string myWord = "myWord";
char myArray[myWord.size()+1];//as 1 char space for null is also required
strcpy(myArray, myWord.c_str());
Just copy the string into the array with strcpy.
Try this way it should be work.
string line="hello world";
char * data = new char[line.size() + 1];
copy(line.begin(), line.end(), data);
data[line.size()] = '\0';
Try strcpy(), but as Fred said, this is C++, not C
You could use strcpy(), like so:
strcpy(tab2, tmp.c_str());
Watch out for buffer overflow.
If you don't know the size of the string beforehand, you can dynamically allocate an array:
auto tab2 = std::make_unique<char[]>(temp.size() + 1);
std::strcpy(tab2.get(), temp.c_str());
If you're using C++11 or above, I'd suggest using std::snprintf over std::strcpy or std::strncpy because of its safety (i.e., you determine how many characters can be written to your buffer) and because it null-terminates the string for you (so you don't have to worry about it). It would be like this:
#include <string>
#include <cstdio>
std::string tmp = "cat";
char tab2[1024];
std::snprintf(tab2, sizeof(tab2), "%s", tmp.c_str());
In C++17, you have this alternative:
#include <string>
#include <cstdio>
#include <iterator>
std::string tmp = "cat";
char tab2[1024];
std::snprintf(tab2, std::size(tab2), "%s", tmp.c_str());
Well I know this maybe rather dumb than and simple, but I think it should work:
string n;
cin>> n;
char b[200];
for (int i = 0; i < sizeof(n); i++)
{
b[i] = n[i];
cout<< b[i]<< " ";
}

Access violation writing location when working with pointers to char

I am writing a very simple program that removes duplicate chars from a string. I ran it visual studio and got the error:
Unhandled exception at 0x00d110d9 in inteviews.exe: 0xC0000005: Access violation writing location 0x00d27830.
I really don't see what the problem is. current cell gets the value of the next cell.
void remove(char *str, char a) {
while (*str != '\0') {
if (*(str+1) == a) {
remove(str + 1, a);
}
*str = *(str +1 );//HERE I GET THE ERROR
++str;
}
}
int _tmain(int argc, _TCHAR* argv[])
{
char *str = "abcad";
while (*str != '\0') {
remove(str,*str);
str++;
}
std::cout << str << std::endl;
return 0;
}
EDIT:
I already tried to change it to char str[] = "abcad" but I still get the same error.
You're attempting to modify a string literal. You can't do that.
char *str = "abcad";
That's a string literal. It's created in read-only memory therefore attempting to write to it is an access violation.
One problem is that you created a read-only string literal and attempted to modify it:
char *str = "abcad"; // String literals are read-only!
You could use a char array instead:
char str[] = "abcad";
There are all sorts of problems with your program. I begun by trying to write them all down but I feel that code is irredeemable. It has indexing errors, parameter passing errors, dubious recursion and so on.
The other answers that point out the error of trying to modify a read-only literal are correct. That is the cause of the error in the code you posted.
The main reason for your troubles, in my view, is that the code is harder to write when you only have a single buffer. You have tied yourself in knots trying to get around this limitation in your design, but with a second buffer to work with, the code is trivial.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void)
{
const char *input = "abcad";
char *output = malloc(strlen(input)+1);
char *in = input;
char *out = output;
while (*in)
{
if (*in != input[0])
{
*out = *in;
out++;
}
in++;
}
*out = '\0';
printf("%s\n", output);
free(output);
return 0;
}
If you want to get really clever you can in fact manage happily with just a single buffer, so long as you keep two distinct pointers for iteration.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void)
{
char str[] = "abcad";
char compare = str[0];
char *in = str;
char *out = str;
while (*in)
{
if (*in != compare)
{
*out = *in;
out++;
}
in++;
}
*out = '\0';
printf("%s\n", str);
return 0;
}
Note that we had to take a copy of first character in the buffer, the character being removed, since that may be modified by the iteration.
So now you are back where you started, with a single buffer. But now the code works and is easy to understand.
Note that my answer is written in C as per your tag, but note that your code is C++.
Since string literal is created in read-only memory, attempting to write to it is an access violation. What you can do is strcpy(dst, src) to a character array.
#include <stdlib.h>
int _tmain(int argc, _TCHAR* argv[])
{
char *str = "abcad";
char str2[10];
strcpy(str2, str);
while (*str2 != '\0') {
remove(str2, *str2);
str2++;
}
}