Can't figure out what's wrong, I don't seem to be getting anything from fread.
port.h
#pragma once
#ifndef _PORT_
#define _PORT_
#include <string>
#ifndef UNICODE
typedef char chr;
typedef string str;
#else
typedef wchar_t chr;
typedef std::wstring str;
inline void fopen(FILE ** ptrFile, const wchar_t * _Filename,const wchar_t * _Mode)
{
_wfopen_s(ptrFile,_Filename,_Mode);
}
#endif
#endif
inside main()
File * f = new File(fname,FileOpenMode::Read);
chr *buffer;
buffer = (wchar_t*)malloc(f->_length*2);
for(int i=0;i<f->_length;i++)
{
buffer[i] = 0;
}
f->Read_Whole_File(buffer);
f->Close();
for(int i=0;i<f->_length;i++)
{
printf("%S",buffer[i]);
}
free(buffer);
inside file class
void Read_Whole_File(chr *&buffer)
{
//buffer = (char*)malloc(_length);
if(buffer == NULL)
{
_IsError = true;
return;
}
fseek(_file_pointer, 0, SEEK_SET);
int a = sizeof(chr);
fread(&buffer,_length ,sizeof(chr) , _file_pointer);
}
You're mixing pointers and references all over the place.
Your function only needs to take a pointer to the buffer:
void Read_Whole_File(char *buffer) { ... }
And you should pass that pointer as-is to fread(), don't take the address of the pointer:
size_t amount_read = fread(buffer, _length, sizeof *buffer, _file_pointer);
Also remember:
If you have a pointer ptr to some type, you can use sizeof *ptr and remove the need to repeat the type name.
If you know the length of the file already, pass it to the function so you don't need to figure it out twice.
In C, don't cast the return value of malloc().
Check for errors when doing memory allocation and I/O, things can fail.
buffer is a reference to a chr *. Yet you're reading into &buffer which is a chr ** (whatever that is). Wrong.
You don't even need to pass a reference to buffer in Read_Whole_File, just use a regular pointer.
aside from your original problem...
from your code:
typedef char chr;
chr *buffer;
buffer = (wchar_t*)malloc(f->_length*2);
for(int i=0;i<f->_length;i++)
{
buffer[i] = 0;
}
don't you think there is something wrong here ? in case you cannot spot the errors, here is the list:
chr is a char, so buffer is a char *
you are using malloc. are you coding in C or in C++ ? if it is C++, consider using new
the buffer you allocate is explicitly casted to a wchar_t * but buffer is a char *
in the malloc you are allocating a block of size length*2 when you should be using length * sizeof(w_char_t). don't make any assumption on the size of a type (and even writing sizeof(char) is no problem, it renders the intentions explicit)
the for loop goes from 0 to length, but since buffer is defined as a buffer of char, only length bytes are initialized, whereas you alocated length*2 bytes, so half your buffer is still uninitialized.
memset() has been defined to avoid this kind of for loop...
please be a little bit careful when coding !
First a couple of nits:
What's with the malloc and free? What's wrong with new and delete? You are obviously writing C++ here, so write C++.
Overloading fopen, fseek, fread in the case of wchar_t bothers me, immensely. Much better: Use templates, or define your own functions. Don't overload those names that belong to C.
This is not not a nit. The following is almost certainly not doing what you want:
fread(&buffer,_length ,sizeof(chr) , _file_pointer);
chr here is a pointer to a reference, so sizeof is almost certainly going to be either 32 or 64.
Related
#include <iostream>
#include <string.h>
using namespace std;
void newBuffer(char* outBuffer, size_t sz) {
outBuffer = new char[sz];
}
int main(void) {
const char* abcd = "ABCD";
char* foo;
foo = NULL;
size_t len = strlen(abcd);
cout<<"Checkpoint 1"<<endl;
newBuffer(foo, len);
cout<<"Checkpoint 2"<<endl;
cout<<"Checkpoint 2-A"<<endl;
memset(foo, '-', len);
cout<<"Checkpoint 3"<<endl;
strncpy(foo, abcd, len);
cout<<"Checkpoint 4"<<endl;
cout << foo << endl;
int hold;
cin>>hold;
return 0;
}
This program crashes between checkpoint 2-1 and 3. What it tries to do is to set the char array foo to the char '-', but it fails because of some access issues. I do not understand why this happens. Thank you very much in advance!
Your newBuffer function should accept the first parameter by reference so that changes made to it inside the function are visible to the caller:
void newBuffer(char*& outBuffer, size_t sz) {
outBuffer = new char[sz];
}
As it is now, you assign the result of new char[sz] to the local variable outBuffer which is only a copy of the caller's foo variable, so when the function returns it's as if nothing ever happened (except you leaked memory).
Also you have a problem in that you are allocating the buffer to the size of the length of ABCD which is 4. That means you can hold up to 3 characters in that buffer because one is reserved for the NUL-terminator at the end. You need to add + 1 to the length somewhere (I would do it in the call to the function, not inside it, because newBuffer shouldn't be specialised for C-strings). strncpy only NUL-terminates the buffer if the source string is short enough, so in this case you are only lucky that there happens to be a 0 in memory after your buffer you allocated.
Also don't forget to delete[] foo in main after you're done with it (although it doesn't really matter for a program this size).
It fails because your newBuffer function doesn't actually work. The easiest way to fix it would be to change the declaration to void newBuffer (char *&outBuffer, size_t sz). As it's written, the address of the newly allocated memory doesn't actually get stored into main's foo because the pointer is passed by value.
You are passing the pointer by value. You would need to pass either a reference to the pointer, or the address of the pointer.
That said, using the return value would be better in my view:
char* newBuffer(size_t sz) {
return new char[sz];
}
When written this way, the newBuffer function doesn't really seem worthwhile. You don't need it. You can use new directly and that would be clearer.
Of course, if you are using C++ then this is all rather pointless. You should be using string, smart pointers etc. You should not have any need to call new directly. Once you fix the bug you are talking about in this question you will come across the problem that your string is not null-terminated and that the buffer is too short to hold the string since you forgot to allocate space for the null-terminator. One of the nice things about C++ is that you can escape the horrors of string handling in C.
This is the scenario;
// I have created a buffer
void *buffer = operator new(100)
/* later some data from a different buffer is put into the buffer at this pointer
by a function in an external header so I don't know what it's putting in there */
cout << buffer;
I want to print out the data that was put into the buffer at this pointer to see what went in. I would like to just print it out as raw ASCII, I know there will be some non-printable characters in there but I also know some legible text was pushed there.
From what I have read on the Internet cout can't print out uncasted data like a void, as opposed to an int or char. However, the compiler wont let me cast it on the fly using (char) for example. Should I create a seperate variable that casts the value at the pointer then cout that variable, or is there a way I can do this directly to save on another variable?
Do something like:
// C++11
std::array<char,100> buf;
// use std::vector<char> for a large or dynamic buffer size
// buf.data() will return a raw pointer suitable for functions
// expecting a void* or char*
// buf.size() returns the size of the buffer
for (char c : buf)
std::cout << (isprint(c) ? c : '.');
// C++98
std::vector<char> buf(100);
// The expression `buf.empty() ? NULL : &buf[0]`
// evaluates to a pointer suitable for functions expecting void* or char*
// The following struct needs to have external linkage
struct print_transform {
char operator() (char c) { return isprint(c) ? c : '.'; }
};
std::transform(buf.begin(), buf.end(),
std::ostream_iterator<char>(std::cout, ""),
print_transform());
Do this:
char* buffer = new char[100];
std::cout << buffer;
// at some point
delete[] buffer;
void* you only need in certain circumstances, mostly for interop with C interfaces, but this is definitely not a circumstance requiring a void*, which essentially loses all type information.
You need to cast it to char*: reinterpret_cast<char*>(buffer). The problem is that void* represents anything, so only th pointer is printed; when you cast it to char*, the contents of the memory are interpreted as a C-style string
Note: use reinterpret_cast<> instead of the C-style (char *) to make your intent clear and avoid subtle-and-hard-to-find bugs later
Note: of course you might get a segfault instead, as if the data is indeed not a C-style string, memory not associated with the buffer might be accessed
Update: You could allocate the memory to a char* buffer to begin with and it would solve your problem too: you could still call your 3rd party function (char* is implicitly convertible to void*, which I presume is the 3rd party function's parameter type) and you don't need to do the cast-ing at all. Your best bet is to zero-out the memory and restrict the 3rd party function to copy more than 99*sizeof(char) bytes into your buffer to preserve the ending '\0' C-style string terminator
If you want to go byte by byte you could use an unsigned char and iterate over it.
unsigned char* currByte = new unsigned char[100];
for(int i = 0; i < 100; ++i)
{
printf("| %02X |", currByte[i]);
}
It's not a very modern (or even very "C++") answer but it will print it as a hex value for you.
I've tried implementing a function like this, but unfortunately it doesn't work:
const wchar_t *GetWC(const char *c)
{
const size_t cSize = strlen(c)+1;
wchar_t wc[cSize];
mbstowcs (wc, c, cSize);
return wc;
}
My main goal here is to be able to integrate normal char strings in a Unicode application. Any advice you guys can offer is greatly appreciated.
In your example, wc is a local variable which will be deallocated when the function call ends. This puts you into undefined behavior territory.
The simple fix is this:
const wchar_t *GetWC(const char *c)
{
const size_t cSize = strlen(c)+1;
wchar_t* wc = new wchar_t[cSize];
mbstowcs (wc, c, cSize);
return wc;
}
Note that the calling code will then have to deallocate this memory, otherwise you will have a memory leak.
Use a std::wstring instead of a C99 variable length array. The current standard guarantees a contiguous buffer for std::basic_string. E.g.,
std::wstring wc( cSize, L'#' );
mbstowcs( &wc[0], c, cSize );
C++ does not support C99 variable length arrays, and so if you compiled your code as pure C++, it would not even compile.
With that change your function return type should also be std::wstring.
Remember to set relevant locale in main.
E.g., setlocale( LC_ALL, "" ).
const char* text_char = "example of mbstowcs";
size_t length = strlen(text_char );
Example of usage "mbstowcs"
std::wstring text_wchar(length, L'#');
//#pragma warning (disable : 4996)
// Or add to the preprocessor: _CRT_SECURE_NO_WARNINGS
mbstowcs(&text_wchar[0], text_char , length);
Example of usage "mbstowcs_s"
Microsoft suggest to use "mbstowcs_s" instead of "mbstowcs".
Links:
Mbstowcs example
mbstowcs_s, _mbstowcs_s_l
wchar_t text_wchar[30];
mbstowcs_s(&length, text_wchar, text_char, length);
You're returning the address of a local variable allocated on the stack. When your function returns, the storage for all local variables (such as wc) is deallocated and is subject to being immediately overwritten by something else.
To fix this, you can pass the size of the buffer to GetWC, but then you've got pretty much the same interface as mbstowcs itself. Or, you could allocate a new buffer inside GetWC and return a pointer to that, leaving it up to the caller to deallocate the buffer.
I did something like this. The first 2 zeros are because I don't know what kind of ascii type things this command wants from me. The general feeling I had was to create a temp char array. pass in the wide char array. boom. it works. The +1 ensures that the null terminating character is in the right place.
char tempFilePath[MAX_PATH] = "I want to convert this to wide chars";
int len = strlen(tempFilePath);
// Converts the path to wide characters
int needed = MultiByteToWideChar(0, 0, tempFilePath, len + 1, strDestPath, len + 1);
Andrew Shepherd 's answer.
Andrew Shepherd 's answer is Good for me, I add up some fix :
1, remove the ending char L'\0', casue sometime it will trouble.
2, use mbstowcs_s
std::wstring wtos(std::string& value){
const size_t cSize = value.size() + 1;
std::wstring wc;
wc.resize(cSize);
size_t cSize1;
mbstowcs_s(&cSize1, (wchar_t*)&wc[0], cSize, value.c_str(), cSize);
wc.pop_back();
return wc;
}
The question has several problems, but so do some of the answers. The idea of returning a pointer to allocated memory "and leaving it up to the caller to de-allocate" is asking for trouble. As a rule the best pattern is always to allocate and de-allocate within the same function. For example, something like:
wchar_t* buffer = new wchar_t[get_wcb_size(str)];
mbstowcs(buffer, str, get_wcb_size(str) + 1);
...
delete[] buffer;
In general, this requires two functions, one the caller calls to find out how much memory to allocate and a second to initialize or fill the allocated memory.
Unfortunately, the basic idea of using a function to return a "new" object is problematic -- not inherently, but because of the C++ inheritance of C memory handling. Using C++ and STL's strings/wstrings/strstreams is a better solution, but I felt the memory allocation thing needed to be better addressed.
Your problem has nothing to do with encodings, it's a simple matter of understanding basic C++. You are returning a pointer to a local variable from your function, which will have gone out of scope by the time anyone can use it, thus creating undefined behaviour (i.e. a programming error).
Follow this Golden Rule: "If you are using naked char pointers, you're Doing It Wrong. (Except for when you aren't.)"
I've previously posted some code to do the conversion and communicating the input and output in C++ std::string and std::wstring objects.
auto Ascii_To_Wstring = [](int code)->std::wstring
{
if (code>255 || code<0 )
{
throw std::runtime_error("Incorrect ASCII code");
}
std::string s{ char(code) };
std::wstring w{ s.begin(),s.end() };
return w;
};
What is the proper way to initialize unsigned char*? I am currently doing this:
unsigned char* tempBuffer;
tempBuffer = "";
Or should I be using memset(tempBuffer, 0, sizeof(tempBuffer)); ?
To "properly" initialize a pointer (unsigned char * as in your example), you need to do just a simple
unsigned char *tempBuffer = NULL;
If you want to initialize an array of unsigned chars, you can do either of following things:
unsigned char *tempBuffer = new unsigned char[1024]();
// and do not forget to delete it later
delete[] tempBuffer;
or
unsigned char tempBuffer[1024] = {};
I would also recommend to take a look at std::vector<unsigned char>, which you can initialize like this:
std::vector<unsigned char> tempBuffer(1024, 0);
The second method will leave you with a null pointer. Note that you aren't declaring any space for a buffer here, you're declaring a pointer to a buffer that must be created elsewhere. If you initialize it to "", that will make the pointer point to a static buffer with exactly one byte—the null terminator. If you want a buffer you can write characters into later, use Fred's array suggestion or something like malloc.
As it's a pointer, you either want to initialize it to NULL first like this:
unsigned char* tempBuffer = NULL;
unsigned char* tempBuffer = 0;
or assign an address of a variable, like so:
unsigned char c = 'c';
unsigned char* tempBuffer = &c;
EDIT:
If you wish to assign a string, this can be done as follows:
unsigned char myString [] = "This is my string";
unsigned char* tmpBuffer = &myString[0];
If you know the size of the buffer at compile time:
unsigned char buffer[SIZE] = {0};
For dynamically allocated buffers (buffers allocated during run-time or on the heap):
1.Prefer the new operator:
unsigned char * buffer = 0; // Pointer to a buffer, buffer not allocated.
buffer = new unsigned char [runtime_size];
2.Many solutions to "initialize" or fill with a simple value:
std::fill(buffer, buffer + runtime_size, 0); // Prefer to use STL
memset(buffer, 0, runtime_size);
for (i = 0; i < runtime_size; ++i) *buffer++ = 0; // Using a loop
3.The C language side provides allocation and initialization with one call.
However, the function does not call the object's constructors:
buffer = calloc(runtime_size, sizeof(unsigned char))
Note that this also sets all bits in the buffer to zero; you don't get a choice in the initial value.
It depends on what you want to achieve (e.g. do you ever want to modify the string). See e.g. http://c-faq.com/charstring/index.html for more details.
Note that if you declare a pointer to a string literal, it should be const, i.e.:
const unsigned char *tempBuffer = "";
If the plan is for it to be a buffer and you want to move it later to point to something, then initialise it to NULL until it really points somewhere to which you want to write, not an empty string.
unsigned char * tempBuffer = NULL;
std::vector< unsigned char > realBuffer( 1024 );
tempBuffer = &realBuffer[0]; // now it really points to writable memory
memcpy( tempBuffer, someStuff, someSizeThatFits );
The answer depends on what you inted to use the unsigned char for. A char is nothing else but a small integer, which is of size 8 bits on 99% of all implementations.
C happens to have some string support that fits well with char, but that doesn't limit the usage of char to strings.
The proper way to initialize a pointer depends on 1) its scope and 2) its intended use.
If the pointer is declared static, and/or declared at file scope, then ISO C/C++ guarantees that it is initialized to NULL. Programming style purists would still set it to NULL to keep their style consistent with local scope variables, but theoretically it is pointless to do so.
As for what to initialize it to... set it to NULL. Don't set it to point at "", because that will allocate a static dummy byte containing a null termination, which will become a tiny little static memory leak as soon as the pointer is assigned to something else.
One may question why you need to initialize it to anything at all in the first place. Just set it to something valid before using it. If you worry about using a pointer before giving it a valid value, you should get a proper static analyzer to find such simple bugs. Even most compilers will catch that bug and give you a warning.
I have the following data i need to add in the void buffer:
MyStruct somedata; // some struct containing ints or floats etc.
string somestring;
How do i do this?
This is my buffer allocation:
void *buffer = (void *)malloc(datasize);
How do i add first the somedata into the buffer (, which takes lets say 20 bytes), and then after 20 bytes comes the string which is variable size. I was thinking to read the structs byte by byte and add to buffer, but that feels stupid, there must be some easier way...?
Edit: i want this to equal to: fwrite( struct1 ); fwrite( struct2 ); which are called sequentially, but instead of writing to file, i want to write to a void buffer.
Edit 2: Made it working, heres the code:
char *data = (char *)malloc(datasize);
unsigned int bufferoffset = 0;
for(...){
MyStruct somedata; // some POD struct containing ints or floats etc.
string somestring;
... stuff ...
// add to buffer:
memcpy(data+bufferoffset, &somedata, sizeof(MyStruct));
bufferoffset += sizeof(MyStruct);
memcpy(data+bufferoffset, somestring.c_str(), str_len);
bufferoffset += str_len;
}
Anything to fix?
memcpy(buffer, &somedata, sizeof(MyStruct));
strcpy(buffer + sizeof(MyStruct), somestring.c_str());
Which will copy the string as a c string.
In general you should avoid doing this for classes which have custom copy-constructors etc.
But if you have to and you know what you're doing, use memcpy
In C, I'd do a bit like this:
MyStruct somedata;
string somestring;
void *buffer = (void *)malloc(datasize);
memmove(buffer, &somedata, 20);
strcpy(buffer + 20, somestring);
But there's LOTS of bad smell in the first 3 lines of this C code:
MyStruct is either a typedef (why? I hate typedefs) or it should be struct MyStruct
string is either a typedef (why? I hate typedefs) or it should be struct string; and identifiers starting with "str" are reserved and should not be used by programmers
Casting the return value of malloc is redundant and may hide errors
Edit after noticing (thanks Newbie) operations on void *
char *buffer = malloc(datasize);
In C, void* and any other pointer type are assignment compatible in both directions, so there is no need to cast char * to void * when passing it to memmove() and friends.
memmove(buffer, &somedata, 20);