c++ cout uncasted memory (void) - c++

This is the scenario;
// I have created a buffer
void *buffer = operator new(100)
/* later some data from a different buffer is put into the buffer at this pointer
by a function in an external header so I don't know what it's putting in there */
cout << buffer;
I want to print out the data that was put into the buffer at this pointer to see what went in. I would like to just print it out as raw ASCII, I know there will be some non-printable characters in there but I also know some legible text was pushed there.
From what I have read on the Internet cout can't print out uncasted data like a void, as opposed to an int or char. However, the compiler wont let me cast it on the fly using (char) for example. Should I create a seperate variable that casts the value at the pointer then cout that variable, or is there a way I can do this directly to save on another variable?

Do something like:
// C++11
std::array<char,100> buf;
// use std::vector<char> for a large or dynamic buffer size
// buf.data() will return a raw pointer suitable for functions
// expecting a void* or char*
// buf.size() returns the size of the buffer
for (char c : buf)
std::cout << (isprint(c) ? c : '.');
// C++98
std::vector<char> buf(100);
// The expression `buf.empty() ? NULL : &buf[0]`
// evaluates to a pointer suitable for functions expecting void* or char*
// The following struct needs to have external linkage
struct print_transform {
char operator() (char c) { return isprint(c) ? c : '.'; }
};
std::transform(buf.begin(), buf.end(),
std::ostream_iterator<char>(std::cout, ""),
print_transform());

Do this:
char* buffer = new char[100];
std::cout << buffer;
// at some point
delete[] buffer;
void* you only need in certain circumstances, mostly for interop with C interfaces, but this is definitely not a circumstance requiring a void*, which essentially loses all type information.

You need to cast it to char*: reinterpret_cast<char*>(buffer). The problem is that void* represents anything, so only th pointer is printed; when you cast it to char*, the contents of the memory are interpreted as a C-style string
Note: use reinterpret_cast<> instead of the C-style (char *) to make your intent clear and avoid subtle-and-hard-to-find bugs later
Note: of course you might get a segfault instead, as if the data is indeed not a C-style string, memory not associated with the buffer might be accessed
Update: You could allocate the memory to a char* buffer to begin with and it would solve your problem too: you could still call your 3rd party function (char* is implicitly convertible to void*, which I presume is the 3rd party function's parameter type) and you don't need to do the cast-ing at all. Your best bet is to zero-out the memory and restrict the 3rd party function to copy more than 99*sizeof(char) bytes into your buffer to preserve the ending '\0' C-style string terminator

If you want to go byte by byte you could use an unsigned char and iterate over it.
unsigned char* currByte = new unsigned char[100];
for(int i = 0; i < 100; ++i)
{
printf("| %02X |", currByte[i]);
}
It's not a very modern (or even very "C++") answer but it will print it as a hex value for you.

Related

how character pointer could be used to point a string in c++?

First of all I am beginner in C++. I was trying to learn about type casting in C++ with strings and character pointer. Is it possible to point a string with a character pointer?
int main() {
string data="LetsTry";
cout<<(&data)<<"\n";
cout<<data<<"\n"<<"size "<<sizeof(data)<<"\n";
//char *ptr = static_cast<char*>(data);
//char *ptr=(char*)data;
char *ptr = reinterpret_cast<char*>(&data);
cout<<(ptr)<<"\n";
cout<<*ptr;
}
The above code yields outcome as below:
0x7ffea4a06150
LetsTry
size 32
`a���
`
I understand as ptr should output the address 0x7ffea4a06150
Historically, in C language strings were just a memory areas filled with characters. Consequently, when a string was passed to a function, it was passed as a pointer to its very first character, of type char *, for mutable strings, or char const *, if the function had no intent to modify string's contents. Such strings were delimited with a zero-character ((char)0 a.k.a. '\0') at the end, so for a string of length 3 you had to allocate at least four bytes of memory (three characters of the string itself plus the zero terminator); and if you only had a pointer to a string's start, to know the size of the string you'd have to iterate it to find how far is the zero-char (the standard function strlen did it). Some standard functions accepted en extra parameter for a string size if you knew it in advance (those starting with strn or, more primitive and effective, those starting with mem), others did not. To concatenate two strings you first had to allocate a sufficient buffer to contain the result etc.
The standard functions that process char pointers can still be found in STL, under the <cstring> header: https://en.cppreference.com/w/cpp/header/cstring, and std::string has synonymous methods c_str() and data() that return char pointers to its contents, should you need it.
When you write a program in C++, its main function has the header of int main(int argc, char *argv[]), where argv is the array of char pointers that contains any command-line arguments your program was run with.
Ineffective as it is, this scheme could still be regarded as an advantage over strings of limited capacity or plain fixed-size character arrays, for instance in mid-nineties, when Borland introduced the PChar type in Turbo Pascal and added a unit that exported Pascal implementations of functions from C's string.h.
std::string and const char* are different types, reinterpret_cast<char*>(&data) means reinterpret the bits located at &data as const char*, which is not we want in this case.
so assuming we have type A and type B:
A a;
B b;
the following are conversion:
a = (A)b; //c sytle
// and
a = A(b);
// and
a = static_cast<A>(b); //c++ style
the following are bit reinterpretation:
a = *(A*)&b; //c style
// and
a = *reinterpret_cast<A*>(&b); //c++ style
finally, this should works:
int main() {
string data = "LetsTry";
const char *ptr = data.c_str();
cout<< ptr << "\n";
}
bit reinterpretation is sometimes used, like when doing bit manipulation of a floating point number, but there are some rules to follow like this one What is the strict aliasing rule?
also note that cout << ptr << "\n"; is a specially case because feeds a pointer to std::cout usually output the address that pointer points to, but std::cout treats char* specially so that it output the content of that char array instead
In C++, string is class and what you doing is creating a string object. So, to use are char * you need to convert it using c_str()
You can refer below code:
std::string data = "LetsTry";
// declaring character array
char * cstr = new char [data.length()+1];
// copying the contents of the
// string to char array
std::strcpy (cstr, data.c_str());
Now, you can get use char * to point your data.

String to const char conversion using c_str() or toCharArray()?

I want to know more about programming and after a bit of googling I found how to convert a string to a const char.
String text1;
What I do not understand is why c_str() works,
const char *text2 = text1.c_str();
contrary to toCharArray()?
const char *text2 = text1.toCharArray();
or
const char text2 = text1.toCharArray();
The latter is more logical to me as I want to convert a string to a char, and then turn it into a const char. But that doesn't work because one is a string, the other is a char. The former, as I understand, converts the string to a C-type string and then turns it into a const char. Here, the string suddenly isn't an issue anymore oO
.
a) Why does it need a C-type string conversion and why does it work only then?
b) Why is the pointer needed?
c) Why does a simple toCharArray() not work?
.
Or do I do something terribly wrong?
Thanks heaps.
I am using PlatformIO with Arduino platform.
If you need to modify the returned c-style string in any way, or have it persist after you modify the original String, you should use toCharArray.
If you only need a null-terminated c-style string to pass as a read-only parameter to a function, use c_str.
Arduino reference for String.toCharArray()
Arduino reference for String.c_str()
The interface (and implementation) of toCharArray is shown below, from source
void toCharArray(char *buf, unsigned int bufsize, unsigned int index=0) const
{ getBytes((unsigned char *)buf, bufsize, index); }
So your first issue is that you're trying to use it incorrectly. toCharArray will COPY the underlying characters of your String into a buffer that you provide. This must be extra space that you have allocated, either in a buffer on the stack, or in some other writable area of memory. You would do it like this.
String str = "I am a string!";
char buf[5];
str.toCharArray(buf, 5);
// buf is now "I am\0"
// or you can start at a later index, here index 5
str.toCharArray(buf, 5, 5);
// buf is now "a st\0"
// we can also change characters in the buffer
buf[1] = 'X';
// buf is now "aXst\0"
// modifying the original String does not invalidate the buffer
str = "Je suis une chaine!";
// buf is still "aXst\0"
This allows you to copy a string partially, or at a later index, or anything you want. Most importantly, this array you copy into is mutable. We can change it, and since it's a copy, it doesn't affect the original String we copied it from. This flexibility comes with a cost. First, we have to have a large enough buffer, which may not be known at compile time, and takes up memory. Second, that copying takes time to do.
But what if we're calling a function that just wants to read a c-style string as input? It doesn't need to modify it at all?
That's where c_str() comes in. The String object has an underlying c-string type array (yes, null terminator and all). c_str() simply returns a const char* to this array. We make it const so that we don't accidentally change it. An object's underlying data should not be changed by random functions outside of its control.
This is the ENTIRE code for c_str():
const char* c_str() const { return buffer; }
You already know how to use it, but to illustrate a difference:
String str = "I am another string!";
const char* c = str.c_str();
// c[1] = 'X'; // error, cannot modify a const object
// modifying the original string may reallocate the underlying buffer
str = "Je suis une autre chaine!";
// dereferencing c now may point to invalid memory
Since c_str() simply returns the underlying data pointer, it's fast. But we don't want other functions to be allowed to modify this data, so it's const.

std::string.c_str() has different value than std::string?

I have been working with C++ strings and trying to load char * strings into std::string by using C functions such as strcpy(). Since strcpy() takes char * as a parameter, I have to cast it which goes something like this:
std::string destination;
unsigned char *source;
strcpy((char*)destination.c_str(), (char*)source);
The code works fine and when I run the program in a debugger, the value of *source is stored in destination, but for some odd reason it won't print out with the statement
std::cout << destination;
I noticed that if I use
std::cout << destination.c_str();
The value prints out correctly and all is well. Why does this happen? Is there a better method of copying an unsigned char* or char* into a std::string (stringstreams?) This seems to only happen when I specify the string as foo.c_str() in a copying operation.
Edit: To answer the question "why would you do this?", I am using strcpy() as a plain example. There are other times that it's more complex than assignment. For example, having to copy only X amount of string A into string B using strncpy() or passing a std::string to a function from a C library that takes a char * as a parameter for a buffer.
Here's what you want
std::string destination = source;
What you're doing is wrong on so many levels... you're writing over the inner representation of a std::string... I mean... not cool man... it's much more complex than that, arrays being resized, read-only memory... the works.
This is not a good idea at all for two reasons:
destination.c_str() is a const pointer and casting away it's const and writing to it is undefined behavior.
You haven't set the size of the string, meaning that it won't even necessealy have a large enough buffer to hold the string which is likely to cause an access violation.
std::string has a constructor which allows it to be constructed from a char* so simply write:
std::string destination = source
Well what you are doing is undefined behavior. Your c_str() returns a const char * and is not meant to be assigned to. Why not use the defined constructor or assignment operator.
std::string defines an implicit conversion from const char* to std::string... so use that.
You decided to cast away an error as c_str() returns a const char*, i.e., it does not allow for writing to its underlying buffer. You did everything you could to get around that and it didn't work (you shouldn't be surprised at this).
c_str() returns a const char* for good reason. You have no idea if this pointer points to the string's underlying buffer. You have no idea if this pointer points to a memory block large enough to hold your new string. The library is using its interface to tell you exactly how the return value of c_str() should be used and you're ignoring that completely.
Do not do what you are doing!!!
I repeat!
DO NOT DO WHAT YOU ARE DOING!!!
That it seems to sort of work when you do some weird things is a consequence of how the string class was implemented. You are almost certainly writing in memory you shouldn't be and a bunch of other bogus stuff.
When you need to interact with a C function that writes to a buffer there's two basic methods:
std::string read_from_sock(int sock) {
char buffer[1024] = "";
int recv = read(sock, buffer, 1024);
if (recv > 0) {
return std::string(buffer, buffer + recv);
}
return std::string();
}
Or you might try the peek method:
std::string read_from_sock(int sock) {
int recv = read(sock, 0, 0, MSG_PEEK);
if (recv > 0) {
std::vector<char> buf(recv);
recv = read(sock, &buf[0], recv, 0);
return std::string(buf.begin(), buf.end());
}
return std::string();
}
Of course, these are not very robust versions...but they illustrate the point.
First you should note that the value returned by c_str is a const char* and must not be modified. Actually it even does not have to point to the internal buffer of string.
In response to your edit:
having to copy only X amount of string A into string B using strncpy()
If string A is a char array, and string B is std::string, and strlen(A) >= X, then you can do this:
B.assign(A, A + X);
passing a std::string to a function from a C library that takes a char
* as a parameter for a buffer
If the parameter is actually const char *, you can use c_str() for that. But if it is just plain char *, and you are using a C++11 compliant compiler, then you can do the following:
c_function(&B[0]);
However, you need to ensure that there is room in the string for the data(same as if you were using a plain c-string), which you can do with a call to the resize() function. If the function writes an unspecified amount of characters to the string as a null-terminated c-string, then you will probably want to truncate the string afterward, like this:
B.resize(B.find('\0'));
The reason you can safely do this in a C++11 compiler and not a C++03 compiler is that in C++03, strings were not guaranteed by the standard to be contiguous, but in C++11, they are. If you want the guarantee in C++03, then you can use std::vector<char> instead.

Proper Way To Initialize Unsigned Char*

What is the proper way to initialize unsigned char*? I am currently doing this:
unsigned char* tempBuffer;
tempBuffer = "";
Or should I be using memset(tempBuffer, 0, sizeof(tempBuffer)); ?
To "properly" initialize a pointer (unsigned char * as in your example), you need to do just a simple
unsigned char *tempBuffer = NULL;
If you want to initialize an array of unsigned chars, you can do either of following things:
unsigned char *tempBuffer = new unsigned char[1024]();
// and do not forget to delete it later
delete[] tempBuffer;
or
unsigned char tempBuffer[1024] = {};
I would also recommend to take a look at std::vector<unsigned char>, which you can initialize like this:
std::vector<unsigned char> tempBuffer(1024, 0);
The second method will leave you with a null pointer. Note that you aren't declaring any space for a buffer here, you're declaring a pointer to a buffer that must be created elsewhere. If you initialize it to "", that will make the pointer point to a static buffer with exactly one byte—the null terminator. If you want a buffer you can write characters into later, use Fred's array suggestion or something like malloc.
As it's a pointer, you either want to initialize it to NULL first like this:
unsigned char* tempBuffer = NULL;
unsigned char* tempBuffer = 0;
or assign an address of a variable, like so:
unsigned char c = 'c';
unsigned char* tempBuffer = &c;
EDIT:
If you wish to assign a string, this can be done as follows:
unsigned char myString [] = "This is my string";
unsigned char* tmpBuffer = &myString[0];
If you know the size of the buffer at compile time:
unsigned char buffer[SIZE] = {0};
For dynamically allocated buffers (buffers allocated during run-time or on the heap):
1.Prefer the new operator:
unsigned char * buffer = 0; // Pointer to a buffer, buffer not allocated.
buffer = new unsigned char [runtime_size];
2.Many solutions to "initialize" or fill with a simple value:
std::fill(buffer, buffer + runtime_size, 0); // Prefer to use STL
memset(buffer, 0, runtime_size);
for (i = 0; i < runtime_size; ++i) *buffer++ = 0; // Using a loop
3.The C language side provides allocation and initialization with one call.
However, the function does not call the object's constructors:
buffer = calloc(runtime_size, sizeof(unsigned char))
Note that this also sets all bits in the buffer to zero; you don't get a choice in the initial value.
It depends on what you want to achieve (e.g. do you ever want to modify the string). See e.g. http://c-faq.com/charstring/index.html for more details.
Note that if you declare a pointer to a string literal, it should be const, i.e.:
const unsigned char *tempBuffer = "";
If the plan is for it to be a buffer and you want to move it later to point to something, then initialise it to NULL until it really points somewhere to which you want to write, not an empty string.
unsigned char * tempBuffer = NULL;
std::vector< unsigned char > realBuffer( 1024 );
tempBuffer = &realBuffer[0]; // now it really points to writable memory
memcpy( tempBuffer, someStuff, someSizeThatFits );
The answer depends on what you inted to use the unsigned char for. A char is nothing else but a small integer, which is of size 8 bits on 99% of all implementations.
C happens to have some string support that fits well with char, but that doesn't limit the usage of char to strings.
The proper way to initialize a pointer depends on 1) its scope and 2) its intended use.
If the pointer is declared static, and/or declared at file scope, then ISO C/C++ guarantees that it is initialized to NULL. Programming style purists would still set it to NULL to keep their style consistent with local scope variables, but theoretically it is pointless to do so.
As for what to initialize it to... set it to NULL. Don't set it to point at "", because that will allocate a static dummy byte containing a null termination, which will become a tiny little static memory leak as soon as the pointer is assigned to something else.
One may question why you need to initialize it to anything at all in the first place. Just set it to something valid before using it. If you worry about using a pointer before giving it a valid value, you should get a proper static analyzer to find such simple bugs. Even most compilers will catch that bug and give you a warning.

malloc pointer function passing fread

Can't figure out what's wrong, I don't seem to be getting anything from fread.
port.h
#pragma once
#ifndef _PORT_
#define _PORT_
#include <string>
#ifndef UNICODE
typedef char chr;
typedef string str;
#else
typedef wchar_t chr;
typedef std::wstring str;
inline void fopen(FILE ** ptrFile, const wchar_t * _Filename,const wchar_t * _Mode)
{
_wfopen_s(ptrFile,_Filename,_Mode);
}
#endif
#endif
inside main()
File * f = new File(fname,FileOpenMode::Read);
chr *buffer;
buffer = (wchar_t*)malloc(f->_length*2);
for(int i=0;i<f->_length;i++)
{
buffer[i] = 0;
}
f->Read_Whole_File(buffer);
f->Close();
for(int i=0;i<f->_length;i++)
{
printf("%S",buffer[i]);
}
free(buffer);
inside file class
void Read_Whole_File(chr *&buffer)
{
//buffer = (char*)malloc(_length);
if(buffer == NULL)
{
_IsError = true;
return;
}
fseek(_file_pointer, 0, SEEK_SET);
int a = sizeof(chr);
fread(&buffer,_length ,sizeof(chr) , _file_pointer);
}
You're mixing pointers and references all over the place.
Your function only needs to take a pointer to the buffer:
void Read_Whole_File(char *buffer) { ... }
And you should pass that pointer as-is to fread(), don't take the address of the pointer:
size_t amount_read = fread(buffer, _length, sizeof *buffer, _file_pointer);
Also remember:
If you have a pointer ptr to some type, you can use sizeof *ptr and remove the need to repeat the type name.
If you know the length of the file already, pass it to the function so you don't need to figure it out twice.
In C, don't cast the return value of malloc().
Check for errors when doing memory allocation and I/O, things can fail.
buffer is a reference to a chr *. Yet you're reading into &buffer which is a chr ** (whatever that is). Wrong.
You don't even need to pass a reference to buffer in Read_Whole_File, just use a regular pointer.
aside from your original problem...
from your code:
typedef char chr;
chr *buffer;
buffer = (wchar_t*)malloc(f->_length*2);
for(int i=0;i<f->_length;i++)
{
buffer[i] = 0;
}
don't you think there is something wrong here ? in case you cannot spot the errors, here is the list:
chr is a char, so buffer is a char *
you are using malloc. are you coding in C or in C++ ? if it is C++, consider using new
the buffer you allocate is explicitly casted to a wchar_t * but buffer is a char *
in the malloc you are allocating a block of size length*2 when you should be using length * sizeof(w_char_t). don't make any assumption on the size of a type (and even writing sizeof(char) is no problem, it renders the intentions explicit)
the for loop goes from 0 to length, but since buffer is defined as a buffer of char, only length bytes are initialized, whereas you alocated length*2 bytes, so half your buffer is still uninitialized.
memset() has been defined to avoid this kind of for loop...
please be a little bit careful when coding !
First a couple of nits:
What's with the malloc and free? What's wrong with new and delete? You are obviously writing C++ here, so write C++.
Overloading fopen, fseek, fread in the case of wchar_t bothers me, immensely. Much better: Use templates, or define your own functions. Don't overload those names that belong to C.
This is not not a nit. The following is almost certainly not doing what you want:
fread(&buffer,_length ,sizeof(chr) , _file_pointer);
chr here is a pointer to a reference, so sizeof is almost certainly going to be either 32 or 64.