Getting first byte in a char* buffer - c++

I have a char* buffer and I am interested in looking at the first byte in the char* buffer, what is the most optimal way to go about this.
EDIT: Based on the negative votes I might want to explain why this question, I am aware of methods but in the code base that I have been looking for getting first byte people do all kinds of crazy things like do a copy of the buffer , copy it to a stream and then do a get.

Just use
char firstByte = buffer[0];

Or this:
char firstByte = *buffer;
For clarification, there's no difference between *buffer and buffer[0], since the latter is really just shorthand for *(buffer + 0*sizeof(char)), and any compiler is going to be smart enough to replace that with *(buffer+0) and then *buffer. So the choice is really whichever is clearest in the context you are using it, not how efficient each one is.

char *buffer = {'h','e','l','l','o','\0'};
or:
char *buffer = "hello";
or:
char buffer[6] = {'h','e','l','l','o','\0'};
and to get the first byte:
char firstChar = buffer[0];
or:
char firstChar = *buffer; // since the buffer pointer points to the first element in the array

If you're determined to micro-optimize, you should know that every compiler made in this millennium should produce exactly the same machine code for "c = *buffer" and "c = buffer[0]".

char first = someCharPtr[0];
or
char first = *someCharPtr;

Just as a clarification of what several people have mentioned--that:
buffer[0]
is equivalent to
*(buffer + 0*sizeof(char))
That's not technically true if you assume that's literal C code (i.e. not pseudo code), although that's what the compiler is doing for you.
Because of pointer arithmetic, when you add an integer to a pointer, it is automatically multiplied by sizeof(*pointer), so it should really be:
*(buffer + 0)
Although, since sizeof(char) is defined to be 1, it is actually equivalent in this case.

char* c_ptr;
char first_char;
first_char = c_ptr[0];

Good for x86 platforms...
char firstByte;
__asm {
mov al, [buffer]
mov [firstByte], al
}

Related

What is *(uint32_t *) &buffer[index]?

In some supposed-C++ code I found, I have buffer defined as const void *buffer; (it's arbitrary binary data that, I think, gets interpreted as a stream of 32-bit unsigned integers) and in many places, I have
*(uint32_t *) &buffer[index]
where index is some kind of integer (I think it was long or unsigned long and got swept up in my replacing those with int32_t and uint32_t when I was making the code work on a 64-bit system).
I recognize that this is taking the address of buffer (&buffer), casting it as a pointer to a uint32_t, and dereferencing that, at least based on this question... but then I'm confused by how the [index] part interacts with that or where I missed inserting the [index] part in between the steps I listed.
What, conceptually, is this doing? Is there some way I could define another variable to be a better type, with the casting there once, and then use that, rather than having this complicated expression throughout the code? Is this actually C++ or is this C99?
edit: The first couple of lines of the code are:
const void *buffer = data.bytes;
if (ntohl(*(int32_t *) buffer) != 'ttcf') {
return;
}
uint32_t ttf_count = ntohl(*(uint32_t *) &buffer[0x08]);
where data.bytes has type const void *. Before I was getting buffer from data.bytes, it was char *.
edit 2: Apparently, having const void *buffer work is not normal C (though it absolutely works in my situation), so if it makes more sense, assume it's const char *buffer.
Putting parenthesis in place to make the order of operations more explicit:
*((uint32_t *) &(buffer[index]))
So you're treating buffer as an array, however because buffer is a void * you can't dereference it directly.
Assuming you want to treat this buffer as an array of uint32_t, what you want to do is this:
((uint32_t *)buffer)[index]
Which can also be written as:
*((uint32_t *)buffer + index)
EDIT:
If index is the byte offset in the buffer, that changes things. In that case, I'd recommend defining the buffer as const char * instead of const void *. That way, you can be sure the dereferencing of the array is working properly.
So to break down the expression:
*(uint32_t *) &buffer[index]
You're going index bytes into buffer: buffer[index]
Then taking the address of that byte: &buffer[index]
Then casting that address to a uint32_t: (uint32_t *) &buffer[index]
Then dereferencing the uint32_t value: *(uint32_t *) &buffer[index]
Lots of issues here! First of all, a void * cannot be dereferenced. buffer[index] is illegal in ISO C, although some compilers apparently have an extension that will treat it as (void)((char *)buffer)[index].
You suggest in comments that the code originally used char * - I recommend you leave it that way. Assuming buffer returns to being const char *:
if (ntohl(*(int32_t *) buffer) != 'ttcf') { return; }
The intent here is to pretend that the first four bytes of buffer contain an integer; read that integer, and compare it to 'ttcf'. The latter is a multibyte character constant, the behaviour of which is implementation-defined. It could represent four characters 't', 't', 'c', 'f', or 'f', 'c', 't', 't', or in fact anything else at all of type int.
A greater problem is that pretending a buffer contains an int when it did not actually get written via an expression of type int violates the strict aliasing rule. This is unfortunately a common technique in older code, but even since the first C standard it has caused undefined behaviour. If you use a compiler that performs type-based aliasing optimization it could wreck your code.
A way to write this code avoiding both of those problems is:
if ( memcmp(buffer, "ttcf", 4) ) { return; }
The later line uint32_t ttf_count = ntohl(*(uint32_t *) &buffer[0x08]); has similar issues. In this case there is no doubt that the best fix is:
uint32_t ttf_count;
memcpy(&ttf_count, buffer + 0x08, sizeof ttf_count);
ttf_count = ntohl(ttf_count);
As discussed in comments, you could make an inline function to keep this tidy. In my own code I do something like:
static inline uint32_t be_to_uint32(void const *ptr)
{
unsigned char const *p = ptr;
return p[0] * 0x1000000ul + p[1] * 0x10000ul + p[2] * 0x100 + p[3];
}
and a similar version le_to_uint32 that reads bytes in the opposite order; then I use whichever of those corresponds to the input format instead of using ntohl.

Convert char* to uint8_t

I transfer message trough a CAN protocol.
To do so, the CAN message needs data of uint8_t type. So I need to convert my char* to uint8_t. With my research on this site, I produce this code :
char* bufferSlidePressure = ui->canDataModifiableTableWidget->item(6,3)->text().toUtf8().data();//My char*
/* Conversion */
uint8_t slidePressure [8];
sscanf(bufferSlidePressure,"%c",
&slidePressure[0]);
As you may see, my char* must fit in sliderPressure[0].
My problem is that even if I have no error during compilation, the data in slidePressure are totally incorrect. Indeed, I test it with a char* = 0 and I 've got unknow characters ... So I think the problem must come from conversion.
My datas can be Bool, Uchar, Ushort and float.
Thanks for your help.
Is your string an integer? E.g. char* bufferSlidePressure = "123";?
If so, I would simply do:
uint8_t slidePressure = (uint8_t)atoi(bufferSlidePressure);
Or, if you need to put it in an array:
slidePressure[0] = (uint8_t)atoi(bufferSlidePressure);
Edit: Following your comment, if your data could be anything, I guess you would have to copy it into the buffer of the new data type. E.g. something like:
/* in case you'd expect a float*/
float slidePressure;
memcpy(&slidePressure, bufferSlidePressure, sizeof(float));
/* in case you'd expect a bool*/
bool isSlidePressure;
memcpy(&isSlidePressure, bufferSlidePressure, sizeof(bool));
/*same thing for uint8_t, etc */
/* in case you'd expect char buffer, just a byte to byte copy */
char * slidePressure = new char[ size ]; // or a stack buffer
memcpy(slidePressure, (const char*)bufferSlidePressure, size ); // no sizeof, since sizeof(char)=1
uint8_t is 8 bits of memory, and can store values from 0 to 255
char is probably 8 bits of memory
char * is probably 32 or 64 bits of memory containing the address of a different place in memory in which there is a char
First, make sure you don't try to put the memory address (the char *) into the uint8 - put what it points to in:
char from;
char * pfrom = &from;
uint8_t to;
to = *pfrom;
Then work out what you are really trying to do ... because this isn't quite making sense. For example, a float is probably 32 or 64 bits of memory. If you think there is a float somewhere in your char * data you have a lot of explaining to do before we can help :/
char * is a pointer, not a single character. It is possible that it points to the character you want.
uint8_t is unsigned but on most systems will be the same size as a char and you can simply cast the value.
You may need to manage the memory and lifetime of what your function returns. This could be done with vector< unsigned char> as the return type of your function rather than char *, especially if toUtf8() has to create the memory for the data.
Your question is totally ambiguous.
ui->canDataModifiableTableWidget->item(6,3)->text().toUtf8().data();
That is a lot of cascading calls. We have no idea what any of them do and whether they are yours or not. It looks dangerous.
More safe example in C++ way
char* bufferSlidePressure = "123";
std::string buffer(bufferSlidePressure);
std::stringstream stream;
stream << str;
int n = 0;
// convert to int
if (!(stream >> n)){
//could not convert
}
Also, if boost is availabe
int n = boost::lexical_cast<int>( str )

Possible Buffer Overflow During String Concatenation

I am new to C++. My program is crashing and I am trying to find out why. At some point of the code, I generate a random number and I copy a file with the original filename followed by the number
char CopyPath[MAX_PATH];
SHGetFolderPath(NULL, CSIDL_MYMUSIC, NULL, 0, CopyPath);
int randomNumber = 101 + rand()%1000000000;
char randomBuffer[15];
itoa(randomNumber, randomBuffer, 10);
char computerName[MAX_COMPUTERNAME_LENGTH+1];
DWORD size = MAX_COMPUTERNAME_LENGTH;
if(!GetComputerName(computerName, &size))
strcat(computerName, "FAIL");
strcat(CopyPath,"\\");
strcat(CopyPath, computerName);
strcat(CopyPath, "-");
strcat(CopyPath, randomBuffer);
copyFile(oldpath, CopyPath);
I suspect the crash happens somewhere here. My question is, since I haven't declared all the values of CopyPath, there is a crash. Should i declare it as
char CopyPath[MAX_PATH] = {'\0'}
Could this be the problem??
if(!GetComputerName(computerName, &size))
strcat(computerName, "FAIL");
This should be strcpy, as there's no valid string in computerName to append to.
Also, you probably should be calling SHGetFolderPathA since you are passing a buffer of char (and not TCHAR).
Prefer using std::string than C array for holding string info like that as it provides proper copying and concatenation through = and + operators.
Not sure what causes the crash in your case. My guess that it must be a buffer overrun problem. Do you consider space for the ending \0 character in MAX_PATH constant?
I believe you've understood from other comments that your code is not very good (at least due to style and possible buffer overruns).
Taking into account only your specific question - you are right - problem is in the uninitialized char array, which doesn't represent a C-string because it has to be zero-terminated. As you probably know strcat works on C strings. So changing from:
char CopyPath[MAX_PATH]; // this is not a C-string
to
char CopyPath[MAX_PATH] = {0}; // this is a C-string (empty though)
will fix this particular problem.
EDIT: this approach should be taken with any buffer that you are going to use with strcat as concatenation target, which in your case is computerName

Proper Way To Initialize Unsigned Char*

What is the proper way to initialize unsigned char*? I am currently doing this:
unsigned char* tempBuffer;
tempBuffer = "";
Or should I be using memset(tempBuffer, 0, sizeof(tempBuffer)); ?
To "properly" initialize a pointer (unsigned char * as in your example), you need to do just a simple
unsigned char *tempBuffer = NULL;
If you want to initialize an array of unsigned chars, you can do either of following things:
unsigned char *tempBuffer = new unsigned char[1024]();
// and do not forget to delete it later
delete[] tempBuffer;
or
unsigned char tempBuffer[1024] = {};
I would also recommend to take a look at std::vector<unsigned char>, which you can initialize like this:
std::vector<unsigned char> tempBuffer(1024, 0);
The second method will leave you with a null pointer. Note that you aren't declaring any space for a buffer here, you're declaring a pointer to a buffer that must be created elsewhere. If you initialize it to "", that will make the pointer point to a static buffer with exactly one byte—the null terminator. If you want a buffer you can write characters into later, use Fred's array suggestion or something like malloc.
As it's a pointer, you either want to initialize it to NULL first like this:
unsigned char* tempBuffer = NULL;
unsigned char* tempBuffer = 0;
or assign an address of a variable, like so:
unsigned char c = 'c';
unsigned char* tempBuffer = &c;
EDIT:
If you wish to assign a string, this can be done as follows:
unsigned char myString [] = "This is my string";
unsigned char* tmpBuffer = &myString[0];
If you know the size of the buffer at compile time:
unsigned char buffer[SIZE] = {0};
For dynamically allocated buffers (buffers allocated during run-time or on the heap):
1.Prefer the new operator:
unsigned char * buffer = 0; // Pointer to a buffer, buffer not allocated.
buffer = new unsigned char [runtime_size];
2.Many solutions to "initialize" or fill with a simple value:
std::fill(buffer, buffer + runtime_size, 0); // Prefer to use STL
memset(buffer, 0, runtime_size);
for (i = 0; i < runtime_size; ++i) *buffer++ = 0; // Using a loop
3.The C language side provides allocation and initialization with one call.
However, the function does not call the object's constructors:
buffer = calloc(runtime_size, sizeof(unsigned char))
Note that this also sets all bits in the buffer to zero; you don't get a choice in the initial value.
It depends on what you want to achieve (e.g. do you ever want to modify the string). See e.g. http://c-faq.com/charstring/index.html for more details.
Note that if you declare a pointer to a string literal, it should be const, i.e.:
const unsigned char *tempBuffer = "";
If the plan is for it to be a buffer and you want to move it later to point to something, then initialise it to NULL until it really points somewhere to which you want to write, not an empty string.
unsigned char * tempBuffer = NULL;
std::vector< unsigned char > realBuffer( 1024 );
tempBuffer = &realBuffer[0]; // now it really points to writable memory
memcpy( tempBuffer, someStuff, someSizeThatFits );
The answer depends on what you inted to use the unsigned char for. A char is nothing else but a small integer, which is of size 8 bits on 99% of all implementations.
C happens to have some string support that fits well with char, but that doesn't limit the usage of char to strings.
The proper way to initialize a pointer depends on 1) its scope and 2) its intended use.
If the pointer is declared static, and/or declared at file scope, then ISO C/C++ guarantees that it is initialized to NULL. Programming style purists would still set it to NULL to keep their style consistent with local scope variables, but theoretically it is pointless to do so.
As for what to initialize it to... set it to NULL. Don't set it to point at "", because that will allocate a static dummy byte containing a null termination, which will become a tiny little static memory leak as soon as the pointer is assigned to something else.
One may question why you need to initialize it to anything at all in the first place. Just set it to something valid before using it. If you worry about using a pointer before giving it a valid value, you should get a proper static analyzer to find such simple bugs. Even most compilers will catch that bug and give you a warning.

How to copy a string into a char array in C++ without going over the buffer

I want to copy a string into a char array, and not overrun the buffer.
So if I have a char array of size 5, then I want to copy a maximum of 5 bytes from a string into it.
what's the code to do that?
This is exactly what std::string's copy function does.
#include <string>
#include <iostream>
int main()
{
char test[5];
std::string str( "Hello, world" );
str.copy(test, 5);
std::cout.write(test, 5);
std::cout.put('\n');
return 0;
}
If you need null termination you should do something like this:
str.copy(test, 4);
test[4] = '\0';
First of all, strncpy is almost certainly not what you want. strncpy was designed for a fairly specific purpose. It's in the standard library almost exclusively because it already exists, not because it's generally useful.
Probably the simplest way to do what you want is with something like:
sprintf(buffer, "%.4s", your_string.c_str());
Unlike strncpy, this guarantees that the result will be NUL terminated, but does not fill in extra data in the target if the source is shorter than specified (though the latter isn't a major issue when the target length is 5).
Use function strlcpybroken link, and material not found on destination site if your implementation provides one (the function is not in the standard C library), yet it is rather widely accepted as a de-facto standard name for a "safe" limited-length copying function for zero-terminated strings.
If your implementation does not provide strlcpy function, implement one yourself. For example, something like this might work for you
char *my_strlcpy(char *dst, const char *src, size_t n)
{
assert(dst != NULL && src != NULL);
if (n > 0)
{
char *pd;
const char *ps;
for (--n, pd = dst, ps = src; n > 0 && *ps != '\0'; --n, ++pd, ++ps)
*pd = *ps;
*pd = '\0';
}
return dst;
}
(Actually, the de-facto accepted strlcpy returns size_t, so you might prefer to implement the accepted specification instead of what I did above).
Beware of the answers that recommend using strncpy for that purpose. strncpy is not a safe limited-length string copying function and is not supposed to be used for that purpose. While you can force strncpy to "work" for that purpose, it is still akin to driving woodscrews with a hammer.
Update: Thought I would try to tie together some of the answers, answers which have convinced me that my own original knee-jerk strncpy response was poor.
First, as AndreyT noted in the comments to this question, truncation methods (snprintf, strlcpy, and strncpy) are often not a good solution. Its often better to check the size of the string string.size() against the buffer length and return/throw an error or resize the buffer.
If truncation is OK in your situation, IMHO, strlcpy is the best solution, being the fastest/least overhead method that ensures null termination. Unfortunately, its not in many/all standard distributions and so is not portable. If you are doing a lot of these, it maybe worth providing your own implementation, AndreyT gave an example. It runs in O(result length). Also the reference specification returns the number of bytes copied, which can assist in detecting if the source was truncated.
Other good solutions are sprintf and snprintf. They are standard, and so are portable and provide a safe null terminated result. They have more overhead than strlcpy (parsing the format string specifier and variable argument list), but unless you are doing a lot of these you probably won't notice the difference. It also runs in O(result length). snprintf is always safe and that sprintf may overflow if you get the format specifier wrong (as other have noted, format string should be "%.<N>s" not "%<N>s"). These methods also return the number of bytes copied.
A special case solution is strncpy. It runs in O(buffer length), because if it reaches the end of the src it zeros out the remainder of the buffer. Only useful if you need to zero the tail of the buffer or are confident that destination and source string lengths are the same. Also note that it is not safe in that it doesn't necessarily null terminate the string. If the source is truncated, then null will not be appended, so call in sequence with a null assignment to ensure null termination: strncpy(buffer, str.c_str(), BUFFER_LAST); buffer[BUFFER_LAST] = '\0';
Some nice libc versions provide non-standard but great replacement for strcpy(3)/strncpy(3) - strlcpy(3).
If yours doesn't, the source code is freely available here from the OpenBSD repository.
void stringChange(string var){
char strArray[100];
strcpy(strArray, var.c_str());
}
I guess this should work. it'll copy form string to an char array.
i think snprintf() is much safe and simlest
snprintf ( buffer, 100, "The half of %d is %d", 60, 60/2 );
null character is append it end automatically :)
The most popular answer is fine but the null-termination is not generic. The generic way to null-terminate the char-buffer is:
std::string aString = "foo";
const size_t BUF_LEN = 5;
char buf[BUF_LEN];
size_t len = aString.copy(buf, BUF_LEN-1); // leave one char for the null-termination
buf[len] = '\0';
len is the number of chars copied so it's between 0 and BUF_LEN-1.
std::string my_string("something");
char* my_char_array = new char[5];
strncpy(my_char_array, my_string.c_str(), 4);
my_char_array[4] = '\0'; // my_char_array contains "some"
With strncpy, you can copy at most n characters from the source to the destination. However, note that if the source string is at most n chars long, the destination will not be null terminated; you must put the terminating null character into it yourself.
A char array with a length of 5 can contain at most a string of 4 characters, since the 5th must be the terminating null character. Hence in the above code, n = 4.
std::string str = "Your string";
char buffer[5];
strncpy(buffer, str.c_str(), sizeof(buffer));
buffer[sizeof(buffer)-1] = '\0';
The last line is required because strncpy isn't guaranteed to NUL terminate the string (there has been a discussion about the motivation yesterday).
If you used wide strings, instead of sizeof(buffer) you'd use sizeof(buffer)/sizeof(*buffer), or, even better, a macro like
#define ARRSIZE(arr) (sizeof(arr)/sizeof(*(arr)))
/* ... */
buffer[ARRSIZE(buffer)-1]='\0';
char mystring[101]; // a 100 character string plus terminator
char *any_input;
any_input = "Example";
iterate = 0;
while ( any_input[iterate] != '\0' && iterate < 100) {
mystring[iterate] = any_input[iterate];
iterate++;
}
mystring[iterate] = '\0';
This is the basic efficient design.
If you always have a buffer of size 5, then you could do:
std::string s = "Your string";
char buffer[5]={s[0],s[1],s[2],s[3],'\0'};
Edit:
Of course, assuming that your std::string is large enough.