Is this allowed: memcpy(dest, src, 0) [duplicate] - c++

This question already has answers here:
Can I call memcpy() and memmove() with "number of bytes" set to zero?
(2 answers)
Closed 7 years ago.
Is there a problem in passing 0 to memcpy():
memcpy(dest, src, 0)
Note that in my code I am using a variable instead of 0, which can sometimes evaluate to 0.

As one might expect from a sane interface, zero is a valid size, and results in nothing happening. It's specifically allowed by the specification of the various string handling functions (including memcpy) in C99 7.21.1/2:
Where an argument declared as size_t n specifies the length of the array for a function, n can have the value zero on a call to that function. [...] On such a call, a function that locates a character finds no occurrence, a function that compares two character sequences returns zero, and a function that copies characters copies zero characters.

Yes, it's totally Ok. The only restriction on memcpy is that memory areas must not overlap.

Related

Char decimal value [duplicate]

This question already has answers here:
uint8_t can't be printed with cout
(8 answers)
Closed 2 years ago.
Is it possible to get the decimal value of a char? I know I can use short(char) but that would waste memory, and that is why I used char in the first place, I want only 8 bits of data, so I use char (similar to byte in C#). In my program, when I need to print it, it always shows some weird character corresponding to decimal value, I want it to show the decimal value itself. So is there a C# equivalent of char.GetNumericValue()?
You can use one of the integer descripted in the link.
Then if the problem is only the print you can use the std::cout with this syntax
char a = 45;
cout << +a; // promotes a to a type printable as a number, regardless of type.
This works as long as the type provides a unary + operator with ordinary semantics. If you are defining a class that represents a number, to provide a unary + operator with canonical semantics, create an operator+() that simply returns *this either by value or by reference-to-const.
Further read in c++-faq print-char-or-ptr-as-number

Character array initialization with the first element being null

I was recently faced with a line of code and four options:
char fullName[30] = {NULL};
A) First element is assigned a NULL character.
B) Every element of the array is assigned 0 ( Zeroes )
C) Every element of the array is assigned NULL
D) The array is empty.
The answer we selected was option C, as, while the array is only initialized with a single NULL, C++ populates the rest of the array with NULL.
However, our professor disagreed, stating that the answer is A, he said:
So the very first element is NULL, and when you display it, it's displaying the first element, which is NULL.
The quote shows the question in its entirety; there was no other information provided. I'm curious to which one is correct, and if someone could explain why said answer would be correct.
The question is ill-defined, but Option B seems like the most correct answer.
The result depends on how exactly NULL is defined, which depends on the compiler (more precisely, on the standard library implementation). If it's defined as nullptr, the code will not compile. (I don't think any major implementation does that, but still.)
Assuming NULL is not defined as nullptr, then it must be defined as an integer literal with value 0 (which is 0, or 0L, or something similar), which makes your code equivalent to char fullName[30] = {0};.
This fills the array with zeroes, so Option B is the right answer.
In general, when you initialize an array with a brace-enclosed list, every element is initialized with something. If you provide fewer initializers than the number of elements, the remaining elements are zeroed.
Regarding the remaining options:
Option C is unclear, because if the code compiles, then NULL is equivalent to 0, so option C can be considered equivalent to Option B.
Option A can be valid depending on how you interpret it. If it means than the remaining elements are uninitialized, then it's wrong. If it doesn't specify what happens to the remaining elements, then it's a valid answer.
Option D is outright wrong, because arrays can't be "empty".
char fullName[30] = {NULL};
This is something that should never be written.
NULL is a macro that expands to a null pointer constant. A character - not a pointer - is being initialised here, so it makes no sense to use NULL.
It just so happens that some null pointer constants are also integer literals with value 0 (i.e. 0 or 0L for example), and if NULL expands to such literal, then the shown program is technically well-formed despite the abuse of NULL. What the macro expands to exactly is defined by the language implementation.
If NULLinstead expands to a null pointer constant that is not an integer literal such as nullptr - which is entirely possible - then the program is ill-formed.
NULL shouldn't be written in C++ at all, even to initialise pointers. It exists for backwards compatibility with C to make it easier to port C programs to C++.
Now, let us assume that NULL happens to expand to an integer literal on this particular implementation of C++.
Nothing in the example is assigned. Assignment is something that is done to pre-existing object. Here, and array is being initialised.
The first element of the array is initialised with the zero literal. The rest of the elements are value initialised. Both result in the null character. As such, the entire array will be filled with null characters.
A simple and correct way to write the same is:
char fullName[30] = {};
B and C are equally close to being correct, except for wording regarding "assignment". They fail to mention value initialisation, but at least the outcome is the same. A is not wrong either, although it is not as complete because it fails to describe how the rest of the elements are initialised.
If "empty" is interpreted as "contains no elements", then D is incorrect because the array contains 30 elements. If it is interpreted as "contains the empty string", then D would be a correct answer.
You are almost correct.
The professor is incorrect. It is true that display finishes at the first NULL (when some approaches are used), but that says nothing about the values of the remainder of the array, which could be trivially examined regardless.
[dcl.init/17.5]:: [..] the
ith array element is copy-initialized with xi for each 1 ≤ i ≤ k, and value-initialized for each k < i ≤ n. [..]
However, none of the options is strictly correct and well-worded.
What happens is that NULL is used to initialise the first element, and the other elements are zero-initialised. The end result is effectively Option B.
Thing is, if NULL were defined as an expression of type std::nullptr_t on your platform (which it isn't, but it is permitted to be), the example won't even compile!
NULL is a pointer, not a number. Historically it has been possible to mix and match the two things to some degree, but C++ has tried to tighten that up in recent years, and you should avoid blurring the line.
A better approach is:
char fullName[30] = {};
And the best approach is:
std::string fullName;
Apparently, Your Professor is right, let's see how
char someName[6] = "SAAD";
how the string name is represented in memory:
0 1 2 3 4 5
S A A D
Array-based C string
The individual characters that make up the string are stored in the elements of the array. The string is terminated by a null character. Array elements after the null character are not part of the string, and their contents are irrelevant.
A "null string" is a string with a null character as its first character:
0 1 2 3 4 5
/0
Null C string
The length of a null string is 0.

C++ value returned even without a return statement [duplicate]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 9 years ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
What are all the common undefined behaviours that a C++ programmer should know about?
Say, like:
a[i] = i++;
Pointer
Dereferencing a NULL pointer
Dereferencing a pointer returned by a "new" allocation of size zero
Using pointers to objects whose lifetime has ended (for instance, stack allocated objects or deleted objects)
Dereferencing a pointer that has not yet been definitely initialized
Performing pointer arithmetic that yields a result outside the boundaries (either above or below) of an array.
Dereferencing the pointer at a location beyond the end of an array.
Converting pointers to objects of incompatible types
Using memcpy to copy overlapping buffers.
Buffer overflows
Reading or writing to an object or array at an offset that is negative, or beyond the size of that object (stack/heap overflow)
Integer Overflows
Signed integer overflow
Evaluating an expression that is not mathematically defined
Left-shifting values by a negative amount (right shifts by negative amounts are implementation defined)
Shifting values by an amount greater than or equal to the number of bits in the number (e.g. int64_t i = 1; i <<= 72 is undefined)
Types, Cast and Const
Casting a numeric value into a value that can't be represented by the target type (either directly or via static_cast)
Using an automatic variable before it has been definitely assigned (e.g., int i; i++; cout << i;)
Using the value of any object of type other than volatile or sig_atomic_t at the receipt of a signal
Attempting to modify a string literal or any other const object during its lifetime
Concatenating a narrow with a wide string literal during preprocessing
Function and Template
Not returning a value from a value-returning function (directly or by flowing off from a try-block)
Multiple different definitions for the same entity (class, template, enumeration, inline function, static member function, etc.)
Infinite recursion in the instantiation of templates
Calling a function using different parameters or linkage to the parameters and linkage that the function is defined as using.
OOP
Cascading destructions of objects with static storage duration
The result of assigning to partially overlapping objects
Recursively re-entering a function during the initialization of its static objects
Making virtual function calls to pure virtual functions of an object from its constructor or destructor
Referring to nonstatic members of objects that have not been constructed or have already been destructed
Source file and Preprocessing
A non-empty source file that doesn't end with a newline, or ends with a backslash (prior to C++11)
A backslash followed by a character that is not part of the specified escape codes in a character or string constant (this is implementation-defined in C++11).
Exceeding implementation limits (number of nested blocks, number of functions in a program, available stack space ...)
Preprocessor numeric values that can't be represented by a long int
Preprocessing directive on the left side of a function-like macro definition
Dynamically generating the defined token in a #if expression
To be classified
Calling exit during the destruction of a program with static storage duration
The order that function parameters are evaluated is unspecified behavior. (This won't make your program crash, explode, or order pizza... unlike undefined behavior.)
The only requirement is that all parameters must be fully evaluated before the function is called.
This:
// The simple obvious one.
callFunc(getA(),getB());
Can be equivalent to this:
int a = getA();
int b = getB();
callFunc(a,b);
Or this:
int b = getB();
int a = getA();
callFunc(a,b);
It can be either; it's up to the compiler. The result can matter, depending on the side effects.
The compiler is free to re-order the evaluation parts of an expression (assuming the meaning is unchanged).
From the original question:
a[i] = i++;
// This expression has three parts:
(a) a[i]
(b) i++
(c) Assign (b) to (a)
// (c) is guaranteed to happen after (a) and (b)
// But (a) and (b) can be done in either order.
// See n2521 Section 5.17
// (b) increments i but returns the original value.
// See n2521 Section 5.2.6
// Thus this expression can be written as:
int rhs = i++;
int lhs& = a[i];
lhs = rhs;
// or
int lhs& = a[i];
int rhs = i++;
lhs = rhs;
Double Checked locking.
And one easy mistake to make.
A* a = new A("plop");
// Looks simple enough.
// But this can be split into three parts.
(a) allocate Memory
(b) Call constructor
(c) Assign value to 'a'
// No problem here:
// The compiler is allowed to do this:
(a) allocate Memory
(c) Assign value to 'a'
(b) Call constructor.
// This is because the whole thing is between two sequence points.
// So what is the big deal.
// Simple Double checked lock. (I know there are many other problems with this).
if (a == null) // (Point B)
{
Lock lock(mutex);
if (a == null)
{
a = new A("Plop"); // (Point A).
}
}
a->doStuff();
// Think of this situation.
// Thread 1: Reaches point A. Executes (a)(c)
// Thread 1: Is about to do (b) and gets unscheduled.
// Thread 2: Reaches point B. It can now skip the if block
// Remember (c) has been done thus 'a' is not NULL.
// But the memory has not been initialized.
// Thread 2 now executes doStuff() on an uninitialized variable.
// The solution to this problem is to move the assignment of 'a'
// To the other side of the sequence point.
if (a == null) // (Point B)
{
Lock lock(mutex);
if (a == null)
{
A* tmp = new A("Plop"); // (Point A).
a = tmp;
}
}
a->doStuff();
// Of course there are still other problems because of C++ support for
// threads. But hopefully these are addresses in the next standard.
My favourite is "Infinite recursion in the instantiation of templates" because I believe it's the only one where the undefined behaviour occurs at compile time.
Assigning to a constant after stripping constness using const_cast<>:
const int i = 10;
int *p = const_cast<int*>( &i );
*p = 1234; //Undefined
Besides undefined behaviour, there is also the equally nasty implementation-defined behaviour.
Undefined behaviour occurs when a program does something the result of which is not specified by the standard.
Implementation-defined behaviour is an action by a program the result of which is not defined by the standard, but which the implementation is required to document. An example is "Multibyte character literals", from Stack Overflow question Is there a C compiler that fails to compile this?.
Implementation-defined behaviour only bites you when you start porting (but upgrading to new version of compiler is also porting!)
Variables may only be updated once in an expression (technically once between sequence points).
int i =1;
i = ++i;
// Undefined. Assignment to 'i' twice in the same expression.
A basic understanding of the various environmental limits. The full list is in section 5.2.4.1 of the C specification. Here are a few;
127 parameters in one function definition
127 arguments in one function call
127 parameters in one macro definition
127 arguments in one macro invocation
4095 characters in a logical source line
4095 characters in a character string
literal or wide string literal (after
concatenation)
65535 bytes in an
object (in a hosted environment only)
15nesting levels for #includedfiles
1023 case labels for a switch
statement (excluding those for
anynested switch statements)
I was actually a bit surprised at the limit of 1023 case labels for a switch statement, I can forsee that being exceeded for generated code/lex/parsers fairly easially.
If these limits are exceeded, you have undefined behavior (crashes, security flaws, etc...).
Right, I know this is from the C specification, but C++ shares these basic supports.
Using memcpy to copy between overlapping memory regions. For example:
char a[256] = {};
memcpy(a, a, sizeof(a));
The behavior is undefined according to the C Standard, which is subsumed by the C++03 Standard.
7.21.2.1 The memcpy function
Synopsis
1/ #include void *memcpy(void * restrict s1, const
void * restrict s2, size_t n);
Description
2/ The memcpy function
copies n characters from the object pointed to by s2 into the object
pointed to by s1. If copying takes place between objects that overlap,
the behavior is undefined. Returns 3 The memcpy function returns the
value of s1.
7.21.2.2 The memmove function
Synopsis
1 #include void *memmove(void *s1, const void *s2, size_t
n);
Description
2 The memmove function copies n characters from the object pointed to
by s2 into the object pointed to by s1. Copying takes place as if the
n characters from the object pointed to by s2 are first copied into a
temporary array of n characters that does not overlap the objects
pointed to by s1 and s2, and then the n characters from the temporary
array are copied into the object pointed to by s1. Returns
3 The memmove function returns the value of s1.
The only type for which C++ guarantees a size is char. And the size is 1. The size of all other types is platform dependent.
Namespace-level objects in a different compilation units should never depend on each other for initialization, because their initialization order is undefined.

Is strncpy() a specialization of memcpy()?

Just curious to know (as we use these functions often). I don't see any practical difference between strncpy() and memcpy(). Isn't it worth to say that effectively,
char* strncpy (char *dst, const char *src, size_t size)
{
return (char*)memcpy(dst, src, size);
}
Or am I missing any side effect? There is one similar earlier question, but couldn't find an exact answer.
There is a difference, see this part of the strncpy page you linked to (emphasis mine):
Copies the first num characters of source to destination. If the end of the source C string (which is signaled by a null-character) is found before num characters have been copied, destination is padded with zeros until a total of num characters have been written to it.
So if the string to be copied is shorter than the limit, strncpy pads with zero while memcpy reads beyond the limit (possibly invoking undefined behaviour).
No, they are not the same.
From the C Standard (ISO/IEC 9899:1999 (E))
7.21.2.3 The strcpy function
Description
2 The strncpy function copies not more than n characters (characters that follow a null
character are not copied) from the array pointed to by s2 to the array pointed to by s1.260) If copying takes place between objects that overlap, the behavior is undefined.
3 If the array pointed to by s2 is a string that is shorter than n characters, null characters are appended to the copy in the array pointed to by s1, until n characters in all have been written.
Returns
4 The strncpy function returns the value of s1.
7.21.2.1 The memcpy function
Description
2 The memcpy function copies n characters from the object pointed to by s2 into the object pointed to by s1. If copying takes place between objects that overlap, the behavior is undefined.
Returns
3 The memcpy function returns the value of s1.
when using memcpy() the source and destination buffers can overlap, while in strncpy() this must not happen.
According to the C standard, the behavior for overlapping buffers are undefined for both strncpy() and memcpy().
According to the C standard, the real difference between strncpy() and memcpy() is that if the source string is less then N value, then NULL characters are appended to the remaining N quantity.
memcpy() is more efficient, but less safe, since it doesn't check the source to see if it has N quantity to move to the target buffer.
No, strncpy() is not a specialization, since it will detect a '\0' character during the copy and stop, something memcpy() will not do.
Adding on to what the others have said, the type of the src and dst pointers does not matter. That is, I can copy a 4 byte integer to 4 consecutive characters of 1 byte like this:
int num = 5;
char arr[4];
memcpy(arr, &num, 4);
Another difference is that memcpy does not look ofr any characters (such as NULL, by strncpy). It blindly copies num bytes from source to destination.
Edited: Properly formatted the code
You could potentially make strncpy faster by checking for a \0 and not copying past that point. So memcpy would always copy all the data, but strncpy would often be faster because of the check.

What are all the common undefined behaviours that a C++ programmer should know about? [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 9 years ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
What are all the common undefined behaviours that a C++ programmer should know about?
Say, like:
a[i] = i++;
Pointer
Dereferencing a NULL pointer
Dereferencing a pointer returned by a "new" allocation of size zero
Using pointers to objects whose lifetime has ended (for instance, stack allocated objects or deleted objects)
Dereferencing a pointer that has not yet been definitely initialized
Performing pointer arithmetic that yields a result outside the boundaries (either above or below) of an array.
Dereferencing the pointer at a location beyond the end of an array.
Converting pointers to objects of incompatible types
Using memcpy to copy overlapping buffers.
Buffer overflows
Reading or writing to an object or array at an offset that is negative, or beyond the size of that object (stack/heap overflow)
Integer Overflows
Signed integer overflow
Evaluating an expression that is not mathematically defined
Left-shifting values by a negative amount (right shifts by negative amounts are implementation defined)
Shifting values by an amount greater than or equal to the number of bits in the number (e.g. int64_t i = 1; i <<= 72 is undefined)
Types, Cast and Const
Casting a numeric value into a value that can't be represented by the target type (either directly or via static_cast)
Using an automatic variable before it has been definitely assigned (e.g., int i; i++; cout << i;)
Using the value of any object of type other than volatile or sig_atomic_t at the receipt of a signal
Attempting to modify a string literal or any other const object during its lifetime
Concatenating a narrow with a wide string literal during preprocessing
Function and Template
Not returning a value from a value-returning function (directly or by flowing off from a try-block)
Multiple different definitions for the same entity (class, template, enumeration, inline function, static member function, etc.)
Infinite recursion in the instantiation of templates
Calling a function using different parameters or linkage to the parameters and linkage that the function is defined as using.
OOP
Cascading destructions of objects with static storage duration
The result of assigning to partially overlapping objects
Recursively re-entering a function during the initialization of its static objects
Making virtual function calls to pure virtual functions of an object from its constructor or destructor
Referring to nonstatic members of objects that have not been constructed or have already been destructed
Source file and Preprocessing
A non-empty source file that doesn't end with a newline, or ends with a backslash (prior to C++11)
A backslash followed by a character that is not part of the specified escape codes in a character or string constant (this is implementation-defined in C++11).
Exceeding implementation limits (number of nested blocks, number of functions in a program, available stack space ...)
Preprocessor numeric values that can't be represented by a long int
Preprocessing directive on the left side of a function-like macro definition
Dynamically generating the defined token in a #if expression
To be classified
Calling exit during the destruction of a program with static storage duration
The order that function parameters are evaluated is unspecified behavior. (This won't make your program crash, explode, or order pizza... unlike undefined behavior.)
The only requirement is that all parameters must be fully evaluated before the function is called.
This:
// The simple obvious one.
callFunc(getA(),getB());
Can be equivalent to this:
int a = getA();
int b = getB();
callFunc(a,b);
Or this:
int b = getB();
int a = getA();
callFunc(a,b);
It can be either; it's up to the compiler. The result can matter, depending on the side effects.
The compiler is free to re-order the evaluation parts of an expression (assuming the meaning is unchanged).
From the original question:
a[i] = i++;
// This expression has three parts:
(a) a[i]
(b) i++
(c) Assign (b) to (a)
// (c) is guaranteed to happen after (a) and (b)
// But (a) and (b) can be done in either order.
// See n2521 Section 5.17
// (b) increments i but returns the original value.
// See n2521 Section 5.2.6
// Thus this expression can be written as:
int rhs = i++;
int lhs& = a[i];
lhs = rhs;
// or
int lhs& = a[i];
int rhs = i++;
lhs = rhs;
Double Checked locking.
And one easy mistake to make.
A* a = new A("plop");
// Looks simple enough.
// But this can be split into three parts.
(a) allocate Memory
(b) Call constructor
(c) Assign value to 'a'
// No problem here:
// The compiler is allowed to do this:
(a) allocate Memory
(c) Assign value to 'a'
(b) Call constructor.
// This is because the whole thing is between two sequence points.
// So what is the big deal.
// Simple Double checked lock. (I know there are many other problems with this).
if (a == null) // (Point B)
{
Lock lock(mutex);
if (a == null)
{
a = new A("Plop"); // (Point A).
}
}
a->doStuff();
// Think of this situation.
// Thread 1: Reaches point A. Executes (a)(c)
// Thread 1: Is about to do (b) and gets unscheduled.
// Thread 2: Reaches point B. It can now skip the if block
// Remember (c) has been done thus 'a' is not NULL.
// But the memory has not been initialized.
// Thread 2 now executes doStuff() on an uninitialized variable.
// The solution to this problem is to move the assignment of 'a'
// To the other side of the sequence point.
if (a == null) // (Point B)
{
Lock lock(mutex);
if (a == null)
{
A* tmp = new A("Plop"); // (Point A).
a = tmp;
}
}
a->doStuff();
// Of course there are still other problems because of C++ support for
// threads. But hopefully these are addresses in the next standard.
My favourite is "Infinite recursion in the instantiation of templates" because I believe it's the only one where the undefined behaviour occurs at compile time.
Assigning to a constant after stripping constness using const_cast<>:
const int i = 10;
int *p = const_cast<int*>( &i );
*p = 1234; //Undefined
Besides undefined behaviour, there is also the equally nasty implementation-defined behaviour.
Undefined behaviour occurs when a program does something the result of which is not specified by the standard.
Implementation-defined behaviour is an action by a program the result of which is not defined by the standard, but which the implementation is required to document. An example is "Multibyte character literals", from Stack Overflow question Is there a C compiler that fails to compile this?.
Implementation-defined behaviour only bites you when you start porting (but upgrading to new version of compiler is also porting!)
Variables may only be updated once in an expression (technically once between sequence points).
int i =1;
i = ++i;
// Undefined. Assignment to 'i' twice in the same expression.
A basic understanding of the various environmental limits. The full list is in section 5.2.4.1 of the C specification. Here are a few;
127 parameters in one function definition
127 arguments in one function call
127 parameters in one macro definition
127 arguments in one macro invocation
4095 characters in a logical source line
4095 characters in a character string
literal or wide string literal (after
concatenation)
65535 bytes in an
object (in a hosted environment only)
15nesting levels for #includedfiles
1023 case labels for a switch
statement (excluding those for
anynested switch statements)
I was actually a bit surprised at the limit of 1023 case labels for a switch statement, I can forsee that being exceeded for generated code/lex/parsers fairly easially.
If these limits are exceeded, you have undefined behavior (crashes, security flaws, etc...).
Right, I know this is from the C specification, but C++ shares these basic supports.
Using memcpy to copy between overlapping memory regions. For example:
char a[256] = {};
memcpy(a, a, sizeof(a));
The behavior is undefined according to the C Standard, which is subsumed by the C++03 Standard.
7.21.2.1 The memcpy function
Synopsis
1/ #include void *memcpy(void * restrict s1, const
void * restrict s2, size_t n);
Description
2/ The memcpy function
copies n characters from the object pointed to by s2 into the object
pointed to by s1. If copying takes place between objects that overlap,
the behavior is undefined. Returns 3 The memcpy function returns the
value of s1.
7.21.2.2 The memmove function
Synopsis
1 #include void *memmove(void *s1, const void *s2, size_t
n);
Description
2 The memmove function copies n characters from the object pointed to
by s2 into the object pointed to by s1. Copying takes place as if the
n characters from the object pointed to by s2 are first copied into a
temporary array of n characters that does not overlap the objects
pointed to by s1 and s2, and then the n characters from the temporary
array are copied into the object pointed to by s1. Returns
3 The memmove function returns the value of s1.
The only type for which C++ guarantees a size is char. And the size is 1. The size of all other types is platform dependent.
Namespace-level objects in a different compilation units should never depend on each other for initialization, because their initialization order is undefined.