Using sizeof in C - c++

I am having the following program, which is crashing. Does anybody know why it is crashing?
/* writes a, b, c into dst
** dst must have enough space for the result
** assumes all 3 numbers are positive */
void concat3(char *dst, int a, int b, int c) {
sprintf(dst, "%08x%08x%08x", a, b, c);
}
/* usage */
int main(void) {
printf("The size of int is %d \n", sizeof(int));
char n3[3 * sizeof(int) + 1];
concat3(n3, 0xDEADFACE, 0xF00BA4, 42);
printf("result is 0x%s\n", n3);
return 0;
}

You're confusing the size of the binary data (which is what sizeof) gives you, with the size of a textual representation in hexadecimal, which is what you're trying to store.
On most current systems, sizeof(int) evaluates to 4. Your buffer n3 will therefore be capable of storing 13 characters (3 * 4 + 1 == 13).
Then, you format three integers into 8-character hex format, which will require 3 * 8 + 1 == 25 characters to store. The resulting buffer overflow causes the crash.
It should be obvious that the size of the data type int doesn't matter, when you're formatting it as text (and specifying the field width yourself!).

Try 3*2*sizeof(int)+1, where 2*sizeof(int) is the number of bytes needed to print each byte worth of an int, in hex. Of course since you're using that %08X format and expecting fixed-width results, you really should be using uint32_t. By the way, your program is also incorrectly passing 0xDEADBEEF as int, which it probably doesn't fit in, and thus entering the realm of implementation-defined conversion-to-signed-type.
Here is a version with those corrections:
#include <inttypes.h>
#include <stdio.h>
/* writes a, b, c into dst
** dst must have enough space for the result
** assumes all 3 numbers are positive */
void concat3(char *dst, uint32_t a, uint32_t b, uint32_t c) {
sprintf(dst, "%08"PRIX32"%08"PRIX32"%08"PRIX32, a, b, c);
}
/* usage */
int main(void) {
printf("The size of int is %d \n", sizeof(int));
char n3[25];
concat3(n3, 0xDEADFACE, 0xF00BA4, 42);
printf("result is 0x%s\n", n3);
return 0;
}

I don't really understand what sizeof has anything to do in your code. In concat3, you're attempting to print a text representation of each provided integer as a 8 char hexadecimal string : the required buffer size should thus be equal to 8 * 3 + 1 = 25, and sizeof(int) has nothing to do with it.
You seem to be mixing the size occupied in memory by an int, and the length of it's textual representation (which in your case is easily determined as it's fixed by your sprintf format string).
On a side note : sprintf is a truly unsafe function that you should consider deprecated.

It is crashing because sizeof(int) is (most likely on your system) 4, meaning that n3 is 13 bytes long. You then try to write 8 + 8 + 8 = 24 characters to it.

Use snprintf instead of sprintf. Think of the kittens!
But seriously, you should not be creating interfaces with buffer pointers but no length information. concat should have a max length parameter. Then use snprintf inside. The length to give to concat is sizeof (n3).
It still won't work, but it won't crash either. The other answers explain how to get the functionality right.
(Oh, and don't use gets() either. Just because it is in the standard library doesn't mean it is good code.)

Related

How to convert an arbitrary length unsigned int array to a base 10 string representation?

I am currently working on an arbitrary size integer library for learning purposes.
Each number is represented as uint32_t *number_segments.
I have functional arithmetic operations, and the ability to print the raw bits of my number.
However, I have struggled to find any information on how I could convert my arbitrarily long array of uint32 into the correct, and also arbitrarily long base 10 representation as a string.
Essentially I need a function along the lines of:
std::string uint32_array_to_string(uint32_t *n, size_t n_length);
Any pointers in the right direction would be greatly appreciated, thank you.
You do it the same way as you do with a single uint64_t except on a larger scale (bringing this into modern c++ is left for the reader):
char * to_str(uint64_t x) {
static char buf[23] = {0}; // leave space for a minus sign added by the caller
char *p = &buf[22];
do {
*--p = '0' + (x % 10);
x /= 10;
} while(x > 0);
return p;
}
The function fills a buffer from the end with the lowest digits and divides the number by 10 in each step and then returns a pointer to the first digit.
Now with big nums you can't use a static buffer but have to adjust the buffer size to the size of your number. You probably want to return a std::string and creating the number in reverse and then copying it into a result string is the way to go. You also have to deal with negative numbers.
Since a long division of a big number is expensive you probably don't want to divide by 10 in the loop. Rather divide by 1'000'000'000 and convert the remainder into 9 digits. This should be the largest power of 10 you can do long division by a single integer, not bigum / bignum. Might be you can only do 10'000 if you don't use uint64_t in the division.

Big File reading error in C++

I need to read a file in c++ that has this specific format:
10 5
1 2 3 4 1 5 1 5 2 1
All the values are separated with a space. The first 2 on the first line are the variables N and M respectively and all the N values from the second line need to be in an array called S with the size of N. The code I have written has no problem with files like these but it does not work when it comes to really big files with millions and so on that i need it to work with. Here is the code
int N,M;
FILE *read = fopen("file.in", "r");
fscanf(read, "%d %d ", &N, &M);
int S[N];
for( i =0; i < N; i++){
fscanf(read, "%d ", &S[i]);
}
What should I change?
There are multiple potential issues when getting in the range of millions of integers:
int is most often 32 bits, a 32 bits signed integer will have a range of -2^31 to 2^31 - 1, and thus the maximum of 2,147,483,647. You should switch to a 64 bits integral.
You are using int S[N] a Variable Length Array (VLA) which is not Standard C++ (it is Standard C99, but... there are discussions as to whether it was a good idea or not). The important detail, though, is that a VLA is stored on the stack: 1 million of 32 bits int is 4 MB, 2 millions is 8 MB, etc... check your default stack size, but it likely is less than 8 MB, and thus you have a stack-overflow (you're on the right site for help!).
So, let's switch to C++ and do away with those issues:
#include <cstdint> // for int64_t
#include <fstream>
#include <vector>
int main(int argc, char* argv[]) {
std::ifstream stream("data.txt");
int64_t n = 0, m = 0;
stream >> n >> m;
std::vector<int> data;
for (int64_t c = 0; c != n; ++c) {
int i = 0;
stream >> i;
data.push_back(i);
}
// do your best :)
}
First of all, we use int64_t from <cstdint> to do away with the integer overflow issue. Second, we use a stream (input file stream: ifstream) to avoid having to learn what is the format associated with each and every integral type (it's a pain). Third, we use a vector to store the data we read, and do away with the stack overflow issue.
You are using variable sized arrays. This is not standard and not supported by all compilers. If your compiler support it, and you go in the millions, you'll run out of stack space (stack overflow).
Alternatively, you could define S as being a vector with vector<int> S(N);

How can i store 2 numbers in a 1 byte char?

I have the question of the title, but If not, how could I get away with using only 4 bits to represent an integer?
EDIT really my question is how. I am aware that there are 1 byte data structures in a language like c, but how could I use something like a char to store two integers?
In C or C++ you can use a struct to allocate the required number of bits to a variable as given below:
#include <stdio.h>
struct packed {
unsigned char a:4, b:4;
};
int main() {
struct packed p;
p.a = 10;
p.b = 20;
printf("p.a %d p.b %d size %ld\n", p.a, p.b, sizeof(struct packed));
return 0;
}
The output is p.a 10 p.b 4 size 1, showing that p takes only 1 byte to store, and that numbers with more than 4 bits (larger than 15) get truncated, so 20 (0x14) becomes 4. This is simpler to use than the manual bitshifting and masking used in the other answer, but it is probably not any faster.
You can store two 4-bit numbers in one byte (call it b which is an unsigned char).
Using hex is easy to see that: in b=0xAE the two numbers are A and E.
Use a mask to isolate them:
a = (b & 0xF0) >> 4
and
e = b & 0x0F
You can easily define functions to set/get both numbers in the proper portion of the byte.
Note: if the 4-bit numbers need to have a sign, things can become a tad more complicated since the sign must be extended correctly when packing/unpacking.

Spitting a char array into a sequence of ints and floats

I'm writing a program in C++ to listen to a stream of tcp messages from another program to give tracking data from a webcam. I have the socket connected and I'm getting all the information in but having difficulty splitting it up into the data I want.
Here's the format of the data coming in:
8 byte header:
4 character string,
integer
32 byte message:
integer,
float,
float,
float,
float,
float
This is all being stuck into a char array called buffer. I need to be able to parse out the different bytes into the primitives I need. I have tried making smaller sub arrays such as headerString that was filled by looping through and copying the first 4 elements of the buffer array and I do get the the correct hear ('CCV ') printed out. But when I try the same thing with the next for elements (to get the integer) and try to print it out I get weird ascii characters being printed out. I've tried converting the headerInt array to an integer with the atoi method from stdlib.h but it always prints out zero.
I've already done this in python using the excellent unpack method, is their any alternative in C++?
Any help greatly appreciated,
Jordan
Links
CCV packet structure
Python unpack method
The buffer only contains the raw image of what you read over the
network. You'll have to convert the bytes in the buffer to whatever
format you want. The string is easy:
std::string s(buffer + sOffset, 4);
(Assuming, of course, that the internal character encoding is the same
as in the file—probably an extension of ASCII.)
The others are more complicated, and depend on the format of the
external data. From the description of the header, I gather than the
integers are four bytes, but that still doesn't tell me anything about
their representation. Depending on the case, either:
int getInt(unsigned char* buffer, int offset)
{
return (buffer[offset ] << 24)
| (buffer[offset + 1] << 16)
| (buffer[offset + 2] << 8)
| (buffer[offset + 3] );
}
or
int getInt(unsigned char* buffer, int offset)
{
return (buffer[offset + 3] << 24)
| (buffer[offset + 2] << 16)
| (buffer[offset + 1] << 8)
| (buffer[offset ] );
}
will probably do the trick. (Other four byte representations of
integers are possible, but they are exceedingly rare. Similarly, the
conversion of the unsigned results of the shifts and or's into a int
is implementation defined, but in practice, the above will work almost
everywhere.)
The only hint you give concerning the representation of the floats is in
the message format: 32 bytes, minus a 4 byte integer, leave 28 bytes for
5 floats; but 28 doesn't go into five, so I cannot even guess as to the
length of the floats (except that there must be some padding in there
somewhere). But converting floating point can be more or less
complicated if the external format isn't exactly like the internal
format.
Something like this may work:
struct {
char string[4];
int integers[2];
float floats[5];
} Header;
Header* header = (Header*)buffer;
You should check that sizeof(Header) == 32.

format specifier for short integer

I don't use correctly the format specifiers in C. A few lines of code:
int main()
{
char dest[]="stack";
unsigned short val = 500;
char c = 'a';
char* final = (char*) malloc(strlen(dest) + 6);
snprintf(final, strlen(dest)+6, "%c%c%hd%c%c%s", c, c, val, c, c, dest);
printf("%s\n", final);
return 0;
}
What I want is to copy at
final [0] = a random char
final [1] = a random char
final [2] and final [3] = the short array
final [4] = another char ....
My problem is that i want to copy the two bytes of the short int to 2 bytes of the final array.
thanks.
I'm confused - the problem is that you are saying strlen(dest)+6 which limits the length of the final string to 10 chars (plus a null terminator). If you say strlen(dest)+8 then there will be enough space for the full string.
Update
Even though a short may only be 2 bytes in size, when it is printed as a string each character will take up a byte. So that means it can require up to 5 bytes of space to write a short to a string, if you are writing a number above 10000.
Now, if you write the short to a string as a hexadecimal number using the %x format specifier, it will take up no more than 2 bytes.
You need to allocate space for 13 characters - not 11. Don't forget the terminating NULL.
When formatted the number (500) takes up three spaces, not one. So your snsprintf should give the final length as strlen(dest)+5+3. Then also fix your malloc call to adjust. If you want to compute the strlen of the number, do that with a call like this strlen(itoa(val)). Also, cant forget the NULL at the end of dest, but I think strlen takes this into account, but I'm not for sure.
Simple answer is you only allocated enough space for the strlen(dest) + 6 characters when in all reality it looks like you're going to have 8 extra characters... since you have 2 chars + 3 chars in your number + 2 chars after + dest (5 chars) = 13 char when you allocated 11 chars.
Unsigned shorts can take up to 5 characters, right? (0 - 65535)
Seems like you'd need to allocate 5 characters for your unsigned short to cover all of the values.
Which would point to using this:
char* final = (char*) malloc(strlen(dest) + 10);
You lose one byte because you think the short variable takes 2 byte. But it takes three: one for each digit character ('5', '0', '0'). Also you need a '\0' terminator (+1 byte).
==> You need strlen(dest) + 8
Use 8 instead of 6 on:
char* final = (char*) malloc(strlen(dest) + 6);
and
snprintf(final, strlen(dest)+6, "%c%c%hd%c%c%s", c, c, val, c, c, dest);
Seems like the primary misunderstanding is that a "2-byte" short can't be represented on-screen as 2 1-byte characters.
First, leave enough room:
char* final = (char*) malloc(strlen(dest) + 9);
The entire range of possible values for a 1-byte character are not printable. If you want to display this on screen and be readable, you'll have to encode the 2-byte short as 4 hex bytes, such as:
## as hex, 4 characters
snprintf(final, sizeof(final), "%c%c%4x%c%c%s", c, c, val, c, c, dest);
If you are writing this to a file, that's OK, and you might try the following:
## print raw bytes, upper byte, then lower byte.
snprintf(final, sizeof(final), "%c%c%c%c%c%c%s", c, c, ((val<<8)&0xFF), ((val>>8)&0xFF), c, c, dest);
But that won't make sense to a human looking at it, and is sensitive to endianness. I'd strongly recommend against it.