Simplifying my program to convert numbers to from one base to another - c++

I'm taking a beginner C++ course. I received an assignment telling me to write a program that converts an arbitrary number from any base between binary and hex to another base between binary and hex. I was asked to use separate functions to convert to and from base 10. It was to help us get used to using arrays. (We already covered passing by reference previously in class.) I already turned this in, but I'm pretty sure this wasn't how I was meant to do it:
#include <iostream>
#include <conio.h>
#include <cstring>
#include <cmath>
using std::cout;
using std::cin;
using std::endl;
int to_dec(char value[], int starting_base);
char* from_dec(int value, int ending_base);
int main() {
char value[30];
int starting_base;
int ending_base;
cout << "This program converts from one base to another, so long as the bases are" << endl
<< "between 2 and 16." << endl
<< endl;
input_numbers:
cout << "Enter the number, then starting base, then ending base:" << endl;
cin >> value >> starting_base >> ending_base;
if (starting_base < 2 || starting_base > 16 || ending_base < 2 || ending_base > 16) {
cout << "Invalid base(s). ";
goto input_numbers;
}
for (int i=0; value[i]; i++) value[i] = toupper(value[i]);
cout << "Base " << ending_base << ": " << from_dec(to_dec(value, starting_base), ending_base) << endl
<< "Press any key to exit.";
getch();
return 0;
}
int to_dec(char value[], int starting_base) {
char hex[16] = {'0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'A', 'B', 'C', 'D', 'E', 'F'};
long int return_value = 0;
unsigned short int digit = 0;
for (short int pos = strlen(value)-1; pos > -1; pos--) {
for (int i=0; i<starting_base; i++) {
if (hex[i] == value[pos]) {
return_value+=i*pow((float)starting_base, digit++);
break;
}
}
}
return return_value;
}
char* from_dec(int value, int ending_base) {
char hex[16] = {'0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'A', 'B', 'C', 'D', 'E', 'F'};
char *return_value = (char *)malloc(30);
unsigned short int digit = (int)ceil(log10((double)(value+1))/log10((double)ending_base));
return_value[digit] = 0;
for (; value != 0; value/=ending_base) return_value[--digit] = hex[value%ending_base];
return return_value;
}
I'm pretty sure this is more advanced than it was meant to be. How do you think I was supposed to do it?
I'm essentially looking for two kinds of answers:
Examples of what a simple solution like the one my teacher probably expected would be.
Suggestions on how to improve the code.

I don't think you need the inner loop:
for (int i=0; i<starting_base; i++) {
What is its purpose?
Rather, you should get the character at value[ pos ] and convert it to an integer. The conversion depends on base, so it may be better to do it in a separate function.
You are defining char hex[ 16 ] twice, once in each function. It may better to do it at only one place.
EDIT 1:
Since this is "homework" tagged, I cannot give you the full answer. However, here is an example of how to_dec() is supposed to work. (Ideally, you should have constructed this!)
Input:
char * value = 3012,
int base = 4,
Math:
Number = 3 * 4^3 + 0 * 4^2 + 1 * 4^1 + 2 * 4^0 = 192 + 0 + 4 + 2 = 198
Expected working of the loop:
x = 0
x = 4x + 3 = 3
x = 4x + 0 = 12
x = 4x + 1 = 49
x = 4x + 2 = 198
return x;
EDIT 2:
Fair enough! So, here is some more :-)
Here is a code sketch. Not compiled or tested though. This is direct translation of the example I provided earlier.
unsigned
to_dec( char * inputString, unsigned base )
{
unsigned rv = 0; // return value
unsigned c; // character converted to integer
for( char * p = inputString; *p; ++p ) // p iterates through the string
{
c = *p - hex[0];
rv = base * rv + c;
}
return rv;
}

I would stay away from GOTO statements unless they are absolutely necessary. GOTO statements are easy to use but will lead to 'spaghetti code'.
Try using a loop instead. Something along the lines of this:
bool base_is_invalid = true;
while ( base_is_invalid ) {
cout << "Enter the number, then starting base, then ending base:" << endl;
cin >> value >> starting_base >> ending_base;
if (starting_base < 2 || starting_base > 16 || ending_base < 2 || ending_base > 16)
cout << "Invalid number. ";
else
base_is_invalid = false;
}

You can initialize arrays by string literals (notice that the terminating \0 is not included because the size of the array doesn't permit that):
char const hex[16] = "0123456789ABCDEF";
Or just use a pointer to the string literal for the same effect:
char const* hex = "0123456789ABCDEF";

to_dec() looks to complicated, here is my shot at it:
int to_dec(char* value, int starting_base)
{
int return_value = 0;
for (char* cur = value + strlen(value) - 1; cur >= value; cur--) {
// assuming chars are ascii/utf: 0-9=48-57, A-F=65-70
// faster than loop
int inval = *cur - 48;
if (inval > 9) {
inval = *cur - 55;
if (inval > 15) {
// throw input error
}
}
if (inval < 0) {
// throw input error
}
if (inval >= starting_base) {
// throw input error
}
// now the simple calc
return_value *= starting_base;
return_value += inval;
}
return return_value;
}

for the initial conversion from ascii to an integer, you can also use a lookup table (just as you are using a lookuptable to to the conversion the other way around) , which is much faster then searching through the array for every digit.
int to_dec(char value[], int starting_base)
{
char asc2BaseTab = {0,1,2,3,4,5,6,7,8,9,-1,-1,-1,-1,-1,-1,-1,10,11,12,13,14,15, //0-9 and A-F (big caps)
-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, //unused ascii chars
10,11,12,13,14,15}; //a-f (small caps)
srcIdx = strlen(value);
int number=0;
while((--srcIdx) >= 0)
{
number *= starting_base;
char asciiDigit = value[srcIdx];
if(asciiDigit<'0' || asciiDigit>'f')
{
//display input error
}
char digit = asc2BaseTab[asciiDigit - '0'];
if(digit == -1)
{
//display input error
}
number += digit;
}
return number;
}
p.s. excuses if there are some compile errors in this...I couldn't test it...but the logic is sound.

In your description of the assignment as given it says:
"I was asked to use separate functions to convert to and from base 10."
If that is really what the teacher meant and wanted, which is doubtful, your code doesn't do that:
int to_dec(char value[], int starting_base)
is returning an int which is a binary number. :-) Which in my opinion does make more sense.
Did the teacher even notice that?

C and C++ are different languages, and with different styles of programming. You better not to mix them. (Where C and C++ differ)
If you are trying to use C++, then:
Use std::string instead of char* or char[].
int to_dec(string value, int starting_base);
string from_dec(int value, int ending_base);
No any mallocs, use new/delete. But actually C++ manages memory automatically. The memory is freed as soon as variable is out of scope (unless you are dealing with pointers). And pointers are the last thing you need to deal with.
We don't need here any lookup tables, just a magic string.
string hex = "0123456789ABCDEF";//The index of the letter is its decimal value. A is 10, F is 15.
//usage
char c = 'B';
int value = hex.find( c );//works only with uppercase;
The refactored to_dec can be like that.
int to_dec(string value, int starting_base) {
string hex = "0123456789ABCDEF";
int result = 0;
for (int power = 0; power < value.size(); ++power) {
result += hex.find( value.at(value.size()-power-1) ) * pow((float)starting_base, power);
}
return result;
}
And there is a more elegant algorithm to convert from base 10 to any other
See there for example. You have the opportunity to code it yourself :)

In your from_dec function, you're converting the digits from left to right. An alternative is to convert from right to left. That is,
std::string from_dec(int n, int base)
{
std::string result;
bool is_negative = n < 0;
if (is_negative)
{
n = - n;
}
while (n != 0)
{
result = DIGITS[n % base] + result;
n /= base;
}
if (is_negative)
{
result = '-' + result;
}
return result;
}
This way, you won't need the log function.
(BTW, to_dec and from_dec are inaccurate names. Your computer doesn't store numbers in base 10.)

Got this question on an interview once and brainfarted and spun wheels for a while. Go figure. Anyway, a couple years later I'm going through Math and Physics for Programmers to brush up for positions that are more math intensive than what I've been doing. CH1 "assignment" has
// Write a function ConvertBase(Number, Base1, Base2) which takes a
// string or array representing an integer in Base1 and converts it
// into base Base2, returning the new string.
So, I took an approach mentioned above: I convert string in arbitrary base to UINT64, then I convert UINT64 back to arbitrary base:
CString ConvertBase(const CString& strNumber, int base1, int base2)
{
return ValueToBaseString(BaseStringToValue(strNumber, base1), base2);
}
Each of the subfunctions has a recursive solution. Here's one for example:
UINT64 BaseStringToValue(const CString& strNumber, int base)
{
if (strNumber.IsEmpty())
{
return 0;
}
CString outDigit = strNumber.Right(1);
UINT64 output = DigitToInt(outDigit[0]);
CString strRemaining = strNumber.Left(strNumber.GetLength() - 1);
UINT64 val = BaseStringToValue(strRemaining, base);
output += val * base;
return output;
}
I find the other one slightly harder to grasp mentally, but it works roughly the same way.
I also implemented DigitToInt and IntToDigit which work just like they sound. You can take some neat shortcuts there, by the way, if you realize that chars are ints then you don't need huge switch statements:
int DigitToInt(wchar_t cDigit)
{
cDigit = toupper(cDigit);
if (cDigit >= '0' && cDigit <= '9')
{
return cDigit - '0';
}
return cDigit - 'A' + 10;
}
and unit tests are really your friend here:
typedef struct
{
CString number;
int base1;
int base2;
CString answer;
} Input;
Input input[] =
{
{ "345678", 10, 16, "5464E"},
{ "FAE211", 16, 8, "76561021" },
{ "FAE211", 16, 2, "111110101110001000010001"},
{ "110110111", 2, 10, "439" }
};
(snip)
for (int i = 0 ; i < sizeof(input) / sizeof(input[0]) ; i++)
{
CString result = ConvertBase(input[i].number, input[i].base1, input[i].base2);
printf("%S in base %d is %S in base %d (%S expected - %s)\n", (const WCHAR*)input[i].number,
input[i].base1,
(const WCHAR*) result,
input[i].base2,
(const WCHAR*) input[i].answer,
result == input[i].answer ? "CORRECT" : "WRONG");
}
And here's the output:
345678 in base 10 is 5464E in base 16 (5464E expected - CORRECT)
FAE211 in base 16 is 76561021 in base 8 (76561021 expected - CORRECT)
FAE211 in base 16 is 111110101110001000010001 in base 2 (111110101110001000010001 expected - CORRECT)
110110111 in base 2 is 439 in base 10 (439 expected - CORRECT)
Now I took some shortcuts in coding by using CString types, etc. I was giving no consideration to efficiency or performance, I just wanted to solve the algorithm with easiest coding possible.
It can help to understand how these algorithms are recursive if you write them like so: Say you want to determine the "value" of the "string" B4A3, which is in base 13. You know it's 3 + 13(A) + 13(13)(4) + 13(13)(13)(B) Another way to write that is: 0+3+13(A+13(4+13(B))) - and voila! Recursion.

Apart from the things already mentioned, I would suggest using the new-operator instead of free. The advantages of new are that it also does call constructors - which is irrelevant here since you're using a POD type, but important when it comes to objects such as std::string or your own custom classes - and that you can overload the new operator to suit your specific needs (which is irrelevant here, too :p). But don't go ahead using malloc for PODs and new for classes, since mixing them is considered bad style.
But okay, you got yourself some heap memory in from_dec... but where is it freed again? Basic rule: memory that you malloc (or calloc etc) must be passed to free at some point. The same rule applies to the new-operator, just that the release-operator is called delete. Note that for arrays, you need new[] and delete[]. DON'T ever allocate with new and release with delete[] or the other way around, since the memory won't be released correctly.
Nothing evil will happen when your toy program won't release the memory... I guess your PC has got enough RAM to cope with it and when you shut down your program, the OS releases the memory anyway.. but not all programs are (a) that tiny and (b) shut down often.
Also I'd avoid conio.h, since this is not portable. You're not using the most complicated IO, so the standard headers (iostream etc) should do.
Likewise, I think most programmers using modern languages follow the rule "Only use goto if other solutions are really crippled or tons of more work". This is a situation that can be easily solved by using loops, as shown by emceefly. In your program the goto is easy to handle, but you won't be writing such small programs forever, will you? ;)
I, for example, was presented with some legacy code recently.. 2000 lines of goto-littered code, yay! Trying to follow the code's logical flow was almost impossible ("Oh, jump ahead 200 lines, great... who needs context anyway"), even harder was to rewrite the damn thing.
So okay, your goto doesn't hurt here, but where's the benefit? 2-3 lines shorter? Doesn't really matter overall (if you're paid by lines of code, this could also be a major disadvantage ;)). Personally I find the loop version more readable and clean.
As you see, most of the points here can be ignored easily for your program, since it's a toy program. But when you think of larger programs, they make more sense (hopefully) ;)

Related

Conversion from Integer to BCD

I want to convert the integer (whose maximum value can reach to 99999999) in to BCD and store in to array of 4 characters.
Like for example:
Input is : 12345 (Integer)
Output should be = "00012345" in BCD which is stored in to array of 4 characters.
Here 0x00 0x01 0x23 0x45 stored in BCD format.
I tried in the below manner but didnt work
int decNum = 12345;
long aux;
aux = (long)decNum;
cout<<" aux = "<<aux<<endl;
char* str = (char*)& aux;
char output[4];
int len = 0;
int i = 3;
while (len < 8)
{
cout <<"str: " << len << " " << (int)str[len] << endl;
unsigned char temp = str[len]%10;
len++;
cout <<"str: " << len << " " << (int)str[len] << endl;
output[i] = ((str[len]) << 4) | temp;
i--;
len++;
}
Any help will be appreciated
str points actually to a long (probably 4 bytes), but the iteration accesses 8 bytes.
The operation str[len]%10 looks as if you are expecting digits, but there is only binary data. In addition I suspect that i gets negative.
First, don't use C-style casts (like (long)a or (char*)). They are a bad smell. Instead, learn and use C++ style casts (like static_cast<long>(a)), because they point out where you are doing things that are dangeruos, instead of just silently working and causing undefined behavior.
char* str = (char*)& aux; gives you a pointer to the bytes of aux -- it is actually char* str = reinterpret_cast<char*>(&aux);. It does not give you a traditional string with digits in it. sizeof(char) is 1, sizeof(long) is almost certainly 4, so there are only 4 valid bytes in your aux variable. You proceed to try to read 8 of them.
I doubt this is doing what you want it to do. If you want to print out a number into a string, you will have to run actual code, not just reinterpret bits in memory.
std::string s; std::stringstream ss; ss << aux; ss >> s; will create a std::string with the base-10 digits of aux in it.
Then you can look at the characters in s to build your BCD.
This is far from the fastest method, but it at least is close to your original approach.
First of all sorry about the C code, I was deceived since this started as a C questions, porting to C++ should not really be such a big deal.
If you really want it to be in a char array I'll do something like following code, I find useful to still leave the result in a little endian format so I can just cast it to an int for printing out, however that is not strictly necessary:
#include <stdio.h>
typedef struct
{
char value[4];
} BCD_Number;
BCD_Number bin2bcd(int bin_number);
int main(int args, char **argv)
{
BCD_Number bcd_result;
bcd_result = bin2bcd(12345678);
/* Assuming an int is 4 bytes */
printf("result=0x%08x\n", *((int *)bcd_result.value));
}
BCD_Number bin2bcd(int bin_number)
{
BCD_Number bcd_number;
for(int i = 0; i < sizeof(bcd_number.value); i++)
{
bcd_number.value[i] = bin_number % 10;
bin_number /= 10;
bcd_number.value[i] |= bin_number % 10 << 4;
bin_number /= 10;
}
return bcd_number;
}

Converting a string of numbers to any form of an int

As a part of a larger program, I must convert a string of numbers to an integer(eventually a float). Unfortunately I am not allowed to use casting, or atoi.
I thought a simple operation along the lines of this:
void power10combiner(string deciValue){
int result;
int MaxIndex=strlen(deciValue);
for(int i=0; MaxIndex>i;i++)
{
result+=(deciValue[i] * 10**(MaxIndex-i));
}
}
would work. How do I convert a char to a int? I suppose I could use ASCII conversions, but I wouldn't be able to add chars to ints anyways(assuming that the conversion method is to have an enormous if statement that returns the different numerical value behind each ASCII number).
There are plenty of ways to do this, and there are some optimization and corrections that can be done to your function.
1) You are not returning any value from your function, so the return type is now int.
2) You can optimize this function by passing a const reference.
Now for the examples.
Using std::stringstream to do the conversion.
int power10combiner(const string& deciValue)
{
int result;
std::stringstream ss;
ss << deciValue.c_str();
ss >> result;
return result;
}
Without using std::stringstream to do the conversion.
int power10combiner(const string& deciValue)
{
int result = 0;
for (int pos = 0; deciValue[pos] != '\0'; pos++)
result = result*10 + (deciValue[pos] - '0');
return result;
}
EDITED by suggestion, and added a bit of explanation.
int base = 1;
int len = strlen(deciValue);
int result = 0;
for (int i = (len-1); i >= 0; i--) { // Loop right to left. Is this off by one? Too tired to check.
result += (int(deciValue[i] - '0') * base); // '0' means "where 0 is" in the character set. We are doing the conversion int() because it will try to multiply it as a character value otherwise; we must cast it to int.
base *= 10; // This raises the base...it's exponential but simple and uses no outside means
}
This assumes the string is only numbers. Please comment if you need more clarification.
You can parse a string iteratively into an integer by simply implementing the place-value system, for any number base. Assuming your string is null-terminated and the number unsigned:
unsigned int parse(const char * s, unsigned int base)
{
unsigned int result = 0;
for ( ; *s; ++s)
{
result *= base;
result += *s - '0'; // see note
}
return result;
}
As written, this only works for number bases up to 10 using the numerals 0, ..., 9, which are guaranteed to be arranged in order in your execution character set. If you need larger number bases or more liberal sets of symbols, you need to replace *s - '0' in the indicated line by a suitable lookup mechanism that determines the digit value of your input character.
I would use std::stringstream, but nobody posted yet a solution using strtol, so here is one. Note, it doesn't perform handle out-of-range errors. On unix/linux you can use errno variable to detect such errors(by comparing it to ERANGE).
BTW, there are strtod/strtof/strtold functions for floating-point numbers.
#include <iostream>
#include <cstdlib>
#include <string>
int power10combiner(const std::string& deciValue){
const char* str = deciValue.c_str();
char* end; // the pointer to the first incorrect character if there is such
// strtol/strtoll accept the desired base as their third argument
long int res = strtol(str, &end, 10);
if (deciValue.empty() || *end != '\0') {
// handle error somehow, for example by throwing an exception
}
return res;
}
int main()
{
std::string s = "100";
std::cout << power10combiner(s) << std::endl;
}

c++ stringstream is too slow, how to speed up? [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Fastest way to read numerical values from text file in C++ (double in this case)
#include <ctime>
#include <cstdlib>
#include <string>
#include <sstream>
#include <iostream>
#include <limits>
using namespace std;
static const double NAN_D = numeric_limits<double>::quiet_NaN();
void die(const char *msg, const char *info)
{
cerr << "** error: " << msg << " \"" << info << '\"';
exit(1);
}
double str2dou1(const string &str)
{
if (str.empty() || str[0]=='?') return NAN_D;
const char *c_str = str.c_str();
char *err;
double x = strtod(c_str, &err);
if (*err != 0) die("unrecognized numeric data", c_str);
return x;
}
static istringstream string_to_type_stream;
double str2dou2(const string &str)
{
if (str.empty() || str[0]=='?') return NAN_D;
string_to_type_stream.clear();
string_to_type_stream.str(str);
double x = 0.0;
if ((string_to_type_stream >> x).fail())
die("unrecognized numeric data", str.c_str());
return x;
}
int main()
{
string str("12345.6789");
clock_t tStart, tEnd;
cout << "strtod: ";
tStart=clock();
for (int i=0; i<1000000; ++i)
double x = str2dou1(str);
tEnd=clock();
cout << tEnd-tStart << endl;
cout << "sstream: ";
tStart=clock();
for (int i=0; i<1000000; ++i)
double x = str2dou2(str);
tEnd=clock();
cout << tEnd-tStart << endl;
return 0;
}
strtod: 405
sstream: 1389
update: remove undersocres, env: win7+vc10
C/C++ text to number formatting is very slow. Streams are horribly slow but even C number parsing is slow because it's quite difficult to get it correct down to the last precision bit.
In a production application where reading speed was important and where data was known to have at most three decimal digits and no scientific notation I got a vast improvement by hand-coding a floating parsing function handling only sign, integer part and any number of decimals (by "vast" I mean 10x faster compared to strtod).
If you don't need exponent and the precision of this function is enough this is the code of a parser similar to the one I wrote back then. On my PC it's now 6.8 times faster than strtod and 22.6 times faster than sstream.
double parseFloat(const std::string& input)
{
const char *p = input.c_str();
if (!*p || *p == '?')
return NAN_D;
int s = 1;
while (*p == ' ') p++;
if (*p == '-') {
s = -1; p++;
}
double acc = 0;
while (*p >= '0' && *p <= '9')
acc = acc * 10 + *p++ - '0';
if (*p == '.') {
double k = 0.1;
p++;
while (*p >= '0' && *p <= '9') {
acc += (*p++ - '0') * k;
k *= 0.1;
}
}
if (*p) die("Invalid numeric format");
return s * acc;
}
string stream is slow. Quite very slow. If you are writing anything performance critical that acts on large data sets ( say loading assets after a level change during a game ) do not use string streams. I recommend using the old school c library parsing functions for performance, although I cannot say how they compare to something like boost spirit.
However, compared to c library functions, string streams are very elegant, readable and reliable so if what you are doing is not performance ciritcal I recommend sticking to streams.
In general, if you need speed, consider this library:
http://www.fastformat.org/
(I'm not sure if it contains functions for converting strings or streams to other types, though, so it may not answer your current example).
For the record, please note you're comparing apples to oranges here. strtod() is a simple function that has a single purpose (converting strings to double), while stringstream is a much more complex formatting mechanism, which is far from being optimized to that specific purpose. A fairer comparison would be comparing stringstream to the sprintf/sscanf line of functions, which would be slower than strtod() but still faster than stringstream. I'm not exactly sure what makes stringstream's design slower than sprintf/sscanf, but it seems like that's the case.
Have you considered using lexical_cast from boost?
http://www.boost.org/doc/libs/1_46_1/libs/conversion/lexical_cast.htm
Edit: btw, the clear() should be redundant.

Return the result of sum of character arrays

Recently in an interview i was asked a question to write a function which takes two character arrays(integers) as input and returns the output character array.
Function Signature:
char* find_sum(char* a, char* b)
How would one approach this?
Example scenario:
find_sum("12345","32142") = "44487"
Note:
The number of digits can be many(1-100).
u can add huge numbers using the char array approach. however you need to delete the char* after using it every time or use some smart pointer.
char* find_sum(char* a, char* b) {
int lenA = strlen(a), lenB = strlen(b);
int max = lenA > lenB ? lenA : lenB; // Get the max for allocation
char* res = (char*)malloc (max+2);
memset(res, '0', max +1); // set the result to all zeros
res[max+1] = '\0';
int i=lenA - 1, j = lenB - 1, k = max;
for (; i >= 0 || j >=0; --i, --j, --k) {
int sum = 0;
if (i >= 0 && j>=0)
sum = a[i] - '0' + b[j] - '0' + res[k] - '0' ; // add using carry
else if (j >= 0)
sum = b[j] - '0' + res[k] - '0' ; // add the carry with remaining
else if (i >= 0)
sum = a[i] - '0' + res[k] - '0' ;
res[k] = sum % 10 + '0';
res[k-1] = sum / 10 + '0';
}
return res;
}
int main() {
printf (" sum = %s ", find_sum("12345432409240242342342342234234234", "9934563424242424242423442424234"));
return 0;
}
Note: The precondition for the function is the input char arrays should contain only numbers.
The most obvious answer is internally to use something like atoi and sprintf to convert the numbers to integers, do the sum and return the response as a char* However the important thing here is not what the interviewer is asking but why.
In my experience, the interviewer is probably not wanting you to write a hum-dinger of a solution that covers all angles. What they most likely want to get to is what the most common approach would be, and what are the likely limitations of such a function. I.e.:
What happens if your input numbers aren't integers? (e.g. 13.245, 2.3E+7)
What happens if your 'numbers' aren't numbers at all?
What happens if your input integers are really big? (i.e. ~2^31)
How could you detect an error and how would you report it.
How would you allocate memory for the resultant string?
What would the memory allocation imply for the calling code?
What is the efficiency of the function and how could you make it more efficient?
In this way, the interviewer wants to probe your experience of critiquing approaches to problem solving. Naturally, there are many ways of solving this problem. Some of the approaches have side-effects but in certain contexts, these side effects (i.e. integer overflow) may not be greatly important.
Coding is often a trade off between a comprehensive solution and what can be produced quickly (and therefore less expensively) These questions allow the interviewer to get a feel for your understanding of quality - that is, can you design something that is fit for purpose, robust and yet does not take too long to put together - and also your experience of having to identify / resolve common bugs.
You did not mention anything about not using any external command.
We can do this easily on machines that have the bc command. You can add any number of digits:
$ echo "99999999999999999999999999999999+1" | bc
100000000000000000000000000000000
$
We call this bc from the C program. We need to construct the right command line as
echo "n1+n2" | bc
and then use popen to read its result. Below is the function to do that. The code lacks many error checking.
char* find_sum(char* a, char* b) {
int l1 = strlen(a),l2 = strlen(b);
int cmdLen = l1 + l2 + 30; // 30 to accomodate echo,bc and stuff.
char *cmd = malloc(cmdLen);
snprintf(cmd,cmdLen,"echo \"%s+%s\"|bc",a,b);
FILE *fp = popen(cmd, "r");
int max = (l1 > l2) ? l1:l2;
max += 2; // one for additional digit, one for null.
char *result = malloc(max);
fgets(result, max, fp);
return result;
}
Working link
The answer is probably that you have to ask what is returned? Is this a memory allocated string that should be freed by the user or is this a static memory location that is overwritten the next time the function is called?
char* find_sum(char* a, char* b) {
static char buf[MAX_STRING];
...
return buf;
}
or
char* find_sum(char* a, char* b) {
char *buf = malloc(MAX_STRING*sizeof(char));
...
return buf;
}
Giving this answer shows the interviewer that you know more about C than just making an algorithm. (As a side-node: It also shows why a language like java shines in these situations as the garbage collections takes care of freeing the buffer).
Just remember how you did addition in the second grade on the paper.
#include <stdio.h>
#include <string.h>
char *sum(char *a,char *b);
int main()
{
char a[] = "100";
char b[] = "300";
char *c;
c = sum(a,b);
printf("%s",c);
}
char *sum(char *a,char *b)
{
int x,y,z,z2,zLen;
char *result;
x = atoi(a);
y = atoi(b);
z = x + y;
z2 = z;
/* Determine the length of the string now! */
for(zLen = 1; z > 0 || z < 0; zLen++)
z/=10;
result = (char *)malloc(zLen*sizeof(char)+1);
sprintf(result,"%d\0",z2);
return result;
}
Quick and dirty implimentation. Note that I'm not freeing the memory, which is not "ideal". Will fetch you extra brownie points for mentioning that there are no error checks happening here, and no freeing of memory, which is far from ideal in practical situations.
Online Version of Code
Several of the answers mention the use of atoi & itoa functions.
atoi returns int. Your numbers may not fit into an integer data type.
You may try to alleviate the problem (not completely though) using atol, which return a long int, or atoll, which returns a long long int.
Also, itoa is not a standard library function, and hence may not be available on all systems.
Here's another approach. Nothe that I don't like the prototype for find_sum. I'd very much prefer it to be
char *find_sum(char *dst, size_t len, const char *a, const char *b);
letting the caller be responsible for managing resources.
a and b are strings composed of 1 or more digits (and digits only); the result should be freed by caller. Calling find_sum with invalid inputs causes UB :-)
char *find_sum(char *a, char *b) {
char *res;
int alen, blen, rlen;
int carry;
alen = strlen(a);
blen = strlen(b);
rlen = 1 + ((alen > blen) ? alen : blen);
res = malloc(1 + rlen);
if (res) {
int oldlen = rlen;
res[rlen] = 0;
carry = 0;
while (rlen) {
int tmp;
if (alen && blen) tmp = a[--alen] - '0' + b[--blen] - '0';
else if (alen) tmp = a[--alen] - '0';
else if (blen) tmp = b[--blen] - '0';
else tmp = 0;
tmp += carry;
res[--rlen] = '0' + tmp % 10;
carry = tmp / 10;
}
if (res[0] == '0') memmove(res, res+1, oldlen);
}
return res;
}
There's a working version of the function at ideone ( http://ideone.com/O2jrx ).
itoa(atoi(a) + atoi(b), t, 10); if you want to be lazy, where t is a char[MAX_NUMBER_OF_DIGITS].
The real question regards the output array, as mentioned by other users.

How to convert an int to a binary string representation in C++

I have an int that I want to store as a binary string representation. How can this be done?
Try this:
#include <bitset>
#include <iostream>
int main()
{
std::bitset<32> x(23456);
std::cout << x << "\n";
// If you don't want a variable just create a temporary.
std::cout << std::bitset<32>(23456) << "\n";
}
I have an int that I want to first convert to a binary number.
What exactly does that mean? There is no type "binary number". Well, an int is already represented in binary form internally unless you're using a very strange computer, but that's an implementation detail -- conceptually, it is just an integral number.
Each time you print a number to the screen, it must be converted to a string of characters. It just so happens that most I/O systems chose a decimal representation for this process so that humans have an easier time. But there is nothing inherently decimal about int.
Anyway, to generate a base b representation of an integral number x, simply follow this algorithm:
initialize s with the empty string
m = x % b
x = x / b
Convert m into a digit, d.
Append d on s.
If x is not zero, goto step 2.
Reverse s
Step 4 is easy if b <= 10 and your computer uses a character encoding where the digits 0-9 are contiguous, because then it's simply d = '0' + m. Otherwise, you need a lookup table.
Steps 5 and 7 can be simplified to append d on the left of s if you know ahead of time how much space you will need and start from the right end in the string.
In the case of b == 2 (e.g. binary representation), step 2 can be simplified to m = x & 1, and step 3 can be simplified to x = x >> 1.
Solution with reverse:
#include <string>
#include <algorithm>
std::string binary(unsigned x)
{
std::string s;
do
{
s.push_back('0' + (x & 1));
} while (x >>= 1);
std::reverse(s.begin(), s.end());
return s;
}
Solution without reverse:
#include <string>
std::string binary(unsigned x)
{
// Warning: this breaks for numbers with more than 64 bits
char buffer[64];
char* p = buffer + 64;
do
{
*--p = '0' + (x & 1);
} while (x >>= 1);
return std::string(p, buffer + 64);
}
AND the number with 100000..., then 010000..., 0010000..., etc. Each time, if the result is 0, put a '0' in a char array, otherwise put a '1'.
int numberOfBits = sizeof(int) * 8;
char binary[numberOfBits + 1];
int decimal = 29;
for(int i = 0; i < numberOfBits; ++i) {
if ((decimal & (0x80000000 >> i)) == 0) {
binary[i] = '0';
} else {
binary[i] = '1';
}
}
binary[numberOfBits] = '\0';
string binaryString(binary);
http://www.phanderson.com/printer/bin_disp.html is a good example.
The basic principle of a simple approach:
Loop until the # is 0
& (bitwise and) the # with 1. Print the result (1 or 0) to the end of string buffer.
Shift the # by 1 bit using >>=.
Repeat loop
Print reversed string buffer
To avoid reversing the string or needing to limit yourself to #s fitting the buffer string length, you can:
Compute ceiling(log2(N)) - say L
Compute mask = 2^L
Loop until mask == 0:
& (bitwise and) the mask with the #. Print the result (1 or 0).
number &= (mask-1)
mask >>= 1 (divide by 2)
I assume this is related to your other question on extensible hashing.
First define some mnemonics for your bits:
const int FIRST_BIT = 0x1;
const int SECOND_BIT = 0x2;
const int THIRD_BIT = 0x4;
Then you have your number you want to convert to a bit string:
int x = someValue;
You can check if a bit is set by using the logical & operator.
if(x & FIRST_BIT)
{
// The first bit is set.
}
And you can keep an std::string and you add 1 to that string if a bit is set, and you add 0 if the bit is not set. Depending on what order you want the string in you can start with the last bit and move to the first or just first to last.
You can refactor this into a loop and using it for arbitrarily sized numbers by calculating the mnemonic bits above using current_bit_value<<=1 after each iteration.
There isn't a direct function, you can just walk along the bits of the int (hint see >> ) and insert a '1' or '0' in the string.
Sounds like a standard interview / homework type question
Use sprintf function to store the formatted output in the string variable, instead of printf for directly printing. Note, however, that these functions only work with C strings, and not C++ strings.
There's a small header only library you can use for this here.
Example:
std::cout << ConvertInteger<Uint32>::ToBinaryString(21);
// Displays "10101"
auto x = ConvertInteger<Int8>::ToBinaryString(21, true);
std::cout << x << "\n"; // displays "00010101"
auto x = ConvertInteger<Uint8>::ToBinaryString(21, true, "0b");
std::cout << x << "\n"; // displays "0b00010101"
Solution without reverse, no additional copy, and with 0-padding:
#include <iostream>
#include <string>
template <short WIDTH>
std::string binary( unsigned x )
{
std::string buffer( WIDTH, '0' );
char *p = &buffer[ WIDTH ];
do {
--p;
if (x & 1) *p = '1';
}
while (x >>= 1);
return buffer;
}
int main()
{
std::cout << "'" << binary<32>(0xf0f0f0f0) << "'" << std::endl;
return 0;
}
This is my best implementation of converting integers(any type) to a std::string. You can remove the template if you are only going to use it for a single integer type. To the best of my knowledge , I think there is a good balance between safety of C++ and cryptic nature of C. Make sure to include the needed headers.
template<typename T>
std::string bstring(T n){
std::string s;
for(int m = sizeof(n) * 8;m--;){
s.push_back('0'+((n >> m) & 1));
}
return s;
}
Use it like so,
std::cout << bstring<size_t>(371) << '\n';
This is the output in my computer(it differs on every computer),
0000000000000000000000000000000000000000000000000000000101110011
Note that the entire binary string is copied and thus the padded zeros which helps to represent the bit size. So the length of the string is the size of size_t in bits.
Lets try a signed integer(negative number),
std::cout << bstring<signed int>(-1) << '\n';
This is the output in my computer(as stated , it differs on every computer),
11111111111111111111111111111111
Note that now the string is smaller , this proves that signed int consumes less space than size_t. As you can see my computer uses the 2's complement method to represent signed integers (negative numbers). You can now see why unsigned short(-1) > signed int(1)
Here is a version made just for signed integers to make this function without templates , i.e use this if you only intend to convert signed integers to string.
std::string bstring(int n){
std::string s;
for(int m = sizeof(n) * 8;m--;){
s.push_back('0'+((n >> m) & 1));
}
return s;
}