atof and non-null terminated character array - c++

using namespace std;
int main(int argc, char *argv[]) {
char c[] = {'0','.','5'};
//char c[] = "0.5";
float f = atof(c);
cout << f*10;
if(c[3] != '\0')
{
cout << "YES";
}
}
OUTPUT: 5YES
Does atof work with non-null terminated character arrays too? If so, how does it know where to stop?

Does atof work with non-null terminated character arrays too?
No, it doesn't. std::atof requires a null-terminated string in input. Failing to satisfy this precondition is Undefined Behavior.
Undefined Behavior means that anything could happen, including the program seeming to work fine. What is happening here is that by chance you have a byte in memory right after the last element of your array which cannot be interpreted as part of the representation of a floating-point number, which is why your implementation of std::atof stops. But that's something that cannot be relied upon.
You should fix your program this way:
char c[] = {'0', '.', '5', '\0'};
// ^^^^

No, atof does not work with non-null terminated arrays: it stops whenever it discovers zero after the end of the array that you pass in. Passing an array without termination is undefined behavior, because it leads the function to read past the end of the array. In your example, the function has likely accessed bytes that you have allocated to f (although there is no certainty there, because f does not need to follow c[] in memory).
char c[] = {'0','.','5'};
char d[] = {'6','7','8'};
float f = atof(c); // << Undefined behavior!!!
float g = atof(d); // << Undefined behavior!!!
cout << f*10;
The above prints 5.678, pointing out the fact that a read past the end of the array has been made.

No... atof() requires a null terminated string.
If you have a string you need to convert that is not null terminated, you could try copying it into a target buffer based on the value of each char being a valid digit. Something to the effect of...
char buff[64] = { 0 };
for( int i = 0; i < sizeof( buff )-1; i++ )
{
char input = input_string[i];
if( isdigit( input ) || input == '-' || input == '.' )
buff[i] = input;
else
break;
}
double result = atof( buff );

From the description of the atof() function on MSDN (probably applies to other compilers) :
The function stops reading the input string at the first character that it cannot recognize as part of a number. This character may be the null character ('\0' or L'\0') terminating the string.

It must either be 0 terminated or the text must contain characters that do not belong to the number.

std::string already terminate a string with NULL!
So why not
std::string number = "7.6";
double temp = ::atof(number.c_str());
You can also do it with the stringstream or boost::lexical_cast
http://www.boost.org/doc/libs/1_53_0/doc/html/boost_lexical_cast.html
http://www.cplusplus.com/reference/sstream/stringstream/

Since C++11, we have std::stof. By replacing atof with std::stof, it would be easier to handle.
I made a handy wrapper if you always pass a known size of char array.
Live Demo
#include <fmt/core.h>
#include <type_traits>
#include <iostream>
// SFINAE fallback
template<typename T, typename =
std::enable_if< std::is_pointer<T>::value >
>
float charArrayToFloat(const T arr){ // Fall back for user friendly compiler errors
static_assert(false == std::is_pointer<T>::value, "`charArrayToFloat()` dosen't allow conversion from pointer!");
return -1;
}
// Valid for both null or non-null-terminated char array
template<size_t sz>
float charArrayToFloat(const char(&arr)[sz]){
// It doesn't matter whether it's null terminated or not
std::string str(arr, sz);
return std::stof(str);
}
int main() {
char number[4] = {'0','.','4','2'};
float ret = charArrayToFloat(number);
fmt::print("The answer is {}. ", ret);
return 0;
}
Output: The answer is 0.42.

Does atof work with non-null terminated character arrays too?
No, this function expects a pointer to a null terminated string. Failing to do so, say for example by passing a pointer to a non-null terminated string(or a non-null terminated character array) is undefined behavior.
Undefined behavior means anything1 can happen including but not limited to the program giving your expected output. But never rely(or make conclusions based) on the output of a program that has undefined behavior.
So the output that you're seeing(maybe seeing) is a result of undefined behavior. And as i said don't rely on the output of a program that has UB. The program may just crash.
So the first step to make the program correct would be to remove UB. Then and only then you can start reasoning about the output of the program.
1For a more technically accurate definition of undefined behavior see this where it is mentioned that: there are no restrictions on the behavior of the program.

Related

Getting Extra characters at the end when Creating std::string from char*

I have just started learning C++. Now i am learning about arrays. So i am trying out different examples. One such example is given below:
int main()
{
const char *ptr1 = "Anya";
char arr[] = {'A','n','y','a'};
std::string name1(ptr1); //this works
std::cout << name1 << std::endl;
std::string name2(arr);
std::cout << name2 << std::endl; //this prints extra characters at the end?
return 0;
}
In the above example at the last cout statement i am getting some extra characters at the end. My question is that how can i prevent this from happening in the above code and what is wrong with the code so that i don't make the same mistake in future?
char arr[] = {'A','n','y','a'}; is not null terminated so you will read it out of bounds when creating the string which in turn makes your program have undefined behavior (and could therefore do anything).
Either make it null terminated:
char arr[] = {'A','n','y','a','\0'};
Or, create the string from iterators:
#include <iostream>
#include <iterator>
#include <string>
int main() {
char arr[] = {'A', 'n', 'y', 'a'};
std::string name2(std::begin(arr), std::end(arr));
std::cout << name2 << '\n'; // now prints "Anya"
}
Or create it with the constructor taking the length as an argument:
std::string name2(arr, sizeof arr); // `sizeof arr` is here 4
The problem is that you're constructing a std::string using a non null terminated array as explained below.
When you wrote:
char arr[] = {'A','n','y','a'}; //not null terminated
The above statement creates an array that is not null terminated.
Next when you wrote:
std::string name2(arr); //undefined behavior
There are 2 important things to note about the above statement:
arr decays to a char* due to type decay.
This char* is passed as an argument to a std::string constructor that have a parameter of type const char*. Essentially the above statement creates a std::string object from a non null terminated array.
But note that whenever we create a std::string using a const char*, the array to which the pointer points must be null terminated. Otherwise the result is undefined behavior.
Undefined behavior means anything1 can happen including but not limited to the program giving your expected output. But never rely(or make conclusions based) on the output of a program that has undefined behavior.
For example here the program gives expected output but here it doesn't. So as i said, don't rely on the output of a program that have UB.
Solution
You can solve this by making your array null terminated as shown below.
char arr[] = {'A','n','y','a','\0'}; //arr is null terminated
// char arr[] = "Anya"; //this is also null terminated
1For a more technically accurate definition of undefined behavior see this where it is mentioned that: there are no restrictions on the behavior of the program.

strstr not working in C++ 4.7 on codeforces

On online compiler this program is giving perfect output on giving input "ABACABA", but on Codeforces tests it is just posting the last line. On debugging I found out that the pointer u is indicating to address 0 when strstr() is used. I am unable to understand why the function is working on other online compiler ,but not on Codeforces.
EDIT: Okay so thanks to #Jeremy Friesner, I found out that it is actually strncpy that is not working properly because now the custom test cases compiler is giving wrong output for 'str'. Still don't know why it should behave differently on two different compilers and what changes should I make.
#include<iostream>
#include<stdio.h>
#include<string>
#include<string.h>
#include<stdlib.h>
using namespace std;
int main()
{
char *s;
int length=20;
s = (char *) malloc(length*(sizeof(char)));
char c;
int count=0;
while((c=getchar())>='A')
{
if(c<='Z')
{
//cout<<count;
if(length>=count)
{
s = (char *) realloc(s,(length+=10)*sizeof(char));
}
s[count++]=c;
//printf("%p\n",s);
}
else
{
break;
}
}
char *u=s;
int o=1;
//printf("%p\n",s);
while(u)
{
char *str = (char *) malloc(o*sizeof(char));
str = strncpy(str,s,o);
//cout<<str<<endl;
char *t;
u = strstr(s+1,str);
//printf("u %p\n",u);
t=u;
int ct=0;
char *p;
while(t)
{
ct++;
p=t;
t = strstr(t+o,str);
}
ct=ct+1;
//cout<<"here"<<endl;
if(p==(s+count-o))
{
cout<<o<<" "<<ct<<endl;
}
//cout<<ct<<endl;
o++;
}
cout<<count<<" "<<1;
}
As noted in the comments, a principal problem was that you were not null-terminating the string after you read it in, which leads to odd results. Specifically, it leads to you invoking undefined behaviour, which is always a bad thing. The memory allocated by malloc() and the extra memory allocated by realloc() is not guaranteed to be zeroed.
You can fix the problem by adding:
s[count] = '\0';
just before:
char *u = s;
Strictly, you should also check the return values of both malloc() and realloc(). Also, you should not use the idiom:
x = realloc(x, newsize);
If the realloc() fails, you've lost your pointer to the original data, so you've leaked memory. The safe way to work is:
void *space = realloc(x, newsize);
if (space == 0)
…report error etc…
x = space;
x_size = newsize;
There may be other problems; I've not scrutinized the code for every possible issue.
You never put null-termination after the characters you put into s, therefore s does not contain a string. So it causes undefined behaviour to pass it to a function that expects a string, such as strncpy.
Another big problem is your usage of strncpy.
int o=1;
while(u)
{
char *str = (char *) malloc(o*sizeof(char));
str = strncpy(str,s,o);
u = strstr(s+1,str);
The strncpy function does not create a string, if strlen(s) >= o. In this case, the strstr function will just read off the end of the buffer, causing undefined behaviour. (Exactly what happens will depend on your compiler and on what junk was in this piece of memory).
You need to put a null-terminated string into str. Either manually add a null-terminator:
assert(o > 0);
strncpy(str, s, o-1);
str[o-1] = 0;
or use a different function:
snprintf(str, o, "%s", s);
You have to keep in mind that a string is a series of characters followed by a null terminator. Whenever you work with functions that expect strings, it's up to you to make sure that the null terminator is present.
Also be careful with lines like strstr(t+o,str);. If o > strlen(t) this causes undefined behaviour. You've got to do the checking yourself that you do not go outside the bounds of the string.

Is char and int interchangeable for function arguments in C?

I wrote some code to verify a serial number is alpha numeric in C using isalnum. I wrote the code assuming isalnum input is char. Everything worked. However, after reviewing the isalnum later, I see that it wants input as int. Is my code okay the way it is should I change it?
If I do need to change, what would be the proper way? Should I just declare an int and set it to the char and pass that to isalnum? Is this considered bad programming practice?
Thanks in advance.
#include <stdlib.h>
#include <string.h>
#include <stdbool.h>
bool VerifySerialNumber( char *serialNumber ) {
int num;
char* charPtr = serialNumber;
if( strlen( serialNumber ) < 10 ) {
printf("The entered serial number seems incorrect.");
printf("It's less than 10 characters.\n");
return false;
}
while( *charPtr != '\0' ) {
if( !isalnum(*charPtr) ) {
return false;
}
*charPtr++;
}
return true;
}
int main() {
char* str1 = "abcdABCD1234";
char* str2 = "abcdef##";
char* str3 = "abcdABCD1234$#";
bool result;
result = VerifySerialNumber( str1 );
printf("str= %s, result=%d\n\n", str1, result);
result = VerifySerialNumber( str2 );
printf("str= %s, result=%d\n\n", str2, result);
result = VerifySerialNumber( str3 );
printf("str= %s, result=%d\n\n", str3, result);
return 0;
}
Output:
str= abcdABCD1234, result=1
The entered serial number seems incorrect.It's less than 10 characters.
str= abcdef##, result=0
str= abcdABCD1234$#, result=0
You don't need to change it. The compiler will implicitly convert your char to an int before passing it to isalnum. Functions like isalnum take int arguments because functions like fgetc return int values, which allows for special values like EOF to exist.
Update: As others have mentioned, be careful with negative values of your char. Your version of the C library might be implemented carefully so that negative values are handled without causing any run-time errors. For example, glibc (the GNU implementation of the standard C library) appears to handle negative numbers by adding 128 to the int argument.* However, you won't always be able to count on having isalnum (or any of the other <ctype.h> functions) quietly handle negative numbers, so getting in the habit of not checking would be a very bad idea.
* Technically, it's not adding 128 to the argument itself, but rather it appears to be using the argument as an index into an array, starting at index 128, such that passing in, say, -57 would result in an access to index 71 of the array. The result is the same, though, since array[-57+128] and (array+128)[-57] point to the same location.
Usually it is fine to pass a char value to a function that takes an int. It will be converted to the int with the same value. This isn't a bad practice.
However, there is a specific problem with isalnum and the other C functions for character classification and conversion. Here it is, from the ISO/IEC 9899:TC2 7.4/1 (emphasis mine):
In all cases the argument is an int, the value of which shall be
representable as an unsigned char or shall equal the value of the
macro EOF. If the argument has any other value, the behavior is
undefined.
So, if char is a signed type (this is implementation-dependent), and if you encounter a char with negative value, then it will be converted to an int with negative value before passing it to the function. Negative numbers are not representable as unsigned char. The numbers representable as unsigned char are 0 to UCHAR_MAX. So you have undefined behavior if you pass in any negative value other than whatever EOF happens to be.
For this reason, you should write your code like this in C:
if( !isalnum((unsigned char)*charPtr) )
or in C++ you might prefer:
if( !isalnum(static_cast<unsigned char>(*charPtr)) )
The point is worth learning because at first encounter it seems absurd: do not pass a char to the character functions.
Alternatively, in C++ there is a two-argument version of isalnum in the header <locale>. This function (and its friends) do take a char as input, so you don't have to worry about negative values. You will be astonished to learn that the second argument is a locale ;-)

Reason for such unusual output in c++

I cannot understand the unusual behavior of this code output.
It prints:
hellooo
monusonuka
Code is here:
#include <iostream>
#include <cstdio>
using namespace std;
int main()
{
printf(" hellooo \n");
char name[7]="sonuka";
char name1[4]={'m','o','n','u'};
printf("%s",name1);
system("pause");
return 0;
}
Your name1 array is not terminated with a zero character ('\0'). The printf function prints characters until it finds a zero. In your case it goes past the end of the array. What happens is undefined behaviour. A likely outcome is that other variables or garbage is printed to the screen until eventually a \0 somewhere else in memory is hit, but anything could happen including your program crashing.
name1 must be NULL-terminated, otherwise printf will print as many bytes, as it find, till hitting the \0.
It must be
char name1[5]={'m','o','n','u', '\0'};
What you have is undefined behaviour : printf prints memory after the memory, allocated for name1.
In this case, it seems like your compiler has placed the memory for name after name1, that's why they are both printed (name is correctly NULL-terminated, as all literals are).
name1 is not null-terminated, so printf just keeps printing chars until a \0 is reached.
printf("%s",name1);
s conversion specifier requires the argument to be a pointer to a C string.
char name1[4]={'m','o','n','u'};
is not a C string in because the array is not null terminated. Violating the requirement of the conversion speicier invokes undefined behavior and this is why you get this unexpected result.
You're trying to print a char array as a string with printf. Try this code:
int pointer=0;
while(pointer < 4){
printf("%c",name1[pointer]);
pointer++;
}

Weird output when printing char pointer from struct

Here is the code:
#include<iostream>
struct element{
char *ch;
int j;
element* next;
};
int main(){
char x='a';
element*e = new element;
e->ch= &x;
std::cout<<e->ch; // cout can print char* , in this case I think it's printing 4 bytes dereferenced
}
am I seeing some undefined behavior? 0_o. Can anyone help me what's going on?
You have to dereference the pointer to print the single char: std::cout << *e->ch << std::endl
Otherwise you are invoking the << overload for char *, which does something entirely different (it expects a pointer to a null-terminated array of characters).
Edit: In answer to your question about UB: The output operation performs an invalid pointer dereferencing (by assuming that e->ch points to a longer array of characters), which triggers the undefined behaviour.
It will print 'a' followed by the garbage until it find a terminating 0. You are confusing a char type (single character) with char* which is a C-style string which needs to be null-terminated.
Note that you might not actually see 'a' being printed out because it might be followed by a backspace character. As a matter of fact, if you compile this with g++ on Linux it will likely be followed by a backspace.
Is this a null question.
Strings in C and C++ end in a null characher. x is a character but not a string. You have tried to make it one.