I am not sure why I am getting output for this? - c++

I was learning some string handling in C++ and was doing hit and trail on a code and surprisingly got output for the given code.
#include<bits/stdc++.h>
using namespace std;
int main(){
char str[12]={'\67','a','v','i'};
cout<<str;
return 0;
}
Surprisingly I get 7avi printed .
But if I replace '\67' with '\68'. The following error is shown on Repl.it (https://repl.it/languages/cpp)
#include<bits/stdc++.h>
using namespace std;
int main(){
char str[12]={'\68','a','v','i'};
cout<<str;
return 0;
}
main.cpp:6:19: warning: multi-character character constant [-Wmultichar]
char str[12]={'\68','a','v','i'};
^
main.cpp:6:19: error: constant expression evaluates to 1592 which cannot
be narrowed to type 'char' [-Wc++11-narrowing]
char str[12]={'\68','a','v','i'};
^~~~~
main.cpp:6:19: note: insert an explicit cast to silence this issue
char str[12]={'\68','a','v','i'};
^~~~~
static_cast<char>( )
main.cpp:6:19: warning: implicit conversion from 'int' to 'char' changes
value from 1592 to 56 [-Wconstant-conversion]
char str[12]={'\68','a','v','i'};
~^~~~~
2 warnings and 1 error generated.
compiler exit status 1
Please someone explain this behavior.

The \nnn notation, where nnn are digits between 0 and 7, is for octal (base 8) notation. So in \68, 68 is not a valid octal number. The number one more than 67 is 70. The way it's interpreting the code is that you have '\6' as character 6 in octal, but then an additional '8' ASCII character inside your character literal - hence a multi-character constant, which can not be stored in a char variable. You could store it in a "wide character":
wchar_t str[12]={'\68','a','v','i'};
But, there is no operator<< overload to display an array of wchar_t, so your cout << str line will match the void* overload and just display the memory address of the first element in the array, rather than any of the characters themselves.
You can fix that using:
wcout << str;
Separately, I recommend putting a newline after your output too. Without it, your output may be overwritten by the console prompt before you can see it, though that doesn't happen in the online REPL you're using. It should look like:
wcout << str << '\n';

I think you're trying to type in an ASCII character using either octal or hex.(octal usually begins with a 0, but hex with a 0x). Just don't put the the ASCII code in quotes, instead put the code straight into the array, like so:
char str[12] = {68, 'a', 'v', 'i'}; //decimal
char str[12] = {0x44, 'a', 'v', 'i'}; //hex
char str[12] = {0104, 'a', 'v', 'i'}; //octal
Side Note
Please don't use <bits/stdc++.h>. It's not standardized(see here, for a more detailed explanation.) Instead include <iostream> for cout and the other requisite libraries for your other needs.

Related

How to get single characters from unicode string and compare, print them?

I am processing unicode strings in C with libunistring. Can't use another library. My goal is to read a single character from the unicode string at its index position, print it, and compare it to a fixed value. This should be really simple, but well ...
Here's my try (complete C program):
/* This file must be UTF-8 encoded in order to work */
#include <locale.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <unitypes.h>
#include <uniconv.h>
#include <unistdio.h>
#include <unistr.h>
#include <uniwidth.h>
int cmpchr(const char *label, const uint32_t charExpected, const uint32_t charActual) {
int result = u32_cmp(&charExpected, &charActual, 1);
if (result == 0) {
printf("%s is recognized as '%lc', good!\n", label, charExpected);
} else {
printf("%s is NOT recognized as '%lc'.\n", label, charExpected);
}
return result;
}
int main() {
setlocale(LC_ALL, ""); /* switch from default "C" encoding to system encoding */
const char *enc = locale_charset();
printf("Current locale charset: %s (should be UTF-8)\n\n", enc);
const char *buf = "foo 楽あり bébé";
const uint32_t *mbcs = u32_strconv_from_locale(buf);
printf("%s\n", u32_strconv_to_locale(mbcs));
uint32_t c0 = mbcs[0];
uint32_t c5 = mbcs[5];
uint32_t cLast = mbcs[u32_strlen(mbcs) - 1];
printf(" - char 0: %lc\n", c0);
printf(" - char 5: %lc\n", c5);
printf(" - last : %lc\n", cLast);
/* When this file is UTF-8-encoded, I'm passing a UTF-8 character
* as a uint32_t, which should be wrong! */
cmpchr("Char 0", 'f', c0);
cmpchr("Char 5", 'あ', c5);
cmpchr("Last char", 'é', cLast);
return 0;
}
In order to run this program:
Save the program to a UTF-8 encoded file called ustridx.c
sudo apt-get install libunistring-dev
gcc -o ustridx.o -W -Wall -O -c ustridx.c ; gcc -o ustridx -lunistring ustridx.o
Make sure the terminal is set to a UTF-8 locale (locale)
Run it with ./ustridx
Output:
Current locale charset: UTF-8 (should be UTF-8)
foo 楽あり bébé
- char 0: f
- char 5: あ
- last : é
Char 0 is recognized as 'f', good!
Char 5 is NOT recognized as '�����'.
Last char is NOT recognized as '쎩'.
The desired behavior is that char 5 and last char are recognized correctly, and printed correctly in the last two lines of the output.
'あ' and 'é' are invalid character literals. Only characters from the basic source character set and escape sequences are allowed in character literals.
GCC however emits a warning (see godbolt) saying warning: multi-character character constant. This is a different case, and is about character constants such as 'abc', which are multicharacter literals. This is because these characters are encoded using multiple bytes with UTF-8. According to cppreference, the value of such a literal is implementation defined, so you can't rely on its value being the corresponding Unicode code point. GCC specifically doesn't do this as seen here.
Since C11 you can use UTF-32 character literals such as U'あ' which results in a char32_t value of the Unicode code point of the character. Although by my reading the standard doesn't allow using characters such as あ in literals, the examples on cppreference seem to suggest that it is common for compilers to allow this.
A standard-compliant portable solution is using Unicode escape sequences for the character literal, like U'\u3042' for あ, but this is hardly different from using an integer constant such as 0x3042.
From libunistring's documentation:
Compares S1 and S2, each of length N, lexicographically. Returns a
negative value if S1 compares smaller than S2, a positive value if
S1 compares larger than S2, or 0 if they compare equal.
The comparison in the if statement was wrong. That was the reason for the mismatch. Of course, this reveals other, unrelated, issues that also need to be fixed. But, that's the reason for the puzzling result of the comparison.

Extract character from QString and compare

I am trying to compare a specific character in a QString, but getting odd results:
My QString named strModified contains: "[y]£trainstrip+[height]£trainstrip+8"
I convert the string to a standard string:
std:string stdstr = strModified.toStdString();
I can see in the debugger that 'stdstr' contins the correct contents, but when I attempt to extract a character:
char cCheck = stdstr.c_str()[3];
I get something completely different, I expected to see '£' but instead I get -62. I realise that '£' is outside of the ASCII character set and has a code of 156.
But what is it returning?
I've modified the original code to simplify, now:
const QChar cCheck = strModified.at(intClB + 1);
if ( cCheck == mccAttrMacroDelimited ) {
...
}
Where mccAttrMacroDelimited is defined as:
const QChar clsXMLnode::mccAttrMacroDelimiter = '£';
In the debugger when looking at both definitions of what should be the same value, I get:
cCheck: -93 '£'
mccAttrMacroDelimiter: -93 with what looks like a chinese character
The comparison fails...what is going on?
I've gone through my code changing all QChar references to unsigned char, now I get a warning:
large integer implicitly truncated to unsigned type [-Woverflow]
on:
const unsigned char clsXMLnode::mcucAttrMacroDelimiter = '£';
Again, why? According to the google search this may be a bogus message.
I am happy to say that this has fixed the problem, the solution, declare check character as unsigned char and use:
const char cCheck = strModified.at(intClB + 1).toLatin1();
I think because '£' is not is the ASCII table, you will get weird behavior from 'char'. The compiler in Xcode does not even let me compile
char c = '£'; error-> character too large for enclosing literal type
You could use unicode since '£' can be found on the Unicode character table
£ : u+00A3 | Dec: 163.
The answer to this question heavily inspired the code I wrote to extract the decimal value for '£'.
#include <iostream>
#include <codecvt>
#include <locale>
#include <string>
using namespace std;
//this takes the character at [index] and prints the unicode decimal value
u32string foo(std::string const & utf8str, int index)
{
std::wstring_convert<std::codecvt_utf8<char32_t>, char32_t> conv;
std::u32string utf32str = conv.from_bytes(utf8str);
char32_t u = utf32str[index];
cout << u << endl;
return &u;
}
int main(int argc, const char * argv[]) {
string r = "[y]£trainstrip+[height]£trainstrip+8";
//compare the characters at indices 3 and 23 since they are the same
cout << (foo(r,3) == foo(r, 23)) << endl;
return 0;
}
You can use a for loop to get all of the characters in the string if you want. Hopefully this helps

How to convert an ASCII char to its ASCII int value?

I would like to convert a char to its ASCII int value.
I could fill an array with all possible values and compare to that, but it doesn't seems right to me. I would like something like
char mychar = "k"
public int ASCItranslate(char c)
return c
ASCItranslate(k) // >> Should return 107 as that is the ASCII value of 'k'.
The point is atoi() won't work here as it is for readable numbers only.
It won't do anything with spaces (ASCII 32).
Just do this:
int(k)
You're just converting the char to an int directly here, no need for a function call.
A char is already a number. It doesn't require any conversion since the ASCII is just a mapping from numbers to character representation.
You could use it directly as a number if you wish, or cast it.
In C++, you could also use static_cast<int>(k) to make the conversion explicit.
Do this:-
char mychar = 'k';
//and then
int k = (int)mychar;
To Convert from an ASCII character to it's ASCII value:
char c='A';
cout<<int(c);
To Convert from an ASCII Value to it's ASCII Character:
int a=67;
cout<<char(a);
#include <iostream>
char mychar = 'k';
int ASCIItranslate(char ch) {
return ch;
}
int main() {
std::cout << ASCIItranslate(mychar);
return 0;
}
That's your original code with the various syntax errors fixed. Assuming you're using a compiler that uses ASCII (which is pretty much every one these days), it works. Why do you think it's wrong?

invalid conversion from ‘char*’ to ‘char’

I have a,
int main (int argc, char *argv[])
and one of the arguements im passing in is a char. It gives the error message in the title when i go to compile
How would i go about fixing this?
Regards
Paul
When you pass command line parameters, they are all passed as strings, regardless of what types they may represent. If you pass "10" on the command line, you are actually passing the character array
{ '1', '0', '\0' }
not the integer 10.
If the parameter you want consists of a single character, you can always just take the first character:
char timer_unit = argv[2][0];
If you only ever want the first character from the parameter the folowing will extract it from string:
char timer_unit = argv[2][0];
The issue is that argv[2] is a char* (C-string) not char.
You are probably not passing in what you think (though this should come from the command line). Please show the complete error message and code, but it looks like you need to deal with the second argument as char *argv[], instead of char argv[] -- that is, as a list of character arrays, as opposed to a single character array.
Everything stays strings when you pass them in to your program as arguments, even if they are single characters. For example, if your program was called "myprog" and you had this at the command line:
myprog arg1 53 c a "hey there!"
Then what you get in the program is the following:
printf("%d\n", argc);
for(int i = 0; i < argc; i++)
{
printf("%s\n", argv[0]);
}
The output of that would be:
6
myprog
arg1
53
c
a
hey there!
The point being that everything on the command line turns into null-terminated strings, even single characters. If you wanted to get the char 'c' from the command line, you'd need to do this:
char value = argv[3][0];
not
char value = argv[3]; // Error!
Even the value of "53" doesn't turn into an int. you can't do:
int number = argv[2]; // Error!
argv[2] is { '5', '2', '\0' }. You have to do this:
int number = atoi(argv[2]); // Converts from a string to an int
I hope this is clear.
Edit: btw, everything above is just as valid for C (hence the printf statements). It works EXACTLY the same in C++.

Searching for Junk Characters in a String

Friends
I want to integrate the following code into the main application code. The junk characters that come populated with the o/p string dumps the application
The following code snipette doesnt work..
void stringCheck(char*);
int main()
{
char some_str[] = "Common Application FE LBS Serverr is down";
stringCheck(some_str);
}
void stringCheck(char * newString)
{
for(int i=0;i<strlen(newString);i++)
{
if ((int)newString[i] >128)
{
TRACE(" JUNK Characters in Application Error message FROM DCE IS = "<<(char)newString[i]<<"++++++"<<(int)newString[i]);
}
}
}
Can someone please show me the better approaches to find junk characters in a string..
Many Thanks
Your char probably is represented signed. Cast it to unsigned char instead to avoid that it becomes a negative integer when casting to int:
if ((unsigned char)newString[i] >128)
Depending on your needs, isprint might do a better job, checking for a printable character, including space:
if (!isprint((unsigned char)newString[i]))
...
Note that you have to cast to unsigned char: input for isprint requires values between 0 and UCHAR_MAX as character values.