NetBeans doesn't output certain ASCII characters - c++

I'm trying to print ASCII characters, but in the output I only see a little box. For example, ASCII 179 is a | character, but it doesn't print. Instead, it prints:
My code:
int main(int argc, char** argv) {
int a[] {179,180,191,192,193,194,195,196,197,217,218,32};
char b = a[2];
std::cout << b;
return 0;
}
How can I resolve this problem?
Note, when I use this code, the output prints the characters correctly:
std::cout << "┐"
But if I use the ASCII character, it prints a box instead.
Edit: To add... even when I output the characters to Notepad, I get the same result.

The problem seems to be an encoding issue. NetBeans by default, is not printing data with UTF-8 encoding.
Take a look at this tutorial to change NetBeans' default encoding.

Related

Is there a proper way to receive input from console in UTF-8 encoding?

When getting input from std::cin in windows, the input is apparently always in the encoding windows-1252 (the default for the host machine in my case) despite all the configurations made, that apparently only affect to the output. Is there a proper way to capture input in windows in UTF-8 encoding?
For instance, let's check out this program:
#include <iostream>
int main(int argc, char* argv[])
{
std::cin.imbue(locale("es_ES.UTF-8"));
std::cout.imbue(locale("es_ES.UTF-8"));
std::cout << "ñeñeñe> ";
std::string in;
std::getline( std::cin, in );
std::cout << in;
}
I've compiled it using visual studio 2022 in a windows machine with spanish locale. The source code is in UTF-8. When executing the resulting program (windows powershell session, after executing chcp 65001 to set the default encoding to UTF-8), I see the following:
PS C:\> .\test_program.exe
ñeñeñe> ñeñeñe
e e e
The first "ñeñeñe" is correct: it display correctly the "ñ" caracter to the output console. So far, so good. The user input is echoed back to the console correctly: another good point. But! when it turns to send back the encoded string to the ouput, the "ñ" caracter is substituted by an empty space.
When debugging this program, I see that the variable "in" have captured the input in an encoding that it is not utf-8: for the "ñ" it use only one character, whereas in utf-8 that caracter must consume two. The conclusion is that the input is not affect for the chcp command. Is something I doing wrong?
UPDATE
Somebody have asked me to see what happens when changing to wcout/wcin:
std::wcout << u"ñeñeñe> ";
std::wstring in;
std::getline(std::wcin, in);
std::wcout << in;
Behaviour:
PS C:\> .\test.exe
0,000,7FF,6D1,B76,E30ñeñeñe
e e e
Other try (setting the string as L"ñeñeñe"):
ñeñeñe> ñeñeñe
e e e
Leaving it as is:
std::wcout << "ñeñeñe> ";
Result is:
eee>
This is the closest to the solution I've found so far:
int main(int argc, char* argv[])
{
_setmode(_fileno(stdout), _O_WTEXT);
_setmode(_fileno(stdin), _O_WTEXT);
std::wcout << L"ñeñeñe";
std::wstring in;
std::getline(std::wcin, in);
std::wcout << in;
return 0;
}
The solution depicted here went in the right direction. Problem: both stdin and stdout should be in the same configuration, because the echo of the console rewrites the input. The problem is the writing of the string with \uXXXX codes.... I am guessing how to overcome that or using #define's to overcome and clarify the text literals

Proper way to convert HEX to ASCII read from a file C++

In my code bellow CODE 1 reading HEX from a file and storing in in string array won't convert it to ASCII when printed out.
#include <iostream>
#include <sstream>
#include <fstream>
int main(int argc, char** argv)
{
// CODE 1
std::ifstream input("C:\\test.txt"); // The test.txt contains \x48\x83\xEC\x28\x48\x83
std::stringstream sstr;
input >> sstr.rdbuf();
std::string test = sstr.str();
std::cout << "\nString from file: " << test;
//char* lol = new char[test.size()];
//memcpy(lol, test.data(), test.size());
////////////////////////////////////////////////////////
// CODE 2
std::string test_2 = "\x48\x83\xEC\x28\x48\x83";
std::cout << "\n\nHardcoded string: " << test_2 << "\n";
// Prints as ASCII "H(H" , which I want my CODE 1 to do.
}
In my CODE 2 sample, same HEX is used and it prints it as ASCII. Why is it not the same for CODE 1?
Okay, it looks like there is some confusion. First, I have to ask if you're SURE you know what is in your file.
That is, does it contain, oh, it looks like about 20 characters:
\
x
4
8
et cetera?
Or does it contain a hex 48 (one byte), a hex 83 (one byte), for a total of 5-ish characters?
I bet it's the first. I bet your file is about 20 characters long and literally contains the string that's getting printed.
And if so, then the code is doing what you expect. It's reading a line of text and writing it back out. If you want it to actually interpret it like the compiler does, then you're going to have to do the steps yourself.
Now, if it actually contains the hex characters (but I bet it doesn't), then that's a little different problem, and we'll have to look at that. But I think you just have a string of characters that includes \x in it. And reading / writing that isn't going to automatically do some magic for you.
When you read from file, the backslash characters are not escaped. Your test string from file is literally an array of chars: {'\\', 'x', '4', '8', ... }
Whereas your hardcoded literal string, "\x48\x83\xEC\x28\x48\x83"; is fully hex escaped by the compiler.
If you really want to store your data as a text file as a series of "backslash x NN" sequences, you'll need to convert after you read from file. Here's a hacked up loop that would do it for you.
std::string test = sstr.str();
char temp[3] = {};
size_t t = 0;
std::string corrected;
for (char c : test)
{
if (isxdigit(c))
{
temp[t] = c;
t++;
if (t == 2)
{
t = 0;
unsigned char uc = (unsigned char)strtoul(tmp, nullptr, 16);
corrected += (char)uc;
}
}
}
You can split the returned string in \x then make casting from string to int,
finally casting to char.
this resource will be helpful
strtok And convert

how to detect non-ascii characters in C++ Windows?

I'm simply trying detect non-ascii characters in my C++ program on Windows.
Using something like isascii() or :
bool is_printable_ascii = (ch & ~0x7f) == 0 &&
(isprint() || isspace()) ;
does not work because non-ascii characters are getting mapped to ascii characters before or while getchar() is doing its thing. For example, if I have some code like:
#include <iostream>
using namespace std;
int main()
{
int c;
c = getchar();
cout << isascii(c) << endl;
cout << c << endl;
printf("0x%x\n", c);
cout << (char)c;
return 0;
}
and input a 😁 (because i am so happy right now), the output is
1
63
0x3f
?
Furthermore, if I feed the program something (outside of the extended ascii range (codepage 437)) like 'Ĥ', I get the output to be
1
72
0x48
H
This works with similar inputs such as Ĭ or ō (goes to I and o). So this seems algorithmic and not just mojibake or something. A quick check in python (via same terminal) with a program like
i = input()
print(ord(i))
gives me the expected actual hex code instead of the ascii mapped one (so its not the codepage or the terminal (?)). This makes me believe getchar() or C++ compilers (tested on VS compiler and g++) is doing something funky. I have also tried using cin and many other alternatives. Note that I've tried this on Linux and I cannot reproduce this issue which makes me inclined to believe that it is something to do with Windows (10 pro). Can anyone explain what is going on here?
Try replacing getchar() with getwchar(); I think you're right that its a Windows-only problem.
I think the problem is that getchar(); is expecting input as a char type, which is 8 bits and only supports ASCII. getwchar(); supports the wchar_t type which allows for other text encodings. "😁" isn't ASCII, and from this page: https://learn.microsoft.com/en-us/windows/win32/learnwin32/working-with-strings , it seems like Windows encodes extended characters like this in UTF-16. I was having trouble finding a lookup table for utf-16 emoji, but I'm guessing that one of the bytes in the utf-16 "😁" is 0x39 which is why you're seeing that printed out.
Okay, I have solved this. I was not aware of translation modes.
_setmode(_fileno(stdin), _O_WTEXT);
Was the solution. The link below essentially explains that there are translation modes and I think phase 5 (character-set mapping) explains what happened.
https://en.cppreference.com/w/cpp/language/translation_phases

C++ error too big for character

Here is were i get the error.
To explain, i want to print the → character which according to http://www.endmemo.com/unicode/unicodeconverter.php
The code is 2192. but i may be using the wrong code if so what is the right way to print → .
int _tmain(int argc, _TCHAR* argv[])
{
UINT oldcp = GetConsoleOutputCP();
SetConsoleOutputCP(CP_UTF8);
cout<<"\x2192"<<endl;
SetConsoleOutputCP(oldcp);
return 0;
}
A char on your platform is 8 bits. Your code part "\x2192" tries to put 16 bits in it. What will not fit, so you get the warning.
You possibly meant several characters, like "\x21\x92" or "\x92\x21"? That creates a valid string with two chars (+ the 0). You may still adjust it to have the proper value if comments are correct.
From the use of _tmain and SetConsoleOutputCP I guess you are mostly about Windows. I'm afraid I don't know much about that; hopefully someone who knows more about that specific case will chime in, but this program generates the output you're looking for in a quick test I tried here with a UTF-8 terminal. Here's the program:
#include <iostream>
int main(void)
{
std::cout << "\xE2\x86\x92" << std::endl;
return 0;
}
And example output:
$ make example && ./example
c++ example.cpp -o example
→
I just directly output the UTF-8 encoding of the → character.
Equivalently (at least for clang):
#include <iostream>
int main(void)
{
std::cout << "→" << std::endl;
return 0;
}

C++ printf: newline (\n) from commandline argument

How print format string passed as argument ?
example.cpp:
#include <iostream>
int main(int ac, char* av[])
{
printf(av[1],"anything");
return 0;
}
try:
example.exe "print this\non newline"
output is:
print this\non newline
instead I want:
print this
on newline
No, do not do that! That is a very severe vulnerability. You should never accept format strings as input. If you would like to print a newline whenever you see a "\n", a better approach would be:
#include <iostream>
#include <cstdlib>
int main(int argc, char* argv[])
{
if ( argc != 2 ){
std::cerr << "Exactly one parameter required!" << std::endl;
return 1;
}
int idx = 0;
const char* str = argv[1];
while ( str[idx] != '\0' ){
if ( (str[idx]=='\\') && (str[idx+1]=='n') ){
std::cout << std::endl;
idx+=2;
}else{
std::cout << str[idx];
idx++;
}
}
return 0;
}
Or, if you are including the Boost C++ Libraries in your project, you can use the boost::replace_all function to replace instances of "\\n" with "\n", as suggested by Pukku.
At least if I understand correctly, you question is really about converting the "\n" escape sequence into a new-line character. That happens at compile time, so if (for example) you enter the "\n" on the command line, it gets printed out as "\n" instead of being converted to a new-line character.
I wrote some code years ago to convert escape sequences when you want it done. Please don't pass it as the first argument to printf though. If you want to print a string entered by the user, use fputs, or the "%s" conversion format:
int main(int argc, char **argv) {
if (argc > 1)
printf("%s", translate(argv[1]));
return 0;
}
You can't do that because \n and the like are parsed by the C compiler. In the generated code, the actual numerical value is written.
What this means is that your input string will have to actually contain the character value 13 (or 10 or both) to be considered a new line because the C functions do not know how to handle these special characters since the C compiler does it for them.
Alternatively you can just replace every instance of \\n with \n in your string before sending it to printf.
passing user arguments directly to printf causes a exploit called "String format attack"
See Wikipedia and Much more details
There's no way to automatically have the string contain a newline. You'll have to do some kind of string replace on your own before you use the parameter.
It is only the compiler that converts \n etc to the actual ASCII character when it finds that sequence in a string.
If you want to do it for a string that you get from somewhere, you need to manipulate the string directly and replace the string "\n" with a CR/LF etc. etc.
If you do that, don't forget that "\\" becomes '\' too.
Please never ever use char* buffers in C++, there is a nice std::string class that's safer and more elegant.
I know the answer but is this thread is active ?
btw
you can try
example.exe "print this$(echo -e "\n ")on newline".
I tried and executed
Regards,
Shahid nx