Space vs null character - c++

In C++, when we need to print a single space, we may do the following:
cout << ' ';
Or we can even use a converted ASCII code for space:
cout << static_cast<char>(32); //ASCII code 32 maps to a single space
I realized that, printing a null character will also cause a single space to be printed.
cout << static_cast<char>(0); //ASCII code 0 maps to a null character
So my question is: Is it universal to all C++ compilers that when I print static_cast<char>(0), it will always appear as a single space in the display?
If it is universal, does it applies to text files when I use file output stream?

No, it will be a zero(0) character in every compiler. Seems that the font you use renders zero characters as a space. For example, in the old times, DOS had a different image (an almost filled rectangle) for zero characters.
Anyway, you really should not output zero characters instead of spaces!
As for the text file part: open the outputted file using a hex editor to see the actual bits written. You will see the difference there!

On my computer, this code
#include <iostream>
int main() {
std::cout << "Hello" << static_cast<char>(0) << "world\n";
}
outputs this:
Helloworld
So no, it clearly doesn’t work.

Related

Printing unicode Characters in C++

im trying to print a interface using these characters:
"╣║╗╝╚╔╩╦╠═╬"
but, when i try to print it, returns something like this:
"ôöæËÈ"
interface.txt
unsigned char* tabuleiroImportado() {
std::ifstream TABULEIRO;
TABULEIRO.open("tabuleiro.txt");
unsigned char tabu[36][256];
for (unsigned char i = 0; i < 36; i++) {
TABULEIRO >> tabu[i];
std::cout << tabu[i] << std::endl;
}
return *tabu;
}
i'm using this function to import the interface.
Just like every other possible kind of data that lives in your computer, it must be represented by a sequence of bytes. Each byte can have just 256 possible values.
All the carbon-based life forms, that live on the third planet from the sun, use all sorts of different alphabets with all sorts of characters, whose total number is much, more than 256.
A single byte by itself cannot, therefore, express all characters. The most simple way of handling all possible permutations of characters is to pick just 256 (or less) of them at a time, and assign the possible (up to 256) to a small set of characters, and call it your "character set".
Such is, apparently, your "tabuleiro.txt" file: its contents must be using some particular character set which includes the characters you expect to see there.
Your screen display, however, uses a different character set, hence the same values show different characters.
However, it's probably more complicated than that: modern operating system and modern terminals employ multi-byte character sequence, where a single character can be represented by specific sequences of more than just one byte. It's fairly likely that your terminal screen is based on multi-byte Unicode encoding.
In summary: you need to figure out two things:
Which character set your file uses
Which character set your terminal display uses
Then write the code to properly translate one to the other
It goes without saying that noone else could possibly tell you which character set your file uses, and which character set your terminal display uses. That's something you'll need to figure out. And without knowing both, you can't do step 3.
To print the Unicode characters, you can put the Unicode value with the prefix \u.
If the console does not support Unicode, then you cannot get the correct result.
Example:
#include <iostream>
int main() {
std::cout << "Character: \u2563" << std::endl;
std::cout << "Character: \u2551" << std::endl;
std::cout << "Character: \u2560" << std::endl;
}
Output:
Character: ╣
Character: ║
Character: ╠
the answer is use the unsigned char in = manner like char than a = unicode num
so this how to do it i did get an word like that when i was making an game engine for cmd so please up vote because it works in c++17 gnu gcc and in 2021 too to 2022 use anything in the place of a named a

How to print out ≠ to terminal using g++?

I want to print ≠ in the terminal. I tried
cout << '\u2248' << endl;
cout << '\U00002248' << endl;
cout << '≠' << endl;
which gives
14846344
14846344
14846368
I tried replacing the single quotes with double
Ôëê
Ôëê
Ôëá
How can it be done? I'm curious what the explanation for the output I'm getting is? I'm running Netbeans 9 but have tested directly from command line with g++ too. I think this should be possible because echo ≠ produces the correct output in the Windows command prompt.
So, in C++, like in plain C, by default we can work just with ASCII characters.
Char variables contains just 8 bits(1 byte) to store values so maximum - 2^8=256 different symbols can be coded by one char variable.
Single quotes (like 'a') are storing char variables so inside of them can be placed just ASCII-character. Your character is not a part of ASCII table and we need to change the encoding.
For just print(not store/process) your character, you should use another encoding such as UTF-8. You can do it programmatically:
std::setlocale(LC_ALL, /*some system-specific locale name, probably */ "en_US.UTF-8");
std::cout << "\u2260" << std::endl;
Or via command line options to g++ (such as -finput-charset=UTF-16).
As you can see, I'm using double quotes to print non-ASCII symbols to console.

fprintf always write to the end of the file even when I do rewind(fileptr) before, c++

I want to append a file and update some of its lines at the same time.
After appending as I desired, say I want to change only the first line, here is what I tried:
outputptr = fopen(outputName.c_str(), "ar+b");
cout << ftell(outputptr) << " ";
rewind(outputptr);
cout << ftell(outputptr) << "\n";
fprintf(outputptr, "abc");
But that code do not replace the first three letters with abc, instead it also appends the file and writes abc to the end. cout were 60 and 0 for this case, so pointer in fact is moved to the beginning.
How do I go any line of a given file and modify only that line?
The definition of 'a' in the mode field says:
(I've cut out the bits that are relevent for this question - it says some other stuff too)
... Repositioning operations (fseek, fsetpos, rewind) affects the next
input operations, but output operations move the position back to the
end of file. ...
You probably want "r+b".
http://www.cplusplus.com/reference/cstdio/fopen/

Decimal values of Extended ASCII characters

I wrote a function to test if a string consists only of letters, and it works well:
bool is_all_letters(const char* src) {
while (*src) {
// A-Z, a-z
if ((*src>64 && *src<91) || (*src>96 && *src<123)) {
*src++;
}
else {
return false;
}
}
return true;
}
My next step was to include “Extended ASCII Codes”, I thought it was going to be really easy but that’s where I ran into trouble. For example:
std::cout << (unsigned int)'A' // 65 <-- decimal ascii value
std::cout << (unsigned int)'ñ'; // 4294967281 <-- what?
I thought that the decimal value for ‘ñ’ was going to be 164 as listed on the ASCII chart at www.asciitable.com.
My goal is to restrict user input to only letters in ISO 8859-1 (latin 1). I’ve only worked with single byte characters and would like to avoid multi-byte characters if possible.
I am guessing that I can compare the unsigned int values above, i.e.: 4294967281, but it does not feel right to me and besides, I don’t know if that large integer is VC 8.0 representation of 'ñ' and changes from compiler to compiler.
Please advise
UPDATE - Per some suggestions made by Christophe, I ran the following code:
locale loc("spanish") ;
cout<<loc.name() << endl; // Spanish_Spain.1252
for (int i = 0; i < 255; i++) {
cout << i << " " << isalpha(i, loc)<< " " << (isprint(i,loc) ? (char)(i):'?') << endl;
}
It does return Spanish_Spain.1252 but unfortunately, the loop iterations print the same data as the default C locale (using VC++ 8 / VS 2005).
Christophe shows different (desired) results as you can see in his screen shots below, but he uses a much newer version of VC++.
The code chart you found on the internet is actually Windows OEM code page 437, which was never endorsed as a standard. Although it is sometimes called "extended ASCII", that description is highly misleading. (See the Wikipedia article Extended ASCII: "The use of the term is sometimes criticized, because it can be mistakenly interpreted that the ASCII standard has been updated to include more than 128 characters or that the term unambiguously identifies a single encoding, both of which are untrue."
You can find the history of OEM437 on Wikipedia, in various versions.
What was endorsed as a standard 8-bit encoding is ISO-8859-1, which later became the first 256 code points in Unicode. (It's one of a series of 8-bit encodings designed for use in different parts of the world; ISO-8859-1 is specified to the Americas and Western Europe.) So that's what you will find in most computers produced in this century in those regions, although more recently more and more operating systems are converting to full Unicode support.
The value you see for (unsigned int)'ñ' is the result of casting the ISO-8859-1 code 0xF1 from a (signed) char (that is, -15) to an unsigned int. Had you cast it to an int, you would have seen -15.
I thought that the decimal value for ‘ñ’ was going to be 164 as listed on the ASCII chart at www.asciitable.com.
Asciitable.com appears to give the code for the old IBM437 DOS character set (still used in the Windows command prompt), in which ñ is indeed 164. But that's just one of hundreds of “extended ASCII” variants.
The value 4294967281 = 0xFFFFFFF1 you got is a sign-extension of the (signed) char value 0xF1, which is how ñ is encoded in ISO-8859-1 and close variants like Windows-1252.
To start with, you're trying to reinvent std::isalpha. But you'll need to pass the ISO-8859-1 locale IIRC, by default that just checks ASCII.
The behavior you see is because char is signed (because you didn't compile with /J, which is the smart thing to do when you use more than just ASCII - VC++ defaults to signed char).
There is already plenty of information here. However, I'd like to propose some ideas to adress your inital problem, being the categorisation of extended character set.
For this, I suggest the use of <locale> (country specific topics), and especially the new locale-aware form of isalpha(), isspace(), isprint(), ... .
Here a little piece of code to help you to find out what chars could be a letter in your local alphabet:
std::locale::global(std::locale("")); // sets the environment default locale currently in place
std::cout << std::locale().name() << std::endl; // display name of current locale
std::locale loc ; // use a copy of the active global locale (you could use another)
for (int i = 0; i < 255; i++) {
cout << i << " " << isalpha(i, loc)<< " " << (isprint(i,loc) ? (char)(i):'?') << endl;
}
This will print out the ascii code from 0 to 255, followed by an indicator if it is a letter according to the local settings, and the character itself if it's printable.
FOr example, on my PC, I get:
And all the accented chars, as well as ñ, and greek letters are considered as alpha, whereas £ and mathematical symbols are considered as non alpha printable.

Padding with spaces in char array

I am writing a program to write a head for a specific image format. This image format requires 256 characters as its header before any following raw image data. But I have problem padding empty spaces to make the header 256 characters long.
Below is an abstraction of my problem:
char pad[256];
sprintf( pad, "header info:%s=%f", "scale", 2.3);
cout<<pad<<"data here"<<endl;
The output is:
header info:scale=2.300000data here
However, the result I expect is like:
header info:scale=2.300000 data here
where "data here" appears after 256 characters from the beginning of the file.
How can I change the program to pad empty spaces in the character array?
Do this:
cout << setw(256) << left << pad << "data here" <<endl;
You may need #include <iomanip>.
BTW in your "real code" you should use snprintf to ensure there is no chance of a buffer overflow, assuming that your %s is going to get some argument that's worked out at runtime . (Or preferably replace the sprintf with a stringstream).
Maybe this can help you. The format - which left aligns the text. For example, the following will left align the float text with a fixed width of 20. The default space character is used as padding.
sprintf( pad, "header info:%s=%-20f", "scale", 2.3);