Spaces instead ascii character - c++

#include <iostream>
int main() {
for(int i=0;i<18;i++)
std::cout << (char)i << '\n';
}
OUTPUT is:
But where are other characters?

The characters you are expecting to see are not ASCII. In ASCII, the codes below 32 signify what is called control characters, which were originally intended to control functions on teletype printers. Many of them don't apply to modern terminals, so your terminal just picked some characters (or got them from some other encoding), such as the faces and the card suits, to use for those codes. Some of the ASCII control characters are still applicable though.
7 is a called the bell character ('\a'), you may have heard a beep.
8 is a backspace ('\b').
std::cout << "abc" << (char)8 "def"; // where's the c?
9 is a horizontal tab ('\t'), so that's invisible, but you'll probably notice it if you print visible characters before and after it.
std::cout << "before" << (char)9 << "after";
10 is a line feed, a.k.a. newline ('\n')
13 is a carriage return ('\r').
std::cout << "hello" << (char)13 << "world"; // where's the hello?
Your results may vary depending upon which terminal you use.

Related

How can I remove a newline from inside a string in C++?

I am trying to take text input from the user and compare it to a list of values in a text file. The values are this:
That line at the end is the cursor, not a straight line, but it doesn't matter. Anyway, I sort by word and produce the values, then check the values. Semicolon is a separator between words. All the data is basic to get the code working first. The important thing is that all the pieces of data have newlines after them. No matter what I try, I can't get rid of the newlines completely. Looking at the ASCII values shows why, My efforts remove only the new line, but not the carriage return. This is fine most of the time, but when comparing values they won't be the same because the one with the carriage return is treated as longer. Here is the important parts of the code:
int pos = 0;
while (pos != std::string::npos)
{
std::string look = lookContents.substr(pos+1, lookContents.find("\n", pos + 1) - pos);
//look.erase(std::remove(look.begin(), look.end(), '\n'), look.end());
//##
for (int i = 0; i < look.length(); i++)
{
std::cout << (int)(look[i]) << " ";
}
std::cout << std::endl;
std::cout << look << ", " << words[1] << std::endl;
std::cout << look.compare(0,3,words[1]) << std::endl;
std::cout << pos << std::endl;
//##
//std::cout << look << std::endl;
if (look == words[1])
{
std::cout << pos << std::endl;
break;
}
pos = lookContents.find("\n", pos + 1);
}
Everything between the //## are just error checking things. Heres what is outputs when I type look b:2
As you can see, the values have the ASCII 10 and 13 at the end, which is what is used to create newlines. 13 is carriage return and 10 is newline. The last one has its 10 remove earlier in the code so the code doesn't do an extra loop on an empty substring. My efforts to remove the newline, including the commented out erase function, either only remove the 13, or remove both the 10 and 13 but corrupt later data like this:
Also, you can see that using cout to print look and words1 at the same time causes look to just not exist for some reason. Printing it by itself works fine though. I realise I could fix this by just using that compare function in the code to check all but the last characters, but this feels like a temporary fix. Any solutions?
My efforts remove only the new line, but not the carriage return
The newline and carriage control are considered control characters.
To remove all the control characters from the string, you can use std::remove_if along with std::iscntrl:
#include <cctype>
#include <algorithm>
//...
lookContents.erase(std::remove_if(lookContents.begin(), lookContents.end(),
[&](char ch)
{ return std::iscntrl(static_cast<unsigned char>(ch));}),
lookContents.end());
Once you have all the control characters removed, then you can process the string without having to check for them.

Print (and store) high ASCII character (╔) in C++ in console

I'm making a small program in C++ and I would like to have this character stored in a variable: ╔. However, I can only do it in a string, and if I use the ' notation it just shows this: �.
Is there anything I can do?
BTW, I use:
Linux (Mint)
Visual Studio Code (integrated terminal)
The console shows the characters correctly if I use the " notation, so probably it's not a problem with the console itself.
You can use the hex notation:
char border = '\xcd';
Small program:
#include <iostream>
using std::cout;
using std::cin;
int main()
{
cout << "border: \xcd\n";
const char corner = '\xc9';
cout << "Upper left corner: " << corner << "\n";
cout << "Paused. Press ENTER to continue.\n";
cin.ignore(100000, '\n');
return 0;
}
There are many charts that show an extending ASCII encoding. Use the hexadecimal value for the character that you need.
Here's a chart from Wikipedia about DOS extended ASCII table.

Why do I obtain this strange character?

Why does my C++ program create the strange character shown below in the pictures? The picture on the left with the black background is from the terminal. The picture on the right with the white background is from the output file. Before, it was a "\v" now it changes to some sort of astrological symbol or symbol to denote males. 0_o This makes no sense to me. What am I missing? How can I have my program output just a backslash v?
Please see my code below:
// SplitActivitiesFoo.cpp : Defines the entry point for the console application.
//
#include "stdafx.h"
#include <iostream>
#include <vector>
#include <fstream>
using namespace std;
int main()
{
string s = "foo:bar-this-is-more_text#\venus \"some more text here to read.\"";
vector<string> first_part;
fstream outfile;
outfile.open("out.foobar");
for (int i = 0; i < s.size(); ++i){
cout << "s[" << i << "]: " << s[i] << endl;
outfile << s[i] << endl;
}
return 0;
}
Also, assume that I do not want to modify my string 's' in this case. I want to be able to parse each character of the string and work around the strange character somehow.This is because in the actual program the string will be read in from a file and parsed then sent to another function. I guess I could figure out a way to programmatically add backslashes...
How can I have my program output just a backslash v?
If you want a backslash, then you need to escape it: "#\\venus".
This is required because a backslash denotes that the next character should be interpreted as something special (note that you were already using this when you wanted double-quotes). So the compiler has no way of knowing you actually wanted a backslash unless you tell it.
A literal backslash character therefore has the syntax \\. This is the case in both string literals ("\\") and character literals ('\\').
Why does my C++ program create the strange character shown below in the picture?
Your string contains the \v control character (vertical tab), and the way it's displayed is dependent on your terminal and font. It looks like your terminal is using symbols from the traditional MSDOS code page.
I found an image for you here, which shows exactly that symbol for the vertical tab (vt) entry at value 11 (0x0b):
Also, assume that I do not want to modify my string 's' in this case. I want to be able to parse each character of the string and work around the strange character somehow.
Well, I just saw you add the above part to your question. Now you're in difficult territory. Because your string literal does not actually contain the character v or any backslashes. It only appears that way in code. As already said, the compiler has interpreted those characters and substituted them for you.
If you insist on printing v instead of a vertical tab for some crazy reason that is hopefully not related to an XY Problem, then you can construct a lookup-table for every character and then replace undesirables with something else:
char lookup[256];
std::iota( lookup, lookup + 256, 0 ); // Using iota from <numeric>
lookup['\v'] = 'v';
for (int i = 0; i < s.size(); ++i)
{
cout << "s[" << i << "]: " << lookup[s[i]] << endl;
outfile << lookup[s[i]] << endl;
}
Now, this won't print the backslashes. To undo the string further check out std::iscntrl. It's locale-dependent, but you could utilise it. Or just something naive like:
const char *lookup[256] = { 0 };
s['\f'] = "\\f";
s['\n'] = "\\n";
s['\r'] = "\\r";
s['\t'] = "\\t";
s['\v'] = "\\v";
s['\"'] = "\\\"";
// Maybe add other controls such as 0x0E => "\\x0e" ...
for (int i = 0; i < s.size(); ++i)
{
const char * x = lookup[s[i]];
if( x ) {
cout << "s[" << i << "]: " << x << endl;
outfile << x << endl;
} else {
cout << "s[" << i << "]: " << s[i] << endl;
outfile << s[i] << endl;
}
}
Be aware there is no way to correctly reconstruct the escaped string as it originally appeared in code, because there are multiple ways to escape characters. Including ordinary characters.
Most likely the terminal that you are using cannot decipher the vertical space code "\v", thus printing something else. On my terminal it prints:
foo:bar-this-is-more_text#
enus "some more text here to read."
To print the "\v" change or code to:
String s = "foo:bar-this-is-more_text#\\venus \"some more text here to read.\"";
What am I missing? How can I have my program output just a backslash v?
You are escaping the letter v. To print backslash and v, escape the backslash.
That is, print double backslash and a v.
\\v

Trying to output everything inside an exe file

I'm trying to output the plaintext contents of this .exe file. It's got plaintext stuff in it like "Changing the code in this way will not affect the quality of the resulting optimized code." all the stuff microsoft puts into .exe files. When I run the following code I get the output of M Z E followed by a heart and a diamond. What am I doing wrong?
ifstream file;
char inputCharacter;
file.open("test.exe", ios::binary);
while ((inputCharacter = file.get()) != EOF)
{
cout << inputCharacter << "\n";
}
file.close();
I would use something like std::isprint to make sure the character is printable and not some weird control code before printing it.
Something like this:
#include <cctype>
#include <fstream>
#include <iostream>
int main()
{
std::ifstream file("test.exe", std::ios::binary);
char c;
while(file.get(c)) // don't loop on EOF
{
if(std::isprint(c)) // check if is printable
std::cout << c;
}
}
You have opened the stream in binary, which is good for the intended purpose. However you print every binary data as it is: some of thes characters are not printable, giving weird output.
Potential solutions:
If you want to print the content of an exe, you'll get more non-printable chars than printable ones. So one approach could be to print the hex value instead:
while ( file.get(inputCharacter ) )
{
cout << setw(2) << setfill('0') << hex << (int)(inputCharacter&0xff) << "\n";
}
Or you could use the debugger approach of displaying the hex value, and then display the char if it's printable or '.' if not:
while (file.get(inputCharacter)) {
cout << setw(2) << setfill('0') << hex << (int)(inputCharacter&0xff)<<" ";
if (isprint(inputCharacter & 0xff))
cout << inputCharacter << "\n";
else cout << ".\n";
}
Well, for the sake of ergonomy, if the exe file contains any real exe, you'd better opt for displaying several chars on each line ;-)
Binary file is a collection of bytes. Byte has a range of values 0..255. Printable characters that can be safely "printed" form a much narrower range. Assuming most basic ASCII encoding
32..63
64..95
96..126
plus, maybe, some higher than 128, if your codepage has them
see ascii table.
Every character that falls out of that range may, at least:
print out as invisible
print out as some weird trash
be in fact a control character that will change settings of your terminal
Some terminals support "end of text" character and will simply stop printing any text afterwards. Maybe you hit that.
I'd say, if you are interested only in text, then print only that printables and ignore others. Or, if you want everything, then maybe write them out in hex form instead?
This worked:
ifstream file;
char inputCharacter;
string Result;
file.open("test.exe", ios::binary);
while (file.get(inputCharacter))
{
if ((inputCharacter > 31) && (inputCharacter < 127))
Result += inputCharacter;
}
cout << Result << endl;
cout << "These are the ascii characters in the exe file" << endl;
file.close();

Why fstream::tellg() return value is enlarged by the number of newlines in the input text file, when file is formated for Windows (\r\n)?

Program openes input file and prints current reading/writing position several times.
If file is formated with '\n' for newline, values are as expected: 0, 1, 2, 3.
On the other side, if the newline is '\r\n' it appears that after some reading, current position returned by all tellg() calls are offsetted by the number of newlines in the file - output is: 0, 5, 6, 7.
All returned values are increased by 4, which is a number of newlines in example input file.
#include <fstream>
#include <iostream>
#include <iomanip>
using std::cout;
using std::setw;
using std::endl;
int main()
{
std::fstream ioff("su9.txt");
if(!ioff) return -1;
int c = 0;
cout << setw(30) << std::left << " Before any operation " << ioff.tellg() << endl;
c = ioff.get();
cout << setw(30) << std::left << " After first 'get' " << ioff.tellg() << " Character read: " << (char)c << endl;
c = ioff.get();
cout << setw(30) << std::left << " After second 'get' " << ioff.tellg() << " Character read: " << (char)c << endl;
c = ioff.get();
cout << setw(30) << std::left << " Third 'get' " << ioff.tellg() << "\t\tCharacter read: " << (char)c << endl;
return 0;
}
Input file is 5 lines long (has 4 newlines), with a content:
-------------------------------------------
abcd
efgh
ijkl
--------------------------------------------
output (\n):
Before any operation 0
After first 'get' 1 Character read: a
After second 'get' 2 Character read: b
Third 'get' 3 Character read: c
output (\r\n):
Before any operation 0
After first 'get' 5 Character read: a
After second 'get' 6 Character read: b
Third 'get' 7 Character read: c
Notice that character values are read corectly.
The first, and most obvious question, is why do you expect any
particular values when teh results of tellg are converted to
an integral type. The only defined use of the results of
tellg is as a later argument to seekg; they have no defined
numerical significance what so ever.
Having said that: in Unix and Windows implementations, they will
practically always correspond to the byte offset of the
physical position in the file. Which means that they will have
some signification if the file is opened in binary mode; under
Windows, for example, text mode (the default) maps the two
character sequence 0x0D, 0x0A in the file to the single
character '\n', and treats the single character 0x1A as if it
had encountered end of file. (Binary and text mode are
indentical under Unix, so things often seem to work there even
when they aren't guaranteed.)
I might add that I cannot reproduce your results with MSC++.
Not that that means anything; as I said, the only requirements
for tellg is that the returned value can be used in a seekg to
return to the same place. (Another issue might be how you
created the files. Might one of them start with a UTF-8
encoding of a BOM, for example, and the other not?)