I want to convert a series of 32-bit integer values into a sequence of printable 8-bit character values. Mapping the 32-bit integers to printable 8-bit character values should result in a clear ASCII art image.
I can convert Integer to ASCII:
#include <iostream>
using namespace std;
int main() {
char ascii;
int numeric;
cout << "Enter Number ";
cin >> numeric;
cout << "The ascii value of " << numeric << " is " << (char) numeric<<"\n\n"<<endl;
return 0;
}
Also I need to open the text file that my numbers are saved into:
// reading a text file
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
int main () {
string line;
ifstream myfile ("1.txt");
if (myfile.is_open())
{
while ( myfile.good() )
{
getline (myfile,line);
cout << line << endl;
}
myfile.close();
}
else
cout << "Unable to open file";
return 0;
}
but my problem is , I can not open this " Text " file and print the ASCII on the screen and also print a copy of that in a " Output.txt "
Inside of my Text file is just :
757935403 544999979 175906848 538976380
757795452 170601773 170601727
That after converting to ASCII needs to look like this :
represents the ASCII art picture
+---+
| |
| |
+---+
and have this also in my output.txt.
Please advise if you know how can I write this program.
First of all, you cannot convert a 32 bit integer to 8 bit ascii without losing information. As far as I guess, you should extract 4 ascii chars from a 32 bit integer.
If your input file is non-binary (which means integer values are human-readable/seperated by some delimeter), first thing you should do is create another file/stream and write these values to the new file/stream but now in binary mode (In this mode, there will be no delimiter and resulting file/stream will not be human readable).
Now read chars one by one(open file with binary mode) from this new file/stream, and write it to your final output file using non-binary mode.
IF YOU WANT TO DO IT WITHOUT SEVERAL FILE INOUTS,
Read all your integer values in an array, then point the starting memory location with a char pointer, then write one by one the contents of this char array.
int* myIntArray; //keep the size of it somewhere
char* myCharArray =(char*)myIntArray; // size for myCharArray is 4 times of the myIntArray
Having converted those numbers into hex, you get this
2d2d2d2b 207c0a2b 0a7c2020 2020207C
- - - + | lf+ lf| |
etc etc
so basically for some reason the input file contains the characters to output stored as integers. Which is completely endian unsafe.
Your least worst bet it to read in each integer, cast it to an array of chars and output those 4 chars.
If you're using unix, I'd suggest using 'tee' to send your output to 2 files if you can, otherwise output once to stdout, then output again to whatever file handle you've opened for Output.txt.
Related
In my code bellow CODE 1 reading HEX from a file and storing in in string array won't convert it to ASCII when printed out.
#include <iostream>
#include <sstream>
#include <fstream>
int main(int argc, char** argv)
{
// CODE 1
std::ifstream input("C:\\test.txt"); // The test.txt contains \x48\x83\xEC\x28\x48\x83
std::stringstream sstr;
input >> sstr.rdbuf();
std::string test = sstr.str();
std::cout << "\nString from file: " << test;
//char* lol = new char[test.size()];
//memcpy(lol, test.data(), test.size());
////////////////////////////////////////////////////////
// CODE 2
std::string test_2 = "\x48\x83\xEC\x28\x48\x83";
std::cout << "\n\nHardcoded string: " << test_2 << "\n";
// Prints as ASCII "H(H" , which I want my CODE 1 to do.
}
In my CODE 2 sample, same HEX is used and it prints it as ASCII. Why is it not the same for CODE 1?
Okay, it looks like there is some confusion. First, I have to ask if you're SURE you know what is in your file.
That is, does it contain, oh, it looks like about 20 characters:
\
x
4
8
et cetera?
Or does it contain a hex 48 (one byte), a hex 83 (one byte), for a total of 5-ish characters?
I bet it's the first. I bet your file is about 20 characters long and literally contains the string that's getting printed.
And if so, then the code is doing what you expect. It's reading a line of text and writing it back out. If you want it to actually interpret it like the compiler does, then you're going to have to do the steps yourself.
Now, if it actually contains the hex characters (but I bet it doesn't), then that's a little different problem, and we'll have to look at that. But I think you just have a string of characters that includes \x in it. And reading / writing that isn't going to automatically do some magic for you.
When you read from file, the backslash characters are not escaped. Your test string from file is literally an array of chars: {'\\', 'x', '4', '8', ... }
Whereas your hardcoded literal string, "\x48\x83\xEC\x28\x48\x83"; is fully hex escaped by the compiler.
If you really want to store your data as a text file as a series of "backslash x NN" sequences, you'll need to convert after you read from file. Here's a hacked up loop that would do it for you.
std::string test = sstr.str();
char temp[3] = {};
size_t t = 0;
std::string corrected;
for (char c : test)
{
if (isxdigit(c))
{
temp[t] = c;
t++;
if (t == 2)
{
t = 0;
unsigned char uc = (unsigned char)strtoul(tmp, nullptr, 16);
corrected += (char)uc;
}
}
}
You can split the returned string in \x then make casting from string to int,
finally casting to char.
this resource will be helpful
strtok And convert
I have a text file which I am adding tags to in order to make it XML readable. In order for our reader to recognize it as valid, each line must at least be wrapped in tags. My issue arises because this is actually a Syriac translation dictionary and so there are many non-standard characters (the actual Syriac words). The most straight-forward way I see to accomplish what I need is to simply prepend and append each line with the needed tags, in place, without necessarily accessing or modifying the rest of the line. Any other options would also be greatly appreciated.
ifstream in_file;
string file_name;
string line;
string line2;
string pre_text;
string post_text;
int num = 1;
pre_text = "<entry n=\"";
post_text = "</entry>";
file_name = "D:/TEI/dictionary1.txt";
in_file.open(file_name.c_str());
if (in_file.is_open()){
while (getline(in_file, line)){
line2 = pre_text + to_string(num) + "\">" + line + post_text;
cout << line2;
num++;
}
}
The file in question may be downloaded here.
You are using std::string which, by default, deals with ASCII encoded text, and you are opening your file in "text translation mode". The first thing you need to do is open the file in binary mode so that it doesn't perform translation on individual char values:
in_file.open(file_name.c_str(), std::ios::binary);
or in C++11
in_file.open(file_name, std::ios::binary);
The next thing is to stop using std::string for storing the text from the file. You will need to us a string type that recognizes the character encoding you are using and use the appropriate character type.
As it turns out, std::string is actually an alias for std::basic_string<char>. In C++11 several new unicode character types were introduced, in C++03 there was wchar_t which supports "wide" characters (more than 8 bits). There is a standard alias for basic_strings of wchar_ts: std::wstring.
Start with the following simple test:
#include <iostream>
#include <fstream>
#include <string>
int main() {
std::string file_name = "D:/TEI/dictionary1.txt";
std::wifstream in_file(file_name, std::ios::binary);
if (!in_file.is_open()) {
// "L" prefix indicates a wide string literal
std::wcerr << L"file open failed\n";
return 1;
}
std::wstring line1;
std::getline(in_file, line1);
std::wcout << L"line1 = " << line1 << L"\n";
}
Note how cout etc also become prefixed with w...
The standard ASCII characterset contains 128 characters numbered 0 thru 127. In ASCII \n and \r are represented with a 7-bit value of 13 and 10 respectively.
Your text file appears to be UTF-8 encoded. UTF-8 uses an 8-bit unsigned representation that allows characters to use a variable number of bytes: the value 0 requires 1 byte, the value 128 requires 2 bytes, the value 8192 requires 3 bytes, and so on.
A value with the highest-bit (2^7) clear is a single, 7-bit ascii value or the end of a multibyte-sequence. If the highest-bit is set, the lower bits are considered to be a "prefix value". So the byte sequence { (128+2), 0 } would represent the value (2 << 7) | 0 or (wchar_t)256. The byte sequence { 130, 13 } represents (2 << 7) | 13 or wchar_t 269.
You can read and write utf-8 values through char streams and storage, but only as opaque byte streams. The moment you start to need to understand the values you generally need to resort to wchar_t, uint16_t or uint32_t etc.
If you are working with Microsoft's toolset (noting the "D:/" path), you may need to look into TCHAR (https://msdn.microsoft.com/en-us/library/c426s321.aspx)
Problem:
Split the binary I/O from the example code into two: one program that converts an ordinary text file into binary and one program that reads binary and converts into text. Test these programs by comparing a text file with what you get by converting it to binary and back.
Example code:
#include "std_lib_facilities.h"
int main(){
cout <<"Please enter input file name.\n";
string name;
cin >> name;
// open file to read, with no byte interpretation
ifstream ifs(name.c_str(), ios_base::binary);
if(!ifs) error("Can't open input file: ", name);
cout << "Please enter output file name.\n";
cin >> name;
// open file to write
ofstream ofs(name.c_str(), ios_base::binary);
if(!ofs) error("Can't open output file: ", name);
vector<int> v;
// read from binary file
int i;
while(ifs.read(as_bytes(i), sizeof(int))) v.push_back(i);
// do something with v
// write to binary file
for(int i = 0; i < v.size(); ++i) ofs.write(as_bytes(v[i]), sizeof(int));
return 0;
}
Here is my code, instead of reading and writing int values, I tried with strings:
#include "std_lib_facilities.h"
void textToBinary(string, string);
//--------------------------------------------------------------------------------
int main(){
const string info("This program converts text to binary files.\n");
cout << info;
const string testFile("test.txt");
const string binaryFile("binary.bin");
textToBinary(testFile, binaryFile);
getchar();
return 0;
}
//--------------------------------------------------------------------------------
void textToBinary(string ftest, string fbinary){
// open text file to read
ifstream ift(ftest);
if(!ift) error("Can't open input file: ", ftest);
// copy contents in vector
vector<string>textFile;
string line;
while (getline(ift,line)) textFile.push_back(line);
// open binary file to write
ofstream fb(fbinary, ios::binary);
if(!fb) error("Can't open output file: ", fbinary);
// convert text to binary, by writing the vector contents
for(size_t i = 0; i < textFile.size(); ++i){ fb.write(textFile[i].c_str(), textFile[i].length()); fb <<'\n';}
cout << "Conversion done!\n";
}
Note:
My text file contains Lorem Ipsum, no digits or special punctuation. After I write the text using binary mode, there is a perfect character interpretation and the source text file looks exactly like the destination. (My attention goes to the fact that when using binary mode and the function write(as_bytes(), sizeof()), the content of the text file is translated perfectly and there are not mistakes.)
Question:
How should the binary file look like after I use binary mode(no char interpretation) and the function write(as_bytes(), sizeof()) when writing?
In both Unix-land and Windows a file is primarily just a sequence of bytes.
With the Windows NTFS file system (which is default) you can have more than one sequence of bytes in the same file, but there is always one main sequence which is the one that ordinary tools see. To ordinary tools every file appears as just a single sequence of bytes.
Text mode and binary mode in C++ concern whether the basic i/o machinery should translate to and from an external convention. In Unix-land there is no difference. In Windows text mode translates newlines from internal single byte C convention (namely ASCII linefeed, '\n'), to external double byte Windows convention (namely ASCII carriage return '\r' + linefeed '\n'), and vice versa. Also, on input in Windows, encountering a single byte value 26, a "control Z", is or can be interpreted as end of file.
Regarding the literal question,
” The question is in what format are they written in the binary file, shouldn't they be written in not-interpreted form, i.e raw bytes?
the text is written as raw bytes in both cases. The difference is only about how newlines are translated to the external convention for newlines. Since your text 1)doesn't contain any newlines, there's no difference. Edit: Not shown in your code except by scrolling it sideways, there's a fb <<'\n' that outputs a newline to the file opened in binary mode, and if this produces the same bytes as in the original text file, then there is no effective translation, which implies you're not doing this in Windows.
About the extra streams for Windows files, they're used e.g. for Windows (file) Explorer's custom file properties, and they're accessible e.g. via a bug in the Windows command interpreter, like this:
C:\my\forums\so\0306>echo This is the main stream >x.txt
C:\my\forums\so\0306>dir | find "x"
04-Jul-15 08:36 PM 26 x.txt
C:\my\forums\so\0306>echo This is a second byte stream, he he >x.txt:2nd
C:\my\forums\so\0306>dir | find "x"
04-Jul-15 08:37 PM 26 x.txt
C:\my\forums\so\0306>type x.txt
This is the main stream
C:\my\forums\so\0306>type x.txt:2nd
The filename, directory name, or volume label syntax is incorrect.
C:\my\forums\so\0306>find /v "" <x.txt:2nd
This is a second byte stream, he he
C:\my\forums\so\0306>_
I just couldn't resist posting an example. :)
1) You state that “My text file contains Lorem Ipsum, no digits or special punctuation”, which indicates no newlines.
This is driving me insane. I'm a beginner/intermediate C++er and I need to do something that seems simple. I have a string with A LOT of hex characters in it. They were inputted from a txt file. The string looks like this
07FF3901FF030302FF3f0007FF3901FF030302FF3f00.... etc for a while
How can I easily write these hex values into a .dat file? Everytime I try, it writes it as text, not hex values. I already tried writing a for loop to insert "\x" every byte but it still is written as text.
Any help would be appreciated :)
Note: Obviously, if I can even do this, then I don't know that much about c++ so try not to use things WAY over my head. Or at least explain it a bit. Pweeeez:)
You should be clear about the difference of char(ascii) and hex values.
Assume in x.txt:
ascii reads as: "FE"
In binary ,x.txt is "0x4645(0100 0110 0100 0101)".In ascii, 'F'=0x46,'E'=0x45.
Notice everything is computer is stored in binary code.
You want to get x.dat:
the binary code of x.dat is "0xFE(1111 1110)"
So, you should tranfer the ascii text into the proper hex values then write it into the x.dat.
The sample code:
#include<iostream>
#include<cstdio>
using namespace std;
char s[]="FE";
char r;
int cal(char c)// cal the coresponding value in hex of ascii char c
{
if (c<='9'&&c>='0') return c-'0';
if (c<='f'&&c>='a') return c-'a'+10;
if (c<='F'&&c>='A') return c-'A'+10;
}
void print2(char c)//print the binary code of char c
{
for(int i=7;i>=0;i--)
if ((1<<i)&c) cout << 1;
else cout << 0;
}
int main()
{
freopen("x.dat","w",stdout);// every thing you output to stdout will be outout to x.dat.
r=cal(s[0])*16+cal(s[1]);
//print2(r);the binary code of r is "1111 1110"
cout << r;//Then you can open the x.dat with any hex editor, you can see it is "0xFE" in binary
freopen("CON","w",stdout); // back to normal
cout << 1;// you can see '1' in the stdout.
}
I'm trying to load a certificate and a png file into a char* in C++:
char certPath[] = "./user.pem";
char dataPath[] = "./test.png";
char *certificate = loadFile(certPath);
char *datafile = loadFile(dataPath);
And this is my loadFile()` method:
char* loadFile(char* filename) {
cout << endl << "Loading file: " << filename << endl;
char *contents;
ifstream file(filename, ios::in|ios::binary|ios::ate);
if (file.is_open())
{
int size = file.tellg();
contents = new char [size];
file.seekg (0, ios::beg);
file.read (contents, size);
file.clear();
file.close();
}
printf("contents: %s\n", contents);
cout << endl << "finished loading " << filename << endl;
return contents;
}
This is the output which it produces:
Loading file: ./user.pem
contents: -----BEGIN CERTIFICATE-----
MIID+TCCAuGgAwIBAgIJAJhxZybSGGMgMA0GCSqGSIb3DQEBBQUAMIGSMQswCQYD
VQQGEwJBVDEPMA0GA1UECAwGU3R5cmlhMQ0wCwYDVQQHDARHcmF6MQowCAYDVQQK
DAEvMQowCAYDVQQLDAEvMR0wGwYDVQQDDBRDaHJpc3RvZiBTdHJvbWJlcmdlcjEs
MCoGCSqGSIb3DQEJARYdc3Ryb21iZXJnZXJAc3R1ZGVudC50dWdyYXouYXQwHhcN
MTIwMjE0MjEwMzA4WhcNMTMwMjEzMjEwMzA4WjCBkjELMAkGA1UEBhMCQVQxDzAN
BgNVBAgMBlN0eXJpYTENMAsGA1UEBwwER3JhejEKMAgGA1UECgwBLzEKMAgGA1UE
CwwBLzEdMBsGA1UEAwwUQ2hyaXN0b2YgU3Ryb21iZXJnZXIxLDAqBgkqhkiG9w0B
CQEWHXN0cm9tYmVyZ2VyQHN0dWRlbnQudHVncmF6LmF0MIIBIjANBgkqhkiG9w0B
AQEFAAOCAQ8AMIIBCgKCAQEA15ISaiXMSTVnmGtEF+bbhmVQk+4voU1pUZlOMVBj
QKjfPgCtgrmRaY8L+d6Pu61urFE1QrsfNJdDJRYs87Cc1eZgkvOXz0fSE2DHVNE2
i9YdFR8ea5niU5ATFZwiDIEhfCAcXWcEHWtZBB4yYYISsBkFxq6UBniGV+p7XOtE
aAtriBP0PZ4KUo+arJLStbwt4f9tBeytKowaKVNGlOpBgj7TG4bw8yA7Avdx8s+k
sReSxYteo0o9clIqISdKL0pRdzXP0Zrix54mBIfsxojfCW2SvqvLLLxtJlRKriQj
JfBc4koS6yAoktx7CvzcepGQk65ZGl0TNlteG4FJqy5yBQIDAQABo1AwTjAdBgNV
HQ4EFgQU1/g63xTix2Vs0zv2d3wVX9FGvVQwHwYDVR0jBBgwFoAU1/g63xTix2Vs
0zv2d3wVX9FGvVQwDAYDVR0TBAUwAwEB/zANBgkqhkiG9w0BAQUFAAOCAQEAHyvI
0L+ibesg45qUxx2OQb37HA9aRpR3wYpt6d5Rd1x2pfqumrKeV/42XWodZJSkU3sH
EX8V2xKwNoUBsPb/q54S9suCHwE33XtWjLvJyR9v2wd2HjNRYdGF9XoYdpsOpcAk
/kaZ2pExzLAPDg5pTsqY9dpCFWnyccZUO1CLEeljinOZ4raIj7d6EryWsn+u5pbs
WB12EFaoNCybQ6j5+TIcRs5xdGpVD6qMkm7HUnBn6mtz8Q7qVj9sqo5us4UBRWY8
ie9X494oW59nRuLiZ8dOPGuOXsuCILY44/3eyDh6yvW7G+wrp3eZ7L7eLRSI3+lm
mxqSJNq8Yi6ArfcB+Q==
-----END CERTIFICATE-----
finished loading ./user.pem
Loading file: ./test.png
First the content of the certificate should appear and then the content of the image. The certificate works but when I try to load the image it is really strange. Nothing works anymore. Even a simple cout or printf doesn't show up on the console but the program doesn't crash...
Any suggestions what's wrong?
Your error is that you have \0 at the beginning of the PNG header.
EDIT:
Change:
printf("contents: %s\n", contents);
To:
std::cout.write( contents, size );
std::cout.flush();
You have to move size into the correct scope as well of course.
There are different kinds of PNG file. So it could be the PNG image is having non-printable character. If so, then it will not be printed using any print function, be it printf or std::cout<<.
However, you can print the hexadecimal values of non-printable character:
//write it inside the if-block
for(int i = 0 ; i < size; ++i)
std::cout << std::hex << (int) contents[i];
It would print hexadecimal value of each character.
You can test if a given character is printable or not, using isprint() function.
You can't print the contents of a png file to the console, it's a binary file - different from a certificate file, which contains the certificate MIME-encoded and thus is a regular text file.
A printable file (i.e. text) contains only bytes representing standard-ASCII characters (0x20 - 0x7F) and uses ASCII formatting characters (CR, LF, etc.) in a predictable way. Furthermore, it doesn't contain a 0x00 byte, which is used in C/C++ to mark the end of a string. A binary file may contain any byte in any order.
So, two things will happen when you try to print it: a) it'll stop at the first 0x00 byte found; b) every byte containing a non-ASCII character will be printed as a special char (if it's in the code page active for the console), or nothing at all, and bytes that contain ASCII formatting chars will be "executed" as if they were actual formatting in a text file.
The result: either you won't see anything at all or just a few strange chars mixed with random line feeds, tabs & etc.
To have what you expect, the first thing is to define exactly what it is. Do you want to see the png contents MIME-encoded? Then you should use a MIME-encoding routine (like this). Or do you want to print the hex value of each byte? Then you need to do std::cout << std::hex << byte (as Nawaz suggested) or printf("%02x") for each byte in a loop.
Also for the certificate file you should open as a text file, not binary. Otherwise, you'd have two undesired effects: no LF normalization (for instance, in Windows the EOL is marked by CR+LF, while in Unix/Linux it's just LF) and no handling for the EOF char.