I working on the creation of a game. I want to hide all my .tga files.
I concatenate the string content of all my files on a single file in order to make it illisible for players.
I want my program to load a picture by creating a temporaly .tga file from
the saved content.
So that, I'm trying to copy a .tga file from the content of an original one.
More precisely, I read a .tga file as a text and a write it.
Eventhough Notepad++ finds original file and new file as identical, the new file can not be open as .tga file. Windows detects the size of files with 1 byte offset.
Can you explain me what I'm doing wrong ?
Or may be suggest me a better way to hide my files.
Regards
More precisely, I read a .tga file as a text and a write it
Herein may lie your problem: You have to read and write the .tga file as a binary file. Otherwise, any occurence of the byte sequence 0x0D 0x0A (CR LF, Windows line ending) may be replaced with a single 0x0A (LF, Unix line ending) or vice versa, or 0x1A (DOS end of file) may be stripped or appended. Depending on the code you are using, you may also end up stripping any 0x00 (NUL) bytes.
I tried to read / write with my program (c++) a .tga file as binary file but the generated file was still corrupted. The code is below.
std::string name = "my_picture.tga";
std::ifstream FileIn(name, std::ios_base::binary);
std::vector<char> listChar;
bool stopp = false;
if (FileIn) {
while (!(stopp))
{
char xin;
FileIn.read(reinterpret_cast<char*>(&xin), sizeof(char));
listChar.push_back(xin);
if (FileIn.eof()) stopp = true;
}
FileIn.close();
}
std::ofstream FileOut(".\\test.tga", std::ios_base::binary);
bool isCarierReturn = false;
for (char xout : listChar) {
isCarierReturn = xout == '\r';
if (!isCarierReturn) FileOut.write(reinterpret_cast<const char*>(&xout), sizeof(char));
}
FileOut.close();
I compared the original file and the new one on a hexadecimal reader and files are effectively different.
The difference between original and new file consists in a mismatch on lines ending, instead of just having 0x0A ('\n') on the original file, the new file had the byte sequence 0x0D 0x0A ('\r' and '\n'). On some other pictures the generated file was incomplete, the break is always before a 0x1A value (as said #Christoph Lipka).
I manage to write the right sequence by testing if the char is a carrier return, the char is not written on this case and only the byte 0x0D is skipped, see below :
std::ofstream FileOut(".\\test.tga", std::ios_base::binary);
bool isCarrierReturn = false;
char xout_p1 = '\0';
if (listChar.size() >= 1) xout_p1 = listChar.at(0);
for (unsigned i(0); i < listChar.size(); i++) {
char xout = xout_p1;
if (i < listChar.size() - 1) xout_p1 = listChar.at(i + 1);
else xout_p1 = '\0';
isCarrierReturn = xout == '\r' && xout_p1 == '\n';
if (!isCarrierReturn) FileOut.write(reinterpret_cast<const char*>(&xout), sizeof(char));
}
FileOut.close();
The incomplete file reading is solved by reading the file as binary file.
It works.
Related
I am writing a program that is to insert texts to a file every time when it is called. I don't want to rewrite the entire file, and I want the new text could be inserted to a new line. Here is my test code:
void writeFile()
{
FILE *pFile;
char* data = "hahaha";
int data_size = 7;
int count = 1;
pFile = fopen("textfile.bin","a+");
if (pFile!=NULL)
{
fwrite (data, data_size, count, pFile);
fclose (pFile);
}
}
At the first time it got called, everything worked fine. A new file was created and the data was successfully written. But when I called it again and expected that a new data to be inserted, I got weird strings in the file, something like:慨慨慨栀桡桡a.
I am not really familiar with C++ I/O functions. Can someone tell me what I did wrong? Also, any suggestion for appending text to the next line?
I think you are running into a code set issue, and the program you're using to look at the file you write expects to find UTF-16 data in the file.
I base this on an analysis of the string you quote:
慨慨慨栀桡桡a
When that (UTF-8) data is converted to Unicode values, I get:
0xE6 0x85 0xA8 = U+6168
0xE6 0x85 0xA8 = U+6168
0xE6 0x85 0xA8 = U+6168
0xE6 0xA0 0x80 = U+6800
0xE6 0xA1 0xA1 = U+6861
0xE6 0xA1 0xA1 = U+6861
0x61 = U+0061
0x0A = U+000A
The Unicode values U+6168 is represented in little-endian as bytes 0x68 0x61, and the ASCII code for h is 104 (0x68) and for a is 97 (0x61). So, the data is probably written correctly, but the interpretation of the data that is written is incorrect.
As I noted in a comment:
If you want lines in the file, you'll need to put them there (by adding newlines to the data that is written), because fwrite() won't output any newlines unless they are in the data it is given to write. You have written a null byte to the file (because you used data_size = 7), which means the file is not really a text file (text files don't contain null bytes). What happens next depends on the code set you're using.
The trailing single-byte codes in the output appear because the second null byte isn't visible in what's pasted on this page, and the trailing U+000A was added by the echo in the command line I used for the analysis (where utf8-unicode is a program I wrote):
echo "慨慨慨栀桡桡a" | utf8-unicode
Change your code to this:
char* data = "hahaha\0";
pFile = fopen("textfile.bin","a+");
if (pFile!=NULL)
{
fwrite (data, sizeof(char), strlen(data), pFile);
fclose (pFile);
}
When reading and parsing a CSV-file line, I need to process the nul character that appears as the value of some row fields. It is complicated by the fact that sometimes the CSV file is in windows-1250 encoding, sometimes it in UTF-8, and sometimes UTF-16. Because of this, I have started some way, and then found the nul char problem later -- see below.
Details: I need to clean a CSV files from third party to the form common to our data extractor (that is the utility works as a filter -- storing one CSV form to another CSV form).
My initial approach was to open the CSV file in binary mode and check whether the first bytes form BOM. I know all the given Unicode files start with BOM. If there is no BOM, I know that it is in windows-1250 encoding.
The converted CSV file should use the windows-1250 encoding. So, after checking the input file, I open it using the related mode, like this:
// Open the file in binary mode first to see whether BOM is there or not.
FILE * fh{ nullptr };
errno_t err = fopen_s(&fh, fnameIn.string().c_str(), "rb"); // const fs::path & fnameIn
assert(err == 0);
vector<char> buf(4, '\0');
fread(&buf[0], 1, 3, fh);
::fclose(fh);
// Set the isUnicode flag and open the file according to that.
string mode{ "r" }; // init
bool isUnicode = false; // pessimistic init
if (buf[0] == 0xEF && buf[1] == 0xBB && buf[2] == 0xBF) // UTF-8 BOM
{
mode += ", ccs=UTF-8";
isUnicode = true;
}
else if ((buf[0] == 0xFE && buf[1] == 0xFF) // UTF-16 BE BOM
|| (buf[0] == 0xFF && buf[1] == 0xFE)) // UTF-16 LE BOM
{
mode += ", ccs=UNICODE";
isUnicode = true;
}
// Open in the suitable mode.
err = fopen_s(&fh, fnameIn.string().c_str(), mode.c_str());
assert(err == 0);
After the successful open, the input line is read or via fgets or via fgetws -- depending on whether Unicode was detected or not. Then the idea was to convert the buffer content from Unicode to 1250 if the unicode was detected earlier, or let the buffer be in 1250. The s variable should contain the string in the windows-1250 encoding. The ATL::CW2A(buf, 1250) is used when conversion is needed:
const int bufsize = 4096;
wchar_t buf[bufsize];
// Read the line from the input according to the isUnicode flag.
while (isUnicode ? (fgetws(buf, bufsize, fh) != NULL)
: (fgets(reinterpret_cast<char*>(buf), bufsize, fh) != NULL))
{
// If the input is in Unicode, convert the buffer content
// to the string in cp1250. Otherwise, do not touch it.
string s;
if (isUnicode) s = ATL::CW2A(buf, 1250);
else s = reinterpret_cast<char*>(buf);
...
// Now processing the characters of the `s` to form the output file
}
It worked fine... until a file with a nul character used as the value in the row appeared. The problem is that when the s variable is assigned, the nul cuts the rest of the line. In the observed case, it happened with the file that used 1250 encoding. But it can probably happen also in the UTF encoded files.
How to solve the problem?
The NUL character problem is solved by using either C++ or Windows functions. In this case, the easiest solution is MultiByteToWideChar which will accept an explicit string length, precisely so it doesn't stop on NUL.
Consider the following code
FILE * pOutFile;
unsigned char uid;
pOutFile = fopen("OutFile.bin","w") ; // open a file to write
uid = 0x0A;
fprintf (pOutFile,"%c",uid); // Trying to print 0x0A in to the file
But the print I get in the file is
0x0D 0x0A
Where is this 0x0D coming from? Am I missing something? What consideration must I take to prevent this.
Corrected: uidl was a typo.
Windows text files want new lines to be represented by two consecutive chars: 0x0D and 0x0A.
In C, a new line is represented by a single char: 0x0A.
Thus, on Windows, in C, you have two ways to open a file: text mode or binary mode.
In binary mode, when you write a LineFeed (0x0A) char, a single byte (0x0A) is append to the file.
In text mode, whenever you write a LineFeed (0x0A) char, two bytes (0x0D and 0x0A) are append to the file.
The solution is to open the file in binary mode, using "wb".
Because you have opened the file in "w" mode it is in TEXT mode, which means \n's (aka 0x0a) are translated into \r\n (carriage return and line feed).
If you only want 0x0a written to the file open it in binary mode ("wb").
Actually, none of those are the issue...
FILE * pOutFile;
unsigned char uid;
pOutFile = fopen("OutFile.bin","wb") ; // open a file to write (in binary, <b>not</b> text mode)
uid = 0x0A; //changed from uidl = 0x0A (which didnt set uid)
fprintf (pOutFile, "%c", uid); // Trying to print 0x0A in to the file
What I changed, was you were setting uidl and NOT uid, which you printed.
You could always do the following:
fprintf(pOutFile, "%c", 0x0A); or
fprintf(pOutFile, "%c", '\n'); or
fprintf(pOutFile, "\n");
if you wanted (the last option is probably your best.
I also opened your file in wb mode.
I need to read bytes from a jpg file in c++ so write this codes:
ifstream in("1.jpg"ios::binary);
while(!in.eof()){
char ch = in.get();
}
as you know a jpg file consist of 256 difference chars that we can save it's repeat in a a arr.but the problem is that this code that i wrote read chars in the form of unicode so it consist of 9256 difference char.how can i read from 1.jpg that it wasn't unicode?
The get function reads unformatted data from the file, it just casts the char it read as an int. Are you seeing data read from the file as different to the actual data in the file? If you are there could be a problem elsewhere in the code, and you should provide more.
Alternatively you could read chunks of unformatted data using read.
int main()
{
std::ifstream in("1.jpg", std::ios::binary);
char buffer[1024];
while (in)
{
in.read(buffer, sizeof(buffer));
if (in.gcount() > 0)
{
// read in.gcount() chars from the file
// process them here.
}
}
}
I have a dat(binary) file but i wish to convert this file into Ascii (txt) file using c++ but i am very new in c++ programming.so I juct opend my 2 files:myBinaryfile and myTxtFile but I don't know how to read data from that dat file and then how to write those data into new txt file.so i want to write a c+ codes that takes in an input containing binary dat file, and converts it to Ascii txt in an output file. if this possible please help to write this codes. thanks
Sorry for asking same question again but still I didn’t solve my problem, I will explain it more clearly as follows: I have a txt file called “A.txt”, so I want to convert this into binary file (B.dat) and vice verse process. Two questions:
1. how to convert “A.txt” into “B.dat” in c++
2. how to convert “B.dat” into “C.txt” in c++ (need convert result of the 1st output again into new ascii file)
my text file is like (no header):
1st line: 1234.123 543.213 67543.210 1234.67 12.000
2nd line: 4234.423 843.200 60543.232 5634.60 72.012
it have more than 1000 lines in similar style (5 columns per one line).
Since I don’t have experiences in c++, I am struggle here, so need your helps. Many Thanks
All files are just a stream of bytes. You can open files in binary mode, or text mode. The later simply means that it may have extra newline handling.
If you want your text file to contain only safe human readable characters you could do something like base64 encode your binary data before saving it in the text file.
Very easy:
Create target or destination file
(a.k.a. open).
Open source file in binary mode,
which prevents OS from translating
the content.
Read an octet (byte) from source
file; unsigned char is a good
variable type for this.
Write the octet to the destination
using your favorite conversion, hex,
decimal, etc.
Repeat at 3 until the read fails.
Close all files.
Research these keywords: ifstream, ofstream, hex modifier, dec modifier, istream::read, ostream::write.
There are utilities and applications that already perform this operation. On the *nix and Cygwin side try od, *octal dump` and pipe the contents to a file.
There is the debug utility on MS-DOS system.
A popular format is:
AAAAAA bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb cccccccccccccccc
where:
AAAAAA -- Offset from beginning of file in hexadecimal or decimal.
bb -- Hex value of byte using ASCII text.
c -- Character representation of byte, '.' if the value is not printable.
Please edit your post to provide more details, including an example layout for the target file.
Edit:
A complex example (not tested):
#include <iostream>
#include <fstream>
#include <cstdio>
#include <cstdlib>
using namespace std;
const unsigned int READ_BUFFER_SIZE = 1024 * 1024;
const unsigned int WRITE_BUFFER_SIZE = 2 * READ_BUFFER_SIZE;
unsigned char read_buffer[READ_BUFFER_SIZE];
unsigned char write_buffer[WRITE_BUFFER_SIZE];
int main(void)
{
int program_status = EXIT_FAILURE;
static const char hex_chars[] = "0123456789ABCDEF";
do
{
ifstream srce_file("binary.dat", ios::binary);
if (!srce_file)
{
cerr << "Error opening input file." << endl;
break;
}
ofstream dest_file("binary.txt");
if (!dest_file)
{
cerr << "Error creating output file." << endl;
}
// While no read errors from reading a block of source data:
while (srce_file.read(&read_buffer[0], READ_BUFFER_SIZE))
{
// Get the number of bytes actually read.
const unsigned int bytes_read = srce_file.gcount();
// Define the index and byte variables outside
// of the loop to maybe save some execution time.
unsigned int i = 0;
unsigned char byte = 0;
// For each byte that was read:
for (i = 0; i < bytes_read; ++i)
{
// Get source, binary value.
byte = read_buffer[i];
// Convert the Most Significant nibble to an
// ASCII character using a lookup table.
// Write the character into the output buffer.
write_buffer[i * 2 + 0] = hex_chars[(byte >> 8)];
// Convert the Least Significant nibble to an
// ASCII character and put into output buffer.
write_buffer[i * 2 + 1] = hex_chars[byte & 0x0f];
}
// Write the output buffer to the output, text, file.
dest_file.write(&write_buffer[0], 2 * bytes_read);
// Flush the contents of the stream buffer as a precaution.
dest_file.flush();
}
dest_file.flush();
dest_file.close();
srce_file.close();
program_status = EXIT_SUCCESS;
} while (false);
return program_status;
}
The above program reads 1MB chunks from the binary file, converts to ASCII hex into an output buffer, then writes the chunk to the text file.
I think you are misunderstanding that the difference between a binary file and a test file is in the interpretation of the contents.