Ofstream adds extra characters to my output

Ofstream adds extra characters to my output - c++

I'm making a enciphering/deciphering program using XTEA algorithm. The encipher/decipher functions work fine, but when I encipher a file and then decipher it, I get some extra characters in the end of the file:
--- Original file ---
QwertY
--- Encrypted file ---
»¦æŸS#±
--- Deciphered from encrypted ---
QwertY ß*tÞÇ
I have no idea why the " ß*tÞÇ" appears in the end.
I will post some of my code, but not all of it since it would be too long. The encipher/decipher function takes 64 bits data and 128 bits key, and encipher/decipher the data to the same block size, which is again 64 bits (similar functions here). It can then be written to a new file.
long data[2]; // 64bits
ZeroMemory(data, sizeof(long)*2);
char password[16];
ZeroMemory(password, sizeof(char)*16);
long *key;
if(argc > 1)
{
string originalpath = argv[1];
string finalpath;
string eextension = "XTEA";
string extension = GetFileExtension(originalpath);
bool encipherfile = 1;
if(extension.compare(eextension) == 0) // If extensions are equal, dont encipher file
{
encipherfile = 0;
finalpath = originalpath;
finalpath.erase(finalpath.length()-5, finalpath.length());
}
ifstream in(originalpath, ios::binary);
ofstream out(finalpath, ios::binary);
cout << "Password:" << endl;
cin.get(password,sizeof(password));
key = reinterpret_cast<long *>(password);
while(!in.eof())
{
ZeroMemory(data, sizeof(long)*2);
in.read(reinterpret_cast<char*>(&data), sizeof(long)*2); // Read 64bits from file
if(encipherfile == 1)
{
encipher(data, key);
out.write(reinterpret_cast<char*>(&data), sizeof(data));
continue;
}
if(encipherfile == 0)
{
decipher(data, key);
out.write(reinterpret_cast<char*>(&data), sizeof(data));
}
}

Check for eof immediately after your read, and if you get eof break out of the loop.
If you may have partial reads (i.e. it is possible to read fewer than all of the requested bytes), then you need also to call gcount to find out how many bytes you actually read, thus:
cin.read( ... )
if( cin.eof() )
{
streamsize bytesRead = cin.gcount();
if( bytesRead > 0 )
// process those bytes
break;
}

Related

C++ Text File Input [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 4 years ago.
Improve this question
This is a relatively simple question but I can't seem to find an answer. I need to read every character from a text file excluding spaces.
I currently have:
fstream inFile(fileName, ios::in);
char ch;
while (!inFile.eof()){
ch = inFile.get();
This is working for all letters and number but not special characters. What's an alternative I can use to read everything but spaces?

Assuming the file is ASCII and contains no NULL characters the following method could be used.
size_t ReadAllChars(char const* fileName, char **ppDestination)
{
//Check inputs
if(!filename || !ppDestination)
{
//Handle errors;
return 0;
}
//open file for reading
FILE *pFile = fopen(fileName, "rb");
//check file successfully opened
if(!pFile)
{
//Handle error
return 0;
}
//Seek to end of file (to get file length)
if(_fseeki64(pFile, 0, SEEK_END))
{
//Handle error
return 0;
}
//Get file length
size_t fileLength = _ftelli64(pFile);
if(fileLength == -1)
{
//Handle error
return 0;
}
//Seek back to start of file
if(_fseeki64(pFile, 0, SEEK_SET))
{
//Handle error
return 0;
}
//Allocate memory to store entire contents of file
char *pRawSource = (char*)malloc(fileLength);
//Check that allocation succeeded
if(!pRawSource)
{
//Handle error
//return 0;
}
//Read entire file
if(fread(pRawSource, 1, fileLength, pFile) != fileLength))
{
//Handle error
fclose(pFile);
free(pRawSource);
return 0;
}
//Close file
fclose(pFile);
//count spaces
size_t spaceCount = 0;
for(size_t i = 0; i < fileLength; i++)
{
if(pRawSource[i] == ' ')
++spaceCount;
}
//allocate space for file contents not including spaces (plus a null terminator)
size_t resultLength = fileLength - spaceCount;
char *pResult = (char*)malloc(resultLength + 1)
//Check allocation succeeded
if(!pResult)
{
//Handle error
free(pRawSource);
return 0;
}
//Null terminate result
pResult[resultLength] = NULL;
//copy all characters except space into pResult
char *pNextTarget = pResult;
for(size_t i = 0; i < fileLength; i++)
{
if(pRawSource[i] != ' ')
{
*pNextTarget = pRawSource[i];
++pNextTarget;
}
}
//Free temporary buffer
free(pRawSource);
*ppDestination = pResult;
return resultLength;
}

You should open the file in binary mode

One of the simpler approaches would just start checking the ASCII of all each character that you are iterating on.
If the ASCII value of the character is "20" (ASCII for SPACE) then skip it with "continue" otherwise just print it.

Assuming you are using the default locale of C++, maybe try to put them into a std::string and let std::ifstream& operator >> (std::ifstream&, std::string&) and std::skipws do the magic (skip all spaces) for you?
#include <iostream>
#include <fstream>
#include <vector>
#include <string>
#include <utility>
int main(int, char* argv[])
{
const char *filename = /* filename */;
std::ifstream in{filename};
if (in.fail()) {
std::cerr << "Fails to open " << filename << std::endl;
return 1;
}
/*
* Actually, you can skip this line, because the default behavior of
* std::fstream and other stream is to skip all the white space before input.
*/
in >> std::skipws;
std::vector<std::string> stringv;
// reserve to speed up, you can replace the new_cap with your guess
stringv.reserve(10);
std::string str;
/*
* while std::skipws tells the stream to skip all the white space before input,
* std::ifstream& operator >> (std::ifstream&, std::string&) will stop when a space is read.
*/
while(in >> str)
stringv.push_back(std::move(str));
}
Edit:
I haven't tested this program yet, so there might be some compilation errors, but I am so sure that this method should works.
Using !in.eof() tests whether the eof is reached, but it doesn't test whether the extraction succeeds or not, which means you can get invalid data. in >> str fixs this because after the extraction the value of !in.fail() indicates whether the extraction from stream succeeds or not.

Cannot Read Binary files in byte mode in C++

I am trying to read a binary file's data sadly opening in C++ is a lot different than in python for these things as they have byte mode. It seems C++ does not have that.
for (auto p = directory_iterator(path); p != directory_iterator(); p++) {
if (!is_directory(p->path()))
byte tmpdata;
std::ifstream tmpreader;
tmpreader.open(desfile, std::ios_base::binary);
int currentByte = tmpreader.get();
while (currentByte >= 0)
{
//std::cout << "Does this get Called?" << std::endl;
int currentByte = tmpreader.get();
tmpdata = currentByte;
}
tmpreader.close()
}
else
{
continue;
}
I want basically a clone of Python's methods of opening a file in 'rb' mode. To have to actual byte data of all of the contents (which is not readable as it has nonprintable chars even for C++. Most of which probably cant be converted to signed chars just because it contains zlib compressed data that I need to feed in my DLL to decompress it all.
I do know that in Python I can do something like this:
file_object = open('[file here]', 'rb')
turns out that replacing the C++ Code above with this helps. However fopen is depreciated but I dont care.
What the Code above did not do was work because I was not reading from the buffer data. I did realize later that fopen, fseek, fread, and fclose was the functions I needed for read bytes mode ('rb').
for (auto p = directory_iterator(path); p != directory_iterator(); p++) {
if (!is_directory(p->path()))
{
std::string desfile = p->path().filename().string();
byte tmpdata;
unsigned char* data2;
FILE *fp = fopen("data.d", "rb");
fseek(fp, 0, SEEK_END); // GO TO END OF FILE
size_t size = ftell(fp);
fseek(fp, 0, SEEK_SET); // GO BACK TO START
data2 = new unsigned char[size];
tmpdata = fread(data2, 1, size, fp);
fclose(fp);
}
else
{
continue;
}

int currentByte = tmpreader.get();
while (currentByte >= 0)
{
//std::cout << "Does this get Called?" << std::endl;
int currentByte = tmpreader.get();
//^ here!
You are declaring a second variable hiding the outer one. However, this inner one is only valid within the while loop's body, so the while condition checks the outer variable which is not modified any more. Rather do it this way:
int currentByte;
while ((currentByte = tmpreader.get()) >= 0)
{

Tellg and seekg and file read not working

I am Trying to read 64000 bytes from file in binary mode in buffer at one time till end of the file. My problem is tellg() returns position in hexadecimal value, How do I make it return decimal value?
because my if conditions are not working, it is reading more than 64000 and when I am relocating my pos and size_stream(size_stream = size_stream - 63999;
pos = pos + 63999;), it is pointing to wrong positions each time.
How do I read 64000 bytes from file into buffer in binary mode at once till the end of file?
Any help would be appreciated
std::fstream fin(file, std::ios::in | std::ios::binary | std::ios::ate);
if (fin.good())
{
fin.seekg(0, fin.end);
int size_stream = (unsigned int)fin.tellg(); fin.seekg(0, fin.beg);
int pos = (unsigned int)fin.tellg();
//........................<sending the file in blocks
while (true)
{
if (size_stream > 64000)
{
fin.read(buf, 63999);
buf[64000] = '\0';
CString strText(buf);
SendFileContent(userKey,
(LPCTSTR)strText);
size_stream = size_stream - 63999;
pos = pos + 63999;
fin.seekg(pos, std::ios::beg);
}
else
{
fin.read(buf, size_stream);
buf[size_stream] = '\0';
CString strText(buf);
SendFileContent(userKey,
(LPCTSTR)strText); break;
}
}

My problem is tellg() returns position in hexadecimal value
No, it doesn't. It returns an integer value. You can display the value in hex, but it is not returned in hex.
when I am relocating my pos and size_stream(size_stream = size_stream - 63999; pos = pos + 63999;), it is pointing to wrong positions each time.
You shouldn't be seeking in the first place. After performing a read, leave the file position where it is. The next read will pick up where the previous read left off.
How do I read 64000 bytes from file into buffer in binary mode at once till the end of file?
Do something more like this instead:
std::ifstream fin(file, std::ios::binary);
if (fin)
{
unsigned char buf[64000];
std::streamsize numRead;
do
{
numRead = fin.readsome(buf, 64000);
if ((!fin) || (numRead < 1)) break;
// DO NOT send binary data using `LPTSTR` string conversions.
// Binary data needs to be sent *as-is* instead.
//
SendFileContent(userKey, buf, numRead);
}
while (true);
}
Or this:
std::ifstream fin(file, std::ios::binary);
if (fin)
{
unsigned char buf[64000];
std::streamsize numRead;
do
{
if (!fin.read(buf, 64000))
{
if (!fin.eof()) break;
}
numRead = fin.gcount();
if (numRead < 1) break;
// DO NOT send binary data using `LPTSTR` string conversions.
// Binary data needs to be sent *as-is* instead.
//
SendFileContent(userKey, buf, numRead);
}
while (true);
}

FTP client -- CR characters in gif is removed after get

I'm creating an FTP client.
I'm getting a gif from the server, but after that the gif is corrupted.
When I change the file extension to look at the diff, I see that the
CR/LF characters are gone.
How could this be? I made sure to use image mode.
Here's my read code in TCP socket.
string TCPSocket::long_read()
{
pollfd ufds;
ufds.fd = sd;
ufds.events = POLLIN;
ufds.revents = 0;
ssize_t bytesRead = 0;
string result;
char* buf = new char[LONGBUFLEN];
do {
bzero(buf, LONGBUFLEN);
bytesRead = ::read(sd, buf, LONGBUFLEN);
if (bytesRead == 0) {
break;
}
if (bytesRead > 0) {
result = result + string(buf, bytesRead);
}
} while (poll(&ufds, 1, 1000) > 0);
return result;
}
Here my get code in main.cpp
else if (command == command::GET) {
string filename;
cin >> filename;
string dataHost;
int dataPort;
if (enterPassiveMode(dataHost, dataPort)) {
dataSocket = new TCPSocket(dataHost.c_str(), dataPort);
if (fork() == 0) {
string result = dataSocket->long_read();
size_t length = result.size();
char* resultArr = new char[length];
memcpy(resultArr, result.data(), length);
// mode_t mode = S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH;
FILE* file = fopen(filename.c_str(), "w+b");
if (file) {
fwrite(resultArr, length, 1, file);
fclose(file);
}
else {
cout << "open failed";
}
break;
}
else {
writeAndImmediateRead(rfc959::TYPE_I);
controlSocket->write(rfc959::RETRIVE(filename));
string result = controlSocket->read();
cout << result;
int reply = Parser::firstDigit(result);
// I'll remove incomplete local file if request fails
if (reply != rfc959::POSITIVE_PRELIMINARY_REPLY) {
remove(filename.c_str());
continue;
}
wait(NULL);
cout << controlSocket->long_read();
}
}
}
EDIT
I did make sure to use Binary mode. And when I transferred a text file(though of a smaller size), it doesn't have this problem. Here's the output:
EDIT 2
Output from Wireshark showing Request: TYPE I and Response: Opening BINARY mode

By default, FTP servers and clients perform data transfers as "ASCII mode", which means that any CRLF sequence is translated on-the-fly to the host's ASCII line ending (e.g. just bare LF on Unix mmachines). This behavior is mandated by RFC 959; see Section 3.1.1.1.
To transfer your data as binary, and avoid the ASCII mode translation, your FTP client will want to send the TYPE command first, e.g.:
TYPE I
Your .gif file should then be transferred as is, with no replacements/transformations on any CRLF sequences.
Hope this helps!

RLE: encode by two symbols

I've created RLE encoding function, which encodes sequences like "A1A1B7B7B7B7" to such strings: "#A12#B74".
void encode(const char *input_path, const char *output_path)
{ // Begin of SBDLib::SBIMask::encode
std::fstream input(input_path, std::ios_base::in | std::ios_base::binary);
std::fstream output(output_path, std::ios_base::out | std::ios_base::binary);
int size = 0; // Set size variable
input.seekg(0, std::ios::end); // Move to EOF
size = input.tellg(); // Tell position
input.seekg(0); // Move to the beginning
int i = 1; // Create encoding counter
int counter = 0; // Create color counter
int cbyte1, cbyte2; // Create current color bytes
int pbyte1 = 0x0; int pbyte2 = 0x0; // Create previous color bytes
while (((cbyte1 = input.get()) != EOF && (cbyte2 = input.get()) != EOF)
|| input.tellg() >= size)
{ // Begin of while
// If current bytes are not equal to previous bytes
// or cursor is at the end of the input file, write
// binary data to file; don't do it if previous bytes
// were not set from 0x0 to any other integer.
if (((cbyte1 != pbyte1 || cbyte2 != pbyte2)
|| (input.tellg() == size))
&& (pbyte1 != 0x0 && pbyte2 != 0x0))
{ // Begin of main if
output << SEPARATOR; // Write separator to file
output.write(reinterpret_cast<const char*>(&pbyte1), 1);
output.write(reinterpret_cast<const char*>(&pbyte2), 1);
output << std::hex << counter; // Write separator, bytes and count
counter = 1; // Reset counter
} // End of main if
else counter++; // Increment counter
pbyte1 = cbyte1; pbyte2 = cbyte2; // Set previous bytes
} // End of main while
} // End of encode
However, function is not as fast as I need. This is the second version of function, I've already improved it to make it faster, but it is still too slow. Do you have any ideas how to improve? I'm lack of ideas.

Depending on the size of data you are reading from files it might be a good idea not to read single charcaters but a chunk of data from your input file at once. This might be a lot faster than accessing the input file on the disk for each input character.
Pseudo code example:
char dataArray[100];
while( !EOF )
{
input.get( &dataArray[0], 100 ); // read a block of data not a single charater
process( dataArray ); // process one line
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Ofstream adds extra characters to my output - c++

Related

C++ Text File Input [closed]

Cannot Read Binary files in byte mode in C++

Tellg and seekg and file read not working

FTP client -- CR characters in gif is removed after get

RLE: encode by two symbols

Categories

Resources