C - Cannot read all characters when reading a file - c++

I am currently working on a lossless compression algorithm using the Huffman technique.
I managed to compress the desired file, and save the compressed data to a file.
However, I am unable to correctly read what is recorded in this file. Preferably, I would like to store the contents of this file in a std::string.
Here is the content of this file:
00000L,1LP10LURD100LVRj1LLRQRER.Rm1LlRr1LiRe1000LpRdRn100Lv100LC1LARF1LIRNRbRcRa100Lo100Lh1Lx1LMRSRf1LgRqRs1LuRt
X6*ÃWØ¿¸u÷üwµS™’ð‚<)âóUO_mÁ9Õö/ë‰ÍÌ Ï-,SÁúÚâuçëðÒì`WVwÿƒüšÎ뉊?Âgÿ­PÞuâ[CßTø¸CJŸy™“Þ¸Ý{+1sü <Ï~÷øà·\#¾¯à禡ú±Õö/Þüºû"í+ª•tÊæ+Ó¸Ð÷Õ>.'¦º¾Åü
úá‡
lÀ•¥¸Äq/?03òØ2'>÷?>9»ŸtY®Ùyù„‰u®'^~¿Û‚sŸ–öŽ(wß°/ì–~+K*•O´ ÿV:âyšö¨oãúü:ÿhrkã[‹7çjëĶ†KßW›˜iSêj£ÓúÆÉ×ûoÆÉï\l ÜKª‘Úɬ®b]T«ÏÖ42+4­Ô2µ“í«Ç7•’Ðä×Æ·Ø÷+ìÊþ¸˜¡sü!xSC—7ëoÿø=_bÿÔÕG§ÃIöÕÐÊV¥:ÅÅ?]Yß„ËsòÖx×™øÞíàæÍÓ+˜¯~7Æ´
puQäöÕA}ÿ².Õ {vÓ+˜¯¾ÍÌ ŽÞ¹úb+
ç·ñd³óÿSUŸ
/ˆ®Œ|/õ}‹ñT`»áúüi|EuÄæçMSs’âóUOrßUSí
ïFòH{Tû8ð¹C«ì_ׯ?_†—ÄWÅæªöV’ŽÜc`>ö0Á‹þÚ8¡ß~ÃH;ÜÙ¸dY¿;þ8-\`ÈœûÜüÞþSS™Zäî#d÷®7Bûo}åʪ¿ŽŽ(wß°Òö?õø0¡¿¾¯àÕ׉hñÇ7·™­Û‚rþ[ª%«KÅ’ý_býCxBí럿‡
löàœ„õÄææ÷îªÃÜ0ëðaxSˆ¬C´êÿm¨³]¯À¦W1^§T_XË®â6OF÷l4Ž;¦¿æ{÷»y—úØ«äý°sºâ³JÝB7ƶ…öñ«&ksóâóUOB ‘9÷¹ùÅf•º†ë^¹,«Ì÷ïCIÏÖ0Áÿ¯Á¿žšç壂{é„Eë"í¡–VK{åSÈ-Êjs"+
*}œx_“Þ¸Þ[·[ŸjÃBµN[êª}¡}’ÏÖ4<…^+PÞ‡16kø¼Õ^‡&ÁÓO.UUülñªY/dÅÏð€þ7O·?X×.„4Ÿm]2¹Š÷’œO©ªOúš¨ôøiSìãÂülžõÆòؽºÜûPÞªrßUSí?Õ~lÜ­#_ÿ­%›ŠÃIÏÖ,ïèÁwÂü4©¯ö¨oéZÁœSøªÿA¸—XõÂòO$Ìþ¸Äq>Ú¹¿[ÞY¬‹µòÑÄrÄÙ¯r†…öáæl ™Ÿ{Ÿ›&²¹Œº‹þÙÄíæ_ëBõb}çÃb8"ZW^Òº©Vɬ®bp¨±í¿Ê¥Sí³¯pȳ~vÎü ï–çç\LP¹þÛBûpÆó7\LP¹þ‚<)dÖW1d¾ë싶‹5Ûo3
Here is the code I wrote to read its contents:
int main(){
int number_of_lines = 0;
std::string line;
std::ifstream myfile("my_file.txt.huff");
while (std::getline(myfile, line)){
++number_of_lines;;
std::cout << "line number: " << number_of_lines << " content: " << line << std::endl;
}
}
I also tried via this way:
int main(){
FILE *find = fopen("my_file.txt.huff", "r");
int ca;
while(EOF != (ca=fgetc(find)))
std::cout << (char)ca;
std::cout << std::endl;
}
Here is the console rendering for the first code given:
line number : 1 content : 00000L,1LP10LURD100LVRj1LLRQRER.Rm1LlRr1LiRe1000LpRdRn100Lv100LC1LARF1LIRNRbRcRa100Lo100Lh1Lx1LMRSRf1LgRqRs1LuRt
line number : 2 content :
line number : 3 content : X6*├WÏ┐©Øu¸³wÁSÖÆ­é<)Ô¾UO_m┴9ı÷/Ùë═╠ ¤-,S┴·┌ÔuþÙ­Êý`WVw â³
At every time I try to read this file the problem is repeated. Indeed, the methods I use do not allow me to read it entirely.
Why does this problem occur?
thank you in advance
N.B: I was able to test, without success, the solution provided by Cillié Malan in this post. I'm having trouble for converting from a std::wstringstream to a std::string correctly.

Here is a short example that opens the file in binary mode and reads the entire file into a std::vector<uint8_t>. (you can also use a std::array<uint8_t> if you like). You open the file, .seekg() to the end, get the number of bytes and create the vector with that number of bytes and then .read() the file into the vector.
The following takes the filename as the first argument and outputs the content in hex format (for large files, change the output before testing to limit what is dumped to stdout)
#include <iostream>
#include <fstream>
#include <vector>
int main (int argc, char **argv) {
if (argc < 2) { /* validate filename given as argument */
std::cerr << "error: insufficient arguments\n"
"usage: " << argv[0] << " filename.bin\n";
return 1;
}
/* open file in binary mode, position at-the-end */
std::ifstream f (argv[1], std::ios::binary | std::ios::ate);
if (!f.is_open()) /* validate file open for reading */
return 1;
size_t nbytes = f.tellg(); /* get number of bytes in file */
f.seekg (0); /* rewind */
std::vector<uint8_t> arr(nbytes); /* declare vector with adequate storage */
f.read(reinterpret_cast<char*>(&arr[0]), nbytes); /* read file into vector */
if (f.bad() || f.fail()) /* validate read */
return 1;
for (auto& i : arr) /* output results (limit for larger files) */
std::cout << std::hex << std::showbase << static_cast<uint32_t>(i) << " ";
std::cout.put ('\n');
}
Look things over and let me know if you have further questions. There are several ways to approach this.

Related

C++ ofstream Binary Mode - Written file still looks like plain text

I have an assignment that wants plain text data to be read in from a file, and then outputted to a separate binary file. With that being said, I expect to see that the contents of the binary file not to be intelligible for human reading. However, when I open the binary file the contents are still appearing as plain text. I am setting the mode like this _file.open(OUTFILE, std::ios::binary). I can't seem to figure out what I'm missing. I've followed other examples with different methods of implementation, but there's obviously something I'm missing.
For the purpose of posting, I created a slimmed down test case to demonstrate what I'm attempting.
Thanks in advance, help is greatly appreciated!
Input File: test.txt
Hello World
main.cpp
#include <iostream>
#include <fstream>
using namespace std;
#define INFILE "test.txt"
#define OUTFILE "binary-output.dat"
int main(int argc, char* argv[]) {
char* text = nullptr;
int nbytes = 0;
// open text file
fstream input(INFILE, std::ios::in);
if (!input) {
throw "\n***Failed to open file " + string(INFILE) + " ***\n";
}
// copy from file into memory
input.seekg(0, std::ios::end);
nbytes = (int)input.tellg() + 1;
text = new char[nbytes];
input.seekg(ios::beg);
int i = 0;
input >> noskipws;
while (input.good()) {
input >> text[i++];
}
text[nbytes - 1] = '\0';
cout << "\n" << nbytes - 1 << " bytes copied from file " << INFILE << " into memory (null byte added)\n";
if (!text) {
throw "\n***No data stored***\n";
} else {
// open binary file for writing
ofstream _file;
_file.open(OUTFILE, std::ios::binary);
if (!_file.is_open()) {
throw "\n***Failed to open file***\n";
} else {
// write data into the binary file and close the file
for (size_t i = 0U; i <= strlen(text); ++i) {
_file << text[i];
}
_file.close();
}
}
}
As stated here, std::ios::binary isn't actually going to write binary for you. Basically, it's the same as std::ios::out except things like \n aren't converted to line breaks.
You can convert text to binary by using <bitset>, like this:
#include <iostream>
#include <vector>
#include <bitset>
int main() {
std::string str = "String in plain text";
std::vector<std::bitset<8>> binary; // A vector of binaries
for (unsigned long i = 0; i < str.length(); ++i) {
std::bitset<8> bs4(str[i]);
binary.push_back(bs4);
}
return 0;
}
And then write to your file.
In simplest terms, the flag std::ios::binary means:
Do not make any adjustments to my output to aid in readability or conformance to operating system standards. Write exactly what I send.
In your case, you are writing readable text and the file contains exactly what you sent.
You could also write bytes that are unintelligible when viewed as text. In that case, your file would be unintelligible when viewed as text.

C++ File has 32K ints seperated by newline. I need to create 8 smaller files to each hold 4096 ints

I have a file of ints seperated by newline delimiter.
324872
27
256230
0
45767
276143
4
258283
189
153812
214521
The file size is 32768 lines. I need to break it into 8 smaller files of 4096 lines. I use fstream to stream the orginal file into a char buffer:
std::string fileOfInts(".txt");
char *buffer = new char[BUFFER_SIZE];
std::ifstream inputFromOrigin("origin.txt");
int fileIndex = 0;
while (inputFromOrigin)
{
inputFromOrigin.read(buffer, BUFFER_SIZE);
size_t count = inputFromOrigin.gcount();
if (!count)
break;
std::ofstream createRunSizeFile;
createRunSizeFile.open("fileOfInts" + std::to_string(fileIndex) + fileOfInts);
int value;
if (createRunSizeFile) {
for (size_t i = 0, bufferSize = sizeof(buffer); i < bufferSize; i += sizeof(int)) {
value = (int)buffer[i];
createRunSizeFile << value << std::endl;
}
}
createRunSizeFile.close();
fileIndex++;
}
inputFromOrigin.close();
delete[] buffer;
But when I extract the ints from the char buffer it only reads two digits at a time and places those two digits in a single file so I end up with 54 files containing one int each:
32
UPDATE:
When I change the for loop that assigns values from the buffer to this:
for (int i = 0; i < BUFFER_SIZE; i++)
I get my 4096 unique lines per file but instead of 8 files with the same values passed in I get 53 files with two digits per line:
10
49
57
57
54
48
50
How can I parse the char buffer to put 4096 unique ints into each file?
UPDATE - Solution:
For others that may have this challenge in the future, here is how I adapted David's solution to my existing code:
int fileIndex = 1;
int lineIndex = 0;
std::string line = "";
std::ofstream createRunSizeFile;
// loop through origin file line by line
while (getline(inputFromOrigin, line)) {
// when file is 0 or 4066 lines long create a new file
if (lineIndex % RUN == 0) {
createRunSizeFile.open("fileOfInts" + std::to_string(fileIndex) + fileOfInts);
lineIndex = 0;
if (createRunSizeFile.is_open()) {
createRunSizeFile.close();
}
// open new run size file and increment file counter
createRunSizeFile.open("fileOfInts" + std::to_string(fileIndex++) + fileOfInts);
if (!createRunSizeFile.good()) {
std::cerr << "Error: Run Size File Failed to Open" << std::endl;
return 1;
}
}
// assign line from origin to the run size file
createRunSizeFile << line << std::endl;
lineIndex++;
}
inputFromOrigin.close();
I made the false assumption that it would be easier to extract the ints from a char buffer than just going line by line. This solution does exactly what I need it to do now.
Rather than using .read, simply use getline to read a line containing an integer into a string. Then it is just a matter of keeping a line-counter and coming up with some scheme to write a suffix for an output filename and opening the output file and writing your 4096 lines to the file, resetting the line-counter, opening the next file and repeating until you run out of lines to read.
You can either #define a constant for the number of lines per sub-file or declare one, then declare your counters (below fileno is just used as the sub-file suffix), declare a string to use as a buffer to hold the line read from input and then your two files -- opening the input file:
#include <iostream>
#include <fstream>
#include <string>
#define NLINES 4096 /* constant no. of lines for output subfiles */
int main (int argc, char **argv) {
if (argc < 2) { /* validate at least 1 argument for filename */
std::cerr << "usage: " << argv[0] << " filename\n";
return 1;
}
size_t n = 0, /* line counter */
fileno = 1; /* output file suffix */
std::string s {}; /* string to use as buffer */
std::ifstream f (argv[1]); /* open input file stream */
std::ofstream subf; /* output file stream */
if (!f.good()) { /* validate input file stream stat/e good */
std::cerr << "error: input file open failed.\n";
return 1;
}
To split the file, simply read each line into s and check whether the modulo of your line counter is zero. If so, create your next output filename, reset your line-counter zero, check whether your output file is open and if so, close it, then open the output file using the new output filename, validate it is open, then it is simply writing the string to your output file and incrementing your line-counter, e.g.
while (getline (f, s)) { /* read each line from input file into s */
if (n % NLINES == 0) { /* if 0 or 4096 */
/* create output filename "subfile_X" */
std::string fname = { "subfile_" + std::to_string(fileno++) };
n = 0; /* reset line count 0 */
if (subf.is_open()) /* if output file open - close it */
subf.close();
subf.open (fname); /* open new output file */
if (!subf.good()) { /* validate output file stream state good */
std::cerr << "error: file open failed '" << fname << "'.\n";
return 1;
}
}
subf << s << '\n'; /* write s to output file */
n++; /* increment line count */
}
That is really all you need. Sewing the parts together would give the complete program:
#include <iostream>
#include <fstream>
#include <string>
#define NLINES 4096 /* constant no. of lines for output subfiles */
int main (int argc, char **argv) {
if (argc < 2) { /* validate at least 1 argument for filename */
std::cerr << "usage: " << argv[0] << " filename\n";
return 1;
}
size_t n = 0, /* line counter */
fileno = 1; /* output file suffix */
std::string s {}; /* string to use as buffer */
std::ifstream f (argv[1]); /* open input file stream */
std::ofstream subf; /* output file stream */
if (!f.good()) { /* validate input file stream stat/e good */
std::cerr << "error: input file open failed.\n";
return 1;
}
while (getline (f, s)) { /* read each line from input file into s */
if (n % NLINES == 0) { /* if 0 or 4096 */
/* create output filename "subfile_X" */
std::string fname = { "subfile_" + std::to_string(fileno++) };
n = 0; /* reset line count 0 */
if (subf.is_open()) /* if output file open - close it */
subf.close();
subf.open (fname); /* open new output file */
if (!subf.good()) { /* validate output file stream state good */
std::cerr << "error: file open failed '" << fname << "'.\n";
return 1;
}
}
subf << s << '\n'; /* write s to output file */
n++; /* increment line count */
}
}
Example Input File with 32k Integers
$ wc -l < dat/32kint.txt
32768
Example Use
Not very exiting:
$ ./bin/filesplit dat/32kint.txt
Resulting Subfiles
$ for i in subfile*; do printf "%s - " "$i"; wc -l < "$i"; done
subfile_1 - 4096
subfile_2 - 4096
subfile_3 - 4096
subfile_4 - 4096
subfile_5 - 4096
subfile_6 - 4096
subfile_7 - 4096
subfile_8 - 4096
Eight files of 4096 lines each. Look things over and let me know if you have any questions.

Print unknown file format to screen

I got a test file from a robot, I have to program by a C++ program I'm developing. So I wanted to use this file to see how the robot saves the coordinates of points. My program is currently able to calculate coordinates, now I have to generate the robot code.
Therefore I wanted to have a look at the file. But it seems that the file is writen in a binary mode. So my first idea was: Open the file in binary mode and print the content to the screen. So this is the code I'm using:
//#include "stdafx.h"
#include <iostream> // std::cout
#include <fstream> // std::ifstream
#include <Windows.h>
int main () {
std::ifstream is ("Test.PRG", std::ifstream::binary);
if (is) {
// get length of file:
is.seekg (0, is.end);
int length = is.tellg();
is.seekg (0, is.beg);
char * buffer = new char [length];
std::cout << "Reading " << length << " characters... ";
// read data as a block:
is.read (buffer,length);
if (is)
std::cout << "all characters read successfully.";
else
std::cout << "error: only " << is.gcount() << " could be read";
is.close();
// ...buffer contains the entire file...
for(int i=0; i<length; i++)
{
std::cout << (double) buffer[i] << std::endl;
}
delete[] buffer;
}
Sleep(10000);
return 0;
}
But with this code I just can't see what is writen in the file. I also tried different conversations than (double). I used char, int and float. Now I just don't know, what i could do more. Is there a possible methode to read this file and convert it to ASCII? I'm also adding the link for the file here, so you can have a look at it.
Download link for file
Here's a picture of the beginning of your file that I took with HexFiend:

How to show contents of the file in C++

I have some code here
https://github.com/Fallauthy/Projects/blob/master/cPlusPlusProjects/bazaPracownikow/bazaPracownikow/bazaPracownikow/main.cpp
And I have no idea how to show contents in my file. I mean i know how, but it doesn't show same I Have in file (in link). It show in next line. This code is responsible to load file
while (!baseFile.eof()) {
//wczytaj zawartosc pliku do zmiennej
std::string buffer;
baseFile >> buffer;
//wypisz
loadLineFromBase += buffer;
loadLineFromBase += " \n";
}
std::cout << loadLineFromBase << std::endl;
Unless I see all your code all I can do for you is give you a sample in return, I don't know what you're trying to do but it seems in this case you're looking for this.
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
int main()
{
string Display = "";
ofstream FileOut;
ifstream FileInput;
FileOut.open("C:\\Example.txt");
FileOut << "This is some example text that will be written to the file!";
FileOut.close();
FileInput.open("C:\\Example.txt");
if (!FileInput)
{
cout << "Error File not Found: " << endl;
return 1;
}
while (!FileInput.eof())
{
getline(FileInput, Display);
}
FileInput.close();
cout << Display << endl;
return 0;
}
Simply put if you're currently working wit ha text document
use getline()
When you use getline() it takes two arguments the first will be in this case your ifstream object, as in what you're using to open the file. The second will be the string you're using to store the contents in.
Using the method I outlined above you'll be able to read the entire file contents.
And please next time as it was said above outline your problem more in depth and if you provide us with all of your code we may better assist you!
Your snippet of code automatically add a newline to every string read from the input file, even if originally those were words separeted by spaces. Probably you want to keep the structure of the original file, so it's better to read one line at a time and, unless you need it for some other uses, print it out in the same loop.
std::string buffer;
// read every line of baseFile till EOF
while ( std::getline(baseFile, buffer) ) {
std::cout << buffer << '\n';
}

File Input in C++

I've been searching the internet for a while, but all I can find for file input in C++ is when you know the filename.
I'm trying to write a program to perform an addition of 2 numbers that are greater than 0 from a file, but without using scanf or cin. I want to load the file into memory, but all of the code I can find in regards to this situation requires knowledge of the filename. The file is formatted with 2 integers on a single line, separated by a space, and there are multiple lines of integers. The program will output the sum of the two numbers. I can easily do this with scanf, but if I were given a massive file, I would want to load it into memory (save mapping for later).
Loading the file into memory is giving me trouble, because I do not know the filename, nor how to find out, unless the user inputs the name of the file (not going to happen). I want the program to be executed like so, but using the most raw, and basic forms of C++ IO:
./myprog < boatloadofnumbers
How would I start my program to take the whole "boatloadofnumbers" as a file, so I can use more basic functions like read()? (also, what is the above method called? passing input?)
int main(){
int a,b;
while (scanf("%i,%i",&a,&b)>-1){
printf("%i\n",(a+b));
} //endwhile
return 0;
} //endmain
When the program is called as you state, then the content of boatloadofnumbers can be read from std::cin.
This method is called input redirection and is done by the shell, not your program.
Wiht input redirection the shell usually buffers the content of the file. That's a quite fast way to stream a file a single time through a computation.
It's not entirely clear how you're going to read a file when you don't know the filename. Presumably you don't know the filename at compile-time. That's okay, you can get this from the command-line at runtime, like this:
./myprog boatloadofnumbers
Then your filename is in argv[1] and you can access it using a std::ifstream.
If you're being given the input directly on stdin via redirection (such as ./myprog < boatloadofnumbers) you don't need a filename at all, you can just use std::cin.
The following main() will deal with both of these situations:
int main(int argc, char* argv[])
{
if (argc == 2)
{
std::cerr << "Reading from file" << argv[1] << std::endl;
std::ifstream ifs(argv[1]);
if (ifs)
{
sum_lines(ifs);
}
else
{
std::cerr << "Could not read from " << argv[1] << std::endl;
}
}
else
{
std::cerr << "Reading from stdin" << std::endl;
sum_lines(std::cin);
}
}
A sample sum_lines() may look a bit like this:
void sum_lines(std::istream& is)
{
int first = 0, second = 0;
std::string line = "";
while (std::getline(is, line))
{
std::istringstream iss(line);
if (is >> first >> second)
{
std::cout << first << " + " << second << " = " << first + second << std::endl;
}
else
{
std::cerr << "Could not parse [" << line << "]" << std::endl;
}
}
}
This doesn't care from where the input comes, so you can easily inject a std::istringstream for unit-testing. Also, this doesn't read the whole file into memory, just one line at a time, so it should deal with averybigboatloadofnumbers.
With shell redirection, your program can read from the standard input, which may be desirable. However, it may also be desirable to read from a file. It's easy to support both:
cat data > ./prog
./prog < data
./prog -f data
The first two are similar, and the contents of the file data are available from the program's standard input; the third line simply passes a command-line argument. Here's how we support this:
#include <cstdio>
#include <cstring>
void process_input(std::FILE * fp)
{
char buf[4];
std::fread(buf, 4, 1, fp);
// ...
}
int main(int argc, char * argv[])
{
std::FILE * fp = stdin; // already open!
if (argc >= 3 && 0 == std::strcmp(argv[1]. "-f"))
{
fp = std::fopen(argv[2], "rb");
if (!fp)
{
std::fprintf(stderr, "Could not open file %s.\n", argv[2]);
return 1;
}
}
process_input(fp);
if (fp != stdin) { std::fclose(fp); }
}
Equivalently, you can achieve something similar with iostreams, though it's a bit more roundabout to have a nice, universal reference:
#include <fstream>
int main()
{
std::ifstream ifp;
if ( /* as before */ )
{
ifp.open(argv[2], std::ios::binary);
if (!ifp) { /* error and die */ }
}
std::ifstream & infile = ifp ? ifp : std::cin;
process_input(infile);
}