I'm trying to determine how big a file i'm reading is in bytes so I used Fseek to jump to the end and it triggered the error: file.exe has triggered a breakpoint.
Heses the code:
FileUtils.cpp:
#include "FileUtils.h"
namespace impact {
std::string read_file(const char* filepath)
{
FILE* file = fopen(filepath, "rt");
fseek(file, 0, SEEK_END);
unsigned long length = ftell(file);
char* data = new char[length + 1];
memset(data, 0, length + 1);
fseek(file, 0 ,SEEK_SET);
fread(data, 1, length, file);
fclose(file);
std::string result(data);
delete[] data;
return result;
}
}
FileUtils.h:
#pragma once
#include <stdio.h>
#include <string>
#include <fstream>
namespace impact {
std::string read_file(const char* filepath);
}
If more info is required just ask me for it I would be more than happy to provide more!
You are doing this in the C way, C++ has much better (in my opinion) ways of handling files.
Your error looks like it may be caused because the file didn't open correctly (you need to check if file != nullptr).
To do this in C++17 you should use the standard library filesystem
(Note: You can also do this with C++11 experimental/filesystem using std::experimental::filesystem namespace)
Example:
std::string read_file(const std::filesystem::path& filepath) {
auto f_size = std::filesystem::file_size(filepath);
...
}
Additionally to read a file in C++ you do not need to know the size of the file. You can use streams:
std::string read_file(const std::filesystem::path& filepath) {
std::ifstream file(filepath); // Open the file
// Throw if failed to open the file
if (!file) throw std::runtime_error("File failed to open");
std::stringstream data; // Create the buffer
data << file.rdbuf(); // Read into the buffer the internal buffer of the file
return data.str(); // Convert the stringstream to string and return it
}
As you can see, the C++ way of doing it is much shorter and much easier to debug (helpful exceptions with descriptions are thrown when something goes wrong)
Related
I'm trying to write a string into a file, but I don't know how to do it, I have tried to use wstring instead of string in my randomString() string function, and other things just to write a string to a file.
The if condition is checking if file is created, and if yes, write to it.
fopen is used to open the file, path1 variable is the path to my file, and the "w" is equal to write.
randomString() is a string function.
char buffer[100] = { randomString() };
FILE* file;
file = fopen(path1, "w");
if (file) {
fwrite(buffer, sizeof(char), sizeof(buffer), file);
fclose(file);
}
return;
If you're using a new-ish compiler you can try std::filesystem
#include <iostream>
#include <fstream>
#include <filesystem>
namespace fs = std::filesystem;
int main() {
auto directoryToWriteTo = fs::current_path(); // returns a fs::path object
std::ofstream fileStream(directoryToWriteTo.string() + "/nameOfYourFile.txt");
if(fileStream.is_open())
fileStream << "Whatever string you want to write to a file\n";
}
std::filesystem isn't needed for writing to one file, but it does make things like iterating over all files in a directory easy, as adapted from the cppreference.com examples.
#include <fstream>
#include <iostream>
#include <filesystem>
namespace fs = std::filesystem;
int main() {
auto directoryToTraverse = fs::current_path();
for(auto& p: fs::directory_iterator(directoryToTraverse)){
if(fs::is_regular_file(p)){
std::ofstream tmpStream(p,std::ios_base::app); //open file in append mode
tmpStream << "Append a string to each regular file in your directory\n";
}
}
}
It also allows one to change file permissions programmatically with standard c++.
If you are just trying to print out to a file the contents of a string, using fwrite, then you need to do something like the following
std::string str = randomString();
if ( file ) {
fwrite( str.c_str(), sizeof( char ), str.size(), file );
fclose( file );
}
Since fwrite expects the first parameter to be void *, an std::string is not compatible; however, a char * is. By calling .c_str() on the string, you would have a char * to work with. The second parameter is the size of the type, which for a std::string is char, so sizeof( char ) gives the size (which is 1). The third parameter is the count (number of characters to write), which can easily be gotten from str.size().
Bellow you can find a code snippet that I used to write an string_length with it to binary file but the code does not works as expected. After it writes I opened the output file and the string was located there but when I read the string from file it reads the string partially. It seems that after reading the string_length the file pointer seeks more than what it should and then it missed the first 8 characters of the string!
#include <iostream>
#include <string>
FILE* file = nullptr;
bool open(std::string mode)
{
errno_t err = fopen_s(&file, "test.code", mode.c_str());
if (err == 0) return true;
return false;
}
void close()
{
std::fflush(file);
std::fclose(file);
file = nullptr;
}
int main()
{
open("wb"); // open file in write binary mode
std::string str = "blablaablablaa";
auto sz = str.size();
fwrite(&sz, sizeof sz, 1, file); // first write size of string
fwrite(str.c_str(), sizeof(char), sz, file); // second write the string
close(); // flush the file and close it
open("rb"); // open file in read binary mode
std::string retrived_str = "";
sz = -1;
fread(&sz, sizeof(size_t), 1, file); // it has the right value (i.e 14) but it seems it seeks 8 bytes more!
retrived_str.resize(sz);
fread(&retrived_str, sizeof(char), sz, file); // it missed the first 8 char
close(); // flush the file and close it
std::cout << retrived_str << std::endl;
return 0;
}
PS: I removed checks in the code in order to makes it more readable.
You're clobbering the retrieved_str object with the file contents rather than reading the file contents into the buffer controlled by retrieved_str.
fread(&retrived_str[0], 1, sz, file);
Or, if you're using C++17 with its non-const std::string::data method:
fread(retrived_str.data(), 1, sz, file);
Change
fread(&retrived_str, sizeof(char), sz, file); // it missed the first 8 char
To
fread((void*)( retrived_str.data()), sizeof(char), sz, file); // set the data rather than the object
I'm trying to read file, which contains Cyrillic characters in their path, and got ifstream.is_open() == false
This is my code:
std::string ReadFile(const std::string &path) {
std::string newLine, fileContent;
std::ifstream in(path.c_str(), std::ios::in);
if (!in.is_open()) {
return std::string("isn't opened");
}
while (in.good()) {
getline(in, newLine);
fileContent += newLine;
}
in.close();
return fileContent;
}
int main() {
std::string path = "C:\\test\\документ.txt";
std::string content = ReadFile(path);
std::cout << content << std::endl;
return 0;
}
Specified file exists
I'm trying to find solution in google, but I got nothing
Here is links, which I saw:
I don't need wstring
The same as previous
no answer here
is not about C++
has no answer too
P.S. I need to get file's content in string, not in wstring
THIS IS ENCODING SETTINGS OF MY IDE (CLION 2017.1)
You'll need an up-to-date compiler or Boost. std::filesystem::path can handle these names, but it's new in the C++17 standard. Your compiler may still have it as std::experimental::filesystem::path, or else you'd use the third-party boost::filesystem::path. The interfaces are pretty comparable as the Boost version served as the inspiration.
The definition for std::string is std::basic_string, so your Cyrillic chararecters are not stored as intended. Atleast, try to use std::wstring to store your file path and then you can read from file using std::string.
First of all, set your project settings to use UTF-8 encoding instead of windows-1251. Until standard library gets really good (not any time soon) you basically can not rely on it if you want to deal with io properly. To make input stream read from files on Windows you need to write your own custom input stream buffer that opens files using 2-byte wide chars or rely on some third-party implementations of such routines. Here is some incomplete (but sufficient for your example) implementation:
// assuming that usual Windows SDK macros such as _UNICODE, WIN32_LEAN_AND_MEAN are defined above
#include <Windows.h>
#include <string>
#include <iostream>
#include <system_error>
#include <memory>
#include <utility>
#include <cstdlib>
#include <cstdio>
static_assert(2 == sizeof(wchar_t), "wchar_t size must be 2 bytes");
using namespace ::std;
class MyStreamBuf final: public streambuf
{
#pragma region Fields
private: ::HANDLE const m_file_handle;
private: char m_buffer; // typically buffer should be much bigger
#pragma endregion
public: explicit
MyStreamBuf(wchar_t const * psz_file_path)
: m_file_handle(::CreateFileW(psz_file_path, FILE_GENERIC_READ, FILE_SHARE_READ, nullptr, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL))
, m_buffer{}
{
if(INVALID_HANDLE_VALUE == m_file_handle)
{
auto const error_code{::GetLastError()};
throw(system_error(static_cast< int >(error_code), system_category(), "::CreateFileW call failed"));
}
}
public:
~MyStreamBuf(void)
{
auto const closed{::CloseHandle(m_file_handle)};
if(FALSE == closed)
{
auto const error_code{::GetLastError()};
//throw(::std::system_error(static_cast< int >(error_code), system_category(), "::CloseHandle call failed"));
// throwing in destructor is kinda wrong
// but if CloseHandle returned false then our program is in inconsistent state
// and must be terminated anyway
(void) error_code; // not used
abort();
}
}
private: auto
underflow(void) -> int_type override
{
::DWORD bytes_count_to_read{1};
::DWORD read_bytes_count{};
{
auto const succeeded{::ReadFile(m_file_handle, addressof(m_buffer), bytes_count_to_read, addressof(read_bytes_count), nullptr)};
if(FALSE == succeeded)
{
auto const error_code{::GetLastError()};
setg(nullptr, nullptr, nullptr);
throw(system_error(static_cast< int >(error_code), system_category(), "::ReadFile call failed"));
}
}
if(0 == read_bytes_count)
{
setg(nullptr, nullptr, nullptr);
return(EOF);
}
setg(addressof(m_buffer), addressof(m_buffer), addressof(m_buffer) + 1);
return(m_buffer);
}
};
string
MyReadFile(wchar_t const * psz_file_path)
{
istream in(new MyStreamBuf(psz_file_path)); // note that we create normal stream
string new_line;
string file_content;
while(in.good())
{
getline(in, new_line);
file_content += new_line;
}
return(::std::move(file_content));
}
int
main(void)
{
string content = MyReadFile(L"C:\\test\\документ.txt"); // note that path is a wide string
cout << content << endl;
return 0;
}
Change your code to use wstring and save your file using Unicode encoding (non UTF8 one, use USC-2, UTF16 or something like that). MSVC has non-standard overload specifically for this reason to be able to handle non-ascii chars in filenames:
std::string ReadFile(const std::wstring &path)
{
std::string newLine, fileContent;
std::ifstream in(path.c_str(), std::ios::in);
if (!in)
return std::string("isn't opened");
while (getline(in, newLine))
fileContent += newLine;
return fileContent;
}
int main()
{
std::wstring path = L"C:\\test\\документ.txt";
std::string content = ReadFile(path);
std::cout << content << std::endl;
}
Also, note corrected ReadFile code.
char buffer[1001];
for(;!gzeof(m_fHandle);){
gzread(m_fHandle, buffer, 1000);
The file I'm handling is more than 1GB.
do I load the entire file to the buffer? or should I malloc and allocate the size?
Or should I load it line by line? the file has a "\n" demarkating the EOL. if so, how do I do that for handling gzfile in c++?
The zlib approach would be:
You can just call gzread with a limited buffer size repeatedly. If you can be sure that he max line length is eg BUFLEN: See it Live On Coliru
#include <zlib.h>
#include <iostream>
#include <algorithm>
static const unsigned BUFLEN = 1024;
void error(const char* const msg)
{
std::cerr << msg << "\n";
exit(255);
}
void process(gzFile in)
{
char buf[BUFLEN];
char* offset = buf;
for (;;) {
int err, len = sizeof(buf)-(offset-buf);
if (len == 0) error("Buffer to small for input line lengths");
len = gzread(in, offset, len);
if (len == 0) break;
if (len < 0) error(gzerror(in, &err));
char* cur = buf;
char* end = offset+len;
for (char* eol; (cur<end) && (eol = std::find(cur, end, '\n')) < end; cur = eol + 1)
{
std::cout << std::string(cur, eol) << "\n";
}
// any trailing data in [eol, end) now is a partial line
offset = std::copy(cur, end, buf);
}
// BIG CATCH: don't forget about trailing data without eol :)
std::cout << std::string(buf, offset);
if (gzclose(in) != Z_OK) error("failed gzclose");
}
int main()
{
process(gzopen("test.gz", "rb"));
}
If you cannot know the maximum line size, I'd suggest abstracting it a bit more and deriving from std::basic_streambuf overriding underflow so you can use std::getline with an istream based on this buffer.
UPDATE Since you're new to C++, implementing your own streambuf is likely not a good idea. I recommend using a c++ library (instead of zlib).
E.g. Boost Iostream allows you to simply do this:
Live On Coliru
#include <boost/iostreams/device/file.hpp>
#include <boost/iostreams/filtering_stream.hpp>
#include <boost/iostreams/filter/gzip.hpp>
namespace io = boost::iostreams;
int main()
{
io::filtering_istream in;
in.push(io::gzip_decompressor());
in.push(io::file_source("my_file.txt"));
// read from in using std::istream interface
std::string line;
while (std::getline(in, line, '\n'))
{
process(line); // your code :)
}
}
You say this is a gzfile. That implies a binary format where '\n' is not valid for EOL (there is no concept of EOL with binary files.)
That said, in practice you have a couple choices for buffer size. Loading the entire file into memory will certainly be easier for you as a developer to work with the data. However, this is a costly solution in terms of memory consumed for the task.
If memory is a concern then you need to work on the data in pieces. There is probably an optimal amount of data to try to fetch at a time and a lot of that will depend on the hardware architecture of the machine you have all the way from the CPU through cache lines, memory bus, SATA bus, and even the drives that hold the file itself.
If this is just a onesy-twosy kind of problem you're solving and you're running this on a modern computer, 1GB is probably ok to keep in memory. Just new a uint8_t[] the size of the file and read the whole thing in then parse the data.
Otherwise, you need to integrate your parsing of the file with the reading of the file.
I'm trying to join two big files (like the UNIX cat command: cat file1 file2 > final) in C++.
I don't know how to do it because every method that I try it's very slow (for example, copy the second file into the first one line by line)
¿What is the best method for do that?
Sorry for being so brief, my english is not too good
If you're using std::fstream, then don't. It's intended primarily for formatted input/output, and char-level operations for it are slower than you'd expect. Instead, use std::filebuf directly. This is in addition to suggestions in other answers, specifically, using the larger buffer size.
Use binary-mode in the standard streams to do the job, don't deal with it as formatted data.
This is a demo if you want transfer the data in blocks:
#include <fstream>
#include <vector>
std::size_t fileSize(std::ifstream& file)
{
std::size_t size;
file.seekg(0, std::ios::end);
size = file.tellg();
file.seekg(0, std::ios::beg);
return size;
}
int main()
{
// 1MB! choose a conveinent buffer size.
const std::size_t blockSize = 1024 * 1024;
std::vector<char> data(blockSize);
std::ifstream first("first.txt", std::ios::binary),
second("second.txt", std::ios::binary);
std::ofstream result("result.txt", std::ios::binary);
std::size_t firstSize = fileSize(first);
std::size_t secondSize = fileSize(second);
for(std::size_t block = 0; block < firstSize/blockSize; block++)
{
first.read(&data[0], blockSize);
result.write(&data[0], blockSize);
}
std::size_t firstFilerestOfData = firstSize%blockSize;
if(firstFilerestOfData != 0)
{
first.read(&data[0], firstFilerestOfData);
result.write(&data[0], firstFilerestOfData);
}
for(std::size_t block = 0; block < secondSize/blockSize; block++)
{
second.read(&data[0], blockSize);
result.write(&data[0], blockSize);
}
std::size_t secondFilerestOfData = secondSize%blockSize;
if(secondFilerestOfData != 0)
{
second.read(&data[0], secondFilerestOfData);
result.write(&data[0], secondFilerestOfData);
}
first.close();
second.close();
result.close();
return 0;
}
Using plain old C++:
#include <fstream>
std::ifstream file1("x", ios_base::in | ios_base::binary);
std::ofstream file2("y", ios_base::app | ios_base::binary);
file2 << file1.rdbuf();
The Boost headers claim that copy() is optimized in some cases, though I'm not sure if this counts:
#include <boost/iostreams/copy.hpp>
// The following four overloads of copy_impl() optimize
// copying in the case that one or both of the two devices
// models Direct (see
// http://www.boost.org/libs/iostreams/doc/index.html?path=4.1.1.4)
boost::iostreams::copy(file1, file2);
update:
The Boost copy function is compatible with a wide variety of types, so this can be combined with Pavel Minaev's suggestion of using std::filebuf like so:
std::filebuf file1, file2;
file1.open("x", ios_base::in | ios_base::binary);
file2.open("y", ios_base::app | ios_base::binary);
file1.setbuf(NULL, 64 * 1024);
file2.setbuf(NULL, 64 * 1024);
boost::iostreams::copy(file1, file2);
Of course the actual optimal buffer size depends on many variables, 64k is just a wild guess.
As an alternative which may or may not be faster depending on your file size and memory on the machine. If memory is tight, you can make the buffer size smaller and loop over the f2.read grabbing the data in chunks and writing to f1.
#include <fstream>
#include <iostream>
using namespace std;
int main(int argc, char *argv[])
{
ofstream f1("test.txt", ios_base::app | ios_base::binary);
ifstream f2("test2.txt");
f2.seekg(0,ifstream::end);
unsigned long size = f2.tellg();
f2.seekg(0);
char *contents = new char[size];
f2.read(contents, size);
f1.write(contents, size);
delete[] contents;
f1.close();
f2.close();
return 1;
}