How to read data from a file from within a function - c++

I want to make my code more efficient, specifically the reading of data from a text file. Here is a snapshot of what it looks like now:
values V(name);
V.population = read_value(find_line_number(name, find_in_map(pop, mapping)));
V.net_growth = read_value(find_line_number(name, find_in_map(ngr, mapping)));
... // and so on
Basically, the read_value function creates an ifstream object, opens the file, reads one line of data, and closes the file connection. This happens many times. What I want to do is to open the file once, read every line that is needed into the struct, and then close the file connection.
Here is the creating values struct function with parameters:
static values create_struct(std::string name, std::map<std::string, int> mapping) {
values V(name);
V.population = read_value(find_line_number(name, find_in_map(pop, mapping)), file);
V.net_growth = read_value(find_line_number(name, find_in_map(ngr, mapping)), file);
// more values here
return V;
}
The function that calls create_struct is shown below:
void initialize_data(string name) {
// read the appropriate data from file into a struct
value_container = Utility::create_struct(name, this->mapping);
}
I am thinking of instead defining the ifstream object in the function initialize_data. Given what is shown about my program, would that be the best location to create the file object, open the connection, read the values, then close the connection? Also, would I need to pass in the ifstream object into the create_values struct, and if so, by value, reference or pointer?

The short answer is to create your ifstream object first and pass it as reference to your parser. Remember to seek the stream back to the beginning before you leave your function, or when you start to read.
The RAII thing to do would be to create a wrapper object that automatically does this when it goes out of scope.
class ifStreamRef{
ifStreamRef(std::ifstream& _in) : mStream(_in){}
~ifStreamRef(){mStream.seekg(0);}
std::ifstream& mStream;
}
Then you create a wrapper instance when entering a method that will read the fstream.
void read_value(std::ifstream& input, ...){
ifStreamRef autoRewind(input);
}
Or, since the Ctor can do the conversion...
void read_value(ifStreamRef streamRef, ...) {
streamRef.mStream.getLine(...);
}
std::ifstream itself follows RAII, so it will close() the stream for you when your stream goes out of scope.
The long answer is that you should read up on dependency injection. Don't create dependencies inside of objects/functions that can be shared. There are lots of videos and documents on dependency injection and dependency inversion.
Basically, construct the objects that your objects depend on and pass them in as parameters.
The injection now relies on the interface of the objects that you pass in. So if you change your ifStreamRef class to act as an interface:
class ifStreamRef{
ifStreamRef(std::ifstream& _in) : mStream(_in){}
~ifStreamRef(){mStream.seekg(0);}
std::string getLine(){
// todo : mStream.getLine() + return "" on error;
}
bool eof() { return mStream.eof(); }
std::ifstream& mStream;
}
Then later on you can change the internal implementation that would take a reference to vector<string>& instead of ifstream...
class ifStreamRef{
ifStreamRef(std::vector<string>& _in) : mStream(_in), mCursor(0){}
~ifStreamRef(){}
std::string getLine(){
// todo : mStream[mCursor++] + return "" on error;
}
bool eof() { return mCursor >= mStream.size(); }
std::vector<string>& mStream;
size_t mCursor;
}
I have oversimplified a few things.

Related

Path to file or stream?

I was discussing with a co-worker and I thought this would be a good question to put here on SO.
When designing and API when should your functions accept file paths and when should they accept streams? Are there any guidelines?
void do_something(const std::filesystem::path &file_path);
void do_something(std::istream &stream);
path:
callee is responsible for checking that the file exists and is accessible.
is difficult to unit test. You have to create/have a file on disk to test it.
stream:
caller is responsible for checking that the file exists and is accessible. more repetitive boilerplate code.
unit test is easier you can just pass a stream object
I guess one could add a function to the library to "help" open the file, something of the sorts:
std::ifstream open_input(const std::filesystem::path &file)
{
std::ifstream stream(file);
if (not stream) {
throw std::invalid_argument("failed to open file: " + file.string());
}
return stream;
}
You stated yourself that you could add the "helper" function in order to preserve the istream interface. This is also the better solution in terms of testability and adheres to the single responsibility principle (SRP).
Your helper function has one responsibility (create stream from file) and your actual function another (it "does something" :)).
I'd add that it depends on the context of what does something actually does. For example, if it is a facade for different accesses to an underlying functionality, then it would make sense to have that interface with the actual path. Still you would have a separate helper function and a do_something function which are used from the facade.
You may have your cake and eat it:
#include <fstream>
#include <sstream>
#include <utility>
//
// some booilerplate to allow use of a polymorphic temporary
template<class Stream, std::enable_if_t<std::is_base_of<std::istream, Stream>::value> * = nullptr>
struct stream_holder
{
stream_holder(Stream stream) : stream_(std::move(stream)) {}
operator std::istream&() && { return stream_; }
operator std::istream&() & { return stream_; }
private:
Stream stream_;
};
// helper function
template<class Stream, std::enable_if_t<std::is_base_of<std::istream, Stream>::value> * = nullptr>
auto with_this(Stream&& stream)
{
return stream_holder<std::decay_t<Stream>>(std::forward<Stream>(stream));
}
// express logic in terms of stream
void do_something(std::istream& stream_ref);
// utility functions to create various types of stream
std::ifstream file_stream();
std::stringstream string_stream();
int main()
{
// * composability with succinct syntax
// * lifetime automatically managed
// * no repetitive boilerplate
do_something(with_this(file_stream()));
do_something(with_this(string_stream()));
}

Expressing "read-only, no modification of position" for std::ifstream

In my code, I want to identify some properties about the contents of a file, before deciding how to read the file. (That is, I search for a keyword, if found, it's going to be read with foo(std::ifstream&), else with bar(std::ifstream&)).
I implemented the method that searches for the keyword as
bool containsKeyword(std::ifstream& file, const char* keyword)
{
for ( std::string line; std::getline(file, line); )
{
if ( line == keyword )
{
return true;
}
}
return false;
}
This modifies the position of the file stream (either the end, if the keyword isn't found, or the position of the keyword). However I want that the position is reset after the search. This can be done with a ScopeGuard:
class FilePositionScopeGuard
{
private:
std::ifstream& file;
using FilePosition = decltype(std::declval<std::ifstream>().tellg());
FilePosition initial_position;
public:
FilePositionScopeGuard(std::ifstream& file_)
:
file(file_),
initial_position(file.tellg())
{
}
~FilePositionScopeGuard()
{
file.clear();
file.seekg(initial_position);
}
};
Now we add this to the method:
bool containsKeyword(std::ifstream& file, const char* keyword)
{
FilePositionScopeGuard guard(file);
for ( std::string line; std::getline(file, line); )
{
...
That's nice, because with exactly one additional line in the method, we get the behaviour of not modifying the std::ifstream no matter how the method is exited (one of the returns or an exception).
However, the method bool containsKeyword(std::ifstream&, const char*); does not express the constness. How can I adjust my method to express (at the level of the interface) that the method will not alter the current state?
You could change the signature to take a position-guarded file:
bool containsKeyword(const FilePositionScopeGuard &, const char *);
This allows the caller to pass an ifstream per the current signature (constructing a temporary guard for that operation), or to make their own guard and use it for several operations.
You'll need to make the ifstream member publicly accessible.
Do it with the text comment // the method does read from file but resets the read pointer.
Do not expect a user of the API to be a monkey at keyboard. Specifically don't mark ifstream argument as const while casting constancy out inside the method. It does make difference in a multithreaded program.

Read the name of an istream

I have something like that :
istream ifs("/path/to/my/file.ppm", ios::binary);
So now, for checking the extension file, It's necessary to get the name of the file.
I'm using my own function read :
... readPPM(std::istream& is) {}
It's is possible to get the /path/to/my/file.ppm in a string from the istream& variable ?
You almost certainly actually used
std::ifstream ifs(...);
// ^
However, even so the stream doesn't retain the name used to open it: there is rarely a need to doing so and it would be a wasted resource for most applications. That is, if you need the name later, you'll need to retain it. Also, not all streams have a name. For example, an std::istringstream doesn't have a name.
If you can't pass the stream's name separate from the stream, you can attach the name, e.g., using the pword() member:
int name_index() {
static int rc = std::ios_base::xalloc(); // get an index to be used for the name
return rc;
}
// ...
std::string name("/path/to/my/file.ppm");
std::ifstream ifs(name, ios::binary);
ifs.pword(name_index()) = const_cast<char*>(name.c_str());
// ...
char const* stream_name = static_cast<char*>(ifs.pword(name_index()));
The stream won't maintain the pointer in any shape or form, i.e., with the above setup the name needs to outlive the ifs object. If necessary the objects stored with pword() can be maintained using the various callbacks but doing so is non-trivial.

C++ copy_if lambda capturing std::string

This is a follow up question from here: C++ - Developing own version of std::count_if?
I have the following function:
// vector for storing the file names that contains sound
std::vector<std::string> FilesContainingSound;
void ContainsSound(const std::unique_ptr<Signal>& s)
{
// Open the Wav file
Wav waveFile = Wav("Samples/" + s->filename_);
// Copy the signal that contains the sufficient energy
std::copy_if(waveFile.Signal().begin(), waveFile.Signal().end(),
FilesContainingSound.begin(), [] (const Signal& s) {
// If the energy bin > threshold then store the
// file name inside FilesContaining
}
}
But to me, I only need to capture the string "filename" inside of the lambda expression, because I'll only be working with this. I just need access to the waveFile.Signal() in order to do the analysis.
Anyone have any suggestions?
EDIT:
std::vector<std::string> FilesContainingSound;
std::copy_if(w.Signal().begin(), w.Signal().end(),
FilesContainingSound.begin(), [&] (const std::unique_ptr<Signal>& file) {
// If the energy bin > threshold then store the
// file name inside FilesContaining
});
You seem to be getting different levels of abstraction confused here. If you're going to work with file names, then you basically want something on this order:
std::vector<std::string> input_files;
std::vector<std::string> files_that_contain_sound;
bool file_contains_sound(std::string const &filename) {
Wav waveFile = Wav("Samples/" + filename);
return binned_energy_greater(waveFile, threshold);
}
std::copy_if(input_files.begin(), input_files.end(),
std::back_inserter(files_that_contain_sound),
file_contains_sound);
For the moment I've put the file_contains_sound in a separate function simply to make its type clear -- since you're dealing with file names, it must take a file name as a string, and return a bool indicating whether that file name is one of the group you want in your result set.
In reality, you almost never really want to implement that as an actual function though--you usually want it to be an object of some class that overloads operator() (and a lambda is an easy way to generate a class like that). The type involved must remain the same though: it still needs to take a file name (string) as a parameter, and return a bool to indicate whether that file name is one you want in your result set. Everything dealing with what's inside the file will happen inside of that function (or something it calls).

Using same variable in two functions

I have two functions read() and write(). I read a file in the read() function and store a line in the header in a variable. Now i want the write() function to write that same line to a new file. But how can i use the same variable or information from the other function? What is the way to do this?
Here is some info about the code:
After including necessary files, it says this
HX_INIT_CLASS(HxCluster,HxVertexSet);
The name of the class is HxCluster and it would be great if someone can tell me why it is not like we define classes in the simple way: class class_name {};
The I have many functions out of which two are read() and write(). They both take one argument only which is the file to be read and the file to be written to in the respective cases. I don't know if writing the code for that will help here.
If I understood you well, this is just what in C++ the structures/classes/objects are for. For example:
class FileLineWriter
{
public:
FileLineWriter();
void read(istream& inputfile);
void write(ostream& putfile);
private:
string line_of_text;
};
void FileLineWriter::read(istream& s)
{
// s >> this->line_of_text; // possible, but probably will not do what you think
getline(s, this->line_of_text);
}
void FileLineWriter::read(ostream& s)
{
s << this->line_of_text;
}
...
FileLineWriter writer;
writer.read(firstfile);
writer.write(secondfile);
note that the above is NOT a working code. It is just a sample. You will have to fix all typos, missing namespaces, headers, add stream opening/closing/error handling, etc.
You return the variable from read and pass it as a parameter to write. Something like this
std::string read()
{
std::string header = ...
return header;
}
void write(std::string header)
{
...
}
std::string header = read();
write(header);
Passing information between functions is a basic C++ skill to learn.
If I have understood this right then I would suggest that you save the info on the variable to a string or an int depending on what kind of info it is.
I would also recommend to always include some code for us to be able to give you some more help
You can either make write take an argument, void write(std::string text) or you can store the string you read as a global variable std::string text at the top of your .cpp file, text = ... in your read function (replace ... with ifstream or whatever you use) and then write text in your write funcion.
Sure,
Use pointers!
void main(){
char* line = malloc(100*sizeof(char));
read_function (line);
write_function (line);
}
void read_function(char* line){
.... read a line
strcpy (line, the_line_you_read_from_file);
}
void write_function (char* line){
fprintf (fp,"%s", line);
}