I have a visitor pattern implemented and it seems to be working fine but I don't see how to do some housekeeping work at the very start and the very end.
There is no guarantee of when the various overloaded visit() methods will be called so I can't tell who is the first one and who is the last one.
Basically I'm using visitor to save/load settings to/from disk. The problem is (on loading) I need to clear some stuff out before I do any of the other loading steps. I did put in a static variable and method to initialize things and do this load, that should ensure that something happens only once at the very start -but- a person could load things multiple times. So at the end of the reading I'd like to reset the static variable (so they can read in again without the old junk still being there). I can't simply put the reset into the destructor (or a method called by the destructor) because the concrete visitor objects are created/destroyed n times for each grouping of settings.
I guess I need to yolk it with another pattern but am not seeing how to.
Following up on my comment above.
You could have a class
class VisitorState {
public:
VisitorState() {
// stuff to be done on loading
}
~VisitorState() {
// stuff to be done when done.
}
private:
// state info you might want to keep around
};
and then modify your Visitor interface to have methods that include the VisitorState
someReturn visit(VisitorState &state,....)
The VisitorState must be allocated (new'ed) when the file is requested to be loaded and kept around associated to the file being visited... It must be deallocated when (delete'd) when the processing of the file ends.
Related
I'm creating a file format where I'd like to write an explicit message into the file indicating that the writer ran to completion. I've had problems in the past with generating files where the generating program crashed and the file was truncated without me realizing, since without an explicit marker there's no way for reading programs to detect a file is incomplete.
So I have a class that is used for writing these files. Now usually if you have an "open" operation and a "close" operation you want to use RAII, so I would put the code to write the end of file marker in the destructor. This way the user can't forget. But in a situation where writing doesn't complete because an exception is thrown the destructor will still be run -- in which case we don't want to write the message so readers will know the file is incomplete.
This seems like something that could happen any time there's a "commit" sort of operation. You want RAII so you can't forget to commit, but you also don't want to commit when an exception occurs. The temptation here is to use std::uncaught_exceptions, but I think that's a code smell.
What's the usual solution to this? Just require that people remember? I'm concerned this will be a stumbling block everytime someone tries to use my API.
One way to tackle this problem is to implement a simple framing system where you can define a header that is only filled in completely at the end of the write. Include a SHA256 hash to make the header useful for verifying the contents of the file. This is usually a lot more convenient than having to read bytes at the end of a file.
In terms of implementation you write out a header with some fields deliberately zeroed out, write the contents of the payload while feeding that data through your hashing method, and then seek back to the header and re-write that with the final values. The file starts out in an obviously invalid state and ends up valid only if everything ran to completion.
You could wrap up all of this in a stream handle that handles the implementation details so as far as the calling code is concerned it's just opening a regular file. Your reading version would throw an exception if the header is incomplete or invalid.
For your example, it seems like RAII would work fine if you add a commit method which the user of your class calls when they are done writing to a file.
class MyFileFormat {
public:
MyFileFormat() : committed_(false) {}
~MyFileFormat() {
if (committed_) {
// write the completion footer (I hope this doesn't throw!)
}
// close the underlying stream...
}
bool open(const std::string& path) {
committed_ = false;
// open the underlying stream...
}
bool commit() {
committed_ = true;
}
};
The onus is on the user to call commit when they're done, but at least you can be sure that resources get closed.
For a more general pattern for cases like this, take a look at ScopeGuards.
ScopeGuards would move the responsibility for cleanup out of your class, and can be used to specify an arbitrary "cleanup" callback in the event that the ScopeGuard goes out of scope and is destroyed before being explicitly dismissed. In your case, you might extend the idea to support callback for both failure-cleanup (e.g. close file handles), and success-cleanup (e.g. write completion footer and close file handles).
I've handled situations like that by writing to a temporary file. Even if you're appending to a file, append to a temporary copy of the file.
In your destructor, you can check std::uncaught_exception() to decide whether your temporary file should be moved to its intended location.
My question is all about tips and tricks. I'm currently working on project, where I have one very big(~1Gb) file with data. First, I need to extract data. This extraction takes 10 mins. Then I do calculations. Next calculation depends on previous. Let's call them calculation1, calculation2 and so on. Assuming, that I've done extraction part right, I currently face two problems:
Every time I launch program it works 10 mins least. I cannot avoid it, so I have to plan debugging.
For every next calculation it takes more time.
Thinking of first problem, I assumed, that some sort of database may help, if database is faster, than reading file, which I doubt.
Second problem might be overcomed if I split my big program in smaller programs, each of which will do: read file-do stuff-write file. So next stage always can read file from previous, for debug. But it introduces many wasted code for file I/O.
I think that both these problems could be solved by some strategy like: write, and test extract module, than launch it and let it extract all data into RAM. Than write calculation1, and launch it to somehow grab data directly from RAM of extract module. And so on with every next calculation. So my questions are:
Are there tips and tricks to minimize loads from files?
Are there ways to share RAM and objects between programs?
By the way I write this task on Perl because I need it quick, but I'll rewrite it on C++ or C# later, so any language-specific or language-agnostic answers welcome.
Thank you!
[EDIT]
File of data does not change it is like big immutable source of knowledge. And it is not exactly 1Gb and it does not take 10 minutes to read it. I just wanted to say, that file is big and time to read it is considerable. On my machine 1 Gb read+parse file into right objects takes about minute. Which is still pretty bad.
[/EDIT]
On my current system Perl copies the whole 1GB file in memory in 2 seconds. So I believe your problem is not reading the file but parsing it.
So the straightforward solution I can think of is to preparse it by, for instance, converting your data into actual code source. I mean, you can prepare your data and hardcode it in your script directly (using another file of course).
However, if reading is an actual problem (which I doubt) you can use database that store the data in the memory (example). It will be faster anyway just because your database reads the data once upon starting and you don't restart your database as often as your program.
The idea for solving this type of problems can be as follows:
Go for 3 programs:
Reader
Analyzer
Writer
and exchange data between them using shared memory.
For that big file I guess you have considerable amount of data of one object type which you can store in circular buffer in shared memory (I recommend using boost::interprocess).
Reader will continuously read data from the input file and store it in shared memory.
In the meantime, once is enough data read for doing calculations, the Analyzer will start processing it and store results into another circular buffer shared memory file.
Once there are some calculations in the second shared memory the Writer will read them and store them into final output file.
You need to make sure all the processes are synchronized properly so that they do their job simultanouesly and you don't lose the data (the data is not being overwritten before is processed or saved into final file).
I like the answer doqtor gives, but to prevent data from being overwritten, a nice helper class to enable and disable critical sections of code within a thread will do the trick.
// Note: I think sealed may be specific to Visual Studio Compiler.
// CRITICAL_SECTION is defined in Windows.h - If on another OS,
// look for similar structure.
class BlockThread sealed {
private:
CRITICAL_SECTION* m_pCriticalSection;
public:
explicit BlockThread( CRITICAL_SECTION& criticalSection );
~BlockThread();
private:
BlockThread( const BlockThread& c );
BlockThread& operator=( const BlockThread& c ); // Not Implement
};
BlockThread::BlockThread( CRITICAL_SECTION& criticalSection ) {
m_pCriticalSection = &criticalSection;
}
BlockThread::~BlockThread() {
LeaveCriticalSection( m_pCriticalSection
}
A class such as this would allow you to block specific threads if you are within the bounds of a critical section where shared memory is being used
and another thread currently has access to it. This will cause this thread
of code to be locked until the current thread is done its work and this
class goes out of scope.
To use this class with in another class is fairly simple: in the class that you
want to block a thread within its .cpp file you need to create a static variable of this type and call the API's function to initialize it. Then
you can use the BlockThread class to lock this thread.
SomeClass.cpp
#include "SomeClass.h"
#include "BlockThread.h"
static CRITICAL_SECTION s_criticalSection;
SomeClass::SomeClass {
// Do This First
InitializeCriticalSection( &s_criticalSection );
// Class Stuff Here
}
SomeClass::~SomeClass() {
// Class Stuff Here
// Do This Last
DeleteCriticalSection( &s_criticalSection );
}
// To Use The BlockThread
SomeClass::anyFunction() {
// When Your Condition Is Met & You Know This Is Critical
// Call This Before The Critical Computation Code.
BlockThread blockThread( s_criticalSection );
}
And that is about it, once this object goes out of scope the static member
is cleaned up within the objects destructor and when this object goes out
of scope so does the BlockThread class and its Destructor cleans it up there.
And now this shared memory can be used. You would usually want to use this class if you are traversing over containers to either add, insert, or find and access elements when this data is a shared type.
As for the 3 different threads running in memory on the same data set a good concept is to have 3 or 4 buffers each about 4MB in size and have them work in a rotating order. Buff1 gets data then Buff2 gets data, while Buff2 is getting data Buff 1 is either parsing the data it or passing it off to be stored for computation, then Buff1 waits until Buff3 or 4 is done, pending on how many buffers you have. Then this process starts again. This is the same principle that is used with Sound Buffers when reading in sound files for doing an Audio Stream, or sending batches of triangles to a graphics card. Another words it is a Batch type process.
I'm in a situation where I think that two implementations are correct, and I don't know which one to choose.
I've an application simulating card readers. It has a GUI where you choose which serial port, and speed to use, and a play and stop button.
I'm looking for the best implementation for reader construction.
I have a SimulatorCore class who's living as long as my application
SimulatorCore instantiate the Reader class. And it will be possible to simulate multiple readers on multiple serial port.
Two possibilities:
My Reader is a pointer (dynamic instantiation), I instantiate it when play button is hit, delete it when stop button is hit.
My Reader is an object (static instantiation), I instantiate it in SimulatorCore constructor then create and call Reader.init() and Reader.cleanup() into my Reader class and call these when play and stop are being hit
I personally see the functional side, and I clearly want to use pointer, and do not have any reader instantiate if no reader are simulated.
Someone say me that I should use static instantiation (Reason : for safety, and because "it's bad to use pointer when you have choice to not use them")
I'm not familiar with them, but I think I can also use smart pointer.
Code samples: 1st solution:
class SimulatorCore
{
play(){reader = new Reader();};
stop(){delete reader; reader = nullptr;};
private:
Reader *reader;
}
Code samples: 2nd solution:
class SimulatorCore
{
play(){reader.init();};
stop(){reader.cleanup();};
private:
Reader reader;
}
The code is unstest, I've juste wite it for illustration.
What is the best solution? Why?
You can easily use shared_ptr/unique_ptr:
class SimulatorCore
{
play(){_reader = make_shared<Reader>();};
stop(){_reader = nullptr};
private:
shared_ptr<Reader> _reader;
}
That will solve your problem right way, I guess.
Dynamic allocation gives some problems, for example, with throwing exception (there can be memory losing if between play() and stop() there will be thrown exception, for example, and stop() will never be called). Or you can just forget somewhere call stop() before destruction of SimulatorCore, it is possible if program is heavy.
If you never tried smart pointers, it is good chance to start doing it.
You should generally avoid performing dynamic allocation with new yourself, so if you were going to go with the 1st solution, you should use smart pointers instead.
However, the main question here is a question of logic. A real card reader exists in an idle state until it is being used. In the 2nd solution, what do init and cleanup do? Do they simply setup the card reader into an idle state or do they start simulating actually having a card being read? If it's the first case, I suggest that this behaviour should be in the constructor and destructor of Reader, and then creating a Reader object denotes bringing a card reader into existence. If it's the second case, then I'd say the 2nd solution is pretty much correct, just that the functions are badly named.
What seems most logical to me is something more like this:
class SimulatorCore
{
play(){reader.start();};
stop(){reader.stop();};
private:
Reader reader;
}
Yes, all I've done is change the function names for Reader. However, the functions now are not responsible for initialising or cleaning up the reader - that responsibility is in the hands of Reader's constructor and destructor. Instead, start and stop begin and end simulation of the Reader. A single Reader instance can then enter and exit this simulation mode multiple times in its lifetime.
If you later want to extend this idea to multiple Readers, you can just change the member to:
std::vector<Reader> readers;
However, I cannot know for certain that this is what you want because I don't know the logic of your program. Hopefully this will give you some ideas though.
Again, whatever you decide to do, you should avoid using new to allocate your Readers and then also avoid using raw pointers to refer to those Readers. Use smart pointers and their corresponding make_... functions to dynamically allocate those objects.
It clearly depends on how your whole program is organized, but in general, I think I would prefer the static approach, because of responsability considerations:
Suppose you have a separate class that handles serial communication. That class will send and receive messages and dispatch them to the reader class. A message may arrive at any time. The difference of the dynamic and static approaches is:
With the dynamic approach, the serial class must test if the reader actually exists before dispatching a message. Or the reader has to register and unregister itself in the serial class.
With the static approach, the reader class can decide for itself, if it is able to process the message at the moment, or not.
So I think the static approach is a bit easier and straight-forward.
However, if there is a chance that you will have to implement other, different reader classes in the future, the dynamic approach will make this extension easier, because the appropriate class can easily be instanciated at runtime.
So the dynamic approach offers more flexibility.
I have a singleton class for logging purpose in my Qt project. In each class except the singleton one, there is a pointer point to the singleton object and a signal connected to an writing slot in the singleton object. Whichever class wants to write log info just emit that signal. The signals are queued so it's thread-safe.
Please critique this approach from OOP point of view, thanks.
=============================================================================================
Edit 1:
Thank you all your applies, listening to opposite opinions is always a big learning.
Let me explain more about my approach and what I did in my code so far:
Exactly as MikeMB pointer, the singleton class has a static function like get_instance() that returns a reference to that singleton. I stored it in a local pointer in each class's constructor, so it will be destroyed after the constructor returns. It is convenient for checking if I got a null pointer and makes the code more readable. I don't like something as this:
if(mySingletonClass::gerInstance() == NULL) { ... }
connect(gerInstance(), SIGNAL(write(QString)), this, SLOT(write(QString)));
because it is more expensive than this:
QPointer<mySingletonClass> singletonInstance = mySingletonClass::getInstance();
if(singletonInstance.isNull) { ... }
connect(singletonInstance, SIGNAL(write(QString)), this, SLOT(write(QString)));
Calling a function twice is more expensive than creating a local variable from ASM's point of view because of push, pop and return address calculation.
Here is my singleton class:
class CSuperLog : public QObject
{
Q_OBJECT
public:
// This static function creates its instance on the first call
// and returns it's own instance just created
// It only returns its own instance on the later calls
static QPointer<CSuperLog> getInstance(void); //
~CSuperLog();
public slots:
void writingLog(QString aline);
private:
static bool ready;
static bool instanceFlag;
static bool initSuccess;
static QPointer<CSuperLog> ptrInstance;
QTextStream * stream;
QFile * oFile;
QString logFile;
explicit CSuperLog(QObject *parent = 0);
};
I call getInstance() at the beginning of main() so make sure it is read immediately for each other class whenever they need to log important information.
MikeMB:
Your approach is making a middle man sitting in between, it makes the path of the logging info much longer because the signals in Qt are always queued except you make direct connection. The reason why I can't make direct connection here is it make the class non-thread-safe since I use threads in each other classes. Yes, someone will say you can use Mutex, but mutex also creates a queue when more than one thread competing on the same resource. Why don't you use the existing mechanism in Qt instead of making your own?
Thank you all of your posts!
=========================================================
Edit 2:
To Marcel Blanck:
I like your approach as well because you considered resource competition.
Almost in every class, I need signals and slots, so I need QObject, and this is why I choose Qt.
There should be only one instance for one static object, if I didn't get it wrong.
Using semaphores is same as using signals/slots in Qt, both generates message queue.
There always be pros and cons regarding the software design pattern and the application performance. Adding more layers in between makes your code more flexible, but decreases the performance significantly on those lower-configured hardware, making your application depending one most powerful hardware, and that's why most of modern OSes are written in pure C and ASM. How to balance them is really a big challenge.
Could you please explain a little bit more about your static Logger factory approach? Thanks.
I do not like singletons so much because it is always unclean to use them. I have even read job descriptions that say "Knowledge of design patterns while knowing that Singleton isn't one to use". Singleton leads to dependecy hell and if you ever want to change to a completely different logging approach (mabe for testing or production), while not destroying the old one you, need to change a lot.
Another problem with the approch is the usage of signals. Yes get thread savety for free, and do not interrupt the code execution so much but...
Every object you log from needs to be a QObject
If you hunt crashes your last logs will not be printed because the logger had no time to do it before the program crashed.
I would print directly. Maybe you can have a static Logger factory that returns a logger so you can have one logger object in every thread (memory impact will still be very small). Or you have one that is threadsave using semaphores and has a static interface. In both cases the logger should be used via an interface to be more flexible later.
Also make sure that your approach prints directly. Even printf writes to a buffer before being printed and you need to flush it every time or you might never find crashes under bad circumstances, if hunting for them.
Just my 2 cents.
I would consider separating the fact that a logger should be unique, and how the other classes get an instance of the logger class.
Creating and obtaining an instance of the logger could be handled in some sort of factory that internally encapsulates its construction and makes only one instance if need be.
Then, the way that the other classes get an instance of the logger could be handled via Dependency injection or by a static method defined on the aforementioned factory. Using dependency injection, you create the logger first, then inject it into the other classes once created.
A singleton usually has a static function like get_instance() that returns a reference to that singleton, so you don't need to store a pointer to the singleton in every object.
Furthermore it makes no sense, to let each object connect its log signal to the logging slot of the logging object itself, because that makes each and every class in your project dependent on your logging class. Instead, let a class just emit the signal with the log information and establish the connection somewhere central on a higher level (e.g. when setting up your system in the main function). So your other classes don't have to know who is listening (if at all) and you can easily modify or replace your logging class and mechanism.
Btw.: There are already pretty advanced logging libraries out there, so you should find out if you can use one of them or at least, how they are used and adapt that concept to your needs.
==========================
EDIT 1 (response to EDIT 1 of QtFan):
Sorry, apparently I miss understood you because I thought the pointer would be a class member and not only a local variable in the constructor which is of course fine.
Let me also clarify what I meant by making the connection on a higher level:
This was solely aimed at where you make the connection - i.e. where you put the line
connect(gerInstance(), SIGNAL(write(QString)), this, SLOT(write(QString)));
I was suggesting to put this somewhere outside the class e.g. into the main function. So the pseudo code would look something like this:
void main() {
create Thread1
create Thread2
create Thread3
create Logger
connect Thread1 signal to Logger slot
connect Thread2 signal to Logger slot
connect Thread3 signal to Logger slot
run Thread1
run Thread2
run Thread3
}
This has the advantage that your classes don't have to be aware of the kind of logger you are using and whether there is only one or multiple or no one at all. I think the whole idea about signals and slots is that the emitting object doesn't need to know where its signals are processed and the receiving class doesn't have to know where the signals are coming from.
Of course, this is only feasible, if you don't create your objects / threads dynamically during the program's run time. It also doesn't work, if you want to log during the creation of your objects.
While writing a new vst-plugin using VSTGUI I'm really struggling with how to use the library, and most progress is made from guessing and debugging after (because there really is no documentation besides the million lines and ygrabit, which states little more than the obvious).
So far it's going good, but my last contribution to the project involved threads which made the design a little bit more problematic. Specifically, I'm working on a set of textlabels in a container (doing non-atomic operations) and these may (and obviously does) get destructed without my knowledge, when a user closes the window.
Even adding checks right before changing elements might still be a problem. So I actually need to control the lifetime of these objects (which is fine) except when they are shown in a CViewContainer, it automatically assumes ownership.
I have no idea how to write the backbone of the editor, so i used a program called VSTGUIBuilder for this, and appended (and basically rewrote) what i needed. However, since all 'views' you can work with requires either a parent or a systemwindow, you cannot instantiate any views/controls before reaching the AEffEditor::Open() function, which is called whenever your window is popped up.
And the AEffEditor::close() method is called whenever the window is closed. Now, the vstguibuilder put a
delete frame;
inside the AEffEditor::close() method which suggests you rebuild and dispense all resources on every open and close. Can this really be true? And if it is, is there no way i can protect my container's contents (which for details is a vector< CTextLabel *>) from getting deleted mid-function? It's no problem to dispose of it afterwards, I'm just worrying about segfaults while changing it.
Using mutexes and the such is really the last resort (if the call is coming from the host), I don't want to hang the host in any case if my code faults and never releases.
Edit:
I ended up finding a solution which is not so elegant, but works safely. Here's the code in the worker function:
while(bLock) {
Sleep(0);
}
bLock = true;
if(msgs.empty())
return;
/*
Prevent someone deletes our lines in close().
we create a copy of the container to be 100% sure
and increase the reference count, so we can safely
work with our own container and we 'forget' them
afterwards, so they will be deleted if needed.
This ensures that close AND open can be called
meanwhile we are working with the lines
*/
bDeleteLock = true;
// also the copy constructor should work as expected here
// since we are working with pointers, we still reference the same content.
auto copy_lines = lines;
for each(auto line in copy_lines) {
line->remember();
}
bDeleteLock = false;
...
for each(auto line in copy_lines) {
line->forget();
}
cont->setDirty();
bLock is another 'mutex' that protects a message queue, which this function will print out. bDeleteLock protects the process of copying the line container and 'remembering' them, and instantly releases if afterwards. Both are declared as volatile bools, shouldn't that be enough? Here's the close() method btw.
void CConsole::Close() {
// locking lines while copying them over in a container we can work with
while(bDeleteLock)
Sleep(0);
//waiting for bLock is not needed because it wont get deleted.
if(!visible) //if we are not visible it's our responsibility to remove the view
delete cont;
lines.clear();
}
Ahh, VSTGUI, that brings back some dark memories. ;) But seriously, yes, you will probably have to use a mutex to prevent the host from hanging. Having to instantiate everything when the window reopens seems kind of silly, but you can see many plugins do just that.
One potential workaround is to use a shared memory segment for cached view data, and then pass a reference to the location back to your plugin