Acquisition accumulator (software design) - c++

I need some help on software design. Let's say I have a camera that gets acquisitions, send them to a filter, and display the images one at a time.
Now, what I want is to wait for two images and after that send the two images to the filter and both of them to the screen.
I thought of two options and I'm wondering which one to choose:
In my Acquisitioner (or whatever) class, should I put a queue which waits for two images before sending them to the Filterer class?
Should I put an Accumulator class between the Acquisitionner & Filterer?
Both would work in the end, but which one do you think would be better?
Thanks!

To give a direct answer, I would implement the Accumulator policy in a separate object. Here's the why:
Working on similar designs in the past, I found it very helpful to think of different 'actors' in this model as sources and sinks. A source object would be capable of producing or outputting an image to the attached sink object. Filters or accumulators in this system would be designed as pipes -- in other words they would implement interfaces of both a sink and a source. Once you come up with a mechanism for connecting generic sources, pipes, and sinks, it's very easy to implement an accumulation policy as a pipe, which for every nth image received, would hold on to it if n is odd, and output both of them, if n is even.
Once you have this system, it will be trivial for you to change out sources (image file readers, movie decoders, camera capture interfaces), sinks (image file or movie encoders, display viewers, etc), and pipes (filters, accumulators, encoders, multiplexers) without disrupting the rest of your code.

It depends. But, if all that your queue does is waiting for a second image to come, i reckon you could simply implement it right in the Acquisitioner.
On the other hand, if you want to incorporate whatever additional functionality there, then added modularity and all the benefits that come hand in hand with it would not hurt one tiny bit.
I don't think it matters all that much in this particular case.

Related

Style transfer on large image. (in chunks?)

I am looking into various style transfer models and I noted that they all have limited resolution (when running on Pixel 3, for example, I couldn't go beyond 1,024x1,024, OOM otherwise).
I've noticed a few apps (eg this app) which appear to be doing style transfer for up to ~10MP images, these apps also show progress bar which I guess means that they don't just call a single tensorflow "run" method for entire image as otherwise they won't know how much was processed.
I would guess they are using some sort of tiling, but naively splitting the image into 256x256 produces inconsistent style (not just on the borders).
As this seems like an obvious problem I tried to find any publications about this, but I couldn't find any. Am I missing something?
Thanks!
I would guess people split the model into multiple ones (for VGG it is easy to do manually, eg. via layers) and then use model_summary Keras function (or benchmarks) to estimate relative time it takes for each step and thus guide progress bar. Such separation probably also saves memory as tensorflow lite might not be clever enough to reuse memory storing intermediate activations from lower layers once they are not needed.

How to update component data and inform systems in ECS framework?

I'm developing my own game engine with ECS framework.
This is the ECS framework I use:
Entity: Just an ID which connects its components.
Component: A struct which stores pure data, no methods at all (So I can write .xsd to describe components and auto-generate C++ struct code).
System: Handle game logic.
EventDispacher: Dispatch events to subscribers (Systems)
but I'm confused about how should systems update the members of components and inform other systems?
For example, I have a TransformComponent like this:
struct TransformComponent
{
Vec3 m_Position;
Float m_fScale;
Quaternion m_Quaternion;
};
Obviously, if any member of the TransformComponent of a renderable entity is changed, the RenderSystem should also update the shader uniform "worldMatrix" before render the next frame.
So if I do "comp->m_Position = ..." in a system, how should the RenderSystem "notice" the change of TransformComponent? I have come up with 3 solutions:
Send an UpdateEvent after updating the members, and handle the event in related System. This is ugly because once a system modify the component data, it must send an event like this:
{
...;
TransformComponent* comp = componentManager.GetComponent<TransformComponent>(entityId);
comp->m_Position = ...;
comp->m_Quaternion = ...;
eventDispatcher.Send<TransformUpdateEvent>(...);
...;
}
Make members private, and for each component class, write a relevant system with set/get methods (wrapping event sending in set methods). This will bring a lot of cumbersome codes.
Do not change anything, but add "Movable" component. RenderSystem will iteratively do update for renderable entities with "Movable" component in Update() method. This may not solve other similar problems, and I'm not sure about the performance.
I can't come up with an elegant way to solve this problem. Should I change my design?
I think that in this case, the simplest method will be the best: you could just keep a pointer to a Transform component in components that read/write it.
I don't think that using events (or some other indirection, like observers) solves any real problem in here.
Transform component is very simple - it's not something that will be changed during development. Abstracting access to it will actually make code more complex and harder to maintain.
Transform is a component that will be frequently changed for many objects, maybe even most of your objects will update it each frame. Sending events each time there's a change has a cost - probably much higher than simply copying matrix/vector/quaternion from one location to another.
I think that using events, or some other abstraction, won't solve other problems, like multiple components updating the same Transform component, or components using outdated transform data.
Typically, renderers just copy all matrices of rendered objects each frame. There's no point in caching them in a rendering system.
Components like Transform are used often. Making them overly complex might be a problem in many different parts of an engine, while using the simplest solution, a pointer, will give you greater freedom.
BTW, there's also very simple way to make sure that RenderComponent will read transform after it has been updated (e. g. by PhysicsComponent) - you can split work into two steps:
Update() in which systems may modify components, and
PostUpdate() in which systems can only read data from components
For example PhysicsSystem::Update() might copy transform data to respective TransformComponent components, and then RenderSystem::PostUpdate() could just read from TransformComponent, without the risk of using outdated data.
I think there are many things to consider here. I will go in parts, discussing first your soutions.
About your solution 1. Consider, that you could do the same with a boolean, or assigning an empty component acting as tag. Many times, using events in ECS is over complicating your system architecture. At least, I tend to avoid it, specially in smaller projects. Remember that a component acting as a tag, could be thought as basically, an event.
Your solution 2, follows up what we discussed in 1. But it reveals a problem about this general approach. If you are updating your TransformComponent in several Systems, you can't know if the TransformComponent has really changed until the last System has updated it, because one System could have moved it one direction, and another one could have moved it back, letting it as at the beginning of your tick. You could solve this by updating your TransformComponent just once, in one single System...
Which looks like your solution 3. But maybe the other way around. You could be updating the MovableComponent in several Systems, and later on in your ECS pipeline, have a single System, reading your MovableComponent and writting into your TransformComponent. In this case, is important that there is only one System allowed to write on the TransformComponents. At that time, having a boolean indicating if it has been moved or not, would do the job perfectly.
Until here, we have traded performance (because we avoid some processing on the RenderSystem when TransformComponent has not changed) for memory (because we are duplicating the content of TransformComponent in some way.
Another way to do the same, without having to add events, booleans, or components, is doing everything in the RenderSystem. Basically, in each RenderComponent, you could keep a copy (or a hash) of your TransformComponent from the last time you have updated it, and then compare it. If it's not the same, render it, and update your copy.
// In your RenderSystem...
if (renderComponent.lastTransformUpdate == transformComponent) {
continue;
}
renderComponent.lastTransformUpdate = transformComponent;
render(renderComponent);
This last one, would be my preferred solution. But it also depends on the characteristics of your system and your concerns. As always, don't try to optmize for performance blindly. First measure, and then compare.

Windows Media Foundation: IMFSourceReader::SetCurrentMediaType execution time issue

I'm currently working on retrieving image data from a video capturing device.
It is important for me that I have raw output data in a rather specific format and I need a continuous data stream. Therefore I figured to use the IMFSourceReader. I pretty much understand how it is working. For the whole pipeline to work I checked the output formats of the camera and looked at the available Media Foundation Transforms(MFTs).
The critical function here is IMFSourceReader::SetCurrentMediaType. I'd like to elaborate one critical functionality I discovered. If I just use the function with the parameters of my desired output format, it changes some parameters like fps or resolution, but the call succeeds. When I first call the function with a native media type with my desired parameters and a wrong subtype (like MJPG or sth.) and call it again with my desired parameters and the correct subtype the call succeeds and I end up with my correct parameters. I suspect this is only true, if fitting MFTs (decoders) are available.
So far I've pretty much beaten the WMF to get what I want. The Problem now is, that the second call of IMFSourceReader::SetCurrentMediaType takes a long time. The duration depends heavily on the camera used. Varying from 0.5s to 10s. To be honest I don't really know why its taking so long, but I think the calculation of the correct transformation paths and/or the initialization of the transformations is the problem. I recognized an excessive amount of loading and unloading of the same dlls(ntasn1.dll, ncrypt.dll, igd10iumd32.dll). But loading them once myself didn't change anything.
So does anybody know this issue and has a quick fix for it?
Or does anybody know a work around to:
Get raw image data via media foundation without the use ofIMFSourceReader?
Select and load the transformations myself, to support the source reader call?
You basically described the way Source Reader is supposed to work in first place. The underlying media source has its own media types and the reader can supply a conversion if it ever needs to fit requested media type and closest original.
Video capture devices tend to expose many [native] media types (I have a webcam which enumerates 475 of them!), so if format fitting does not go well, source reader might take some time to try one conversion or another.
Note that you can disable source reader's conversions by applying certain attributes like MF_READWRITE_DISABLE_CONVERTERS, in which case inability to set a video format directly on the source would result in an failure.
You can also read data in original device's format and decode/convert/process yourself by feeding the data into one or a chain of MFTs. Typically, when you set respective format on the source reader, the source reader manages the MFTs for you. If you however prefer, you can do it yourself too. Unfortunately you cannot build a chain of MFTs for the source reader to manage. Either you leave it on source reader completely, or you set native media type, you read the data in original format from the reader, and then you manage the MFTs on your side by doing IMFTransform::ProcessInput, IMFTransform::ProcessOutput and friends. This is not as easy as source reader, but is doable.
Since VuVirt does not want to write any answer, I'd like to add one for him and everybody who has the same issue.
Under some conditions the call IMFSourceReader::SetCurrentMediaType takes a long time, when the target format is RGB of some kind and is not natively available. So to get rid of it, I adjusted my image pipeline to be able to interpret YUV (YUY2). I still have no idea, why this is the case, but this is a working work around for me. I don't know any alternative to speed the call up.
Additional hint: I've found that there are usually several IMFTransforms to decode many natively available formats to YUY2. So, if you are able to use YUY2, you are safe. NV12 is another working alternative. Though there are probably more.
Thanks for your answer anyways

What's the difference between pub and mult in core.async? & a sample usecase?

I've been using core.async for some time, but avoided pub and mult, since I can't really grasp a useful usecase from their documentation.
Specifically what's the purpose of the topic-fn and how would you use it in practice?
Or maybe you can map a theoretical explanation onto the following fictive approach. I think this could help a lot to see how it works in practice (if applicable at all?)
Fictive approach explained:
There would be several different views to represent the state. To let them act and respond to state-changes, I would like to have several channels (on an application level), which are - for example - dedicated to state-changes and user-inputs (like key presses).
Each of the views should be able to sub(scribe) ? to this application channel, so they can react independently to changes. Also each of the views should be possible to put something on the state-channel (but not the user-input-chan).
Channels in core.async are single put, single take. That is to say any message going in is given to only one taker. This doesn't work well in broadcast situations where many go blocks need a copy of each message put into a channel, then you need something else. This is what mult is useful for. Mult could probably also be called "broadcast"
Pub is then mult + multimethods. topic-fn is a function that is applied to each input item. The output of the function decides the topic of the message. The input message is then only broadcast to those subscribers who are listening to that topic.
More information is in the notes from my talk at the last Conj, available here: https://github.com/halgari/clojure-conj-2013-core.async-examples/blob/master/src/clojure_conj_talk/core.clj#L398

How should I link a data Class to my GUI code (to display attributes of object, in C++)?

I have a class (in C++), call it Data, that has thousands of instances (objects) when the code is run. I have a widget (in Qt), call it DataWidget that displays attributes of the objects. To rapidly build the widget I simply wrote the object attributes to a file and had the widget parse the file for the attributes - this approach works, but isn't scalable or pretty.
To be more clear my requirements are:
1 - DataWidget should be able to display multiple, different, Data object's attributes at a time
2 - DataWidget should be able to display thousands of Data objects per second
3 - DataWidget should be run along side the code that generates new Data objects
4 - each Data object needs to be permanently saved to file/database
Currently, the GUI is created and the DataWidget is created then the experiment runs and generates thousands of Data objects (periodically writing some of them to file). After the experiment runs the DataWidget displays the last Data object written to file (they are written to XML files).
With my current file approach I can satisfy (1) by grabbing more than one file after the experiment runs. Since the experiment isn't tied to DataWidget, there is no concurrency, so I can't do (3) until I add a signal that informs the DataWidget that a new file exists.
I haven't moved forward with this approach for 2 reasons:
Firstly, even though the files aren't immediately written to disk, I can't imagine that this method is scalable unless I implement a caching system - but, this seems like I'm reinvent the wheel? Secondly, Data is a wrapper for a graph data-structure and I'm using Graphml (via Boost Graph Library i.e. write_graphml()) to write the structure to XML files, and to read the structure back in with Boost's read_graphml() requires me to read the file back into a Data object ... which means the experiment portion of the program encodes the object into XML, writes the XML to a file (but hopefully in memory and not to disk), then the DataWidget reads the XML from a file and decodes it into an object!
It seems to me like I should be using a database which would handle all the caching etc. Moreover, it seems like I should be able to skip the file/database step and pass the Data to the DataWidget in the program (perhaps pass it a reference to a list of Data). Yet, I also want to save the Data to file to the file/database step isn't entirely pointless - I'm just using it in the wrong way at the wrong time.
What is the better approach given my requirements?
Are there any general resources and/or guidelines for handling and displaying data like this?
I see you're using Qt. This is good because Qt 4.0 and later includes a powerful model/view framework. And I think this is what you want.
Model/View
Basically, have your Data class inherit and implement QAbstractItemModel, or a different Qt Model class, depending on the kind of model you want. Then set your view widget (most likely a QListView) to use Data for its model.
There are lots of examples at their site and this solution scales nicely with large data sets.
Added: This model test code from labs.trolltech.com comes in real handy:
http://labs.trolltech.com/page/Projects/Itemview/Modeltest
It seems to me like I should be using
a database which would handle all the
caching etc. Moreover, it seems like I
should be able to skip the
file/database step and pass the Data
to the DataWidget in the program
(perhaps pass it a reference to a list
of Data). Yet, I also want to save the
Data to file to the file/database step
isn't entirely pointless - I'm just
using it in the wrong way at the wrong
time.
If you need to display that much rapidly changing data, having an intermediate file or database will slow it down and likely become the bottleneck. I think the Widget should read the newly generated data directly from memory. This doesn't prevent you from storing the data in a file or database though, it can be done in a separate thread/process.
If all of the data items will fit in memory, I'd say put them in a vector/list, and pass a reference to that to the DataWidget. When it's time to save them, pass a reference to your serializing method. Then your experiment just populates the data structure for the other processes to use.