Spatial transformation of volumes - xtk

I have 2 nifti image volumes that I would like to display together using XTK. Both of them are already converted to NRRD using Mevislab's relevant ITK modules. Both volumes were acquired in the same MR session, but they differ in spatial resolution and field of view (even by orientation), so it's important that XTK takes in mind the "space directions" and "space origin" fields of the NRRD to display them correctly in terms of relative spatial position. This seems not to happen in practice though.
I already read a question and answer on fixing the "space origin" part, but it's still running into problems with the "space directions". My latest attempts were trying to modify the transformation of the volumes after loading them, but this doesn't seems to have any effect on the displayed volumes. I could however successfully do this with a TRK fiber file, but changing a volume's tranformation doesn't yield any effects. So the question being: how do I correctly load a (NRRD) volume and while taking into account it's full spatial tranformation to patient/scanner space, so multiple loaded volumes get matched up correctly?
Thanks in advance for any help!

Sadly, the scan orientation is not taken into account at the moment. This should happen soon so, since it is very important.
As a workaround, it is possible to transform each Slice of a volume using the transform methods. You basically loop through the children and then apply to each one the transform.
If you are willing to contribute the features to the parser, it would be great! Also, the NII format is now supported so you don't have to convert to (the much slower) NRRD.

Related

AWS image processing

I am working on a project where I need to take a picture of a surface using my phone and then analyze the surface for defects and marks.
I want to take the image and then send it to the cloud for analysis.
Does AWS-Rekognition provide such a service to analyze the defects I want to study?
Or Would I need to write a custom code using opencv or something?
While Amazon Rekognition can detect faces and objects, it has no idea what it meant by a "defect".
Imagine if you had 10 people lined up and showed them a picture, asking them if they could see a defect. Would they all agree? They'd probably ask you what you mean by a defect and how bad something has to look before it could be considered a defect.
Similarly, you would need to train a system on what is a valid defect and what is not a defect.
This is a good use case for Amazon SageMaker. You would need to provide LOTS of sample images of defects and not-defects. They should be shot from many different angles in many different lighting situations, similar to the images you would want to test.
It would then build a model that could be used for detecting 'defects' in supplied images. You could even put the model into an AWS DeepLens unit to do the processing locally.
Please note, however, that you need to provide a large number of images (hundreds is good, thousands is better) to be able to train it to correct detect 'defects'.

Object recognition of a set of objects

In a computer vision project, the image I want to process can be partitioned in "zones" containining multiple products of the same kind.
Provided that I can retrieve image information of all the possible kinds of product, I need to detect which kind is present in each zone, without the need to detect the position of each single product. In summary, I need to recognize "sets of products".
As additional info, the products have not a rigid shape, they are not oriented in the same manner and luminosity changes (so I am basically searching for shape, orientation and luminosity invariant approaches).
The reliable info I can exploit is that the products logos - or parts of them - are often visible and the products are quite colorful.
I would like to know about possible approaches that exploit the fact that I know the zones partition and approaches that do not exploit it.

Debugging of image processing code

What kind of debugging is available for image processing/computer vision/computer graphics applications in C++? What do you use to track errors/partial results of your method?
What I have found so far is just one tool for online and one for offline debugging:
bmd: attaches to a running process and enables you to view a block of memory as an image
imdebug: enables printf-style of debugging
Both are quite outdated and not really what I would expect.
What would seem useful for offline debugging would be some style of image logging, lets say a set of commands which enable you to write images together with text (probably in the form of HTML, maybe hierarchical), easy to switch off at both compile and run time, and the least obtrusive it can get.
The output could look like this (output from our simple tool):
http://tsh.plankton.tk/htmldebug/d8egf100-RF-SVM-RBF_AC-LINEAR_DB.html
Are you aware of some code that goes in this direction?
I would be grateful for any hints.
Coming from a ray tracing perspective, maybe some of those visual methods are also useful to you (it is one of my plans to write a short paper about such techniques):
Surface Normal Visualization. Helps to find surface discontinuities. (no image handy, the look is very much reminiscent of normal maps)
color <- rgb (normal.x+0.5, normal.y+0.5, normal.z+0.5)
Distance Visualization. Helps to find surface discontinuities and errors in finding a nearest point. (image taken from an abandoned ray tracer of mine)
color <- (intersection.z-min)/range, ...
Bounding Volume Traversal Visualization. Helps visualizing a bounding volume hierarchy or other hierarchical structures, and helps to see the traversal hotspots, like a code profiler (e.g. Kd-trees). (tbp of http://ompf.org/forum coined the term Kd-vision).
color <- number_of_traversal_steps/f
Bounding Box Visualization (image from picogen or so, some years ago). Helps to verify the partitioning.
color <- const
Stereo. Maybe useful in your case as for the real stereographic appearance. I must admit I never used this for debugging, but when I think about it, it could prove really useful when implementing new types of 3d-primitives and -trees (image from gladius, which was an attempt to unify realtime and non-realtime ray tracing)
You just render two images with slightly shifted position, focusing on some point
Hit-or-not visualization. May help to find epsilon errors. (image taken from metatrace)
if (hit) color = const_a;
else color = const_b
Some hybrid of several techniques.
Linear interpolation: lerp(debug_a, debug_b)
Interlacing: if(y%2==0) debug_a else debug_b
Any combination of ideas, for example the color-tone from Bounding Box Visualization, but with actual scene-intersection and lighting applied
You may find some more glitches and debugging imagery on http://phresnel.org , http://phresnel.deviantart.com , http://picogen.deviantart.com , and maybe http://greenhybrid.deviantart.com (an old account).
Generally, I prefer to dump bytearray of currently processed image as raw data triplets and run Imagemagick to create png from it with number e.g img01.png. In this way i can trace the algorithms very easy. Imagemagick is run from the function in the program using system call. This make possible do debug without using any external libs for image formats.
Another option, if you are using Qt is to work with QImage and use img.save("img01.png") from time to time like a printf is used for debugging.
it's a bit primitive compared to what you are looking for, but i have done what you suggested in your OP using standard logging and by writing image files. typically, the logging and signal export processes and staging exist in unit tests.
signals are given identifiers (often input filename), which may be augmented (often process name or stage).
for development of processors, it's quite handy.
adding html for messages would be simple. in that context, you could produce viewable html output easily - you would not need to generate any html, just use html template files and then insert the messages.
i would just do it myself (as i've done multiple times already for multiple signal types) if you get no good referrals.
In Qt Creator you can watch image modification while stepping through the code in the normal C++ debugger, see e.g. http://labs.qt.nokia.com/2010/04/22/peek-and-poke-vol-3/

Scaleing of different images uniformly

I was working on age estimation project and stuck with the following problem:-
I have a database of different images of different people and for each individual there are pictures taken at different age. The problem I am facing is that for any person the pictures in the database has not been taken with the same distance hence the estimation algorithm on these set of picture is not working. I need to construct new database of pictures from the current database in which all the photographs are taken from the same camera distance. I am not able to find such a scaling method. Zooming both in & out of the pictures is not able to solve this problem as the face becomes smaller or bigger which is not desired. Kindly help me !!!! to solve this problem
If you want to correct somehow the camera distance automatically that's a 3D transformation and not really a scaling issue only. Camera distance change implies change in perspective.
The general problem you are facing is one of Image Registration - there are many different algorithms and approaches depending on your specific problem complexity, resources, and quality requirements.
You can research toolkits such as OpenCV, VXL, and others to see if there is something appropriate for your needs.

Photoshop Undo System

The question probably applies to drawing systems in general. I was wondering how the undo functionality is implemented in PS. Does the program take snapshots of the canvas before each operation? If so, wouldn't this lead to huge memory requirements? I've looked into the Command pattern, but I can't quite see how this would be applied to drawing.
Regards,
Menno
It's called the command pattern. It's simple to implement as useful for any sort of editor.
Photoshop applies stacked transformations upon the original image. One opetation one command. It simply unapplies the transformation when you undo. So it just keeps the original and latest versions, but I guess it might cache the last few versions just for performance.
Since some operations will be non-reversable and as you say snapshoting the entire image every time would be out of the question then the only other alternative I can see would be a stack of deltas. A delta being the set of masks containing the modified pixels prior to the operation. Of course many operations may be reversable so their deltas could be optimised.
I'm not sure how Adobe Photoshop implements undo, but the Paint node within Apple Shake compositing application is pretty easy to explain:
Each stoke is stored as a series of points, along with some information like stroke-color, brush-size etc.
When you draw a stoke, the changes are made on the current image.
Every x strokes (10 I think) the current image is cached into memory.
When you undo, it redraws the last ~9 stokes on the previous cached image.
There are two problems with this:
When you undo more than 10 times, it has to recalculate the whole image. With thousands of strokes this can cause a several second pause.
With Shake, you save the setup file, containing the stroke information - not the actual pixel values. Then means you have to recalculate the whole image whenever you reopen the Paint node, or render the image (not nearly as big a problem as the undo thing, however).
Well, there is a third problem, that being Shake is horribly buggy and poorly implemented in many areas, the Paint node beign one of them - so I'm not sure how good an implementation this is, but I can't imagine Photoshop being too dissimilar (albeit far better optimised).
The easiest way I've found to solve this problem, though I don't know how Adobe tackles it, is to use a persistent data structure, like so:
You think of an image as a collection of image tiles, say 64x64 pixels each, and they get garbage collected or reference counted (ex: using shared_ptr in C++).
Now when the user makes changes to an image tile, you create a new version while shallow copying the unmodified tiles:
Everything except those dark tiles are shallow copied upon such a change. And when you do it that way, your entire undo system boils down to this:
before user operation:
store current image in undo stack
on undo/redo:
swap image at top of undo stack with current image
And it becomes super easy like that without requiring the entire image to be stored over and over in each undo entry. As a bonus when users copy and paste layers, it barely takes any more memory unless/until they make changes to that pasted layer. It basically provides you an instancing system for images. As yet another bonus, when a user creates a transparent layer that's, say, 2000x2000 pixels but they only paint a little bit of the image, like say just 100x100 pixels, that also barely takes any memory because the empty/transparent tiles don't have to store any pixels, only a couple of null pointers. It also speeds up compositing with such mostly-transparent layers, because you don't have to alpha blend the empty image tiles and can just skip over them. It also speeds up image filters in those cases as well since they can likewise just skip over the empty tiles.
As for PS actions, that's a bit of a different approach. There you might use some scripting to indicate what actions to perform, but you can couple it with the above to efficiently cache only modified portions of the image. The whole point of this approach is to avoid having to deep copy the entirety of the image over and over and blow up memory usage to cache previous states of an image for undoing without having to fiddle with writing separate undo/redo logic for all kinds of different operations that could occur.
Photoshop uses History to track their actions. These also serve as Undo as you can go back in history at any point. You can set the size of history in preferences.
I also suggest you look into Adobe Version Cue as a tool for retrospect undo or versions, it's built into the suite for that sole purpose. http://en.wikipedia.org/wiki/Adobe_Version_Cue