Photoshop Undo System - drawing

The question probably applies to drawing systems in general. I was wondering how the undo functionality is implemented in PS. Does the program take snapshots of the canvas before each operation? If so, wouldn't this lead to huge memory requirements? I've looked into the Command pattern, but I can't quite see how this would be applied to drawing.
Regards,
Menno

It's called the command pattern. It's simple to implement as useful for any sort of editor.
Photoshop applies stacked transformations upon the original image. One opetation one command. It simply unapplies the transformation when you undo. So it just keeps the original and latest versions, but I guess it might cache the last few versions just for performance.

Since some operations will be non-reversable and as you say snapshoting the entire image every time would be out of the question then the only other alternative I can see would be a stack of deltas. A delta being the set of masks containing the modified pixels prior to the operation. Of course many operations may be reversable so their deltas could be optimised.

I'm not sure how Adobe Photoshop implements undo, but the Paint node within Apple Shake compositing application is pretty easy to explain:
Each stoke is stored as a series of points, along with some information like stroke-color, brush-size etc.
When you draw a stoke, the changes are made on the current image.
Every x strokes (10 I think) the current image is cached into memory.
When you undo, it redraws the last ~9 stokes on the previous cached image.
There are two problems with this:
When you undo more than 10 times, it has to recalculate the whole image. With thousands of strokes this can cause a several second pause.
With Shake, you save the setup file, containing the stroke information - not the actual pixel values. Then means you have to recalculate the whole image whenever you reopen the Paint node, or render the image (not nearly as big a problem as the undo thing, however).
Well, there is a third problem, that being Shake is horribly buggy and poorly implemented in many areas, the Paint node beign one of them - so I'm not sure how good an implementation this is, but I can't imagine Photoshop being too dissimilar (albeit far better optimised).

The easiest way I've found to solve this problem, though I don't know how Adobe tackles it, is to use a persistent data structure, like so:
You think of an image as a collection of image tiles, say 64x64 pixels each, and they get garbage collected or reference counted (ex: using shared_ptr in C++).
Now when the user makes changes to an image tile, you create a new version while shallow copying the unmodified tiles:
Everything except those dark tiles are shallow copied upon such a change. And when you do it that way, your entire undo system boils down to this:
before user operation:
store current image in undo stack
on undo/redo:
swap image at top of undo stack with current image
And it becomes super easy like that without requiring the entire image to be stored over and over in each undo entry. As a bonus when users copy and paste layers, it barely takes any more memory unless/until they make changes to that pasted layer. It basically provides you an instancing system for images. As yet another bonus, when a user creates a transparent layer that's, say, 2000x2000 pixels but they only paint a little bit of the image, like say just 100x100 pixels, that also barely takes any memory because the empty/transparent tiles don't have to store any pixels, only a couple of null pointers. It also speeds up compositing with such mostly-transparent layers, because you don't have to alpha blend the empty image tiles and can just skip over them. It also speeds up image filters in those cases as well since they can likewise just skip over the empty tiles.
As for PS actions, that's a bit of a different approach. There you might use some scripting to indicate what actions to perform, but you can couple it with the above to efficiently cache only modified portions of the image. The whole point of this approach is to avoid having to deep copy the entirety of the image over and over and blow up memory usage to cache previous states of an image for undoing without having to fiddle with writing separate undo/redo logic for all kinds of different operations that could occur.

Photoshop uses History to track their actions. These also serve as Undo as you can go back in history at any point. You can set the size of history in preferences.
I also suggest you look into Adobe Version Cue as a tool for retrospect undo or versions, it's built into the suite for that sole purpose. http://en.wikipedia.org/wiki/Adobe_Version_Cue

Related

Apply one Direct2D effect chain to multiple rectangles

I adopted code which implements a simple image editor. It applies an effect chain to one rectangular region within an image. Now the desired enhancement is to apply that same effect chain with exactly the same parameters to multiple rectangular regions within the same image.
Is there a way to have one effect chain instance work on multiple rectangles? Or do I have to create duplicates of the effect chain, one for each rectangle? One effect chain seems more efficient.
I'm not knowledgeable in D2D and I don't intend to delve too deeply into it, just for this one-off solution. I'm looking for pointers in the right direction and perhaps the appropriate functions to use. My first naive attempt was to put the rectangles into the effect chain via ID2D1Effect::SetInput(index, image) while incrementing the index, but that didn't work at all. image instances were created by some complicated input bitmap shuffling and have the rectangles of interest baked into them somehow. The original code used SetInput(0, image) with just one image instance and it worked. Unfortunately, I can't share code here, both for legal and for practical reasons (too much, too spread out).

Im trying to use this method to detect moving object. Can someone advise me for this?

I want to ask about what kind of problems there be if i use this method to extract foreground.
The condition before using this method is that it runs on fixed camera so there're not going to be any movement on camera position.
And what im trying to do is below.
read one frame from camera and set this frame as background image. This is done periodically.
periodically subtract frames that are read afterward to background image above. Then there will be only moving things colored differently from other area
that are same to background image.
then isolate moving object by using grayscale, binarization, thresholding.
iterate above 4 processes.
If i do this, would probability of successfully detect moving object be high? If not... could you tell me why?
If you consider illumination change(gradually or suddenly) in scene, you will see that your method does not work.
There are more robust solutions for these problems. One of these(maybe the best) is Gaussian Mixture Model applied for background subtraction.
You can use BackgroundSubtractorMOG2 (implementation of GMM) in OpenCV library.
Your scheme is quite adequate to cases where the camera is fix and the background is stationary. Indoor and man-controlled scenes are more appropriate to this approach than outdoor and natural scenes .I've contributed to a detection system that worked basically on the same principals you suggested. But of course the details are crucial. A few remarks based on my experience
Your initialization step can cause very slow convergence to a normal state. You set the background to the first frames, and then pieces of background coming behind moving objects will be considered as objects. A better approach is to take the median of N first frames.
Simple subtraction may not be enough in cases of changing light condition etc. You may find a similarity criterion better for your application.
simple thresholding on the difference image may not be enough. A simple approach is to dilate the foreground for the sake of not updating the background on pixels that where accidentally identified as such.
Your step 4 is unclear, I assumed that you mean that you update the foreground only on those places that are identified as background on the last frame. Note that with such a simple approach, pixels that are actually background may be stuck forever with a "foreground" labeling, as you don't update the background under them. There are many possible solutions to this.
There are many ways to solve this problem, and it will really depend on the input images as to which method will be the most appropriate. It may be worth doing some reading on the topic
The method you are suggesting may work, but it's a slightly non-standard approach to this problem. My main concern would be that subtracting several images from the background could lead to saturation and then you may lose some detail of the motion. It may be better to take difference between consecutive images, and then apply the binarization / thresholding to these images.
Another (more complex) approach which has worked for me in the past is to take subregions of the image and then cross-correlate with the new image. The peak in this correlation can be used to identify the direction of movement - it's a useful approach if more then one thing is moving.
It may also be possible to use a combination of the two approaches above for example.
Subtract second image from the first background.
Threshold etc to find the ROI where movement is occurring
Use a pattern matching approach to track subsequent movement focussed on the ROI detected above.
The best approach will depend on you application but there are lots of papers on this topic

after effects remove still background

I have a movie in the after effects that dosent have a KEY color of background, but a still background.
I want to detect the motion of 2 persons walking in front of this still background and bring them to the front, so I can create effects on the background.
Is that possible to do? Witch Effect do I use?
This works with a technique called difference matting. It never works well even if you have a solid clean plate. your best bet is to rotobrush or just mask them out manually. if you want to try difference matting you can set the footage layer with the people in it to difference blending mode and put a clean plate below it.
you can use the rotobtush that allows to separate elements from a static background. Works better if:
you have a clean background
good quality videos
front object needs to be cutted as big as possible
This method works well with a locked down camera. Depending upon how active people are and how still the background - you can remove the people by cutting and assemble pieces of the background to create a background layer without the people... Once you have a clean plate - then use a difference matte to subtract the background from your clip which will leave anything that moves (read people). This will work well if the hue/values between the background and the people are different.
Doing this in conjuncture with animated masks gives good results and can give you a jump on your rotobrushing to improve your edges.
If all else fails the Mocha plug works well as long as you break your masks up into lots of parts to follow the planes of your actors.
I have never found doing this is a single tool solution.
It is not impossible, but I don't think it could give you the best result.
Anyway, you should mask them out manually. but not keyframe by keyframe. there is a technic for it. for setting the mask keyframes, first, you should set 3 keyframes in first, middle and end of your footage.
after masking three one, you do it the same for first keyframe and the middle one.
it means between the first keyframe and the middle keyframe, you should set a key.
Do the same for middle and end key.
this technic could save your time and make masking with less fault.
by the way, you can use Mocha pro as well for tracking the persons.

Tiling a large QGraphicsItem in a QGraphicsView

I am currently using a QGraphicsItem that I am loading a pixmap into to display some raster data. I am currently not doing any tiling or anything of the sort, but I have overriden my QGraphicsItem so that I can implement features like zooming under mouse, tracking whick pixel I am hovering over, etc etc.
My files that are coming off the disk are 1 - 2GB in size, and I would like to figure out a more optimal way of displaying them. For starters - it seems like I could display them all at once if I wanted - because the QImage that I am using (Qpixmap->QImage->QgraphicsItem) seems to fail at any pixel index over 32,xxx (16 bit).
So how should I implement tiling here if I want to maintain using a single QGraphicsItem? I dont think I want to use multiple QGraphicsItems to save the displayed data + neighboring data "about" to be displayed. This would require me to scale them all when the person moused over and tried to scale a single tile, and thus causing me to also have to reposition everything, right? I guess this will also require having some knowledge about what data to exactly get from the file.
I am however open to ideas. I also suppose it would be nice to do this in some kind of threaded way, that way the user can keep panning the image or zooming even if all the tiles are not loaded yet.
I looked at the 40000 chip demo, but I am not sure that is what I am after - it looks like it basically still displays all of the chips like you normally would in a scene, just overrode the paint method to supply less level of detail...or did I miss something about that demo?
It's not too surprising that there would be difficulty handling images that size. Qt just isn't designed for it and there are possibly other contributing factors due to the particular OS and perhaps the way memory is managed.
You very clearly need (or at least, should use) a tiling mechanism. Your main issue is that you need a way to access your image data that does not involve using a QImage (or QPixmap) to load the entire thing and provide access to that image data since it has already been determined that this fails.
You would either need to find a method (library) that can load the entire image into memory and allow you to pull regions of image data out of it, or load only a specific region from the file on disk. You would also need the ability to resize very large regions to lower resolution sections when trying to "zoom" out on any part of the image. Unfortunately, I have never done image processing like this so am unfamiliar with what library options are available, Qt likely won't be able to help you directly with this.
One option you might explore however is using an image editing package to break your large image up into more manageable chunks. Then perhaps a QGraphicsView solution similar to the chip demo would work.

Improving camshift algorithm in open cv

I am using camshift algorithm of opencv for object tracking. The input is being taken from a webcam and the object is tracked between successive frames. How can I make the tracking stronger? If I move the object at a rapid rate, tracking fails. Also when the object is not in the frame there are false detections. How do I improve this ?
Object tracking is an active research area in computer vision. There are numerous algorithms to do it, and none of them work 100% of the time.
If you need to track in real time, then you need something simple and fast. I am assuming that you have a way of segmenting a moving object from the background. Then you can compute a representation of the object, such as a color histogram, and compare it to the the object you find in the next frame. You should also check that the object has not moved too far between frames. If you want to try more advanced motion tracking, then you should look up Kalman Filter.
Determining that an object is not in the frame is also a big problem. First, what kinds of objects are you trying to track? People? Cars? Dogs? You can build an object classifier, which would tell you whether or not the moving object in the frame is your object of interest, as opposed to noise or some other kind of object. A classifier can be something very simple, such as a constraint on size, or it can be very complicated. In the latter case you need to learn about features that can be computed, classification algorithms, such as support vector machines, and you would need to collect training images to train it.
In short, a reliable tracker is not an easy thing to build.
Suppose you find the object in the first two frames. From that information, you can extrapolate where you'd expect the object in the third frame. Instead of using a generic find-the-object algorithm, you can use a slower, more sophisticated (and thus hopefully more reliable) algorithm by limiting it to check in the vicinity that the extrapolation predicts. It may not be exactly where you expect (perhaps the velocity vector is changing), but you should certainly be able to reduce the area that's checked.
This should help reduce the number of times some other part of the frame is misidentified as the object (because you're looking at a smaller portion of the frame and because you're using a better feature detector).
Update the extrapolations based on what you find and iterate for the next frame.
If the object goes out of frame, you fall back to your generic feature detector, as you do with the first two frames, and try again to get a "lock" when the object returns to the view.
Also, if you can, throw as much light into the physical scene as possible. If the scene is dim, the webcam will use a longer exposure time, leading to more motion blur on moving objects. Motion blur can make it very hard for the feature detectors (though it can give you information about direction and speed).
I've found that if you expand the border of the search window in camShift it makes the algorithm a bit more adaptive to fast moving objects, although it can introduced some irregularities. try just making the border of your window 10% bigger and see what happens.