Style transfer on large image. (in chunks?) - computer-vision

I am looking into various style transfer models and I noted that they all have limited resolution (when running on Pixel 3, for example, I couldn't go beyond 1,024x1,024, OOM otherwise).
I've noticed a few apps (eg this app) which appear to be doing style transfer for up to ~10MP images, these apps also show progress bar which I guess means that they don't just call a single tensorflow "run" method for entire image as otherwise they won't know how much was processed.
I would guess they are using some sort of tiling, but naively splitting the image into 256x256 produces inconsistent style (not just on the borders).
As this seems like an obvious problem I tried to find any publications about this, but I couldn't find any. Am I missing something?
Thanks!

I would guess people split the model into multiple ones (for VGG it is easy to do manually, eg. via layers) and then use model_summary Keras function (or benchmarks) to estimate relative time it takes for each step and thus guide progress bar. Such separation probably also saves memory as tensorflow lite might not be clever enough to reuse memory storing intermediate activations from lower layers once they are not needed.

Related

Reduce a Caffe network model

I'd like to use Caffe to extract image features. However, it takes too long to process an image, so I'm looking for ways to optimize for speed.
One thing I noticed is that the network definition I'm using has four extra layers on top the one from which I'm reading a result (and there are no feedback signals, so they should be safe to delete).
I tried to delete them from the definition file but it had no effect at all. I guess I might need to remove the corresponding part of the file that contains pre-trained weights, too. That is, however, a binary file (a protobuffer) so editing it is not that easy.
Do you think that removing the four layers might have a profound effect of the net performance?
If so then how do I get familiar with the file contents so that I could edit it and how do I know which parts to remove?
first, I don't think removing the binary weights will have any effect.
Second, you can do it easily using the python interface: see this tutorial.
Last but not least, have you tried running caffe time to measure the performance of your net? this may help you identify the bottlenecks of your computations.
PS,
You might find this thread relevant as well.
Caffemodel stores data as key-value pair. Caffe only copies weight for those layers (in train.prototxt) having exactly same name as caffemodel. Hence I don't think removing binary weights will work. If you want to change network structure, just modify train.prototxt and deploy.txt.
If you insist to remove weights from binary file, follow this caffe example.
And to make sure you delete right part, this visualizing tool should help.
I would retrain on a smaller input size, change strides, etc. However if you want to reduce file size, I'd suggest quantizing the weights https://github.com/yuanyuanli85/CaffeModelCompression and then using something like lzma compression (xz for unix). We do this so we can deploy to mobile devices. 8 bit weights compress nicely.

Testing ZXing - some discoveries

Just to start off, thanks for making ZXing freely available.
I have been working in the barcode field since '98, and have a fair bit of experience decoding barcodes from images.
I consult now, and one of my products in a program that tests barcode decoders. This is an extremely CPU intensive test, with literally millions of images being tested per symbology. I test with different amounts of blur, different angles, different brightness and contrast, different sizes (pixels per module), curvature, perspective distortion, ripple, varying illumination ... the whole shebang. I am testing the C++ implementation on a Linux machine.
I have come across a few issues that may interest the developers:
1) Aztec has a bug that crashes sometimes. In AztecDecoder.cpp, in Decoder::correctBits, sometimes numECCCodewords takes on a negative value. Bad things happen after that. I was able to patch with a simple test and a "throw" to complete my testing.
2) Aztec has a huge memory leak. In no time at all, my program is taking over one GB of RAM, and the number steadily increases over time. It gets bad enough that the OS starts bogging down and reboots after a while. I don't seem to have this problem with other symbologies.
3) You don't seem to make the version number available in the code. I know, you should keep track of which version you downloaded. Life isn't always like that, sometimes you inherit code, and you had nothing to do with downloading it. Even a simple #define in a header file would suffice. I like to display the version number of the decoder in my test program. I get this value directly from the decoder so that if I upgrade decoders, the reported version automatically changes. Just a nice thing to have.
4) Aztec appears to be extremely weak. I know, it's only alpha, but it hardly ever decodes. I am not meaning to pick on aztec, but it gave me the most issues.
5) The entire UPC family misdecodes (returns incorrect information for a correctly encoded barcode) extremely frequently. You may want to put some protection in there. ITF does a fair bit as well. All well-known weaknesses.
6) Aztec misdecodes as well. This should pretty well never happen with a Reed-Solomon error correction system. I haven't taken the time to look into this yet.
Those were the major problems, now for comments:
1) There is no support for barcodes at angles (omnidirectional decoding). This is supported by all major commercial packages.
2) Decoding appears to be weak overall in terms of handling blur, and low pixels per module. Yes, I realize that it is a free package, but just stating the weak points.
That's about all that I found now. I'll update when I have more information.

How to generate high resolution thumbnails for only some aliases?

I'm working on a photography website in Django.
Because the site is “responsive”, we pre-generate numerous sizes of each image using set aliases. In particular, 7 images with a variety of widths starting with 960 to 3,840 pixels wide in 480px increments. These images will be used when a photo is shown full-screen (as in, not in a list-view).
The site is also built for HiDPI/Retina displays/devices. So, we'd like to use the setting: THUMBNAIL_HIGH_RESOLUTION to automatically prepare the #2x versions of some of the aliases, but most notably, NOT for the aliases used to create the range of 7 images used for the full-screen images above.
As this project is meant to show off the work of a photographer, we're using rather high quality settings, so each image starts out at roughly 3840x2160 pixels in size and, through our pre-generation becomes approximately 50MBs of JPGs. Unfortunately nearly 50% of this is pure waste, because we only use the #2x versions on an image when we show lists or collections of images on a page. These are generally only 300px/600px wide and are relatively tiny compared to our "full screen" image sets.
We've considered disabling THUMBNAIL_HIGH_RESOLUTION and just creating new aliases for the #2x versions, but it isn't clear how to generate the correct filenames with an alias.
So, how can we pre-generate HiDPI/Retina images with the standard #2x (or _2x) infix for just SOME of our aliases?
UPDATE: This is now a feature of easy_thumbnails! In aliases you can use HIGH_RESOLUTION: False to disable their creation, or HIGH_RESOLUTION: True to force them. Thanks #ChrisSmiley!
In easy-thumbnails-1.3, the #2x infix currently is hard-coded, but in the next version, users may choose another infix through a configuration setting. Have a look at this pull request for details.
To answer your second question, currently it is not possible to generate Retina thumbnails only for certain entries. easy-thumbnails has a all-or-none policy, but this in theory, could be changed.

How do I write a Perl script to filter out digital pictures that have been doctored?

Last night before going to bed, I browsed through the Scalar Data section of Learning Perl again and came across the following sentence:
the ability to have any character in a string means you can create, scan, and manipulate raw binary data as strings.
An idea immediately hit me that I could actually let Perl scan the pictures that I have stored on my hard disk to check if they contain the string Adobe. It seems by doing so, I can tell which of them have been photoshopped. So I tried to implement the idea and came up with the following code:
#!perl
use autodie;
use strict;
use warnings;
{
local $/="\n\n";
my $dir = 'f:/TestPix/';
my #pix = glob "$dir/*";
foreach my $file (#pix) {
open my $pic,'<', "$file";
while(<$pic>) {
if (/Adobe/) {
print "$file\n";
}
}
}
}
Excitingly, the code seems to be really working and it does the job of filtering out the pictures that have been photoshopped. But problem is many pictures are edited by other utilities. I think I'm kind of stuck there. Do we have some simple but universal method to tell if a digital picture has been edited or not, something like
if (!= /the origianl format/) {...}
Or do we simply have to add more conditions? like
if (/Adobe/|/ACDSee/|/some other picture editors/)
Any ideas on this? Or am I oversimplifying due to my miserably limited programming knowledge?
Thanks, as always, for any guidance.
Your best bet in Perl is probably ExifTool. This gives you access to whatever non-image information is embedded into the image. However, as other people said, it's possible to strip this information out, of course.
I'm not going to say there is absolutely no way to detect alterations in an image, but the problem is extremely difficult.
The only person I know of who claims to have an answer is Dr. Neal Krawetz, who claims that digitally altered parts of an image will have different compression error rates from the original portions. He claims that re-saving a JPEG at different quality levels will highlight these differences.
I have not found this to be the case, in my investigations, but perhaps you might have better results.
No. There is no functional distinction between a perfectly edited image, and one which was the way it is from the start - it's all just a bag of pixels in the end, after all, and any other metadata you can remove or forge all you want.
The name of the graphics program used to edit the image is not part of the image data itself but of something called meta data - which may be stored in the image file but, as others have noted, is neither required (so some programs may not store it, some may allow you an option of not storing it) nor reliable - if you forged an image, you might have forged the meta data as well.
So the answer to your question is "no, there's no way to universally tell if the pic was edited or not, although some image editing software may write its signature into the image file and it'll be left there by carelessness of the editing person.
If you're inclined to learn more about image processing in Perl, you could take a look at some of the excellent modules CPAN has to offer:
Image::Magick - read, manipulate and write of a large number of image file formats
GD - create colour drawings using a large number of graphics primitives, and emit the drawings in various formats.
GD::Graph - create charts
GD::Graph3d - create 3D Graphs with GD and GD::Graph
However, there are other utilities available for identifying various image formats. It's more of a question for Super User, but for various unix distros you can use file to identify many different types of files, and for MacOSX, Graphic Converter has never let me down. (It was even able to open the bizarre multi-file X-ray of my cat's shattered pelvis that I got on a disc from the vet.)
How would you know what the original format was? I'm pretty sure there's no guaranteed way to tell if an image has been modified.
I can just open the file (with my favourite programming language and filesystem API) and just write whatever I want into that file willy-nilly. As long as I don't screw something up with the file format, you'd never know it happened.
Heck, I could print the image out and then scan it back in; how would you tell it from an original?
As other's have stated, there is no way to know if the image was doctored. I'm guessing what you basically want to know is the difference between a realistic photograph and one that has been enhanced or modified.
There's always the option of running some extremely complex image recognition algorithm that would analyze every pixel in your image and do some very complicated stuff to determine if the image was doctored or not. This solution would probably involve AI which would examine millions of photos that are both doctored and those that are not and learn from them. However, this is more of a theoretical solution and isn't very practical... you would probably only see it in movies. It would be extremely complex to develop and probably take years. And even if you did get something like this to work, it probably still wouldn't be 100% correct all the time. I'm guessing AI technology still isn't at that level and could take a while until it is.
A not-commonly-known feature of exiftool allows you to recognize the originating software through an analysis of the JPEG quantization tables (not relying on image metadata). It recognizes tables written by many applications. Note that some cameras may use the same quantization tables as some applications, so this isn't a 100% solution, but it is worth looking into. Here is an example of exiftool run on two images, the first was edited by photoshop.
> exiftool -jpegdigest a.jpg b.jpg
======== a.jpg
JPEG Digest : Adobe Photoshop, Quality 10
======== b.jpg
JPEG Digest : Canon EOS 30D/40D/50D/300D, Normal
2 image files read
This will work even if the metadata has been removed.
There is existing software out there which uses various techniques (compression artifacting, comparison to signature profiles in a database of cameras, etc.) to analyze the actual image data for evidence of alteration. If you have access to such software and the software available to you provides an API for external access to these analysis functions, then there's a decent chance that a Perl module exists which will interface with that API and, if no such module exists, it could probably be created rather quickly.
In theory, it would also be possible to implement the image analysis code directly in native Perl, but I'm not aware of anyone having done so and I expect that you'd be better off writing something that low-level and processor-intensive in a fully-compiled language (e.g., C/C++) rather than in Perl.
http://www.impulseadventure.com/photo/jpeg-snoop.html
is a tool that does the job almost good
If there has been any cloning , there is a variation in the pixel density..or concentration which sometimes shows up.. upon manual inspection
a Photoshop cloned area will have even pixel density(my meaning is variation of Pixels wrt a scanned image)

How do I append a large amount of rich content (images, formatting) quickly to a control without using tons of CPU?

I am using wxWidgets and Visual C++ to create functionality similar to using Unix "tail -f" with rich formatting (colors, fonts, images) in a GUI. I am targeting both wxMSW and wxMAC.
The obvious answer is to use wxTextCtrl with wxTE_RICH, using calls to wxTextCtrl::SetDefaultStyle() and wxTextCtrl::WriteText().
However, on my 3ghz workstation, compiled in release mode, I am unable to keep tailing a log that grows on average of 1 ms per line, eventually falling behind. For each line, I am incurring:
Two calls to SetDefaultStyle()
Two calls two WriteText()
A call to Freeze() and Thaw() the widget
When running this, my CPU goes to 100% on one core using wxMSW after filling up roughly 20,000 lines. The program is visibly slower once it reaches a certain threshold, falling further behind.
I am open to using other controls (wxListCtrl, wxRichTextCtrl, etc).
Have you considered limiting the amount of lines in the view? When we had a similar issue, we just made sure never more than 10,000 lines are in the view. If more lines come in at the bottom we remove lines at the top. This was not using WxWidgets, it was using a native Cocoa UI on Mac, but the issue is the same. If a styled text view (with colors, formatting and pretty printing) grows to large, appending more data at the bottom becomes pretty slow.
Sounds like the control you are using is simply not built for the amount of data you are throwing at it. I would consider building a custom control. Here's some things you could take into account:
When a new line comes in, you don't need to re-render the previous lines... they don't change and the layout won't change due to the new data.
Try to only keep the visible portion plus a few screens of look-back in memory at a time. This would make it a little lighter... but you will have to do your own scroll management if you want the user to be able to scroll back further than your look-back and make it all appear to be seamless.
Don't necessarily update one line at a time. When there is new data, grab it all and update. If you get 10 lines really quickly, and you update the screen all at once, you might save on some of the overhead of doing it line by line.
Hope this helps.
Derive from wxVListBox. From the docs:
wxVListBox is a listbox-like control with the following two main differences from a regular listbox: it can have an arbitrarily huge number of items because it doesn't store them itself but uses OnDrawItem() callback to draw them (so it is a Virtual listbox) and its items can have variable height as determined by OnMeasureItem() (so it is also a listbox with the lines of Variable height).