cr2 extract images with a specific focal length - c++

I have some cr2 files.
I would like to make a C++ program, or write a script, that separates the different cr2 files with different focal lengths, and puts them in separate directories.
How can I do that ?
I have access to the Canon EDSDK and LibRaw but i am not sure where i can find the focal length information.
I also have a little utility "ExifTool" that can read the metadata - but i don't know how to get it to do something i want - and the gui version seems to crash when reading 5000 files over the network...
Please can somebody give me a suggestion ?

The EXIF specification can be found here:
http://www.exif.org/specifications.html
You will need to read the EXIF data, and find the entry which contains a "tag" of 37386 or 0x920A. This is followed by a "RATIONAL" number, which is essentially two unsigned integers forming a fraction, for example 400/20 = 20 mm, as is 80/5 or 20/1. A 14.5mm lens would have to be (at least) stored as 29/2, but could be stored as 145/10 or 1450/100 - or a large number of other variants.
Of course, if you use for example ExifTool, you can easily do this with a script, or use it's Perl binding to write a script in Perl.
There is also a C++ interface to use with exiftool:
http://owl.phy.queensu.ca/~phil/cpp_exiftool/
Using the TagInfo that you get back from ImageInfo(), it should be possible to find the FocalLength and move the files accordingly.
I'm not going to write the code for you, but the above information should be able to give you an idea.

Related

Reduce a Caffe network model

I'd like to use Caffe to extract image features. However, it takes too long to process an image, so I'm looking for ways to optimize for speed.
One thing I noticed is that the network definition I'm using has four extra layers on top the one from which I'm reading a result (and there are no feedback signals, so they should be safe to delete).
I tried to delete them from the definition file but it had no effect at all. I guess I might need to remove the corresponding part of the file that contains pre-trained weights, too. That is, however, a binary file (a protobuffer) so editing it is not that easy.
Do you think that removing the four layers might have a profound effect of the net performance?
If so then how do I get familiar with the file contents so that I could edit it and how do I know which parts to remove?
first, I don't think removing the binary weights will have any effect.
Second, you can do it easily using the python interface: see this tutorial.
Last but not least, have you tried running caffe time to measure the performance of your net? this may help you identify the bottlenecks of your computations.
PS,
You might find this thread relevant as well.
Caffemodel stores data as key-value pair. Caffe only copies weight for those layers (in train.prototxt) having exactly same name as caffemodel. Hence I don't think removing binary weights will work. If you want to change network structure, just modify train.prototxt and deploy.txt.
If you insist to remove weights from binary file, follow this caffe example.
And to make sure you delete right part, this visualizing tool should help.
I would retrain on a smaller input size, change strides, etc. However if you want to reduce file size, I'd suggest quantizing the weights https://github.com/yuanyuanli85/CaffeModelCompression and then using something like lzma compression (xz for unix). We do this so we can deploy to mobile devices. 8 bit weights compress nicely.

How to read/restore big data file (SEGY format) with C/C++?

I am working on a project which needs to deal with large seismic data of SEGY format (from several GB to TB). This data represents the 3D underground structure.
Data structure is like:
1st tract, 2,3,5,3,5,....,6
2nd tract, 5,6,5,3,2,....,3
3rd tract, 7,4,5,3,1,....,8
...
What I want to ask is, in order to read and deal with the data fast, do I have to convert the data into another form? Or it's better to read from the original SEGY file? And is there any existing C package to do that?
If you need to access it multiple times and
if you need to access it randomly and
if you need to access it fast
then load it to a database once.
Do not reinvent the wheel.
When dealing of data of that size, you may not want to convert it into another form unless you have to - though some software does do just that. I found a list of free geophysics software on Wikipedia that look promising; many are open source and read/write SEGY files.
Since you are a newbie to programming, you may want to consider if the Python library segpy suits your needs rather than a C/C++ option.
Several GB is rathe medium, if we are toking about poststack.
You may use segy and convert on the fly, you may invent your own format. It depends whot you needed to do. Without changing segy format it's enough to createing indexes to traces. If segy is saved as inlines - it's faster access throug inlines, although crossline access is not very bad.
If it is 3d seismic, the best way to have the same quick access to all inlines/crosslines is to have own format - based od beans, e.g 8x8 traces - loading all beans and selecting tarces access time may be very quick - 2-3 secends. Or you may use SSD disk, or 2,5x RAM as your SEGY.
To quickly access timeslices you have 2 ways - 3D beans or second file stored as timeslices (the quickes way). I did same kind of that 10 years ago - access time to 12 GB SEGY was acceptable - 2-3 seconds in all 3 directions.
SEGY in database? Wow ... ;)
The answer depends upon the type of data you need to extract from the SEG-Y file.
If you need to extract only the headers (Text header, Binary header, Extended Textual File headers and Trace headers) then they can be easily extracted from the SEG-Y file by opening the file as binary and extracting relevant information from the respective locations as mentioned in the data exchange formats (rev2). The extraction might depend upon the type of data (Post-stack or Pre-stack). Also some headers might require conversions from one format to another (e.g Text Headers are mostly encoded in EBCDIC format). The complete details about the byte locations and encoding formats can be read from the above documentation
The extraction of trace data is a bit tricky and depends upon various factors like the encoding, whether the no. of trace samples is mentioned in the trace headers, etc. A careful reading of the documentation and getting to know about the type of SEG data you are working on will surely make this task a lot easier.
Since you are working with the extracted data, I would recommend to use already existing libraries (segpy: one of the best python library I came across). There are also numerous free available SEG-Y readers, a very nice list has already been mentioned by Daniel Waechter; you can choose any one of them that suits your requirements and the type file format supported.
I recently tried to do something same using C++ (Although it has only been tested on post-stack data). The project can be found here.

How generate random word from real languages

How I can generate random word from real language?
Anybody know any API from internet with this functional?
For example I send http-request to 'ht_tp://www.any...api.com/getword?lang=en' and I get responce 'Town'. Or 'Fast'. Or 'Received'... For example I send http-request to 'ht_tp://www.any...api.com/getword?lang=ru' and I get responce 'Ходить'. Or 'Шапка'. Or 'Отправлено'... Any form (noun, adjective, verb etc...) of the words of the any language.
I find resource 'http://www.randomlists.com/random-words'. But this is not JSON format, only English, and don't any warranty work in long time.
Please any ideas.
See this answer : https://stackoverflow.com/questions/824422/can-i-get-an-english-dictionary-word-list-somewhere Download a word dictionary, stick in the databse and fetch a random record or read a random line from the file each time. This way you don't depend on 3rd party API and you can extend it in all the languages you can find words for.
You can download the OpenOffice dictionaries. They come as extension (oxt), which is nothing different than a ZIP file. You may open them with 7zip or alike. Within you will find lots of files, interesting for you are the *.dic files. They will also contain resolutions or number words.
When you encounter something like abandon/LdS get rid of the /LdS this is used for hunspell.
Take these *.dic files use their name as key, put them into a database and pick a random word from there for a given language code.
Update
Older, but easier to access, the archived hunspell dictionaries from OpenOffice.
This question can be viewed in two ways and therefore I give two answers:
To collect words, I would run a spider on websites with known language (Wikipedia is a good starting point) and strip HTML tags.
To generate words from a real language is trickier. Using statistics from the collected words, it is possible to use Markow chains that produces statistically real words. I have tried letter by letter generation, and that works poorly. It is probably a better approach to use syllable construction instead.

Quick way to output a picture in C++

I'm coding a physical simulation on 2d array and I'm now thinking that I could benefit from having a graphical output. My system is an array of cells (up to 2048*2048 of them) taking binary values, until now I used a prompt or text file output of '+' and '-' but it's not efficient for 2048*2048 lattice and maybe outputting in an image would be quicker and neater. Still, I've never done that. Ideally a library allowing me to write blue and red pixels/cell while parsing my lattice would get the job done. Are there some pre-existing not too long tools for doing it in c++?
Edit: I think that I just found what I was looking for: png++
After no more than 10 lines of coding I got the following output:
All I was asking for! Thank you for the suggestions ;)
You can easily get away without using an external imaging library by outputting a very simple format such as PGM or PBM. Refer to the wikipedia page on Netbpm for more details, but you're essentially outputting all the values as either ASCII or binary numbers, then any image viewer or editor that supports PGM (many of which do) can open and display them. Even if you don't have an editor, something like ImageMagick can easily convert it to a PNG or any other more accessible format.
I've used this technique in the past to quickly visualize 2D data, as you're intending to.
C++ does not have native support for graphics. You need an additional C++ library.
Personally, I suggest you to use Qt, which is free, powerful and cross-platform.

Simple data visualization from data to create/place circles/spere on grid

all I want is to create circles (from data) on a grid for specific sets of data with different colours. These might be objects I have created to be placed on grid or from the program itself. I was using POVRAY but it is massively complicated and I don't have the time. Unless anyone has a tutorial on how to read data from files and extract all the numbers and used successfully in .pov files.
There are several programs/environments (not C++) that can do this directly. One is gnuplot; another, more robust tool is R. Although with R there is a bit more of a learning curve to really get moving.
Have you considered GNUplot?
It has a simple syntax, so you can just convert your data file into an input file for gnuplot.