I am looking for a library or method to decode a QR Code (or potentially another form of 2d barcode) and to be able to actually determine the camera position and orientation. This seems like it should be doable, but I am not entirely sure.
Does anyone know what the best route for this is? Or if it is even possible?
zxing is the open-source Google-hosted Java library for 2d barcodes including QR.
see com.google.zxing.ResultMetadataType.ORIENTATION (optional metadata returned in a hashtable from com.google.zxing.Result.getResultMetadata()):
Denotes the likely approximate orientation of the barcode in the image. This value is given as degrees rotated clockwise from the normal, upright orientation. For example a 1D barcode which was found by reading top-to-bottom would be said to have orientation "90". This key maps to an Integer whose value is in the range [0,360).
Many Android apps make heavy use of QR codes - if I were you I'd do some research using Android as one of the keywords and may be add "android" as a tag on this Q (or post android-specific version of it).
P.S. Since Android code is IIRC open source avialable from Google, if the QR logic is available in core Android you'd be able to have access to it.
Related
I'm using c++ with directshow to capture multiple images from the camera, much like how opencv camcapture does.
And the camera I'm using has a AoI control where I can move the offset of x and y to move the camera view. I try searching the web, but I couldn't find anything.
Is there a way to control those values using directshow? Cause there seems to be way to change the values of gain and what not, but there's no mention about the AoI control
This totally depends on the source directshow filer that you got with your camera and you should check it's documentation / view it's property page with graph edit.
If it's a standard filter you are using, try to switch to the one you got with your hardware, since the lack of results probably means there is no support for this option in the standard src filter.
I'm trying to implement a program that will take a scanned (possibly rotated) document like an ID card, detect its type based on two or more image templates and normalize it (de-rotated it and resize so it matches the template). Everything will be scanned, so luckily perspective is not a problem.
I have already tried a number of approaches with no success:
I tried using openCV's features2d to detect the template and findHomograpy to normalize it but it fails extremely often. If I take a template, change it a little bit (other data/photo on ID card), rotate ~40 degrees then it usually fails, no matter what configuration of descriptors detectors and matcher I use.
Also tried this http://manpages.ubuntu.com/manpages/gutsy/man1/unpaper.1.html which is an de-rotate tool and then tried to do normal matching but unpaper doesn't work great with rotation angles greater than 20 deg.
If there's a ready solution it would be really great, a commercial library (preferably c/c++ or a command line tool) is also an option. I hate to admit that but I fail miserably when try to understand computer vision papers so liniking unfortunately won't help me.
Thank you very much for help!
I recently saw the virtual mirror concept on you tube, I tried it out and researched about it. It seems that the creators have used augmented reality so that people can see the output on their screens. On researching I found out that we identify a pattern on which a 3D image is superimposed.
Question 1:How are they able to superimpose the jewellery and track the face of the person without identifying any pattern?
I also tried to check various libraries that I can use to make a program similar to the one they show. Seems to me that a lot of people are using Android phones and iPhones and making apps that use augmented reality.
Question 2:Is there any way that I can use c++ and try to make a program that uses augmented reality?
Oh, and the most important thing, the link to the application is provided below:
http://www.boutiqueaccessories.com.au/virtual-mirror/w1/i1001664/
Do try it out. Its a good experience. :D
I'm not able to actually try the live demo, but the linked video suggests that they either use some simplified pattern recognition (get the person's outline), or they simply track you based on the initial image (with your position/texture being determined by the outline being shown.
Following the video, it's easy to see that there's no real/advanced AR behind this. The images are simply overlayed or hidden (e.g. in case it's missing track of one ear due to you looking to the side) and they're not transformed (no perspective or resizing happening). They definitely seem to track the head (or features like ears, neck, etc.). depending on your background and surroundings that's actually a rather trivial task.
Question 2: Sure! There are lots of premade toolsets out there, but you could as well use some general image processing library such as OpenCV to do the math. Augmented reality usually uses some kind of pattern (e.g. a card or page with a known pattern) to determine the correct position and transformation for the contents to be added to the image. There are also approaches using the device's orientation and perspective changes in camera images to determine depth/position (I really like this demo).
I have to process a lot of scanned IDs and I need to extract photos from them for further processing.
Here's a fictional example:
The problem is that the scans are not perfectly aligned (rotated up to 10 degrees). So I need to find their position, rotate them and cut out the photo. This turned out to be a lot harder than I originally thought.
I checked OpenCV and the only thing I found was rectangle detection but it didn't give me good results: the rectangle not always matches good enough on samples. Also its image matching algorithm works only for not-rotated image since it's just a brute force comparison.
So I though about using ARToolkit (augmented reality lib) because I know that it's able to very precisely locate given marker on an image. But it it seems that the markers have to be very simple, so I can't use a constant part of the document for this purpose (please correct me if I'm wrong). Also, I found it super-hard to compile it on Ubuntu 11.10.
OCR - haven't tried this one yet and before I start my research I'd be thankful for any suggestions what to look for.
I look for a C(preferable)/C++ solution. Python is an option too.
If you don't find another ideal solution, one method I ended up using for OCR preprocessing in the past was to convert the source images to PPM and use unpaper in Ubuntu. You can attempt to deskew the image based on whichever sides you specify as having clearly-defined edges, and there is an option to bypass the filters that would normally be applied to black and white text. You probably don't want those for images.
Example for images skewed no more than 15 degrees, using the bottom and right edges to detect rotation:
unpaper -n -dn bottom,right -dr 15 input.ppm output.ppm
unpaper was written in C, if the source is any help to you.
Is there a very basic color/shape detection mechanism through which one could detect a specific color or a shape in a webcam feed? Wanted to use the color/or shape as a symbolic marker for an AR application.
Though the ideal case would be a NFT , but i am not much of coder and have no experience in OpenCV( have read a lot about it in previous discussions here).Have worked so far with the SLAR tooolkit only and that offers only the basic b/w marker detection
And the more easily useable NFT libraries are , well, not freeware :/
Any guidance to integrate the abovementioned detection routines in a .Net/Flash environment would be of great help.
Color detection is very easy: take your videostream images, convert them to binary images by using the RGB value as a vector (like RGB = [0,255,0] = green), and setting other vectors within a given distance as positive hits. This is one of the easiest forms of computer vision, and a couple of early CV-based PS2 games involved detecting brightly colored props.
This is my favorite paper on shape recognition - if you want to detect simple 2D outlines on flat surfaces, this is a great technique.
I'm neither a .Net or a Flash programmer, so I can't offer any help there.