I am using AIR to do some augmented reality using fudicial marker tracking. I am using FLARToolkit and it works fine, except the frame rate drops to ridiculous lows in certain lighting conditions. This is because Flash only uses the CPU for processing, and every frame it is applying filters, adjusting thresholds, and analyzing the pixels to find the marker pattern. Without any hardware acceleration, it can get really slow.
I did some searching and it looks like the fastest and most stable tracking library is Studierstube ( http://handheldar.icg.tugraz.at/stbtracker.php and http://studierstube.icg.tugraz.at/download.php ). Unfortunately, I am not a C++ developer. But it seems that the tracking is insanely fast using this tracker (especially since it isn't all CPU processing like Flash is).
So my plan is to build (or rather have someone build) a small C++ program that leverages this tracker, and then sends the marker position data every frame (only need 30 FPS) to my Flash client application to display back the video and some augmented reality experiences. I believe this would be done through a socket server or something right? Is this possible and fairly easy for someone who is a decent C++ developer? I would ask him/her but I am in search for such a person.
May this link will be helpful?
http://www.adobe.com/devnet/air/flex/quickstart/articles/interacting_with_native_process.html
As some told here, it'll be done with nativeprocess...
Related
I am writing an application for Windows that runs a CUDA accelerated HDR algorithm. I've set up an external image signal processor device that presents as a UVC device, and delivers 60 frames per second to the Windows machine over USB 3.0.
Every "even" frame is a more underexposed frame, and every "odd" frame is a more overexposed frame, which allows my CUDA code perform a modified Mertens exposure fusion algorithm to generate a high quality, high dynamic range image.
Very abstract example of Mertens exposure fusion algorithm here
My only problem is that I don't know how to know when I'm missing frames, since the only camera API I have interfaced with on Windows (Media Foundation) doesn't make it obvious that a frame I grab with IMFSourceReader::ReadSample isn't the frame that was received after the last one I grabbed.
Is there any way that I can guarantee that I am not missing frames, or at least easily and reliably detect when I have, using a Windows available API like Media Foundation or DirectShow?
It wouldn't be such a big deal to miss a frame and then have to purposefully "skip" the next frame in order to grab the next oversampled or undersampled frame to pair with the last frame we grabbed, but I would need to know how many frames were actually missed since a frame was last grabbed.
Thanks!
There is IAMDroppedFrames::GetNumDropped method in DirectShow and chances are that it can be retrieved through Media Foundation as well (never tried - they are possibly obtainable with a method similar to this).
The GetNumDropped method retrieves the total number of frames that the filter has dropped since it started streaming.
However I would question its reliability. The reason is that with these both APIs, the attribute which is more or less reliable is a time stamp of a frame. Capture devices can flexibly reduce frame rate for a few reasons, including both external like low light conditions and internal like slow blocking processing downstream in the pipeline. This makes it hard to distinguish between odd and even frames, but time stamp remains accurate and you can apply frame rate math to convert to frame indices.
In your scenario I would however rather detect large gaps in frame times to identify possible gap and continuity loss, and from there run algorithm that compares exposure on next a few consecutive frames to get back to sync with under-/overexposition. Sounds like a more reliable way out.
After all this exposure problem is highly likely to be pretty much specific to the hardware you are using.
Normally MFSampleExtension_Discontinuity is here for this. When you use IMFSourceReader::ReadSample, check this.
I am working on a project that requires me to detect and track a human in a live video from a webcam connected to a Beagleboard xm.
I have completed this task using Opencv in pixel domain. The results on the board are very accurate but extremely slow. Many people have suggested me to leave pixel domain and do the same task in an h.264/MPEG-4 compressed video as it would extremely reduce the computational overhead.
I have read many research papers but failed to discover any software platform or a library that I can use to analyze and process h.264 compressed videos.
I will be thankful if someone can suggest me some library for h.264 compressed video analysis and guide me further.
Thanks and Regards.
I'm not sure how practical this really is (I've never tried to do it), but my guess would be that what they're referring to would be looking for a block of macro-blocks that all have (nearly) identical motion vectors.
For example, let's assume you have a camera that's not panning, and the picture shows a car driving across the screen. Looking at the motion vectors, you should have a (roughly) car-shaped bunch of macro-blocks that all have similar motion vectors (denoting the motion of the car). Then, rather than look at the entire picture for your object of interest, you can look at that block in isolation and try to identify it. Likewise, if the camera was panning with the car, you'd have a car-shaped block with small motion vectors, and most of the background would have similar motion vectors in the opposite direction of the car's movement.
Note, however, that this is likely to be imprecise at best. Just for example, let's assume our mythical car as driving in front of a brick building, with its headlights illuminating some of the bricks. In this case, a brick in one picture might (easily) not point back at the same brick in the previous picture, but instead point at the brick in the previous picture that happened to be illuminated about the same. The bricks are enough alike that the closest match will depend more on illumination than the brick itself.
You may be able, eventually, to parse and determine that h.264 has an object, but this will not be "object tracking" like your looking for. openCV is excellent software and what it does best. Have you considered scaling the video down to a smaller resolution for easier analysis by openCV?
I think you are highly over estimating the computing power of this $45 computer. Object recognition and tracking is VERY hard computationally speaking. I would start by seeing how many frames per second your board can track and optimize from there. Start looking at where your bottlenecks are, you may be better off processing raw video instead of having to decode h.264 video first. Again, RAW video takes a LOT of RAM, and processing through that takes a LOT of CPU.
Minimize overhead from decoding video, minimize RAM overhead by scaling down the video before analysis, but in the end, your asking a LOT from a 1ghz, 32bit ARM processor.
FFMPEG is a very old library that is not being supported now a days. It has very limited capabilities in terms of processing and object tracking in h.264 compressed video. Most of the commands usually are outdated.
The best thing would be to study h.264 thoroughly and then try to implement your own API in some language like Java or c#.
Im trying to do a screen-flashing application, that flashes the screen according to the music(which will be frequencies, such as healing frequencies, etc...).
I already made the player and know how will I make the screen flash, but I need to make the screen flash super fast according to the music, for example if the music speeds up, the screen-flash will flash faster. I understand that I would achieve this by FFT or DSP(as I only need to know when the frequency raises from some Hz, lets say 20 to change the color, making the screen-flash).
But I've found that I understand NOTHING, even less try to implement it to my application.
Can somebody help me out to learn whichever both of them? My email is sismetic_chaos#hotmail.com. I really need help, I've been stucked for like 3 days not coding or doing anything at all, trying to understand, but I dont.
PS:My application is written in C++ and Qt.
PS:Thanks for taking the time to read this and the willingness to help.
Edit: Thanks to all for the answers, the problem is in no way solved yet, but I appreciate all the answers, I didnt thought I would get so many answers and info. Thanks to you all.
This is a difficult problem, requiring more than an FFT. I'll briefly describe how I implemented beat detection when I was writing software for professional DJ equipment.
First of all, you'll need to cut down the amount of data you're dealing with, since there are only two or three beats per second, but tens of thousands of samples. You'll also need to look at different frequency ranges, since some types of music carry the tempo in the bassline, and others in percussion or other instruments. So pass the signal through several band-pass filters (I chose 8 filters, each covering one octave, from low bass to high treble), and then downsample each band by averaging the power over a few hundred samples.
Every few seconds, you'll have a thousand or so samples in each band. Your next tool is an autocorrelation, to identify repetitive patterns in the music. The peaks of the autocorrelation tell you what the beat is more or less likely to be; but you'll need to make up some heuristics to compare all the frequency bands to find a beat that you can be confident in, and to avoid misleading syncopations. If you can manage that, then you'll have a reasonable guess at the tempo, but no idea of the phase (i.e. exactly when to flash the screen).
Now you can look at the a smoothed version of the audio data for peaks, some of which are likely to correspond to beats. Initially, look for the strongest peak over the course of a few seconds and take that as a downbeat. In conjunction with the tempo you estimated in the first stage, you can predict when the next beat is due, and measure where you actually saw something like a beat, and adjust your estimate to more closely match the data. You can also maintain a confidence level based on how well the predicted beats match the measured peaks; if that drops too low, then restart the beat detection from scratch.
There are a lot of fiddly details to this, and it took me some weeks to get it working nicely. It is a difficult problem.
Or for a simple visualisation effect, you could simply detect peaks and flash the screen for each one; it will probably look good enough.
The output of a FFT will give you the frequency spectrum of an audio sample, but extracting the tempo from the FFT output is probably not the way you want to go.
One thing you can do is to use peak detection to identify the volume "spikes" that typically occur on the "down-beats" of the music. If you can identify the down-beats, then you can use a resource like bpmdatabase.com to find the tempo of the song. The tempo will tell you how fast to flash and the peaks you detected will tell you when to start flashing. Have your app monitor your flashes to make sure that they generally occur at the same time as a peak (if the two start to diverge, then the tempo may have changed mid-song).
That may sound straightforward, but this is actually a very non-trivial thing to implement. You might want to read this SO question for more information. There are some quality links in the answers there.
If I'm completely mis-interpreting what you are trying to do and you need to do FFTs for something different, then you might want to look at using one of the existing FFT libraries to do the heavy lifting for you. Some examples are FFTW and KissFFT.
It sounds like maybe you're trying to get your visualizer to flash the screen in time with the
music somehow. I don't think calculating the FFT is going to help you here. At any
given instant, there will be many simultaneous frequency components, all over the audio spectrum (roughly 20 Hz to 20 kHz). But you're likely to be a lot more interested in the
musical tempo (beats per minute -- more like 5 Hz or below), and that's not going to show
up anywhere in an FFT of the raw audio signal.
You probably need something much simpler -- some sort of real-time peak detection.
Whenever you see a peak greater than some threshold above the average volume,
make your screen flash.
Of course, more complicated visualizations might well take advantage of the FFT,
but not the one you're describing.
My recommendation would be to find a library that does this for you. Unless you have a lot of mathematics to back you up, I think you will be wasting a ton of your time trying to learn FFTs when all you really want out is some sort of 'base hits per minute' number out which you can adjust your graphics to accordingly.
Check out this similar post:
here
It took me about three weeks to understand the mathematics behind FFTs and then another week to write something in Matlab using those concepts. If you are discouraged after three days, don't try and roll your own.
I hope this is helpful advice and not discouraging.
-Brian J. Stinar-
As previous answers have noted, an FFT is probably not the tool you need in order to solve your problem, which requires tempo detection rather than spectral analysis.
For an example of what can be done using FFT - and of how a particular FFT implementation was integrated into a Qt application, take a look at this blog post which describes the spectrum analyzer demo I developed. Code for the demo is shipped with Qt itself, in the demos/spectrum directory.
I'm experimenting with developing a tool for remote OpenGL rendering, in C++. The basic idea is:
The client issues OpenGL commands like it's a normal app
Those commands are actually sent over the network to an external server
The server performs the rendering using some off-screen technique
Once done, the server transmits a single frame over the network to the client
The client renders the frame on screen.
Loop.
I know I shouldn't start worrying about optimization if I don't have a finished product yet, but I'm pretty sure that is going to be very slow, and the bottleneck is probably going to be the single frame transmission over the network, even if those computers are connected in the same LAN.
I'm thinking about using some kind of video streaming library. That way, the frames would be transmitted using proper compression algorithms, making the process faster.
Am I in the right path about this? Is it right to use a video streaming library here? If you think so, what's a good library for this task (in C or C++, preferably C++)?
Thank you for your help!
You have two solutions.
Solution 1
Run the app remotely
Intercept the openGL calls
Forward them on the network
Issue the openGL calls localy
-> complicated, especially when dealing with buffers and textures; the real openGL code is executed locally, which may not be what's wanted, but it's up to you. What's more, it's transparent for the remote app (no source modification, no rebuild). Almost no network communication.
Solution 2 : what you described, with the pros and cons.
If you go for Solution 2, don't bother about speed for now. You will have enough challenges with openGL as it is, trust me.
Begin by a synchronous mode : render, fetch, send, render, fetch, send
Then a asynchronous mode : render, begin the fetch, render, end of the fetch, begin the send, render, etc
It will be hard enough, I think
Depending on the resolution you need to support and the speed of your LAN it may be possible to stream the data uncompressed.
A 24-bit 1280x1024 frame requires 30 Mbit, and with a gigabit ethernet this means a theoretical 33 frames per second uncompressed.
If that is not enough, adding a simple RLE-compression yourself is fairly straightforward.
Imagine having to spend $ on both machines to provide them with proper graphics processing power. You could avoid this and simplify the client development if you centralize all the graphics related tasks on one single machine. The job of the client would be only to send/receive/display data, and the server could focus on processing the graphics (OpenGL) and sending the data (as frames) back to the client.
The bottleneck you referred to depends on a couple of things on your side: the size of the images and the frame rate you need to send/receive/display them.
These are some of the interesting topics I've read and hopefully they will shed a light on the subject:
Video streaming using c++
How do I stream video and play it?
Does anyone know a reason why my programs could be causing my speakers to output some soft static? The programs themselves don't have a single element that outputs sound to anything, yet when I run a few of my programs I can hear a static coming from my speakers. It even gets louder when I run certain programs. Moving the speakers around doesn't help, so it must be coming from inside the computer.
I'm not sure what other details to put down since this seems very odd. They are OpenGL programs written in C++ with MS Visual C++.
Edit: It seems to be that swapping the framebuffers inside an infinite loop is making the noise, as when I stop swapping I get silence...
:)
You will be surprised to know that the speaker input is picking up static from the hard disk. When you do something memory/disk intensive (like swapping framebuffers) so that the hard disk has to rotate fast, the sound will appear.
I had the same problem some years back, I solved it too. But I am sorry that I don't remember how I did it.
Hope the diagnosis helps in remedying the problem.
UPDATE: I remembered. If you are using Windows, go to volume control and mute all the external inputs/outputs like CD input etc. Just keep the two basic ones.
Computers consume a different amount of power when executing code. This fluctuation of current acts like a RF transmitter and can be picked up by audio equipment and it will be essentially "decoded" much like a AM modulated signal. As the execution usually does not produce a recognizable signal it sounds like white noise. A good example of audio equippment picking up a RF signal is if you hold your (GSM) cell phone close to an audio amplifier when receiving a call. You most likely will hear a characteristic pumping buzz from the cell phone's transmitter.
Go here to learn more about Electromagnetic compatibility. There are multiple ways a signal can couple into your audio. As you mentioned a power cord to be the source it was most likely magnetic inductive coupling.
Since you say you don't touch sound in your programs, I doubt it's your code doing this. Does it occur if you run any other graphics-intensive programs? Also, what happens if you mute various channels in the mixer (sndvol32.exe on 32-bit windows)?
Not knowing anything else I'd venture a guess that it could be related to the fan on your graphics card. If your programs cause the fan to turn on and it's either close to your sound card or the fan's power line crosses an audio cable, it could cause some static. Try moving any audio cables as far as possible from the fan and power cables and see what happens.
It could also be picking up static from a number of other sources, and I wouldn't say it's necessarily unusual. If non-graphics-intensive programs cause this as well, it could be hard-disk access, or even certain frequencies of CPU/power usage being picked up on an audio line like an antenna. You can also try to reduce the number of loops in your audio wires and see if it helps, but no guarantees.
Crappy audio hardware on motherboards, especially the ones that end up in office PCs. The interior of a PC case is full of electrical noise. If that couples to the audio hardware, you'll hear it.
Solution: Get a pair of headphones with a volume control on the cord. Turn the volume on the headphones down, and turn the volume on the PC up full. This will increase the signal level relative to the noise level in most cases.
Most electronic devices give off some kind of electromagnetic interference. Your speakers or sound hardware may be picking up something as simple as the signaling on your video cable or the graphics card itself. Cheap speakers and poorly-protected audio devices tend to be fairly sensitive to this kind of radiation, in my experience.
There is interference on your motherboard that is leaking onto your sound bus.
This is usually because of the quality of your motherboard, or the age of it. Also, the layout of the equipment inside your computer (close together, over lapping) often will make interesting EM fields. My old laptop used to do this a lot easier as it got older.
So as things are winding up or down you'll hear it.
Try to see if it happens on a different computer. Try different computers of different ages and different configurations (external soundcard, or a physical sound card, etc).
Hope that helps.
tempest
dvbt