I'm going to using openFrameworks to develop a music visualizer--music drive graphics.
I can use Opengl to do the graphics. But the problem is the audio processing part. I have no experience in this field. I once thought I could extract music features-such as pitch, beat, volume to control the graphics. But I now I don't know how to start. I learned fft in math, but I don't know what to do after I get the spectrum of a piece of music, how can I extract those music features after fft? Or how to do the music processing part?
I would begin by playing with ofxFFT and then learning intuitively. Do you have set music you will be using? What are the characteristics you are looking for? You should probably start by looking for specific frequencies - like high, low, mid. You will likely find that depending on the type of sound / music you are playing that you won't always get a connection between what you think it should detect and what it actually detects. The wave is a series of values which you can perform operations on. You could detect the rate at which it changes, how many times it dips above or a below a range, etc.
Related
I have a project that uses sound sensors. but the output is not in frequency units. how do I make the coding so that the output is in the form of frequency (hz). I used the Arduino IDE to code it. thanks for your attention
In order to get the spectrum of an audio signal you have to transform it from time domain to frequency domain. This is usually done with the Fast Fourier Transform (FFT).
https://www.rationalacoustics.com/files/FFT_Fundamentals.pdf
https://en.wikipedia.org/wiki/Fast_Fourier_transform
This is a very common thing in the Arduino community. Websearch will overwhelm you with tutorials and other resources.
I've already loaded the .wav audio to the buffer with XAudio2 (Windows 8.1) and to play it I just have to use:
//start consuming audio in the source voice
/* IXAudio2SourceVoice* */ g_source->Start();
//play the sound
g_source->SubmitSourceBuffer(buffer.xaBuffer());
I wonder, how can I get the frequency value at given time with XAudio2?
The question does not make much sense, a .wav file contains a great many frequencies. It is the blend of them that makes it sound like music to your ears, instead of just an artificial generated tone. A blend that's constantly changing.
A signal processing step is required to convert the samples in the .wav file from the time domain to the frequency domain. Generally known as spectrum analysis, the Fast Fourier Transform (FFT) is the standard technique.
A random Google hit on "xaudio2 fft" produced this code sample. No idea how good it is, but something to play with to get the lay of the land. You'll find more about it in this gamedev question.
I am working on a project that requires me to detect and track a human in a live video from a webcam connected to a Beagleboard xm.
I have completed this task using Opencv in pixel domain. The results on the board are very accurate but extremely slow. Many people have suggested me to leave pixel domain and do the same task in an h.264/MPEG-4 compressed video as it would extremely reduce the computational overhead.
I have read many research papers but failed to discover any software platform or a library that I can use to analyze and process h.264 compressed videos.
I will be thankful if someone can suggest me some library for h.264 compressed video analysis and guide me further.
Thanks and Regards.
I'm not sure how practical this really is (I've never tried to do it), but my guess would be that what they're referring to would be looking for a block of macro-blocks that all have (nearly) identical motion vectors.
For example, let's assume you have a camera that's not panning, and the picture shows a car driving across the screen. Looking at the motion vectors, you should have a (roughly) car-shaped bunch of macro-blocks that all have similar motion vectors (denoting the motion of the car). Then, rather than look at the entire picture for your object of interest, you can look at that block in isolation and try to identify it. Likewise, if the camera was panning with the car, you'd have a car-shaped block with small motion vectors, and most of the background would have similar motion vectors in the opposite direction of the car's movement.
Note, however, that this is likely to be imprecise at best. Just for example, let's assume our mythical car as driving in front of a brick building, with its headlights illuminating some of the bricks. In this case, a brick in one picture might (easily) not point back at the same brick in the previous picture, but instead point at the brick in the previous picture that happened to be illuminated about the same. The bricks are enough alike that the closest match will depend more on illumination than the brick itself.
You may be able, eventually, to parse and determine that h.264 has an object, but this will not be "object tracking" like your looking for. openCV is excellent software and what it does best. Have you considered scaling the video down to a smaller resolution for easier analysis by openCV?
I think you are highly over estimating the computing power of this $45 computer. Object recognition and tracking is VERY hard computationally speaking. I would start by seeing how many frames per second your board can track and optimize from there. Start looking at where your bottlenecks are, you may be better off processing raw video instead of having to decode h.264 video first. Again, RAW video takes a LOT of RAM, and processing through that takes a LOT of CPU.
Minimize overhead from decoding video, minimize RAM overhead by scaling down the video before analysis, but in the end, your asking a LOT from a 1ghz, 32bit ARM processor.
FFMPEG is a very old library that is not being supported now a days. It has very limited capabilities in terms of processing and object tracking in h.264 compressed video. Most of the commands usually are outdated.
The best thing would be to study h.264 thoroughly and then try to implement your own API in some language like Java or c#.
I am using AIR to do some augmented reality using fudicial marker tracking. I am using FLARToolkit and it works fine, except the frame rate drops to ridiculous lows in certain lighting conditions. This is because Flash only uses the CPU for processing, and every frame it is applying filters, adjusting thresholds, and analyzing the pixels to find the marker pattern. Without any hardware acceleration, it can get really slow.
I did some searching and it looks like the fastest and most stable tracking library is Studierstube ( http://handheldar.icg.tugraz.at/stbtracker.php and http://studierstube.icg.tugraz.at/download.php ). Unfortunately, I am not a C++ developer. But it seems that the tracking is insanely fast using this tracker (especially since it isn't all CPU processing like Flash is).
So my plan is to build (or rather have someone build) a small C++ program that leverages this tracker, and then sends the marker position data every frame (only need 30 FPS) to my Flash client application to display back the video and some augmented reality experiences. I believe this would be done through a socket server or something right? Is this possible and fairly easy for someone who is a decent C++ developer? I would ask him/her but I am in search for such a person.
May this link will be helpful?
http://www.adobe.com/devnet/air/flex/quickstart/articles/interacting_with_native_process.html
As some told here, it'll be done with nativeprocess...
I am making a simple game for fun and learning using SFML for 2D stuff. The game is rather simple.. I loath to say it is a HoG (hidden object game) but I guess that would be a way to get my point across quickly. Basically I am using SFML to load and display 2D still art and capture mouse events.
Anyway... I would like to add video clips to my project. All the art is rendered and for example.. if my image is of a park with a fountain, I would like to have a looping video of the water running so the image has some life even though it is just a still.
All I need is the ability to play videos in the window, preferably compatible with sfml but I am in the planning projects I can swap to something else if needed. The project will have a set resolution (not scalable) and I just want to load the video and play them at a certain pixel location in x,y. So if I have a 1200x720 image I play a 100x100 pixel video on loop at a certain location to make the water of the fountain move.
Now then I am thinking I can just load 2D sprites onto of the video matching the background image to do simple masking. There are some formats like quicktime that can embed an alpha channel directly into the video and if that is supported awesome.. but some planning in the set design should mean that is not really needed. Though if that was supported more options open in set design.
I am pretty good with video as I am a 3D animator by profession, new to programming as a learning hobby. So the format and container of the video is not really an issue though I have been working with OGV a lot recently.
What I see as it needing is
Load multiple videos at once
Play with out any boarders or anything
Play at specific locations in a window.
loop seamlessly
Allow zdepth so I can place sprites onto of it
Dose anyone know were I would go to start looking into this? It seams like something that could possibly be a library I could use? Preferably an open source one as this is just a for fun project nothing commercial.
Thanks in advance for any ideas you may have.