Audio Recording/Mixer Software - c++

As a guitarist I have always wanted to develop my own recording, mixing software. I have some experience in Direct Sound, Windows Multimedia (waveOutOpen, etc). I realise that this will be a complex project, but is purely for my own use and learning, i.e. no deadlines! I intend to use C++ but as yet am unsure as the best SDK/API to use. I want the software to be extensible as I may wish to add effects in the future. A few prerequisites...
To run on Windows XP
Minimal latency
VU meter (on all tracks)
This caused me to shy away from Direct Sound as there doesn't appear to be a way to read audio data from the primary buffer.
Overdubbing (i.e. record a new track whilst playing existing tracks).
Include a metronome
My initial thoughts are to use WMM and use the waveOutWrite function to play audio data. I guess this is essentially an audio streaming player. To try and keep things simpler, I will hard-code the sample rate to 16-bit, 44.1kHZ (the best sampling rate my sound card supports). What I need are some ideas, guidance on an overall architecture.
For example, assume my tempo is 60 BPM and time signature is 4/4. I want the metronome to play a click at the start of every bar/measure. Now assume that I have recorded a rhythm track. Upon playback I need to orchestrate (pun intended) what data is sent to the primary sound buffer. I may also, at some point, want to add instruments, drums (mainly). Again, I need to know how to send the correct audio data, at the correct time to the primary audio buffer. I appreciate timing is key here. What I am unsure of is how to grab correct data from individual tracks to send to the primary sound buffer.
My initial thoughts are to have a timing thread which periodically asks each track, "I need data to cover N milliseconds of play". Where N depends upon the primary buffer size.
I appreciate that this is a complex question, I just need some guidance as to how I might approach some of the above problems.
An additional question is WMM or DirectSound better suited for my needs. Maybe even ASIO? However, the main question is how, using a streaming mechanism, do I gather the correct track data (from multiple tracks) to send to a primary buffer, and keep minimal latency?
Any help is appreciated,
Many thanks
Karl
Thanks for the responses. However, my main question is how to time all of this, to ensure that each track writes appropriate data to the primary buffer, at the correct time. I am of course open to (free) libraries that will help me achieve my main goals.

As you intend to support XP (which I would not recommend, as even the extended support will end next year) you really have no choice but to use ASIO. The appropriate SDK can be downloaded from Steinberg. In Windows Vista and above WASAPI Exclusive Mode might be a better option due to wider availability, however the documentation is severely lacking IMO. In any case, you should have a look at PortAudio which helps wrap these APIs (and unlike Juce is free.
Neither WMM nor DirectSound nor XAudio 2 will be able to achieve sufficiently low latencies for realtime monitoring. Low-latency APIs usually periodically call a callback for each block of data.
As every callback processes a given number of samples, you can calculate the time from the sample rate and a sample counter (simply accumulate across callback calls). Tip: do not accumulate with floating point. That way lies madness. Use a 64 bit sample counter, as the smallest increment is always 1./sampleRate.
Effectively your callback function would (for each track) call a getSamples(size_t n, float* out) (or similar) method and sum up the results (i.e. mix them). Each individual track could would then have an integrated sample time to compute what is currently required. For periodic things (infinite waves, loops, metronomes) you can easily calculate the number of samples per period and have a modulo counter. That would lead to rounded periods, but as mentioned before, floating point accumulators are a no-no, they can work ok for periodic signals though.
In the case of the metronome example you might have a waveform "click.wav" with n samples and a period of m samples. Your counter periodically goes from 0 to m-1 and as long as the counter is less than n you play the corresponding sample of your waveform. For example a simple metronome that plays a click each beat could look something like this:
class Metronome
{
std::vector<float> waveform;
size_t counter, period;
public:
Metronome(std::vector<float> const & waveform, float bpm, float sampleRate) : waveform(waveform), counter(0)
{
float secondsPerBeat = 60.f/bpm; // bpm/60 = bps
float samplesPerBeat = sampleRate * secondsPerBeat;
period = (size_t)round(samplesPerBeat);
}
void getSamples(size_t n, float* out)
{
while(n--)
{
*out++ = counter < waveform.size() ? waveform[counter] : 0.f;
counter += 1;
counter -= counter >= period ? period : 0;
}
}
};
Furthermore you could check the internet for VST/AU Plugin programming tutorials, as these have the same "problem" of determining time from the number of samples.

As you've discovered, you are entering a world of pain. If you're really building audio software for Windows XP and expect low latency, you'll definitely want to avoid any audio API provided by the operating system, and do as almost all commercial software does and use ASIO. Whilst things got better, ASIO isn't going anyway any time soon.
To ease you pain considerably, I would recommend having a look at Juce, which is a cross-platform framework for building both audio host software and plugins. It's been used to build many commercial products.
They've got many of the really nasty architectural hazards covered, and it comes with examples of both host applications and plug-ins to play with.

Related

C++ playing audio live from byte array

I am using C++ and have the sample rate, number of channels, and bit depth for my audio. I also have a char array containing the audio that I want to play. I am look for something along the lines of, sending a quarter of a second (or some other short amount of audio) to be played, then sending some more, etc. Is this possible, and if it is how would it be done.
Thanks for any help.
I've done this before with the library OpenAL.
This would require a pretty involved answer and hopefully the OpenAL documentation can walk you through it all, but here is the source example which I wrote that plays audio streaming in from a mumble server in nodejs.
You may need to ask a more specific question to get a better answer as this is a fairly large topic. It may also help to list other technologies you may be using such as target operating system(s) and if you are using any libraries already. Many desktop and game engines already have api's for playing simple sounds and using OpenAL may be much more complex than you really need.
But, briefly, the steps of the solution are:
Enumerate devices
Capture a device
Stream data to device
enqueue audio to buffer alSourceQueueBuffers
play queued buffer alSourcePlay

zwave/cpp - GetValue from Thermostat (Temperature and Humidity)

I know I'm asking a dumb question, but I'm quite of a zwave/openzwave beginner, so I wanted to get some help on that.
My zwave network is already up, and I have two nodes:
the key itself to control the other nodes
a sensor for temperature and humidity (the ST814, from Everspring)
Now, I want to display the temperature and the humidity in my console, but I'm not really understanding how it works. From what I understood, I need to configure the auto-report of my sensor (doc is here, see page 6), and get the notifications every X minutes, but I'm not sure.
Does someone already did that or know how to do it?
Thank you a lot,
Maxime
Imagine there's a room full of people from Sweden, and they're all talking to each other in Swedish. Even though you can hear what they're saying, it doesn't mean anything to you because you don't speak Swedish. If you had the ability to speak Swedish, you would understand exactly what was going on.
Now imagine there's a network full of devices and a controller that all speak Z-Wave. Sensors are reporting temperature and humidity at regular intervals to the controller. But, even though you can hear what they're saying, it doesn't mean anything to you because you don't speak Z-Wave.
OpenZWave is a library that enables you to understand and speak Z-Wave. You can use it to create software that listens to the conversations, decides what action to take and even barks out orders in Z-Wave to devices (e.g., motion detection -> call the police). OpenZWave comes with sample applications that show you how to construct your own home automation software using the OpenZWave library. You can also use a software package such as Domoticz, HomeSeer, OpenHAB or SmartThings. These applications provide a broad set of home automation features and functionality so you don't have to program them yourself.
To use the least amount of battery, a device such as the ST814 spends most of its time sleeping. At user-defined regular intervals (for example, every hour), the device wakes up, reports the temperature and humidity to the controller and checks to make sure there are no other commands or requests waiting for it. Then it goes back to sleep. You determine how often the device wakes up and can set it according to the instructions you referenced.
If you want to intercept the temperature and humidity report from the ST814 to the controller and output it to the console with OpenZwave, you need to write some code or use someone else's program. The latter is easier, but might not enable you to do exactly what you want to do. Using OpenZWave is harder, but provides the capability to do just about anything you want to do.

real time audio processing in C++

I want to produce software that reads raw audio from an external audio interface (Focusrite Scarlett 2i2) and processes it in C++ before returning it to the interface for playback. I currently run Windows 8 and was wondering how to do this with minimum latency?
I've spent a while looking into (boost) ASIO but the documentation seems fairly poor. I've also been considering OpenCL but I've been told it would most likely have higher latency. Ideally I'd like to be able to just access the Focusrite driver directly.
I'm sorry that this is such an open question but I've been having some trouble finding educational materiel on Audio Programming, other than just manipulating the audio when provided by a third party plug in design suite such as RackAFX. I'd also be grateful if anyone could recommend some reading on low level stuff like this.
You can get very low latency by communicating directly with the Focuswrite ASIO driver (this is totally different than boost ASIO). To work with this you'll need to register and download the ASIO SDK from Steinberg. Within the API download there is a Visual C++ sample project called hostsample which is a good starting point and there is pretty good documentation about the buffering process that is used by ASIO.
ASIO uses double buffering. Your application is able to choose a buffer size within the limits of the driver. For each input channel and each output channel, 2 buffers of that size are created. While the driver is playing from and recording to one set of buffers your program is reading from and writing to the other set. If your program was performing a simple loopback then it would have access to the input 1 buffer period after it was recorded, would write directly to the output buffer which would be played out on the next period so there would be 2 buffer periods of latency. You'll need to experiment to find the smallest buffer size you can tolerate without glitches and this will give you the lowest latency. And of course the signal processing code will need to be optimized well enough to keep up. A 64 sample (1.3 ms # 48kHz) is not unheard of.

decode Infrared remote control code using c or c++

Please let me know how to decode infrared remote control code using c or c++.
I suppose your question is vague because you absolutely don't know where to start.
If you don't have a infrared receiver, here is a blog is used some time ago to learn how to build one. I hope you have some talents in electronics. Otherwise, it's time to learn ! :D
If you managed to build it (or have another known-working receiver), you can then take a look at LIRC, an open-source project which is compatible with the suggested device.
Well, I don't know about C++, but here it is in C#:
void ReadRemote()
{
while(!Console.KeyAvailable)
Console.Write(DecoderClass.ReadIR(IRType.TVRemoteControl));
}
Different manufacturers use different encoding schemes, different timing, and even different modulation frequencies. A typical IR receiver sensor performs the demodulation in hardware so that it outputs a logic pulse sequence rather than an OOK modulated signal. The receiver needs to match the transmitter modulation frequency. If you want to receive multiple modulation frequencies, you would have to use a simple IR photo-diode and provide your own OOK carrier detector circuitry for each modulation frequency of interest, or use multiple receivers.
Once you have the demodulates pulse sequence, it is a simple matter of decoding it according to the specific manufacturers encoding scheme. I have only ever done this for a Sony SIRC remote to use it as a robot controller. Where the sensor triggered an interrupt and the pulse timing was latched on each interrupt. If your device has input capture and compare timer hardware you could reduce the overhead and increase accuracy (although this is probably not necessary).
There are lots of resources on the subject for different protocols and manufacturers here
There are some applicable standards, but no manufacture has to adhere to them, and there is more that one. RC-5 and RECS 80 are two such standards that are fairly common.
Either buy some hardware that does it and make C or c++ api calls (lirc for example). or if you are interested in decoding the protocols and doing it with software instead of hardware then you want to use a microcontroller.
It is fun and easy and I recommend it as a first microcontroller project. There are a handful of popular protocols, most are a variation on a couple of themes. Basically you get a receiver which you can get from radio shack, although the amount of what used to be radio components at radio shack (maybe that is why they are changing the name to the shack) are much fewer and buried in a small drawer in the back of the store, soon to be gone. Anyway get a receiver there or digikey or mouser, a microcontroller, count the number of clock ticks between rising and falling edges (the receiver has removed the ~40Khz carrier frequency) compare those times to the expected protocol and determine the ones from zeros. Transmitting with just a microcontroller and an ir led can be done but is a little trickier because you need to do the modulation. A single and gate outside the microcontroller along with the modulation clock and/or a timer generated clock is much easier, but requires extra hardware.
The multi protocol receivers are likely not to be as good as a specific protocol receiver, in the same way that gcc is not very good at any one platform, but pretty good across all the platforms. Building your own is fun, educational, and generally results in a better receiver, although you wont be able to match the price of the mass produced products. Part of the problem also has to do with the carrier frequencies vary and ideally you want to choose the receiver that matches the carrier frequency for the protocol you are using. It makes a difference, for example instead of having to be 8 feet away you can be 30 feet away. Also it may work at 3 feet away but not one foot or three inches away. That sort of thing. cable and dish and other universal remotes generate all of the protocols so you are free to pick and choose depending on what your project is.

Running background services on a PocketPC

I've recently bought myself a new cellphone, running Windows Mobile 6.1 Professional. And of course I am currently looking into doing some coding for it, on a hobby basis. My plan is to have a service running as a DLL, loaded by Services.exe. This needs to gather som data, and do som processing at regular intervals (every 5-10 minutes).
Since I need to run this at regular intervals, it is a bit of a problem for me, that the system typically goes to sleep (suspend) after a short period of inactivity by the user.
I have been reading all the documentation I could find on MSDN, and MSDN blogs about this subject, and it seems to me, that there are three possible solutions to this problem:
Keep the system in an "Always On"-state, by calling SystemIdleTimerReset periodically. This seems a bit excessive, and is therefore out of the question.
Have the system periodically waken up with CeRunAppAtTime, and enter the unattended state, to do my processing.
Use the unattended state instead of going into a full suspend. This would be transparent to the user, but the system would never go into sleep.
The second approach seems to be preferred, however, this would require an executable to be called by the system on wake up, with the only task of notifying my service that it should commence processing. This seems a bit unnecessary and I would like to avoid this extra executable. I could of course move all my processing into this extra executable, but I would like to use some of the facilities provided when running as a service, and also not have a program pop up (even if its in the background) whenever processing starts.
At first glance, the third approach seems to have the same basic problem as the first. However, I have read on some of the MSDN blogs, that it might be possible to actually conserve battery consumption with this approach, instead of going in and out of suspend mode often (The arguments for this was that the nature of the WM platform is to have a very little battery consumption, when the system is idle. And that going in and out of suspend require quite a bit of processing).
So I guess my questions are as following:
Which approach would you recommend in my situation? With respect to keeping a minimum battery consumption, and a nice clean implementation.
In the case of approach number two, is it possible to eliminate the need for a notifying executable? Either through alternative API functions, or existing generic applications on the platform?
In the case of approach number three, do you know of any information/statistics relevant to the claim, that it is possible to extend the battery lifetime when using unattended mode over going into suspend. E.g. how often do you need to pull the system out of suspend, before unattended mode is to be preferred.
Implementation specific (bonus) question: Is it necessary to regularly call SystemIdleTimerReset to stay in unattended mode?
And finally, if you think I have prematurely eliminated approach number one, please tell me why.
Please include in your response whether you base your response on knowledge, or are merely guessing (the latter is also very welcome!).
Please leave a comment, if you think I need to clarify any parts of this question.
CERunAppAtTime is a much-misunderstood API (largely because of the terrible name). It doesn't have to run an app. It can simply set a named system event (see the description of the pwszAppName parameter in the MSDN docs). If you care to know when it has fired (to lat your app put the device to sleep again when it's done processing) simply have a worker thread that is doing a WaitForSingleObject on that same named event.
Unattended state is often used for devices that need to keep an app running continuously (like an MP3 player) but conserve power by shutting down the backlight (probably the single most power consuming subsystem).
Obviously unattended mode uses significantly more powr than suspend, becasue in suspend the only power draw is for RAM self-refresh. In unattended mode the processor is stuill powered and running (and several peripherals may be too - depends on how the OEM defined their unattended mode).
SystemIdleTimerReset simply prevents the power manager from putting the device into low-power mode due to inactivity. This mode, whether suspended, unattended, flight or other, is defined by the OEM. Use it sparingly because when your do it impacts the power consumption of the device. Doing it in unattended mode is especially problematic from a user perspective because they might think the device is off (it looks that way) but now their battery life has gone south.
I had a whole long post detailing how you shouldn't expect to be able to get acceptable battery life because WM is not designed to support what you're trying to do, but -- you could signal your service on wakeup, do your processing, then use the methods in this post to put the device back to sleep immediately. You should be able to keep the ratio of on-time-to-sleep-time very low this way -- but as you say, I'm only guessing.
See also:
Power-Efficient Apps (MSDN)
Power To The People (Developers 1, Developers 2, Devices)
Power-Efficient WM Apps (blog post)