I use GStreamer to play audio and regularly require a timestamp of where I am in the file.
If I adjust the rate of play, be it using a seek command specifying a new play rate or if I use a plugin like “pitch” to adjust the “tempo” component. All timings go out of the window as GStreamer adjusts the length of the audio and its current position to factor in the speed that it is playing at. So what was at say 18 seconds is now at 14 seconds, for example.
I have also tried stopping the audio and starting afresh passing the new settings and also issuing the seek with a rate of 1.00 and also the tempo rate neither has worked. I have run out of ideas for the moment, thus this plea to SO.
Example code
#Slow down rate of file play
def OnSlow(self, evt):
media_state = self.media_get_state()
self.rate = Gpitch.get_property("tempo")
if self.rate > 0.2:
self.rate = self.rate - 0.10
Gpitch.set_property("tempo", self.rate)
r = "%.2f" % (self.rate)
self.ma3.SetLabel("Speed: "+r)
if media_state == Gst.State.PLAYING or media_state == Gst.State.PAUSED:
self.timer.Stop() #momentarily stop updating the screen
seek_event = Gst.Event.new_seek(self.rate, Gst.Format.TIME,
(Gst.SeekFlags.FLUSH),#NONE),
Gst.SeekType.NONE, 0, Gst.SeekType.NONE, -1)
Gplayer.send_event(seek_event)
time.sleep(0.1)
self.timer.Start() #Restart updating the screen
I have tried multiplying the duration and the current position by the adjustment in an attempt to pull or push the timestamps back to their positions, as if it were being played at normal speed but to no avail.
I have been pulling my hair out over this and the real kick in the teeth, is that, if I perform the same task using Vlc as my audio engine it works but I have to alter the pitch separately. The whole reason for moving over to GStreamer was that the pitch plugin tracks the “tempo” component and yet if I cannot get accurate and consistent timestamps, the project is dead in the water.
My question, has anybody a) come across this issue and b) mastered it
The answer, for anyone who finds themselves in a similar predicament, seems to lie with a fundamental tussle between the soundtouch pitch plugin and GStreamer's rate of play.
Audacity even makes a note of it in their User Manual.
The only way I eventually found a way round the problem, was to ditch the pitch plugin entirely from the pipeline, as just having it in there was enough to mess things up.
Instead, I used the ladspa-am-pitchshift-1433-so-ampitchshift plugin to adjust the pitch of the audio and left GStreamer to vary the rate using normal seek commands whilst altering the rate, to give slower and faster rates of play.
In this way the timestamps remain consistent but the pitch has to be manually adjusted. Although it can be semi-automated by picking from a list of predefined pitch values for given rates of play.
I trust that this saves someone else 2 days of head scratching.
Additional Note:
Even though GStreamer works in nanoseconds and one could be forgiven for thinking that not using the flag Gst.SeekFlags.ACCURATE, when performing a seek, wouldn't make that much difference, one would be very much mistaken.
I have noticed that not using the ACCURATE flag can make a difference of up to 10 seconds when GStreamer is asked to report its current position, if the seek didn't use the ACCURATE flag.
So forewarned is forearmed.
(note that using this flag will make the seek take longer but at least if gives consistent results)
Related
I'm gonna make a converter to h.265 with ffmpeg, based on documentation: http://www.ffmpeg.org/doxygen/trunk/transcoding_8c-example.html
I want to add info about the progress, but I have no idea what number I can use to show that, for example in %.
Please help. :)
What about offering several variants with a choice with an argument?
I think time passed and the estimated time left are more suggestive for than % - for example in order to leave the machine or the window to work and return to check it later.
Also, the current frame rate of the conversion is suggestive, it gives hints eventually for adjusting the bitrate etc. if it's too slow.
So you may measure the time of the encoding so far and try to estimate the frame rate of processing and how much remains.
ffmpeg itself displays current time or current frame from the processed video and the duration of the video.
I want to measure how much time does it take MF to process my video samples.
I’ve tried using sample time as unique sample identifier, discovered the pipeline adjusts that value so it drifts away (not fast, 0-1 100-nanoseconds ticks per frame, but even off-by-1 is sufficient for the value to be worthless as a unique ID).
I’ve tried putting custom value in attributes, works OK on Win10 with nVidia encoder, fails on Win7 with MS encoder: the output frame doesn’t contain my value, apparently the encoder dropped all attributes from samples. Tried MFSampleExtension_DeviceTimestamp built-in attribute, same result, the value is lost in the pipelines.
Any other way to match input samples with output samples? Manually counted sequence numbers are too fragile IMO, the framework is heavily multithreaded.
You may write a wrapper encoder MFT which wrap MS decoder in Win7, and record the sample times/additional attributes into a queue in IMFTransform::ProcessInput, and process it in IMFTransform::ProcessOutput, and get the attribute according to the sample time, and set the related attributes to the output samples, is it ok?
I'm trying to make an application in tkinter that has a number of buttons you can assign sound on and play it later. The click of the button itself only calls play() method, so loading of the sound is done beforehand.
I tried making some kind of volume control with sliders (tk.Scale) and I noticed there is no noticeable difference between most volume values until I get very close to zero (take into consideration that slider resolution is 0.01 from 0.0 to 0.1).
At around 0.02 I think I notice the sound volume is significantly lower and if I get to zero, the sound is muted. Please note that this happens if I move the slider while no sounds are playing.
The interesting thing is, if I try playing a sound that is long enough to let me move the slider while it's playing, I can notice the difference right away, but if the sound stops playing and I try playing it again, it goes to the "default" volume again.
Since I divided my application into multiple scripts according to what they do (recording sound, playing sound, GUI) I thought it could be the problem that I haven't directly initialized pygame mixer, but rather from the imported module, so I made a new python script and typed this code in:
import pygame
import time
pygame.mixer.pre_init(frequency=44100, size=-16, channels=1, buffer=512)
pygame.mixer.init()
sound1=pygame.mixer.Sound("sound.wav")
sound1.set_volume(1.0)
print sound1.get_volume()
sound1.play()
time.sleep(sound1.get_length())
sound1.set_volume(0.5)
print sound1.get_volume()
sound1.play()
time.sleep(sound1.get_length())
sound1.set_volume(0.08)
print sound1.get_volume()
sound1.play()
time.sleep(sound1.get_length())
The output is the following: 1.0,0.5,0.078125 (one below the other) confirming that the volume has indeed been set (I hope properly).
The only time I can notice the difference is the third case, which is not that noticeable really, I want the volume increase to be linear, this is far from it.
I tried the same thing with a channel:
sound1=pygame.mixer.Sound("sound.wav")
channel=pygame.mixer.find_channel(True)
channel.set_volume(1.0)
channel.play(sound1)
time.sleep(sound1.get_length()/2)
channel.set_volume(0.5)
print "Volume set"
time.sleep(sound1.get_length()/2)
No luck, the same thing happens here too.
I spent all day googling "pygame mixer volume problem" "pygame mixer volume set problem" and similar phrases, but no luck. Hopefully someone here can be of help, considering my diploma depends on a python method. :)
Thanks in advance.
I found the answer (thank you Gummbum from PyGame IRC).
The problem is not in Python or Pygame itself, but rather in Windows. It seems sound enhancements are somehow fiddling with the way the sound my script is playing (or any other Pygame script for that matter).
I'm on Windows 10 and this is how I did it:
Right click on the speaker icon in the taskbar
Select Playback Devices
Select Speakers and Properties
Go to Enhancements tab and uncheck Equalizer and Loudness Equalization
That's it.
Music on Raspberry Pi:
using Pygame to program playing music on my Raspberry Pi, I found the volume way too low at settings 0.0 to 1.0. Then I tried setting the value higher up to 10.0(pygame.mixer.music.set_volume(vol)) and it works Great!
Maybe you need to change the file format to mp3 for running the music because when i copied this code got an alarm music in mp3 extension and ran it in Spyder(anaconda) python 3.8 , it works , There might be 2 solutions :
Change your python version to 3.8
Convert the .wav extension into mp3
I am not sure it would work or not but with these situations , it might work at your end.
When using EDSDK version 3.4.0 to take a photo with the Rebel T6i it can take anywhere from 2 to 30 seconds after calling EdsSendCommand(camera, kEdsCameraCommand_TakePicture, 0); for the corresponding kEdsObjectEvent_DirItemCreated to be received, signalling that the image is ready to download from the camera. Note that the camera itself takes the photo and the flash goes off almost instantly after sending the TakePicture command - it is only the kEdsObjectEvent_DirItemCreated event that is delayed for seemingly random, large amounts of time.
The delays become much longer and more frequent when connecting to a second Rebel T6i, even when only taking photos with one of the cameras. This even occurs when both cameras are ran from separate applications.
We're hoping to use both of these cameras as a part of an installation that requires we're able to download each photo from the camera within at most 5 seconds from when EdsSendCommand(camera, kEdsCameraCommand_TakePicture, 0) is called.
If anyone has any ideas on why this large delay might be occurring or any other suggestions on how to fix it, we'd greatly appreciate it!
Note: We're building 64-bit at the moment but are currently attempting to get a 32-bit build working in the meantime to see if that improves anything.
EDSDK v3.4.0
OS X 10.12.1
64-bit
Rebel T6i
Not using live view will fix the problem. You need to download the image direct to the computer instead of saving to the SD card first as well. If ANY other camera is plugged in that is using live mode then you will continue to have the above problem unfortunately.
As a guitarist I have always wanted to develop my own recording, mixing software. I have some experience in Direct Sound, Windows Multimedia (waveOutOpen, etc). I realise that this will be a complex project, but is purely for my own use and learning, i.e. no deadlines! I intend to use C++ but as yet am unsure as the best SDK/API to use. I want the software to be extensible as I may wish to add effects in the future. A few prerequisites...
To run on Windows XP
Minimal latency
VU meter (on all tracks)
This caused me to shy away from Direct Sound as there doesn't appear to be a way to read audio data from the primary buffer.
Overdubbing (i.e. record a new track whilst playing existing tracks).
Include a metronome
My initial thoughts are to use WMM and use the waveOutWrite function to play audio data. I guess this is essentially an audio streaming player. To try and keep things simpler, I will hard-code the sample rate to 16-bit, 44.1kHZ (the best sampling rate my sound card supports). What I need are some ideas, guidance on an overall architecture.
For example, assume my tempo is 60 BPM and time signature is 4/4. I want the metronome to play a click at the start of every bar/measure. Now assume that I have recorded a rhythm track. Upon playback I need to orchestrate (pun intended) what data is sent to the primary sound buffer. I may also, at some point, want to add instruments, drums (mainly). Again, I need to know how to send the correct audio data, at the correct time to the primary audio buffer. I appreciate timing is key here. What I am unsure of is how to grab correct data from individual tracks to send to the primary sound buffer.
My initial thoughts are to have a timing thread which periodically asks each track, "I need data to cover N milliseconds of play". Where N depends upon the primary buffer size.
I appreciate that this is a complex question, I just need some guidance as to how I might approach some of the above problems.
An additional question is WMM or DirectSound better suited for my needs. Maybe even ASIO? However, the main question is how, using a streaming mechanism, do I gather the correct track data (from multiple tracks) to send to a primary buffer, and keep minimal latency?
Any help is appreciated,
Many thanks
Karl
Thanks for the responses. However, my main question is how to time all of this, to ensure that each track writes appropriate data to the primary buffer, at the correct time. I am of course open to (free) libraries that will help me achieve my main goals.
As you intend to support XP (which I would not recommend, as even the extended support will end next year) you really have no choice but to use ASIO. The appropriate SDK can be downloaded from Steinberg. In Windows Vista and above WASAPI Exclusive Mode might be a better option due to wider availability, however the documentation is severely lacking IMO. In any case, you should have a look at PortAudio which helps wrap these APIs (and unlike Juce is free.
Neither WMM nor DirectSound nor XAudio 2 will be able to achieve sufficiently low latencies for realtime monitoring. Low-latency APIs usually periodically call a callback for each block of data.
As every callback processes a given number of samples, you can calculate the time from the sample rate and a sample counter (simply accumulate across callback calls). Tip: do not accumulate with floating point. That way lies madness. Use a 64 bit sample counter, as the smallest increment is always 1./sampleRate.
Effectively your callback function would (for each track) call a getSamples(size_t n, float* out) (or similar) method and sum up the results (i.e. mix them). Each individual track could would then have an integrated sample time to compute what is currently required. For periodic things (infinite waves, loops, metronomes) you can easily calculate the number of samples per period and have a modulo counter. That would lead to rounded periods, but as mentioned before, floating point accumulators are a no-no, they can work ok for periodic signals though.
In the case of the metronome example you might have a waveform "click.wav" with n samples and a period of m samples. Your counter periodically goes from 0 to m-1 and as long as the counter is less than n you play the corresponding sample of your waveform. For example a simple metronome that plays a click each beat could look something like this:
class Metronome
{
std::vector<float> waveform;
size_t counter, period;
public:
Metronome(std::vector<float> const & waveform, float bpm, float sampleRate) : waveform(waveform), counter(0)
{
float secondsPerBeat = 60.f/bpm; // bpm/60 = bps
float samplesPerBeat = sampleRate * secondsPerBeat;
period = (size_t)round(samplesPerBeat);
}
void getSamples(size_t n, float* out)
{
while(n--)
{
*out++ = counter < waveform.size() ? waveform[counter] : 0.f;
counter += 1;
counter -= counter >= period ? period : 0;
}
}
};
Furthermore you could check the internet for VST/AU Plugin programming tutorials, as these have the same "problem" of determining time from the number of samples.
As you've discovered, you are entering a world of pain. If you're really building audio software for Windows XP and expect low latency, you'll definitely want to avoid any audio API provided by the operating system, and do as almost all commercial software does and use ASIO. Whilst things got better, ASIO isn't going anyway any time soon.
To ease you pain considerably, I would recommend having a look at Juce, which is a cross-platform framework for building both audio host software and plugins. It's been used to build many commercial products.
They've got many of the really nasty architectural hazards covered, and it comes with examples of both host applications and plug-ins to play with.