How to use openenc to encode opus in ogg extension - file-conversion

Musicbee uses opusenc to encode files into opus. I would like to encode my music files into opus in an ogg extention.
The command line arguments look like this:
--bitrate 256 --vbr --ignorelength - [outputfile]
I've tried --bitrate 256 --vbr --ignorelength - [outputfile].ogg and --bitrate 256 --vbr --ignorelength - ogg
I've looked at the documentation and I don't know how I would do this.

To encode a PCM Wave file foo.wav as an Ogg Opus file foo.ogg at 256 kb/s VBR:
opusenc --bitrate 256 foo.wav foo.ogg
VBR is the default so the --vbr option is not necessary.
You can use --ignorelength if the length in the Wave file header is wrong and you want it to assume that the audio continues to the end of the file with nothing after it, but by default it will do that if the length appears to be wrong so this should not normally be necessary.
Options such as --title and --artist may be used if you want to add tags. See opusenc --help for the available options.
If you want to convert a whole directory, use a for loop.
bash: for f in *.wav; do opusenc --bitrate 256 "$f" "${f%.wav}.ogg"; done
Windows cmd: for %F in (*.wav) do opusenc --bitrate 256 "%F" "%~nF.ogg"
In addition to Wave files, opusenc will also accept AIFF and raw PCM input, and can optionally be built to accept FLAC and Ogg/FLAC input. If you need to convert from some other format then try FFmpeg.

Related

cImg save_video ignores fps

I am using cImg to generate images in a CImgList.
when I save that stack to video by calling the save_video method on the stack, it ignores the fps and the output seems to be always 25fps. I opened the file in different players (VLC, windows movie,...) and its always 25fps.
cImg is using ffmpeg to create the video. I'm not specifying any codec so i assume the default mpeg2 is used (based on what VLC tells me).
I also have no specfic settings for cImg.
I have a fixed amount of images of 500 and it always produces around 20 seconds which is 25fps.
What do I need to do to output it to for example 60fps?
I fixed it but not sure if it's a bug in cImg or just me not knowing how to use it properly but from what I found is that cImg doesn't pass the --framerate parameter to ffmpeg, only the -r parameter.
I updated the code to include the framerate parameter and it does seem to work. This is the updated cImg code in the save_ffmpeg_external function:
cimg_snprintf(command,command._width,
"\"%s\" -framerate %u -v -8 -y -i \"%s_%%6d.ppm\" -pix_fmt yuv420p -vcodec %s -b %uk -r %u \"%s\"",
cimg::ffmpeg_path(),
fps,
CImg<charT>::string(filename_tmp)._system_strescape().data(),
_codec,bitrate,fps,
CImg<charT>::string(filename)._system_strescape().data());

Google Cloud Speech to text returning empty result or error

Working hard for 4 days now to fix the google cloud speech to text api to work, but still see no light at the end of the tunnel. Searched on the net a lot, read the documentations a lot but see no result.
Our site is bbsradio.com, we are trying to auto extract transcript from our mp3 files using google speech-to-text api. Code is written on PHP and almost exact copy of this: https://github.com/GoogleCloudPlatform/php-docs-samples/blob/master/speech/src/transcribe_async.php
I see process is completed and its reached out here "$operation->pollUntilComplete();" but its not showing it was successful at "if ($operation->operationSucceeded()) {" and its not returning any error either at $operation->getError().
I am converting the mp3 to raw file like this: ffmpeg -y -loglevel panic -i /public_html/sites/default/files/show-archives/audio-clips-9-23-2020/911freefall2020-05-24.mp3 -f s16le -acodec pcm_s16le -vn -ac 1 -ar 16000 -map_metadata -1 /home/mp3_to_raw/911freefall2020-05-24.raw
While tried with FLAC format as well, not worked. I tested converted FLAC file using windows media player, I can listen conversation clearly. I checked the files its Hz 16000, channel = 1 and its 16 bit. I see file is uploaded in cloud storage. Checked this:
https://cloud.google.com/speech-to-text/docs/troubleshooting and
https://cloud.google.com/speech-to-text/docs/best-practices
There are lot of discussion and documentation, seems nothing is helpful at this moment. If some one can really help me out to find out the issue, it will be really really really great!
TLDR; convert from MP3 to a 1-channel FLAC file with the same sample rate as your MP3 file.
Long explanation:
Since you're using MP3 files as your process input, probably you MP3 compression artifacts might be hurting you when you resample to to 16KHz (you cannot hear this, but the algoritm will).
To confirm this theory:
Execute ffprobe -hide_banner filename.mp3 it will output something like this:
Metadata:
...
Duration: 00:02:12.21, start: 0.025057, bitrate: 320 kb/s
Stream #0:0: Audio: mp3, 44100 Hz, stereo, s16p, 320 kb/s
Metadata:
encoder : LAME3.99r
In this case, the sample rate is OK for Google-Spech-Api. Just transcode the file without changing the sample rate (remove the -ar 16000 from your ffmpeg command)
You might get into trouble if the original MP3 bitrate is low. 320kb/s seems safe (unless the recording has a lot of noise).
Take into account that voice recoded under 64kb/s (ISDN line quality) can be understood only by humans if there is some noise.
At last I found the solution and reason of the issue. Actually getting empty results is a bug of the php api code. What you need to do:
Replace this:
$operation->pollUntilComplete();
by this:
while(!$operation->isDone()){
$operation->pollUntilComplete();
}
Read this: enter link description here

Remux mp4 file containing data stream

I’m developing an app that needs to clone an MP4 video file with all the streams using FFmpeg C++ API and have successfully made it work based on the FFmpeg remuxing example.
This works great for video and audio streams, but when the video includes a data stream (actually a QuickTime Time Code according to MediaInfo) I get this error.
Output #0, mp4, to 'C:\Users\user\Desktop\shortOut.mp4':
Stream #0:0: Video: hevc (Main 10) (hev1 / 0x31766568), yuv420p10le(tv,progressive), 3840x2160 [SAR 1:1 DAR 16:9], q=2-31, 1208 kb/s
Stream #0:1: Audio: mp3 (mp4a / 0x6134706D), 48000 Hz, stereo, s16p, 32s
Stream #0:2: Data: none (tmcd / 0x64636D74), 0 kb/s
[mp4 # 0000000071edf600] Could not find tag for codec none in stream #2, codec not currently supported in container
I’ve found this happens in the call to avformat_write_header().
It makes sense that if FFmpeg doesn’t know the codec it can’t write to the header about it, but I found out that using the ffmpeg command line I can make it to work perfectly using the copy command for the stream, something like:
ffmpeg -i input.mp4 -c:v copy -c:a copy -c:a copy output.mp4
I have been analyzing ffmpeg.c implementation to try to understand how they do a stream copy, but it’s been very painful following along the huge pipeline.
What would be a proper way to remux a data stream of this type with FFmpeg C++ API? Any tip or pointers?

Gstreamer multifilesink wav files splitting

I have problem with recording streams using gstreamer.
I have to write audio and video separately and cut in when signal arrived. I have correctly working video, but still have problems with wav files.
Even simple pipeline in gst-launch don't work correctly. I have wave file and I am trying to split it using multifilesink:
gst-launch filesrc location=test.wav ! multifilesink location=test2%d.wav next-file=4 max-file-size=512000
But final wav files are corrupted while the same pipeline with ts files is working ok:
gst-launch-1.0 filesrc location=test.ts ! multifilesink location=test2%d.ts next-file=4 max-file-size=2000000
multifilesink doesn't know anything about the data it splits up, so it won't take care of adding headers to each of the files it writes.
The reason why your .ts files work is because it was designed to be a streaming format where each separate packet will be treated independently. Therefore, one can just 'tune in' to the stream whenever one likes. The decoder will simply look for the next packet header it finds and start decoding there (for Details have a look at MPEG TS' wiki page.
The WAV file format however was designed as pure file (and not as a streaming) format. Therefore, there's only one header at the start of the file. When you split that file up into multiple files, these headers are missing (the file contains only raw PCM data then).
To work around that issue, you can...
manually copy the .wav header from the first file to all the other ones
use programs that support PCM files and either work directly with them or convert the files (you'll have to set the channel count, sample rate and bitrate manually when opening those files though).
use another, stream oriented file format like .mp3 which comes from the same family of codecs as .ts and also uses a separate 4-byte header for each frame (Keep in mind though that MP3 is a lossy file format).
An example pipeline would be:
gst-launch filesrc location=test.wav ! wavparse ! lame ! multifilesink location=test%d.mp3 next-file=4 max-file-size=100000
If you're willing to use some scripting as well and split the task up into different gst-launch calls, I can offer you another possible way to solve your little problem:
The following script is a Linux bash script. You should be able to translate that to Windows batch script (or a C or python app if you want):
#!/bin/bash -e
# First write the buffer stream to .buff files (annotated using GStreamer's GDP format)
gst-launch -e filesrc location=test.wav ! wavparse ! gdppay ! multifilesink next-file=4 max-file-size=1000000 location=foo%05d.buff
# use the following instead for any other source (e.g. internet radio streams)
#gst-launch -e uridecodebin uri=http://url.to/stream ! gdppay ! multifilesink next-file=4 max-file-size=1000000 location=foo%05d.buff
# After we're done, convert each of the resulting files to proper .wav files with headers
for file in *.buff; do
tgtFile="$(echo "$file"|sed 's/.buff$/.wav/')"
gst-launch-0.10 filesrc "location=$file" ! gdpdepay ! wavenc ! filesink "location=$tgtFile"
done
# Uncomment the following line to remove the .buff files here, but to avoid accidentally
# deleting stuff we haven't properly converted if something went wrong, I'm not gonna do that now.
#rm *.buff
Now to what the script does:
First we're gonna use multifilesink to create a set of .buff files, each under 1MB of size (gdppay will annotate each buffer with its caps; the -e flag of gst-launch will cause it to trigger an EOS if the process gets killed prematurely which is useful if you're reading and decoding an internet stream)
The second gst-launch invocation within the for loop takes one of the .buff files, parses the GDP headers using gdpdepay (and strips them), adds a WAV header and writes the result to a .wav file.
Hope this is a solution you can live with, because I doubt there's a way to do it with one single gst-launch run.

Saving H.264 RTP stream without re-encoding?

My C++ application receives a H.264 RTP video stream.
Right now it decodes the stream, saves it into a YUV file and later I use ffmpeg to re-ecode the file into something suitable to watch on a Windows PC (eg. Mpeg4 AVI).
Shouldn't it be possible to save the H.264 stream into a AVI (or similar) container without having to decode and re-encode it ? That would require some H.264 decoder on the PC to watch, but it should be much more efficient.
How could that be done ? Are there any libraries supporting that ?
using ffmpeg is correct but the answers posted so far dont look right to me.
the correct switch should be:
-vcodec copy
Your program could pipe the rtp itself through ffmpeg - even invoking it using popen3().
It seems that you need to use an intermediate SDP file - I speculate that you can specify a file you created as a named pipe or with tmpfile() which your application writes to - using the file as an intermediary.
The command-line would be something like:
int p[3];
const char* const out_fmt = "avi";
const char* cmd[] = {"ffmpeg","-f",,"-i",temp_sdp_filename,"-vcodec","copy","-f",out_fmt,"-",NULL};
if(-1 == popen3(p,cmd)) ...
// write the rtp that you receive to p[STDIN_FILENO]
// read the avi from p[STDOUT_FILENO]
// read any messages and error text from p[STDERR_FILENO]
I believe that in this circumstance ffmpeg is clever enough to repackage the container (rtp stream vs AVI) without transcoding the video and audio (this is the -vcodec copy switch); therefore, you'd have no loss of quality and it'd be blazingly fast.