What is the gstreamer caps syntax? - gstreamer

What is the syntax for caps, specifying media capabilities, in gstreamer? Caps are strings that specify the type of media allowed and look like "audio/x-raw-int,..." but I haven't been able to find good documentation on exactly what is allowed in a caps string.

The syntax is:
Note that the type is not a MIME type, however much it may look like one.
You can find out which caps properties elements support by using gst-inspect. It will proviide "pad templates" for the element's pads, which will specify the ranges of caps supported.
The GStreamer plugin writer's guide also contains a list of defined types which describes properties for common audio, video and image formats.

In Java, for gstreamer-java
final Element videofilter = ElementFactory.make("capsfilter", "flt");
videofilter.setCaps(Caps.fromString("video/x-raw-yuv, width=720, height=576"
+ ", bpp=32, depth=32, framerate=25/1"));
In C, say you want videoscale caps filter
GstElement *videoscale_capsfilter;
GstCaps* videoscalecaps;
videoscale = gst_element_factory_make ("videoscale", "videoscale");
g_assert (videoscale);
videoscale_capsfilter = gst_element_factory_make ("capsfilter", "videoscale_capsfilter");
g_assert (videoscale_capsfilter);
then set properties
g_object_set( G_OBJECT ( videoscale_capsfilter ), "caps", videoscalecaps, NULL );
then you could add these to bin and link them the way you have constructed media pipeline using gst-launch
/* Add Elements to the Bin */
gst_bin_add_many (GST_BIN (pipeline),source ,demux ,decoder ,videoscale ,videoscale_capsfilter ,ffmpegcolorspace ,ffmpegcolorspace_capsfilter,autovideosink,NULL);
/* Link confirmation */
if (!gst_element_link_many (demux, decoder,videoscale, videoscale_capsfilter ,ffmpegcolorspace, ffmpegcolorspace_capsfilter, autovideosink, NULL)){
g_warning ("Main pipeline link Fail...");
/* Dynamic Pad Creation */
if(! g_signal_connect (source, "pad-added", G_CALLBACK (on_pad_added),demux))
g_warning ("Linking Fail...");

Here is the format as far as I understand it:
caps = <caps_name>, <field_name>=<field_value>[; <caps>]
<caps_name> = image/jpeg etc
<field_name> = width etc
<field_value> = <fixed_field_value>|<ranged_field_value>|<multi_field_value>
<fixed_field_value> = 800 etc
<ranged_field_value> = [<lower_value>, <upper_value>]
<multi_field_value> = {<fixed_field_value>, <fixed_field_value>, <fixed_field_value>, ...}

I see you are after audio.
I'll just give you the long version, you can drop or change the parts you don't need. It changes between GStreamer 0.10 and GStreamer 1.0 though. I'll give both:
for GStreamer 0.10:
for GStreamer 1.0:
As you can see, with 1.0, you will need to combine the audio format. S16LE means signed + int + 16 width + little endian (=1234).

This is how i use it in python...HTH
caps = gst.Caps("video/x-raw-yuv,format=(fourcc)AYUV,width=704,height=480")
capsFilter = gst.element_factory_make("capsfilter")
capsFilter.props.caps = caps

I'm unsure due to your question is about syntax, but "list of defined types" may be helpful.

a partial answer, which i'm sure you've worked out already:
formally, caps are not represented by strings but by a GstCaps object containing an array of GstStructures. see the documentation here.
perhaps if we work out a definitive answer here we could submit a documentation patch for the function gst_caps_from_string()

from x264enc source code: https://github.com/GStreamer/gst-plugins-ugly/blob/master/ext/x264/gstx264enc.c#L693-L704
static GstStaticPadTemplate src_factory = GST_STATIC_PAD_TEMPLATE ("src",
GST_STATIC_CAPS ("video/x-h264, "
"framerate = (fraction) [0/1, MAX], "
"width = (int) [ 1, MAX ], " "height = (int) [ 1, MAX ], "
"stream-format = (string) { avc, byte-stream }, "
"alignment = (string) au, "
"profile = (string) { high-4:4:4, high-4:2:2, high-10, high, main,"
" baseline, constrained-baseline, high-4:4:4-intra, high-4:2:2-intra,"
" high-10-intra }")
for this CAPS, when inpstected by gst-inspect, it's like:
Pad Templates:
SRC template: 'src'
Availability: Always
framerate: [ 0/1, 2147483647/1 ]
width: [ 1, 2147483647 ]
height: [ 1, 2147483647 ]
stream-format: { (string)avc, (string)byte-stream }
alignment: au
profile: { (string)high-4:4:4, (string)high-4:2:2, (string)high-10, (string)high, (string)main, (string)baseline, (string)constrained-baseline, (string)high-4:4:4-intra, (string)high-4:2:2-intra, (string)high-10-intra }


Swift3 AudioToolbox: PCM playback how to AudioQueueAllocateBuffer?

I am following https://github.com/AlesTsurko/LearningCoreAudioWithSwift2.0/tree/master/CH05_Player
to playback a frequency but it is with Swift2.
Get microphone input using Audio Queue in Swift 3 has resolved many of the issues but it is for recording.
I am stuck at allocating a buffer to audio queue
var ringBuffers = [AudioQueueBufferRef](repeating:nil, count:3)
AudioQueueAllocateBuffer(inQueue!, bufferSize, &ringBuffers[0])
It gives an error
main.swift:152:29: Expression type '[AudioQueueBufferRef]' is ambiguous without more context
main.swift:153:20: Cannot pass immutable value as inout argument: implicit conversion from 'AudioQueueBufferRef' to 'AudioQueueBufferRef?' requires a temporary
--After Spads' answer--
var ringBuffers = [AudioQueueBufferRef?](repeating:nil, count:3)
let status = AudioQueueAllocateBuffer(inQueue!, bufferSize, &ringBuffers[0])
vm_map failed: 0x4 ((os/kern) invalid argument)
audio description I have used is
inFormat = AudioStreamBasicDescription(
mSampleRate: Double(sampleRate),
mFormatID: kAudioFormatLinearPCM,
mFormatFlags: kLinearPCMFormatFlagIsBigEndian | kLinearPCMFormatFlagIsSignedInteger | kLinearPCMFormatFlagIsPacked,
mBytesPerPacket: UInt32(numChannels * MemoryLayout<UInt16>.size),
mFramesPerPacket: 1,
mBytesPerFrame: UInt32(numChannels * MemoryLayout<UInt16>.size),
mChannelsPerFrame: UInt32(numChannels),
mBitsPerChannel: UInt32(8 * (MemoryLayout<UInt16>.size)),
mReserved: UInt32(0)
AudioQueueNewOutput(&inFormat, AQOutputCallback, &player, nil, nil, 0, &inQueue)
Should you not have an array of AudioQueueBufferRef? instead of AudioQueueBufferRef
var ringBuffers = [AudioQueueBufferRef?](repeating:nil, count:3)
AudioQueueAllocateBuffer(inQueue!, bufferSize, &ringBuffers[0])

i have error with the sample rate 8khz to read real time audio buffer

with my following code i have the error :
var engine = AVAudioEngine()
let input = engine.inputNode!
let bus = 0
let mixer = AVAudioMixerNode()
engine.connect(input, to: mixer, format: input.outputFormat(forBus: 0))
//pcmFormatFloat64 -- pcmFormatFloat32
let fmt = AVAudioFormat(commonFormat: .pcmFormatFloat32, sampleRate: 8000, channels: 1, interleaved: true)
mixer.installTap(onBus: bus, bufferSize: 512, format: fmt) { (buffer, time) -> Void in
// 8kHz buffers!
try! engine.start()
ERROR : kAudioUnitErr_TooManyFramesToProcess : inFramesToProcess=1024, mMaxFramesPerSlice=768
with the sample rate 441khz is everything fine but with 8khz not
what is wrong with this code ?
I encountered the same error message at 8kHz and fixed it by setting the preferred sampling rate and capture duration for my audio session instance:
let audioSession = AVAudioSession.sharedInstance()
try audioSession.setPreferredSampleRate(sampleRate)
try audioSession.setPreferredIOBufferDuration(TimeInterval(processingInterval))

List device-names available for video capture from ksvideosrc in gstreamer 1.0

I am trying to query a list of available video capture devices (webcams) on windows using gstreamer 1.0 in c++.
I am using ksvideosrc as source and i am able to capture the video input but i can't query a list of available devices (and their caps).
On gstreamer 0.10 it has been possible through GstPropertyProbe which is removed in gstreamer 1.0. The documentation suggests using GstDeviceMonitor. But i have no luck using that either.
Has anyone succeeded in acquiring a list of device names? Or can you suggests another way of retrieving the available device names and their caps?
You can use GstDeviceMonitor and gst_device_monitor_get_devices () function.
First initialize GstDeviceMonitor by gst_device_monitor_new().
Second start the monitor by gst_device_monitor_start(pMonitor).
Third, get devices list by gst_device_monitor_get_devices(pMonitor).
Code would be like this:
GstDeviceMonitor* monitor= gst_device_monitor_new();
printf("WARNING: Monitor couldn't started!!\n");
GList* devices = gst_device_monitor_get_devices(monitor);
My references:
Although I haven't figured out how to enumerate the device names, I've come up with a workaround to at least get the available ksvideosrc device indexes. Below is the code in Python, but you should be able to port it to C++ fairly easily, thanks to the GObject introspection bindings.
from gi.repository import Gst
def get_ksvideosrc_device_indexes():
device_index = 0
video_src = Gst.ElementFactory.make('ksvideosrc')
state_change_code = None
while True:
video_src.set_property('device-index', device_index)
state_change_code = video_src.set_state(Gst.State.READY)
if state_change_code != Gst.StateChangeReturn.SUCCESS:
device_index += 1
return range(device_index)
if __name__ == '__main__':
print get_ksvideosrc_device_indexes()
Note that the video source device-name property is None as of GStreamer version on Windows for the ksvideosrc.
It's very late, but for the future...
The Gst.DeviceMonitor can be used to enumerate devices, and register an addition or removal of a device.
Here's how to get device names in C# with GStreamer 1.14
static class Devices
public static void Run(string[] args)
Application.Init(ref args);
var devmon = new DeviceMonitor();
// to show only cameras
// var caps = new Caps("video/x-raw");
// var filtId = devmon.AddFilter("Video/Source", caps);
var bus = devmon.Bus;
if (!devmon.Start())
"Device monitor cannot start".PrintErr();
Console.WriteLine("Video devices count = " + devmon.Devices.Length);
foreach (var dev in devmon.Devices)
var loop = new GLib.MainLoop();
static void DumpDevice(Device d)
Console.WriteLine($"{d.DeviceClass} : {d.DisplayName} : {d.Name} ");
static bool OnBusMessage(Bus bus, Message message)
switch (message.Type)
case MessageType.DeviceAdded:
var dev = message.ParseDeviceAdded();
Console.WriteLine("Device added: ");
case MessageType.DeviceRemoved:
var dev = message.ParseDeviceRemoved();
Console.WriteLine("Device removed: ");
return true;

How to test gstreamer plugin multiple video format support

I use v0.10.34 gstreamer core, plugin-base, etc...
I develop a simple filter to modify the Y component, but i would like handle all video format without using a colorspace converter between video decode and my filter.
For now I use the following cmd to test my filter:
../../Build/lin64_release/bin/gst-launch-0.10 -v filesrc location=input.mp4 ! decodebin2 name=dec ! ffmpegcolorspace ! myfilter silent=1 ! tee name=t \ t. ! queue ! filesink location=test.yuv \ t. ! queue ! ffmpegcolorspace ! ximagesink
My 1st question is, how can i force/set a specific cap(video format) as input of my filter ?
And 2nd question is, why does I get a connection with I420 format, if I use only YUY2 and UYVY as template to create my src and sink pad ?
All idea and good url on those topics will be welcome.
For the 1st question, It looks like it's the responsibility of the _set_caps function to accept or reject a connection with a given caps. To implement that I used an array of supported caps (define as GstStaticCaps) and in my _set_caps function I check the intersection of caps I receive with the GstStaticCaps I used as template.
static gboolean
gst_myfilter_set_caps (GstPad * pad, GstCaps * caps)
Gstmyfilter *filter;
GstVideoFormat format;
int i, w, h;
gboolean isSupported;
filter = GST_NGPTVSTUB (gst_pad_get_parent (pad));
if(!gst_video_format_parse_caps(caps, &format, &w, &h)) {
if (filter->silent == FALSE) {
g_print("Unable to get video format from caps\n");
return FALSE;
isSupported = FALSE;
for (i = 0; i < G_N_ELEMENTS (gst_myfilter_video_format_caps); i++) {
if(gst_caps_can_intersect(caps, gst_static_caps_get(&gst_myfilter_video_format_caps[i]))) {
isSupported = TRUE;
if(!isSupported) {
if (filter->silent == FALSE) {
g_print("that caps is not supported\n");
return FALSE;
And to the 2nd question, how to test multiple colorspace and format support, a solution can be to use a colorspace converter and a format specifier before the filter like below.
... ! ffmpegcolorspace ! video/x-raw-yuv,format=\(fourcc\)YUY2 | myfilter ! ....

gstreamer : audiosink to output stream of integers representing volume levels

I need a gstreamer audio sink that outputs integers that
represent volume level of an audio stream. The sampling rate
need not be the same as the incoming audio stream, it can be much
lower, ex.: one value per second would be sufficient.
Does such a sink exist ?
It seems that this one could be modified to do this :
But if something already exists I'd prefer to avoid writing one !
there indeed is such an element, it's not a sink though but I don't think you need it to be for that task anyway :)
It is called level (http://gstreamer.freedesktop.org/data/doc/gstreamer/head/gst-plugins-good-plugins/html/gst-plugins-good-plugins-level.html), and as you can see there is an "interval" property that you can tweak.
We use this element in our video editor to draw waveforms, here take this simplified script :
from gi.repository import Gst
from gi.repository import GLib
import sys
mainloop = GLib.MainLoop()
def _messageCb(bus, message):
if str(type(message.src)) == "<class '__main__.__main__.GstLevel'>":
s = message.get_structure()
p = None
if s:
p = s.get_value("rms")
if p:
st = s.get_value("stream-time")
print "rms = " + str(p) + "; stream-time = " + str(st)
if message.type == Gst.MessageType.EOS:
elif message.type == Gst.MessageType.ERROR:
if __name__=="__main__":
global mainloop
pipeline = Gst.parse_launch("uridecodebin name=decode uri=" + sys.argv[1] + " ! audioconvert ! level name=wavelevel interval=10000000 post-messages=true ! fakesink qos=false name=faked")
faked = pipeline.get_by_name("faked")
bus = pipeline.get_bus()
bus.connect("message", _messageCb)
May I inquire about your use case ?