I have a simple script to read mp4 file like this.
#include <opencv2/core/core.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/videoio/videoio.hpp>
int main()
{
cv::VideoCapture file("test.mp4");
cv::Mat frame;
while (true)
{
file.read(frame);
cv::imshow("Preview", frame);
cv::waitKey(42);
}
file.release();
return 0;
}
This works fine. But when I integrate this script into another project I'm working on. The the image frame shows in wrong aspect ratio.
Correct (Side by Side):
Wrong (Only show one side and aspect ratio is wrong):
I'm running on Windows VS2019. I have remove all other script in my exists project, just leave the above. The only different I can think of is the includes and linker setting. I use ceres, glog, d3d11, realsense2, VTK, pcl, eigen3, OpenXR in the project. Does any of that effects what OpenCV behave? Or what might be the problem?
I've already try setting the frame width and height for VideoCapture and it's not working.
I've test both OpenCV 4.1 and 4.6.
When accessing frame.cols, frame.rows, I got the correcy resolution.
UPDATE
The metadata of the file I'm trying to read is as below. This is a side by side 3D video. It also can display correctly in player such as VLC.
mediainfo test.mp4
General
Complete name : test.mp4
Format : MPEG-4
Format profile : Base Media
Codec ID : isom (isom/iso2/avc1/mp41)
File size : 181 MiB
Duration : 31 s 339 ms
Overall bit rate : 48.4 Mb/s
Writing application : Lavf58.20.100
Video
ID : 1
Format : AVC
Format/Info : Advanced Video Codec
Format profile : High#L5.1
Format settings : CABAC / 1 Ref Frames
Format settings, CABAC : Yes
Format settings, Reference frames : 1 frame
Format settings, GOP : M=1, N=30
Codec ID : avc1
Codec ID/Info : Advanced Video Coding
Duration : 31 s 317 ms
Bit rate : 48.0 Mb/s
Width : 3 840 pixels
Height : 1 080 pixels
Display aspect ratio : 16:9
Original display aspect ratio : 3.556
Frame rate mode : Variable
Frame rate : 60.000 FPS
Minimum frame rate : 59.920 FPS
Maximum frame rate : 60.080 FPS
Color space : YUV
Chroma subsampling : 4:2:0
Bit depth : 8 bits
Scan type : Progressive
Bits/(Pixel*Frame) : 0.193
Stream size : 179 MiB (99%)
Title : SStar Video
Codec configuration box : avcC
Audio
ID : 2
Format : AAC LC
Format/Info : Advanced Audio Codec Low Complexity
Codec ID : mp4a-40-2
Duration : 31 s 339 ms
Bit rate mode : Constant
Bit rate : 140 kb/s
Channel(s) : 2 channels
Channel layout : L R
Sampling rate : 48.0 kHz
Frame rate : 46.875 FPS (1024 SPF)
Compression mode : Lossy
Stream size : 534 KiB (0%)
Title : SStar Audio
Default : Yes
Alternate group : 1
Related
What I hope will be a quick question.. Does anybody know how to specify the output format of a v4l2Convert Plugin in GStreamer? Slightly more specifically the stride alignment or Bytes per line I don't mind which?
To give the full details. I'm playing around with an embedded video processing platform and wish to simultaneously connect multiple outputs to a single input using a GStreamer Tee object. The problem is that different outputs need different stride alignments / bytes per line.
I can set the stride alignment on the v4l2src plugin I'm using to read the input device and can get a combination that "works" for all outputs. But under the hood GStremaer is being helpful and instantiating a buffer copy, to perform re-alignment, using memcpy and thus my CPU utilisation goes through the roof.
My proposed solution is to use a hardware DMA loopback device (v4l2 Mem2Mem) which is controlled by a v4l2Convert plugin, to provide a simple, low CPU load way of realigning the data.
I've tired this system with several pipleines and been monitoring it using v4l2-ctl and it appears to be able to do what I want. If I change the stride-align of the initial v4l2src plugin I can see GStreamer change the format of the data written into the Mem2Mem device to match this. However the capture / read format always remains at N BytesPerPixel x NumberofPixelsPerLine bytes per line.
Format Video Capture Multiplanar:
Width/Height : 1920/1080
Pixel Format : 'NV16' (Y/CbCr 4:2:2)
Field : None
Number of planes : 1
Flags :
Colorspace : SMPTE 170M
Transfer Function : Rec. 709
YCbCr/HSV Encoding: ITU-R 601
Quantization : Limited Range
Plane 0 :
Bytes per Line : 1920
Size Image : 4147200
Format Video Output Multiplanar:
Width/Height : 1920/1080
Pixel Format : 'NV16' (Y/CbCr 4:2:2)
Field : None
Number of planes : 1
Flags :
Colorspace : SMPTE 170M
Transfer Function : Rec. 709
YCbCr/HSV Encoding: ITU-R 601
Quantization : Limited Range
Plane 0 :
Bytes per Line : 2048
Size Image : 4423680
Vs
Format Video Capture Multiplanar:
Width/Height : 1920/1080
Pixel Format : 'NV16' (Y/CbCr 4:2:2)
Field : None
Number of planes : 1
Flags :
Colorspace : SMPTE 170M
Transfer Function : Rec. 709
YCbCr/HSV Encoding: ITU-R 601
Quantization : Limited Range
Plane 0 :
Bytes per Line : 1920
Size Image : 4147200
Format Video Output Multiplanar:
Width/Height : 1920/1080
Pixel Format : 'NV16' (Y/CbCr 4:2:2)
Field : None
Number of planes : 1
Flags :
Colorspace : SMPTE 170M
Transfer Function : Rec. 709
YCbCr/HSV Encoding: ITU-R 601
Quantization : Limited Range
Plane 0 :
Bytes per Line : 1920
Size Image : 4147200
Is there a way for me to change the v4l2Convert's capture / source format properties from within my GStreamer pipelines declaration? GST-Inpect-1.0 doesn't show any equivalent caps to the stride-align cap of v4l2src for v4l2Convert. And caps filters like video/x-raw don't appear to to be able to provide what I need (I accept if this is wrong as I'm very much a noob in this respect)
The best I've found is the extra-controls cap but I can find very little documentation on this, and what I have found appears to suggest it's used for setting the V4L2 devices "physical" controls rather than things like the format information, so I'm probably barking up the wrong tree there any way.
my test pipeline is: -
v4l2src name=videosrc device=/dev/video0 ! video/x-raw, width=1920, height=1080, format=NV16, framerate=30/1 ! queue ! v4l2convert device=/dev/video2 disable-passthrough=true capture-io-mode=4 output-io-mode=4 import-buffer-alignment=true ! queue ! kmssink sync=false fullscreen-overlay=true
If it's helps what I want to be able to do is provide video at 1920x1080, 1920 bytes per line (GStreamer will does this quite happily for me). But set the V2l2Converts capture / source to be 1920x1080, 2048 bytes per line, as the problem sink device needs a stride align of 256.
Thanks
ALREADY WORKING:
I get video from embedded video source (just device) through LAN and I can get video from it and save it to file ".h264" (append to file every next "encodedPacket", C++) (it is worked fine, I can play file using VLC).
TASK:
How can I save image files periodically (5 in 1 second for example) (any format, but I want jpg)?
File info:
Video
Format : AVC
Format/Info : Advanced Video Codec
Format profile : Baseline#L3.1
Format settings : 1 Ref Frames
Format settings, CABAC : No
Format settings, RefFrames : 1 frame
Width : 640 pixels
Height : 480 pixels
Display aspect ratio : 4:3
Color space : YUV
Chroma subsampling : 4:2:0
Bit depth : 8 bits
Scan type : Progressive
ffmpeg is your friend:
https://trac.ffmpeg.org/wiki/Create%20a%20thumbnail%20image%20every%20X%20seconds%20of%20the%20video
I'd try something like: ffmpeg -i input.h264 -vf fps=5 out%d.jpg
If your input is a network stream you can do something like ffmpeg -i tcp://local_hostname:port?listen
https://trac.ffmpeg.org/wiki/StreamingGuide
So. I have been trying to get my Raspberry Pi 2 to capture H264 stream with OpenCV from my Logitech C920 for quite some time now. I have been scavenging the internet for info, but with no luck.
A short system description:
Raspberry Pi 2, running Raspbian, Kernel 3.18
Logitech HD Pro Webcam c920
OpenCV 2.4.11
boneCV - Credits to Derek Molloy (https://github.com/derekmolloy/boneCV)
libx264 and FFMPEG (built with x264 support)
libv4l-dev, v4l-utils, qv4l2, v4l2ucp
I know OpenCV forces format to BGR24 (MJPG). This is specified in cap_libv4l.cpp. It looks like this(line 692->):
/* libv4l will convert from any format to V4L2_PIX_FMT_BGR24 */
CLEAR (capture->form);
capture->form.type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
capture->form.fmt.pix.pixelformat = V4L2_PIX_FMT_BGR24;
capture->form.fmt.pix.field = V4L2_FIELD_ANY;
capture->form.fmt.pix.width = capture->width;
capture->form.fmt.pix.height = capture->height;
I can set the pixelformat manualy with v4l2-ctl --set-fmt-video
pi#raspberrypi ~/boneCV$ v4l2-ctl --set-fmt-video=width=1920,height=1080,pixelformat=H264
pi#raspberrypi ~/boneCV$ v4l2-ctl --get-fmt-video
Format Video Capture:
Width/Height : 1920/1080
Pixel Format : 'H264'
Field : None
Bytes per Line: 3840
Size Image : 4147200
Colotspace : SRGB
And if I now run "./boneCV" - A very simple capture program that captures a picture and does a canny edge detection. (I'll add the code in the end). I get this:
pi#raspberrypi ~/boneCV$ ./boneCV
pi#raspberrypi ~/boneCV$ v4l2-ctl --get-fmt-video
Format Video Capture:
Width/Height : 1920/1080
Pixel Format : 'MJPG'
Field : None
Bytes per Line: 0
Size Image : 4147200
Colorspace : SRGB
As you can se the "Pixelformat" and the "Bytes per Line" changes. The "Field" stays at None and the "Colourspace" stays at SRGB.
Then I tried to replace every "V4L2_PIX_FMT_BGR24" with "V4L2_PIX_FMT_H264" in cap_lib4vl.cpp and rebuilded OpenCV. When I then ran the "./boneCV" my two .png images are only black with one or two stripes of white color.
To find out if it is libv4l or OpenCV I ran "./capture" script that follow Derek Molloys boneCV. It uses libv4l directly and captures an H264 video stream with no problems at all. I then have to use "./raw2mpg4" to be able to watch it. The .mp4 file is 1920x1080 at 30 fps with no glitches. And after this I checked "v4l2-ctl --get-fmt-video" again and got this:
pi#raspberrypi ~/boneCV$ v4l2-ctl --get-fmt-video
Format Video Capture:
Width/Height : 1920/1080
Pixel Format : 'H264'
Field : None
Bytes per Line: 3840
Size Image : 4147200
Colotspace : SRGB
Exactly the same as when I did set everything manualy.
I have come to the conclusion that if I want OpenCV to be able to capture raw H264 streams I'll have to change the cap_libv4l.cpp, but I have no idea how. I think it may be because the difference in bits per frame and/or colorspace.
Do anybody know how to do this or how to make an workaround so that I stil can use OpenCVs "VideoCapture" function?
I know alot of Raspberry Pi and BeagleboneBlack users would be ever so gratefull if there was any solution to this problem.
I have tried to cover everything that I think is relevant, if there is anything more I could provide to paint the picture better, please say so.
Her some links to the mentioned scripts and programs:
(edit. I tried to post the links to each of the programs, but I didn't have enough reputation. Go to Derek Molloys github page and you'll find boneCV there.)
And no I can not use the "CV_FOURCC('H','2','6','4');" because this function is not implemented for linux yet.
I've got web camera (China) which can play MJPEG and h.264. I want to save the stream with gstreamer, but if I can do it with MJPEG, I can't do it with h264. I know, that camera gives h264 stream. Programms from developer shows it and if I save video with such programs I see that it is h264.
General
Complete name : C:\RecordFiles\20140319\IPCAM1\20140319_232127_141.avi
Format : AVI
Format/Info : Audio Video Interleave
File size : 4.11 MiB
Duration : 7s 280ms
Overall bit rate : 4 740 Kbps
Video ID : 0
Format : AVC
Format/Info : Advanced Video Codec
Format profile : Baseline#L3.1
Format settings, CABAC : No
Format settings, ReFrames : 1 frame
Codec ID : h264
Duration : 7s 280ms
Bit rate : 4 733 Kbps
Width : 1 280 pixels
Height : 720 pixels
Display aspect ratio : 16:9
Frame rate : 25.000 fps
Color space : YUV
Chroma subsampling : 4:2:0
Bit depth : 8 bits
Scan type : Progressive
Bits/(Pixel*Frame) : 0.205
Stream size : 4.11 MiB (100%)
Audio ID : 1
Format : PCM
Format settings, Endianness : Little
Format settings, Sign : Signed
Codec ID : 1
Duration : 7s 280ms
Bit rate mode : Constant
Bit rate : 128 Kbps
Channel count : 1 channel
Sampling rate : 8 000 Hz
Bit depth : 16 bits
Stream size : 114 KiB (3%)
Alignment : Aligned on interleaves
So I give all data, if somebody can help to tune gstreamer, I will be glad.
This works
gst-launch souphttpsrc location="http://shonlinecam1.dyndns.org:81/videostream.cgi?loginuse=user&loginpas=123" \
! jpegparse ! jpegdec \
! x264enc bitrate=512 key-int-max=45 speed-preset=superfast threads=1 \
! video/x-h264,stream-format=avc,alignment=au,profile=constrained-baseline \
! h264parse ! fakesink
This doesn't work
gst-launch souphttpsrc \
is-live=true \
location="http://shonlinecam1.dyndns.org:81/livestream.cgi?user=user&pwd=123&streamid=0" \
! h264parse ! decodebin2 ! fakesink
All links are real, please help.
I'm working on a custom Windows DirectShow source filter based on CSource and CSourceStream for each pin. There are two pins - video output and audio output. Both pins work fine when individually rendered in graphedit and similar tools such as Graph Studio with correct time stamps, frame rates and sound. I'm rendering the video to the Video Mixing Renderer (VMR7 or VMR9).
However when I render both pins the video plays back too fast while the audio still sounds correct. The video plays back approximately 50% too fast but I think this is limited by the speed of decoding.
The timestamps on the samples are the same in both cases. If I render the audio stream to a null renderer (the one in qedit.dll) then the video stream plays back at the correct frame rate. The filter is a 32 bit filter running on a Win7 x64 system.
When I added support for IMediaSeeking seeking I found that the seeking bar for the audio stream behaved quite bizarrely. However the problem happens without IMediaSeeking support.
Any suggestions for what could be causing this or suggestions for further investigation?
The output types from the audio and video pin are pasted below:
Mediatyp: Video Subtype: RGB24 Format: Type VideoInfo Video Size: 1024 x 576 Pixel, 24 Bit Image Size: 1769472 Bytes Compression: RGB Source: width 0, height 0 Target: width 0, height 0 Bitrate: 0 bits/sec. Errorrate: 0 bits/sec. Avg. display time: 41708 µsec.
Mediatyp: Video Subtype: RGB32 Format: Type VideoInfo Video Size: 1024 x 576 Pixel, 32 Bit Image Size: 2359296 Bytes Compression: RGB Source: width 0, height 0 Target: width 0, height 0 Bitrate: 0 bits/sec. Errorrate: 0 bits/sec. Avg. display time: 41708 µsec.
Majortyp: Audio
Subtype: PCM audio
Sample Size: 3
Type WaveFormatEx
Wave Format: Unknown
Channels: 1
Samples/sec.: 48000
Avg. bytes/sec.:144000
Block align: 3
Bits/sample: 24
I realised the problem straight after posting the question. A case of debugging by framing the question correctly.
The audio stream had completely bogus time stamps. The audio and video streams played back fine individually but did not synch at all with each other when played together.