GstCUDA - Example: cudamux

From RidgeRun Developer Connection
Revision as of 20:23, 23 January 2018 by Dgarbanzo (talk | contribs)
Jump to: navigation, search

Error something wrong.jpg Problems running the pipelines shown on this page?
Please see our GStreamer Debugging guide for help.


Example 1: cudafilter


Home

Example 3: cudadebayer



This page gives an usage example of the cudamux element.


Introduction

GstCUDA offers one basic CUDA algorithm library example, designed for the cudamux element that works just out of the box, so it is perfect for training and giving the first steps on GstCUDA. The idea is to give detailed examples on how to use the cudamux element, and also to give some examples of written code of a functional (out of the box) CUDA algorithm library for the cudamux element. The cudamux CUDA algorithm library example is automatically built, and you can find it under the following path:

  • $GstCUDA_DIR/tests/examples/cudamux_algorithms/mixer/mixer.so

The mixer.so CUDA algorithm library consists in a very basic algorithm that receives two YUV I420 images as inputs and mixed them on the GPU, this generates an output image that is the average of the two input images. All the processing of the images is done in the GPU. This basic algorithm is just for example and demonstration purposes, because it shows the capability of GstCUDA to execute an algorithm on the GPU that go through each pixel in the incoming images, process them and generate and output image.

Below you will find a set of test pipelines with their respective performance stats for the mixer CUDA algorithm library.

Note: To get the best performance on the Tegra platform, you must execute the jetson_clocks.sh script. This script tune-up the Tegra to high performance mode. All the reported performance stats came from tests done after ran the jetson_clocks.sh script. Execute command: sudo ~/jetson_clocks.sh


Mixer CUDA library algorithm

2x 720p 30fps videotestsrc sources mixed to display (in-place=true)

Example pipeline

gst-launch-1.0 cudamux name=cuda in-place=true location=./mixer.so videotestsrc pattern=ball is-live=true ! "video/x-raw,width=1280,height=720,format=I420,framerate=30/1" ! nvvidconv ! "video/x-raw(memory:NVMM),width=1280,height=720,format=I420,framerate=30/1" ! queue ! cuda.sink_0 videotestsrc is-live=true ! "video/x-raw,width=1280,height=720,format=I420,framerate=30/1" ! nvvidconv ! "video/x-raw(memory:NVMM),width=1280,height=720,format=I420,framerate=30/1" ! queue ! cuda.sink_1 cuda. ! perf print-arm-load=true ! nvoverlaysink --gst-debug=0

Performance stats

GST-PERF INFO -->  Timestamp: 1:04:41.509928234; Bps: 776; fps: 30.0; CPU: 24;  
GST-PERF INFO -->  Timestamp: 1:04:42.543376206; Bps: 751; fps: 30.0; CPU: 24;  
GST-PERF INFO -->  Timestamp: 1:04:43.576504024; Bps: 751; fps: 30.0; CPU: 25;  
GST-PERF INFO -->  Timestamp: 1:04:44.576590298; Bps: 776; fps: 30.0; CPU: 24;  
GST-PERF INFO -->  Timestamp: 1:04:45.609834937; Bps: 751; fps: 30.0; CPU: 24;  
GST-PERF INFO -->  Timestamp: 1:04:46.609851679; Bps: 776; fps: 30.0; CPU: 24;  
GST-PERF INFO -->  Timestamp: 1:04:47.609865921; Bps: 776; fps: 30.0; CPU: 24;  
GST-PERF INFO -->  Timestamp: 1:04:48.643290399; Bps: 751; fps: 30.0; CPU: 24;  
GST-PERF INFO -->  Timestamp: 1:04:49.676644357; Bps: 751; fps: 30.0; CPU: 24;  
GST-PERF INFO -->  Timestamp: 1:04:50.709835243; Bps: 751; fps: 30.0; CPU: 24;


2x 720p 30fps videotestsrc sources mixed to display (in-place=false)

Example pipeline

gst-launch-1.0 cudamux name=cuda in-place=false location=./mixer.so videotestsrc pattern=ball is-live=true ! "video/x-raw,width=1280,height=720,format=I420,framerate=30/1" ! nvvidconv ! "video/x-raw(memory:NVMM),width=1280,height=720,format=I420,framerate=30/1" ! queue ! cuda.sink_0 videotestsrc is-live=true ! "video/x-raw,width=1280,height=720,format=I420,framerate=30/1" ! nvvidconv ! "video/x-raw(memory:NVMM),width=1280,height=720,format=I420,framerate=30/1" ! queue ! cuda.sink_1 cuda. ! nvvidconv ! perf print-arm-load=true ! nvoverlaysink --gst-debug=0

Performance stats

GST-PERF INFO -->  Timestamp: 21:48:33.279751940; Bps: 763; fps: 32.48; CPU: 25;  
GST-PERF INFO -->  Timestamp: 21:48:34.279839845; Bps: 776; fps: 30.0; CPU: 25;  
GST-PERF INFO -->  Timestamp: 21:48:35.311903301; Bps: 751; fps: 30.3; CPU: 25;  
GST-PERF INFO -->  Timestamp: 21:48:36.312767089; Bps: 776; fps: 30.0; CPU: 25;  
GST-PERF INFO -->  Timestamp: 21:48:37.344449777; Bps: 752; fps: 30.6; CPU: 26;  
GST-PERF INFO -->  Timestamp: 21:48:38.344565041; Bps: 776; fps: 30.0; CPU: 25;  
GST-PERF INFO -->  Timestamp: 21:48:39.346374558; Bps: 775; fps: 29.97; CPU: 25;  
GST-PERF INFO -->  Timestamp: 21:48:40.378383451; Bps: 751; fps: 30.3; CPU: 25;  
GST-PERF INFO -->  Timestamp: 21:48:41.411104423; Bps: 751; fps: 30.3; CPU: 25;  
GST-PERF INFO -->  Timestamp: 21:48:42.411187827; Bps: 776; fps: 30.0; CPU: 24;


1080p 60fps camera stream + 1080p 60fps videotestsrc sources mixed to display (in-place=true)

Example pipeline

gst-launch-1.0 cudamux name=cuda in-place=true location=./mixer.so nvcamerasrc queue-size=10 sensor-id=0 fpsRange='60 60' ! "video/x-raw(memory:NVMM),width=1920,height=1080,format=I420,framerate=60/1" ! nvvidconv ! "video/x-raw(memory:NVMM),width=1920,height=1080,format=I420,framerate=60/1" ! queue ! cuda.sink_0 videotestsrc pattern=ball is-live=true ! "video/x-raw,width=640,height=480,format=I420,framerate=60/1" ! nvvidconv ! "video/x-raw(memory:NVMM),width=1920,height=1080,format=I420,framerate=60/1" ! queue ! cuda.sink_1 cuda. ! perf print-arm-load=true ! nvoverlaysink --gst-debug=0

Performance stats

GST-PERF INFO -->  Timestamp: 1:07:52.306338304; Bps: 763; fps: 60.3; CPU: 25;  
GST-PERF INFO -->  Timestamp: 1:07:53.306388795; Bps: 776; fps: 60.0; CPU: 24;  
GST-PERF INFO -->  Timestamp: 1:07:54.322402666; Bps: 763; fps: 60.3; CPU: 28;  
GST-PERF INFO -->  Timestamp: 1:07:55.322446073; Bps: 776; fps: 60.0; CPU: 28;  
GST-PERF INFO -->  Timestamp: 1:07:56.322744529; Bps: 776; fps: 60.0; CPU: 27;  
GST-PERF INFO -->  Timestamp: 1:07:57.340534213; Bps: 763; fps: 59.98; CPU: 25;  
GST-PERF INFO -->  Timestamp: 1:07:58.356490895; Bps: 764; fps: 60.9; CPU: 24;  
GST-PERF INFO -->  Timestamp: 1:07:59.372619293; Bps: 763; fps: 60.3; CPU: 25;  
GST-PERF INFO -->  Timestamp: 1:08:00.389454663; Bps: 763; fps: 60.3; CPU: 24;  
GST-PERF INFO -->  Timestamp: 1:08:01.406149096; Bps: 763; fps: 60.3; CPU: 22;


1080p 60fps camera stream + 1080p 60fps videotestsrc sources mixed to display (in-place=false)

Example pipeline

gst-launch-1.0 cudamux name=cuda in-place=false location=./mixer.so nvcamerasrc queue-size=10 sensor-id=0 fpsRange='60 60' ! "video/x-raw(memory:NVMM),width=1920,height=1080,format=I420,framerate=60/1" ! nvvidconv ! "video/x-raw(memory:NVMM),width=1920,height=1080,format=I420,framerate=60/1" ! queue ! cuda.sink_0 videotestsrc pattern=ball is-live=true ! "video/x-raw,width=640,height=480,format=I420,framerate=60/1" ! nvvidconv ! "video/x-raw(memory:NVMM),width=1920,height=1080,format=I420,framerate=60/1" ! queue ! cuda.sink_1 cuda. ! perf print-arm-load=true ! nvvidconv ! nvoverlaysink --gst-debug=0

Performance stats

GST-PERF INFO -->  Timestamp: 22:04:56.593436486; Bps: 763; fps: 60.3; CPU: 33;  
GST-PERF INFO -->  Timestamp: 22:04:57.593838787; Bps: 776; fps: 60.0; CPU: 33;  
GST-PERF INFO -->  Timestamp: 22:04:58.609787695; Bps: 764; fps: 60.9; CPU: 32;  
GST-PERF INFO -->  Timestamp: 22:04:59.609907812; Bps: 776; fps: 60.0; CPU: 34;  
GST-PERF INFO -->  Timestamp: 22:05:00.609942878; Bps: 776; fps: 60.0; CPU: 32;  
GST-PERF INFO -->  Timestamp: 22:05:01.626993130; Bps: 763; fps: 59.98; CPU: 33;  
GST-PERF INFO -->  Timestamp: 22:05:02.627490381; Bps: 776; fps: 60.0; CPU: 35;  
GST-PERF INFO -->  Timestamp: 22:05:03.643151065; Bps: 764; fps: 60.9; CPU: 35;  
GST-PERF INFO -->  Timestamp: 22:05:04.659751376; Bps: 763; fps: 60.3; CPU: 36;  
GST-PERF INFO -->  Timestamp: 22:05:05.660301596; Bps: 776; fps: 60.0; CPU: 36;


2x 1080p 60fps camera streams mixed to display (in-place=true)

Note: We only use one camera source that is splitted in two sources by a tee element. One of the video source input branches is flipped by 180 degrees, so the resultant image is a combination of the input image with the same image rotated.

Example pipeline

gst-launch-1.0 cudamux name=cuda in-place=true location=./mixer.so nvcamerasrc queue-size=10 sensor-id=0 fpsRange='60 60' ! "video/x-raw(memory:NVMM),width=1920,height=1080,format=I420,framerate=60/1" ! tee name=t t.src_0 ! nvvidconv ! "video/x-raw(memory:NVMM),width=1920,height=1080,format=I420,framerate=60/1" ! queue ! cuda.sink_0 t.src_1 ! nvvidconv flip-method=2 ! "video/x-raw(memory:NVMM),width=1920,height=1080,format=I420,framerate=60/1" ! queue ! cuda.sink_1 cuda. ! perf print-arm-load=true ! nvoverlaysink --gst-debug=0

Performance stats

GST-PERF INFO -->  Timestamp: 1:09:00.057598052; Bps: 776; fps: 60.0; CPU: 7;  
GST-PERF INFO -->  Timestamp: 1:09:01.058253501; Bps: 776; fps: 60.0; CPU: 15;  
GST-PERF INFO -->  Timestamp: 1:09:02.073875844; Bps: 764; fps: 60.9; CPU: 13;  
GST-PERF INFO -->  Timestamp: 1:09:03.073878080; Bps: 776; fps: 60.0; CPU: 14;  
GST-PERF INFO -->  Timestamp: 1:09:04.074694842; Bps: 776; fps: 60.0; CPU: 13;  
GST-PERF INFO -->  Timestamp: 1:09:05.090132409; Bps: 764; fps: 60.9; CPU: 13;  
GST-PERF INFO -->  Timestamp: 1:09:06.090189398; Bps: 776; fps: 60.0; CPU: 12;  
GST-PERF INFO -->  Timestamp: 1:09:07.091822777; Bps: 775; fps: 59.94; CPU: 15;  
GST-PERF INFO -->  Timestamp: 1:09:08.108071701; Bps: 763; fps: 60.3; CPU: 12;  
GST-PERF INFO -->  Timestamp: 1:09:09.123598348; Bps: 764; fps: 60.9; CPU: 14;


2x 1080p 60fps camera streams mixed to display (in-place=false)

Note: We only use one camera source that is splitted in two sources by a tee element. One of the video source input branches is flipped by 180 degrees, so the resultant image is a combination of the input image with the same image rotated.

Example pipeline

gst-launch-1.0 cudamux name=cuda in-place=false location=./mixer.so nvcamerasrc queue-size=10 sensor-id=0 fpsRange='60 60' ! "video/x-raw(memory:NVMM),width=1920,height=1080,format=I420,framerate=60/1" ! tee name=t t.src_0 ! nvvidconv ! "video/x-raw(memory:NVMM),width=1920,height=1080,format=I420,framerate=60/1" ! queue ! cuda.sink_0 t.src_1 ! nvvidconv flip-method=2 ! "video/x-raw(memory:NVMM),width=1920,height=1080,format=I420,framerate=60/1" ! queue ! cuda.sink_1 cuda. ! nvvidconv ! perf print-arm-load=true ! nvoverlaysink --gst-debug=0

Performance stats

GST-PERF INFO -->  Timestamp: 22:29:36.540394267; Bps: 776; fps: 60.0; CPU: 26;  
GST-PERF INFO -->  Timestamp: 22:29:37.557019712; Bps: 763; fps: 60.3; CPU: 27;  
GST-PERF INFO -->  Timestamp: 22:29:38.573266100; Bps: 763; fps: 60.3; CPU: 27;  
GST-PERF INFO -->  Timestamp: 22:29:39.589838109; Bps: 763; fps: 60.3; CPU: 26;  
GST-PERF INFO -->  Timestamp: 22:29:40.589912895; Bps: 776; fps: 60.0; CPU: 25;  
GST-PERF INFO -->  Timestamp: 22:29:41.590099450; Bps: 776; fps: 60.0; CPU: 26;  
GST-PERF INFO -->  Timestamp: 22:29:42.590126059; Bps: 776; fps: 60.0; CPU: 26;  
GST-PERF INFO -->  Timestamp: 22:29:43.590282250; Bps: 776; fps: 60.0; CPU: 26;  
GST-PERF INFO -->  Timestamp: 22:29:44.606635721; Bps: 763; fps: 60.3; CPU: 26;  
GST-PERF INFO -->  Timestamp: 22:29:45.606758423; Bps: 776; fps: 60.0; CPU: 25;


2x 4K 60fps camera streams camera streams mixed to display (in-place=true)

Example pipeline

gst-launch-1.0 cudamux name=cuda in-place=true location=./mixer.so nvcamerasrc queue-size=10 sensor-id=0 fpsRange='60 60' ! "video/x-raw(memory:NVMM),width=3840,height=2160,format=I420,framerate=60/1" ! tee name=t t.src_0 ! nvvidconv ! "video/x-raw(memory:NVMM),width=3840,height=2160,format=I420,framerate=60/1" ! queue ! cuda.sink_0 t.src_1 ! nvvidconv flip-method=2 ! "video/x-raw(memory:NVMM),width=3840,height=2160,format=I420,framerate=60/1" ! queue ! cuda.sink_1 cuda. ! perf print-arm-load=true ! nvoverlaysink --gst-debug=0

Performance stats

GST-PERF INFO -->  Timestamp: 1:09:53.580481344; Bps: 776; fps: 60.0; CPU: 16;  
GST-PERF INFO -->  Timestamp: 1:09:54.596964159; Bps: 763; fps: 60.3; CPU: 18;  
GST-PERF INFO -->  Timestamp: 1:09:55.612936985; Bps: 764; fps: 60.9; CPU: 17;  
GST-PERF INFO -->  Timestamp: 1:09:56.613082828; Bps: 776; fps: 60.0; CPU: 20;  
GST-PERF INFO -->  Timestamp: 1:09:57.622503818; Bps: 769; fps: 59.46; CPU: 18;  
GST-PERF INFO -->  Timestamp: 1:09:58.634003692; Bps: 767; fps: 59.34; CPU: 18;  
GST-PERF INFO -->  Timestamp: 1:09:59.646157364; Bps: 766; fps: 59.28; CPU: 16;  
GST-PERF INFO -->  Timestamp: 1:10:00.657026108; Bps: 768; fps: 59.40; CPU: 15;  
GST-PERF INFO -->  Timestamp: 1:10:01.667696004; Bps: 768; fps: 59.40; CPU: 18;  
GST-PERF INFO -->  Timestamp: 1:10:02.680801291; Bps: 766; fps: 59.23; CPU: 17;


2x 4K 60fps camera streams mixed to fakesink (in-place=false)

Note: Due to limitations of the nvoverlaysink element, we use a fakesink instead to demonstrate that cudamux is capable to handle up to 2x4K@60fps streams.

Example pipeline

gst-launch-1.0 cudamux name=cuda in-place=false location=./mixer.so nvcamerasrc queue-size=10 sensor-id=0 fpsRange='60 60' ! "video/x-raw(memory:NVMM),width=3840,height=2160,format=I420,framerate=60/1" ! tee name=t t.src_0 ! nvvidconv ! "video/x-raw(memory:NVMM),width=3840,height=2160,format=I420,framerate=60/1" ! queue ! cuda.sink_0 t.src_1 ! nvvidconv flip-method=2 ! "video/x-raw(memory:NVMM),width=3840,height=2160,format=I420,framerate=60/1" ! queue ! cuda.sink_1 cuda. ! perf print-arm-load=true ! fakesink --gst-debug=0

Performance stats

GST-PERF INFO -->  Timestamp: 22:36:03.257382576; Bps: 12318415; fps: 59.40; CPU: 28;  
GST-PERF INFO -->  Timestamp: 22:36:04.269501023; Bps: 12294071; fps: 59.28; CPU: 28;  
GST-PERF INFO -->  Timestamp: 22:36:05.280928748; Bps: 12306231; fps: 59.34; CPU: 29;  
GST-PERF INFO -->  Timestamp: 22:36:06.292548867; Bps: 12306231; fps: 59.34; CPU: 30;  
GST-PERF INFO -->  Timestamp: 22:36:07.304494555; Bps: 12306231; fps: 59.34; CPU: 33;  
GST-PERF INFO -->  Timestamp: 22:36:08.314887239; Bps: 12318415; fps: 59.40; CPU: 32;  
GST-PERF INFO -->  Timestamp: 22:36:09.327312506; Bps: 12294071; fps: 59.28; CPU: 31;  
GST-PERF INFO -->  Timestamp: 22:36:10.340253028; Bps: 12294071; fps: 59.28; CPU: 30;  
GST-PERF INFO -->  Timestamp: 22:36:11.351945386; Bps: 12306231; fps: 59.34; CPU: 33;  
GST-PERF INFO -->  Timestamp: 22:36:12.361623391; Bps: 12330624; fps: 59.46; CPU: 31;




Example 1: cudafilter


Home

Example 3: cudadebayer