GstCUDA - Example - cudafilter: Unified memory allocator

From RidgeRun Developer Connection
Jump to: navigation, search


Previous: Example - cudafilter: NVMM direct mapping Index Next: Example: cudamux


Nvidia-preferred-partner-badge-rgb-for-screen.png



This page gives a set of test pipelines to use cudafilter element for Unified memory allocator mode.

Contents

Error something wrong.jpg Problems running the pipelines shown on this page?
Please see our GStreamer Debugging guide for help.

The perf element can be downloaded from this repository, otherwise, the element can be removed from the pipeline without any issues.

Gray-scale filter CUDA library algorithm

For these examples, remember to be in the correct path to use the binaries:

cd $GstCUDA_DIR/tests/examples/cudafilter_algorithms/gray-scale-filter/

1080p 60fps capture to display (in-place=true)

Example pipeline

gst-launch-1.0 nvcamerasrc queue-size=10 sensor-id=0 fpsRange='60 60' ! "video/x-raw(memory:NVMM),width=1920,height=1080,format=I420,framerate=60/1" ! nvvidconv ! "video/x-raw,width=1920,height=1080,format=I420,framerate=60/1" ! cudafilter in-place=true location=./gray-scale-filter.so ! perf print-arm-load=true ! nvoverlaysink --gst-debug=0

Performance stats

GST-PERF INFO -->  Timestamp: 2:32:18.349217617; Bps: 3107292; fps: 60.93; CPU: 22;  
GST-PERF INFO -->  Timestamp: 2:32:19.353131270; Bps: 3101096; fps: 60.81; CPU: 22;  
GST-PERF INFO -->  Timestamp: 2:32:20.354075771; Bps: 3110400; fps: 61.0; CPU: 21;  
GST-PERF INFO -->  Timestamp: 2:32:21.355957035; Bps: 3107292; fps: 60.93; CPU: 22;  
GST-PERF INFO -->  Timestamp: 2:32:22.358532255; Bps: 3104191; fps: 60.87; CPU: 21;  
GST-PERF INFO -->  Timestamp: 2:32:23.360376511; Bps: 3107292; fps: 60.93; CPU: 22;  
GST-PERF INFO -->  Timestamp: 2:32:24.364154593; Bps: 3101096; fps: 60.81; CPU: 21;  
GST-PERF INFO -->  Timestamp: 2:32:25.364880141; Bps: 3110400; fps: 61.0; CPU: 22;  
GST-PERF INFO -->  Timestamp: 2:32:26.367493736; Bps: 3104191; fps: 60.87; CPU: 22;  
GST-PERF INFO -->  Timestamp: 2:32:27.369399230; Bps: 3107292; fps: 60.93; CPU: 22;

4K 60fps capture to fakesink (in-place=true)

Example pipeline

gst-launch-1.0 nvcamerasrc queue-size=10 sensor-id=0 fpsRange='60 60' ! "video/x-raw(memory:NVMM),width=3840,height=2160,format=I420,framerate=60/1" ! nvvidconv ! "video/x-raw,width=3840,height=2160,format=I420,framerate=60/1" ! cudafilter in-place=true location=./gray-scale-filter.so ! perf print-arm-load=true ! fakesink --gst-debug=0

Performance stats

GST-PERF INFO -->  Timestamp: 3:06:50.662612586; Bps: 12441600; fps: 60.0; CPU: 26;  
GST-PERF INFO -->  Timestamp: 3:06:51.676336187; Bps: 12281934; fps: 60.21; CPU: 25;  
GST-PERF INFO -->  Timestamp: 3:06:52.689126572; Bps: 12294071; fps: 60.27; CPU: 26;  
GST-PERF INFO -->  Timestamp: 3:06:53.705126913; Bps: 12245669; fps: 60.3; CPU: 26;  
GST-PERF INFO -->  Timestamp: 3:06:54.720984078; Bps: 12257733; fps: 60.9; CPU: 25;  
GST-PERF INFO -->  Timestamp: 3:06:55.730950180; Bps: 12330624; fps: 60.45; CPU: 26;  
GST-PERF INFO -->  Timestamp: 3:06:56.745949024; Bps: 12269822; fps: 60.15; CPU: 26;  
GST-PERF INFO -->  Timestamp: 3:06:57.758307229; Bps: 12294071; fps: 60.27; CPU: 26;  
GST-PERF INFO -->  Timestamp: 3:06:58.761227074; Bps: 12416766; fps: 59.88; CPU: 27;  
GST-PERF INFO -->  Timestamp: 3:06:59.769052216; Bps: 12355114; fps: 59.58; CPU: 30;

4K 60fps capture to display (in-place=true)

In this case, the framerate drops to 37 fps, when should be 60fps. This is caused to limitations in the system performance because in this case, we are not using NVMM memory type buffers, who ensures the optimal performance on Tegra boards.

Example pipeline

gst-launch-1.0 nvcamerasrc queue-size=10 sensor-id=0 fpsRange='60 60' ! "video/x-raw(memory:NVMM),width=3840,height=2160,format=I420,framerate=60/1" ! nvvidconv ! "video/x-raw,width=3840,height=2160,format=I420,framerate=60/1" ! cudafilter in-place=true location=./gray-scale-filter.so ! perf print-arm-load=true ! nvoverlaysink --gst-debug=0

Performance stats

GST-PERF INFO -->  Timestamp: 0:06:00.632191378; Bps: 12392031; fps: 37.84; CPU: 35;  
GST-PERF INFO -->  Timestamp: 0:06:01.654330618; Bps: 12173776; fps: 37.18; CPU: 35;  
GST-PERF INFO -->  Timestamp: 0:06:02.657927359; Bps: 12404386; fps: 36.88; CPU: 39;  
GST-PERF INFO -->  Timestamp: 0:06:03.664068820; Bps: 12367395; fps: 37.77; CPU: 37;  
GST-PERF INFO -->  Timestamp: 0:06:04.667427195; Bps: 12404386; fps: 37.88; CPU: 35;  
GST-PERF INFO -->  Timestamp: 0:06:05.683748562; Bps: 12245669; fps: 37.40; CPU: 35;  
GST-PERF INFO -->  Timestamp: 0:06:06.702499911; Bps: 12221611; fps: 37.32; CPU: 35;  
GST-PERF INFO -->  Timestamp: 0:06:07.727256551; Bps: 12150000; fps: 38.8; CPU: 35;  
GST-PERF INFO -->  Timestamp: 0:06:08.750347186; Bps: 12161876; fps: 38.12; CPU: 36;  
GST-PERF INFO -->  Timestamp: 0:06:09.760670528; Bps: 12318415; fps: 37.62; CPU: 34;

720p 60fps videotestsrc to display (in-place=true)

Example pipeline

gst-launch-1.0 videotestsrc is-live=true ! "video/x-raw,width=1280,height=720,format=I420,framerate=60/1" ! cudafilter in-place=true location=./gray-scale-filter.so ! perf print-arm-load=true ! nvoverlaysink --gst-debug=0

Performance stats

GST-PERF INFO -->  Timestamp: 3:07:49.322893142; Bps: 795; fps: 60.3; CPU: 16;  
GST-PERF INFO -->  Timestamp: 3:07:50.322903974; Bps: 808; fps: 60.0; CPU: 16;  
GST-PERF INFO -->  Timestamp: 3:07:51.339511816; Bps: 795; fps: 60.3; CPU: 15;  
GST-PERF INFO -->  Timestamp: 3:07:52.356234918; Bps: 795; fps: 60.3; CPU: 16;  
GST-PERF INFO -->  Timestamp: 3:07:53.372854218; Bps: 795; fps: 60.3; CPU: 16;  
GST-PERF INFO -->  Timestamp: 3:07:54.372943539; Bps: 808; fps: 60.0; CPU: 16;  
GST-PERF INFO -->  Timestamp: 3:07:55.389524768; Bps: 795; fps: 60.3; CPU: 16;  
GST-PERF INFO -->  Timestamp: 3:07:56.389544142; Bps: 808; fps: 60.0; CPU: 16;  
GST-PERF INFO -->  Timestamp: 3:07:57.389562059; Bps: 808; fps: 60.0; CPU: 16;  
GST-PERF INFO -->  Timestamp: 3:07:58.406250630; Bps: 795; fps: 60.3; CPU: 16;

Example pipeline for x86

gst-launch-1.0 videotestsrc is-live=true ! "video/x-raw,width=1280,height=720,format=I420,framerate=60/1" ! cudafilter in-place=true location=./gray-scale-filter.so ! perf print-arm-load=true ! autovideosink

Performance stats

INFO:
perf: perf0; timestamp: 6:56:07.973102245; bps: 597196800,000; mean_bps: 0,000; fps: 58,872; mean_fps: 58,872
INFO:
perf: perf0; timestamp: 6:56:09.007449869; bps: 652492800,000; mean_bps: 652492800,000; fps: 58,974; mean_fps: 58,923
INFO:
perf: perf0; timestamp: 6:56:10.007783428; bps: 674611200,000; mean_bps: 663552000,000; fps: 60,980; mean_fps: 59,609
INFO:
perf: perf0; timestamp: 6:56:11.024421488; bps: 663552000,000; mean_bps: 663552000,000; fps: 60,002; mean_fps: 59,707
INFO:
perf: perf0; timestamp: 6:56:12.040240924; bps: 663552000,000; mean_bps: 663552000,000; fps: 60,050; mean_fps: 59,776
INFO:
perf: perf0; timestamp: 6:56:13.056260073; bps: 663552000,000; mean_bps: 663552000,000; fps: 60,038; mean_fps: 59,819
INFO:
perf: perf0; timestamp: 6:56:14.057459656; bps: 663552000,000; mean_bps: 663552000,000; fps: 59,928; mean_fps: 59,835
INFO:
perf: perf0; timestamp: 6:56:15.072867023; bps: 652492800,000; mean_bps: 661972114,286; fps: 60,074; mean_fps: 59,865
INFO:
perf: perf0; timestamp: 6:56:16.073571754; bps: 674611200,000; mean_bps: 663552000,000; fps: 59,958; mean_fps: 59,875

1080p 60fps capture to display (in-place=false)

Example pipeline

gst-launch-1.0 nvcamerasrc queue-size=10 sensor-id=0 fpsRange='60 60' ! "video/x-raw(memory:NVMM),width=1920,height=1080,format=I420,framerate=60/1" ! nvvidconv ! "video/x-raw,width=1920,height=1080,format=I420,framerate=60/1" ! cudafilter in-place=false location=./gray-scale-filter.so ! nvvidconv ! perf print-arm-load=true ! nvoverlaysink --gst-debug=0

Performance stats

GST-PERF INFO -->  Timestamp: 3:08:28.245104278; Bps: 804; fps: 60.75; CPU: 23;  
GST-PERF INFO -->  Timestamp: 3:08:29.246617908; Bps: 807; fps: 60.93; CPU: 24;  
GST-PERF INFO -->  Timestamp: 3:08:30.248102788; Bps: 807; fps: 60.93; CPU: 24;  
GST-PERF INFO -->  Timestamp: 3:08:31.251689462; Bps: 805; fps: 60.81; CPU: 24;  
GST-PERF INFO -->  Timestamp: 3:08:32.254615156; Bps: 806; fps: 60.87; CPU: 25;  
GST-PERF INFO -->  Timestamp: 3:08:33.254989427; Bps: 808; fps: 61.0; CPU: 26;  
GST-PERF INFO -->  Timestamp: 3:08:34.257540647; Bps: 806; fps: 60.87; CPU: 25;  
GST-PERF INFO -->  Timestamp: 3:08:35.265103361; Bps: 802; fps: 60.57; CPU: 25;  
GST-PERF INFO -->  Timestamp: 3:08:36.278168395; Bps: 797; fps: 61.20; CPU: 24;  
GST-PERF INFO -->  Timestamp: 3:08:37.286638388; Bps: 801; fps: 60.51; CPU: 26;

4K 60fps capture to fakesink (in-place=false)

In this case, the framerate drops to 39 fps, when should be 60fps. This is caused to limitations in the system performance because in this case, we are not using NVMM memory type buffers, who ensures the optimal performance on Tegra boards. Also, we are not using the in-place mode, which affects a little in lowering the system performance.

Example pipeline

gst-launch-1.0 nvcamerasrc queue-size=10 sensor-id=0 fpsRange='60 60' ! "video/x-raw(memory:NVMM),width=3840,height=2160,format=I420,framerate=60/1" ! nvvidconv ! "video/x-raw,width=3840,height=2160,format=I420,framerate=60/1" ! cudafilter in-place=false location=./gray-scale-filter.so ! perf print-arm-load=true ! fakesink --gst-debug=0

Performance stats

GST-PERF INFO -->  Timestamp: 0:24:24.345566910; Bps: 12416766; fps: 39.92; CPU: 25;  
GST-PERF INFO -->  Timestamp: 0:24:25.348160788; Bps: 12416766; fps: 39.92; CPU: 26;  
GST-PERF INFO -->  Timestamp: 0:24:26.349754616; Bps: 12429170; fps: 38.96; CPU: 25;  
GST-PERF INFO -->  Timestamp: 0:24:27.367986536; Bps: 12221611; fps: 39.29; CPU: 24;  
GST-PERF INFO -->  Timestamp: 0:24:28.385009396; Bps: 12233628; fps: 39.33; CPU: 25;  
GST-PERF INFO -->  Timestamp: 0:24:29.395061749; Bps: 12318415; fps: 38.61; CPU: 24;  
GST-PERF INFO -->  Timestamp: 0:24:30.401018589; Bps: 12379701; fps: 39.80; CPU: 26;  
GST-PERF INFO -->  Timestamp: 0:24:31.422597951; Bps: 12185700; fps: 40.15; CPU: 25;  
GST-PERF INFO -->  Timestamp: 0:24:32.435818006; Bps: 12281934; fps: 39.48; CPU: 25;  
GST-PERF INFO -->  Timestamp: 0:24:33.443020885; Bps: 12355114; fps: 39.72; CPU: 25;

720p 60fps videotestsrc to display (in-place=false)

Example pipeline

gst-launch-1.0 videotestsrc is-live=true ! "video/x-raw,width=1280,height=720,format=I420,framerate=60/1" ! cudafilter in-place=false location=./gray-scale-filter.so ! nvvidconv ! perf print-arm-load=true ! nvoverlaysink --gst-debug=0

Performance stats

GST-PERF INFO -->  Timestamp: 3:12:26.643385389; Bps: 795; fps: 60.3; CPU: 19;  
GST-PERF INFO -->  Timestamp: 3:12:27.643428963; Bps: 808; fps: 60.0; CPU: 19;  
GST-PERF INFO -->  Timestamp: 3:12:28.643445299; Bps: 808; fps: 60.0; CPU: 19;  
GST-PERF INFO -->  Timestamp: 3:12:29.643520696; Bps: 808; fps: 60.0; CPU: 19;  
GST-PERF INFO -->  Timestamp: 3:12:30.660181959; Bps: 795; fps: 60.3; CPU: 19;  
GST-PERF INFO -->  Timestamp: 3:12:31.676592653; Bps: 795; fps: 60.3; CPU: 19;  
GST-PERF INFO -->  Timestamp: 3:12:32.693588599; Bps: 795; fps: 60.3; CPU: 18;  
GST-PERF INFO -->  Timestamp: 3:12:33.693620925; Bps: 808; fps: 60.0; CPU: 19;  
GST-PERF INFO -->  Timestamp: 3:12:34.710218231; Bps: 795; fps: 60.3; CPU: 19;  
GST-PERF INFO -->  Timestamp: 3:12:35.710295816; Bps: 808; fps: 60.0; CPU: 19;


Example pipeline for x86

gst-launch-1.0 videotestsrc is-live=true ! "video/x-raw,width=1280,height=720,format=I420,framerate=60/1" ! cudafilter in-place=false location=./gray-scale-filter.so ! perf print-arm-load=true ! autovideosink

Performance stats

INFO:
perf: perf0; timestamp: 7:54:09.068688778; bps: 663552000,000; mean_bps: 663552000,000; fps: 59,863; mean_fps: 59,374; cpu: 14; 
INFO:
perf: perf0; timestamp: 7:54:10.070051453; bps: 663552000,000; mean_bps: 663552000,000; fps: 59,918; mean_fps: 59,556; cpu: 11; 
INFO:
perf: perf0; timestamp: 7:54:11.071697931; bps: 663552000,000; mean_bps: 663552000,000; fps: 59,901; mean_fps: 59,642; cpu: 12; 
INFO:
perf: perf0; timestamp: 7:54:12.073209851; bps: 663552000,000; mean_bps: 663552000,000; fps: 59,909; mean_fps: 59,696; cpu: 12; 
INFO:
perf: perf0; timestamp: 7:54:13.086875389; bps: 663552000,000; mean_bps: 663552000,000; fps: 60,178; mean_fps: 59,776; cpu: 13; 
INFO:
perf: perf0; timestamp: 7:54:14.089058612; bps: 663552000,000; mean_bps: 663552000,000; fps: 59,869; mean_fps: 59,789; cpu: 15; 
INFO:
perf: perf0; timestamp: 7:54:15.105509121; bps: 663552000,000; mean_bps: 663552000,000; fps: 60,013; mean_fps: 59,817; cpu: 14; 
INFO:
perf: perf0; timestamp: 7:54:16.106221087; bps: 674611200,000; mean_bps: 664934400,000; fps: 59,957; mean_fps: 59,833; cpu: 16; 
INFO:
perf: perf0; timestamp: 7:54:17.118832050; bps: 663552000,000; mean_bps: 664780800,000; fps: 60,240; mean_fps: 59,874; cpu: 18; 
INFO:
perf: perf0; timestamp: 7:54:18.122389505; bps: 663552000,000; mean_bps: 664657920,000; fps: 59,787; mean_fps: 59,866; cpu: 13;

HW accelerated memcpy CUDA library algorithm

For these examples, remember to be in the correct path to use the binaries:

cd $GstCUDA_DIR/tests/examples/cudafilter_algorithms/memcpy/

1080p 60fps capture to display (in-place=true)

Example pipeline

gst-launch-1.0 nvcamerasrc queue-size=10 sensor-id=0 fpsRange='60 60' ! "video/x-raw(memory:NVMM),width=1920,height=1080,format=I420,framerate=60/1" ! nvvidconv ! "video/x-raw,width=1920,height=1080,format=I420,framerate=60/1" ! cudafilter in-place=true location=./memcpy.so ! perf print-arm-load=true ! nvoverlaysink --gst-debug=0

Performance stats

GST-PERF INFO -->  Timestamp: 3:13:37.922427243; Bps: 3104191; fps: 60.87; CPU: 21;  
GST-PERF INFO -->  Timestamp: 3:13:38.924511734; Bps: 3104191; fps: 60.87; CPU: 21;  
GST-PERF INFO -->  Timestamp: 3:13:39.926912158; Bps: 3104191; fps: 60.87; CPU: 21;  
GST-PERF INFO -->  Timestamp: 3:13:40.929106907; Bps: 3104191; fps: 60.87; CPU: 21;  
GST-PERF INFO -->  Timestamp: 3:13:41.931554414; Bps: 3104191; fps: 60.87; CPU: 21;  
GST-PERF INFO -->  Timestamp: 3:13:42.933716664; Bps: 3104191; fps: 60.87; CPU: 21;  
GST-PERF INFO -->  Timestamp: 3:13:43.935918185; Bps: 3104191; fps: 60.87; CPU: 21;  
GST-PERF INFO -->  Timestamp: 3:13:44.938165434; Bps: 3104191; fps: 60.87; CPU: 21;  
GST-PERF INFO -->  Timestamp: 3:13:45.940054511; Bps: 3107292; fps: 60.93; CPU: 22;  
GST-PERF INFO -->  Timestamp: 3:13:46.942710661; Bps: 3104191; fps: 60.87; CPU: 22;


4K 60fps capture to fakesink (in-place=true)

Example pipeline

gst-launch-1.0 nvcamerasrc queue-size=10 sensor-id=0 fpsRange='60 60' ! "video/x-raw(memory:NVMM),width=3840,height=2160,format=I420,framerate=60/1" ! nvvidconv ! "video/x-raw,width=3840,height=2160,format=I420,framerate=60/1" ! cudafilter in-place=true location=./memcpy.so ! perf print-arm-load=true ! fakesink --gst-debug=0

Performance stats

GST-PERF INFO -->  Timestamp: 3:14:28.821086827; Bps: 12269822; fps: 60.15; CPU: 26;  
GST-PERF INFO -->  Timestamp: 3:14:29.836487648; Bps: 12257733; fps: 60.9; CPU: 26;  
GST-PERF INFO -->  Timestamp: 3:14:30.846254381; Bps: 12330624; fps: 60.45; CPU: 27;  
GST-PERF INFO -->  Timestamp: 3:14:31.847040300; Bps: 12441600; fps: 60.0; CPU: 26;  
GST-PERF INFO -->  Timestamp: 3:14:32.862722733; Bps: 12257733; fps: 60.9; CPU: 27;  
GST-PERF INFO -->  Timestamp: 3:14:33.875866242; Bps: 12281934; fps: 60.21; CPU: 25;  
GST-PERF INFO -->  Timestamp: 3:14:34.891156076; Bps: 12257733; fps: 60.9; CPU: 26;  
GST-PERF INFO -->  Timestamp: 3:14:35.897627334; Bps: 12367395; fps: 61.63; CPU: 27;  
GST-PERF INFO -->  Timestamp: 3:14:36.902003620; Bps: 12392031; fps: 59.76; CPU: 25;  
GST-PERF INFO -->  Timestamp: 3:14:37.914072041; Bps: 12294071; fps: 60.27; CPU: 26;


4K 60fps capture to display (in-place=true)

In this case, the framerate drops to 38 fps, which should be 60 fps. This is caused to limitations in the system performance because in this case, we are not using NVMM memory type buffers, who ensures the optimal performance on Tegra boards.

Example pipeline

gst-launch-1.0 nvcamerasrc queue-size=10 sensor-id=0 fpsRange='60 60' ! "video/x-raw(memory:NVMM),width=3840,height=2160,format=I420,framerate=60/1" ! nvvidconv ! "video/x-raw,width=3840,height=2160,format=I420,framerate=60/1" ! cudafilter in-place=true location=./memcpy.so ! perf print-arm-load=true ! nvoverlaysink --gst-debug=0

Performance stats

GST-PERF INFO -->  Timestamp: 0:33:20.988503700; Bps: 12379701; fps: 37.81; CPU: 43;  
GST-PERF INFO -->  Timestamp: 0:33:21.993059346; Bps: 12392031; fps: 37.84; CPU: 36;  
GST-PERF INFO -->  Timestamp: 0:33:23.018329951; Bps: 12138146; fps: 38.4; CPU: 34;  
GST-PERF INFO -->  Timestamp: 0:33:24.023946167; Bps: 12379701; fps: 37.81; CPU: 36;  
GST-PERF INFO -->  Timestamp: 0:33:25.027543481; Bps: 12404386; fps: 37.88; CPU: 35;  
GST-PERF INFO -->  Timestamp: 0:33:26.033257510; Bps: 12379701; fps: 37.81; CPU: 35;  
GST-PERF INFO -->  Timestamp: 0:33:27.037511334; Bps: 12392031; fps: 37.84; CPU: 35;  
GST-PERF INFO -->  Timestamp: 0:33:28.052662740; Bps: 12257733; fps: 37.43; CPU: 36;  
GST-PERF INFO -->  Timestamp: 0:33:29.068566801; Bps: 12257733; fps: 38.42; CPU: 37;  
GST-PERF INFO -->  Timestamp: 0:33:30.088606375; Bps: 12197647; fps: 38.23; CPU: 36;


720p 60fps videotestsrc to display (in-place=true)

Example pipeline

gst-launch-1.0 videotestsrc is-live=true ! "video/x-raw,width=1280,height=720,format=I420,framerate=60/1" ! cudafilter in-place=true location=./memcpy.so ! perf print-arm-load=true ! nvoverlaysink --gst-debug=0

Performance stats

GST-PERF INFO -->  Timestamp: 3:14:59.262126230; Bps: 808; fps: 60.0; CPU: 19;  
GST-PERF INFO -->  Timestamp: 3:15:00.262127266; Bps: 808; fps: 60.0; CPU: 16;  
GST-PERF INFO -->  Timestamp: 3:15:01.278617033; Bps: 795; fps: 60.3; CPU: 25;  
GST-PERF INFO -->  Timestamp: 3:15:02.278643434; Bps: 808; fps: 60.0; CPU: 16;  
GST-PERF INFO -->  Timestamp: 3:15:03.295377938; Bps: 795; fps: 60.3; CPU: 16;  
GST-PERF INFO -->  Timestamp: 3:15:04.295400484; Bps: 808; fps: 60.0; CPU: 16;  
GST-PERF INFO -->  Timestamp: 3:15:05.312018531; Bps: 795; fps: 60.3; CPU: 16;  
GST-PERF INFO -->  Timestamp: 3:15:06.328620641; Bps: 795; fps: 60.3; CPU: 16;  
GST-PERF INFO -->  Timestamp: 3:15:07.328731832; Bps: 808; fps: 60.0; CPU: 16;  
GST-PERF INFO -->  Timestamp: 3:15:08.328821201; Bps: 808; fps: 60.0; CPU: 16;

Example pipeline for x86

gst-launch-1.0 videotestsrc is-live=true ! "video/x-raw,width=1280,height=720,format=I420,framerate=60/1" ! cudafilter in-place=true location=./memcpy.so ! perf print-arm-load=true ! autovideosink

Performance stats

INFO:
perf: perf0; timestamp: 8:27:44.132774137; bps: 663552000,000; mean_bps: 663552000,000; fps: 62,822; mean_fps: 60,172; cpu: 100; 
INFO:
perf: perf0; timestamp: 8:27:45.154653806; bps: 663552000,000; mean_bps: 663552000,000; fps: 58,715; mean_fps: 59,687; cpu: 100; 
INFO:
perf: perf0; timestamp: 8:27:46.159348555; bps: 663552000,000; mean_bps: 663552000,000; fps: 61,710; mean_fps: 60,193; cpu: 100; 
INFO:
perf: perf0; timestamp: 8:27:47.173018374; bps: 674611200,000; mean_bps: 666316800,000; fps: 60,177; mean_fps: 60,190; cpu: 100; 
INFO:
perf: perf0; timestamp: 8:27:48.189706152; bps: 663552000,000; mean_bps: 665763840,000; fps: 59,999; mean_fps: 60,158; cpu: 100; 
INFO:
perf: perf0; timestamp: 8:27:49.200836614; bps: 663552000,000; mean_bps: 665395200,000; fps: 60,329; mean_fps: 60,182; cpu: 100; 
INFO:
perf: perf0; timestamp: 8:27:50.220311419; bps: 663552000,000; mean_bps: 665131885,714; fps: 59,835; mean_fps: 60,139; cpu: 100; 
INFO:
perf: perf0; timestamp: 8:27:51.231456474; bps: 652492800,000; mean_bps: 663552000,000; fps: 60,328; mean_fps: 60,160; cpu: 100; 
INFO:
perf: perf0; timestamp: 8:27:52.234158908; bps: 674611200,000; mean_bps: 664780800,000; fps: 59,838; mean_fps: 60,128; cpu: 100; 
INFO:
perf: perf0; timestamp: 8:27:53.249469880; bps: 663552000,000; mean_bps: 664657920,000; fps: 59,095; mean_fps: 60,034; cpu: 100;

1080p 60fps capture to display (in-place=false)

Example pipeline

gst-launch-1.0 nvcamerasrc queue-size=10 sensor-id=0 fpsRange='60 60' ! "video/x-raw(memory:NVMM),width=1920,height=1080,format=I420,framerate=60/1" ! nvvidconv ! "video/x-raw,width=1920,height=1080,format=I420,framerate=60/1" ! cudafilter in-place=false location=./memcpy.so ! nvvidconv ! perf print-arm-load=true ! nvoverlaysink --gst-debug=0

Performance stats

GST-PERF INFO -->  Timestamp: 3:15:36.964483800; Bps: 807; fps: 60.93; CPU: 24;  
GST-PERF INFO -->  Timestamp: 3:15:37.967952916; Bps: 805; fps: 60.81; CPU: 26;  
GST-PERF INFO -->  Timestamp: 3:15:38.968547123; Bps: 808; fps: 61.0; CPU: 26;  
GST-PERF INFO -->  Timestamp: 3:15:39.972391546; Bps: 805; fps: 60.81; CPU: 26;  
GST-PERF INFO -->  Timestamp: 3:15:40.976603100; Bps: 804; fps: 60.75; CPU: 27;  
GST-PERF INFO -->  Timestamp: 3:15:41.992211737; Bps: 796; fps: 61.8; CPU: 25;  
GST-PERF INFO -->  Timestamp: 3:15:43.008196618; Bps: 796; fps: 61.8; CPU: 23;  
GST-PERF INFO -->  Timestamp: 3:15:44.011540684; Bps: 805; fps: 60.81; CPU: 23;  
GST-PERF INFO -->  Timestamp: 3:15:45.015317452; Bps: 805; fps: 60.81; CPU: 23;  
GST-PERF INFO -->  Timestamp: 3:15:46.018784589; Bps: 805; fps: 60.81; CPU: 23;

4K 60fps capture to fakesink (in-place=false)

In this case, the framerate drops to 39 fps, which should be 60 fps. This is caused to limitations in the system performance because in this case, we are not using NVMM memory type buffers, who ensures the optimal performance on Tegra boards. Also, we are not using the in-place mode, which affects a little in lowering the system performance.

Example pipeline

gst-launch-1.0 nvcamerasrc queue-size=10 sensor-id=0 fpsRange='60 60' ! "video/x-raw(memory:NVMM),width=3840,height=2160,format=I420,framerate=60/1" ! nvvidconv ! "video/x-raw,width=3840,height=2160,format=I420,framerate=60/1" ! cudafilter in-place=false location=./memcpy.so ! perf print-arm-load=true ! fakesink --gst-debug=0

Performance stats

GST-PERF INFO -->  Timestamp: 0:38:01.653790309; Bps: 12392031; fps: 39.84; CPU: 25;  
GST-PERF INFO -->  Timestamp: 0:38:02.656695518; Bps: 12416766; fps: 39.92; CPU: 26;  
GST-PERF INFO -->  Timestamp: 0:38:03.674419491; Bps: 12233628; fps: 39.33; CPU: 25;  
GST-PERF INFO -->  Timestamp: 0:38:04.677275408; Bps: 12416766; fps: 38.92; CPU: 25;  
GST-PERF INFO -->  Timestamp: 0:38:05.693357318; Bps: 12245669; fps: 38.38; CPU: 24;  
GST-PERF INFO -->  Timestamp: 0:38:06.703775461; Bps: 12318415; fps: 38.61; CPU: 25;  
GST-PERF INFO -->  Timestamp: 0:38:07.721455693; Bps: 12233628; fps: 38.34; CPU: 26;  
GST-PERF INFO -->  Timestamp: 0:38:08.725218415; Bps: 12404386; fps: 37.88; CPU: 25;  
GST-PERF INFO -->  Timestamp: 0:38:09.736927114; Bps: 12306231; fps: 37.58; CPU: 25;  
GST-PERF INFO -->  Timestamp: 0:38:10.753877957; Bps: 12245669; fps: 38.38; CPU: 24;

720p 60fps videotestsrc to display (in-place=false)

Example pipeline

gst-launch-1.0 videotestsrc is-live=true ! "video/x-raw,width=1280,height=720,format=I420,framerate=60/1" ! cudafilter in-place=false location=./memcpy.so ! nvvidconv ! perf print-arm-load=true ! nvoverlaysink --gst-debug=0

Performance stats

GST-PERF INFO -->  Timestamp: 3:16:52.475903528; Bps: 808; fps: 60.0; CPU: 19;  
GST-PERF INFO -->  Timestamp: 3:16:53.492648143; Bps: 795; fps: 60.3; CPU: 19;  
GST-PERF INFO -->  Timestamp: 3:16:54.509168491; Bps: 795; fps: 60.3; CPU: 19;  
GST-PERF INFO -->  Timestamp: 3:16:55.509256876; Bps: 808; fps: 60.0; CPU: 19;  
GST-PERF INFO -->  Timestamp: 3:16:56.509305211; Bps: 808; fps: 60.0; CPU: 18;  
GST-PERF INFO -->  Timestamp: 3:16:57.509339014; Bps: 808; fps: 60.0; CPU: 18;  
GST-PERF INFO -->  Timestamp: 3:16:58.525940767; Bps: 795; fps: 60.3; CPU: 18;  
GST-PERF INFO -->  Timestamp: 3:16:59.526004309; Bps: 808; fps: 60.0; CPU: 21;  
GST-PERF INFO -->  Timestamp: 3:17:00.542556219; Bps: 795; fps: 60.3; CPU: 18;  
GST-PERF INFO -->  Timestamp: 3:17:01.559235315; Bps: 795; fps: 60.3; CPU: 18;

Example pipeline for x86

gst-launch-1.0 videotestsrc is-live=true ! "video/x-raw,width=1280,height=720,format=I420,framerate=60/1" ! cudafilter in-place=false location=./memcpy.so ! perf print-arm-load=true ! autovideosink

Performance stats

INFO:
perf: perf0; timestamp: 8:31:22.849334728; bps: 663552000,000; mean_bps: 663552000,000; fps: 60,131; mean_fps: 59,505; cpu: 16; 
INFO:
perf: perf0; timestamp: 8:31:23.849352943; bps: 663552000,000; mean_bps: 663552000,000; fps: 59,999; mean_fps: 59,669; cpu: 22; 
INFO:
perf: perf0; timestamp: 8:31:24.853056761; bps: 663552000,000; mean_bps: 663552000,000; fps: 59,779; mean_fps: 59,697; cpu: 15; 
INFO:
perf: perf0; timestamp: 8:31:25.867663407; bps: 663552000,000; mean_bps: 663552000,000; fps: 60,122; mean_fps: 59,782; cpu: 15; 
INFO:
perf: perf0; timestamp: 8:31:26.867694330; bps: 674611200,000; mean_bps: 665763840,000; fps: 59,998; mean_fps: 59,818; cpu: 12; 
INFO:
perf: perf0; timestamp: 8:31:27.868941538; bps: 663552000,000; mean_bps: 665395200,000; fps: 59,925; mean_fps: 59,833; cpu: 12; 
INFO:
perf: perf0; timestamp: 8:31:28.869524624; bps: 663552000,000; mean_bps: 665131885,714; fps: 59,965; mean_fps: 59,850; cpu: 13; 
INFO:
perf: perf0; timestamp: 8:31:29.884438289; bps: 663552000,000; mean_bps: 664934400,000; fps: 60,104; mean_fps: 59,878; cpu: 13; 
INFO:
perf: perf0; timestamp: 8:31:30.886387151; bps: 663552000,000; mean_bps: 664780800,000; fps: 59,883; mean_fps: 59,878; cpu: 13; 
INFO:
perf: perf0; timestamp: 8:31:31.902888980; bps: 663552000,000; mean_bps: 664657920,000; fps: 60,010; mean_fps: 59,890; cpu: 14;



Previous: Example - cudafilter: NVMM direct mapping Index Next: Example: cudamux