Difference between revisions of "GstCUDA - Performance Profiling"
(Created page with "{{GstCUDA Page | Example 3: cudadebayer| Home| This page shows GstCUDA performance profiling. __TOC__ ==Description== Ri...") |
|||
Line 7: | Line 7: | ||
__TOC__ | __TOC__ | ||
− | = | + | = Glass to glass latency = |
− | + | This wiki contains the glass to glass latency measurements results of GstCUDA simple capture and display pipelines on a TX2. It contains the results for all the possible GstCUDA (cudafilter and cudamux) configurations and uses cases. | |
+ | All the measurements were taken using the TX2 on the high-performance mode by running the following commands: | ||
+ | <pre> | ||
+ | sudo nvpmodel -m 0 #Reboot after running it, so changes can take effect. | ||
+ | reboot | ||
+ | sudo ~/jetson_clocks | ||
+ | </pre> | ||
− | + | = Jetpack 3.3 - IMX274 camera 4K@60fps glass to glass latency = | |
+ | =Simple Capture to Display pipeline (without GstCUDA)= | ||
+ | This measurement should be used as a reference to compare the glass to glass latency of the below pipelines with GstCUDA. | ||
+ | * '''''Glass to Glass latency = 112.2042693 ms''''' | ||
+ | Test pipeline: | ||
+ | <pre> | ||
+ | gst-launch-1.0 -v nvcamerasrc queue-size=10 sensor-id=1 fpsRange='60 60' ! "video/x-raw(memory:NVMM),width=3840,height=2160,format=I420,framerate=60/1" ! perf print-arm-load=true ! nvoverlaysink enable-last-sample=false | ||
+ | </pre> | ||
− | + | == Cudafilter == | |
+ | === NVMM Direct Handling === | ||
+ | ==== In-place:True ==== | ||
+ | * '''''Glass to Glass latency = 178.9331237 ms''''' | ||
+ | Test pipeline: | ||
+ | <pre> | ||
+ | gst-launch-1.0 nvcamerasrc queue-size=10 sensor-id=1 fpsRange='60 60' ! "video/x-raw(memory:NVMM),width=3840,height=2160,format=NV12,framerate=60/1" ! nvvidconv ! "video/x-raw(memory:NVMM),width=3840,height=2160,format=I420,framerate=60/1" ! cudafilter in-place=true location=/home/nvidia/gst-cuda/tests/examples/cudafilter_algorithms/gray-scale-filter/gray-scale-filter.so ! perf print-arm-load=true ! nvoverlaysink enable-last-sample=false | ||
+ | </pre> | ||
+ | ==== In-place:False ==== | ||
+ | * '''''Glass to Glass latency = 230.3850304 ms''''' | ||
+ | Test pipeline: | ||
+ | <pre> | ||
+ | gst-launch-1.0 nvcamerasrc queue-size=10 sensor-id=1 fpsRange='60 60' ! "video/x-raw(memory:NVMM),width=3840,height=2160,format=NV12,framerate=60/1" ! nvvidconv ! "video/x-raw(memory:NVMM),width=3840,height=2160,format=I420,framerate=60/1" ! cudafilter in-place=false location=/home/nvidia/gst-cuda/tests/examples/cudafilter_algorithms/gray-scale-filter/gray-scale-filter.so ! perf print-arm-load=true ! nvoverlaysink enable-last-sample=false | ||
+ | </pre> | ||
+ | === Unified Memory Allocator === | ||
+ | ==== In-place:True ==== | ||
+ | * '''''Glass to Glass latency = 188.1192285 ms''''' | ||
+ | Test pipeline: | ||
+ | <pre> | ||
+ | gst-launch-1.0 nvcamerasrc queue-size=10 sensor-id=1 fpsRange='60 60' ! "video/x-raw(memory:NVMM),width=3840,height=2160,format=I420,framerate=60/1" ! nvvidconv ! "video/x-raw,width=3840,height=2160,format=I420,framerate=60/1" ! cudafilter in-place=true location=/home/nvidia/gst-cuda/tests/examples/cudafilter_algorithms/gray-scale-filter/gray-scale-filter.so ! perf print-arm-load=true ! nvoverlaysink enable-last-sample=false | ||
+ | </pre> | ||
+ | ==== In-place:False ==== | ||
+ | * '''''Glass to Glass latency = 306.2578894 ms''''' | ||
+ | Test pipeline: | ||
+ | <pre> | ||
+ | gst-launch-1.0 nvcamerasrc queue-size=10 sensor-id=1 fpsRange='60 60' ! "video/x-raw(memory:NVMM),width=3840,height=2160,format=I420,framerate=60/1" ! nvvidconv ! "video/x-raw,width=3840,height=2160,format=I420,framerate=60/1" ! cudafilter in-place=false location=/home/nvidia/gst-cuda/tests/examples/cudafilter_algorithms/gray-scale-filter/gray-scale-filter.so ! nvvidconv ! perf print-arm-load=true ! nvoverlaysink enable-last-sample=false | ||
+ | </pre> | ||
− | + | == Cudamux == | |
− | == | + | === NVMM Direct Handling === |
− | + | ==== In-place:True ==== | |
− | + | * '''''Glass to Glass latency = 145.5713375 ms''''' | |
− | < | + | Test pipeline: |
− | + | <pre> | |
− | + | gst-launch-1.0 -v cudamux name=cuda in-place=true location=/home/nvidia/gst-cuda/tests/examples/cudamux_algorithms/mixer/mixer.so nvcamerasrc queue-size=10 sensor-id=1 fpsRange='60 60' ! "video/x-raw(memory:NVMM),width=3840,height=2160,format=NV12,framerate=60/1" ! nvvidconv ! "video/x-raw(memory:NVMM),width=3840,height=2160,format=I420,framerate=60/1" ! queue max-size-buffers=3 leaky=2 ! cuda.sink_0 nvcamerasrc queue-size=10 sensor-id=2 fpsRange='60 60' ! "video/x-raw(memory:NVMM),width=3840,height=2160,format=NV12,framerate=60/1" ! nvvidconv ! "video/x-raw(memory:NVMM),width=3840,height=2160,format=I420,framerate=60/1" ! queue max-size-buffers=3 leaky=2 ! cuda.sink_1 cuda. ! perf print-arm-load=true ! nvoverlaysink enable-last-sample=false | |
− | + | </pre> | |
− | + | ==== In-place:False ==== | |
− | + | * '''''Glass to Glass latency = 332.9231919 ms''''' | |
− | </ | + | Test pipeline: |
+ | <pre> | ||
+ | gst-launch-1.0 -v cudamux name=cuda in-place=false location=/home/nvidia/gst-cuda/tests/examples/cudamux_algorithms/mixer/mixer.so nvcamerasrc queue-size=10 sensor-id=1 fpsRange='60 60' ! "video/x-raw(memory:NVMM),width=3840,height=2160,format=NV12,framerate=60/1" ! nvvidconv ! "video/x-raw(memory:NVMM),width=3840,height=2160,format=I420,framerate=60/1" ! queue max-size-buffers=3 leaky=2 ! cuda.sink_0 nvcamerasrc queue-size=10 sensor-id=2 fpsRange='60 60' ! "video/x-raw(memory:NVMM),width=3840,height=2160,format=NV12,framerate=60/1" ! nvvidconv ! "video/x-raw(memory:NVMM),width=3840,height=2160,format=I420,framerate=60/1" ! queue max-size-buffers=3 leaky=2 ! cuda.sink_1 cuda. ! perf print-arm-load=true ! nvoverlaysink enable-last-sample=false | ||
+ | </pre> | ||
+ | === Unified Memory Allocator === | ||
+ | ==== In-place:True ==== | ||
+ | * '''''Glass to Glass latency = 136.4211149 ms''''' | ||
+ | Test pipeline: | ||
+ | <pre> | ||
+ | gst-launch-1.0 -v cudamux name=cuda in-place=true location=/home/nvidia/gst-cuda/tests/examples/cudamux_algorithms/mixer/mixer.so nvcamerasrc queue-size=10 sensor-id=1 fpsRange='60 60' ! "video/x-raw(memory:NVMM),width=3840,height=2160,format=I420,framerate=60/1" ! nvvidconv ! "video/x-raw,width=3840,height=2160,format=I420,framerate=60/1" ! queue max-size-buffers=3 leaky=2 ! cuda.sink_0 nvcamerasrc queue-size=10 sensor-id=2 fpsRange='60 60' ! "video/x-raw(memory:NVMM),width=3840,height=2160,format=I420,framerate=60/1" ! nvvidconv ! "video/x-raw,width=3840,height=2160,format=I420,framerate=60/1" ! queue max-size-buffers=3 leaky=2 ! cuda.sink_1 cuda. ! perf print-arm-load=true ! nvoverlaysink enable-last-sample=false | ||
+ | </pre> | ||
+ | ==== In-place:False ==== | ||
+ | * '''''Glass to Glass latency = 197.1957698 ms''''' | ||
+ | Test pipeline: | ||
+ | <pre> | ||
+ | gst-launch-1.0 -v cudamux name=cuda in-place=false location=/home/nvidia/gst-cuda/tests/examples/cudamux_algorithms/mixer/mixer.so nvcamerasrc queue-size=10 sensor-id=1 fpsRange='60 60' ! "video/x-raw(memory:NVMM),width=3840,height=2160,format=I420,framerate=60/1" ! nvvidconv ! "video/x-raw,width=3840,height=2160,format=I420,framerate=60/1" ! queue max-size-buffers=3 leaky=2 ! cuda.sink_0 nvcamerasrc queue-size=10 sensor-id=2 fpsRange='60 60' ! "video/x-raw(memory:NVMM),width=3840,height=2160,format=I420,framerate=60/1" ! nvvidconv ! "video/x-raw,width=3840,height=2160,format=I420,framerate=60/1" ! queue max-size-buffers=3 leaky=2 ! cuda.sink_1 cuda. ! perf print-arm-load=true ! nvoverlaysink enable-last-sample=false | ||
+ | </pre> | ||
|keywords=GstCUDA add-ons,GstCUDA framework}} | |keywords=GstCUDA add-ons,GstCUDA framework}} |