Difference between revisions of "CUDA ISP for NVIDIA Jetson/Performance"

From RidgeRun Developer Connection
Jump to: navigation, search
(More cameras)
(Blanked the page)
(Tag: Blanking)
 
(17 intermediate revisions by 2 users not shown)
Line 1: Line 1:
<noinclude>
 
{{CUDA ISP for NVIDIA Jetson/Head|previous=|next=|metakeywords=|metadescription=}}
 
</noinclude>
 
  
{{DISPLAYTITLE:CUDA ISP for NVIDIA Jetson: Performance|noerror}}
 
 
= Library API performance =
 
 
To measure the CUDA ISP API performance, we built a simple example that iterates over the apply methods and records performance metrics for each iteration. We recorded the duration of each apply method, the CPU and GPU usage during the application of the code, and the CPU RAM and GPU RAM usage. We recorded the performance on a Jetson Nano, Jetson Xavier NX, Jetson Xavier AGX, and Jetson Orin. We recorded the performance statistics over 3 buffer sizes:
 
* A minimum 2x2 case, to test the maximum speeds that the apply methods could achieve
 
* A medium 1920x1080 case, to illustrate the changes in performance as the buffer size increases
 
* A maximum 3840x2160 case, to test performance on large buffers
 
 
== Jetson Nano ==
 
 
=== Procesing Time ===
 
<center>
 
{| class="wikitable"
 
|-
 
! Procesing time (In microseconds, averaged over 100 iterations) !! 2x2 Buffers !! 1080p Buffers !! 4K Buffers
 
|-
 
| cudashift || 136 || 135 || 147
 
|-
 
| cudadebayer || 68 || 53 || 55
 
|-
 
| cudawhitebalancer || 317 || 5071 || 18903
 
|-
 
| cudacolorspaceconverter || 55 || 55 || 57
 
|-
 
|}
 
</center>
 
 
=== CPU and CPU RAM usage ===
 
<center>
 
{| class="wikitable"
 
|-
 
! Measurement (Averaged over 100 iterations) !! 2x2 Buffers !! 1080p Buffers !! 4K Buffers
 
|-
 
| CPU usage (%)|| 0.797500 || 0.836478 || 0.819940
 
|-
 
| CPU RAM usage (kB) || 147071 || 146295 || 147580
 
|-
 
|}
 
</center>
 
 
=== GPU and GPU RAM usage ===
 
<center>
 
{| class="wikitable"
 
|-
 
! Measurement (Averaged over 100 iterations) !! 2x2 Buffers !! 1080p Buffers !! 4K Buffers
 
|-
 
| GPU usage (%)|| 0.0 || 25.12 || 94.6
 
|-
 
| GPU RAM usage (kB) || 91967 || 91733 || 116833
 
|-
 
|}
 
</center>
 
 
== Jetson Xavier NX ==
 
 
=== Procesing Time ===
 
<center>
 
{| class="wikitable"
 
|-
 
! Procesing time (In microseconds, averaged over 100 iterations) !! 2x2 Buffers !! 1080p Buffers !! 4K Buffers
 
|-
 
| cudashift || 93 || 93 || 93
 
|-
 
| cudadebayer || 39 || 39 || 31
 
|-
 
| cudawhitebalancer || 375 || 1360 || 4249
 
|-
 
| cudacolorspaceconverter || 33 || 35 || 34
 
|-
 
|}
 
</center>
 
 
=== CPU and CPU RAM usage ===
 
<center>
 
{| class="wikitable"
 
|-
 
! Measurement (Averaged over 100 iterations) !! 2x2 Buffers !! 1080p Buffers !! 4K Buffers
 
|-
 
| CPU usage (%)|| 0.482488 || 0.523657 || 0.477216
 
|-
 
| CPU RAM usage (kB) || 171679 || 173539 || 171987
 
|-
 
|}
 
</center>
 
 
=== GPU and GPU RAM usage ===
 
<center>
 
{| class="wikitable"
 
|-
 
! Measurement (Averaged over 100 iterations) !! 2x2 Buffers !! 1080p Buffers !! 4K Buffers
 
|-
 
| GPU usage (%)|| 0.85 || 5.48 || 17.91
 
|-
 
| GPU RAM usage (kB) || 98719 || 100387 || 106288
 
|-
 
|}
 
</center>
 
 
== Jetson Xavier AGX ==
 
 
=== Procesing Time ===
 
<center>
 
{| class="wikitable"
 
|-
 
! Procesing time (In microseconds, averaged over 100 iterations) !! 2x2 Buffers !! 1080p Buffers !! 4K Buffers
 
|-
 
| cudashift || 129 || 135 || 131
 
|-
 
| cudadebayer || 54 || 48 || 39
 
|-
 
| cudawhitebalancer || 667 || 4844 || 8091
 
|-
 
| cudacolorspaceconverter || 38 || 45 || 52
 
|-
 
|}
 
</center>
 
 
=== CPU and CPU RAM usage ===
 
<center>
 
{| class="wikitable"
 
|-
 
! Measurement (Averaged over 100 iterations) !! 2x2 Buffers !! 1080p Buffers !! 4K Buffers
 
|-
 
| CPU usage (%)|| 0.409836 || 0.491435 || 0.458062
 
|-
 
| CPU RAM usage (kB) || 172066 || 173613 || 173477
 
|-
 
|}
 
</center>
 
 
=== GPU and GPU RAM usage ===
 
<center>
 
{| class="wikitable"
 
|-
 
! Measurement (Averaged over 100 iterations) !! 2x2 Buffers !! 1080p Buffers !! 4K Buffers
 
|-
 
| GPU usage (%)|| || ||
 
|-
 
| GPU RAM usage (kB) || 101984 || 105247 || 107641
 
|-
 
|}
 
</center>
 
 
== Jetson Orin ==
 
 
=== Procesing Time ===
 
<center>
 
{| class="wikitable"
 
|-
 
! Procesing time (In microseconds, averaged over 100 iterations) !! 2x2 Buffers !! 1080p Buffers !! 4K Buffers
 
|-
 
| cudashift || || ||
 
|-
 
| cudadebayer || || ||
 
|-
 
| cudawhitebalancer || || ||
 
|-
 
| cudacolorspaceconverter || || ||
 
|-
 
|}
 
</center>
 
 
=== CPU and CPU RAM usage ===
 
<center>
 
{| class="wikitable"
 
|-
 
! Measurement (Averaged over 100 iterations) !! 2x2 Buffers !! 1080p Buffers !! 4K Buffers
 
|-
 
| CPU usage (%)|| || ||
 
|-
 
| CPU RAM usage (kB) || || ||
 
|-
 
|}
 
</center>
 
 
=== GPU and GPU RAM usage ===
 
<center>
 
{| class="wikitable"
 
|-
 
! Measurement (Averaged over 100 iterations) !! 2x2 Buffers !! 1080p Buffers !! 4K Buffers
 
|-
 
| GPU usage (%)|| || ||
 
|-
 
| GPU RAM usage (kB) || || ||
 
|-
 
|}
 
</center>
 
 
= GStreamer elements performance =
 
To measure the performance, we have used two of our GStreamer tools: [[GstShark]] and [https://github.com/RidgeRun/gst-perf GstPerf].
 
 
For testing purposes, take into account the following points:
 
* Maximum performance mode enabled: all cores and Jetson clocks enabled.
 
* Jetpack 4.6
 
 
== '''Jetson Xavier AGX''' ==
 
For all the elements, it was measured the processing time and FPS with an input image with 1920x1200 resolution coming from a camera sensor.
 
 
The following pipeline was used to test the cudadebayer and cudaawb elements with an RGB image as output.
 
<source lang=bash>
 
GST_DEBUG="GST_TRACER:7" GST_TRACERS="proctime" gst-launch-1.0 -ve rrv4l2src io-mode=userptr ! 'video/x-bayer, bpp=10, width=1920, height=1200, format=grbg' ! cudadebayer ! cudaawb ! 'video/x-raw, format=RGB' ! fakesink
 
</source>
 
The following pipeline was used to the the cudadebayer and cudaawb elements with an I420 image as output.
 
<source lang=bash>
 
GST_DEBUG="GST_TRACER:7" GST_TRACERS="proctime" gst-launch-1.0 -ve rrv4l2src io-mode=userptr ! 'video/x-bayer, bpp=10, width=1920, height=1200, format=grbg' ! cudadebayer ! cudaawb ! 'video/x-raw, format=I420' ! fakesink
 
</source>
 
The results obtained:
 
 
 
<center>
 
{| class="wikitable"
 
|-
 
! colspan=5 | Xavier AGX
 
|-
 
! Element !! colspan=2|cudadebayer !! colspan=2|cudaawb
 
|-
 
! Output !! RGB !! I420 !! RGB !! I420
 
|-
 
| FPS || 539 || 458 || 752 || 473
 
|-
 
| Processing time (seconds) || 0.001854 || 0.002183 || 0.001329 || 0.002111
 
|}
 
</center>
 
 
=='''Jetson Xavier NX'''==
 
For all the elements, it was measured the processing time and FPS with an input image with 4K resolution coming from a camera sensor.
 
 
The following pipeline was used to measure the processing time and FPS for the cudashift element.
 
<source lang=bash>
 
GST_DEBUG="GST_TRACER:7" GST_TRACERS="proctime" gst-launch-1.0 -ve v4l2src io-mode=userptr ! 'video/x-bayer, bpp=10, format=rggb' ! cudashift shift=5 ! fakesink
 
</source>
 
 
The following pipeline was used to test the cudadebayer and cudaawb elements with an RGB image as output.
 
<source lang=bash>
 
GST_DEBUG="GST_TRACER:7" GST_TRACERS="proctime" gst-launch-1.0 -ve v4l2src io-mode=userptr ! 'video/x-bayer, bpp=10, width=3840, height=2160' ! cudadebayer ! cudaawb ! fakesink
 
</source>
 
The following pipeline was used to test the cudadebayer and cudaawb elements with an I420 image as output.
 
<source lang=bash>
 
GST_DEBUG="GST_TRACER:7" GST_TRACERS="proctime" gst-launch-1.0 -ve v4l2src io-mode=userptr ! 'video/x-bayer, bpp=10, width=3840, height=2160' ! cudadebayer ! cudaawb ! 'video/x-raw, format=I420' ! fakesink
 
</source>
 
 
 
The results obtained:
 
<center>
 
{| class="wikitable"
 
|-
 
! colspan=7 | Xavier NX
 
|-
 
! Element !! colspan=2|cudashift !! colspan=2|cudadebayer !! colspan=2|cudaawb
 
|-
 
! colspan=1|Output !! colspan=2|bayer 8 !! RGB !! I420 !! RGB !! I420
 
|-
 
| FPS || colspan=2|396 || 228 || 187 || 370 || 202
 
|-
 
| Processing time (seconds) || colspan=2|0.002522 || 0.004389 || 0.005353 || 0.002698 || 0.004952
 
|}
 
</center>
 
 
=='''Jetson Nano'''==
 
For all the elements, it was measured the processing time and FPS with an input image with 4K resolution coming from a camera sensor.
 
 
The following pipeline was used to measure the processing time and FPS for the cudashift element.
 
<source lang=bash>
 
GST_DEBUG="GST_TRACER:7" GST_TRACERS="proctime" gst-launch-1.0 -ve v4l2src io-mode=userptr ! 'video/x-bayer, bpp=10, format=rggb' ! cudashift shift=0 ! fakesink
 
</source>
 
he following pipeline was used to test the cudadebayer and cudaawb elements with an RGB image as output.
 
<source lang=bash>
 
GST_DEBUG="GST_TRACER:7" GST_TRACERS="proctime" gst-launch-1.0 -ve v4l2src io-mode=userptr ! 'video/x-bayer, bpp=10, width=3840, height=2160' ! cudadebayer ! cudaawb ! fakesink
 
</source>
 
The following pipeline was used to test the cudadebayer and cudaawb elements with an I420 image as output.
 
<source lang=bash>
 
GST_DEBUG="GST_TRACER:7" GST_TRACERS="proctime" gst-launch-1.0 -ve v4l2src io-mode=userptr ! 'video/x-bayer, bpp=10, width=3840, height=2160' ! cudadebayer ! cudaawb ! 'video/x-raw, format=I420' ! fakesink
 
</source>
 
The results obtained:
 
<center>
 
{| class="wikitable"
 
|-
 
! colspan=7 | Nano
 
|-
 
! Element !! colspan=2|cudashift !! colspan=2|cudadebayer !! colspan=2|cudaawb
 
|-
 
! colspan=1|Output !! colspan=2|bayer 8 !! RGB !! I420 !! RGB !! I420
 
|-
 
| FPS || colspan=2|92 || 51 || 36 || 91 || 38
 
|-
 
| Processing time (seconds) || colspan=2|0.01088 || 0.01948 || 0.02769 || 0.01096 || 0.02605
 
|}
 
</center>
 
 
== '''More cameras''' ==
 
This section shows the performance results for the elements running at the same time on more than one camera on a Jetson XavierAGX. For all the tests done with an RGB output image, the following pipeline was used to measure the processing time and FPS for the cudaawb and the cudadebayer element with an input image with 1920x1200 resolution coming from multiple camera sensor.
 
<source lang=bash>
 
GST_DEBUG="GST_TRACER:7" GST_TRACERS="proctime" gst-launch-1.0 -ve rrv4l2src device=/dev/video0 io-mode=userptr ! 'video/x-bayer, bpp=10, width=1920, height=1200, format=grbg' ! cudadebayer ! cudaawb ! 'video/x-raw, format=RGB' ! fakesink
 
</source>
 
 
The same way, for all the test done with an I420 output image. the following pipeline was used to measure the processing time and FPS for the cudaawb and the cudadebayer element with an input image with 1920x1200 resolution coming from multiple camera sensor
 
<source lang=bash>
 
GST_DEBUG="GST_TRACER:7" GST_TRACERS="proctime" gst-launch-1.0 -ve rrv4l2src device=/dev/video1 io-mode=userptr ! 'video/x-bayer, bpp=10, width=1920, height=1200, format=grbg' ! cudadebayer ! cudaawb ! 'video/x-raw, format=I420' ! fakesink
 
</source>
 
 
The results obtained:
 
<center>
 
{| class="wikitable"
 
|-
 
! colspan=9 | cudadebayer
 
|-
 
! Output !! colspan=4|RGB !! colspan=4|I420
 
|-
 
! Number of cameras !! Two !! Three !! Four !! Five !! Two !! Three !! Four !! Five
 
|-
 
| FPS || 412 || 429 || 385 || 494 || 464 || 402 || 320 || 332
 
|-
 
| Processing time (seconds) || 0.002426 || 0.002354 || 0.02597 || 0.02025 || 0.002154 || 0.002486 || 0.003128 || 0.003011
 
|}
 
</center>
 
 
== '''Jetson Nano''' ==
 
In the following sections you will see the performance for each of the elements.
 
=== '''cudashift element''' ===
 
The following pipeline was used to measure the processing time and FPS for the cudashift element with an input image with 4K resolution coming from a camera sensor. 
 
 
<source lang=bash>
 
GST_DEBUG="GST_TRACER:7" GST_TRACERS="proctime" gst-launch-1.0 -ve v4l2src io-mode=userptr ! 'video/x-bayer, bpp=10, format=rggb' ! cudashift shift=0 ! fakesink
 
</source>
 
The results obtained:
 
<center>
 
{| class="wikitable"
 
|-
 
! Measurement (Average) !! Jetson Nano
 
|-
 
| FPS || 92
 
|-
 
| Processing time (seconds) || 0.01088
 
|}
 
</center>
 
 
<br>
 
<br><br>
 
[[File:Cudashiftproctimenano.png|1000px|frameless|center|CUDA ISP library ]]
 
<br>
 
<br><br>
 
 
=== cudadebayer element ===
 
'''RGB Output'''
 
 
The following pipeline was used to measure the processing time and FPS for the cudadebayer element with an input image with 4K resolution coming from a camera sensor to an RGB output image.
 
<source lang=bash>
 
GST_DEBUG="GST_TRACER:7" GST_TRACERS="proctime" gst-launch-1.0 -ve v4l2src io-mode=userptr ! 'video/x-bayer, bpp=10, width=3840, height=2160' ! cudadebayer ! fakesink
 
</source>
 
The results obtained:
 
<center>
 
{| class="wikitable"
 
|-
 
! Measurement (Average) !! Jetson Nano
 
|-
 
| FPS || 51
 
|-
 
| Processing time (seconds) || 0.01948
 
|}
 
</center>
 
 
<br>
 
<br><br>
 
[[File:Debayergbproctimenano.png|1000px|frameless|center|CUDA ISP library ]]
 
<br>
 
<br><br>
 
 
'''I420 Output'''
 
 
The following pipeline was used to measure the processing time and FPS for the cudadebayer element with an input image with 4K resolution coming from a camera sensor to an I420 output image.
 
<source lang=bash>
 
GST_DEBUG="GST_TRACER:7" GST_TRACERS="proctime" gst-launch-1.0 -ve v4l2src io-mode=userptr ! 'video/x-bayer, bpp=10, width=3840, height=2160' ! cudadebayer ! 'video/x-raw, format=I420' ! fakesink
 
</source>
 
The results obtained:
 
<center>
 
{| class="wikitable"
 
|-
 
! Measurement (Average) !! Jetson Nano
 
|-
 
| FPS || 36
 
|-
 
| Processing time (seconds) || 0.02769
 
|}
 
</center>
 
 
<br>
 
<br><br>
 
[[File:Debayer1420proctimenano.png|1000px|frameless|center|CUDA ISP library ]]
 
<br>
 
<br><br>
 
 
=== '''cudaawb element''' ===
 
'''RGB Output'''
 
 
The following pipeline was used to measure the processing time and FPS for the cudaawb element with an input image with 4K resolution coming from a camera sensor to an RGB output image.
 
<source lang=bash>
 
GST_DEBUG="GST_TRACER:7" GST_TRACERS="proctime" gst-launch-1.0 -ve v4l2src io-mode=userptr ! 'video/x-bayer, bpp=10, width=3840, height=2160' ! cudadebayer ! cudaawb ! fakesink
 
</source>
 
The results obtained:
 
<center>
 
{| class="wikitable"
 
|-
 
! Measurement (Average) !! Jetson Nano
 
|-
 
| FPS || 91
 
|-
 
| Processing time (seconds) || 0.01096
 
|}
 
</center>
 
<br>
 
<br><br>
 
[[File:Awbrgbproctimenano.png|1000px|frameless|center|CUDA ISP library ]]
 
<br>
 
<br><br>
 
 
'''I420 Output'''
 
 
The following pipeline was used to measure the processing time and FPS for the cudaawb element with an input image with 4K resolution coming from a camera sensor to an I420 output image.
 
<source lang=bash>
 
GST_DEBUG="GST_TRACER:7" GST_TRACERS="proctime" gst-launch-1.0 -ve v4l2src io-mode=userptr ! 'video/x-bayer, bpp=10, width=3840, height=2160' ! cudadebayer ! cudaawb ! 'video/x-raw, format=I420' ! fakesink
 
</source>
 
The results obtained:
 
<center>
 
{| class="wikitable"
 
|-
 
! Measurement (Average) !! Jetson Nano
 
|-
 
| FPS || 38
 
|-
 
| Processing time (seconds) || 0.02605
 
|}
 
</center>
 
 
<br>
 
<br><br>
 
[[File:Awbi420proctimenano.png|1000px|frameless|center|CUDA ISP library ]]
 
<br>
 
<br><br>
 
 
 
 
== '''Jetson Xavier NX''' ==
 
 
In the following sections you will see the performance for each of the elements.
 
=== '''cudashift element''' ===
 
The following pipeline was used to measure the processing time and FPS for the cudashift element with an input image with 4K resolution coming from a camera sensor. 
 
 
<source lang=bash>
 
GST_DEBUG="GST_TRACER:7" GST_TRACERS="proctime" gst-launch-1.0 -ve v4l2src io-mode=userptr ! 'video/x-bayer, bpp=10, format=rggb' ! cudashift shift=5 ! fakesink
 
</source>
 
The results obtained:
 
<center>
 
{| class="wikitable"
 
|-
 
! Measurement (Average) !! Jetson Xavier NX
 
|-
 
| FPS || 396
 
|-
 
| Processing time (seconds) || 0.002522
 
|}
 
</center>
 
 
<br>
 
<br><br>
 
[[File:Shift4kproctimenx.png|1000px|frameless|center|CUDA ISP library ]]
 
<br>
 
<br><br>
 
 
=== '''cudadebayer element''' ===
 
'''RGB Output'''
 
 
The following pipeline was used to measure the processing time and FPS for the cudadebayer element with an input image with 4K resolution coming from a camera sensor to an RGB output image.
 
<source lang=bash>
 
GST_DEBUG="GST_TRACER:7" GST_TRACERS="proctime" gst-launch-1.0 -ve v4l2src io-mode=userptr ! 'video/x-bayer, bpp=10, width=3840, height=2160' ! cudadebayer ! fakesink
 
</source>
 
The results obtained:
 
<center>
 
{| class="wikitable"
 
|-
 
! Measurement (Average) !! Jetson Xavier NX
 
|-
 
| FPS || 228
 
|-
 
| Processing time (seconds) || 0.004389
 
|}
 
</center>
 
 
<br>
 
<br><br>
 
[[File:Debayer4krgbproctimenx.png|1000px|frameless|center|CUDA ISP library ]]
 
<br>
 
<br><br>
 
 
'''I420 Output'''
 
 
The following pipeline was used to measure the processing time and FPS for the cudadebayer element with an input image with 4K resolution coming from a camera sensor to an I420 output image.
 
<source lang=bash>
 
GST_DEBUG="GST_TRACER:7" GST_TRACERS="proctime" gst-launch-1.0 -ve v4l2src io-mode=userptr ! 'video/x-bayer, bpp=10, width=3840, height=2160' ! cudadebayer ! fakesink
 
</source>
 
The results obtained:
 
<center>
 
{| class="wikitable"
 
|-
 
! Measurement (Average) !! Jetson Xavier NX
 
|-
 
| FPS || 187
 
|-
 
| Processing time (seconds) || 0.005353
 
|}
 
</center>
 
 
<br>
 
<br><br>
 
[[File:Debayer4ki420proctimenx.png|1000px|frameless|center|CUDA ISP library ]]
 
<br>
 
<br><br>
 
 
=== '''cudaawb element''' ===
 
'''RGB Output'''
 
 
The following pipeline was used to measure the processing time and FPS for the cudaawb element with an input image with 4K resolution coming from a camera sensor to an RGB output image.
 
<source lang=bash>
 
GST_DEBUG="GST_TRACER:7" GST_TRACERS="proctime" gst-launch-1.0 -ve v4l2src io-mode=userptr ! 'video/x-bayer, bpp=10, width=3840, height=2160' ! cudadebayer ! cudaawb ! fakesink
 
</source>
 
The results obtained:
 
<center>
 
{| class="wikitable"
 
|-
 
! Measurement (Average) !! Jetson Xavier NX
 
|-
 
| FPS || 370
 
|-
 
| Processing time (seconds) || 0.002698
 
|}
 
</center>
 
<br>
 
<br><br>
 
[[File:Awb4krgbproctimenx.png|1000px|frameless|center|CUDA ISP library ]]
 
<br>
 
<br><br>
 
 
'''I420 Output'''
 
 
The following pipeline was used to measure the processing time and FPS for the cudaawb element with an input image with 4K resolution coming from a camera sensor to an I420 output image.
 
<source lang=bash>
 
GST_DEBUG="GST_TRACER:7" GST_TRACERS="proctime" gst-launch-1.0 -ve v4l2src io-mode=userptr ! 'video/x-bayer, bpp=10, width=3840, height=2160' ! cudadebayer ! cudaawb ! 'video/x-raw, format=I420' ! fakesink
 
</source>
 
The results obtained:
 
<center>
 
{| class="wikitable"
 
|-
 
! Measurement (Average) !! Jetson Xavier NX
 
|-
 
| FPS || 202
 
|-
 
| Processing time (seconds) || 0.004952
 
|}
 
</center>
 
 
<br>
 
<br><br>
 
[[File:Awb4ki420proctimenx.png|1000px|frameless|center|CUDA ISP library ]]
 
<br>
 
<br><br>
 
 
== '''Jetson Xavier AGX''' ==
 
In the following sections you will see the performance the elements.
 
 
=== cudadebayer ===
 
'''RGB Output'''
 
 
The following pipeline was used to measure the processing time and FPS for the cudadebayer element with an input image with 1920x1200 resolution coming from a camera sensor to an RGB output image.
 
<source lang=bash>
 
GST_DEBUG="GST_TRACER:7" GST_TRACERS="proctime" gst-launch-1.0 -ve rrv4l2src io-mode=userptr ! 'video/x-bayer, bpp=10, width=1920, height=1200, format=grbg' ! cudadebayer ! 'video/x-raw, format=RGB' ! fakesink
 
</source>
 
The results obtained:
 
<center>
 
{| class="wikitable"
 
|-
 
! Measurement (Average) !! Jetson Xavier AGX
 
|-
 
| FPS || 539
 
|-
 
| Processing time (seconds) || 0.001854
 
|}
 
</center>
 
 
<br>
 
<br><br>
 
[[File:Debayer1920x1200rgbproctimev1.png|1000px|frameless|center|CUDA ISP library ]]
 
<br>
 
<br><br>
 
 
'''I420 Output'''
 
 
The following pipeline was used to measure the processing time and FPS for the cudadebayer element with an input image with 1920x1200 resolution coming from a camera sensor to an I420 output image.
 
<source lang=bash>
 
GST_DEBUG="GST_TRACER:7" GST_TRACERS="proctime" gst-launch-1.0 -ve rrv4l2src io-mode=userptr ! 'video/x-bayer, bpp=10, width=1920, height=1200, format=grbg' ! cudadebayer ! 'video/x-raw, format=I420' ! fakesink
 
</source>
 
The results obtained:
 
<center>
 
{| class="wikitable"
 
|-
 
! Measurement (Average) !! Jetson Xavier AGX
 
|-
 
| FPS || 458
 
|-
 
| Processing time (seconds) || 0.002183
 
|}
 
</center>
 
 
<br>
 
<br><br>
 
[[File:Debayer1920x1200i420proctime.png|1000px|frameless|center|CUDA ISP library ]]
 
<br>
 
<br><br>
 
 
=== cudaawb ===
 
'''RGB Output'''
 
 
The following pipeline was used to measure the processing time and FPS for the cudaawb element with an input image with 1920x1200 resolution coming from a camera sensor to an RGB output image.
 
<source lang=bash>
 
GST_DEBUG="GST_TRACER:7" GST_TRACERS="proctime" gst-launch-1.0 -ve rrv4l2src io-mode=userptr ! 'video/x-bayer, bpp=10, width=1920, height=1200, format=grbg' ! cudadebayer ! cudaawb ! 'video/x-raw, format=RGB' ! fakesink
 
</source>
 
The results obtained:
 
<center>
 
{| class="wikitable"
 
|-
 
! Measurement (Average) !! Jetson Xavier NX
 
|-
 
| FPS || 752
 
|-
 
| Processing time (seconds) || 0.001329
 
|}
 
</center>
 
 
<br>
 
<br><br>
 
[[File:Debayer1920x1200rgbproctime.png|1000px|frameless|center|CUDA ISP library ]]
 
<br>
 
<br><br>
 
 
'''I420 Output'''
 
 
The following pipeline was used to measure the processing time and FPS for the cudaawb element with an input image with 1920x1200 resolution coming from a camera sensor to an I420 output image.
 
<source lang=bash>
 
GST_DEBUG="GST_TRACER:7" GST_TRACERS="proctime" gst-launch-1.0 -ve rrv4l2src io-mode=userptr ! 'video/x-bayer, bpp=10, width=1920, height=1200, format=grbg' ! cudadebayer ! cudaawb ! 'video/x-raw, format=I420' ! fakesink
 
</source>
 
The results obtained:
 
<center>
 
{| class="wikitable"
 
|-
 
! Measurement (Average) !! Jetson Xavier NX
 
|-
 
| FPS || 473
 
|-
 
| Processing time (seconds) || 0.002111
 
|}
 
</center>
 
 
<br>
 
<br><br>
 
[[File:Awb1920x1200i420proctimeagx.png|1000px|frameless|center|CUDA ISP library ]]
 
<br>
 
<br><br>
 
 
== More cameras ==
 
This section shows the performance results for the elements running at the same time on more than one camera on a Jetson XavierAGX. For all the tests done with an RGB output image, the following pipeline was used to measure the processing time and FPS for the cudaawb and the cudadebayer element with an input image with 1920x1200 resolution coming from multiple camera sensor.
 
<source lang=bash>
 
GST_DEBUG="GST_TRACER:7" GST_TRACERS="proctime" gst-launch-1.0 -ve rrv4l2src device=/dev/video0 io-mode=userptr ! 'video/x-bayer, bpp=10, width=1920, height=1200, format=grbg' ! cudadebayer ! cudaawb ! 'video/x-raw, format=RGB' ! fakesink
 
</source>
 
 
The same way, for all the test done with an I420 output image. the following pipeline was used to measure the processing time and FPS for the cudaawb and the cudadebayer element with an input image with 1920x1200 resolution coming from multiple camera sensor
 
<source lang=bash>
 
GST_DEBUG="GST_TRACER:7" GST_TRACERS="proctime" gst-launch-1.0 -ve rrv4l2src device=/dev/video1 io-mode=userptr ! 'video/x-bayer, bpp=10, width=1920, height=1200, format=grbg' ! cudadebayer ! cudaawb ! 'video/x-raw, format=I420' ! fakesink
 
</source>
 
 
=== Two cameras ===
 
 
'''RGB Output'''
 
 
The results obtained:
 
 
<center>
 
{| class="wikitable"
 
|-
 
! Measurement (Average) !! cudadebayer !! cudaawb
 
|-
 
| FPS || 412 || 397
 
|-
 
| Processing time (seconds) || 0.002426 || 0.002521
 
|}
 
</center>
 
 
'''I420'''
 
 
The results obtained:
 
 
<center>
 
{| class="wikitable"
 
|-
 
! Measurement (Average) !! cudadebayer !! cudaawb
 
|-
 
| FPS || 464 || 429
 
|-
 
| Processing time (seconds) || 0.002154 || 0.002330
 
|}
 
</center>
 
 
=== Three cameras ===
 
'''RGB Output'''
 
 
The results obtained:
 
 
<center>
 
{| class="wikitable"
 
|-
 
! Measurement (Average) !! cudadebayer !! cudaawb
 
|-
 
| FPS || 429 || 374
 
|-
 
| Processing time (seconds) || 0.002354 || 0.002672
 
|}
 
</center>
 
 
'''I420'''
 
 
The results obtained:
 
 
<center>
 
{| class="wikitable"
 
|-
 
! Measurement (Average) !! cudadebayer !! cudaawb
 
|-
 
| FPS || 402 || 450
 
|-
 
| Processing time (seconds) || 0.002486 || 0.002220
 
|}
 
</center>
 
 
=== Four cameras ===
 
 
'''RGB Output'''
 
 
The results obtained:
 
 
<center>
 
{| class="wikitable"
 
|-
 
! Measurement (Average) !! cudadebayer !! cudaawb
 
|-
 
| FPS || 385 || 689
 
|-
 
| Processing time (seconds) || 0.002597 || 0.001450
 
|}
 
</center>
 
 
'''I420'''
 
 
The results obtained:
 
 
<center>
 
{| class="wikitable"
 
|-
 
! Measurement (Average) !! cudadebayer !! cudaawb
 
|-
 
| FPS || 320 || 289
 
|-
 
| Processing time (seconds) || 0.003128 || 0.003459
 
|}
 
</center>
 
 
=== Five cameras ===
 
 
'''RGB Output'''
 
 
The results obtained:
 
 
<center>
 
{| class="wikitable"
 
|-
 
! Measurement (Average) !! cudadebayer !! cudaawb
 
|-
 
| FPS || 494 || 347
 
|-
 
| Processing time (seconds) || 0.002025 || 0.002883
 
|}
 
</center>
 
 
'''I420'''
 
 
The results obtained:
 
 
<center>
 
{| class="wikitable"
 
|-
 
! Measurement (Average) !! cudadebayer !! cudaawb
 
|-
 
| FPS || 332 || 296
 
|-
 
| Processing time (seconds) || 0.003011 || 0.003375
 
|}
 
</center>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
<noinclude>
 
{{CUDA ISP for NVIDIA Jetson/Foot||}}
 
</noinclude>
 

Latest revision as of 08:17, 28 March 2023