Difference between revisions of "CUDA Accelerated GStreamer Camera Undistort/Performance/Xavier"

From RidgeRun Developer Connection
Jump to: navigation, search
m
m
 
(One intermediate revision by the same user not shown)
Line 1: Line 1:
 
<noinclude>
 
<noinclude>
{{CUDA Accelerated GStreamer Camera Undistort/Head|next=Contact Us|previous=Performance|metakeywords=}}
+
{{CUDA Accelerated GStreamer Camera Undistort/Head|next=Contact Us|previous=Performance|metakeywords=cuda unidstort, cuda-undistort performance evaluation}}
 
</noinclude>
 
</noinclude>
  
  
The performance of the cuda unidstort element depends mainly on the input image resolution.  
+
The performance of the cuda undistort element depends mainly on the input image resolution.  
  
 
The following sections show the measurements of the '''cuda-undistort''' (FPS and Latency) for multiple image resolutions; as well as the impact of changing the distortion model.
 
The following sections show the measurements of the '''cuda-undistort''' (FPS and Latency) for multiple image resolutions; as well as the impact of changing the distortion model.
Line 14: Line 14:
 
</source>
 
</source>
  
The following sections show a comparison between using the platform at maximum frequency (With ''jetson_clocks'') and in base mode. This mode can be set as follows:  
+
The following sections show a comparison between using the platform at a maximum frequency (With ''jetson_clocks'') and in base mode. This mode can be set as follows:  
  
 
<source lang=bash>
 
<source lang=bash>
Line 44: Line 44:
  
 
== Latency ==
 
== Latency ==
For the purpose of this performance evaluation, '''Latency''' is measured as the time difference between the src of the element before the undistort and the src of the undistort itself, effectively measuring the time between input and output pads. For multiple inputs the largest time difference is taken.  
+
For the purpose of this cuda-undistort performance evaluation, '''Latency''' is measured as the time difference between the src of the element before the undistort and the src of the undistort itself, effectively measuring the time between input and output pads. For multiple inputs, the largest time difference is taken.  
  
 
These latency measurements were taken using the [https://developer.ridgerun.com/wiki/index.php?title=GstShark GstShark] interlatency tracer.
 
These latency measurements were taken using the [https://developer.ridgerun.com/wiki/index.php?title=GstShark GstShark] interlatency tracer.

Latest revision as of 03:22, 14 April 2023



Previous: Performance Index Next: Contact Us





The performance of the cuda undistort element depends mainly on the input image resolution.

The following sections show the measurements of the cuda-undistort (FPS and Latency) for multiple image resolutions; as well as the impact of changing the distortion model.

Platform Setup

The performance measurements were done with the AGX Xavier in 30W All mode, which can be activated with

sudo nvpmodel -m 3

The following sections show a comparison between using the platform at a maximum frequency (With jetson_clocks) and in base mode. This mode can be set as follows:

sudo /usr/bin/jetson_clocks

Framerate

The average Frames per Second measurements are shown in the following charts, for multiple image resolutions. Also, the impact of executing or not the jetson_clocks script is shown in the results.

Undistort FPS on multiple resolution images using Fisheye model with and without jetson_clocks.sh
Undistort FPS on multiple resolution images using Brown-Conrady model with and without jetson_clocks.sh

Pipeline structure

The general structure of the pipeline used for the framerate measurements is shown below, for the Fisheye model.

CAMERA_MATRIX="{\"fx\":9.5211633874478218e+02, \"fy\":9.4946222068253201e+02, \"cx\":6.8041416457132573e+02, \"cy\":3.1446117133659988e+02}"
DISTORTION_PARAMETERS="{\"k1\":3.8939572818197948e-01, \"k2\":-5.5685725182648649e-01, \"k3\":2.3785352925072494e+00, \"k4\":-1.2037220289124213e+00}"

INPUT=image_1.jpg

gst-launch-1.0 filesrc location=$INPUT \
! nvjpegdec !  imagefreeze ! nvvidconv\
! cudaundistort distortion-model=fisheye \
camera-matrix="$CAMERA_MATRIX" distortion-parameters="$DISTORTION_PARAMETERS" \
! perf print-cpu-load=true ! fakesink

Latency

For the purpose of this cuda-undistort performance evaluation, Latency is measured as the time difference between the src of the element before the undistort and the src of the undistort itself, effectively measuring the time between input and output pads. For multiple inputs, the largest time difference is taken.

These latency measurements were taken using the GstShark interlatency tracer.

The pictures below show the latency of the cuda-undistort element, for both models and multiple resolutions, as well as using and not using the jetson_clocks script.

Latency on multiple resolution images using Fisheye model with and without jetson_clocks.sh
Latency on multiple resolution images using Brown-Conrady model with and without jetson_clocks.sh

Pipeline structure

The general structure of the pipeline used for the latency measurements is shown below, for the Fisheye model.

CAMERA_MATRIX="{\"fx\":9.5211633874478218e+02, \"fy\":9.4946222068253201e+02, \"cx\":6.8041416457132573e+02, \"cy\":3.1446117133659988e+02}"
DISTORTION_PARAMETERS="{\"k1\":3.8939572818197948e-01, \"k2\":-5.5685725182648649e-01, \"k3\":2.3785352925072494e+00, \"k4\":-1.2037220289124213e+00}"

INPUT=image_1.jpg

GST_DEBUG="3,GST_TRACER:7" GST_TRACERS="interlatency" GST_SHARK_CTF_DISABLE=1 \
gst-launch-1.0 filesrc location=$INPUT \
! nvjpegdec ! imagefreeze ! nvvidconv \
! cudaundistort distortion-model=fisheye \
camera-matrix="$CAMERA_MATRIX" distortion-parameters="$DISTORTION_PARAMETERS" \
! perf print-cpu-load=true ! fakesink


Previous: Performance Index Next: Contact Us