CUDA ISP for NVIDIA Jetson: Performance
CUDA ISP for NVIDIA Jetson | |
---|---|
![]() | |
CUDA ISP for NVIDIA Jetson Basics | |
|
|
Getting Started | |
|
|
User Manual | |
|
|
GStreamer | |
|
|
Examples | |
|
|
Performance | |
|
|
Contact Us | |
|
Library API performance
![]() | lleon: Please, specify: which performance metrics and the tools used for recording, tolerance of the framerate (please remove this box when addressed) |
To measure the CUDA ISP API performance, we built a simple example (provided upon request) that iterates over the Apply
methods for each algorithm and records performance metrics for each iteration. We measured the duration of each algorithm's Apply
method. We also measured CPU, CPU RAM, GPU, and GPU RAM usage for the complete processing pipeline iterating at 30fps. We ran the experiments on both 1080p and 4K buffers. We also ran the experiments on the Jetson Nano, Jetson Xavier NX, Jetson Xavier AGX, and Jetson AGX Orin.
- We measured the duration of each
Apply
method separately using thechrono
library. - We used the
sys/times.h
library to obtain the CPU usage. - We read the
/proc/self/status
file to obtain the CPU RAM usage. - We used jtop to measure GPU usage on the Jetson Nano and Jetson Xavier NX. We use jetson-stats to measure GPU usage on the Jetson Xavier AGX and the Jetson AGX Orin.
- We used
cudaMemGetInfo
from CUDA to measure GPU RAM usage.
![]() | lleon: Please, specify: environment (jetpack) and power mode (please remove this box when addressed) |
The following table summarizes CUDA ISP's performance results:
Platform | Jetson AGX Orin | Jetson Xavier AGX | Jetson Xavier NX | Jetson Nano | ||||
---|---|---|---|---|---|---|---|---|
Buffer size | 1080p | 4K | 1080p | 4K | 1080p | 4K | 1080p | 4K |
Processing time by algorithm (microseconds) | ||||||||
CudaShift | 60 (16.7K) | 51 (19.6K) | 135 (7.4K) | 131 (7.6K) | 93 (10.8K) | 93 (10.8K) | 135 (7.4K) | 147 (6.8K) |
CudaDebayer | 22 (45.5K) | 20 (50.0K) | 48 (20.8K) | 39 (25.6K) | 39 (25.6K) | 31 (32.3K) | 53 (18.9K) | 55 (18.2K) |
CudaWhiteBalancer | 4056 (247) | 5966 (168) | 4844 (206) | 8091 (124) | 1360 (735) | 4249 (235) | 5071 (197) | 18903 (53) |
CudaColorSpaceConverter | 20 (50.0K) | 17 (58.8K) | 45 (22.2K) | 52 (19.2K) | 35 (28.6K) | 34 (29.4K) | 55 (18.2K) | 57 (17.5K) |
Resource consumption profile | ||||||||
CPU usage (%) | 0.211 | 0.129 | 0.491 | 0.458 | 0.524 | 0.477 | 0.836 | 0.820 |
CPU RAM (MB) | 160.3 | 157.6 | 173.6 | 173.5 | 173.5 | 172.0 | 146.3 | 147.6 |
GPU usage (%) | 20.68 | 27.06 | 13.22 | 50.16 | 5.48 | 17.91 | 25.12 | 94.60 |
GPU RAM (MB) | 86.7 | 135.9 | 105.2 | 107.6 | 100.4 | 106.3 | 91.7 | 116.8 |