CUDA ISP for NVIDIA Jetson: Performance

From RidgeRun Developer Connection
Jump to: navigation, search


  Index  






Library API performance

To measure the CUDA ISP API performance, we built a simple example (provided upon request) that iterates over the apply methods and records performance metrics for each iteration. We recorded the duration of each apply method separately. We then recorded the CPU and GPU usage, as well as the CPU RAM and GPU RAM usage for the complete processing pipeline running at 30 fps. We recorded the performance statistics for 1080p and 4K buffers. We recorded the performance on a Jetson Nano, Jetson Xavier NX, Jetson Xavier AGX, and Jetson AGX Orin.

The following table summarizes CUDA ISP's performance results:

Platform Jetson AGX Orin Jetson Xavier AGX Jetson Xavier NX Jetson Nano
Buffer size 1080p 4K 1080p 4K 1080p 4K 1080p 4K
Processing time by algorithm (microseconds)
CudaShift 60 (16.7K) 51 (19.6K) 135 (7.4K) 131 (7.6K) 93 (10.8K) 93 (10.8K) 135 (7.4K) 147 (6.8K)
CudaDebayer 22 20 48 39 39 31 53 55
CudaWhiteBalancer 4056 5966 4844 8091 1360 4249 5071 18903
CudaColorSpaceConverter 20 17 45 52 35 34 55 57
Resource consumption profile
CPU usage (%) 0.211 0.129 0.491 0.458 0.524 0.477 0.836 0.820
CPU RAM (MB) 160.3 157.6 173.6 173.5 173.5 172.0 146.3 147.6
GPU usage (%) 20.68 27.06 13.22 50.16 5.48 17.91 25.12 94.60
GPU RAM (MB) 86.7 135.9 105.2 107.6 100.4 106.3 91.7 116.8



  Index