Library API performance
To measure the CUDA ISP API performance, we built a simple example that iterates over the apply methods and records performance metrics for each iteration. We recorded the duration of each apply method, the CPU and GPU usage during the application of the code, and the CPU RAM and GPU RAM usage. We recorded the performance on a Jetson Nano, Jetson Xavier NX, Jetson Xavier AGX, and Jetson Orin. We recorded the performance statistics over 3 buffer sizes:
- A minimum 2x2 case, to test the maximum speeds that the apply methods could achieve
- A medium 1920x1080 case, to illustrate the changes in performance as the buffer size increases
- A maximum 3840x2160 case, to test performance on large buffers
Platform
|
Jetson Orin
|
Jetson Xavier AGX
|
Jetson Xavier NX
|
Jetson Nano
|
Buffer size
|
1080p
|
4K
|
1080p
|
4K
|
1080p
|
4K
|
1080p
|
4K
|
Processing time by algorithm
|
CudaShift
|
|
|
135
|
131
|
93
|
93
|
135
|
147
|
CudaDebayer
|
|
|
48
|
39
|
39
|
31
|
53
|
55
|
CudaWhiteBalancer
|
|
|
4844
|
8091
|
1360
|
4249
|
5071
|
18903
|
CudaColorSpaceConverter
|
|
|
45
|
52
|
35
|
34
|
55
|
57
|
Resource consumption profile
|
CPU usage
|
|
|
0.491435
|
0.458062
|
0.523657
|
0.477216
|
0.836478
|
0.819940
|
CPU RAM
|
|
|
173613
|
173477
|
173539
|
171987
|
146295
|
147580
|
GPU usage
|
|
|
|
|
5.48
|
17.91
|
25.12
|
94.6
|
GPU RAM
|
|
|
105247
|
107641
|
100387
|
106288
|
91733
|
116833
|
Jetson Nano
Procesing Time
Procesing time (In microseconds, averaged over 100 iterations) |
2x2 Buffers |
1080p Buffers |
4K Buffers
|
cudashift |
136 |
135 |
147
|
cudadebayer |
68 |
53 |
55
|
cudawhitebalancer |
317 |
5071 |
18903
|
cudacolorspaceconverter |
55 |
55 |
57
|
CPU and CPU RAM usage
Measurement (Averaged over 100 iterations) |
2x2 Buffers |
1080p Buffers |
4K Buffers
|
CPU usage (%) |
0.797500 |
0.836478 |
0.819940
|
CPU RAM usage (kB) |
147071 |
146295 |
147580
|
GPU and GPU RAM usage
Measurement (Averaged over 100 iterations) |
2x2 Buffers |
1080p Buffers |
4K Buffers
|
GPU usage (%) |
0.0 |
25.12 |
94.6
|
GPU RAM usage (kB) |
91967 |
91733 |
116833
|
Jetson Xavier NX
Procesing Time
Procesing time (In microseconds, averaged over 100 iterations) |
2x2 Buffers |
1080p Buffers |
4K Buffers
|
cudashift |
93 |
93 |
93
|
cudadebayer |
39 |
39 |
31
|
cudawhitebalancer |
375 |
1360 |
4249
|
cudacolorspaceconverter |
33 |
35 |
34
|
CPU and CPU RAM usage
Measurement (Averaged over 100 iterations) |
2x2 Buffers |
1080p Buffers |
4K Buffers
|
CPU usage (%) |
0.482488 |
0.523657 |
0.477216
|
CPU RAM usage (kB) |
171679 |
173539 |
171987
|
GPU and GPU RAM usage
Measurement (Averaged over 100 iterations) |
2x2 Buffers |
1080p Buffers |
4K Buffers
|
GPU usage (%) |
0.85 |
5.48 |
17.91
|
GPU RAM usage (kB) |
98719 |
100387 |
106288
|
Jetson Xavier AGX
Procesing Time
Procesing time (In microseconds, averaged over 100 iterations) |
2x2 Buffers |
1080p Buffers |
4K Buffers
|
cudashift |
129 |
135 |
131
|
cudadebayer |
54 |
48 |
39
|
cudawhitebalancer |
667 |
4844 |
8091
|
cudacolorspaceconverter |
38 |
45 |
52
|
CPU and CPU RAM usage
Measurement (Averaged over 100 iterations) |
2x2 Buffers |
1080p Buffers |
4K Buffers
|
CPU usage (%) |
0.409836 |
0.491435 |
0.458062
|
CPU RAM usage (kB) |
172066 |
173613 |
173477
|
GPU and GPU RAM usage
Measurement (Averaged over 100 iterations) |
2x2 Buffers |
1080p Buffers |
4K Buffers
|
GPU usage (%) |
|
|
|
GPU RAM usage (kB) |
101984 |
105247 |
107641
|
Jetson Orin
Procesing Time
Procesing time (In microseconds, averaged over 100 iterations) |
2x2 Buffers |
1080p Buffers |
4K Buffers
|
cudashift |
|
|
|
cudadebayer |
|
|
|
cudawhitebalancer |
|
|
|
cudacolorspaceconverter |
|
|
|
CPU and CPU RAM usage
Measurement (Averaged over 100 iterations) |
2x2 Buffers |
1080p Buffers |
4K Buffers
|
CPU usage (%) |
|
|
|
CPU RAM usage (kB) |
|
|
|
GPU and GPU RAM usage
Measurement (Averaged over 100 iterations) |
2x2 Buffers |
1080p Buffers |
4K Buffers
|
GPU usage (%) |
|
|
|
GPU RAM usage (kB) |
|
|
|