Difference between revisions of "GstCUDA - Features and Limitations"

From RidgeRun Developer Connection
Jump to: navigation, search
(Key Features)
 
(21 intermediate revisions by 4 users not shown)
Line 1: Line 1:
{{GstCUDA Page |  
+
{{GstCUDA/Head|previous=|next=Supported Platforms|metakeywords=GstCuda Features, GstCuda Key Features, GstCUDA project, NVMM direct mapping mode, Unified memory allocator mode, NVMM, Unified memory}}
[[GstCUDA|Home]]|
 
[[GstCUDA - Supported Platforms|Supported Platforms]]|
 
 
 
This page describes the GstCUDA features and limitations.
 
  
 
__TOC__
 
__TOC__
Line 9: Line 5:
 
==GstCUDA project general characteristics==
 
==GstCUDA project general characteristics==
  
The GstCUDA project exposes the following general characteristics:
+
GstCUDA characteristics:
 
 
* CUDA algorithm easy integration into GStreamer pipelines.
 
* Complexity abstraction of both CUDA and GStreamer.
 
* Optimal performance assurance for GStreamer/CUDA applications on Tegra platforms.
 
 
 
  
 +
* Easy CUDA algorithm integration into GStreamer pipelines.
 +
* Complexity abstraction of both CUDA and GStreamer - allowing the developer to focus on the CUDA algorithm.
 +
* Optimal performance assurance for GStreamer/CUDA applications on Jetson platforms.
 +
* Support for PC systems that have NVIDIA GPUs. (x86 architecture)
  
 
==Key Features==
 
==Key Features==
  
The GstCUDA project exposes the following general key features:
+
GstCUDA key features:
  
* Offers a framework allowing users to develop custom GStreamer elements that could execute any CUDA algorithm. The framework consists in a series of base classes that abstracts the complexity of GStreamer and CUDA integration. Also, GstCUDA provides a set of quick prototyping elements.
+
* Offers a framework allowing users to develop custom GStreamer elements that can execute any CUDA algorithm. The framework consists of a series of base classes that abstract the complexity of GStreamer and CUDA integration.
* Guarantees zero memory copy interface between CUDA and GStreamer on Tegra X1/X2 platforms.
+
* Zero memory copy interface between CUDA and GStreamer on Jetson family platforms (TX1, TX2, Xavier, Nano and Orin).
 
* GstCUDA supports two modes of memory handling:
 
* GstCUDA supports two modes of memory handling:
**'''''NVMM direct mapping mode''''': use the GstCUDA API's to directly handle NVMM memory buffers type. This method guarantee the best possible performance on the Tegra platforms.
+
**'''''NVMM direct mapping mode''''': use the GstCUDA API's to directly handle NVMM memory buffers. This method provides the best possible performance on the Tegra platforms.
**'''''Unified memory allocator mode''''': avoids the use of NVMM memory buffers by providing a memory allocator that directly pass the buffer to the GPU, guaranteeing zero memory copies and maintaining an excellent performance. This mode has a lower performance in comparison with the ''"Unified memory allocator mode"''. Nevertheless, this mode allows to support V4L2 and user-space buffers.
+
**'''''Unified memory allocator mode''''': avoids the use of NVMM memory buffers by providing a memory allocator that directly passes the buffer to the GPU, providing zero memory copies and maintaining an excellent performance. This mode has a lower performance in comparison with the ''"Unified memory allocator mode"''. The Unified memory allocator is used in conjunction with V4L2 and user-space buffers.
**Both modes are supported for Jetpack 3.0, however NVMM direct mapping mode is not supported for Jetpack 3.1.
+
**The two memory handling modes allow GstCUDA to support NVMM buffers, V4L2 buffers and user-space buffers.  
**This two memory handling modes allows GstCUDA to support NVMM buffers, V4L2 buffers and user-space buffers.
+
* Supports heavy CUDA algorithms and large amounts of data to be processed on the GPU without performance being affected due to copies or memory conversions. Handles up to  2x 4K 60fps streams simultaneously with ''"NVMM direct mapping mode"'' and 2x 4K 40fps streams simultaneously with ''"Unified memory allocator mode"''.
* Provides the necessary APIs to directly handle NVMM buffers type to achieve the best possible performance on the Tegra platforms.
+
* Provides a set of video filter quick prototyping GStreamer elements, with different input/output combinations, that allows video frames to be processed by the GPU using a custom CUDA library algorithm. Those elements executes the CUDA algorithm from a custom CUDA library loaded dynamically during run-time, passed to the GstCUDA element by setting an element property value. The user can choose between the different provided elements, to find the one that best matches the project requirements. It is ideal for quick prototyping, because the CUDA algorithm is separated from the GStreamer element, so the user could make modifications to the CUDA algorithm, recompile the custom CUDA library and run the GStreamer pipeline again to test it.  Using run-time linking allows the CUDA algorithm to be swapped out or updated without having to rebuild any of the GStreamer source.
* Supports heavy CUDA algorithms and large amounts of data to be processed on the GPU without performance being affected due to copies or memory conversions. Could handle 2x 4K 60fps streams simultaneously with ''"NVMM direct mapping mode"'' and 2x 4K 40fps streams simultaneously with ''"Unified memory allocator mode"''.
+
* Provides integrated ad-on elements;  that consist of a complete shared library which execute a specific CUDA algorithm. Those ad-ons elements are based on the GstCUDA framework, and clearly shows the potential of this framework being used to generate a final product.
* Provides a set of video filter GStreamer elements, with different input/output pads combinations, that allows video frames to be processed by the GPU using a custom CUDA library algorithm. Those elements executes the CUDA algorithm from a custom CUDA library (XXX.so file) loaded dynamically during run-time, passed trough an element's property. The user can choose between the different provided elements, to find the one that best adjust to its project requirements. It is ideal for quick prototyping, because the CUDA algorithm is separated from the GStreamer element, so the user could make modifications to the CUDA algorithm, recompile the custom CUDA library and run the GStreamer pipeline again to test it.
 
* Provides ad-ons, that consists in full complete and terminated elements that executes a specific CUDA algorithm that is integrated into the element code. Those ad-ons elements are based on the GstCUDA framework, and clearly shows the potential of this framework being used to generate a final product.
 
 
 
  
 +
==Limitations==
  
==Limitations==
 
 
The current release exposes the following limitations and known bugs:
 
The current release exposes the following limitations and known bugs:
  
* It only supports Tegra X1/X2 platforms. There are plans in the future to extend support for PC systems that has Nvidia's GPUs.
+
* <del>It only supports the whole NVIDIA Jetson family platforms (TX1, TX2, Xavier and Nano). There are plans in the future to extend support for PC systems that have NVIDIA GPUs.</del>
 +
{{Ambox
 +
|type=notice
 +
|small=left
 +
|issue='''Note:''' the new release supports PC systems that have NVIDIA GPUs (x86 architecture).
 +
|style=width:unset;
 +
}}
 
* There are plans in the future to extend support for EGL memory type buffers being directly accepted as inputs.
 
* There are plans in the future to extend support for EGL memory type buffers being directly accepted as inputs.
* Both ''NVMM direct mapping mode'' and ''Unified memory allocator mode'' modes are supported for Jetpack 3.0, however ''NVMM direct mapping mode'' is not supported for Jetpack 3.1.
 
  
}}
+
{{GstCUDA/Foot|previous=|next=Supported Platforms}}

Latest revision as of 09:36, 24 April 2023


  Index Next: Supported Platforms


Nvidia-preferred-partner-badge-rgb-for-screen.png



GstCUDA project general characteristics

GstCUDA characteristics:

  • Easy CUDA algorithm integration into GStreamer pipelines.
  • Complexity abstraction of both CUDA and GStreamer - allowing the developer to focus on the CUDA algorithm.
  • Optimal performance assurance for GStreamer/CUDA applications on Jetson platforms.
  • Support for PC systems that have NVIDIA GPUs. (x86 architecture)

Key Features

GstCUDA key features:

  • Offers a framework allowing users to develop custom GStreamer elements that can execute any CUDA algorithm. The framework consists of a series of base classes that abstract the complexity of GStreamer and CUDA integration.
  • Zero memory copy interface between CUDA and GStreamer on Jetson family platforms (TX1, TX2, Xavier, Nano and Orin).
  • GstCUDA supports two modes of memory handling:
    • NVMM direct mapping mode: use the GstCUDA API's to directly handle NVMM memory buffers. This method provides the best possible performance on the Tegra platforms.
    • Unified memory allocator mode: avoids the use of NVMM memory buffers by providing a memory allocator that directly passes the buffer to the GPU, providing zero memory copies and maintaining an excellent performance. This mode has a lower performance in comparison with the "Unified memory allocator mode". The Unified memory allocator is used in conjunction with V4L2 and user-space buffers.
    • The two memory handling modes allow GstCUDA to support NVMM buffers, V4L2 buffers and user-space buffers.
  • Supports heavy CUDA algorithms and large amounts of data to be processed on the GPU without performance being affected due to copies or memory conversions. Handles up to 2x 4K 60fps streams simultaneously with "NVMM direct mapping mode" and 2x 4K 40fps streams simultaneously with "Unified memory allocator mode".
  • Provides a set of video filter quick prototyping GStreamer elements, with different input/output combinations, that allows video frames to be processed by the GPU using a custom CUDA library algorithm. Those elements executes the CUDA algorithm from a custom CUDA library loaded dynamically during run-time, passed to the GstCUDA element by setting an element property value. The user can choose between the different provided elements, to find the one that best matches the project requirements. It is ideal for quick prototyping, because the CUDA algorithm is separated from the GStreamer element, so the user could make modifications to the CUDA algorithm, recompile the custom CUDA library and run the GStreamer pipeline again to test it. Using run-time linking allows the CUDA algorithm to be swapped out or updated without having to rebuild any of the GStreamer source.
  • Provides integrated ad-on elements; that consist of a complete shared library which execute a specific CUDA algorithm. Those ad-ons elements are based on the GstCUDA framework, and clearly shows the potential of this framework being used to generate a final product.

Limitations

The current release exposes the following limitations and known bugs:

  • It only supports the whole NVIDIA Jetson family platforms (TX1, TX2, Xavier and Nano). There are plans in the future to extend support for PC systems that have NVIDIA GPUs.
  • There are plans in the future to extend support for EGL memory type buffers being directly accepted as inputs.


  Index Next: Supported Platforms