GstCUDA - cudamux

From RidgeRun Developer Connection
Jump to: navigation, search


Previous: cudafilter Index Next: cudademux


Nvidia-preferred-partner-badge-rgb-for-screen.png



This page describes in detail the cudamux element of the GstCUDA plugin.


NOTE: This element is under development. It will be ready for the next GstCUDA version release.


Description

Cudamux is a multiple inputs/single output pad video filter GStreamer element, that allows video frames to be processed by the GPU using a custom CUDA library algorithm. With this element users can now develop their own CUDA processing library, pass the library into cudamux, which executes the library on the GPU, passing upstream frames from the GStreamer pipeline for each input pad to the GPU and passing the modified frames downstream to the next element in the GStreamer pipeline.


This element executes the CUDA algorithm from a custom CUDA library (XXX.so file) loaded dynamically during run-time, passed trough an element's property. The CUDA algorithm is separated from the GStreamer element, so the developer could make modifications to the CUDA algorithm, recompile the custom CUDA library and run the GStreamer pipeline again to test the changes. This process can be iterated as many times as needed to debug a custom CUDA algorithm. This feature makes cudamux ideal for quick prototyping because it offers flexibility and adaptability to many project requirements.


One key feature of this element is the capability to load the CUDA algorithm to be executed on the GPU to process the incoming frames, from an external compiled custom CUDA library. This gives the advantage of having the GStreamer element separated from the CUDA algorithm. So, the developer doesn't have to worry about the GStreamer-CUDA interface and complex memory handling, because the cudamux will take care of that. Instead, the developer can be focused on the custom CUDA algorithm development, and test any change made during the debugging process by just recompiling the CUDA library and just execute the GStreamer pipeline again without the necessity to modify, recompile and reinstall the GstCUDA plugin. This feature is crucial in reducing the time to market on project development because considerably accelerates the prototyping stage.


Another crucial feature of cudamux is the multiple input/single output pads filter element topology. This feature makes this element very flexible and adaptable to many project requirements. This element has one "Always" source pad and multiple "On request" sink pads. The user is responsible to request the number of sink pads as many inputs are required by the custom CUDA algorithm. Because this is quick prototyping intend element, it will not be aware of errors committed by the user related to a mismatch in the number of requested sink pads and the number of inputs required by the custom CUDA algorithm. The cudamux element will generate an array of inputs based on the number of "On requested" sink pads and pass it to the custom CUDA algorithm, accordingly to the expected template of the custom CUDA library. So, for this reason is very important that the user be aware to match the number of requested sink pads with the number of inputs defined in the custom CUDA library to avoid an error.


The cudamux with its multiple inputs/single-output (MISO) topology, becomes the best option for quick prototyping projects that wants to interface GStreamer with a CUDA algorithm that requires several inputs and one output, for example: image stitching, stereoscopic vision (3D vision), High-dynamic-range imaging (HDRI), the picture on picture overlays, etc.


The cudamux could be viewed as a generic multiple inputs/single output pads video filter element that executes any custom CUDA algorithm provided by the user. So, this allows the user to develop different CUDA algorithms at the same time and test them using the same cudamux element, by just changes the element's property that specifies the CUDA library that should be loaded during pipeline execution.


Key features

  • Multiple inputs/single output pads filter element topology.
  • Dynamically load of an external compiled CUDA library that contains the CUDA algorithm to be executed in the GPU to process the incoming frames.
  • Independence between the GStreamer element and CUDA algorithm.
  • Generic GStreamer element that could execute custom CUDA algorithms.
  • Adaptability to many project requirements.
  • Ideal for quick prototyping and reducing time to market of project development.
  • High performance, due to zero memory copies interface between CUDA and GStreamer.
  • Directly handle of NVMM memory type buffers.


Documentation

Cudamux documentation.


Element inspect

 1 $ gst-inspect-1.0 cudamux
 2 Factory Details:
 3   Rank                     none (0)
 4   Long-name                cudamux
 5   Klass                    Muxer
 6   Description              Allows frames to be processed by the GPU using a custom CUDA library algorithm.
 7 			   Multiple input single output topology filter element.
 8   Author                   Diego Chaverri <diego.chaverri@ridgerun.com> 
 9 			   Daniel Garbanzo <daniel.garbanzo@ridgerun.com> 
10 			   Enrique Ramirez <enrique.ramirez@ridgerun.com> 
11 			   Michael Gruner <michael.gruner@ridgerun.com>
12 
13 Plugin Details:
14   Name                     cuda
15   Description              Allows frames to be processed by the GPU using a custom CUDA library algorithm
16   Filename                 /usr/lib/aarch64-linux-gnu/gstreamer-1.0/libgstcuda.so
17   Version                  0.3.1.1
18   License                  Proprietary
19   Source module            gst-cuda
20   Source release date      2018-01-10 17:43 (UTC)
21   Binary package           GStreamer CUDA Plug-in
22   Origin URL               Unknown package origin
23 
24 GObject
25  +----GInitiallyUnowned
26        +----GstObject
27              +----GstElement
28                    +----GstAggregator
29                          +----GstCudaBaseMiso
30                                +----GstCudaMux
31 
32 Pad Templates:
33   SINK template: 'sink_%u'
34     Availability: On request
35       Has request_new_pad() function: gst_aggregator_request_new_pad
36     Capabilities:
37       video/x-raw(memory:NVMM)
38                  format: I420
39                   width: [ 1, 2147483647 ]
40                  height: [ 1, 2147483647 ]
41               framerate: [ 0/1, 2147483647/1 ]
42 
43   SRC template: 'src'
44     Availability: Always
45     Capabilities:
46       video/x-raw
47                  format: I420
48                   width: [ 1, 2147483647 ]
49                  height: [ 1, 2147483647 ]
50               framerate: [ 0/1, 2147483647/1 ]
51       video/x-raw(memory:NVMM)
52                  format: I420
53                   width: [ 1, 2147483647 ]
54                  height: [ 1, 2147483647 ]
55               framerate: [ 0/1, 2147483647/1 ]
56 
57 
58 Element Flags:
59   no flags set
60 
61 Element Implementation:
62   Has change_state() function: gst_aggregator_change_state
63 
64 Element has no clocking capabilities.
65 Element has no URI handling capabilities.
66 
67 Pads:
68   SRC: 'src'
69     Pad Template: 'src'
70 
71 Element Properties:
72   name                : The name of the object
73                         flags: readable, writable
74                         String. Default: "cudamux0"
75   parent              : The parent of the object
76                         flags: readable, writable
77                         Object of type "GstObject"
78   latency             : Additional latency in live mode to allow upstream to take longer to produce buffers for the current position (in nanoseconds)
79                         flags: readable, writable
80                         Integer64. Range: 0 - 9223372036854775807 Default: 0 
81   start-time-selection: Decides which start time is output
82                         flags: readable, writable
83                         Enum "GstAggregatorStartTimeSelection" Default: 0, "zero"
84                            (0): zero             - Start at 0 running time (default)
85                            (1): first            - Start at first observed input running time
86                            (2): set              - Set start time with start-time property
87   start-time          : Start time to use if start-time-selection=set
88                         flags: readable, writable
89                         Unsigned Integer64. Range: 0 - 18446744073709551615 Default: 18446744073709551615 
90   location            : Location of the CUDA algorithm library to load
91                         flags: readable, writable
92                         String. Default: null
93   in-place            : Use in-place transform mode configuration
94                         flags: readable, writable
95                         Boolean. Default: false


Previous: cudafilter Index Next: cudademux