GstCUDA - cudamux
This page describes in detail the cudamux element of the GstCUDA plugin.
NOTE: This element is under development. It will be ready for the next GstCUDA version release.
Description
Cudamux is a multiple inputs/single output pad video filter GStreamer element, that allows video frames to be processed by the GPU using a custom CUDA library algorithm. With this element users can now develop their own CUDA processing library, pass the library into cudamux, which executes the library on the GPU, passing upstream frames from the GStreamer pipeline for each input pad to the GPU and passing the modified frames downstream to the next element in the GStreamer pipeline.
This element executes the CUDA algorithm from a custom CUDA library (XXX.so file) loaded dynamically during run-time, passed trough an element's property. The CUDA algorithm is separated from the GStreamer element, so the developer could make modifications to the CUDA algorithm, recompile the custom CUDA library and run the GStreamer pipeline again to test the changes. This process can be iterated as many times as needed to debug a custom CUDA algorithm. This feature makes cudamux ideal for quick prototyping because it offers flexibility and adaptability to many project requirements.
One key feature of this element is the capability to load the CUDA algorithm to be executed on the GPU to process the incoming frames, from an external compiled custom CUDA library. This gives the advantage of having the GStreamer element separated from the CUDA algorithm. So, the developer doesn't have to worry about the GStreamer-CUDA interface and complex memory handling, because the cudamux will take care of that. Instead, the developer can be focused on the custom CUDA algorithm development, and test any change made during the debugging process by just recompiling the CUDA library and just execute the GStreamer pipeline again without the necessity to modify, recompile and reinstall the GstCUDA plugin. This feature is crucial in reducing the time to market on project development because considerably accelerates the prototyping stage.
Another crucial feature of cudamux is the multiple input/single output pads filter element topology. This feature makes this element very flexible and adaptable to many project requirements. This element has one "Always" source pad and multiple "On request" sink pads. The user is responsible to request the number of sink pads as many inputs are required by the custom CUDA algorithm. Because this is quick prototyping intend element, it will not be aware of errors committed by the user related to a mismatch in the number of requested sink pads and the number of inputs required by the custom CUDA algorithm. The cudamux element will generate an array of inputs based on the number of "On requested" sink pads and pass it to the custom CUDA algorithm, accordingly to the expected template of the custom CUDA library. So, for this reason is very important that the user be aware to match the number of requested sink pads with the number of inputs defined in the custom CUDA library to avoid an error.
The cudamux with its multiple inputs/single-output (MISO) topology, becomes the best option for quick prototyping projects that wants to interface GStreamer with a CUDA algorithm that requires several inputs and one output, for example: image stitching, stereoscopic vision (3D vision), High-dynamic-range imaging (HDRI), the picture on picture overlays, etc.
The cudamux could be viewed as a generic multiple inputs/single output pads video filter element that executes any custom CUDA algorithm provided by the user. So, this allows the user to develop different CUDA algorithms at the same time and test them using the same cudamux element, by just changes the element's property that specifies the CUDA library that should be loaded during pipeline execution.
Key features
- Multiple inputs/single output pads filter element topology.
- Dynamically load of an external compiled CUDA library that contains the CUDA algorithm to be executed in the GPU to process the incoming frames.
- Independence between the GStreamer element and CUDA algorithm.
- Generic GStreamer element that could execute custom CUDA algorithms.
- Adaptability to many project requirements.
- Ideal for quick prototyping and reducing time to market of project development.
- High performance, due to zero memory copies interface between CUDA and GStreamer.
- Directly handle of NVMM memory type buffers.
Documentation
Element inspect
1 $ gst-inspect-1.0 cudamux
2 Factory Details:
3 Rank none (0)
4 Long-name cudamux
5 Klass Muxer
6 Description Allows frames to be processed by the GPU using a custom CUDA library algorithm.
7 Multiple input single output topology filter element.
8 Author Diego Chaverri <diego.chaverri@ridgerun.com>
9 Daniel Garbanzo <daniel.garbanzo@ridgerun.com>
10 Enrique Ramirez <enrique.ramirez@ridgerun.com>
11 Michael Gruner <michael.gruner@ridgerun.com>
12
13 Plugin Details:
14 Name cuda
15 Description Allows frames to be processed by the GPU using a custom CUDA library algorithm
16 Filename /usr/lib/aarch64-linux-gnu/gstreamer-1.0/libgstcuda.so
17 Version 0.3.1.1
18 License Proprietary
19 Source module gst-cuda
20 Source release date 2018-01-10 17:43 (UTC)
21 Binary package GStreamer CUDA Plug-in
22 Origin URL Unknown package origin
23
24 GObject
25 +----GInitiallyUnowned
26 +----GstObject
27 +----GstElement
28 +----GstAggregator
29 +----GstCudaBaseMiso
30 +----GstCudaMux
31
32 Pad Templates:
33 SINK template: 'sink_%u'
34 Availability: On request
35 Has request_new_pad() function: gst_aggregator_request_new_pad
36 Capabilities:
37 video/x-raw(memory:NVMM)
38 format: I420
39 width: [ 1, 2147483647 ]
40 height: [ 1, 2147483647 ]
41 framerate: [ 0/1, 2147483647/1 ]
42
43 SRC template: 'src'
44 Availability: Always
45 Capabilities:
46 video/x-raw
47 format: I420
48 width: [ 1, 2147483647 ]
49 height: [ 1, 2147483647 ]
50 framerate: [ 0/1, 2147483647/1 ]
51 video/x-raw(memory:NVMM)
52 format: I420
53 width: [ 1, 2147483647 ]
54 height: [ 1, 2147483647 ]
55 framerate: [ 0/1, 2147483647/1 ]
56
57
58 Element Flags:
59 no flags set
60
61 Element Implementation:
62 Has change_state() function: gst_aggregator_change_state
63
64 Element has no clocking capabilities.
65 Element has no URI handling capabilities.
66
67 Pads:
68 SRC: 'src'
69 Pad Template: 'src'
70
71 Element Properties:
72 name : The name of the object
73 flags: readable, writable
74 String. Default: "cudamux0"
75 parent : The parent of the object
76 flags: readable, writable
77 Object of type "GstObject"
78 latency : Additional latency in live mode to allow upstream to take longer to produce buffers for the current position (in nanoseconds)
79 flags: readable, writable
80 Integer64. Range: 0 - 9223372036854775807 Default: 0
81 start-time-selection: Decides which start time is output
82 flags: readable, writable
83 Enum "GstAggregatorStartTimeSelection" Default: 0, "zero"
84 (0): zero - Start at 0 running time (default)
85 (1): first - Start at first observed input running time
86 (2): set - Set start time with start-time property
87 start-time : Start time to use if start-time-selection=set
88 flags: readable, writable
89 Unsigned Integer64. Range: 0 - 18446744073709551615 Default: 18446744073709551615
90 location : Location of the CUDA algorithm library to load
91 flags: readable, writable
92 String. Default: null
93 in-place : Use in-place transform mode configuration
94 flags: readable, writable
95 Boolean. Default: false