Revision as of 12:52, 26 December 2018

The NCSDK Intel® Movidius™ Neural Compute SDK (Intel® Movidius™ NCSDK) enables deployment of deep neural networks on compatible devices such as the Intel® Movidius™ Neural Compute Stick. The NCSDK includes a set of software tools to compile, profile, and validate DNNs (Deep Neural Networks) as well as APIs on C/C++ and Python for application development.

To use the ncsdk on Gst-Inference be sure to run the R2Inference configure with the flag --enable-ncsdk and use the property backend=ncsdk on the Gst-Inference plugins.

Installation

You can install the NCSDK on a system running Linux directly, downloading a Docker container, on a virtual machine or using a Python virtual environment. All the possible installation paths are documented on the official installation guide.

We also provide an installation guide with troubleshooting on the Intel Movidius Installation wiki page

Generating a graph

When you use the ncsdk backeng you will need a compiled ncs graph file. You can obtain this file from tensorflow's protobuff and weights filer; or caffe's prototxt and caffemodel files. mvNCCompile is a tool included with the ncsdk installation that compiles a network and produces a graph file that is compatible with the NCAPI and the Gst-Inference plugins using the ncsdk backend.

For example, giving a caffe model (bvlc_googlenet.caffemodel) and a network description (deploy.prototxt):

mvNCCompile -w bvlc_googlenet.caffemodel -s 12 deploy.prototxt

This command will output the graph and output_expected.npy files, that can be used later with the googlenet plugin.

Options

You can find the full documentation of the C API here and the Python API here. Gst-Inference uses only the C API and R2Inference takes care of devices, graphs, models and fifos. Because of this, we will only take a look at the options that you can change when using the C API through R2Inference.

The following syntax is used to change backend options on Gst-Inference plugins:

backend::<property>

For example to change the NCSDK API log level of the googlenet plugin you need to run the pipeline like this:

gst-launch-1.0 \
googlenet name=net model-location=/root/r2inference/examples/r2i/ncsdk/graph_googlenet backend=ncsdk backend::log-level=1 \
videotestsrc ! tee name=t \
t. ! queue ! videoconvert ! videoscale ! net.sink_model \
t. ! queue ! net.sink_bypass \
net.src_bypass ! fakesink

The backend::log-level=1 section of the pipeline sets the NC_RW_LOG_LEVEL option of the NCSDK C API to 1.

To learn more about the NCSDK C API option, please check the NCSDK wiki page on the R2Inference subwiki.

Device Options

All the device options are read only.

Property	C API Counterpart	Value	Description
thermal-throttling-level	NC_RO_THERMAL_THROTTLING_LEVEL	Integer (0,1,2)	0: No limit reached. 1: Lower temperature guard threshold reached. 2: Upper temperature guard threshold reached.
device-state	NC_RO_DEVICE_STATE	Integer (0,1,2,3)	The current state of the device: 0: CREATED: The struct has been initialized. 1: OPENED: The device communication has been opened. 2: CLOSED: Communication with the device has been closed. 3: DESTROYED: The device handler has been freed.
current-memory-used	NC_RO_DEVICE_CURRENT_MEMORY_USED	Integer	Current memory used on the device.
memory-size	NC_RO_DEVICE_MEMORY_SIZE	Integer	Total memory available on the device.
max-fifo-num	NC_RO_DEVICE_MAX_FIFO_NUM	Integer	Max number of fifos.
allocated-fifo-num	NC_RO_DEVICE_ALLOCATED_FIFO_NUM	Integer	Number of fifos currently allocated.
max-graph-num	NC_RO_DEVICE_MAX_GRAPH_NUM	Integer	Max number of graphs.
allocated-graph-num	NC_RO_ALLOCATED_GRAPH_NUM	Integer	Number of graphs currently allocated.
option-class-limit	NC_RO_DEVICE_OPTION_CLASS_LIMIT	Integer	Highest option class supported.
device-name	NC_RO_DEVICE_NAME	String	Device name.

Fifo Options

Most of the R/W options on the FIFO can only be modified between creation and allocation, and R2Inference does both in a single method (Engine->Start()), so it is impossible to write on these options. R2Inference also fixates those options to our specific implementation, so they are not exposed on the plugin.

Global Options

Pay special attention to the log level enumeration, because it is ordered counter intuitively. 1 is actually the highest log level, 4 is the lowest and 0 the default.

Property	C API Counterpart	Value	Description
log-level	NC_RW_LOG_LEVEL	Integer	NCSDK debug log level from ncLogLevel_t enum 0: NC_LOG_DEBUG: Debug, warning, error and fatal. 1: NC_LOG_INFO: Info, debug, warning, error and fatal. 2: NC_LOG_WARN: Warning, error and fatal. 3: NC_LOG_ERROR: Error and fatal. 4: NC_LOG_FATAL: Fatal only.

Graph Options

Property	C API Counterpart	Value	Description
graph-state	NC_RO_GRAPH_STATE	Integer	The current state of the graph from ncGraphState_t enum 0: NC_GRAPH_CREATED: The struct has been initialized. 1: NC_GRAPH_ALLOCATED: The graph has been allocated. 2: NC_GRAPH_WAITING_FOR_BUFFERS: The graph is waiting for input. 3: NC_GRAPH_RUNNING: The graph is currently running an inference.
graph-input-count	NC_RO_GRAPH_INPUT_TENSOR_DESCRIPTORS	Integer	Array of graph inputs. Returns the size of the array instead of the array itself.
graph-output-count	NC_RO_GRAPH_OUTPUT_TENSOR_DESCRIPTORS	Integer	Array of graph outputs. Returns the size of the array instead of the array itself.
graph-debug-info	NC_RO_GRAPH_DEBUG_INFO	String	Debug information.
graph-name	NC_RO_GRAPH_NAME	String	Graph name.
graph-option-class-limit	NC_RO_GRAPH_OPTION_CLASS_LIMIT	Integer	The highest option class supported.
graph-version	NC_RO_GRAPH_VERSION	String	The version ([major, minor]) of the compiled graph.

NCSDK Tools

The NCSDK installation include some useful tools to analyze, optimize and compile models. We will mention these tools here, but if you want some examples and a more complete description please check the NCSDK wiki page on the R2Inference subwiki.

mvNCCheck: Checks the validity of a Caffe or TensorFlow model on a neural compute device. The check is done by running an inference on both the device and in software and then comparing the results to determine a if the network passes or fails.
mvNCCompile: Compiles a network and weights files from Caffe or TensorFlow models into a graph file that is compatible with the NCAPI.
mvNCProfile: Compiles a network, runs it on a connected neural compute device, and outputs profiling info on the terminal and on an HTML file. The profiling data contains layer performance and execution time of the model. The html version of the report also contains a graphical representation of the model.

Previous: Supported backends

Index

Next: Example pipelines

Difference between revisions of "GstInference/Supported backends/NCSDK"

Revision as of 12:52, 26 December 2018

Contents

Installation

Generating a graph

Options

Device Options

Fifo Options

Global Options

Graph Options

NCSDK Tools

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Useful Links

Legal

Services

Tools

@@ Line 8: / Line 8: @@
 The NCSDK Intel® Movidius™ Neural Compute SDK (Intel® Movidius™ NCSDK) enables deployment of deep neural networks on compatible devices such as the Intel® Movidius™ Neural Compute Stick. The NCSDK includes a set of software tools to compile, profile, and validate DNNs (Deep Neural Networks) as well as APIs on C/C++ and Python for application development.
-The NCSDK has two general usages:
+To use the ncsdk on Gst-Inference be sure to run the R2Inference configure with the flag <code> --enable-ncsdk </code> and use the property <code> backend=ncsdk </code> on the Gst-Inference plugins.
-*Profiling, tuning, and compiling a DNN models.
-*Prototyping user applications, that run accelerated with a neural compute device hardware, using the NCAPI.
 =Installation=
@@ Line 19: / Line 16: @@
 We also provide an installation guide with troubleshooting on the [[Intel_Movidius_NCSDK_Installation | Intel Movidius Installation wiki page]]
-=Tools=
+=Generating a graph=
-==mvNCCheck==
-Checks the validity of a Caffe or TensorFlow model on a neural compute device. The check is done by running an inference on both the device and in software and then comparing the results to determine a if the network passes or fails. This tool works best with image classification networks. You can check all the available options on the [https://movidius.github.io/ncsdk/tools/check.html official documentation].
-For example lets test the googlenet caffe model downloaded by the [https://github.com/movidius/ncappzoo ncappzoo repo]:
-<syntaxhighlight lang=bash>
+When you use the ncsdk backeng you will need a compiled ncs graph file. You can obtain this file from tensorflow's protobuff and weights filer; or caffe's prototxt and caffemodel files. mvNCCompile is a tool included with the ncsdk installation that
-mvNCCheck -w bvlc_googlenet.caffemodel -i ../../data/images/nps_electric_guitar.png -s 12 -id 546  deploy.prototxt -S 255 -M 110
+compiles a network and produces a graph file that is compatible with the NCAPI and the Gst-Inference plugins using the ncsdk backend.
-</syntaxhighlight>
-* -w indicates the weights file
-* -i the input image
-* -s the number of shaves
-* -id the expected label id for the input image (you can find the id for any imagenet model [https://gist.github.com/yrevar/942d3a0ac09ec9e5eb3a here])
-* -S is the scaling sice
-* -M is the substracted mean after scaling
-Most of these parameters are available from the model documentation. The command produces the following result:
-<syntaxhighlight>
-lob generated
-USB: Transferring Data...
-USB: Myriad Execution Finished
-USB: Myriad Connection Closing.
-USB: Myriad Connection Closed.
-Result:  (1000,)
-) 546 0.99609
-) 402 0.0038853
-) 420 8.9228e-05
-) 327 0.0
-) 339 0.0
-Expected:  (1000,)
-) 546 0.99609
-) 402 0.0039177
-) 420 9.0837e-05
-) 889 1.2875e-05
-) 486 5.3644e-06
-------------------------------------------------------------
- Obtained values
-------------------------------------------------------------
- Obtained Min Pixel Accuracy: 0.0032552085031056777% (max allowed=2%), Pass
- Obtained Average Pixel Accuracy: 7.264380030846951e-06% (max allowed=1%), Pass
- Obtained Percentage of wrong values: 0.0% (max allowed=0%), Pass
- Obtained Pixel-wise L2 error: 0.00011369892179413199% (max allowed=1%), Pass
- Obtained Global Sum Difference: 7.236003875732422e-05
-------------------------------------------------------------
-</syntaxhighlight>
-==mvNCCompile==
-Compiles a network and weights files from Caffe or TensorFlow models into a graph file that is compatible with the NCAPI.
 For example, giving a caffe model (bvlc_googlenet.caffemodel) and a network description (deploy.prototxt):
@@ Line 76: / Line 25: @@
 mvNCCompile -w bvlc_googlenet.caffemodel -s 12 deploy.prototxt
 </syntaxhighlight>
-This command will output the '''graph''' and '''output_expected.npy''' files, that will be used later on the API
+This command will output the '''graph''' and '''output_expected.npy''' files, that can be used later with the googlenet plugin.
-==mvNCProfile==
-Compiles a network, runs it on a connected neural compute device, and outputs profiling info on the terminal and on an HTML file. The profiling data contains layer performance and execution time of the model. The html version of the report also contains a graphical representation of the model.
-For example, to profile the googlenet network:
-<syntaxhighlight lang=bash>
-mvNCProfile deploy.prototxt -s 12
-</syntaxhighlight>
-The output looks like:
-<syntaxhighlight lang=bash>
-mvNCProfile v02.00, Copyright @ Intel Corporation 2017
-****** WARNING: using empty weights ******
-Layer  inception_3b/1x1  forced to im2col_v2, because its output is used in concat
-/usr/local/bin/ncsdk/Controllers/FileIO.py:65: UserWarning: You are using a large type. Consider reducing your data sizes for best performance
-Blob generated
-USB: Transferring Data...
-Time to Execute :  115.95  ms
-USB: Myriad Execution Finished
-Time to Execute :  98.03  ms
-USB: Myriad Execution Finished
-USB: Myriad Connection Closing.
-USB: Myriad Connection Closed.
-Network Summary
-Detailed Per Layer Profile
-                                               Bandwidth       time
-#    Name                           MFLOPs      (MB/s)         (ms)
-=======================================================================
-    data                            0.0        55877.1        0.005
-    conv1/7x7_s2                  236.0         2453.0        5.745
-    pool1/3x3_s2                    1.8         1346.8        1.137
-    pool1/norm1                     0.0          711.3        0.538
-    conv2/3x3_reduce               25.7          471.6        0.828
-    conv2/3x3                     693.6          305.9       11.957
-    conv2/norm2                     0.0          771.6        1.488
-    pool2/3x3_s2                    1.4         1403.3        0.818
-    inception_3a/1x1               19.3          554.6        0.560
-    inception_3a/3x3_reduce        28.9          458.3        0.703
-   inception_3a/3x3              173.4          319.2        4.716
-   inception_3a/5x5_reduce         4.8         1035.8        0.283
-   inception_3a/5x5               20.1          716.0        0.872
-   inception_3a/pool               1.4          648.5        0.443
-   inception_3a/pool_proj          9.6          657.0        0.455
-   inception_3b/1x1               51.4          446.0        0.999
-   inception_3b/3x3_reduce        51.4          445.1        1.001
-   inception_3b/3x3              346.8          261.0        8.228
-   inception_3b/5x5_reduce        12.8          879.9        0.453
-   inception_3b/5x5              120.4          536.8        2.510
-   inception_3b/pool               1.8          678.7        0.564
-   inception_3b/pool_proj         25.7          631.2        0.656
-   pool3/3x3_s2                    0.8         1213.8        0.591
-   inception_4a/1x1               36.1          364.0        0.977
-   inception_4a/3x3_reduce        18.1          490.3        0.545
-   inception_4a/3x3               70.4          306.0        2.187
-   inception_4a/5x5_reduce         3.0          763.2        0.254
-   inception_4a/5x5                7.5          455.1        0.414
-   inception_4a/pool               0.8          604.6        0.297
-   inception_4a/pool_proj         12.0          613.0        0.389
-   inception_4b/1x1               32.1          349.6        0.995
-   inception_4b/3x3_reduce        22.5          385.6        0.780
-   inception_4b/3x3               88.5          280.9        2.888
-   inception_4b/5x5_reduce         4.8          576.7        0.373
-   inception_4b/5x5               15.1          339.7        0.885
-   inception_4b/pool               0.9          617.8        0.310
-   inception_4b/pool_proj         12.8          579.5        0.438
-   inception_4c/1x1               25.7          415.5        0.762
-   inception_4c/3x3_reduce        25.7          410.3        0.771
-   inception_4c/3x3              115.6          288.2        3.462
-   inception_4c/5x5_reduce         4.8          574.7        0.374
-   inception_4c/5x5               15.1          339.7        0.885
-   inception_4c/pool               0.9          615.3        0.311
-   inception_4c/pool_proj         12.8          577.3        0.440
-   inception_4d/1x1               22.5          382.9        0.786
-   inception_4d/3x3_reduce        28.9          489.2        0.679
-   inception_4d/3x3              146.3          402.9        2.981
-   inception_4d/5x5_reduce         6.4          728.9        0.305
-   inception_4d/5x5               20.1          408.5        0.979
-   inception_4d/pool               0.9          629.5        0.304
-   inception_4d/pool_proj         12.8          630.8        0.403
-   inception_4e/1x1               53.0          297.7        1.531
-   inception_4e/3x3_reduce        33.1          277.0        1.294
-   inception_4e/3x3              180.6          290.3        4.902
-   inception_4e/5x5_reduce         6.6          492.8        0.466
-   inception_4e/5x5               40.1          378.6        1.322
-   inception_4e/pool               0.9          633.0        0.312
-   inception_4e/pool_proj         26.5          446.8        0.731
-   pool4/3x3_s2                    0.4         1245.4        0.250
-   inception_5a/1x1               20.9          616.4        0.786
-   inception_5a/3x3_reduce        13.0          569.7        0.582
-   inception_5a/3x3               45.2          570.7        1.786
-   inception_5a/5x5_reduce         2.6          329.2        0.391
-   inception_5a/5x5               10.0          459.6        0.601
-   inception_5a/pool               0.4          531.7        0.146
-   inception_5a/pool_proj         10.4          514.9        0.546
-   inception_5b/1x1               31.3          607.0        1.133
-   inception_5b/3x3_reduce        15.7          612.0        0.625
-   inception_5b/3x3               65.0          606.1        2.366
-   inception_5b/5x5_reduce         3.9          375.0        0.410
-   inception_5b/5x5               15.1          475.0        0.866
-   inception_5b/pool               0.4          531.7        0.146
-   inception_5b/pool_proj         10.4          513.7        0.547
-   pool5/7x7_s1                    0.1          405.5        0.236
-   loss3/classifier                0.0         2559.7        0.764
-   prob                            0.0           10.0        0.192
----------------------------------------------------------------------------------------------
-                                                                                                                                                          Total inference time                   93.66
----------------------------------------------------------------------------------------------
-Generating Profile Report 'output_report.html'...
-</syntaxhighlight>
 =Options=
@@ Line 210: / Line 48: @@
 The <code>backend::log-level=1</code> section of the pipeline sets the <code>NC_RW_LOG_LEVEL</code> option of the NCSDK C API to <code>1</code>.
+To learn more about the NCSDK C API option, please check the [[R2Inference/Supported_backends/NCSDK| NCSDK wiki page]] on the R2Inference subwiki.
 ==Device Options==
@@ Line 355: / Line 195: @@
 |}
+= NCSDK Tools=
+The NCSDK installation include some useful tools to analyze, optimize and compile models. We will mention these tools here, but if you want some examples and a more complete description please check the [[R2Inference/Supported_backends/NCSDK| NCSDK wiki page]] on the R2Inference subwiki.
+* '''mvNCCheck''': Checks the validity of a Caffe or TensorFlow model on a neural compute device. The check is done by running an inference on both the device and in software and then comparing the results to determine a if the network passes or fails.
+* '''mvNCCompile''': Compiles a network and weights files from Caffe or TensorFlow models into a graph file that is compatible with the NCAPI.
+* '''mvNCProfile''': Compiles a network, runs it on a connected neural compute device, and outputs profiling info on the terminal and on an HTML file. The profiling data contains layer performance and execution time of the model. The html version of the report also contains a graphical representation of the model.
 <noinclude>
 {{GstInference/Foot|Supported backends|Example pipelines}}
 </noinclude>