Difference between revisions of "GstInference/Supported backends/NCSDK"
Line 77: | Line 77: | ||
==mvNCProfile== | ==mvNCProfile== | ||
+ | |||
+ | Compiles a network, runs it on a connected neural compute device, and outputs profiling info on the terminal and on an HTML file. The profiling data contains layer performance and execution time of the model. | ||
+ | For example, to profile the googlenet network: | ||
+ | <syntaxhighlight lang=bash> | ||
+ | mvNCProfile deploy.prototxt -s 12 | ||
+ | </syntaxhighlight> | ||
+ | The output looks like: | ||
+ | <syntaxhighlight lang=bash> | ||
+ | mvNCProfile v02.00, Copyright @ Intel Corporation 2017 | ||
+ | |||
+ | ****** WARNING: using empty weights ****** | ||
+ | Layer inception_3b/1x1 forced to im2col_v2, because its output is used in concat | ||
+ | /usr/local/bin/ncsdk/Controllers/FileIO.py:65: UserWarning: You are using a large type. Consider reducing your data sizes for best performance | ||
+ | Blob generated | ||
+ | USB: Transferring Data... | ||
+ | Time to Execute : 115.95 ms | ||
+ | USB: Myriad Execution Finished | ||
+ | Time to Execute : 98.03 ms | ||
+ | USB: Myriad Execution Finished | ||
+ | USB: Myriad Connection Closing. | ||
+ | USB: Myriad Connection Closed. | ||
+ | Network Summary | ||
+ | |||
+ | Detailed Per Layer Profile | ||
+ | |||
+ | Bandwidth time | ||
+ | # Name MFLOPs (MB/s) (ms) | ||
+ | ======================================================================= | ||
+ | 0 data 0.0 55877.1 0.005 | ||
+ | 1 conv1/7x7_s2 236.0 2453.0 5.745 | ||
+ | 2 pool1/3x3_s2 1.8 1346.8 1.137 | ||
+ | 3 pool1/norm1 0.0 711.3 0.538 | ||
+ | 4 conv2/3x3_reduce 25.7 471.6 0.828 | ||
+ | 5 conv2/3x3 693.6 305.9 11.957 | ||
+ | 6 conv2/norm2 0.0 771.6 1.488 | ||
+ | 7 pool2/3x3_s2 1.4 1403.3 0.818 | ||
+ | 8 inception_3a/1x1 19.3 554.6 0.560 | ||
+ | 9 inception_3a/3x3_reduce 28.9 458.3 0.703 | ||
+ | 10 inception_3a/3x3 173.4 319.2 4.716 | ||
+ | 11 inception_3a/5x5_reduce 4.8 1035.8 0.283 | ||
+ | 12 inception_3a/5x5 20.1 716.0 0.872 | ||
+ | 13 inception_3a/pool 1.4 648.5 0.443 | ||
+ | 14 inception_3a/pool_proj 9.6 657.0 0.455 | ||
+ | 15 inception_3b/1x1 51.4 446.0 0.999 | ||
+ | 16 inception_3b/3x3_reduce 51.4 445.1 1.001 | ||
+ | 17 inception_3b/3x3 346.8 261.0 8.228 | ||
+ | 18 inception_3b/5x5_reduce 12.8 879.9 0.453 | ||
+ | 19 inception_3b/5x5 120.4 536.8 2.510 | ||
+ | 20 inception_3b/pool 1.8 678.7 0.564 | ||
+ | 21 inception_3b/pool_proj 25.7 631.2 0.656 | ||
+ | 22 pool3/3x3_s2 0.8 1213.8 0.591 | ||
+ | 23 inception_4a/1x1 36.1 364.0 0.977 | ||
+ | 24 inception_4a/3x3_reduce 18.1 490.3 0.545 | ||
+ | 25 inception_4a/3x3 70.4 306.0 2.187 | ||
+ | 26 inception_4a/5x5_reduce 3.0 763.2 0.254 | ||
+ | 27 inception_4a/5x5 7.5 455.1 0.414 | ||
+ | 28 inception_4a/pool 0.8 604.6 0.297 | ||
+ | 29 inception_4a/pool_proj 12.0 613.0 0.389 | ||
+ | 30 inception_4b/1x1 32.1 349.6 0.995 | ||
+ | 31 inception_4b/3x3_reduce 22.5 385.6 0.780 | ||
+ | 32 inception_4b/3x3 88.5 280.9 2.888 | ||
+ | 33 inception_4b/5x5_reduce 4.8 576.7 0.373 | ||
+ | 34 inception_4b/5x5 15.1 339.7 0.885 | ||
+ | 35 inception_4b/pool 0.9 617.8 0.310 | ||
+ | 36 inception_4b/pool_proj 12.8 579.5 0.438 | ||
+ | 37 inception_4c/1x1 25.7 415.5 0.762 | ||
+ | 38 inception_4c/3x3_reduce 25.7 410.3 0.771 | ||
+ | 39 inception_4c/3x3 115.6 288.2 3.462 | ||
+ | 40 inception_4c/5x5_reduce 4.8 574.7 0.374 | ||
+ | 41 inception_4c/5x5 15.1 339.7 0.885 | ||
+ | 42 inception_4c/pool 0.9 615.3 0.311 | ||
+ | 43 inception_4c/pool_proj 12.8 577.3 0.440 | ||
+ | 44 inception_4d/1x1 22.5 382.9 0.786 | ||
+ | 45 inception_4d/3x3_reduce 28.9 489.2 0.679 | ||
+ | 46 inception_4d/3x3 146.3 402.9 2.981 | ||
+ | 47 inception_4d/5x5_reduce 6.4 728.9 0.305 | ||
+ | 48 inception_4d/5x5 20.1 408.5 0.979 | ||
+ | 49 inception_4d/pool 0.9 629.5 0.304 | ||
+ | 50 inception_4d/pool_proj 12.8 630.8 0.403 | ||
+ | 51 inception_4e/1x1 53.0 297.7 1.531 | ||
+ | 52 inception_4e/3x3_reduce 33.1 277.0 1.294 | ||
+ | 53 inception_4e/3x3 180.6 290.3 4.902 | ||
+ | 54 inception_4e/5x5_reduce 6.6 492.8 0.466 | ||
+ | 55 inception_4e/5x5 40.1 378.6 1.322 | ||
+ | 56 inception_4e/pool 0.9 633.0 0.312 | ||
+ | 57 inception_4e/pool_proj 26.5 446.8 0.731 | ||
+ | 58 pool4/3x3_s2 0.4 1245.4 0.250 | ||
+ | 59 inception_5a/1x1 20.9 616.4 0.786 | ||
+ | 60 inception_5a/3x3_reduce 13.0 569.7 0.582 | ||
+ | 61 inception_5a/3x3 45.2 570.7 1.786 | ||
+ | 62 inception_5a/5x5_reduce 2.6 329.2 0.391 | ||
+ | 63 inception_5a/5x5 10.0 459.6 0.601 | ||
+ | 64 inception_5a/pool 0.4 531.7 0.146 | ||
+ | 65 inception_5a/pool_proj 10.4 514.9 0.546 | ||
+ | 66 inception_5b/1x1 31.3 607.0 1.133 | ||
+ | 67 inception_5b/3x3_reduce 15.7 612.0 0.625 | ||
+ | 68 inception_5b/3x3 65.0 606.1 2.366 | ||
+ | 69 inception_5b/5x5_reduce 3.9 375.0 0.410 | ||
+ | 70 inception_5b/5x5 15.1 475.0 0.866 | ||
+ | 71 inception_5b/pool 0.4 531.7 0.146 | ||
+ | 72 inception_5b/pool_proj 10.4 513.7 0.547 | ||
+ | 73 pool5/7x7_s1 0.1 405.5 0.236 | ||
+ | 74 loss3/classifier 0.0 2559.7 0.764 | ||
+ | 75 prob 0.0 10.0 0.192 | ||
+ | --------------------------------------------------------------------------------------------- | ||
+ | Total inference time 93.66 | ||
+ | --------------------------------------------------------------------------------------------- | ||
+ | Generating Profile Report 'output_report.html'... | ||
+ | </syntaxhighlight> | ||
=API= | =API= |
Revision as of 11:25, 19 December 2018
Make sure you also check GstInference's companion project: R2Inference |
The NCSDK Intel® Movidius™ Neural Compute SDK (Intel® Movidius™ NCSDK) enables deployment of deep neural networks on compatible devices such as the Intel® Movidius™ Neural Compute Stick. The NCSDK includes a set of software tools to compile, profile, and validate DNNs (Deep Neural Networks) as well as APIs on C/C++ and Python for application development.
The NCSDK has two general usages:
- Profiling, tuning, and compiling a DNN models.
- Prototyping user applications, that run accelerated with a neural compute device hardware, using the NCAPI.
Installation
You can install the NCSDK on a system running Linux directly, downloading a Docker container, on a virtual machine or using a Python virtual environment. Al the possible installation paths are documented on the official installation guide.
Tools
mvNCCheck
Checks the validity of a Caffe or TensorFlow model on a neural compute device. The check is done by running an inference on both the device and in software and then comparing the results to determine a if the network passes or fails. This tool works best with image classification networks. You can check all the available options on the official documentation.
For example lets test the googlenet caffe model downloaded by the ncappzoo repo:
mvNCCheck -w bvlc_googlenet.caffemodel -i ../../data/images/nps_electric_guitar.png -s 12 -id 546 deploy.prototxt -S 255 -M 110
- -w indicates the weights file
- -i the input image
- -s the number of shaves
- -id the expected label id for the input image (you can find the id for any imagenet model here)
- -S is the scaling sice
- -M is the substracted mean after scaling
Most of these parameters are available from the model documentation. The command produces the following result:
lob generated
USB: Transferring Data...
USB: Myriad Execution Finished
USB: Myriad Connection Closing.
USB: Myriad Connection Closed.
Result: (1000,)
1) 546 0.99609
2) 402 0.0038853
3) 420 8.9228e-05
4) 327 0.0
5) 339 0.0
Expected: (1000,)
1) 546 0.99609
2) 402 0.0039177
3) 420 9.0837e-05
4) 889 1.2875e-05
5) 486 5.3644e-06
------------------------------------------------------------
Obtained values
------------------------------------------------------------
Obtained Min Pixel Accuracy: 0.0032552085031056777% (max allowed=2%), Pass
Obtained Average Pixel Accuracy: 7.264380030846951e-06% (max allowed=1%), Pass
Obtained Percentage of wrong values: 0.0% (max allowed=0%), Pass
Obtained Pixel-wise L2 error: 0.00011369892179413199% (max allowed=1%), Pass
Obtained Global Sum Difference: 7.236003875732422e-05
------------------------------------------------------------
mvNCCompile
Compiles a network and weights files from Caffe or TensorFlow models into a graph file that is compatible with the NCAPI.
For example, giving a caffe model (bvlc_googlenet.caffemodel) and a network description (deploy.prototxt):
mvNCCompile -w bvlc_googlenet.caffemodel -s 12 deploy.prototxt
This command will output the graph and output_expected.npy files, that will be used later on the API
mvNCProfile
Compiles a network, runs it on a connected neural compute device, and outputs profiling info on the terminal and on an HTML file. The profiling data contains layer performance and execution time of the model. For example, to profile the googlenet network:
mvNCProfile deploy.prototxt -s 12
The output looks like:
mvNCProfile v02.00, Copyright @ Intel Corporation 2017
****** WARNING: using empty weights ******
Layer inception_3b/1x1 forced to im2col_v2, because its output is used in concat
/usr/local/bin/ncsdk/Controllers/FileIO.py:65: UserWarning: You are using a large type. Consider reducing your data sizes for best performance
Blob generated
USB: Transferring Data...
Time to Execute : 115.95 ms
USB: Myriad Execution Finished
Time to Execute : 98.03 ms
USB: Myriad Execution Finished
USB: Myriad Connection Closing.
USB: Myriad Connection Closed.
Network Summary
Detailed Per Layer Profile
Bandwidth time
# Name MFLOPs (MB/s) (ms)
=======================================================================
0 data 0.0 55877.1 0.005
1 conv1/7x7_s2 236.0 2453.0 5.745
2 pool1/3x3_s2 1.8 1346.8 1.137
3 pool1/norm1 0.0 711.3 0.538
4 conv2/3x3_reduce 25.7 471.6 0.828
5 conv2/3x3 693.6 305.9 11.957
6 conv2/norm2 0.0 771.6 1.488
7 pool2/3x3_s2 1.4 1403.3 0.818
8 inception_3a/1x1 19.3 554.6 0.560
9 inception_3a/3x3_reduce 28.9 458.3 0.703
10 inception_3a/3x3 173.4 319.2 4.716
11 inception_3a/5x5_reduce 4.8 1035.8 0.283
12 inception_3a/5x5 20.1 716.0 0.872
13 inception_3a/pool 1.4 648.5 0.443
14 inception_3a/pool_proj 9.6 657.0 0.455
15 inception_3b/1x1 51.4 446.0 0.999
16 inception_3b/3x3_reduce 51.4 445.1 1.001
17 inception_3b/3x3 346.8 261.0 8.228
18 inception_3b/5x5_reduce 12.8 879.9 0.453
19 inception_3b/5x5 120.4 536.8 2.510
20 inception_3b/pool 1.8 678.7 0.564
21 inception_3b/pool_proj 25.7 631.2 0.656
22 pool3/3x3_s2 0.8 1213.8 0.591
23 inception_4a/1x1 36.1 364.0 0.977
24 inception_4a/3x3_reduce 18.1 490.3 0.545
25 inception_4a/3x3 70.4 306.0 2.187
26 inception_4a/5x5_reduce 3.0 763.2 0.254
27 inception_4a/5x5 7.5 455.1 0.414
28 inception_4a/pool 0.8 604.6 0.297
29 inception_4a/pool_proj 12.0 613.0 0.389
30 inception_4b/1x1 32.1 349.6 0.995
31 inception_4b/3x3_reduce 22.5 385.6 0.780
32 inception_4b/3x3 88.5 280.9 2.888
33 inception_4b/5x5_reduce 4.8 576.7 0.373
34 inception_4b/5x5 15.1 339.7 0.885
35 inception_4b/pool 0.9 617.8 0.310
36 inception_4b/pool_proj 12.8 579.5 0.438
37 inception_4c/1x1 25.7 415.5 0.762
38 inception_4c/3x3_reduce 25.7 410.3 0.771
39 inception_4c/3x3 115.6 288.2 3.462
40 inception_4c/5x5_reduce 4.8 574.7 0.374
41 inception_4c/5x5 15.1 339.7 0.885
42 inception_4c/pool 0.9 615.3 0.311
43 inception_4c/pool_proj 12.8 577.3 0.440
44 inception_4d/1x1 22.5 382.9 0.786
45 inception_4d/3x3_reduce 28.9 489.2 0.679
46 inception_4d/3x3 146.3 402.9 2.981
47 inception_4d/5x5_reduce 6.4 728.9 0.305
48 inception_4d/5x5 20.1 408.5 0.979
49 inception_4d/pool 0.9 629.5 0.304
50 inception_4d/pool_proj 12.8 630.8 0.403
51 inception_4e/1x1 53.0 297.7 1.531
52 inception_4e/3x3_reduce 33.1 277.0 1.294
53 inception_4e/3x3 180.6 290.3 4.902
54 inception_4e/5x5_reduce 6.6 492.8 0.466
55 inception_4e/5x5 40.1 378.6 1.322
56 inception_4e/pool 0.9 633.0 0.312
57 inception_4e/pool_proj 26.5 446.8 0.731
58 pool4/3x3_s2 0.4 1245.4 0.250
59 inception_5a/1x1 20.9 616.4 0.786
60 inception_5a/3x3_reduce 13.0 569.7 0.582
61 inception_5a/3x3 45.2 570.7 1.786
62 inception_5a/5x5_reduce 2.6 329.2 0.391
63 inception_5a/5x5 10.0 459.6 0.601
64 inception_5a/pool 0.4 531.7 0.146
65 inception_5a/pool_proj 10.4 514.9 0.546
66 inception_5b/1x1 31.3 607.0 1.133
67 inception_5b/3x3_reduce 15.7 612.0 0.625
68 inception_5b/3x3 65.0 606.1 2.366
69 inception_5b/5x5_reduce 3.9 375.0 0.410
70 inception_5b/5x5 15.1 475.0 0.866
71 inception_5b/pool 0.4 531.7 0.146
72 inception_5b/pool_proj 10.4 513.7 0.547
73 pool5/7x7_s1 0.1 405.5 0.236
74 loss3/classifier 0.0 2559.7 0.764
75 prob 0.0 10.0 0.192
---------------------------------------------------------------------------------------------
Total inference time 93.66
---------------------------------------------------------------------------------------------
Generating Profile Report 'output_report.html'...
API