Compiling neon (NervanaSystems) deep learning library source

screen-shot-2016-11-29-at-9-22-52-pm

Nervana Systems have Neon which  is an open source Python-based language and set of libraries for developing deep learning models. Neon is super fast, super powerful, and super easy to use!

github: https://github.com/NervanaSystems/neon

I tried building this source on my machine as below:

  • OS: Ubuntu 16.04
  • GPU: CUDA 8.0
  • NVDIA SMI – 375.20

Build Source:

  • $ git clone https://github.com/NervanaSystems/neon
  • $ cd neon
  • $ make

How to run:

  • $ pwd
    • Note: Make sure you are in the neon folder
  • neon$ . .venv/bin/activate
    • This will open python virtualenv session for you
    • You can run jupyter notebook here as
      • $ jupyter notebook
  • To close active neon session in virtualenv do the following:
    • neon$ deactivate

Here are few problems I had along with their solution:

PROBLEM [1]

c++: error: unrecognized command line option ‘-Wdate-time’
c++: error: unrecognized command line option ‘-fstack-protector-strong’
c++: error: unrecognized command line option ‘-Wdate-time’
c++: error: unrecognized command line option ‘-fstack-protector-strong’
error: command ‘c++’ failed with exit status 1


Command “/home/avkash/toolkit/neon/.venv2/bin/python2.7 -u -c “import setuptools, tokenize;file=’/tmp/pip-build-P52OJy/pycuda/setup.py’;f=getattr(tokenize, ‘open’, open)(file);code=f.read().replace(‘\r\n’, ‘\n’);f.close();exec(compile(code, file, ‘exec’))” install –record /tmp/pip-71rowS-record/install-record.txt –single-version-externally-managed –compile –install-headers /home/avkash/toolkit/neon/.venv2/include/site/python2.7/pycuda” failed with error code 1 in /tmp/pip-build-P52OJy/pycuda/
Makefile:116: recipe for target ‘.venv2/bin/activate’ failed
make: *** [.venv2/bin/activate] Error 1

SOLUTION:

  • You would need to change gcc/g++ to  5.0 or above

PROBLEM [2]

unable to execute ‘clang++’: No such file or directory
unable to execute ‘clang++’: No such file or directory
unable to execute ‘clang++’: No such file or directory


Command “/home/avkash/toolkit/neon/.venv2/bin/python2.7 -u -c “import setuptools, tokenize;file=’/tmp/pip-U3oScE-build/setup.py’;f=getattr(tokenize, ‘open’, open)(file);code=f.read().replace(‘\r\n’, ‘\n’);f.close();exec(compile(code, file, ‘exec’))” install –record /tmp/pip-le8M0S-record/install-record.txt –single-version-externally-managed –compile –install-headers /home/avkash/toolkit/neon/.venv2/include/site/python2.7/nervana-aeon” failed with error code 1 in /tmp/pip-U3oScE-build/
Makefile:157: recipe for target ‘aeon_install’ failed

SOLUTION:

You would need to install clang to solve this problem as below:
$ sudo apt-get install clang

PROBLEM [3]:

loader/src/util.hpp:24:10: fatal error: ‘sox.h’ file not found
#include <sox.h>

Solution:

You would need to install sox libraries as below:

  • $ sudo apt-get install libsox-fmt-all libsox-dev sox

Successful Build:

neon$ make

Installed /home/avkash/toolkit/neon
Processing dependencies for neon==1.7.0
Finished processing dependencies for neon==1.7.0

…..
make[1]: Entering directory ‘/home/avkash/toolkit/neon/loader’
make[1]: ‘bin/loader.so’ is up to date.
make[1]: Leaving directory ‘/home/avkash/toolkit/neon/loader’

Have fun!!!

 

Advertisements

DS101: Various distribution type and class of problems

When you choose “binomial” and “multinomial” distribution, those will only work with CLASSIFICATION problems. All other distributions are for REGRESSION problems.

Here is generic table of common distribution types:

Distribution Type
Binomial Classification
Multinomial Classification
Bernoulli Regression
Gaussian Regression
Poisson Regression
Gamma Regression
Tweedie Regression
Laplace Regression
Quantile Regression
Huber Regression

So if your response column is numeric and you will do binomial or multinomial, you will get an error as reported above. Your response column must be enum if you will select binomial and multinomial distribution. You can change an numeric column to ENUM first and then run the algorithm to build a classification model.

Hot reads for this week in machine learning and deep learning

gpu-accelerated-deep-learning-for-cudnn-v2-7-638

December – 27th – 31st December

December – 19th – 26th December

December – 12th – 18th December

November – 5th – 11th December

November – 28th – 4th December

MIT Tech Review  Python Image Research dd

November – 21st – 27th

November – 14th – 20th

Shoot like an artist – Using imagination, artificial intelligence, Tensorflow (& GPU)

natural-art

After I got opencv, mxnet and tensorflow working with CUDA, I was looking for tensorflow implementation of “A Neural Algorithm of Artistic Styleresearch paper and I found this.

I found  tensorflow implementation by Anish for the above research paper and I took from there.

Why Tensorflow:

  • TensorFlow supports automatic differentiation and has a clean API
  • Research paper steps are translated into code here
  • It has support for GPU (CUDA) so I can get works done faster (Time is $$)

 

Pre-requisite:

  • Ubuntu 16.04
  • Python 2.7
  • Tensorflow with GPU support

Commands:

  • Command for help:
    • $ python neural_style.py –help
  • Basic command:
    • $ python neural_style.py –content your_content.jpg –style your_style.jpg –output output_file_name.png –iteration 500
  • If you decided to use the output as input style you sure can do to get improved results
    • $ python neural_style.py –content your_content.jpg –style your_previous_output.png –output new_output_file_name.png –iteration 500

Few things I found:

  • If you have both content and style image as png you may get the following error
    • ValueError: Dimensions must be equal, but are 4 and 3
  • To solve it just use both content and style image as jpg.
  • If you have less memory in machine use both content and style image smaller under 480×480 size.

Example:

Top left after 500 iteration, top right after 2000 iteration and bottom image after 3500 iteration:

 

Tensorflow with CUDA/cuDNN on Ubuntu 16.04

tf-cuda-cudnn

Environment:

  • OS: Ubuntu 16.0
  • Python 2.7
  • CUDA 8.0.27
  • CuDNN v5.1
  • Note: TensorFlow with GPU support, both NVIDIA’s Cuda Toolkit (>= 7.0) and cuDNN (>= v3) need to be installed.

GPU verification:

$ nvidia-smi
Tue Nov 22 04:28:59 2016
+-------------------------------------------------------------+
| NVIDIA-SMI 370.28 Driver Version: 370.28 |
|---------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|====================+======================+======================|
| 0 GRID K520 Off | 0000:00:03.0 Off | N/A |
| N/A 43C P0 1W / 125W | 0MiB / 4036MiB | 0% Default |
+----------------------+----------------------+---------------+

+-------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|===============================================================|
| No running processes found |
+--------------------------------------------------------------+

CUDA Toolkit verification:

$cat /usr/local/cuda/version.txt
CUDA Version 8.0.27

CuDNN Verification:

Download cudnn-8.0-linux-x64-v5.1.tgz from Nvidia developer site.

  • $ tar -xvzf cudnn-8.0-linux-x64-v5.1.tgz
  •  ## NOTE: unzip happens at local cuda folder
  • cuda
    • include/
      • cudnn.h
    • lib64/
      • libcudnn.so -> libcudnn.so.5*
      • libcudnn.so.5 -> libcudnn.so.5.1.5*
      • libcudnn.so.5.1.5*

You just need to merge cuDNN cudnn.h and lib64 files to cuda toolkit at /usr/bin/cuda as below:

sudo cp cuda/include/cudnn.h /usr/local/cuda/include
sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*

Setting cuda libraries into path:

export PATH=${PATH}:/usr/local/cuda/bin

Tensorflow Install:

$ export TF_BINARY_URL=https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.11.0-cp27-none-linux_x86_64.whl
$ sudo pip install –upgrade $TF_BINARY_URL

Tensorflow Verification:

>>> import tensorflow as tf

I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcurand.so locally
>>>

Have fun !!

Building opencv with Nvidia CUDA 8.0 extensions on Ubuntu 16.04

cuda

Note: Building opencv without CUDA you just need to following the following blog:

https://aichamp.wordpress.com/2016/11/11/compiling-opencv-in-ubuntu-16-04-with-gcc-4-9/

Here is the opencv libs without CUDA/gpu:

$ pkg-config –libs opencv

-L/usr/local/lib -lopencv_stitching -lopencv_superres -lopencv_videostab -lopencv_aruco -lopencv_bgsegm -lopencv_bioinspired -lopencv_ccalib -lopencv_cvv -lopencv_dnn -lopencv_dpm -lopencv_fuzzy -lopencv_hdf -lopencv_line_descriptor -lopencv_optflow -lopencv_plot -lopencv_reg -lopencv_saliency -lopencv_stereo -lopencv_structured_light -lopencv_phase_unwrapping -lopencv_rgbd -lopencv_surface_matching -lopencv_tracking -lopencv_datasets -lopencv_text -lopencv_face -lopencv_xfeatures2d -lopencv_shape -lopencv_video -lopencv_ximgproc -lopencv_calib3d -lopencv_features2d -lopencv_flann -lopencv_xobjdetect -lopencv_objdetect -lopencv_ml -lopencv_xphoto -lopencv_highgui -lopencv_videoio -lopencv_imgcodecs -lopencv_photo -lopencv_imgproc -lopencv_core

Get NVidia CUDA installed

Getting Source:

Get opencv-master and opencv_contrib from git and keep them in the same path:

Build Process (Step 1) – Config Build:

  • $ mkdir build;cd build
  • cmake -D CMAKE_BUILD_TYPE=RELEASE -D CMAKE_INSTALL_PREFIX=/usr/local -D WITH_TBB=ON -D WITH_V4L=ON -D WITH_QT=ON -D INSTALL_C_EXAMPLES=ON -D INSTALL_PYTHON_EXAMPLES=ON -D OPENCV_EXTRA_MODULES_PATH=/mnt/avkash/opencv_contrib/modules -D BUILD_EXAMPLES=ON -DCMAKE_C_COMPILER=gcc-4.9 -DCMAKE_CXX_COMPILER=g++-4.9 -DCUDA_CUDA_LIBRARY=/usr/local/cuda -DWITH_CUDA=ON -DENABLE_FAST_MATH=1 -D CUDA_FAST_MATH=1 -D WITH_CUBLAS=1 -D WITH_OPENGL=ON  ..

Note: In above step if CUDA configuration is correct you will see the following

  • CUDA detected: 8.0
— CUDA NVCC target flags: -gencode;arch=compute_20,code=sm_20;-gencode;arch=compute_30,code=sm_30;-gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-D_FORCE_INLINES

Error if CUDA Path are not set correctly:

CMake Error: The following variables are used in this project, but they are set to NOTFOUND.
Please set them or make sure they are set and tested correctly in the CMake files:
CUDA_CUDA_LIBRARY (ADVANCED)

Verify CUDA path (-DCUDA_CUDA_LIBRARY=/usr/local/cuda)

Build process Step 2 – Making code:

  • Make sure you are in opencv-master/build folder and step-1 was successful.
  • $ make -j8

Note: It will take about 1-2 hours and about 10GB space so make sure you have enough patience and space 🙂

Possible Errors:

[One] If you build fail due to “opencv_cudaimgproc.dir” missing error

make[2]: *** [modules/cudaimgproc/CMakeFiles/cuda_compile.dir/src/cuda/cuda_compile_generated_gftt.cu.o] Error 1
make[2]: *** Waiting for unfinished jobs….
Scanning dependencies of target opencv_cudafeatures2d
[ 72%] Building CXX object modules/cudafeatures2d/CMakeFiles/opencv_cudafeatures2d.dir/src/orb.cpp.o
[ 72%] Building CXX object modules/cudafeatures2d/CMakeFiles/opencv_cudafeatures2d.dir/src/feature2d_async.cpp.o
[ 72%] Building CXX object modules/cudafeatures2d/CMakeFiles/opencv_cudafeatures2d.dir/src/brute_force_matcher.cpp.o
[ 72%] Building CXX object modules/cudafeatures2d/CMakeFiles/opencv_cudafeatures2d.dir/src/fast.cpp.o
[ 72%] Linking CXX shared library ../../lib/libopencv_cudafeatures2d.so
[ 72%] Built target opencv_cudafeatures2d
Scanning dependencies of target opencv_test_cudafeatures2d
Scanning dependencies of target opencv_perf_cudafeatures2d
[ 72%] Building CXX object modules/cudafeatures2d/CMakeFiles/opencv_test_cudafeatures2d.dir/test/test_features2d.cpp.o
[ 72%] Building CXX object modules/cudafeatures2d/CMakeFiles/opencv_test_cudafeatures2d.dir/test/test_main.cpp.o
[ 72%] Building CXX object modules/cudafeatures2d/CMakeFiles/opencv_perf_cudafeatures2d.dir/perf/perf_features2d.cpp.o
[ 72%] Building CXX object modules/cudafeatures2d/CMakeFiles/opencv_perf_cudafeatures2d.dir/perf/perf_main.cpp.o
[ 72%] Linking CXX executable ../../bin/opencv_perf_cudafeatures2d
CMakeFiles/Makefile2:3590: recipe for target ‘modules/cudaimgproc/CMakeFiles/opencv_cudaimgproc.dir/all’ failed
make[1]: *** [modules/cudaimgproc/CMakeFiles/opencv_cudaimgproc.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs….
[ 72%] Built target opencv_perf_cudafeatures2d
[ 72%] Linking CXX executable ../../bin/opencv_test_cudafeatures2d
[ 72%] Built target opencv_test_cudafeatures2d
Makefile:160: recipe for target ‘all’ failed
make: *** [all] Error 2

Solution:

  • Visit:https://github.com/opencv/opencv/issues/6632
  • Steps:
    • $ git clone https://github.com/thrust/thrust.git
    • $ cp -r thrust/thrust /usr/local/cuda/include
    • We are update above thrust files with files included into cuda/include folder
  • Rebuild code again
    • $ make -j8

[Two] fatal error: stdlib.h: No such file or directory

you may get the error with cmake step as below:

/usr/include/c++/6/cstdlib:75:25: fatal error: stdlib.h: No such file or directory
#include_next <stdlib.h>
^
compilation terminated.

Solution:  Add the following parameter with cmake builder:

-DENABLE_PRECOMPILED_HEADERS=OFF

Updated cmake command as below:

  • cmake -D CMAKE_BUILD_TYPE=RELEASE -D CMAKE_INSTALL_PREFIX=/usr/local -D WITH_TBB=ON -D WITH_V4L=ON -D WITH_QT=ON -D INSTALL_C_EXAMPLES=ON -D INSTALL_PYTHON_EXAMPLES=ON -D OPENCV_EXTRA_MODULES_PATH=/mnt/avkash/opencv_contrib/modules -D BUILD_EXAMPLES=ON -DCMAKE_C_COMPILER=gcc-4.9 -DCMAKE_CXX_COMPILER=g++-4.9 -DCUDA_CUDA_LIBRARY=/usr/local/cuda -DWITH_CUDA=ON -DENABLE_FAST_MATH=1 -D CUDA_FAST_MATH=1 -D WITH_CUBLAS=1 -D WITH_OPENGL=ON  -DENABLE_PRECOMPILED_HEADERS=OFF ..

Success Story:

[100%] Built target tutorial_imageSegmentation
[100%] Linking CXX executable ../../bin/cpp-tutorial-pnp_registration
[100%] Built target cpp-tutorial-pnp_registration
[100%] Linking CXX executable ../../bin/cpp-example-stitching_detailed
[100%] Built target example_stitching_detailed
[100%] Linking CXX shared module ../../lib/cv2.so
[100%] Built target opencv_python2

Build installer:

opencv-master/build$ sudo make install

Test OpenCV with CUDA in Python:

  • >>> import cv2
  • >>> print(cv2.__version__)
  • >>>print(cv2.cuda)