Flight Plan: Automatic Recommendation System Using Machine Learning

This BoosterPack was created and authored by: Carla Margalef Bentabol

DAIR BoosterPacks are free, curated packages of cloud-based tools and resources about a specific emerging technology, built by experienced Canadian businesses who have built products or services using that technology and are willing to share their expertise.

Overview

For software developers needing an Automatic Recommendation System, the Sample Solution demonstrates how a Collaborative Filtering model is used to provide recommendations to users based on their preferences. Unlike non-machine learning solutions, this Sample Solution provides an automated way of providing useful personalized recommendations to users leveraging their past preferences and similarities to other users.

Please see the Movie Recommender: Sample Solution ‎page for more information on how the Sample Solution works.

The Sample Solution showcases TensorFlow and TensorRT technologies described in subsequent sections.

Tech Spotlight: Tensorflow

TensorFlow is a computational framework for machine learning models created by Google. It provides a software library for data processing, model building, model training, and performance evaluation.  It contains different toolkits to construct models at different levels of abstraction and can run on a variety of devices, including CPU and GPUTensorFlow is widely used in both industry and research contexts to experiment, develop, and train deep neural networks.

Resources

The table below provides a non-comprehensive list of links to useful introductory resources.

Resource Summary
TensorFlow Site TensorFlow page, with a general introduction and links to many resources.
MIT Deep Learning Basics: Introduction and Overview with TensorFlow MIT Deep Learning lecture blog post covering deep learning concepts and TensorFlow.

Tutorials

The table below provides a non-comprehensive list of links to tutorials the author has found to be most useful.

Tutorial Content Summary
TensorFlow Tutorials Official TensorFlow Tutorials page with many examples.
Intro to TensorFlow for Deep Learning Free Udacity course about building deep learning applications in TensorFlow.
Practical Machine Learning Tutorial with Python Introduction Tutorial that provides an introduction to Machine Learning and implementations in Python and TensorFlow.

Documentation

Please see the table below for a set of documentation resources for TensorRT.

Document Summary
TensorFlow API Documentation Official API documentation.
TensorFlow Developer Guide Official developer guide.

Support

Support resources for TensorFlow are described in the community support page.

Best Practices

Abstraction levels

While TensorFlow supports different levels of abstraction and you can implement models from low-level operations, using the high-level APIs provides better out-of-the-box-performance. When available, it is usually best to use the provided models, layers, estimators, operators, training functions, and data handling methods.

CPU and GPU

TensorFlow can run on both CPU and GPU. GPU has generally better performance, and when available, it is recommended to train and run inferences on GPU over CPU.

Tips and Traps

Experimentation vs. production

TensorFlow provides several debugging, visualization, and summarization capabilities. They are very useful and typically recommended during learning or early development. However, they can be very inefficient and could make the whole system slower. Remember to disable them or update your implementation for production or performance tests.

Checkpoints

Real-world models might take a long time to train. To avoid losing a partially trained model in the event of an unexpected error, to test the progress in another environment, or to stop the training early for any reason, it is useful to use TensorFlow’s checkpoint capabilities and regularly save the model while training.

Flexibility

TensorFlow provides a great degree of flexibility, allowing you to implement many types of custom models, layers, operations, and losses. Expert users benefit from the flexibility for advanced use cases. However, TensorFlow also provides out-of-the-box functionality that will work for most common cases. It is advised to use the provided models, layers, operations, and losses whenever possible, as they are well tested, efficient, and highly compatible with TensorRT.

Flight Plan: TensorRT

TensorRT is a platform from NVIDIA for deep learning inference. It includes a deep learning inference optimizer and runtime that provides low latency and high-throughput for deep learning models. Models can be developed and trained in any of many different deep learning frameworks (such as TensorFlow) and then, with TensorRT, optimized and calibrated for lower precision and deployed for production. TensorRT is built on CUDA, NVIDIA’s parallel programming model, allowing you to leverage and maximize GPU utilization.

Resources

The table below provides a non-comprehensive list of links to useful introductory resources.

Resource Summary
NVIDIA TensorRT Site The TensorRT main page, with a general introduction and links to many resources.
Introduction to TensorRT NVIDIA Webinar (video) introducing TensorRT.
What is CUDA? Parallel Programming for GPUs Overview of CUDA. Introduction and different uses (including TensorRT).

Tutorials

The table below provides a non-comprehensive list of links to tutorials the author has found to be most useful.

Tutorial Content Summary
How to Speed Up Deep Learning Inference Using TensorRT Tutorial from NVIDIA developer blog to learn how to deploy a deep learning application onto a GPU using TensorRT.
Speed up TensorFlow Inference on GPUs with TensorRT Blog post with a general overview of the integration workflow of TensorRT and TensorFlow with code examples.
Trying out TensorRT on Jetson TX2 Tutorial on how to optimize a deep learning model trained with Caffe (deep learning framework) and run inferences in Jetson TX2 (NVIDIA GPU).

Documentation

Please see the table below for a set of documentation resources for TensorRT.

Document Summary
TensorRT Inference Library Documentation Official developer guide and API documentation.
TensorRT Inference Server Documentation Official documentation for the Inference Server.

Support

Support resources for TensorRT are described in the support section of the documentation.

Best Practices

A comprehensive list of best practices can be found in the “Best Practices” section of the official TensorRT documentation.

Tips and Traps

Versions

Different versions of TensorRT are compatible only with specific versions of CUDA and TensorFlow. Furthermore, CUDA versions are compatible with specific versions of GPU drivers. Make sure to check out the TensorRT Compatibility Matrix and CUDA toolkit notes to avoid incompatibility issues. TensorRT provides Docker containers pre-packaged with the appropriate versions. See the Movie Recommender: Sample Solution page for example installation instructions.

Python vs C++ API

TensorRT provides both Python and C++ APIs. Initially, most of the documentation and tutorials were written in C++. When having problems finding tutorials and support online for Python, it might be useful to investigate the C++ solutions, as the APIs are mostly equivalent and easy to translate.

Tech Spotlight: Multilayer Perceptron for Collaborative Filtering

Collaborative Filtering is a widely used approach to implement recommender systems. Collaborative filtering methods are based on users’ behaviours, activities, or preferences and predict what users will like based on their similarity to other users.

A Multilayer Perceptron is a type of neural network that contains an input layer to receive the data, an output layer to make a prediction given the input, and in between them, an arbitrary number of hidden layers that represent non-linear functions that, combined, can learn complex problems.

In the paper “Neural Collaborative Filtering (He et al. 2017)“, the authors propose a Deep Learning Framework for Collaborative Filtering. One of the models that the authors evaluate is a Multilayer Perceptron. The problem is presented as a classification problem and the model is trained by using movies watched and rated by users as positive examples, and unwatched movies as negative examples.

Resources

Please see sections below for resources to learn more about multilayer perceptron for collaborative filtering.

Tutorials

The table below provides a non-comprehensive list of links to tutorials the author has found to be most useful.

Tutorial Content Summary
A Beginner’s Guide to Multilayer Perceptrons Blog post with an introduction to multilayer perceptrons.
Introduction to Recommender Systems Blog post with an introduction to recommender systems, focusing on collaborative filtering.
Various Implementations of Collaborative Filtering Blog post showcasing different types of collaborative filtering implementations.
Stanford Lecture Notes, Chapter 9: Recommender Systems 1 Stanford lecture notes on recommendation systems
Google Machine Learning Crash Course: Recommendation Self-study course by Google for recommendation systems.

Documentation

Please see the table below for a set of documentation resources.

Document Summary
Neural Collaborative Filtering (He et al. 2017) Scientific paper proposing a collaborative filtering framework based on neural networks.

Best Practices

General Best Practices for Machine Learning projects apply. There are many online resources regarding this topic, such as: