The Multi-Armed Bandit Problem and Its Solutions
The multi-armed bandit problem is a class example to demonstrate the exploration versus exploitation dilemma. This post introduces the bandit problem and how to solve it using different exploration strategies. The algorithms are implemented for Bernoulli bandit in lilianweng/multi-armed-bandit.
Object Recognition for Dummies Part 3: R-CNN and Fast/Faster/Mask R-CNN and YOLO
In Part 3, we would examine five object recognition models: R-CNN, Fast R-CNN, Faster R-CNN, Mask R-CNN and YOLO. These models are highly related and the new versions show great speed improvement compared to the older ones.
Object Recognition for Dummies Part 2: CNN, DPM and Overfeat
Part 2 introduces several classic convolutional neural work architecture designs for image classification (AlexNet, VGG, ResNet), as well as DPM (Deformable Parts Model) and Overfeat models for object recognition.
Object Recognition for Dummies Part 1: Gradient Vector, HOG, and SS
In this series of posts on “Object Recognition for Dummies”, we will go through several basic concepts, algorithms, and popular deep learning models for image processing and objection detection. Hopefully, it would be a good read for people with no experience in this field but want to learn more. The Part 1 introduces the concept of Gradient Vectors, the HOG (Histogram of Oriented Gradients) algorithm, and Selective Search for image segmentation.
Learning Word Embedding
Word embedding is a dense representation of words in the form of numeric vectors. It can be learned using a variety of language models. The word embedding representation is able to reveal many hidden relationships between words. For example, vector(“cat”) - vector(“kitten”) is similar to vector(“dog”) - vector(“puppy”). This post introduces several models for learning word embedding and how their loss functions are designed for the purpose.
Anatomize Deep Learning with Information Theory
This post is a summary of Prof Naftali Tishby’s recent talk on “Information Theory in Deep Learning”. It presented how to apply the information theory to study the growth and transformation of deep neural networks during training.
From GAN to WGAN
This post explains the maths behind a generative adversarial network (GAN) model and why it is hard to be trained. Wasserstein GAN is intended to improve GANs’ training by adopting a smooth metric for measuring the distance between two probability distributions.
How to Explain the Prediction of a Machine Learning Model?
This post reviews some research in model interpretability, covering two aspects: (i) interpretable models with model-specific interpretation methods and (ii) approaches of explaining black-box models. I included an open discussion on explainable artificial intelligence at the end.
Predict Stock Prices Using RNN: Part 2
This post is a continued tutorial for how to build a recurrent neural network using Tensorflow to predict stock market prices. Part 2 attempts to predict prices of multiple stocks using embeddings. The full working code is available in github.com/lilianweng/stock-rnn.
Predict Stock Prices Using RNN: Part 1
This post is a tutorial for how to build a recurrent neural network using Tensorflow to predict stock market prices. Part 1 focuses on the prediction of S&P 500 index. The full working code is available in github.com/lilianweng/stock-rnn.
An Overview of Deep Learning for Curious People
Starting earlier this year, I grew a strong curiosity of deep learning and spent some time reading about this field. To document what I’ve learned and to provide some interesting pointers to people with similar interests, I wrote this overview of deep learning models and their applications.