Anatomize Deep Learning with Information Theory

Professor Naftali Tishby passed away in 2021. Hope the post can introduce his cool idea of information bottleneck to more people. Recently I watched the talk “Information Theory in Deep Learning” by Prof Naftali Tishby and found it very interesting. He presented how to apply the information theory to study the growth and transformation of deep neural networks during training. Using the Information Bottleneck (IB) method, he proposed a new learning bound for deep neural networks (DNN), as the traditional learning theory fails due to the exponentially large number of parameters....

Date: September 28, 2017 | Estimated Reading Time: 9 min | Author: Lilian Weng

From GAN to WGAN

[Updated on 2018-09-30: thanks to Yoonju, we have this post translated in Korean!] [Updated on 2019-04-18: this post is also available on arXiv.] Generative adversarial network (GAN) has shown great results in many generative tasks to replicate the real-world rich content such as images, human language, and music. It is inspired by game theory: two models, a generator and a critic, are competing with each other while making each other stronger at the same time....

Date: August 20, 2017 | Estimated Reading Time: 21 min | Author: Lilian Weng

How to Explain the Prediction of a Machine Learning Model?

The machine learning models have started penetrating into critical areas like health care, justice systems, and financial industry. Thus to figure out how the models make the decisions and make sure the decisioning process is aligned with the ethnic requirements or legal regulations becomes a necessity. Meanwhile, the rapid growth of deep learning models pushes the requirement of interpreting complicated models further. People are eager to apply the power of AI fully on key aspects of everyday life....

Date: August 1, 2017 | Estimated Reading Time: 18 min | Author: Lilian Weng

Predict Stock Prices Using RNN: Part 2

In the Part 2 tutorial, I would like to continue the topic on stock price prediction and to endow the recurrent neural network that I have built in Part 1 with the capability of responding to multiple stocks. In order to distinguish the patterns associated with different price sequences, I use the stock symbol embedding vectors as part of the input. Dataset During the search, I found this library for querying Yahoo!...

Date: July 22, 2017 | Estimated Reading Time: 9 min | Author: Lilian Weng

Predict Stock Prices Using RNN: Part 1

This is a tutorial for how to build a recurrent neural network using Tensorflow to predict stock market prices. The full working code is available in github.com/lilianweng/stock-rnn. If you don’t know what is recurrent neural network or LSTM cell, feel free to check my previous post. One thing I would like to emphasize that because my motivation for writing this post is more on demonstrating how to build and train an RNN model in Tensorflow and less on solve the stock prediction problem, I didn’t try hard on improving the prediction outcomes....

Date: July 8, 2017 | Estimated Reading Time: 12 min | Author: Lilian Weng

An Overview of Deep Learning for Curious People

(The post was originated from my talk for WiMLDS x Fintech meetup hosted by Affirm.) I believe many of you have watched or heard of the games between AlphaGo and professional Go player Lee Sedol in 2016. Lee has the highest rank of nine dan and many world championships. No doubt, he is one of the best Go players in the world, but he lost by 1-4 in this series versus AlphaGo....

Date: June 21, 2017 | Estimated Reading Time: 12 min | Author: Lilian Weng