COURSE OUTLINE
Session 1
NLP problems and naive Bayes
Natural language processing: defining the problems. From syntactic to semantic problems. The text classification problem and the naive Bayesian classifier. Tf-idf weights.
Session 2
Extending naive Bayes
Can we remove the naive Bayes assumptions? From classification to clustering. From clustering to topic modeling: probabilistic latent semantic analysis.
Session 3
Topic modeling
Regularised pLSA: additive regularisation of topic models (ARTM). Bayesian pLSA: latent Dirichlet allocation (LDA). LDA extensions: additional dependencies and/or additional information.
Session 4
Practical session
Construct different topic models.
Session 5
Language modeling
Language modeling: the problem and its importance. Language modeling with n-grams.
Session 7
Other word embeddings
Word2vec variations: GloVe and FastText. Extensions of word embeddings. Examples.
Session 8
Basic RNNs for NLP
Reminder: recurrent neural networks. RNNs for language modeling. RNNs for sentiment analysis.
Session 9
Practical session
Improve sentiment analysis with different deep architectures.
Session 10
Character-based models
Breaking words down into characters. Character-based models. Convolutional networks in NLP.
Session 11
Encoder-decoder architectures
Idea of encoder-decoder architectures. Examples. Show and Tell.
Session 12
Neural networks with attention
Attention in deep learning: Show, Attend, and Tell. Encoder-decoder with attention: machine translation.
Session 13
Practical session
Build your own machine translation model.
Session 14
Hierarchical encoder-decoder
Dialogue and conversational models. The hierarchical encoder-decoder architecture.
Session 15
Practical session
Build your own chat bot.
Session 6
Neural language models and word2vec
From language modeling to word embeddings: word2vec architectures.