COURSE OUTLINE

Session 1

NLP problems and naive Bayes

Natural language processing: defining the problems. From syntactic to semantic problems. The text classification problem and the naive Bayesian classifier. Tf-idf weights.

Session 2

Extending naive Bayes

Can we remove the naive Bayes assumptions? From classification to clustering.  From clustering to topic modeling: probabilistic latent semantic analysis.

Session 3

Topic modeling

Regularised pLSA: additive regularisation of topic models (ARTM). Bayesian pLSA: latent Dirichlet allocation (LDA). LDA extensions: additional dependencies and/or additional information.

Session 4

Practical session

Construct different topic models.

Session 5

Language modeling

Language modeling: the problem and its importance. Language modeling with n-grams.

Session 7

Other word embeddings

Word2vec variations: GloVe and FastText. Extensions of word embeddings. Examples.

Session 8

Basic RNNs for NLP

Reminder: recurrent neural networks. RNNs for language modeling. RNNs for sentiment analysis.

Session 9

Practical session

Improve sentiment analysis with different deep architectures.

Session 10

Character-based models

Breaking words down into characters. Character-based models. Convolutional networks in NLP.

Session 11

Encoder-decoder architectures

Idea of encoder-decoder architectures. Examples. Show and Tell.

Session 12

Neural networks with attention

Attention in deep learning: Show, Attend, and Tell. Encoder-decoder with attention: machine translation.

Session 13

Practical session

Build your own machine translation model.

Session 14

Hierarchical encoder-decoder

Dialogue and conversational models. The hierarchical encoder-decoder architecture.

Session 15

Practical session

Build your own chat bot.

Session 6

Neural language models and word2vec

From language modeling to word embeddings: word2vec architectures.