TEXT MINING AND 
TRANSLATION
SERGEY
NIKOLENKO

We offer innovative university degrees taught in English by industry leaders from around the world, aimed at giving our students meaningful and creatively satisfying top-level professional futures. We think the future is bright if you make it so.

Ten years ago, machine learning went through a revolution. While neural networks had been one of the oldest tools in artificial intelligence, people had not really been able to train deep architectures efficiently until the mid-2000s. After the breakthrough results of the groups of Geoffrey Hinton and Yoshua Bengio, however,
deep neural architectures quickly outperformed state of the art in image processing, speech
recognition, natural language
processing, and by now they basically define the modern state of machine learning in many different domains, from face recognition and self-driving cars to playing Go. In the course, we will see what makes modern neural networks so powerful, learn to train them properly, go through the most important architectures, and, best of all, learn to implement all of these ideas in code through standard libraries such as TensorFlow and Keras.

Sergey Nikolenko is a computer scientist with wide experience in machine learning and data analysis, algorithms design and analysis, theoretical computer science, and algebra. He graduated from the St. Petersburg State University in 2005, majoring in algebra (Chevalley groups), and earned his Ph.D. at the Steklov Mathematical Institute at St. Petersburg in 2009 in theoretical computer science (circuit complexity and theoretical cryptography). Since then, Dr. Nikolenko has been interested in machine learning and probabilistic modeling, producing theoretical results and working on practical projects for the industry. He is currently employed at the Steklov Mathematical Institute at St. Petersburg and Higher School of Economics at St. Petersburg. Dr. Nikolenko has more than 100 publications, including top computer science journals and conferences and several books.

• Machine learning: probabilistic graphical models, recommender systems, topic modeling

• Algorithms for networking: competitive analysis, FIB optimization

• Bioinformatics: processing mass-spectrometry data, genome assembly

• Proof theory, automated reasoning, computational complexity, circuit complexity

• Algebra (Chevalley groups), algebraic geometry (motives).

Understand the main problems of natural language processing

• Be able to construct topic models by using standard libraries

• Understand and be able to use different forms of word embeddings

• Learn the structure and composition of encoder-decoder architectures and be able to construct such models in practice

SKILLS:

- Machine learning

- Algorithms for networking

- Bioinformatics

- Mathematical Modeling

- Python

ABOUT SERGEY
HARBOUR.SPACE 
WHAT YOU WILL LEARN
RESERVE MY SPOT

DATE: 8 Jan - 26 Jan, 2018 . 

DURATION: 3 Weeks

LECTURES: 3 Hours per day

LANGUAGE: English

LOCATION: Barcelona, Harbour.Space Campus

COURSE TYPE: Offline

HARBOUR.SPACE UNIVERSITY

RESERVE MY SPOT

@snikolenko

DATE: 8 Nov - 26 Jan, 2018

DURATION:  3 Weeks

LECTURES: 3 Hours per day

LANGUAGE: English

LOCATION: Barcelona, Harbour.Space Campus

COURSE TYPE: Offline

All rights reserved. 2017

Harbour.Space University
Tech Heart
COURSE OUTLINE
SHOW MORE

Session 1

NLP problems and naive Bayes

Natural language processing: defining the problems. From syntactic to semantic problems. The text classification problem and the naive Bayesian classifier. Tf-idf weights.

Session 4

Practical session

Construct different topic models.

Session 3

Topic modeling

Regularised pLSA: additive regularisation of topic models (ARTM). Bayesian pLSA: latent Dirichlet allocation (LDA). LDA extensions: additional dependencies and/or additional information

Session 2

Extending naive Bayes

Can we remove the naive Bayes assumptions? From classification to clustering.  From clustering to topic modeling: probabilistic latent semantic analysis.

TEXT
MINING AND 

TRANSLATION

BIBLIOGRAPHY

Natural language processing is one of the most challenging parts of artificial intelligence. It encompasses many different problems, from well-defined classification problems to rather vague tasks that involve text generation. In the course, we will go over some of the most common NLP problems, including text classification, topic modeling, and sentiment analysis. But we will pay the most attention to modern deep learning approaches that use word embeddings and/or character-based models. We will consider encoder-decoder architectures and architectures with attention, specifically in application to machine translation and similar problems.