TEXT MINING
SERGEY
KHOROSHENKIKH

Natural Language Processing (NLP) field has gained increased attention in recent years because of impressive algorithmic advances in Deep Learning and significant progress in hardware.
Text Mining is a subset of NLP focused on unsupervised and semi-supervised algorithms of text analysis.
The course covers the main algorithms and concepts of Text Mining, including both “classical” methods from Information Retrieval domain (like TF-IDF and topic modeling) and modern Deep Learning architectures.

Sergey Khoroshenkikh is a senior software engineer with 5 years of experience in applied machine learning and data analysis. He graduated from Moscow Institute of Physics and Technology in 2015, and now he is earning a PhD at Moscow Institute of Physics and Technology in the area of random geometric graphs. 

Currently he works in R&D department at Yandex, developing large-scale machine learning solutions for web-advertising (which is the main source of company’s income by now).

Students will learn:

- What types of problems can be solved with Text Mining

- Which algorithms are used for various Text Mining problems

- How to use practical tools for Text Mining

SKILLS:

-Python programming language

-Calculus and optimisation


-Probability


-Linear algebra

ABOUT SERGEY
WHAT YOU WILL LEARN
RESERVE MY SPOT

DATE: 18 May - 5 Jun, 2020

DURATION: 3 Weeks

LECTURES: 3 Hours per day

LANGUAGE: English

LOCATION: Barcelona, Harbour.Space Campus

COURSE TYPE: Offline

RESERVE MY SPOT

DATE: 18 May - 5 Jun, 2020

DURATION:  3 Weeks

LECTURES: 3 Hours per day

LANGUAGE: English

LOCATION: Barcelona, Harbour.Space Campus

COURSE TYPE: Offline

All rights reserved. 2017

Harbour.Space University
Tech Heart
COURSE OUTLINE
SHOW MORE
TEXT MINING
BIBLIOGRAPHY

Natural Language Processing (NLP) field has gained increased attention in recent years because of impressive algorithmic advances in Deep Learning and significant progress in hardware.
Text Mining is a subset of NLP focused on unsupervised and semi-supervised algorithms of text analysis.
The course covers the main algorithms and concepts of Text Mining, including both “classical” methods from Information Retrieval domain (like TF-IDF and topic modeling) and modern Deep Learning architectures.

Session 1

Introduction

NLP pipeline with spaCy.
TF-IDF.
Text analysis with scikit-learn.

Session 2

Language models and text classification

Language models: definition, algorithms, and evaluation.
Text classification: algorithms and feature engineering.

Session 3

Topic modeling

Non-negative matrix factorization (NMF).
Latent Semantic Indexing (LSI).
Latent Dirichlet Allocation (LDA).

Session 4

Word vectors

Distributional hypothesis.
Word2Vec algorithm.

HARBOUR.SPACE 

Harbour.Space is a university created by entrepreneurs for entrepreneurs. We focus on meeting the demands of the future, while traditional education providers are too often stuck in the past.

We’re one of the only European institutions completely dedicated to technology, design and entrepreneurship, and our interdisciplinary courses are taught by some of today’s leading professionals. Our aim is not only to equip students with the knowledge to take on the real world, but to nurture, create and shape tomorrow’s tech superstars.

 Learn more about Harbour.Space.

HARBOUR.SPACE UNIVERSITY