Python for Data Scientists with Maxim Musin at Harbour.Space University

PYTHON FOR
DATA SCIENTISTS

MAXIM MUSIN

We offer innovative university degrees taught in English by industry leaders from around the world, aimed at giving our students meaningful and creatively satisfying top-level professional futures. We think the future is bright if you make it so.

The course will cover basic python methods for data analysis: pandas, numpy, scipy, sklearn, along with advanced techniques of their application. Basic integrations of python with external libraries like xgboost, tensorflow, pytorch along with data wrangling and some hyperparameter optimization methods will be also included. Jupyter notebook usage and tricks will be also given as an organic part of the course. At the end of module everyone is expected to be ready to come up with a simple data wrangling system.

Head of ML at Traceair Technologies

Maxim Musin comes from a background in statistics, advanced multidimensional probability, and random processes. During his career in these fields, he found himself developing his skills and gathering experience through working in both academic environments and the private sector.

His academic experience ranges from teaching probability and statistics at MSU and MIPT, as a member of the faculty of innovation and high technology, FIHT, which at the time was among the few places worldwide with capabilities for advanced statistics study. During his time there, he produced several notable projects with his students, particularly in regards to the stochastic convergence of neural networks. His course on applied modern statistics became mandatory for the data analysis division of the FIHT MIPT Masters.

Following his time as a professor, he began to work on industry-related tasks, and began to note the difference between the problems posed in the academic and private sector, and the different set of skills required by the industry to solve these practical tasks. Since then, he has been working primarily on getting value from data analysis and machine learning, a process which requires a broad range of in-depth knowledge regarding the current state-of-the-art methods in the field, along with the ability to implement them quickly and efficiently.

Working with the basic package: jupyter, pandas, numpy, scipy in more details, so students will not have problems in the future with data wrangling, particularly with merging several data sources in one. For sklearn we will consider custom modification for all the pipeline steps. Students will be introduced to the usual problems of a python environment setup for data analysis, and they will receive a basic experience of xgboost, tensorflow, pytorch. Students will also be shown examples of useful system applications, like automl and hyperparameter optimization.

At the end of the course students are expected to be familiar with standard python for data analysis, usage of jupyter and simple packages compatible with python methods

SKILLS:

-Data Analysis

-Machine Learning

-Big Data

-Development

-Programming Languages

-Web

ABOUT MAXIM

HARBOUR.SPACE

WHAT YOU WILL LEARN

DATE: 14 Oct - 1 Nov, 2019

DURATION: 3 Weeks

LECTURES: 3 Hours per day

LANGUAGE: English

LOCATION: Barcelona, Harbour.Space Campus

COURSE TYPE: Offline

HARBOUR.SPACE UNIVERSITY

DATE: 14 Oct - 1 Nov, 2019

DURATION: 3 Weeks

LECTURES: 3 Hours per day

LANGUAGE: English

LOCATION: Barcelona, Harbour.Space Campus

COURSE TYPE: Offline

COURSE OUTLINE

Session 1

Introduction to pandas, numpy, scipy and jupyter, extended tricks, jupyter magic commands and technics

Session 4

Sklearn. Classifiers, regressors, pre and post processors, cross validation, pipelines. Custom classifier/preprocessor, postprocessor

Session 3

Data visualization

Session 2

Data manipulations

PYTHON FOR
DATA SCIENTISTS

REQUIRED READING

Jupyter cookbook

Jupyter magic tricks

The course will cover basic python methods for data analysis: pandas, numpy, scipy, sklearn, along with advanced techniques of their application. Basic integrations of python with external libraries like xgboost, tensorflow, pytorch along with data wrangling and some hyperparameter optimization methods will be also included. Jupyter notebook usage and tricks will be also given as an organic part of the course. At the end of module everyone is expected to be ready to come up with a simple data wrangling system.

Working with the basic package: jupyter, pandas, numpy, scipy in more details, so students will not have problems in the future with data wrangling, particularly with merging several data sources in one. For sklearn we will consider custom modification for all the pipeline steps. Students will be introduced to the usual problems of a python environment setup for data analysis, and they will receive a basic experience of xgboost, tensorflow, pytorch. Students will also be shown examples of useful system applications, like automl and hyperparameter optimization.

At the end of the course students are expected to be familiar with standard python for data analysis, usage of jupyter and simple packages compatible with python methods.

Maxim Musin comes from a background in statistics, advanced multidimensional probability, and random processes. During his career in these fields, he found himself developing his skills and gathering experience through working in both academic environments and the private sector.

His academic experience ranges from teaching probability and statistics at MSU and MIPT, as a member of the faculty of innovation and high technology, FIHT, which at the time was among the few places worldwide with capabilities for advanced statistics study. During his time there, he produced several notable projects with his students, particularly in regards to the stochastic convergence of neural networks. His course on applied modern statistics became mandatory for the data analysis division of the FIHT MIPT Masters.

Following his time as a professor, he began to work on industry-related tasks, and began to note the difference between the problems posed in the academic and private sector, and the different set of skills required by the industry to solve these practical tasks. Since then, he has been working primarily on getting value from data analysis and machine learning, a process which requires a broad range of in-depth knowledge regarding the current state-of-the-art methods in the field, along with the ability to implement them quickly and efficiently.

Scipy docs