PYTHON FOR DATA SCIENTISTS
MAXIM MUSIN
We offer innovative university degrees taught in English by industry leaders from around the world, aimed at giving our students meaningful and creatively satisfying top-level professional futures. We think the future is bright if you make it so.
The course will cover basic python methods for data analysis: pandas, numpy, scipy, sklearn, along with advanced techniques of their application. Basic integrations of python with external libraries like xgboost, tensorflow, pytorch along with data wrangling and some hyperparameter optimization methods will be also included. Jupyter notebook usage and tricks will be also given as an organic part of the course. At the end of module everyone is expected to be ready to come up with a simple data wrangling system.
Maxim Musin comes from a background in statistics, advanced multidimensional probability, and random processes. During his career in these fields, he found himself developing his skills and gathering experience through working in both academic environments and the private sector.
His academic experience ranges from teaching probability and statistics at MSU and MIPT, as a member of the faculty of innovation and high technology, FIHT, which at the time was among the few places worldwide with capabilities for advanced statistics study. During his time there, he produced several notable projects with his students, particularly in regards to the stochastic convergence of neural networks. His course on applied modern statistics became mandatory for the data analysis division of the FIHT MIPT Masters.
Following his time as a professor, he began to work on industry-related tasks, and began to note the difference between the problems posed in the academic and private sector, and the different set of skills required by the industry to solve these practical tasks. Since then, he has been working primarily on getting value from data analysis and machine learning, a process which requires a broad range of in-depth knowledge regarding the current state-of-the-art methods in the field, along with the ability to implement them quickly and efficiently.
Working with the basic package: jupyter, pandas, numpy, scipy in more details, so students will not have problems in the future with data wrangling, particularly with merging several data sources in one. For sklearn we will consider custom modification for all the pipeline steps. Students will be introduced to the usual problems of a python environment setup for data analysis, and they will receive a basic experience of xgboost, tensorflow, pytorch. Students will also be shown examples of useful system applications, like automl and hyperparameter optimization.
At the end of the course students are expected to be familiar with standard python for data analysis, usage of jupyter and simple packages compatible with python methods
SKILLS:
-Data Analysis
-Machine Learning
-Big Data
-Development
-Programming Languages
-Web
ABOUT MAXIM
HARBOUR.SPACE
WHAT YOU WILL LEARN
DATE: 14 Oct - 1 Nov, 2019
DURATION: 3 Weeks
LECTURES: 3 Hours per day
LANGUAGE: English
LOCATION: Barcelona, Harbour.Space Campus
COURSE TYPE: Offline
HARBOUR.SPACE UNIVERSITY
DATE: 14 Oct - 1 Nov, 2019
DURATION: 3 Weeks
LECTURES: 3 Hours per day
LANGUAGE: English
LOCATION: Barcelona, Harbour.Space Campus
COURSE TYPE: Offline
All rights reserved. 2017
COURSE OUTLINE
Session 1
Introduction to pandas, numpy, scipy and jupyter, extended tricks, jupyter magic commands and technics
Session 4
Sklearn. Classifiers, regressors, pre and post processors, cross validation, pipelines. Custom classifier/preprocessor, postprocessor
Session 3
Data visualization
Session 2
Data manipulations
PYTHON FOR DATA SCIENTISTS
REQUIRED READING
The course will cover basic python methods for data analysis: pandas, numpy, scipy, sklearn, along with advanced techniques of their application. Basic integrations of python with external libraries like xgboost, tensorflow, pytorch along with data wrangling and some hyperparameter optimization methods will be also included. Jupyter notebook usage and tricks will be also given as an organic part of the course. At the end of module everyone is expected to be ready to come up with a simple data wrangling system.
Working with the basic package: jupyter, pandas, numpy, scipy in more details, so students will not have problems in the future with data wrangling, particularly with merging several data sources in one. For sklearn we will consider custom modification for all the pipeline steps. Students will be introduced to the usual problems of a python environment setup for data analysis, and they will receive a basic experience of xgboost, tensorflow, pytorch. Students will also be shown examples of useful system applications, like automl and hyperparameter optimization.
At the end of the course students are expected to be familiar with standard python for data analysis, usage of jupyter and simple packages compatible with python methods.
Maxim Musin comes from a background in statistics, advanced multidimensional probability, and random processes. During his career in these fields, he found himself developing his skills and gathering experience through working in both academic environments and the private sector.
His academic experience ranges from teaching probability and statistics at MSU and MIPT, as a member of the faculty of innovation and high technology, FIHT, which at the time was among the few places worldwide with capabilities for advanced statistics study. During his time there, he produced several notable projects with his students, particularly in regards to the stochastic convergence of neural networks. His course on applied modern statistics became mandatory for the data analysis division of the FIHT MIPT Masters.
Following his time as a professor, he began to work on industry-related tasks, and began to note the difference between the problems posed in the academic and private sector, and the different set of skills required by the industry to solve these practical tasks. Since then, he has been working primarily on getting value from data analysis and machine learning, a process which requires a broad range of in-depth knowledge regarding the current state-of-the-art methods in the field, along with the ability to implement them quickly and efficiently.