Course Overview

If you are looking for the right way how to organize your data science projects, and the right tool and workflow according to the latest trends, you are on the right place.

In this course you will learn everything you need to be comfortable to work on your data science projects. Primary, you will get detailed knowledge of the concepts and possibilities of the most used tool currently, jupyter, its extension jupyter lab, how to set jupyter hub to work with multiple users. You will get the basics on how to work with docker containers. Main programming language will be python, but we cover the foundations on how to work with other programing languages in jupyter as well, like R and Julia. We will show you how to work with it on cyber security use cases, how to manipulate data which fits into RAM with pandas, but we show you the options on how to work with big data too using dask. You will learn how to effectively visualize data with static or interactive charts in jupyter, and ways on how to present your work with various export options.

Machine learning is one of the ultimate topics today, and it will not be left out. You will learn basic concepts of selected supervised and unsupervised ML algorithms, how to build optimal ML pipelines using sklearn, starting from data preparation, variables transformations and selections and the final model. You will be able to build by yourself “good enough” models with minimum effort and theory, but with maximum practical usability using state-of-art algorithms like xgboost and lightgbm, and you will get the foundations on how to automatically tweak the hyperparameters of the pipeline to find the best ones. All of it in jupyter, jupyter lab and python.

What You Will Learn

  • Acquire working knowledge of the Jupyter ecosystem
  • Understanding Docker basics
  • Learn Data analytics using pandas
  • Familiarize working with big data using dask
  • Understand data visualization using pandas
  • matplotlib
  • and plotly express
  • Getting equipped with Machine learning practical fundamentals
  • Understand about designing machine learning pipelines in the scikit-learn ecosystem

Program Curriculum

  • The Jupyter Ecosystem
  • Why to Use Jupyter Notebooks?
  • $7 Million Cybersecurity Scholarship by EC-Council
  • Chapter 1 Quiz

  • What Do We Need to Install?
  • Docker Basics
  • Jupyter Notebooks on Docker
  • Chapter 2 Quiz

  • Jupyter Notebook App UI
  • File Browser
  • File Editor
  • Terminal
  • Extensions
  • Configuration
  • Chapter 3 Quiz

  • UI - Main Panel and Toolbar
  • Dataset Introduction
  • Working With Cells
  • Markdown
  • Latex
  • Simple Regression Model
  • Working With Kernels
  • Use Kernel from Other Environment
  • R & Python in Same Notebook
  • Downloading and Exporting the Notebook
  • Running the Notebook and Exporting It via Terminal
  • Parameterising the Notebook, Running, and Exporting It from Python
  • Embedding HTML
  • Interactivity with Ipywidgets
  • Turning the Notebook into a Dashboard with Voila
  • Debugging Python Code
  • Sharing Notebooks
  • Chapter 4 Quiz

  • Jupyter Lab – Workspace
  • Jupyter Lab UI
  • Debugger
  • Binary Classification
  • Extensions
  • Collaborative Editing
  • Chapter 5 Quiz

  • Pandas Dataframe & Series
  • Loading Data into Dataframes
  • Initial Data Audit
  • Data Types
  • Index and Multiindex
  • Dataframe Operations
  • Apply Method
  • Merging Dataframes
  • Summarizing Data with Crosstab and Pivot Table
  • Summarizing Data Using groupby and agg
  • Styling Pandas Dataframes in Juyter Notebooks
  • Chapter 6 Quiz

  • Dataset
  • Univariate Analysis – Numeric Variables
  • Univariate Analysis – Categorical Variables
  • Bivariate Analysis
  • Multivariate Analysis
  • Other Useful Plots
  • Jupyter Lab Extensions for EDA
  • Chapter 7 Quiz

  • Strategies for Dealing with Big Data
  • Dask Introduction
  • Our Dataset
  • Dask Graph
  • Dask Dashboard
  • Analytics on a Dask Dataframe Part 1
  • Analytics on a Dask Dataframe Part 2
  • Visualisation on Dask Dataframe via Datashader
  • Chapter 8 Quiz

  • Kmeans Clustering - Theory
  • Kmeans Clustering - Practice
  • XGBoost – Theory
  • XGBoost – Practice
  • Cross Validation
  • Making Pipelines
  • Hyperparameter Tuning – Gridsearch
  • Hyperparameter Tuning – Randomized Search
  • Hyperparameter Tuning – tunesearchCV
  • LightGBM – Theory
  • LightGBM – Practice
  • Chapter 9 Quiz

  • Jupyter Hub - Introduction
  • Launching Jupyter Hub
  • Configuration
  • Creating Users
  • Administration Panel
  • Chapter 10 Quiz
Load more modules

Instructor

Jaroslav Klen

During his 9 years of experience in the field of data science, Jaroslav has worked on many enterprise-level projects, mainly in the marketing and banking domain. He cooperated on a development of a data science platform. His main experience covers data analytics, building data and machine learning pipelines, reporting and its automation, analytic web application development, building production ready predictive models and python development. His approach when working on projects, can be described “build it quick to be wrong, then tweak it to be good enough” and “do not let the perfect be enemy of the good”. He emphasizes continuous advancement in life and learning new things. During his data science learning path, he found, that when learning, most important is to know the foundations, concepts and the possibilities. The remaining things is your own practice, research, and amount of time committed.

Join over 1 Million professionals from the most renowned Companies in the world!

certificate

Empower Your Learning with Our Flexible Plans

Invest in your future with our flexible subscription plans. Whether you're just starting out or looking to enhance your expertise, there's a plan tailored to meet your needs. Gain access to in-demand skills and courses for your continuous learning needs.

Monthly Plans
Annual Plans
Save 20% with our annual plans!

Pro

Ideal for continuous learning, offering extensive resources with 600+ courses and diverse Learning Paths to enhance your skills.

$ 499.00
Billed annually or $59.00 billed monthly

What is included

  • 700+ Premium Short Courses
  • 50+ Structured Learning Paths
  • Validation of Completion with all courses and learning paths
  • New Courses added every month
Early Access Offer

Pro +

Experience immersive learning with Practice Labs, CTF Challenges, and exclusive EC-Council certifications for comprehensive skill-building.

$ 599.00
Billed annually or $69.00 billed monthly

Everything in Pro and

  • 800+ Practice Lab exercises with guided instructions
  • 150+ CTF Challenges with detailed walkthroughs
  • New Practice Labs and Challenges added every month
  • 3 Official EC-Council Essentials Certifications¹ (retails at $897!)
    Exclusive Bonus with Annual Plans

¹This plan includes Digital Forensics Essentials (DFE), Ethical Hacking Essentials (EHE), and Network Defense Essentials (NDE) certifications. No other EC-Council certifications are included.

Related Courses

1 of 8