Course Overview

As more and more data is getting stored and evolved over time, the problem is generally how do you use that data and how to churn meaningful data out. Azure Databricks is a fast, easy and collaborative Apache Spark-based big data analytics service designed for data science and data engineering.

In this course you will learn about how to use the unstructured data and put it to a meaningful sight, we would kick start Azure Databricks and start by understanding how Azure Databricks was evolved why there was a need to get another big data solution. Further in this course, we would play around with smaller sets of datasets, we would mount the Azure blob storage to store and keep the processed data. You will also understand filestore, wherein we would be processing CSV data stored on a production scale. And finally, we would touch base on the security and how to enforce cluster policy for different teams, we would be creating cluster policies using the JSON template and apply them to the existing and new cluster. We would also integrate a GIT repo with Databricks to follow continuous integration and delivery and then eventually set up CICD of our spark application using Azure DevOps.

By the end of this course, you’ll master the deployment of Azure Databricks, you’ll be well equipped with the knowledge about how to read and transform data and also how to load the transformed data into a sink. 

What You Will Learn

  • Azure Databricks deployment
  • What is a notebook
  • job
  • and a cluster?
  • How to read and transform data
  • How to load the transformed data into a sink
  • Deploy JOB written to process data into a spark cluster
  • Perform automated deployment

Program Curriculum

  • Introduction to Big Data
  • Spark Internals and Architecture
  • Overview of Azure Databricks
  • $7 Million Cybersecurity Scholarship by EC-Council
  • Chapter 1 Quiz

  • Create and Deploy Azure Databricks Workspace
  • Introduction to Databricks Utility via CLI
  • Create Notebooks and Run Initial Databricks Command to Validate
  • Manage and Automate Databricks Deployment Using Terraform
  • Running Your First Spark Query
  • Global Init Script
  • Chapter 2 Quiz

  • Mount Azure Blob Storage
  • Reading Data from External Sources
  • Understand How to Read Data, Filter Data, and Create Data Frames
  • Introduction to Streaming Application
  • Chapter 3 Quiz

  • Configuring Security in Databrick Environment?
  • Deploy the Spark Application as a Job in a Databricks Cluster
  • Configure and Integrate GIT Repo with Databricks
  • Configure CICD of Jobs Using Azure DevOps
  • Chapter 4 Quiz

Instructor

Shantanu Das

Infrastructure Consultant

Shantanu Das is a seasoned author for many DevOps courses like Puppet, Microservices, Terraform & Azure DevOps. He has a worked as a Site Reliability Engineer with solid hands-on experience in Google Cloud and Microsoft Azure technologies in Cloud/DevOps practice, architecting solutions along with migrating, managing supporting enterprise suite of applications. He has helped fortune 500 companies in automating build and deployments of code which needs to be shipped to different environments. Also, he has experience in hosting and supporting dockerized applications via Kubernetes on GCP, support Google managed Hadoop/data proc clusters along with ETL solutions built on top of Airflow backed by Hive and MySQL DB.

Join over 1 Million professionals from the most renowned Companies in the world!

certificate

Empower Your Learning with Our Flexible Plans

Invest in your future with our flexible subscription plans. Whether you're just starting out or looking to enhance your expertise, there's a plan tailored to meet your needs. Gain access to in-demand skills and courses for your continuous learning needs.

Monthly Plans
Annual Plans
Save 20% with our annual plans!

Pro

Ideal for continuous learning, offering extensive resources with 600+ courses and diverse Learning Paths to enhance your skills.

$ 499.00
Billed annually or $59.00 billed monthly

What is included

  • 700+ Premium Short Courses
  • 50+ Structured Learning Paths
  • Validation of Completion with all courses and learning paths
  • New Courses added every month
Early Access Offer

Pro +

Experience immersive learning with Practice Labs, CTF Challenges, and exclusive EC-Council certifications for comprehensive skill-building.

$ 599.00
Billed annually or $69.00 billed monthly

Everything in Pro and

  • 800+ Practice Lab exercises with guided instructions
  • 150+ CTF Challenges with detailed walkthroughs
  • New Practice Labs and Challenges added every month
  • 3 Official EC-Council Essentials Certifications¹ (retails at $897!)
    Exclusive Bonus with Annual Plans

¹This plan includes Digital Forensics Essentials (DFE), Ethical Hacking Essentials (EHE), and Network Defense Essentials (NDE) certifications. No other EC-Council certifications are included.

Related Courses

1 of 8