The DP-203T00: Data Engineering on Microsoft Azure course is designed to impart the knowledge and skills necessary to design and implement data engineering solutions on Azure. In this course, the student will learn how to implement and manage data engineering workloads on Microsoft Azure using Azure services such as Azure Synapse Analytics, Azure Data Lake Storage Gen2, Azure Stream Analytics, Azure Databricks, and others. The course focuses on common data engineering tasks such as orchestrating data transfer and transformation pipelines, working with data files in a data lake, creating and loading relational data warehouses, capturing and aggregating streams of real-time data, and tracking data assets and lineage.
Learners will gain practical experience through labs that reinforce the lessons, such as querying data with Serverless SQL pools, performing data engineering with Azure Synapse Apache Spark Pools, and implementing real-time analytics with Azure Stream Analytics. The course also dives into security and compliance with modules on end-to-end security and explores integration with Power BI for reporting and Azure Synapse Analytics for machine learning processes. By the end of this course, participants will have a strong foundation in data engineering practices, enabling them to build scalable and secure data solutions in the cloud.
Audience Profile:
This course is tailored for data professionals, data architects, and business intelligence experts who aim to build and manage data engineering solutions and analytical platforms using Microsoft Azure. It is also relevant for data analysts and data scientists who work with analytical solutions built on Azure.
At Course Completion:
After completing this course, students will be able to:
-
Explore Azure's compute and storage options for data engineering workloads.
-
Design and implement the serving layer for data solutions.
-
Understand considerations for effective data engineering.
-
Execute interactive queries using serverless SQL pools.
-
Explore, transform, and load data into the Data Warehouse with Apache Spark.
-
Perform data exploration and transformation in Azure Databricks.
-
Ingest and load data into the Data Warehouse.
-
Transform data using Azure Data Factory or Azure Synapse Pipelines.
-
Integrate data from notebooks with Azure Data Factory or Azure Synapse Pipelines.
-
Optimize query performance with dedicated SQL pools in Azure Synapse.
-
Analyze and optimize data warehouse storage.
-
Support Hybrid Transactional Analytical Processing (HTAP) with Azure Synapse Link.
-
Implement end-to-end security with Azure Synapse Analytics.
-
Perform real-time stream processing with Stream Analytics.
-
Create a stream processing solution with Event Hubs and Azure Databricks.
-
Build reports using Power BI integration with Azure Synapse Analytics.
-
Conduct integrated machine learning processes within Azure Synapse Analytics.
Prerequisites:
Course Outline:
Module 1: Explore Compute and Storage Options for Data Engineering Workloads
Lessons:
-
Introduction to Azure Synapse Analytics
-
Azure Databricks and Delta Lake Architecture
-
Azure Data Lake Storage
-
Azure Stream Analytics for Data Streams
Lab:
After Completion:
Module 2: Design and Implement the Serving Layer
Lessons:
Lab:
After Completion:
Module 3: Data Engineering Considerations for Source Files
Lessons:
Lab:
After Completion:
Module 4: Run Interactive Queries Using Serverless SQL Pools
Lessons:
Lab:
After Completion:
Module 5: Explore, Transform, and Load Data into the Data Warehouse Using Apache Spark
Lessons:
Lab:
After Completion:
Module 6: Data Exploration and Transformation in Azure Databricks
Lessons:
Lab:
After Completion:
Module 7: Ingest and Load Data into the Data Warehouse
Lessons:
Lab:
After Completion:
Module 8: Transform Data with Azure Data Factory or Azure Synapse Pipelines
Lessons:
Lab:
After Completion:
Module 9: Orchestrate Data Movement and Transformation in Azure Synapse Pipelines
Lessons:
Lab:
After Completion:
Module 10: Optimize Query Performance with Dedicated SQL Pools
Lessons:
Lab:
After Completion:
Module 11: Analyze and Optimize Data Warehouse Storage
Lessons:
Lab:
After Completion:
Module 12: Support Hybrid Transactional Analytical Processing (HTAP) with Azure Synapse Link
Lessons:
Lab:
After Completion:
Module 13: End-to-End Security with Azure Synapse Analytics
Lessons:
Lab:
After Completion:
Module 14: Real-Time Stream Processing with Stream Analytics
Lessons:
Lab:
After Completion:
Module 15: Create a Stream Processing Solution with Event Hubs and Azure Databricks
Lessons:
Lab:
After Completion: