- Home
- IT Courses
- MS-DP3012: Implementing a Data Analytics Solution With Azure Synapse Analytics
MS-DP3012: Implementing a Data Analytics Solution With Azure Synapse Analytics
Course Code: MS-DP3012
This is a single day Instructor Lead Course designed to give the learners instruction on the SQL dedicated and serverless Spark pools and providing instruction of data wrangling and the ELT process using Synapse Pipelines which is very similar to those familiar with Azure Data Factory (ADF) to move data into the Synapse dedicated pool database.
The Audience should have familiarity with notebooks that use different languages and a Spark engine, such as Databricks, Jupyter Notebooks, Zeppelin notebooks and more. They should also have some experience with SQL, Python, and Azure tools, such as Data Factory.
This course does not have any formal prerequisites; however, it is beneficial if participants have familiarity with:
- Notebooks that use different languages and a Spark engine, such as Databricks, Jupyter Notebooks, or Zeppelin Notebooks.
- Experience with SQL, Python, and Azure tools like Data Factory.
After completion of this course, you will be able to:
- Introduction to Azure Synapse Analytics: Understand the features and capabilities of Azure Synapse Analytics, including its architecture and components.
- Data Ingestion and Preparation: Learn how to ingest, prepare, and transform data using Synapse Pipelines and other data integration tools.
- Data Storage and Management: Explore different data storage options, including data lakes and data warehouses, and learn how to manage data effectively.
- Data Processing with Apache Spark: Use Apache Spark within Azure Synapse Analytics to process and analyze large datasets.
- Querying Data with SQL: Utilize serverless SQL pools to query data stored in data lakes and other sources without the need for data movement.
- Building Data Pipelines: Create and manage data pipelines to automate data workflows and ensure data consistency2.
- Implementing Security and Compliance: Learn best practices for securing data and ensuring compliance with industry standards.
- Performance Optimization: Optimize the performance of data analytics solutions to handle large-scale data processing efficiently.
There is no Associated Certification or Exam for this course.
Modules
Learn about the features and capabilities of Azure Synapse Analytics - a cloud-based platform for big data processing and analysis.
Lessons
- Introduction.
- What is Azure Synapse Analytics.
- How Azure Synapse Analytics works.
- When to use Azure Synapse Analytics.
- Exercise - Explore Azure Synapse Analytics.
- Knowledge check.
- Summary.
By the end of this module, you'll be able to:
- Identify the business problems that Azure Synapse Analytics addresses.
- Describe core capabilities of Azure Synapse Analytics.
- Determine when to use Azure Synapse Analytics.
With Azure Synapse serverless SQL pool, you can leverage your SQL skills to explore and analyze data in files, without the need to load the data into a relational database.
Lessons
- Introduction.
- Understand Azure Synapse serverless SQL pool capabilities and use cases.
- Query files using a serverless SQL pool.
- Create external database objects.
- Exercise - Query files using a serverless SQL pool.
- Knowledge check.
- Summary.
By the end of this module, you'll be able to:
- Identify capabilities and use cases for serverless SQL pools in Azure Synapse Analytics
- Query CSV, JSON, and Parquet files using a serverless SQL pool
- Create external database objects in a serverless SQL pool
Apache Spark is a core technology for large-scale data analytics. Learn how to use Spark in Azure Synapse Analytics to analyze and visualize data in a data lake.
Lessons
- Introduction.
- Get to know Apache Spark.
- Use Spark in Azure Synapse Analytics.
- Analyze data with Spark.
- Visualize data with Spark.
- Exercise - Analyze data with Spark.
- Knowledge check.
- Summary.
By the end of this module, you'll be able to:
- Identify core features and capabilities of Apache Spark.
- Configure a Spark pool in Azure Synapse Analytics.
- Run code to load, analyze, and visualize data in a Spark notebook.
Delta Lake is an open source relational storage area for Spark that you can use to implement a data lakehouse architecture in Azure Synapse Analytics.
Lessons
- Introduction.
- Understand Delta Lake.
- Create Delta Lake tables.
- Create catalog tables.
- Use Delta Lake with streaming data.
- Use Delta Lake in a SQL pool.
- Exercise - Use Delta Lake in Azure Synapse Analytics.
- Knowledge check.
- Summary.
- Describe core features and capabilities of Delta Lake.
- Create and use Delta Lake tables in a Synapse Analytics Spark pool.
- Create Spark catalog tables for Delta Lake data.
- Use Delta Lake tables for streaming data.
- Query Delta Lake tables from a Synapse Analytics SQL pool.
Relational data warehouses are a core element of most enterprise Business Intelligence (BI) solutions, and are used as the basis for data models, reports, and analysis.
Lessons
- Introduction.
- Design a data warehouse schema.
- Create data warehouse tables.
- Load data warehouse tables.
- Query a data warehouse.
- Exercise - Explore a data warehouse.
- Knowledge check.
- Summary.
By the end of this module, you'll be able to:
- Design a schema for a relational data warehouse.
- Create fact, dimension, and staging tables.
- Use SQL to load data into data warehouse tables.
- Use SQL to query relational data warehouse tables.
Pipelines are the lifeblood of a data analytics solution. Learn how to use Azure Synapse Analytics pipelines to build integrated data solutions that extract, transform, and load data across diverse systems.
Lessons
- Introduction.
- Understand pipelines in Azure Synapse Analytics.
- Create a pipeline in Azure Synapse Studio.
- Define data flows.
- Run a pipeline.
- Exercise - Build a data pipeline in Azure Synapse Analytics.
- Knowledge check.
- Summary.
In this module, you'll practice how to:
- Describe core concepts for Azure Synapse Analytics pipelines.
- Create a pipeline in Azure Synapse Studio.-
- Implement a data flow activity in a pipeline.
- Initiate and monitor pipeline runs.
By using a serverless SQL pool in Azure Synapse Analytics, you can use the ubiquitous SQL language to transform data in files in a data lake.
Lessons
- Introduction.
- Transform data files with the CREATE EXTERNAL TABLE AS SELECT statement.
- Encapsulate data transformations in a stored procedure.
- Include a data transformation stored procedure in a pipeline.
- Exercise - Transform files using a serverless SQL pool.
- Knowledge check.
- Summary.
After completing this module, you'll be able to:
- Use a CREATE EXTERNAL TABLE AS SELECT (CETAS) statement to transform data.
- Encapsulate a CETAS statement in a stored procedure.
- Include a data transformation stored procedure in a pipeline.
Why choose between working with files in a data lake or a relational database schema? With lake databases in Azure Synapse Analytics, you can combine the benefits of both.
Lessons
- Introduction.
- Understand lake database concepts.
- Explore database templates.
- Create a lake database.
- Use a lake database.
- Exercise - Analyze data in a lake database.
- Knowledge check.
- Summary.
After completing this module, you'll be able to:
- Understand lake database concepts and components.
- Describe database templates in Azure Synapse Analytics.
- Create a lake database.
Learn how you can set up security when using Azure Synapse serverless SQL pools
Lessons
- Introduction.
- Choose an authentication method in Azure Synapse serverless SQL pools.
- Manage users in Azure Synapse serverless SQL pools.
- Manage user permissions in Azure Synapse serverless SQL pools.
- Knowledge check.
- Summary.
After completing this module, you'll be able to:
- Choose an authentication method in Azure Synapse serverless SQL pools.
- Manage users in Azure Synapse serverless SQL pools.
- Manage user permissions in Azure Synapse serverless SQL pools.
Data engineers commonly need to transform large volumes of data. Apache Spark pools in Azure Synapse Analytics provide a distributed processing platform that they can use to accomplish this goal.
Lessons
- Introduction.
- Modify and save dataframes.
- Partition data files.
- Transform data with SQL.
- Exercise: Transform data with Spark in Azure Synapse Analytics.
- Knowledge check.
- Summary.
After completing this module, you'll be able to:
- Use Apache Spark to modify and save data frames.
- Partition data files for improved performance and scalability.
- Transform data with SQL.
A core responsibility for a data engineer is to implement a data ingestion solution that loads new data into a relational data warehouse.
Lessons
- Introduction.
- Load staging tables.
- Load dimension tables.
- Load time dimension tables.
- Load slowly changing dimensions.
- Load fact tables.
- Perform post load optimization.
- Exercise - load data into a relational data warehouse.
- Knowledge check.
- Summary.
After completing this module, you'll be able to:
- Load staging tables in a data warehouse.
- Load dimension tables in a data warehouse.
- Load time dimensions in a data warehouse.
- Load slowly changing dimensions in a data warehouse.
- Load fact tables in a data warehouse.
- Perform post-load optimizations in a data warehouse.
Learn how to manage and monitor Azure Synapse Analytics.
Lessons
- Introduction.
- Scale compute resources in Azure Synapse Analytics.
- Pause compute in Azure Synapse Analytics.
- Manage workloads in Azure Synapse Analytics.
- Use Azure Advisor to review recommendations.
- Use dynamic management views to identify and troubleshoot query performance.
- Knowledge check.
- Summary.
After completing this module, you'll be able to:
- Scale compute resources in Azure Synapse Analytics.
- Pause compute in Azure Synapse Analytics.
- Manage workloads in Azure Synapse Analytics.
- Use Azure Advisor to review recommendations.
- Use Dynamic Management Views to identify and troubleshoot query performance.
Learn how to approach and implement security to protect your data with Azure Synapse Analytics.
Lessons
- Introduction.
- Understand network security options for Azure Synapse Analytics.
- Configure Conditional Access.
- Configure authentication.
- Manage authorization through column and row level security.
- Exercise - Manage authorization through column and row level security.
- Manage sensitive data with Dynamic Data Masking.
- Implement encryption in Azure Synapse Analytics.
- Knowledge check.
- Summary.
After completing this module, you'll be able to:
- Understand network security options for Azure Synapse Analytics.
- Configure Conditional Access.
- Configure Authentication.
- Manage authorization through column and row level security.
- Manage sensitive data with Dynamic Data masking.
- Implement encryption in Azure Synapse Analytics.
Apache Spark provides data engineers with a scalable, distributed data processing platform, which can be integrated into an Azure Synapse Analytics pipeline.
Lessons
- Introduction.
- Understand Synapse Notebooks and Pipelines.
- Use a Synapse notebook activity in a pipeline.
- Use parameters in a notebook.
- Exercise - Use an Apache Spark notebook in a pipeline.
- Knowledge check.
- Summary.
After completing this module, you'll be able to:
- Describe notebook and pipeline integration.
- Use a Synapse notebook activity in a pipeline.
- Use parameters with a notebook activity.