This document aims to provide learning resources to help in training for an intermediate level. This list is not exhuastive and is simply to help learning some of the core concepts we have around data engineering for that level. We have given a variety of resources from articles to online courses to help with progressing towards completing these learning objectives. We have also put at the end optional certifications you can pursue to concrete your knowledge. Any comments, feedback or reports of missing/broken links please slack the cop-data channel.
If you enjoyed using these learning paths or have feedback, please use this feedback form
If you want to explore further than what is on this document then please look at the links below for further resources: Data Wiki Awesome Data Engineering
Learn Python Programming Masterclass (course)
Test Driven Development: By Example - Kent Beck (book)
Python Cookbook: Recipes for Mastering Python 3 - David Beazley & Brian K Jones (book)
Advanced SQL for Query Tuning and Performance Optimization (course)
Implements and guides best practice, coaching others in designing, coding, testing, correcting and documenting moderate-to-complex programs and scripts.
Data Engineering on Azure - Vlad Riscutia (book)
Spark - The Definitive Guide: Big data processing made simple (book)
Creates complex, automated data architecture from scratch, owning deployment of cloud infrastructure to provide DevOps automation with a standard and centralized platform for testing, deployment, and production.
Azure Learning Path DevOps (course)
Owns and implements the entire data pipeline, leveraging integrations which connect the orchestration solution to each data tool.
Data Pipelines Pocket Reference: Moving and Processing Data for Analytics - James Densmore (book)
Building (Better) Data Pipelines with Apache Airflow (video)
Data as Code — Achieving Zero Production Defects for Analytics Datasets (article)
Employs in-depth knowledge of a chosen IaC language to design, implement, and deploy application infrastructure with software best practices, leveraging knowledge and skills to educate clients.
AWS CloudFormation Step by Step: Intermediate to Advanced (course)
Uses basic knowledge of ML/NLP to incorporate data science needs into construction of data platforms, identifying opportunities to work with with data scientists.
Machine learning workflow (website)
An Overview of the End-to-End Machine Learning Workflow (website)
Data Pipelines Vs. ML Pipelines – Similarities and Differences part 1 (article)
Data Pipelines Vs. ML Pipelines – Similarities and Differences part 2 (article)
Creates interactive dashboards using cloud-based analysis services to extract and visualise interactive data.
Python Plotly Tutorial (website)
Tableau in Two Minutes - Tableau Basics for Beginners (video)
Introduction to Azure Data Explorer (course)
What is AWS QuickSight (website)
AWS Certified Developer - Associate
AWS Certified Solutions Architect - Associate
AWS Certified Solutions Architect - Professional