Skip to content

Latest commit

 

History

History
92 lines (47 loc) · 6.03 KB

data_102.md

File metadata and controls

92 lines (47 loc) · 6.03 KB

Data 102 Learning Path

This document aims to provide learning resources to help in training for an intermediate level. This list is not exhuastive and is simply to help learning some of the core concepts we have around data engineering for that level. We have given a variety of resources from articles to online courses to help with progressing towards completing these learning objectives. We have also put at the end optional certifications you can pursue to concrete your knowledge. Any comments, feedback or reports of missing/broken links please slack the cop-data channel.

If you enjoyed using these learning paths or have feedback, please use this feedback form

If you want to explore further than what is on this document then please look at the links below for further resources: Data Wiki Awesome Data Engineering

Applies working knowledge of programming languages and data processing to support the TDD process.

Learn Python Programming Masterclass (course)

Test Driven Development: By Example - Kent Beck (book)

Python Cookbook: Recipes for Mastering Python 3 - David Beazley & Brian K Jones (book)

Advanced SQL for Query Tuning and Performance Optimization (course)

Implements and guides best practice, coaching others in designing, coding, testing, correcting and documenting moderate-to-complex programs and scripts.

Agile Coaching DNA (website)

Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems - Martin Kleppmann (book)

Data Engineering on Azure - Vlad Riscutia (book)

Spark - The Definitive Guide: Big data processing made simple (book)

Creates complex, automated data architecture from scratch, owning deployment of cloud infrastructure to provide DevOps automation with a standard and centralized platform for testing, deployment, and production.

Azure Learning Path DevOps (course)

AWS DevOps CI/CD (blog)

Awesome CI/CD Security (repo)

Owns and implements the entire data pipeline, leveraging integrations which connect the orchestration solution to each data tool.

Data Pipelines Pocket Reference: Moving and Processing Data for Analytics - James Densmore (book)

Building (Better) Data Pipelines with Apache Airflow (video)

Data as Code — Achieving Zero Production Defects for Analytics Datasets (article)

Employs in-depth knowledge of a chosen IaC language to design, implement, and deploy application infrastructure with software best practices, leveraging knowledge and skills to educate clients.

AWS CloudFormation Step by Step: Intermediate to Advanced (course)

HashiCorp Learn (website)

Terratest (website)

Uses basic knowledge of ML/NLP to incorporate data science needs into construction of data platforms, identifying opportunities to work with with data scientists.

Machine learning workflow (website)

An Overview of the End-to-End Machine Learning Workflow (website)

Data Pipelines Vs. ML Pipelines – Similarities and Differences part 1 (article)

Data Pipelines Vs. ML Pipelines – Similarities and Differences part 2 (article)

Creates interactive dashboards using cloud-based analysis services to extract and visualise interactive data.

Python Plotly Tutorial (website)

Power BI Beginners (video)

Tableau in Two Minutes - Tableau Basics for Beginners (video)

Introduction to Azure Data Explorer (course)

What is AWS QuickSight (website)

Senior Data Engineer Exams

AWS Certified Developer - Associate

AWS Certified Solutions Architect - Associate

AWS Certified Solutions Architect - Professional

Azure Data Engineer

Databricks Engineer Associate

Databricks Spark Developer Associate

Azure AI Fundementals