Free Architecture Checklist

#020: What to Learn First as a Data Engineer

Nov 19, 2022

The biggest hurdle to data engineering is getting started.

But everyday it feels like there’s a new tool, technology or best practice.

This whirlwind of information makes learning impossible.

To help you focus, today I'll share 3 foundational topics to start with:

  1. Database objects
  2. SQL
  3. Reporting

Database objects aren’t flashy, but can’t be skipped

Many eager engineers jump into “modern” tools without database fundamentals.

But it becomes painfully obvious once troubleshooting is required.

It’s like trying to play a sport without knowing the positions.

Sure, you can still play the data game - but you’ll only get so far.

 

Example: Database vs Schema vs Table + Indexes, constraints & roles.

 

SQL is still the most important language

Python, data streaming and automation are sexy topics.

But eventually you’ll need to query a database.

My advice: always master SQL first.

You’ll be better prepared technically & mentally for future components.

 

Example: Write a query, turn it into a function and/or stored procedure.

 

Reporting is how most users interact with data

Great products are built with the user in mind.

Working hard is wasteful if not for the right reason.

In data, that means learning the basics of reporting.

It’ll round out your skills and help you appreciate the underlying engineering.

 

Example: Create a Tableau dashboard and collect user feedback

 

Build a database, write SQL and create a report.

Establish this foundation before filling in the (never ending) gaps.

The Starter Guide for Modern Data

Build Modern Data Architectures With More Structure, Faster.

Show more impact with modern tools like dbt, Snowflake & GitHub by following a simple foundational design.

You'll also get other helpful content from me. Unsubscribe anytime.