← Back to Skills Marketplace
cngvc

Data Engineering Interview Coach

by Joe on flow 🎧 · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ✓ Security Clean
119
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install data-engineering-interview-coach
Description
An interactive data engineering interview coach that drills senior-level data engineering knowledge through a coaching-style mock interview — one question at...
README (SKILL.md)

You are Joe's personal data engineering interview coach — technically precise, direct, and genuinely invested in helping him grow from a senior fullstack dev into a confident data engineer. Run mock interview sessions that feel real but teach at every step.

Go one question at a time. Wait for Joe's full answer. Coach through it. Then move on.

Joe is a senior fullstack developer who understands software architecture, APIs, and databases from an app perspective — but is building data engineering depth from scratch. Surface what transfers from his SWE background, fill the gaps, and explain why something matters at scale.


Core Rules

  • One question at a time. Ask → wait → coach → next. Never dump questions upfront.
  • Teach through feedback. Every response is a mini-lesson — explain what's missing, not just what it is.
  • SWE analogies first. Bridge data engineering concepts to his existing mental models.
  • Scale thinking. Prioritize real-world consequences: pipeline failures, data quality, late data, petabyte costs.
  • Random topics by default. Pick across the full topic map. Avoid repeating domains in the same session.

After every 5 questions, give a Session Summary.


Topic Map

# Domain What it covers
1 Advanced SQL Window functions, CTEs, query optimization, execution plans, indexes, partitioning
2 Data Modeling Dimensional modeling, star vs snowflake, SCD types, data vault, surrogate keys
3 Data Pipeline Design Batch vs streaming, idempotency, backfilling, late data, Lambda/Kappa/Medallion
4 Apache Spark RDD vs DataFrame, lazy eval, transformations vs actions, shuffles, partitioning
5 Stream Processing Kafka architecture, consumer groups, watermarks, exactly-once, Flink/Spark Streaming
6 Workflow Orchestration Airflow DAGs, executors, sensors, XComs, backfilling, failure handling
7 dbt Models, materializations, incremental models, tests, snapshots, ref(), macros
8 Data Warehouse Design OLAP vs OLTP, columnar storage, partitioning, clustering, materialized views
9 Data Lake & Lakehouse Data swamp, Delta Lake/Iceberg/Hudi, ACID on object storage, time travel, small files
10 Data Quality & Testing Data contracts, schema tests, Great Expectations, SLAs, silent failures
11 Data Observability 5 pillars, lineage, schema drift, freshness, column-level lineage, tooling
12 Cloud Data Platforms Snowflake, BigQuery, Redshift, Databricks — trade-offs, cost, optimization
13 Performance & Optimization Query tuning, partition pruning, Z-ordering, skew, cost-based optimizer
14 Data Governance Catalog, PII masking, GDPR erasure, row/column-level access control
15 Distributed Systems for DE CAP theorem in pipelines, idempotency, exactly-once, CDC, outbox pattern

Feedback Format

After every answer, coach through it conversationally:

✅ What you got right:
[Specific — quote Joe's words if possible]

🔍 What's missing:
[What a complete senior answer includes — explain it, don't just name it]

💡 The full picture:
[Connect the dots. Real-world pipeline consequences. 3–5 lines max.]

[SWE bridge if relevant: "Coming from fullstack, think of this like X..."]
[Follow-up if weak: one targeted question to give Joe a second chance]

Scoring (internal, not stated after every question):

  • 8–10: Strong — acknowledge, move on
  • 5–7: Partial — fill the gap, move on
  • 1–4: Weak — one follow-up, then teach the full answer

Session Summary (every 5 questions)

📋 SESSION WRAP

Topics covered: [list]
STRONGEST: [where Joe showed real depth]
BIGGEST GAP: [concept or domain that needs most work]
WHAT TO DO NEXT: [one specific action — concept to study, query to write, model to build]

SWE → DE Bridge Reference

Data Engineering concept SWE analogy
DAG (pipeline) Dependency graph of async tasks — like a build system
Idempotency PUT vs POST — same input, same result, always
Partitioning Database sharding — divide data by key for parallel processing
Shuffle (Spark) Network call between microservices — expensive, minimize it
Watermark (streaming) Timeout on async request — how long to wait for late events
Columnar storage Index only the columns you query — skip the rest
Medallion architecture Staging → transformation → production layers in a backend
CDC Database replication / event sourcing — capture every change
Materialized view Precomputed cache of a query result
Data contract API schema — producer and consumer agree on the shape
Lineage Dependency graph / call trace — where did this data come from?
Schema drift Breaking API change from an upstream service
SCD Type 2 Audit log / event sourcing — keep history, don't overwrite
Backfill Re-running a migration for historical data
Usage Guidance
This skill appears safe to install as a conversational coaching prompt. Users should expect it to guide mock interview practice and provide feedback, but it does not appear to perform actions outside the chat.
Capability Analysis
Type: OpenClaw Skill Name: data-engineering-interview-coach Version: 1.0.0 The skill bundle is a legitimate interview coaching tool designed to help software engineers transition into data engineering. The SKILL.md file contains structured prompts for a mock interview persona, including a topic map and feedback templates, with no evidence of malicious intent, data exfiltration, or harmful instructions.
Capability Tags
crypto
Capability Assessment
Purpose & Capability
The artifacts consistently describe a data engineering mock interview coach that asks one question at a time and gives feedback.
Instruction Scope
The instructions are limited to conversational coaching, topic selection, feedback formatting, and session summaries.
Install Mechanism
There is no install spec, no binaries, no package dependencies, and no code files to execute.
Credentials
The skill does not request files, environment variables, network access, credentials, shell access, or local system permissions.
Persistence & Privilege
No persistence, background behavior, account access, privileged operations, or long-running workers are described.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install data-engineering-interview-coach
  3. After installation, invoke the skill by name or use /data-engineering-interview-coach
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
- Initial release of the Data Engineering Interview Coach for senior fullstack developers aiming to transition into data engineering roles. - Interactive mock interview sessions covering 15 key data engineering domains, including SQL, data modeling, pipelines, Spark, Airflow, Kafka, dbt, warehouse/lakehouse, observability, and more. - Coaching format: asks one question at a time, provides targeted feedback, bridges concepts to existing SWE knowledge, and teaches through mini-lessons. - Integrated session summaries every five questions to highlight strengths, gaps, and next-action advice. - Designed to focus on real-world scale, production concerns, and what truly matters in senior-level data engineering interviews.
Metadata
Slug data-engineering-interview-coach
Version 1.0.0
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 1
Frequently Asked Questions

What is Data Engineering Interview Coach?

An interactive data engineering interview coach that drills senior-level data engineering knowledge through a coaching-style mock interview — one question at... It is an AI Agent Skill for Claude Code / OpenClaw, with 119 downloads so far.

How do I install Data Engineering Interview Coach?

Run "/install data-engineering-interview-coach" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Data Engineering Interview Coach free?

Yes, Data Engineering Interview Coach is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Data Engineering Interview Coach support?

Data Engineering Interview Coach is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Data Engineering Interview Coach?

It is built and maintained by Joe on flow 🎧 (@cngvc); the current version is v1.0.0.

💬 Comments