Senior Data Engineer

Production-ready data engineering skill for designing ETL/ELT and real-time streaming pipelines, data quality, and pipeline performance optimization.

triggers:build etl pipelinestreaming pipelinedata qualitygenerate airflow dagkafka configoptimize spark

GitHub SKILL.md

What it does

Provides a comprehensive, production-focused data engineering toolkit: Airflow DAG generation, streaming job scaffolds (Flink/Spark), Kafka config generators, data quality validation, and performance analysis. It helps build end-to-end ETL/ELT pipelines and real-time streaming architectures with monitoring and CI/CD integration.

When to use it

Use this skill when you need to design or implement production data pipelines (batch or streaming), add automated data quality checks, scaffold streaming/Flink jobs, generate Kafka configurations, or profile and optimize pipeline performance. Good for enterprise ETL, data-lake/warehouse builds, and production ML data infra.

What's included

Scripts: pipeline_orchestrator.py, data_quality_validator.py, stream_processor.py, kafka_config_generator.py, etl_performance_optimizer.py, streaming_quality_validator.py (has_scripts=true)
References: frameworks, templates, tools documentation (has_references=true)
Instructions: YAML-driven pipeline generation, Airflow DAG templates, dbt model guidance, data-quality rule patterns, streaming validation and monitoring workflows.

Compatible agents

Best suited to agents that can run Python tooling and integrate with Git/CI systems (Copilot/Code assistants, Claude Code/Cursor), and that can operate with containerized or cloud data stacks.

Not yet audited

This skill has not been reviewed by our automated audit pipeline yet.

Information

Repository: skillsbench
Stars: 1,301

Related Skills

OpenTestAI

Automated, high-confidence AI testing: bug detection, persona feedback, and prioritized test-case generation using many specialized tester profiles.

SSE Stream Resilience

Provide robust server-sent-events (SSE) stream management with Redis-backed registry, heartbeat monitoring, completion persistence, and background guardian clea

Go Data Structures

Authoritative guidance on choosing and using Go built-in and standard-library data structures, with practical best practices for slices, maps, arrays, container

React Development Expert

Provides authoritative React guidance on hooks, state patterns, Server Components, performance optimization, and common architectural patterns.

Code Reviewer

Perform structured, prioritized code reviews that find correctness, security, performance, reliability, and testing issues and provide concrete fix suggestions.

dotLottie Web

Guidelines and patterns for implementing performant dotLottie/Lottie animations on the web (vanilla JS and React), including workers, state machines, and themin

Party Engine Skill

Guidance and examples for using the @cazala/party particle engine (engine lifecycle, modules, WebGPU vs CPU patterns) in custom apps.

Party Skill

Programmatic guide for the @cazala/party library: engine setup, modules, particle APIs, and performance tips for WebGPU and CPU runtimes.

Back to Skills

Senior Data Engineer

from skillsbench1,301

Production-ready data engineering skill for designing ETL/ELT and real-time streaming pipelines, data quality, and pipeline performance optimization.

triggers:build etl pipelinestreaming pipelinedata qualitygenerate airflow dagkafka configoptimize spark

GitHub SKILL.md

What it does

When to use it

What's included

Scripts: pipeline_orchestrator.py, data_quality_validator.py, stream_processor.py, kafka_config_generator.py, etl_performance_optimizer.py, streaming_quality_validator.py (has_scripts=true)
References: frameworks, templates, tools documentation (has_references=true)
Instructions: YAML-driven pipeline generation, Airflow DAG templates, dbt model guidance, data-quality rule patterns, streaming validation and monitoring workflows.

Compatible agents

Best suited to agents that can run Python tooling and integrate with Git/CI systems (Copilot/Code assistants, Claude Code/Cursor), and that can operate with containerized or cloud data stacks.

Not yet audited

This skill has not been reviewed by our automated audit pipeline yet.

Information

Repository: skillsbench
Stars: 1,301

Related Skills

OpenTestAI

Automated, high-confidence AI testing: bug detection, persona feedback, and prioritized test-case generation using many specialized tester profiles.

SSE Stream Resilience

Provide robust server-sent-events (SSE) stream management with Redis-backed registry, heartbeat monitoring, completion persistence, and background guardian clea

Go Data Structures

Authoritative guidance on choosing and using Go built-in and standard-library data structures, with practical best practices for slices, maps, arrays, container

React Development Expert

Provides authoritative React guidance on hooks, state patterns, Server Components, performance optimization, and common architectural patterns.

Code Reviewer

Perform structured, prioritized code reviews that find correctness, security, performance, reliability, and testing issues and provide concrete fix suggestions.

dotLottie Web

Guidelines and patterns for implementing performant dotLottie/Lottie animations on the web (vanilla JS and React), including workers, state machines, and themin

Party Engine Skill

Guidance and examples for using the @cazala/party particle engine (engine lifecycle, modules, WebGPU vs CPU patterns) in custom apps.

Party Skill

Programmatic guide for the @cazala/party library: engine setup, modules, particle APIs, and performance tips for WebGPU and CPU runtimes.

Senior Data Engineer

What it does

When to use it

What's included

Compatible agents

Tags

Not yet audited

Information

Related Skills

Senior Data Engineer

What it does

When to use it

What's included

Compatible agents

Tags

Not yet audited

Information

Related Skills