
from skillsbench1,301
Production-ready data engineering skill for designing ETL/ELT and real-time streaming pipelines, data quality, and pipeline performance optimization.
Provides a comprehensive, production-focused data engineering toolkit: Airflow DAG generation, streaming job scaffolds (Flink/Spark), Kafka config generators, data quality validation, and performance analysis. It helps build end-to-end ETL/ELT pipelines and real-time streaming architectures with monitoring and CI/CD integration.
Use this skill when you need to design or implement production data pipelines (batch or streaming), add automated data quality checks, scaffold streaming/Flink jobs, generate Kafka configurations, or profile and optimize pipeline performance. Good for enterprise ETL, data-lake/warehouse builds, and production ML data infra.
Best suited to agents that can run Python tooling and integrate with Git/CI systems (Copilot/Code assistants, Claude Code/Cursor), and that can operate with containerized or cloud data stacks.
This skill has not been reviewed by our automated audit pipeline yet.