
from NVIDIA Skills76
Reference guide to Megatron-LM CI/CD: pipeline structure, PR scope labels, triggering internal CI, and steps for investigating CI failures.
Documents the CI/CD workflows used by NVIDIA's Megatron-LM project. The skill explains the main GitHub Actions workflow structure, the decision tree for PR labels that control test scopes and repeat counts, how images are pushed to registries, and practical commands for triggering internal CI and locating pipeline logs. It also provides procedures for investigating failures and correlating them back to PR changes.
Use when debugging failed CI runs, deciding which PR labels to attach to open a PR, triggering internal pipelines without UI access, or understanding how CI stages map to test scope and container images. Ideal for maintainers, CI engineers, and contributors working on model training code where test scope selection matters.
tools/trigger_internal_ci.py and usage examples for triggering internal GitLab CI.gh to view PR metadata and runs, and guidance for locating and reading CI artifacts and logs.Inferred: developer-facing agents with shell access and GitHub CLI capability (gh, bash), and agents used by maintainers for CI automation.
This skill has not been reviewed by our automated audit pipeline yet.