Runpod Flash provides a high-velocity development cycle for AI workloads. It allows developers to write code locally and execute it on remote Runpod GPUs or CPUs via flash dev, featuring hot-reloading that syncs function bodies instantly. Once stable, flash deploy ships the workload as a stable serverless endpoint.
Use this skill when you need to deploy Python-based AI functions, manage GPU resources (from RTX 4090s to H100s), or set up load-balanced serverless APIs for ML models without the overhead of manual Docker management.
Endpoint constructor, GPU/CPU instance types, and specific "Gotchas" regarding cloudpickle and module imports.Agents with shell access and Python capabilities (e.g., Claude Code, Codex, or any ACP harness) that can drive a long-running background process and interact with it via HTTP.
This skill has not been reviewed by our automated audit pipeline yet.