
from xiaotianfotos
Multi-engine AI creative toolchain: TTS, ASR, image generation (ComfyUI), and subtitle/video cut tooling.
opc tts and opc say with multiple engines (edge-tts, qwen), voice design and cloning modes, rate/pitch controls.\n- ASR: robust 4-stage pipeline (ASR → forced alignment → sentence breaking → CSV fix → render) with SRT/ASS/JSON outputs and resume/fix features.\n- Image generation: ComfyUI integrations, PromptKG skeletons, templates and analysis commands.\n- Cut: subtitle-timestamp-driven video editing and web UI for manual adjustments.\n- Dashboard and configuration management for local workspaces.\n\n## Typical prompts / usage\n- "Generate Chinese TTS using edge-tts with faster rate."\n- "Transcribe audio.mp3 and output SRT and ASS subtitles."\n- "Run prompt-kg planning and generate an image with ernie-turbo."\n- "Launch the cut dashboard for video.mp4."\n\n## Notes\n- Cross-platform: CUDA on Linux, MLX on macOS.\n- Configuration stored in ~/.opc_cli/opc/config.json.\n- Provides helper commands for model source selection and local model cache.\nOPC is a sophisticated digraph-based agent orchestration skill that decomposes tasks into multi-role pipelines with independent evaluation gates. SKILL.md is exceptionally detailed with clear flow templates, execution protocols, and harness command reference. The two bundled scripts are supplementary: a Node postinstall hook and a Python verifier for devil's-advocate output. The verifier script ran cleanly but requires a file argument to operate. No security concerns beyond the auto-install postinstall hook.
Well-architected skill with strong separation of concerns. The digraph flow engine with cycle limits, escape hatches, and state recovery is impressive. Postinstall auto-exec is the main security note.