vLLM Qwen3 Coder Optimization

Name: vLLM Qwen3 Coder Optimization
Rating: 54 (1 reviews)
Author: bbuf

Trust Score 54/100

Guided, PR-backed manual for auditing, debugging, and extending the Qwen3 Coder tool parser in vLLM—focuses on schema edge cases, tool-parser regressions, and v

triggers:qwen3vllmtool parseranyOfschema validationResponses APIcoder tool

GitHub SKILL.md

What it does

This skill provides a focused, evidence-backed optimization dossier for the Qwen3 Coder tool parser in the vLLM runtime. It documents landed PRs, runtime surfaces, and a validation plan so an agent (e.g., Codex or a code-focused assistant) can audit, diagnose, and patch regressions related to JSON-schema edge cases (anyOf/oneOf), nullable parameters, and Responses API tool calls. The content is built from diffs and PR notes to ensure recommendations are traceable.

When to use it

Use this skill when an agent must: reproduce or investigate a regression in vLLM's Qwen3 coder tool parsing; create or review PRs that change tool-parser behavior; validate tool-call integrity under streaming/speculative decode; or prepare test lanes that exercise complex schema combinations. It is intended for engineering review, QA automation, and PR triage workflows.

What's included

Scripts: none bundled, but the repo includes a references/ directory with PR history and validation notes.
References: yes — canonical PR notes and history mirrors (references/pr-history.md, model-pr-optimization-history/...).
Instructions: a procedural checklist to re-run PR searches, verify the mainline commit, and execute validation lanes focusing on schema extraction and Responses API tool execution.

Compatible agents

Best suited for code-capable agents (Codex-family, GPT-code assistants, Claude Code) and any workflow that can read PR diffs and run validation test lanes.

Audit Summary

Skill references a GitHub path (skills/model-optimization/vllm/vllm-qwen3-coder-optimization/SKILL.md) that does not exist in the repo — the content is inaccessible. No scripts were bundled. Based on metadata alone, it targets vLLM Qwen3 Coder tool-parser debugging and PR auditing, a niche but real use case. The skill appears to have been removed from the source repo or was never properly created at the recorded path.

Watch Out

SKILL.md body not found at recorded GitHub path — skill may have been deleted or path is incorrect
No scripts to test

Notes

Source path skills/model-optimization/vllm/vllm-qwen3-coder-optimization does not exist in BBuf/AI-Infra-Auto-Driven-SKILLS repo. The repo has vllm-related content under model-pr-optimization-history/ but no skill at the recorded path. Low quality/architecture scores due to missing content. Security score kept moderate-high since no malicious content was found but also no content to audit deeply.

Information

Repository: ai-infra-auto-driven-skills
Stars: 699

Trust Score

Overall54

Security85

Code Quality25

Architecture20

Usefulness35

More from ai-infra-auto-driven-skills

vLLM Qwen3.6 Optimization

Guidance and PR-driven validation steps for optimizing and documenting Qwen3.6 support in vLLM; use when auditing or implementing model-specific changes.

LLM Pipeline Analysis

Analyze LLM torch profiler traces to identify layer-level bottlenecks, kernel flows, and timing gaps in forward passes.

Related Skills

LLM Evaluation

Evaluation framework and tools for systematically measuring LLM performance using automated metrics, human judgment, and A/B testing.

vLLM-Omni Video Generation

Generate videos (text→video, image→video, text+image→video) using vLLM-Omni and Wan2.2-style diffusion models, with guidance on parameters and performance trade

TDD (Test-Driven Development) Skill

Guides an agent through a strict red–green–refactor TDD cycle: write a failing test, implement the minimal change, and refactor with verification.

Hugging Face Evaluation Manager

Extract, import, and add structured model evaluation results to Hugging Face model cards; run or import benchmark evaluations and generate model-index YAML for

Drizzle ORM Knowledge Patch

Add knowledge about Drizzle ORM changes (v1.0.0-beta.x) — validator import consolidation, Effect Schema support, node-sqlite driver, and .comment() query taggin

vLLM Qwen3.6 Optimization

Guidance and PR-driven validation steps for optimizing and documenting Qwen3.6 support in vLLM; use when auditing or implementing model-specific changes.

Bug Fix — Stop-the-Line Protocol

Structured bug-fix workflow and triage protocol: reproduce, localize, reduce, fix, add regression test, and verify before resuming development.

Bug Fix

A disciplined, test-first workflow for reproducing, triaging, and fixing software bugs while preventing regressions.

Back to Skills