
from mkurman
Contrastive language–image pre-training (zero-shot image classification, image-text similarity, and cross-modal retrieval).
OpenAI CLIP (Contrastive Language-Image Pre-training) learns joint text–image representations. Enables zero-shot image classification, image–text similarity, cross-modal search, and image captioning without task-specific training.
Includes example code for loading a CLIP model, preprocessing images, tokenizing text, and computing similarity or classification scores.
pip install openai-clip
This skill has not been reviewed by our automated audit pipeline yet.