GPT-OSS 120B
2x A100 80GB

Large language model for chat completions

Container: 400GB
Volume: 100GB
Ports: 8082
Auto-deploy enabled
Qwen VL Embedding
1x A40 48GB

Text/image embedding model

Container: 100GB
Volume: 100GB
Ports: 8082
Auto-deploy enabled
Whisper + PaddleOCR + Reranker
1x A40 48GB

Speech-to-text, OCR, and reranking models

Container: 100GB
Volume: 100GB
Ports: 8082, 8083, 8084
Auto-deploy enabled
Select a model configuration above
After creation, run the deploy command from the pod detail page
Cancel