📄️ ICICLE AI Embed Service
FastAPI service that turns text into embedding vectors using Qwen3-Embedding-0.6B (GGUF quantized) via llama-cpp-python, designed for the ICICLE AI Tapis tenant. The service runs the model locally — no external API calls — so a single .gguf file plus a Tapis token is everything a deployment needs.
📄️ Explanation
Architecture
📄️ How-To Guides
Authentication
📄️ Tutorials
Quickstart