A curated hub for on-device AI
Google’s AI Edge Gallery is constructed on LiteRT (previously TensorFlow Lite) and MediaPipe, optimized for working AI on resource-constrained units. It helps open-source fashions from Hugging Face, together with Google’s Gemma 3n — a small, multimodal language mannequin that handles textual content and pictures, with audio and video assist within the pipeline.
The 529MB Gemma 3 1B mannequin delivers as much as 2,585 tokens per second throughout prefill inference on cellular GPUs, enabling sub-second duties like textual content technology and picture evaluation. Fashions run totally offline utilizing CPUs, GPUs, or NPUs, preserving knowledge privateness.
The app features a Immediate Lab for single-turn duties corresponding to summarization, code technology, and picture queries, with templates and tunable settings (e.g., temperature, top-k). The RAG library lets fashions reference native paperwork or photographs with out fine-tuning, whereas a Operate Calling library allows automation with API calls or kind filling.