Privacy-First Local AI Chat

PocketLlama is a fully local AI chat application for iOS and iPadOS. Run powerful LLMs like Gemma 3, Llama 3.2, Qwen 2.5, and Phi-3 directly on your device with no internet required.

Key Features

  • 100% Local Inference — Your conversations never leave your device
  • Curated Model Library — Choose from Gemma, Llama, Qwen, Phi and more
  • Smart Download Manager — Real-time disk usage tracking
  • Performance Metrics — Monitor tokens/second and memory usage
  • Customizable Answers — Adjust response style and length
  • Adaptive UI — Light/Dark modes with multi-device optimization

Technology Stack

  • SwiftUI — Native iOS/iPadOS interface
  • llama.cpp — Efficient local model inference
  • GGUF Models — Optimized quantized model format
  • Core Data — Persistent conversation storage
  • Combine — Reactive data flow

Screenshots

A clean, modern interface with smooth animations and intuitive controls.

About This Project

PocketLlama was built to provide a private, offline AI assistant that respects user privacy. Unlike cloud-based AI services, all processing happens locally on your device. This means:

  • No data is sent to external servers
  • Works completely offline
  • No subscription fees or API costs
  • Full control over your conversations