Back to Projects
Agentic SystemsAI / ProductivityActive Development
3

LocalMind

Personal Project

Local RAG Intelligence

1000+
Indexing
files per minute
<1s
Latency
Retrieval
12+
Formats
Supported file types
1GB
Storage
per 14K files

The Challenge

Building a privacy-first RAG system that runs entirely on local hardware while maintaining the accuracy and speed of cloud-based solutions.

The Solution

Implemented a high-speed incremental indexer and hybrid search pipeline that optimizes local compute resources while leveraging Gemini for high-quality grounding.

What It Does

LocalMind indexes your local filesystem and provides a chat interface to query your documents. It uses semantic and keyword search to retrieve relevant context from PDFs, code, notes, and data files, providing answers grounded in actual local content with full citations.

How It Works

Scans directories using SHA-256 for incremental updates. Parses diverse file formats (AST-based for Python, regex for TS/JS). Generates recursive chunks with overlap. Stores embeddings in a local ChromaDB instance. Executes hybrid search using Reciprocal Rank Fusion (RRF) and passes top context to Gemini for final generation.

Process Flow

1Filesystem Scanning & Hashing
2Multi-format Content Parsing
3Intelligent Recursive Chunking
4Vector Embedding Generation
5Hybrid Retrieval & RRF
6Context-Grounded Generation

Key Innovations

Incremental indexing via SHA-256 file hashing
Hybrid Search (Semantic + Keyword) with RRF
AST-based intelligent parsing for source code
Recursive character splitting with overlap
Grounding with citations and source transparency
Zero-infrastructure local persistent storage

Technologies Used

PythonGeminiChromaDBRAGHybrid SearchCLIRRFpdfplumbertrafilatura

Performance Metrics

1000files/min
Indexing Speed
After initial run
1s
Retrieval Latency
p99 end-to-end

Interested in working together?

Let's discuss how AI enablement can transform your operations.

Get in Touch