Obsidian LLM Wiki Local: Karpathy's Concept Graphs Go 100% Private

kytmanov/obsidian-llm-wiki-local · Updated 2026-04-18T04:02:56.470Z
Trend 33
Stars 148
Weekly +0

Summary

This Python pipeline transforms Obsidian into a self-hosted semantic knowledge base by using local LLMs to extract concepts and auto-link related notes. It eliminates cloud dependencies while delivering the "LLM Wiki" experience Andrej Karpathy described—creating a living, growing second brain that remains entirely on your hardware. The 377% monthly growth signals pent-up demand for privacy-preserving PKM tools that don't sacrifice AI capabilities.

Architecture & Design

Zero-Network Pipeline Design

The architecture centers on a local-only inference loop that never exposes note content to external APIs. A Python watcher monitors your Obsidian vault, triggering Ollama-hosted LLMs to analyze Markdown semantics.

LayerComponentImplementation
IngestionVault MonitorPython watchdog or Git hooks
ProcessingConcept ExtractorOllama API (Llama 3.x/Mistral)
Graph EngineLink SuggesterSemantic similarity + entity resolution
OutputMarkdown WriterBi-directional [[WikiLinks]] injection

Concept Extraction Flow

  1. Chunking: Parses Markdown into semantic blocks (paragraphs, lists) while preserving frontmatter
  2. LLM Analysis: Prompts local model to identify concepts (abstract ideas) vs. keywords (literal text)
  3. Link Resolution: Matches extracted concepts against existing note titles and previous extractions
  4. Vault Mutation: Writes bi-directional links directly into source Markdown, compatible with Obsidian's graph view
Privacy Architecture: Unlike vector-RAG systems that embed content into retrievable vectors, this approach writes connections as plain text links, making the "AI layer" removable without data loss.

Key Innovations

The "RAG Alternative" Philosophy

Traditional RAG is reactive: you query, it retrieves. This system is proactive: it writes connections into your notes as you create them, effectively turning your LLM into a co-author rather than a search engine.

  • Concept-Centric Linking: Uses LLM reasoning to connect "Transformers" (AI) with "Attention Is All You Need" (paper) even when literal keywords don't overlap
  • Organic Growth: The wiki structure emerges from content semantics rather than manual curation or rigid folder hierarchies
  • Git-Native Versioning: Treats knowledge evolution as code—every AI-suggested link is a diff you can review, revert, or merge

Local-First Constraints as Features

By mandating Ollama, the system forces optimization for quantized models (4-bit/8-bit), resulting in surprisingly efficient concept extraction that runs on consumer hardware (M1 Macs, RTX 3060s). The constraint eliminates the "API anxiety" of sending personal notes to OpenAI/Anthropic.

Performance Characteristics

Inference Latency by Model Tier

ModelQuantizationConcepts/SecRAM UsageQuality
Llama 3.24-bit12-152.5 GBGood for entities
Mistral 7BQ4_K_M8-105 GBBest balance
Llama 3.1 70BQ41-240 GBDeep reasoning

Benchmarked on M2 Pro (32GB). Performance scales linearly with vault size but parallelizes across files.

Scalability Ceiling

The current architecture hits practical limits around 10,000 notes with 7B models, primarily due to context window constraints during cross-note linking. For larger corpora, the tool supports sharded processing—processing recent notes daily and running full-vault link reconciliation weekly.

Limitations

  • Cold Start: Initial processing of a 1,000-note vault takes 20-40 minutes depending on model size
  • Hallucinated Links: Smaller models (3B) occasionally suggest spurious connections requiring manual cleanup
  • English-Centric: Concept extraction quality degrades significantly for non-Latin scripts with base Ollama models

Ecosystem & Alternatives

Obsidian Integration

Works natively with Obsidian's core plugins—suggested links appear as standard [[WikiLinks]], compatible with Graph View, Backlinks panel, and Dataview queries. No proprietary JSON formats or lock-in.

Ollama Model Compatibility

Model FamilySupportRecommended For
Llama 3.1/3.2✅ NativeGeneral knowledge work
CodeLlama✅ TestedTechnical documentation
Mixtral 8x7B⚠️ HeavyComplex reasoning (requires 32GB+ RAM)
Phi-3✅ FastDaily notes/quick capture

Deployment Patterns

  • Desktop: Python script + Ollama desktop app (macOS/Windows/Linux)
  • Homelab: Dockerized Ollama + cron-triggered wiki updates
  • Sync-Safe: Git-based conflict resolution when using Obsidian Sync or GitHub

Commercial Considerations

MIT licensed with no commercial restrictions. The 27 forks suggest active experimentation, including community adaptations for Logseq and Emacs Org-mode. No SaaS upsell—truly local-first.

Momentum Analysis

AISignal exclusive — based on live signal data

Growth Trajectory: Explosive
MetricValueContext
Weekly Growth+0 stars/weekBaseline (newly indexed)
7d Velocity179.3%Viral within PKM communities
30d Velocity377.4%Breakout momentum in local-AI niche

Adoption Phase Analysis

Currently in early adopter phase within the privacy-conscious developer segment. The 148-star count is deceptive—this represents high-intent traction (27 forks = 18% fork rate, indicating active experimentation rather than passive starring).

Forward-Looking Assessment

The project sits at the intersection of two explosive trends: local LLM inference (Ollama adoption) and tools for thought (Obsidian). The "RAG-alternative" positioning is prescient—users are fatigued by vector DB complexity and API costs. If the maintainer adds support for incremental updates (processing only changed paragraphs rather than full notes), this could become the default local PKM stack. Risk: Dependency on Ollama's API stability and Obsidian's closed-source ecosystem.