Built a local-first RAG research tool that runs entirely on a single GPU. Nemotron Nano 9B v2 on vLLM + FastAPI + SQLite FTS5 with a two-step Extract → Execute flow. Tool calling +...
A deep dive into the fork that replaces Claude Code with Qwen 3.5 9B + ollama in Karpathy's autoresearch framework. Run fully autonomous ML research on a single GPU with zero API c...
Canonical, the company behind Ubuntu, is targeting an IPO with $292M revenue and 88% gross margins. If they go public, it will symbolize the new era of Linux/OSS business. We trace...
Using the Google Places Text Search API, I scraped 1,914 unagi restaurants across all 47 Japanese prefectures with under 1.6% noise. This article dissects why BM25/FTS5 can never r...
How I replaced slow LIKE queries with SQLite FTS5 full-text search on a 1.73 million row patent database, achieving 100x+ speedup with BM25 ranking and boolean query support....
How to set up Anthropic's official SQLite MCP server with Claude Code to run queries, inspect schemas, and manage databases directly from your AI coding assistant....
How I rebuilt five separate HTML prototypes into a single Flutter Web app backed by a Flask API, using 874MB of Claude Code session history as the data source for local LLM analysi...
How I debugged a patent analysis pipeline where Gemini generated plausible-but-fake patent numbers because the FTS5 queries returned zero results, and the three fixes that made it ...
Practical walkthrough of integrating Stripe Checkout into a Python SaaS targeting US patent law firms, including graceful degradation, local subscription caching, and the decision ...
AI coding assistants like Claude Code don't automatically read your README before making changes. Here are three strategies that enforce documentation-first workflows....
A deep dive into converting a PyTorch shogi (Japanese chess) model to ONNX for TensorRT inference, and what MCTS parameter tuning taught me about why raw model size isn't everythin...
How I built SoyLM, a single-file RAG tool using FastAPI, SQLite FTS5, and a local LLM, and what I learned about documentation-driven development when Reddit pointed out my README w...
A practical explanation of how Flutter Web apps become installable PWAs, the difference between Flutter's native compilation and its web target, and why Google built Flutter this w...
A technical deep dive into Tailscale's architecture including WireGuard foundations, DERP relay servers, NAT traversal, and why the mesh-network approach is replacing traditional h...
The story of systemd from Lennart Poettering's frustration with SysVinit to the most heated technical debate in Linux history, the Devuan fork, and why systemd won despite the cont...
Analyzing NVIDIA's open-source strategy revealed at GTC 2026. From NemoClaw to Vera Rubin, Physical AI, and cuDF/cuVS — why NVIDIA bet on open, viewed through the lens of Linux his...
For individual developers with RTX 40 Series GPUs, soy-tuber provides a practical explanation on how to run LLMs at low cost and high speed, utilizing the latest OSS inference engi...
OpenAI acquired Astral, the company behind Python's fast package manager uv and linter Ruff. This article analyzes the strategic implications for both sides — OpenAI's play for "AI...
NVIDIA NemoClaw (OpenShell) sandboxes are network-isolated by design. Here's how I broke through three layers of isolation — iptables, network policy, and namespace firewalls — to ...
Why the entire local AI ecosystem's adoption of OpenAI-compatible APIs created a fundamental security blind spot—and how kernel-level trust models should replace network-layer hack...
FastAPI 0.133 with Starlette 0.52 returns 405 for HEAD requests on GET routes. This silently broke Googlebot crawling and left 93 pages unindexed. Here's how I found the bug and th...
A practical, experience-based comparison of four LLM inference engines on RTX 5090 (32GB VRAM). Why vLLM is the pragmatic choice for Mamba-hybrid models on consumer Blackwell hardw...
In December 2025, I wrote my first line of code. By March 2026, I had built a patent search engine with 3.54 million US patents, a Japanese case law RAG system, a shogi AI ranked #...
This article explains the data organization know-how gained from developing the patent search app PatentLLM and the case law search app Hanrei-DB. It covers the practical use cases...