PatentLLM Blog →日本語

HanreiLLM PatentLLM SubsidyDB RAG Eng Apps Live GitHub Inquiry

Today

AI

Built a Local-First RAG Research Tool with Nemotron + vLLM + Tool Calling

Built a local-first RAG research tool that runs entirely on a single GPU. Nemotron Nano 9B v2 on vLLM + FastAPI + SQLite FTS5 with a two-step Extract → Execute flow. Tool calling +...

AI

Running Karpathy's autoresearch with Local LLM — Zero API Cost Autonomous AI Research

A deep dive into the fork that replaces Claude Code with Qwen 3.5 9B + ollama in Karpathy's autoresearch framework. Run fully autonomous ML research on a single GPU with zero API c...

oss

Canonical Eyes IPO — Ubuntu Proves the Revival of Linux and OSS

Canonical, the company behind Ubuntu, is targeting an IPO with $292M revenue and 88% gross margins. If they go public, it will symbolize the new era of Linux/OSS business. We trace...

2026-03-21

AI

How Google Finds Every Restaurant in Japan — And Why Your Full-Text Search Can't

Using the Google Places Text Search API, I scraped 1,914 unagi restaurants across all 47 Japanese prefectures with under 1.6% noise. This article dissects why BM25/FTS5 can never r...

AI

From 30 Seconds to 3 Milliseconds: Replacing LIKE with FTS5 on 1.7M Patent Records

How I replaced slow LIKE queries with SQLite FTS5 full-text search on a 1.73 million row patent database, achieving 100x+ speedup with BM25 ranking and boolean query support....

AI

Claude Code + MCP SQLite Server: Query Your Database Without Leaving the Conversation

How to set up Anthropic's official SQLite MCP server with Claude Code to run queries, inspect schemas, and manage databases directly from your AI coding assistant....

AI

Building a 5-in-1 Local LLM App with Flutter Web and Flask

How I rebuilt five separate HTML prototypes into a single Flutter Web app backed by a Flask API, using 874MB of Claude Code session history as the data source for local LLM analysi...

AI

When Gemini Hallucinates Patent Numbers: Fixing the FTS5 + LLM Analysis Pipeline

How I debugged a patent analysis pipeline where Gemini generated plausible-but-fake patent numbers because the FTS5 queries returned zero results, and the three fixes that made it ...

AI

Adding Stripe Checkout to a Solo SaaS: Lessons from PatentLLM's $1K/mo Plan

Practical walkthrough of integrating Stripe Checkout into a Python SaaS targeting US patent law firms, including graceful degradation, local subscription caching, and the decision ...

AI

The README Trap: Why AI Coding Assistants Skip Your Docs (and 3 Fixes)

AI coding assistants like Claude Code don't automatically read your README before making changes. Here are three strategies that enforce documentation-first workflows....

AI

Training a Shogi Engine: ONNX Conversion, TensorRT, and Getting Crushed by Ryfamate

A deep dive into converting a PyTorch shogi (Japanese chess) model to ONNX for TensorRT inference, and what MCTS parameter tuning taught me about why raw model size isn't everythin...

AI

SoyLM: Building a Zero-Dependency Local RAG Tool in a Single Python File

How I built SoyLM, a single-file RAG tool using FastAPI, SQLite FTS5, and a local LLM, and what I learned about documentation-driven development when Reddit pointed out my README w...

AI

Flutter Web + PWA: Why Add to Home Screen Gives You a Full-Screen App

A practical explanation of how Flutter Web apps become installable PWAs, the difference between Flutter's native compilation and its web target, and why Google built Flutter this w...

AI

Tailscale Deep Dive: Why Developers Are Ditching Traditional VPNs

A technical deep dive into Tailscale's architecture including WireGuard foundations, DERP relay servers, NAT traversal, and why the mesh-network approach is replacing traditional h...

AI

Lennart Poettering and the systemd Wars: The Most Controversial Software in Linux History

The story of systemd from Lennart Poettering's frustration with SysVinit to the most heated technical debate in Linux history, the Devuan fork, and why systemd won despite the cont...

AI

The Real Inflection Point GTC 2026 Quietly Announced — Why NVIDIA Bet on "Open"

Analyzing NVIDIA's open-source strategy revealed at GTC 2026. From NemoClaw to Vera Rubin, Physical AI, and cuDF/cuVS — why NVIDIA bet on open, viewed through the lens of Linux his...

GPU Inference

RTX 40 Series Makes LLM Blazing Fast! The Complete Guide to Inference Optimization for Individual Developers [2026 Latest Edition]

For individual developers with RTX 40 Series GPUs, soy-tuber provides a practical explanation on how to run LLMs at low cost and high speed, utilizing the latest OSS inference engi...

2026-03-19

Dev Tools

OpenAI Acquires Astral (uv / Ruff) — What It Really Means

OpenAI acquired Astral, the company behind Python's fast package manager uv and linter Ruff. This article analyzes the strategic implications for both sides — OpenAI's play for "AI...

2026-03-18

GPU Inference

Punching Through NVIDIA NemoClaw's Sandbox to Hit Local vLLM on RTX 5090

NVIDIA NemoClaw (OpenShell) sandboxes are network-isolated by design. Here's how I broke through three layers of isolation — iptables, network policy, and namespace firewalls — to ...

GPU Inference

The Technical Debt Local AI Must Fix Before It's Too Late — What NemoClaw Says About NVIDIA's Philosophy

Why the entire local AI ecosystem's adoption of OpenAI-compatible APIs created a fundamental security blind spot—and how kernel-level trust models should replace network-layer hack...

2026-03-17

Web / Infra

Why Google Wasn't Indexing My FastAPI Site — The HEAD Request Trap

FastAPI 0.133 with Starlette 0.52 returns 405 for HEAD requests on GET routes. This silently broke Googlebot crawling and left 93 pages unindexed. Here's how I found the bug and th...

2026-03-14

GPU Inference

vLLM vs TensorRT-LLM vs Ollama vs llama.cpp — Choosing the Right Inference Engine on RTX 5090

A practical, experience-based comparison of four LLM inference engines on RTX 5090 (32GB VRAM). Why vLLM is the pragmatic choice for Mamba-hybrid models on consumer Blackwell hardw...

2026-03-13

Dev Tools

Three Months of Code: What a Patent Lawyer Built from Zero

In December 2025, I wrote my first line of code. By March 2026, I had built a patent search engine with 3.54 million US patents, a Japanese case law RAG system, a shogi AI ranked #...

2026-03-08

SQLite vs JSONL vs XML vs TSV — Data Wrangling for AI Projects
Dev Tools

SQLite vs JSONL vs XML vs TSV — Data Wrangling for AI Projects

This article explains the data organization know-how gained from developing the patent search app PatentLLM and the case law search app Hanrei-DB. It covers the practical use cases...

Daily Tech Digest Curated AI & dev news from 15+ international sources, delivered daily