This article explains the data organization know-how gained from developing the patent search app PatentLLM and the case law search app Hanrei-DB. It covers the practical use cases...
This article details the steps for locally running NVIDIA's released Japanese-specific 9B parameter LLM, 'Nemotron-Nano-9B-v2-Japanese'. It features the Mamba SSM architecture and ...
This article explains how to introduce and practically use uv, a Rust-based Python package manager developed by Astral. It covers its speed (10 to 100 times faster than pip), venv-...
This article organizes the origins and design philosophies of the major OSS components that constitute the AI development stack, following their historical development. We will pro...
This article outlines a design for running NVIDIA's Nemotron-Nano-9B-v2-Japanese with vLLM to analyze and structure development environment data using batch processing. It discusse...
A manzai comedy skit between Engineer Takechi, whose brain has been completely taken over by Claude Code, and his junior colleague Niiyama, who desperately tries to pull him back t...
We apply insights gained from distillation experiments with Shogi AI to LLMs. Fine-tuning (FT) a distilled model is either meaningless or harmful, and LoRA can be replaced by promp...
Searching 1.73 million patent records, which was impractical with SQLite's LIKE search, is solved with FTS5 full-text search. This article explains the implementation steps for inv...
This article explains how to complete SQLite database operations within an AI assistant by utilizing Claude Code's MCP (Model Context Protocol) server functionality. We will cover ...
This article explains the process of building an AI development support app that integrates five functions into one app using Flutter Web. It details how to run a local LLM with vL...
A practical record of implementing Stripe Checkout billing for 'PatentLLM', a SaaS for US IP law firms. This article explains a design that does not store card information on the s...
This article explains how to launch NVIDIA's Nemotron-Nano-9B-v2-Japanese with vLLM and integrate it into your custom application as an OpenAI-compatible API. It eliminates the nee...
This article explains how to securely expose multiple web applications from a WSL2 environment by combining Cloudflare Tunnel and Caddy reverse proxy. This configuration eliminates...
This article explains a method for proactively and automatically preventing the execution of port conflicts and dangerous commands (e.g., rm -rf, git push --force) by leveraging Cl...
This article introduces a method to solve the problem of excessive token consumption when all information is written in CLAUDE.md, by using a two-layer structure: a Tier 1 index (l...
This article explains how to build a daily report system that automatically aggregates Claude Code and Gemini CLI usage history every morning with a cron job, visualizing token con...
This article explains how to build a free research agent that doesn't require an API key, by combining the ddgs library and a local LLM (Nemotron). It also includes an implementati...
This article explains methods to reduce API costs and shorten processing time for large-scale data analysis by leveraging Google Gemini's Context Caching feature. Specific examples...
This article introduces an implementation pattern that balances cost, quality, and privacy by combining Gemini 2.5 Flash and Nemotron 9B. It also explains the design of a common in...
Due to the deprecation of the google.generativeai package, this guide explains the specific steps to migrate to the google-genai SDK. We will show examples of import changes, Gener...
This article explains how to build a legal AI search system that converts court case law PDFs into text, achieves full-text search with SQLite FTS5, and automates issue extraction ...
This article explains an implementation method for giving Minecraft NPCs natural language-based situational judgment and response capabilities by running a local LLM (Nemotron 9B) ...
This article explains the design and implementation of a 2-stage pipeline that generates content with Nemotron 9B and refines and fact-checks it with Gemini 2.5 Flash. We also intr...
This article explains the method for building a dashboard that indexes data from the Chemical Substance Risk Information Platform (CHRIP) with SQLite FTS5 and performs regulatory s...
This article explains the common infrastructure design and resource management strategy for operating 13 projects, including Shogi AI, LLM applications, and legal systems, on a sin...
This article explains how to build an AI development environment that maximizes the utilization of RTX 5090's 32GB VRAM in a WSL2 environment, allowing vLLM, TensorRT, Shogi AI, an...
This is a record of operating the dlshogi Shogi engine on an RTX 5090 with TensorRT FP8 quantization. We explain the structure of the Fuka40B model, the effects of quantization, Fl...
This article explains an implementation method for integrating Streamlit and Flutter apps on WSL2 using Nginx proxy and WebSocket. It also covers solving CORS issues and introduces...
This article explains how to enable systemd in WSL2 and manage vLLM server, Flask API, and periodic tasks as systemd services. It also covers startup order dependencies and log mon...
A free patent search engine covering 3.5M US patents (2016-2025). Built with SQLite FTS5, BM25 ranking, CPC classification filtering, and Nemotron 9B for tech tag classification....
A chapter-by-chapter guide to Peter Seibel's 'Coders at Work', introducing all 15 programmers interviewed — from UNIX creator Ken Thompson to Erlang inventor Joe Armstrong, JavaScr...