This article explains how to build an AI development environment that maximizes the utilization of RTX 5090's 32GB VRAM in a WSL2 environment, allowing vLLM, TensorRT, Shogi AI, an...
This is a record of operating the dlshogi Shogi engine on an RTX 5090 with TensorRT FP8 quantization. We explain the structure of the Fuka40B model, the effects of quantization, Fl...
This article explains an implementation method for integrating Streamlit and Flutter apps on WSL2 using Nginx proxy and WebSocket. It also covers solving CORS issues and introduces...
This article explains how to enable systemd in WSL2 and manage vLLM server, Flask API, and periodic tasks as systemd services. It also covers startup order dependencies and log mon...
A free patent search engine covering 3.5M US patents (2016-2025). Built with SQLite FTS5, BM25 ranking, CPC classification filtering, and Nemotron 9B for tech tag classification....
Kiyoshi Oka, a solitary genius who solved historical problems in Western mathematics one after another. What drove him was not "motivation", but the "Spring Mind" rooted in Japanes...
Real-world benchmarks of Nemotron Nano 9B v2 Japanese on an RTX 5090 with vLLM 0.15.1: 83 tok/s single, 630 tok/s batched. Includes a fix for the broken reasoning parser import pat...
A chapter-by-chapter guide to Peter Seibel's 'Coders at Work', introducing all 15 programmers interviewed — from UNIX creator Ken Thompson to Erlang inventor Joe Armstrong, JavaScr...
When I posted my custom free patent search engine to Reddit's r/LocalLLaMA, I received 65 upvotes and over 20 technical questions within 2 hours. This is a record of practical Q&A ...
A practical guide to improving development efficiency using Anthropic's CLI tool "Claude Code" and the Opus 4.6 model. We explain prompt design for cost reduction, Flask app debugg...
To back up bloating AI development data to Google Drive, we explain the authentication process on a headless server using Rclone, and how to build a robust automated backup system ...
This guide explains how to securely expose your home AI server (equipped with an RTX 5090) to the internet without port forwarding. We summarize practical infrastructure setup step...
We explain how to automate video generation by combining "Remotion", a React-based video generation framework, and the "VOICEVOX" speech synthesis engine. From environment setup to...
A practical guide to turning AI dialogue data into assets by combining the Gemini API and an RTX 5090 (32GB VRAM). We explain everything with copy-pasteable code, from history expo...
While researching LLM and RAG, I worked on Shogi AI's open-source project as a testbed for reasoning optimization. I modified two engines—a DL-based one and an NNUE-based one—and a...
For developers struggling with local LLM inference speed, this guide presents the optimal solution for building an environment using an RTX 5090 with 32GB VRAM and Core Ultra 9. Dr...
When trying to load Google Docs into AI, direct URL access often fails. To eliminate the hassle of manual copy-paste and file conversion, we present a Python script implementation ...