PatentLLM Tech Blog

GPU Inference

RTX 5090 + WSL2 AI Dev Environment — Full 32GB VRAM Setup

This article explains how to build an AI development environment that maximizes the utilization of RTX 5090's 32GB VRAM in a WSL2 environment, allowing vLLM, TensorRT, Shogi AI, an...

GPU Inference

Shogi AI on RTX 5090 — TensorRT FP8 & Floodgate Results

This is a record of operating the dlshogi Shogi engine on an RTX 5090 with TensorRT FP8 quantization. We explain the structure of the Fuka40B model, the effects of quantization, Fl...

Web / Infra

Streamlit × Flutter Bidirectional Integration on WSL2

This article explains an implementation method for integrating Streamlit and Flutter apps on WSL2 using Nginx proxy and WebSocket. It also covers solving CORS issues and introduces...

Dev Tools

Auto-Start vLLM, Flask & cron on WSL2 with systemd Services

This article explains how to enable systemd in WSL2 and manage vLLM server, Flask API, and periodic tasks as systemd services. It also covers startup order dependencies and log mon...

Dev Tools

PatentLLM: Free Search Engine for 3.5M US Patents

A free patent search engine covering 3.5M US patents (2016-2025). Built with SQLite FTS5, BM25 ranking, CPC classification filtering, and Nemotron 9B for tech tag classification....

AI

Talent Blooms When You Stop Relying on "Motivation": 7 Insights on the "Spring Mind" Left by Genius Mathematician Kiyoshi Oka

Kiyoshi Oka, a solitary genius who solved historical problems in Western mathematics one after another. What drove him was not "motivation", but the "Spring Mind" rooted in Japanes...

GPU Inference

RTX 5090 + Nemotron 9B on vLLM — Benchmarks & TRT-LLM Comparison

Real-world benchmarks of Nemotron Nano 9B v2 Japanese on an RTX 5090 with vLLM 0.15.1: 83 tok/s single, 630 tok/s batched. Includes a fix for the broken reasoning parser import pat...

Dev Tools

Coders at Work — Index of All 15 Programmer Interviews

A chapter-by-chapter guide to Peter Seibel's 'Coders at Work', introducing all 15 programmers interviewed — from UNIX creator Ken Thompson to Erlang inventor Joe Armstrong, JavaScr...

AI

I Posted My Patent Search AI to Reddit r/LocalLLaMA and Got 65 Upvotes and Over 20 Questions

When I posted my custom free patent search engine to Reddit's r/LocalLLaMA, I received 65 upvotes and over 20 technical questions within 2 hours. This is a record of practical Q&A ...

Dev Tools

Claude Code Practical Guide: Debugging, Test Automation, and CUDA Environment Setup with Opus 4.6

A practical guide to improving development efficiency using Anthropic's CLI tool "Claude Code" and the Opus 4.6 model. We explain prompt design for cost reduction, Flask app debugg...

Web / Infra

Automated Google Drive Backup with Rclone: Headless OAuth Authentication and systemd Configuration

To back up bloating AI development data to Google Drive, we explain the authentication process on a headless server using Rclone, and how to build a robust automated backup system ...

Web / Infra

Cloudflare Tunnel Practical Guide: Securely Exposing a Home AI Server Without Port Forwarding

This guide explains how to securely expose your home AI server (equipped with an RTX 5090) to the internet without port forwarding. We summarize practical infrastructure setup step...

Web / Infra

Automating Video Generation with Remotion and VOICEVOX: From Environment Setup to Performance Optimization

We explain how to automate video generation by combining "Remotion", a React-based video generation framework, and the "VOICEVOX" speech synthesis engine. From environment setup to...

AI

Turn Conversation Data into Assets with Gemini API: History Export, RAG, and Streamlit

A practical guide to turning AI dialogue data into assets by combining the Gemini API and an RTX 5090 (32GB VRAM). We explain everything with copy-pasteable code, from history expo...

AI

What I Gained from Interacting with Shogi AI: The Path to 1st Place in Floodgate and My Approach to Distilled Models

While researching LLM and RAG, I worked on Shogi AI's open-source project as a testbed for reasoning optimization. I modified two engines—a DL-based one and an NNUE-based one—and a...

GPU Inference

Hardware Selection for Local LLMs: Overcoming the VRAM Wall with Practical GPU, CPU, and Memory Configurations

For developers struggling with local LLM inference speed, this guide presents the optimal solution for building an environment using an RTX 5090 with 32GB VRAM and Core Ultra 9. Dr...

Dev Tools

Using Python to Load Google Docs into AI — Drive API Minimal Permission Setup

When trying to load Google Docs into AI, direct URL access often fails. To eliminate the hassle of manual copy-paste and file conversion, we present a Python script implementation ...

PatentLLM Blog →日本語

2026-03-08

RTX 5090 + WSL2 AI Dev Environment — Full 32GB VRAM Setup

Shogi AI on RTX 5090 — TensorRT FP8 & Floodgate Results

Streamlit × Flutter Bidirectional Integration on WSL2

Auto-Start vLLM, Flask & cron on WSL2 with systemd Services

PatentLLM: Free Search Engine for 3.5M US Patents

2025-08-17

Talent Blooms When You Stop Relying on "Motivation": 7 Insights on the "Spring Mind" Left by Genius Mathematician Kiyoshi Oka

2025-08-10

RTX 5090 + Nemotron 9B on vLLM — Benchmarks & TRT-LLM Comparison

2025-08-03

Coders at Work — Index of All 15 Programmer Interviews

2025-07-27

I Posted My Patent Search AI to Reddit r/LocalLLaMA and Got 65 Upvotes and Over 20 Questions

2025-07-20

Claude Code Practical Guide: Debugging, Test Automation, and CUDA Environment Setup with Opus 4.6

2025-07-13

Automated Google Drive Backup with Rclone: Headless OAuth Authentication and systemd Configuration

2025-07-06

Cloudflare Tunnel Practical Guide: Securely Exposing a Home AI Server Without Port Forwarding

2025-06-29

Automating Video Generation with Remotion and VOICEVOX: From Environment Setup to Performance Optimization

2025-06-22

Turn Conversation Data into Assets with Gemini API: History Export, RAG, and Streamlit

2025-06-15

What I Gained from Interacting with Shogi AI: The Path to 1st Place in Floodgate and My Approach to Distilled Models

2025-06-08

Hardware Selection for Local LLMs: Overcoming the VRAM Wall with Practical GPU, CPU, and Memory Configurations

2025-06-01

Using Python to Load Google Docs into AI — Drive API Minimal Permission Setup