Deep Dive
In-depth technical articles on AI, GPU inference, and developer tools
Today's Local LLM Acceleration Techniques: ik_llama.cpp Speedup, Tinybox, and NVIDIA GTC Latest Trends
Today's Local LLM Acceleration Techniques: ik_llama.cpp Speedup, Tinybox, and NVIDIA GTC...
GPU & Inference2026: Local AI Evolves! From Offline Devices to Large-Scale Inference on RTX
Today's Highlights In 2026, AI's evolution is remarkable, with accelerated adoption in...
GPU & InferenceRTX 40 Series Makes LLM Blazing Fast! The Complete Guide to Inference Optimization for Individual Developers [2026 Latest Edi...
Hello everyone! I'm soy-tuber, an AI researcher and individual developer. I usually push my RTX 5090...
GPU & InferenceThe Technical Debt Local AI Must Fix Before It's Too Late — What NemoClaw Says About NVIDIA's Philosophy
If you've been following my recent posts, you might have seen my repository and the issue I opened on...
GPU & InferencePunching Through NVIDIA NemoClaw's Sandbox to Hit Local vLLM on RTX 5090
Disclaimer: This is an experimental build, not a production setup. NemoClaw is early-stage, the...
GPU & InferenceHardware Selection for Local LLMs: Overcoming the VRAM Wall with Practical GPU, CPU, and Memory Configurations
Introduction: Gemini Flash Equivalent Locally? The Despair of a Slow Development...
GPU & InferenceRTX 5090 + Nemotron 9B on vLLM — Benchmarks & TRT-LLM Comparison
I've been running Nemotron Nano 9B v2 Japanese on an RTX 5090 with vLLM 0.15.1 and wanted to share...
GPU & InferenceShogi AI with RTX 5090 — Record of TensorRT FP8 Quantization and Floodgate Practical Games
What is dlshogi? dlshogi is a Shogi engine incorporating deep learning, consisting of a...
GPU & InferencePractical Guide to Running Nemotron-Nano-9B-v2-Japanese with vLLM and Integrating it into Your Custom Application via an Open...
Introduction Recently, an article on Qiita titled "Running Nemotron-Nano-9B-v2-Japanese...
GPU & InferencePersonal AI Development Environment Built with RTX 5090 + WSL2 — A Practical Setup Fully Utilizing 32GB GPU
Why RTX 5090 + WSL2? The 32GB VRAM of the RTX 5090 is a practical choice for local...
GPU & InferenceIndividual Developer's Portfolio Strategy: Running 13 Projects on a Single RTX 5090
13 Project List The portfolio consists of the following categories: Legal...