This article explains how to launch NVIDIA's Nemotron-Nano-9B-v2-Japanese with vLLM and integrate it into your custom application as an OpenAI-compatible API. It eliminates the nee...
This article explains the common infrastructure design and resource management strategy for operating 13 projects, including Shogi AI, LLM applications, and legal systems, on a sin...
This article explains how to build an AI development environment that maximizes the utilization of RTX 5090's 32GB VRAM in a WSL2 environment, allowing vLLM, TensorRT, Shogi AI, an...
This is a record of operating the dlshogi Shogi engine on an RTX 5090 with TensorRT FP8 quantization. We explain the structure of the Fuka40B model, the effects of quantization, Fl...