PatentLLM Blog →日本語

PatentLLM SubsidyDB GitHub Inquiry
← All Articles Read in Japanese
Web / Infra

Why Google Wasn't Indexing My FastAPI Site — The HEAD Request Trap

The Symptom: 93 Pages Invisible to Google

I run a technical blog on FastAPI behind Cloudflare Tunnel. One day I checked Google Search Console and found:

The site was working perfectly in a browser. Every page returned 200. The sitemap was valid. The robots.txt allowed everything. So why was Google refusing to index 93 out of ~95 pages?

The Investigation

The first instinct was to check for obvious problems:

# Site loads fine
curl -s -o /dev/null -w "%{http_code}" https://media.patentllm.org/
# 200

# Sitemap is valid
curl -s -o /dev/null -w "%{http_code}" https://media.patentllm.org/sitemap.xml
# 200

# Article pages work
curl -s -o /dev/null -w "%{http_code}" https://media.patentllm.org/blog/ai/some-article
# 200

Everything looks fine. But then I tried what Googlebot does — a HEAD request:

curl -s -o /dev/null -w "%{http_code}" -X HEAD https://media.patentllm.org/
# 405

405 Method Not Allowed. Every single page on the site rejected HEAD requests.

Why HEAD Requests Matter for SEO

The HTTP specification (RFC 9110) states that a server MUST support HEAD for any resource that supports GET. HEAD is identical to GET except the server must not send a response body.

Googlebot uses HEAD requests during crawling to: 1. Check if a page exists before fetching the full content 2. Verify Content-Type and other headers efficiently 3. Detect redirects without downloading the body

When HEAD returns 405, Googlebot logs it as a server error. The page gets stuck in "Discovered — currently not indexed" limbo. Google knows the URL exists (from the sitemap), but it can't crawl it because the pre-flight HEAD check fails.

The Root Cause: FastAPI + Starlette Version Behavior

Here's the surprise: FastAPI 0.133.1 with Starlette 0.52.1 does not automatically handle HEAD requests on GET routes.

I verified this with a minimal reproduction:

from fastapi import FastAPI
from fastapi.testclient import TestClient

app = FastAPI()

@app.get("/")
def index():
    return {"hello": "world"}

client = TestClient(app)
print("GET:", client.get("/").status_code)    # 200
print("HEAD:", client.head("/").status_code)  # 405

This is not a bug in your application code. It's a behavior of the framework version. Earlier versions of Starlette had implicit HEAD handling for GET routes, but this behavior changed. If you're running these versions and haven't explicitly addressed HEAD requests, your site has the same problem — you just might not know it yet.

The Fix: One Middleware

The cleanest solution is a middleware that intercepts HEAD requests, processes them as GET internally, and strips the response body:

@app.middleware("http")
async def handle_head_requests(request: Request, call_next):
    """Convert HEAD requests to GET internally (HTTP spec compliance)."""
    if request.method == "HEAD":
        request.scope["method"] = "GET"
        response = await call_next(request)
        response.body = b""
        return response
    return await call_next(request)

Place this before any other middleware (in FastAPI, middleware is executed in reverse registration order, so register it after other middleware definitions).

After adding this and restarting the service:

curl -s -o /dev/null -w "%{http_code}" -X HEAD https://media.patentllm.org/
# 200

curl -s -o /dev/null -w "%{http_code}" -X HEAD https://media.patentllm.org/blog/ai/some-article
# 200

curl -s -o /dev/null -w "%{http_code}" -X HEAD https://media.patentllm.org/sitemap.xml
# 200

Every route now correctly responds to HEAD requests.

How to Check If You're Affected

Run this against your own FastAPI site:

# Replace with your URL
curl -s -o /dev/null -w "HEAD: %{http_code}\n" -X HEAD https://your-site.com/
curl -s -o /dev/null -w "GET:  %{http_code}\n" https://your-site.com/

If HEAD returns 405 and GET returns 200, you have this problem.

You can also check your FastAPI/Starlette versions:

python -c "import fastapi; print(fastapi.__version__); import starlette; print(starlette.__version__)"

What Happened After the Fix

After deploying the middleware fix:

  1. I requested re-indexing for key pages in Google Search Console
  2. The "Server error (5xx)" count dropped to 0 within 24 hours
  3. Pages started moving from "Discovered — currently not indexed" to "Indexed" over the following days

The site had been live for weeks with this invisible problem. Every page worked in a browser, every monitoring check passed (because they all used GET), but Google couldn't crawl a single page beyond the initial discovery.

Lessons Learned

  1. Test with HEAD, not just GET. Browser testing and typical health checks only use GET. Add curl -X HEAD to your deployment verification.

  2. Google Search Console errors are real. When GSC reports "Server error (5xx)" and your monitoring shows all-green, the discrepancy is a clue — not a false positive. Something about how Google accesses your site differs from how you test it.

  3. Framework defaults change between versions. Don't assume HTTP method handling behavior is stable across framework upgrades. Check the changelog when upgrading Starlette/FastAPI.

  4. Middleware order matters in FastAPI. FastAPI executes middleware in reverse registration order. The HEAD handler must run before middleware that might depend on the request method.

  5. The HTTP spec is your friend. RFC 9110 says HEAD must work wherever GET works. Violating this doesn't just break Googlebot — it breaks any well-behaved HTTP client, proxy, or CDN that relies on HEAD for cache validation or pre-flight checks.