Log file analysis

Every request to your server is logged. Among those logs: every single Googlebot visit. Log analysis tells you what Google actually crawls (not what you think it crawls), where it spends time, what it skips, and what it's breaking on.

When log analysis is worth it

What you can learn

Getting the logs

Depends on hosting:

Tools

Verifying Googlebot

Bad actors spoof the Googlebot user-agent. Verify by doing a reverse DNS lookup on the IP:

  1. Take the IP from the log
  2. nslookup [IP], should resolve to *.googlebot.com or *.google.com
  3. Then forward-lookup that hostname, should match the original IP

If the reverse doesn't match, it's not real Googlebot. Most log analysis tools automate this.

Key ratios to compute

Findings that matter

  1. Googlebot crawling stuff you don't care about, filter/sort/session parameter URLs. Handle with robots.txt or better URL hygiene.
  2. Googlebot NOT crawling stuff you do care about, low crawl frequency on money pages. Fix architecture or freshness signals.
  3. Crawl frequency drop on specific sections, can precede ranking drops. Early warning.
  4. High 404 rate, fix the links or delete the pages cleanly.
  5. Slow response time to bot, your LCP and crawl efficiency are both hurting.

Cadence

After big changes (migrations, restructures): daily for 2-4 weeks.