All posts
Blog

Google Sheets API Rate Limits Explained

Understanding Google Sheets API quotas, what the 429 error means, and how GKit SheetsAPI helps you stay within limits without slowing your app down.

5 min read

You've built something. It works perfectly in development. You deploy, traffic picks up, and then — somewhere around the third feature launch or the first real spike — requests start failing with a status you hadn't planned for:

HTTP 429 Too Many Requests

Google Sheets API rate limits are one of those things developers discover at the worst possible moment. This post explains exactly how the quotas work, what the 429 means, and what you can do about it — both in your own code and at the infrastructure layer.

Two quota dimensions, not one

Most APIs gate you on a single number. Google Sheets API uses two separate counters simultaneously, and you can hit either one independently.

Per-project quota applies to all requests made under your Google Cloud project, regardless of which user triggered them:

  • Read requests: 300 per minute
  • Write requests: 300 per minute

Per-user-per-project quota applies to requests made on behalf of a specific authenticated user within your project:

  • Read requests: 60 per minute per user

The per-user limit is the one that surprises people. You can have 10 users each making 6 reads per minute and never approach the 300/min project ceiling — but if a single user's workflow triggers 61 reads in one minute, that user's requests start failing even though the project as a whole is well under its limit.

Keep both numbers in mind when you're estimating load. A dashboard that auto-refreshes every 5 seconds for a single power user is already running at 12 reads per minute for that user. Scale that across tabs, concurrent sessions, or a webhook fan-out and you'll hit 60 faster than expected.

What a 429 response actually looks like

When you exceed a quota, the API responds with HTTP 429 and a JSON body that looks roughly like this:

{
  "error": {
    "code": 429,
    "message": "Quota exceeded for quota metric 'sheets.googleapis.com/read_requests' and limit 'READ_REQUESTS_PER_MINUTE_PER_USER' of service 'sheets.googleapis.com'.",
    "status": "RESOURCE_EXHAUSTED",
    "details": [
      {
        "@type": "type.googleapis.com/google.rpc.ErrorInfo",
        "reason": "RATE_LIMIT_EXCEEDED",
        "domain": "googleapis.com"
      }
    ]
  }
}

Two things worth noting: the message field tells you which limit you hit (per-project vs per-user), and the status is RESOURCE_EXHAUSTED — the same status Google uses for quota exhaustion across all its APIs. If you're pattern-matching on status strings, RESOURCE_EXHAUSTED is the value to catch.

Exponential backoff: the only retry strategy that works

Retrying immediately after a 429 makes the problem worse. You're already over quota, and an instant retry just adds more load while the window resets. Google's documented recommendation is exponential backoff with jitter.

Here's a production-ready TypeScript implementation:

interface RetryOptions {
  maxAttempts?: number;
  baseDelayMs?: number;
  maxDelayMs?: number;
}
 
async function withExponentialBackoff<T>(
  fn: () => Promise<T>,
  options: RetryOptions = {}
): Promise<T> {
  const { maxAttempts = 5, baseDelayMs = 1000, maxDelayMs = 32000 } = options;
 
  for (let attempt = 1; attempt <= maxAttempts; attempt++) {
    try {
      return await fn();
    } catch (err: unknown) {
      const isRateLimited =
        err instanceof Error &&
        "status" in (err as Record<string, unknown>) &&
        (err as Record<string, unknown>).status === 429;
 
      const isLastAttempt = attempt === maxAttempts;
 
      if (!isRateLimited || isLastAttempt) {
        throw err;
      }
 
      // Exponential backoff: 1s, 2s, 4s, 8s, 16s — capped at maxDelayMs
      const exponentialDelay = Math.min(
        baseDelayMs * Math.pow(2, attempt - 1),
        maxDelayMs
      );
 
      // Add jitter: ±20% of the calculated delay
      // This prevents thundering herd when multiple clients retry simultaneously
      const jitter = exponentialDelay * 0.2 * (Math.random() * 2 - 1);
      const delay = Math.round(exponentialDelay + jitter);
 
      console.warn(
        `Rate limited by Sheets API. Attempt ${attempt}/${maxAttempts}. Retrying in ${delay}ms.`
      );
 
      await new Promise((resolve) => setTimeout(resolve, delay));
    }
  }
 
  // TypeScript requires this even though the loop always returns or throws
  throw new Error("Unreachable");
}

Usage is straightforward — wrap any Sheets API call:

const response = await withExponentialBackoff(() =>
  fetch(`https://sheetsapi.gkit.mreshank.com/api/spreadsheets/${userKey}/${sheetName}`)
);

The jitter is important. Without it, every client that hit the same quota window will retry at the same interval, causing a synchronized spike that re-triggers the limit. Adding ±20% randomness staggers the retries across clients.

Practical ways to stay under the limit

Backoff handles failure gracefully, but it's better to not fail in the first place. A few patterns that reduce your quota consumption significantly:

Batch reads instead of sequential ones. If you need data from multiple tabs, fetch them in parallel with Promise.all rather than sequentially. You'll use the same number of quota units but finish faster, reducing the chance of hitting the per-minute window. If you need data from a single tab at multiple points in a user flow, fetch it once and pass it down.

Cache aggressively on your server. A read request that returns a cached response consumes zero Sheets API quota. Even a 30-second in-memory cache on your server can absorb a significant number of repeated reads — particularly for data that doesn't change between requests.

Stop polling. Use push instead. Auto-refreshing a sheet on a 5-second interval burns quota continuously. Google Sheets supports push notifications via Apps Script webhooks — your server gets called when the sheet changes rather than checking repeatedly when it hasn't. That shifts the read pattern from "always-on" to "on-demand," which is dramatically more efficient.

Separate service accounts by function. If you have multiple services reading the same spreadsheet, each one authenticated as a different service account counts against the per-user quota separately. This doesn't multiply your project quota, but it does give you more headroom per user before you hit the 60/min ceiling.

Sheets API quota is separate from Drive API quota. If you're also calling the Drive API (listing files, checking metadata), those quota counters are independent. Hitting a Drive limit won't affect your Sheets quota and vice versa.

How GKit SheetsAPI reduces upstream calls

When you route reads through SheetsAPI instead of calling the Google Sheets API directly, you get worker-level response caching as part of the infrastructure. Requests that arrive within the cache window are served from the Cloudflare Worker edge without making a new call to Google — which means they consume no Sheets API quota at all.

This matters most for public-facing use cases: a product catalog, a pricing table, a content feed. These pages get hit by many users in a short window, and without caching every one of those requests would count against your project quota. With edge caching in place, Google only sees a trickle of upstream calls regardless of how many users are reading.

For write-heavy workflows the story is different — writes always need to reach Google, so quota management there still falls on your application logic. But for read-dominated workloads, the caching layer is effectively a quota multiplier.

Monitor your actual quota usage

Don't wait for a 429 to find out how close you are to the limit. Google Cloud Console shows real-time and historical quota usage:

  1. Open Google Cloud Console and select your project.
  2. Navigate to APIs & ServicesGoogle Sheets APIQuotas & System Limits.
  3. You'll see per-minute read and write usage graphed over time, broken down by quota type.

The "Per user" view is the one most developers underestimate. Check it on your heaviest-usage days — if you're consistently reaching 40–50 of the 60/min per-user ceiling, a caching layer or batching refactor will pay for itself before you ever hit the wall.


Rate limits are a constraint, not a mystery. Know the two quota dimensions, implement backoff correctly, design your read patterns to be parsimonious, and cache at the edge wherever you can. The 429 is almost always solvable before it becomes a production incident.

Share