nSkillHub
Toggle Dark/Light/Auto mode Toggle Dark/Light/Auto mode Toggle Dark/Light/Auto mode Back to homepage

API Design: REST vs GraphQL vs gRPC

API design decisions have long tails — once you publish an API and clients integrate with it, changing it is expensive. The choice of protocol, versioning strategy, and backwards compatibility approach should be deliberate, not defaults.


REST: The Default Choice and Why It’s Usually Right

REST is HTTP-native — it uses standard verbs (GET, POST, PUT, PATCH, DELETE), status codes, headers, and content negotiation. It’s stateless, cacheable, and every HTTP client in existence can call it.

REST wins when:

  • Your consumers are diverse (mobile apps, third-party developers, browsers, other services)
  • You need HTTP caching (GET responses with Cache-Control)
  • The access patterns map naturally to resources and CRUD
  • Team familiarity matters — REST is the most widely understood API style
  • You need public or partner APIs where simplicity and documentation matter

REST’s weaknesses:

  • Over-fetching: API returns a User object with 30 fields; client needed 3. Wastes bandwidth and parsing time, especially on mobile.
  • Under-fetching: Client needs user + orders + profile. Three round trips unless you build a custom endpoint.
  • Versioning drift: Over time, APIs accumulate versions and deprecated fields, and the surface area becomes unwieldy.

For most internal and external APIs, these weaknesses are manageable with thoughtful design (field selection, composite endpoints for common patterns) and don’t justify the complexity of an alternative.


GraphQL: When It’s Worth the Complexity

GraphQL is a query language — clients specify exactly what data they need in the shape they need it.

query {
  user(id: "123") {
    name
    email
    orders(last: 5) {
      id
      status
      total
    }
  }
}

GraphQL wins when:

  • Multiple clients with different data needs. Mobile app needs fewer fields; web app needs more. With REST, you build multiple endpoints or bloat the response. With GraphQL, each client requests exactly what it needs.
  • BFF (Backend for Frontend) aggregation. A single GraphQL layer aggregates data from multiple backend services. The client doesn’t need to know about backend service topology.
  • Rapidly evolving data model. Adding new fields doesn’t break existing queries. Deprecating fields is visible in the schema.
  • Complex, nested data relationships. GraphQL resolvers compose naturally for graph-shaped data.

GraphQL’s real costs:

  • Caching is harder. REST GET requests are trivially cacheable by URL. GraphQL queries are POST requests with a body — HTTP caching doesn’t apply by default. You need application-level caching (persisted queries, DataLoader for N+1 batching).
  • N+1 queries are easy to introduce. A naive GraphQL resolver fetches each item’s related data in a loop. DataLoader batches these, but it must be implemented correctly.
  • Error handling is non-standard. GraphQL returns HTTP 200 even when the query partially fails (errors in the errors array). This breaks conventional monitoring that keys on HTTP status codes.
  • Security surface: Clients can write arbitrarily complex queries. Depth limiting, query complexity budgets, and persisted queries are necessary to prevent abuse.
  • Tooling and expertise: The ecosystem is good but smaller than REST. Debugging, federation (Apollo Federation), schema stitching — all add complexity.

The honest EM take: GraphQL is genuinely valuable for consumer-facing APIs where multiple clients (iOS, Android, web) have divergent data needs, or for a BFF aggregation layer. For internal service-to-service communication, it’s rarely the right choice — gRPC or REST is simpler.


gRPC: Internal Service-to-Service Communication

gRPC uses Protocol Buffers (binary serialization) over HTTP/2. It’s contract-first — the .proto file defines the API, and code is generated for both client and server.

service UserService {
  rpc GetUser (UserRequest) returns (UserResponse);
  rpc StreamUserEvents (UserRequest) returns (stream UserEvent);
}

gRPC wins when:

  • Internal service-to-service communication where performance matters
  • Strongly typed contracts between services reduce integration bugs
  • You want auto-generated client libraries in multiple languages
  • You need streaming (server streaming, client streaming, bidirectional streaming)
  • Polyglot microservices — generated clients work in Go, Java, Python, etc.

gRPC’s costs:

  • Not browser-native — gRPC-Web proxy needed for browser clients (adds complexity)
  • Binary protocol means you can’t curl it without tooling (grpcurl, Postman with gRPC support)
  • HTTP/2 can be problematic through certain proxies, load balancers, and firewalls
  • Protobuf schema evolution requires discipline (don’t reuse field numbers)
  • Steeper learning curve than REST for teams new to it

REST vs gRPC for internal services:

  • Small team, REST expertise, simple request/response: REST is fine
  • Performance-critical inter-service calls, polyglot environment, strict typing: gRPC
  • The performance difference (binary vs JSON, HTTP/2 multiplexing) is real but usually not the bottleneck — don’t over-optimize

API Versioning

Versioning is a commitment to support multiple API behaviors simultaneously. Choose your strategy upfront because changing it later is painful.

URL Versioning (/v1/users, /v2/users)

  • Explicit, discoverable
  • Easy to route at API gateway
  • Clients know exactly what version they’re using
  • Version proliferation: /v1, /v2, /v3 requires parallel maintenance

Header Versioning (Accept: application/vnd.api+json;version=2)

  • Clean URLs
  • Harder to test (can’t just change the URL)
  • Less discoverable
  • Often used for content negotiation-style versioning

No Versioning (Evolution instead)

  • Only add fields, never remove them
  • Use @deprecated annotation in schemas and documentation
  • Set a sunset date and enforce client migration
  • Requires disciplined schema evolution (additive-only changes)
  • Works well for mature APIs with trusted consumers

Recommendation: URL versioning for public APIs (clarity over elegance). No versioning with additive-only evolution for internal APIs with internal consumers where you can coordinate migrations.


Backwards Compatibility

When changing an API used by many clients, the risks are:

  1. Removing a field a client depends on
  2. Changing a field’s type
  3. Changing behavior of an existing operation

Safe changes (backwards compatible):

  • Adding optional fields to requests
  • Adding fields to responses (clients must ignore unknown fields — enforce this)
  • Adding new endpoints
  • Adding new enum values (with care — some clients break on unknown enums)

Breaking changes:

  • Removing or renaming fields
  • Changing field types
  • Changing error codes or response structure
  • Changing required/optional semantics

Consumer-driven contract testing (Pact): Publish a contract describing what each consumer uses. CI checks that new API versions don’t violate any published contracts. This is the most rigorous approach for a large consumer base.

Sunset headers: Deprecation: true, Sunset: Sat, 01 Jan 2027 00:00:00 GMT. Programmatic signal to clients to migrate. Monitor usage of deprecated endpoints before removal.


WebSockets and Server-Sent Events vs Polling

Polling: Client calls /status?id=123 every N seconds. Simple, stateless, easy to scale. Every client is bombarded with unnecessary requests. Acceptable for low-frequency status checks (job status, slow-changing data).

Long Polling: Client makes a request; server holds it open until there’s data to send (or timeout). Reduces unnecessary requests but complicates server-side connection management. Largely superseded by SSE and WebSockets.

Server-Sent Events (SSE): HTTP-based unidirectional push from server to client. Standard EventSource API in browsers. Automatic reconnection. Works through most proxies. Good for: live dashboards, news feeds, notification pushes, progress updates.

WebSockets: Full-duplex, bidirectional. Client and server both push and receive. More complex to scale (stateful connections, sticky sessions or pub/sub fan-out layer). Good for: chat applications, real-time collaborative editing, live gaming, trading platforms.

The decision:

  • One-way server-to-client push, browser client: SSE
  • Bidirectional real-time communication: WebSocket
  • Infrequent updates, simple implementation: polling
  • Never use WebSockets just because “it’s faster” for standard request/response — the overhead of connection management outweighs the benefit