URL Shortener (bit.ly)
Every QR code on a cereal box, every tweet with a link, every SMS campaign from your bank — they all run through a URL shortener. bit.ly processes ~10 billion clicks per month. The system looks trivial: take a long URL, return a 7-character string. But at scale it becomes a masterclass in read-heavy systems, cache warming, hash collision handling, and hot-partition avoidance. Interviewers love it because the surface area is small but the depth is unlimited.
POST /shorten— given a long URL, return a unique short code (e.g.https://sho.rt/aB3xKz)GET /{code}— redirect the user to the original long URL (HTTP 301 or 302)- Custom aliases — user can request a vanity slug (e.g.
/my-brand) - Link expiry — optional TTL after which the short URL returns 410 Gone
- Analytics — track click count, country, referrer, device (eventual consistency is fine)
| Property | Target |
|---|---|
| Redirect latency (p99) | < 20 ms |
| Availability | 99.99% (< 53 min downtime/year) |
| Write throughput | 5,000 URLs created / sec |
| Read throughput | 500,000 redirects / sec (100:1 read:write) |
| Short code collision probability | < 1 in 10^12 |
| Data retention | 5 years default, configurable |
- Full analytics dashboards (we store events; aggregation is a separate service)
- User authentication / link ownership (assume API-key auth at gateway)
- Link previews / safety scanning (separate async pipeline)
Assumptions:
- 500M redirects/day, 5M new URLs/day
- Average long URL: 200 bytes; short URL record: ~500 bytes
- 80% of traffic hits 20% of links (Pareto)
| Metric | Calculation | Result |
|---|---|---|
| Write QPS | 5M / 86,400 | ~58 writes/sec (peak ×10 → 580/sec) |
| Read QPS | 500M / 86,400 | ~5,800/sec (peak ×10 → 58,000/sec) |
| Storage/day | 5M × 500B | 2.5 GB/day |
| Storage/year | 2.5 GB × 365 | ~900 GB/year |
| Storage/5 years | ~4.5 TB | |
| Bandwidth (read) | 58,000 × 500B | ~29 MB/s outbound |
| Cache size (hot 20%) | 5M × 365 × 20% × 500B | ~183 GB (fits in Redis cluster) |
Key insight: This is overwhelmingly read-heavy. The cache hit ratio is the single most important performance lever.
┌──────────────────────────────────────┐
│ Clients │
└────────────┬─────────────────────────┘
│ HTTPS
┌────────────▼─────────────────────────┐
│ API Gateway │
│ (rate-limit, auth, TLS termination) │
└──────┬──────────────┬────────────────┘
│ │
┌───────────▼──┐ ┌──────▼───────────┐
│ Write Path │ │ Read Path │
│ (Shortener) │ │ (Redirect Svc) │
└───────┬──────┘ └──────┬────────────┘
│ │ cache miss
┌───────▼──────┐ ┌──────▼────────────┐
│ ID Generator│ │ Redis Cluster │
│ (Snowflake) │ │ (L1 cache) │
└───────┬──────┘ └──────┬────────────┘
│ │ still miss
┌───────▼──────────────────▼────────────┐
│ Primary DB (Cassandra) │
│ Partition key: short_code │
└───────────────────────────────────────┘
│ async
┌─────────────────▼──────────────────────┐
│ Analytics Event Bus (Kafka) │
└────────────────────────────────────────┘
Write path: Client → API Gateway → Shortener Service → ID Generator (Snowflake) → Base62 encode → write to Cassandra + warm Redis
Read path: Client → API Gateway → Redirect Service → Redis lookup (hit: 302 immediately) → on miss: Cassandra → populate Redis → 302
The most common mistake is MD5/SHA256-then-truncate. That gives you a fixed output for a given URL — which means two users shortening the same URL get the same code (good for dedup, bad if you want user-scoped expiry). The cleaner approach is ID-based encoding:
- Generate a globally unique 64-bit integer via Snowflake (timestamp + datacenter + worker + sequence)
- Base62-encode it (
[0-9][a-z][A-Z]= 62 chars) - A 64-bit integer needs at most
ceil(log62(2^64))= 11 chars; we cap at 7 for aesthetics (62^7 = 3.5 trillion codes)
This is collision-free by construction — Snowflake guarantees uniqueness. No DB round-trip to check.
// Java 17 — Base62 encoder using records
public record ShortCode(String value) {
private static final String ALPHABET =
"0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
private static final int BASE = ALPHABET.length(); // 62
public static ShortCode fromId(long id) {
if (id <= 0) throw new IllegalArgumentException("id must be positive");
var sb = new StringBuilder();
while (id > 0) {
sb.append(ALPHABET.charAt((int)(id % BASE)));
id /= BASE;
}
return new ShortCode(sb.reverse().toString());
}
public long toId() {
long id = 0;
for (char c : value.toCharArray()) {
id = id * BASE + ALPHABET.indexOf(c);
}
return id;
}
}
301 vs 302: Use 302 Found (temporary redirect). 301 is cached permanently by browsers — you lose all analytics after the first visit. 302 hits your servers every time; that’s what you want.
API contract:
GET /{code}
→ 302 Location: https://original-long-url.com/path
→ 404 if code unknown
→ 410 Gone if expired
Store in the same table with a is_custom flag. On write: check existence first (Cassandra lightweight transaction — INSERT IF NOT EXISTS). Charge custom aliases against a user’s quota at the gateway.
Cassandra table (optimised for point lookups by short_code):
| Column | Type | Notes |
|---|---|---|
short_code |
text | Partition key — even distribution |
long_url |
text | Up to 2KB |
created_at |
timestamp | Creation time |
expires_at |
timestamp | NULL = never; Cassandra TTL set on write |
user_id |
uuid | Owner (nullable for anonymous) |
is_custom |
boolean | Vanity slug flag |
click_count |
counter | Approximate; updated async |
Secondary access pattern — “list all links for user_id”: separate table links_by_user (user_id, created_at, short_code) — materialised view in Cassandra or a separate write on creation.
Indexes: None on the primary table — Cassandra discourages secondary indexes at this scale. Use the materialised view pattern.
Partitioning: short_code as partition key distributes evenly. Avoid using user_id as partition key on the main table — a power user would create a hot partition.
TTL: Set Cassandra’s native TTL on write for expiring links. No background sweep job needed.
| Option | Pros | Cons | When to use |
|---|---|---|---|
| Hash (MD5 + truncate) | Deterministic dedup | Collision handling needed, SHA overhead | Tiny scale, dedup is critical |
| Auto-increment (DB sequence) | Simple, no collision | Single point of failure, predictable/guessable | Prototypes only |
| Snowflake ID + Base62 | Collision-free, decentralised, sortable | Slightly longer code; worker ID coordination | This design — production |
| Random + DB existence check | Easy to understand | DB round-trip per write, thundering herd on hot URLs | Low write volume |
Conclusion: Snowflake + Base62. Decentralised, monotonically increasing (good for time-range queries), zero collision risk.
| Option | Pros | Cons |
|---|---|---|
| 301 Permanent | Client caches → zero server load | Lose analytics after first visit |
| 302 Temporary | Every redirect measurable | Higher server load |
| 307 Temporary (method-preserving) | Correct for POST forms | Rarely needed for URL shorteners |
Conclusion: 302 for analytics-first products. 301 only if you’re shutting down and want graceful offload.
| Option | Pros | Cons |
|---|---|---|
| Cassandra | Linear scale, high write throughput, built-in TTL | Eventual consistency, no joins |
| DynamoDB | Managed, predictable latency | Vendor lock-in, cost at scale |
| MySQL/Postgres | ACID, familiar | Hard to shard at this write volume |
Conclusion: Cassandra for writes + Redis for reads. Accept eventual consistency on analytics.
This system prioritises AP (Availability + Partition Tolerance). A short redirect returning a stale URL is far less harmful than a 503 error. We tolerate eventual consistency on click counts; redirect correctness is protected by Redis + Cassandra replication.
| Component | Failure | Impact | Mitigation |
|---|---|---|---|
| Redis node down | Cache miss storm | Cassandra gets full redirect traffic | Redis Cluster (3+ shards, replicas); circuit breaker to limit Cassandra blast |
| Snowflake worker restart | Possible duplicate IDs if clock skew | Collision in short codes | Snowflake’s sequence within same millisecond prevents this; worker ID reassignment guard |
| Cassandra partition unavailable | Reads/writes fail for affected codes | Redirects return 503 | Replication factor 3, quorum reads (LOCAL_QUORUM); fallback to stale Redis |
| Thundering herd on viral URL | Single key hammers Redis | Redis CPU spikes | Local in-process cache (Caffeine, 1s TTL) in Redirect Service pods; read-through with mutex/singleflight |
| Hot partition (popular short code) | Cassandra token overload | Latency spike for that key | Cache is the fix — hot codes never reach Cassandra after warm-up |
| Analytics Kafka lag | Click counts delayed | Stale dashboards | Acceptable; Kafka consumer lag alert at >100K messages |
| DLQ (analytics events) | Failed event processing | Lost click data | Kafka DLQ topic; reprocessing job with idempotent consumer |
AuthN/AuthZ: API key per client at the gateway. Write endpoints require a valid key. Read (redirect) is public — no auth needed.
Encryption: TLS 1.3 in transit. Cassandra encryption at rest (AES-256). Redis TLS + AUTH.
Input validation:
- URL must parse as valid
http://orhttps://— rejectjavascript:,data:,ftp:schemes - Max URL length: 2,048 chars (standard browser limit)
- Custom alias: alphanumeric + hyphens only, 3–30 chars, reserved words blocklist (
api,admin,health)
Abuse / Phishing: Async URL scanner (Google Safe Browsing API) on write. Flag malicious URLs; short code still created but returns a warning interstitial (like bit.ly’s spam page).
Rate limiting: 100 writes/min per API key (token bucket at gateway). 10,000 reads/min per IP (sliding window in Redis).
PII/GDPR: Long URLs may contain PII (e.g. search queries). Log only short_code in access logs, never the long URL. Analytics events store hashed IP. Right-to-erasure: delete row + Redis DEL + tombstone in Cassandra.
Audit log: All write operations (create, delete, update expiry) written to immutable audit Kafka topic.
| Metric | Alert Threshold |
|---|---|
| Redirect latency p99 | > 50ms → warn; > 100ms → page |
| Redirect error rate (5xx) | > 0.1% → page |
| Write latency p99 | > 200ms → warn |
| Cache hit ratio | < 90% → warn |
| Metric | Alert |
|---|---|
| Redis memory usage | > 75% → warn |
| Cassandra disk usage per node | > 70% → warn |
| Kafka consumer lag (analytics) | > 100K → warn |
- Unique short URLs created per hour (anomaly = abuse spike)
- Click-through rate per link (dashboard)
- Top 100 hottest links by click volume (real-time Flink job)
OpenTelemetry spans on every redirect: gateway → redirect-svc → redis-lookup → cassandra-fallback. Trace ID injected into analytics event for correlation. Sample at 1% baseline + 100% on errors (tail-based sampling via Grafana Tempo).
- Single Redirect Service + single Redis + single Postgres
- Snowflake with one worker node
- What breaks first: Postgres read throughput at ~2K RPS
- Migrate to Cassandra (3-node cluster, RF=3)
- Redis Cluster (3 shards)
- Horizontal scale Redirect Service (stateless, 5+ pods)
- What breaks first: Redis hot keys for viral URLs
- Add local in-process cache (Caffeine, 1s TTL) in each Redirect pod — absorbs viral URL spikes without Redis
- CDN layer (Cloudflare) for top 1M hot codes — serve redirects from edge, zero origin hits
- Cassandra expand to 9 nodes across 3 AZs
- What breaks first: CDN cache invalidation on link deletion/expiry
- Full CDN-first: 95%+ of redirects served at edge via Cloudflare Workers (KV store)
- Cassandra multi-region active-active (2 regions, LOCAL_QUORUM)
- Snowflake workers per datacenter (datacenter ID in worker config)
- Analytics moves to Flink + Iceberg (real-time + historical)
- What breaks: cross-region consistency for link deletion — use tombstone TTL + CDN purge API
Brownfield integration: Enterprises often need to keep their own domain (links.acme.com). Support custom domains via CNAME → your CDN. Each domain maps to a tenant namespace in Cassandra.
Build vs Buy:
- ID generation: Build Snowflake (trivial, no vendor dependency)
- Cache: Buy — Redis Enterprise or Elasticache; operational complexity not worth owning
- Analytics pipeline: Buy — Confluent (Kafka managed) + Databricks or BigQuery
- CDN: Buy — Cloudflare or Fastly; edge redirect latency you can’t match in-house
Multi-tenancy: Partition by tenant_id as the first token in Cassandra (composite partition key: (tenant_id, short_code)). Each tenant gets rate limits, custom domain, and analytics namespace.
Vendor lock-in: Cassandra is open-source — no lock-in. If using DynamoDB, abstract behind a UrlRepository interface so you can swap. Redis is replaceable with Dragonfly or KeyDB if licensing becomes a concern.
TCO ballpark (1B redirects/month):
- 3× Cassandra
i3en.2xlarge= ~$1,500/mo - Redis Cluster 3×
r6g.xlarge= ~$900/mo - Redirect Service pods (10×
t3.medium) = ~$300/mo - Kafka (Confluent serverless) = ~$200/mo
- CDN (Cloudflare Pro) = ~$200/mo
- Total: ~$3,100/mo — $0.003 per 1,000 redirects
Conway’s Law: Split ownership naturally: Platform team owns ID generation + Cassandra. Product team owns Redirect + Shortener services. Data team owns analytics pipeline. Avoid the monolith temptation on Day 1 — the redirect path is hot enough to justify its own deployment.
-
Clarify before drawing: Ask “Is dedup important? Should two users shortening the same URL get the same code?” — it completely changes your hash vs Snowflake decision.
-
Lead with the read/write ratio: Saying “this is 100:1 read-heavy, so the entire design is a cache in front of a DB” immediately signals senior-level thinking.
-
Justify 301 vs 302 explicitly — most candidates miss this. It’s a 10-second point but interviewers notice.
-
Avoid the MD5 trap: Truncated hashes have collision probability that grows faster than people expect. If asked, say “I’d use ID-based encoding to guarantee uniqueness without a DB round-trip.”
-
Name the thundering herd scenario: “A viral tweet sends 500K clicks in 60 seconds to one short code — here’s how I protect Redis and Cassandra.” Then explain singleflight / local cache / CDN. This separates candidates who’ve operated systems from those who’ve only designed them on paper.
Fluency vocabulary: Base62, Snowflake ID, singleflight, write-around cache, read-through cache, Cassandra TTL, LOCAL_QUORUM, 302 vs 301, CDN cache invalidation, hot partition, thundering herd.
- Cassandra Data Modeling — Official guide; partition key selection is directly applicable here
- Bitly Engineering: Building a Reliable URL Shortener — Real-world lessons from the team that runs billions of redirects
- Snowflake ID — Twitter Engineering — Original announcement and design rationale
- RFC 7231 §6.4.3 — Formal definition of 302 Found and caching semantics