SQL is SQL until it isn’t. When you’re making a database selection for a new service, the choice between PostgreSQL, MySQL, and SQL Server comes down to features, ecosystem, operational model, and political reality. Here’s how to reason through it.
PostgreSQL: The Default Choice for Most New Work #
Postgres is the right default for most new greenfield services at most companies. The reasons are concrete:
Feature set that matters in practice:
- JSONB: Binary JSON stored with indexing support. You get SQL querying power over semi-structured data. Hybrid approach: structured fields (user_id, created_at, status) as columns + flexible attributes as JSONB. This is genuinely useful — it’s not a NoSQL replacement, it’s an escape hatch for variable-schema data without leaving your transactional database.
- Window functions:
ROW_NUMBER(),RANK(),LAG()/LEAD(),running totals— essential for analytics queries that would otherwise require multiple subqueries or application-side logic. - CTEs (Common Table Expressions): Readable, composable, recursive queries. Postgres CTEs are materialized by default (tunable) which is an important optimizer consideration.
- Partial indexes: Index only rows matching a condition (
CREATE INDEX ON orders(created_at) WHERE status = 'PENDING'). Dramatically smaller index for the queries that need it. - LISTEN/NOTIFY: Lightweight pub/sub within Postgres. Services can subscribe to database-level events. Often used for simple event-driven patterns without introducing Kafka.
- Full-text search: Built-in
tsvector/tsquery— not Elasticsearch, but handles many search requirements without adding another system. - Strong type system: Native UUID, array, hstore, range types. Not just
VARCHARandINT. - Logical replication: Feeds CDC tools (Debezium), streaming to data warehouses, multi-region setups.
Ecosystem and licensing: Open source (PostgreSQL License), no commercial licensing concerns, huge community, works on every cloud and on-prem.
MySQL: Still Valid, Specific Trade-offs #
MySQL (and its drop-in-compatible Aurora MySQL) is still a solid choice, especially if your team has deep MySQL expertise or if you’re in an environment where Aurora MySQL is the standard.
Where MySQL has historically lagged Postgres:
- Less complete SQL standard support (historically lacked window functions in older versions, CTEs, etc.) — much of this was addressed in MySQL 8.0
- Weaker full-text search
- No JSONB equivalent — has JSON type but indexing is more limited
- InnoDB’s behavior around locking and MVCC differs subtly from Postgres
Where MySQL tends to be preferred:
- Teams with deep MySQL expertise and existing tooling around it
- High-read workloads where MySQL’s simpler replication model (binlog) is well-understood
- WordPress/PHP/LAMP ecosystem (effectively MySQL by default)
- When you’re on AWS and Aurora MySQL meets your needs — it’s extremely mature
Honest take: For new services at a company not already standardized on MySQL, Postgres is usually the better long-term choice on features. But if your DBAs know MySQL deeply and your tooling is built around it, the migration overhead to Postgres rarely pays off.
SQL Server: Enterprise, Windows, and Microsoft Shops #
SQL Server is the right answer when:
- You’re in a .NET / Azure-first environment where SQL Server integration is deep
- The business requires SQL Server for licensing/support contract reasons
- You’re working with enterprise software (SAP, Dynamics) that runs on SQL Server
- You need features like SQL Server Reporting Services, Integration Services, or Analysis Services
SQL Server is expensive. Licensing for high-core-count servers is significant. For startups or cloud-native teams, this is usually a non-starter unless an enterprise customer or compliance requirement mandates it.
Managed Cloud SQL: When to Use RDS / Aurora / Cloud SQL #
When managed makes sense:
- You don’t have DBAs. Managed services handle patching, backups, failover, and minor version upgrades.
- You want automated backups with point-in-time recovery (PITR) — essential for production.
- You need read replicas without operational complexity.
- You want automated failover (Multi-AZ RDS, Aurora).
What you give up with managed:
- Control over OS-level tuning (huge pages, filesystem settings)
- Access to certain Postgres extensions not supported by RDS
- Cost — RDS is meaningfully more expensive than a self-managed EC2 instance for the same specs
- Some advanced configurations require forking to Aurora (e.g., Aurora-specific parameters)
Self-hosted makes sense when:
- You have DBA expertise on the team
- Cost at scale justifies the operational investment
- You need specific extensions (PostGIS, TimescaleDB, pgvector) not available on managed
- You’re on-prem or in a private cloud
Aurora vs Standard RDS Postgres #
Aurora Postgres is a re-implementation of the Postgres wire protocol on top of a distributed storage layer. It’s not the same as RDS Postgres — it’s a different database that speaks Postgres.
Aurora advantages:
- Storage automatically grows (no need to provision disk upfront)
- Faster failover (~30s vs ~60–120s for Multi-AZ RDS)
- Aurora Global Database for cross-region replication with single-digit-millisecond replication lag
- Aurora Serverless v2 for auto-scaling (minimum ACUs to maximum ACUs, scales down to near-zero)
- Up to 15 read replicas vs 5 for standard RDS
Aurora trade-offs:
- Higher cost than standard RDS (storage cost model is different)
- Some Postgres extensions and features aren’t supported
- Aurora Serverless v2 cold start latency (scaling from minimum to active) can be a problem for latency-sensitive workloads with spiky traffic
The decision: For services that need high availability, fast failover, and global replication — Aurora. For simpler needs, standard RDS Postgres is cheaper and more straightforward. Aurora’s cost model only makes sense when you’re utilizing the capabilities.
Heavy Write Bottleneck: Decision Tree #
Your DB is the write bottleneck. Walk through this before reaching for sharding:
Step 1: Profile and diagnose
- Identify the slow/hot queries (pg_stat_statements)
- Check I/O wait vs CPU — are you I/O bound or CPU bound?
- Check for lock contention (pg_locks, pg_stat_activity)
Step 2: Query optimization
- Missing indexes on write-heavy tables (insertions are only half the story —
queries blocking writes via long transactions are often the real issue)
- Batch writes instead of individual INSERTs (COPY or batch INSERT)
- Reduce write amplification — are you writing the same data multiple times?
Step 3: Schema optimization
- UNLOGGED tables for data that can be reconstructed (reduces WAL overhead)
- Partial indexes to reduce index write overhead
- Partition tables by time range (pg_partman) — partition pruning speeds writes to current partition
Step 4: Hardware/instance sizing
- Upgrade to a larger instance with faster NVMe SSDs
- Memory sizing — Postgres buffer pool hit rate should be >99% for hot data
- Increase max_wal_size, checkpoint_completion_target for write-heavy workloads
Step 5: Write offloading
- Queue writes through a buffer (Kafka → consumer → batch insert)
- Async write paths for non-critical data
Step 6: Connection management
- PgBouncer in transaction pooling mode — often eliminates connection overhead
that masquerades as write bottleneck
Step 7: Read replicas
- Many "write bottleneck" problems are actually read-triggered lock contention
- Moving heavy reads to replicas reduces lock pressure on primary
Step 8 (last resort): Sharding / distributed DB
- CockroachDB, CitusDB, or application-layer sharding
- This is a significant architecture change — exhaust all above options firstThe number of teams that jump to step 8 while being at step 2 is remarkable.