SQL Flavors: Postgres vs MySQL vs SQL Server
SQL is SQL until it isn’t. When you’re making a database selection for a new service, the choice between PostgreSQL, MySQL, and SQL Server comes down to features, ecosystem, operational model, and political reality. Here’s how to reason through it.
Postgres is the right default for most new greenfield services at most companies. The reasons are concrete:
Feature set that matters in practice:
- JSONB: Binary JSON stored with indexing support. You get SQL querying power over semi-structured data. Hybrid approach: structured fields (user_id, created_at, status) as columns + flexible attributes as JSONB. This is genuinely useful — it’s not a NoSQL replacement, it’s an escape hatch for variable-schema data without leaving your transactional database.
- Window functions:
ROW_NUMBER(),RANK(),LAG()/LEAD(),running totals— essential for analytics queries that would otherwise require multiple subqueries or application-side logic. - CTEs (Common Table Expressions): Readable, composable, recursive queries. Postgres CTEs are materialized by default (tunable) which is an important optimizer consideration.
- Partial indexes: Index only rows matching a condition (
CREATE INDEX ON orders(created_at) WHERE status = 'PENDING'). Dramatically smaller index for the queries that need it. - LISTEN/NOTIFY: Lightweight pub/sub within Postgres. Services can subscribe to database-level events. Often used for simple event-driven patterns without introducing Kafka.
- Full-text search: Built-in
tsvector/tsquery— not Elasticsearch, but handles many search requirements without adding another system. - Strong type system: Native UUID, array, hstore, range types. Not just
VARCHARandINT. - Logical replication: Feeds CDC tools (Debezium), streaming to data warehouses, multi-region setups.
Ecosystem and licensing: Open source (PostgreSQL License), no commercial licensing concerns, huge community, works on every cloud and on-prem.
MySQL (and its drop-in-compatible Aurora MySQL) is still a solid choice, especially if your team has deep MySQL expertise or if you’re in an environment where Aurora MySQL is the standard.
Where MySQL has historically lagged Postgres:
- Less complete SQL standard support (historically lacked window functions in older versions, CTEs, etc.) — much of this was addressed in MySQL 8.0
- Weaker full-text search
- No JSONB equivalent — has JSON type but indexing is more limited
- InnoDB’s behavior around locking and MVCC differs subtly from Postgres
Where MySQL tends to be preferred:
- Teams with deep MySQL expertise and existing tooling around it
- High-read workloads where MySQL’s simpler replication model (binlog) is well-understood
- WordPress/PHP/LAMP ecosystem (effectively MySQL by default)
- When you’re on AWS and Aurora MySQL meets your needs — it’s extremely mature
Honest take: For new services at a company not already standardized on MySQL, Postgres is usually the better long-term choice on features. But if your DBAs know MySQL deeply and your tooling is built around it, the migration overhead to Postgres rarely pays off.
SQL Server is the right answer when:
- You’re in a .NET / Azure-first environment where SQL Server integration is deep
- The business requires SQL Server for licensing/support contract reasons
- You’re working with enterprise software (SAP, Dynamics) that runs on SQL Server
- You need features like SQL Server Reporting Services, Integration Services, or Analysis Services
SQL Server is expensive. Licensing for high-core-count servers is significant. For startups or cloud-native teams, this is usually a non-starter unless an enterprise customer or compliance requirement mandates it.
When managed makes sense:
- You don’t have DBAs. Managed services handle patching, backups, failover, and minor version upgrades.
- You want automated backups with point-in-time recovery (PITR) — essential for production.
- You need read replicas without operational complexity.
- You want automated failover (Multi-AZ RDS, Aurora).
What you give up with managed:
- Control over OS-level tuning (huge pages, filesystem settings)
- Access to certain Postgres extensions not supported by RDS
- Cost — RDS is meaningfully more expensive than a self-managed EC2 instance for the same specs
- Some advanced configurations require forking to Aurora (e.g., Aurora-specific parameters)
Self-hosted makes sense when:
- You have DBA expertise on the team
- Cost at scale justifies the operational investment
- You need specific extensions (PostGIS, TimescaleDB, pgvector) not available on managed
- You’re on-prem or in a private cloud
Aurora Postgres is a re-implementation of the Postgres wire protocol on top of a distributed storage layer. It’s not the same as RDS Postgres — it’s a different database that speaks Postgres.
Aurora advantages:
- Storage automatically grows (no need to provision disk upfront)
- Faster failover (~30s vs ~60–120s for Multi-AZ RDS)
- Aurora Global Database for cross-region replication with single-digit-millisecond replication lag
- Aurora Serverless v2 for auto-scaling (minimum ACUs to maximum ACUs, scales down to near-zero)
- Up to 15 read replicas vs 5 for standard RDS
Aurora trade-offs:
- Higher cost than standard RDS (storage cost model is different)
- Some Postgres extensions and features aren’t supported
- Aurora Serverless v2 cold start latency (scaling from minimum to active) can be a problem for latency-sensitive workloads with spiky traffic
The decision: For services that need high availability, fast failover, and global replication — Aurora. For simpler needs, standard RDS Postgres is cheaper and more straightforward. Aurora’s cost model only makes sense when you’re utilizing the capabilities.
Your DB is the write bottleneck. Walk through this before reaching for sharding:
Step 1: Profile and diagnose
- Identify the slow/hot queries (pg_stat_statements)
- Check I/O wait vs CPU — are you I/O bound or CPU bound?
- Check for lock contention (pg_locks, pg_stat_activity)
Step 2: Query optimization
- Missing indexes on write-heavy tables (insertions are only half the story —
queries blocking writes via long transactions are often the real issue)
- Batch writes instead of individual INSERTs (COPY or batch INSERT)
- Reduce write amplification — are you writing the same data multiple times?
Step 3: Schema optimization
- UNLOGGED tables for data that can be reconstructed (reduces WAL overhead)
- Partial indexes to reduce index write overhead
- Partition tables by time range (pg_partman) — partition pruning speeds writes to current partition
Step 4: Hardware/instance sizing
- Upgrade to a larger instance with faster NVMe SSDs
- Memory sizing — Postgres buffer pool hit rate should be >99% for hot data
- Increase max_wal_size, checkpoint_completion_target for write-heavy workloads
Step 5: Write offloading
- Queue writes through a buffer (Kafka → consumer → batch insert)
- Async write paths for non-critical data
Step 6: Connection management
- PgBouncer in transaction pooling mode — often eliminates connection overhead
that masquerades as write bottleneck
Step 7: Read replicas
- Many "write bottleneck" problems are actually read-triggered lock contention
- Moving heavy reads to replicas reduces lock pressure on primary
Step 8 (last resort): Sharding / distributed DB
- CockroachDB, CitusDB, or application-layer sharding
- This is a significant architecture change — exhaust all above options first
The number of teams that jump to step 8 while being at step 2 is remarkable.