System Design: Digital Wallet
A digital wallet (think PayPal, Apple Pay balance, in-app credits) is a system where correctness is non-negotiable. The design challenge: every debit must be atomic, concurrent operations must not cause double-spending, and the balance you show must always be consistent with the transaction history.
Functional:
- User has a wallet with a balance
- Deposit funds (from card, bank, another wallet)
- Withdraw funds (to bank, payment for goods/services)
- Transfer between wallets (peer-to-peer)
- View balance and transaction history
- Multi-currency (USD, EUR, etc.)
- Wallet statements and exports
Non-functional:
- No double-spending under any race condition. Two concurrent debits must not both succeed if the balance is insufficient.
- No money creation. Every credit has a corresponding debit somewhere.
- Durability. Once a transaction is confirmed to the user, it must not be lost.
- Audit trail. Every balance change traceable to a transaction.
- Consistency. Balance shown to user matches the actual ledger sum.
Never store “balance” as a mutable field in the user record. That’s a trap — two concurrent updates can both read the same balance, both subtract, and both write back without conflict (read-modify-write race condition).
The correct model: double-entry ledger
Every transaction creates two ledger entries — a debit on one account and a credit on another. Balance is derived by summing ledger entries, never stored directly.
CREATE TABLE ledger_entries (
id BIGSERIAL PRIMARY KEY,
wallet_id BIGINT NOT NULL REFERENCES wallets(id),
amount NUMERIC(18, 6) NOT NULL, -- positive = credit, negative = debit
transaction_id BIGINT NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
-- Balance query:
SELECT SUM(amount) FROM ledger_entries WHERE wallet_id = 123;
-- A $50 payment from user A to user B creates:
INSERT INTO ledger_entries (wallet_id, amount, transaction_id) VALUES
(user_a_wallet, -50.00, tx_id), -- debit user A
(user_b_wallet, +50.00, tx_id); -- credit user B
-- Both in the same DB transaction — atomic
Immutable ledger: Ledger entries are never updated or deleted. Corrections are new entries (reversal = negative counterpart). This gives you a complete, auditable history.
Two requests to withdraw $100 from a $150 balance arrive simultaneously. Both read balance = $150, both see “sufficient funds”, both write debit entries, and now balance is -$50. This is the double-spend problem.
Solution 1: Optimistic Locking with Version Counter
Add a version column to the wallet. Read the balance and the version. Write the debit only if the version still matches.
-- Read:
SELECT balance_cache, version FROM wallets WHERE id = 123;
-- balance_cache = 150, version = 42
-- Write (debit $100):
UPDATE wallets SET balance_cache = 50, version = 43
WHERE id = 123 AND version = 42;
-- If 0 rows updated → version changed (concurrent modification) → retry or reject
One of the concurrent requests will fail the version check and retry or fail. Eliminates double-spend.
Solution 2: Pessimistic Locking (SELECT FOR UPDATE)
Lock the wallet row for the duration of the transaction.
BEGIN;
SELECT balance_cache FROM wallets WHERE id = 123 FOR UPDATE;
-- Other transactions trying to update wallet 123 now block here
-- Perform balance check
-- Write ledger entries
-- Update balance_cache
COMMIT;
Guaranteed consistency but reduces concurrency (requests to the same wallet serialize). For high-throughput wallets (a merchant wallet receiving many payments), this becomes a bottleneck.
Recommendation: Optimistic locking for most wallets (low per-user contention). Pessimistic locking for high-contention scenarios or when you need absolute simplicity.
Computing balance from all ledger entries (SELECT SUM(amount)...) is a table scan that gets slower as entries accumulate.
Solution: Materialized balance cache + periodic snapshot
Maintain a balance_cache on the wallet row. Update it atomically when writing ledger entries. Periodically snapshot the balance (balance at a specific point in time), so balance queries only need to sum entries since the last snapshot.
-- Snapshot approach:
SELECT snapshot_balance + SUM(le.amount)
FROM ledger_entries le
JOIN wallet_snapshots ws ON le.wallet_id = ws.wallet_id
WHERE le.wallet_id = 123
AND le.created_at > ws.snapshot_at;
The cache may drift from the sum due to bugs. Run a reconciliation job nightly: compute actual balance from ledger, compare with cache, alert on discrepancies.
A transfer between two wallets must debit one and credit the other atomically. Partial transfer (debit succeeded, credit failed) is money destruction.
BEGIN;
-- Lock wallets in consistent order (lower ID first) to prevent deadlock
SELECT id FROM wallets WHERE id IN (wallet_a, wallet_b)
ORDER BY id FOR UPDATE;
-- Check sender balance
SELECT SUM(amount) FROM ledger_entries WHERE wallet_id = wallet_a;
-- If insufficient: ROLLBACK;
-- Write both ledger entries
INSERT INTO transactions (type, amount) VALUES ('TRANSFER', 50.00) RETURNING id;
INSERT INTO ledger_entries (wallet_id, amount, transaction_id) VALUES
(wallet_a, -50.00, tx_id),
(wallet_b, +50.00, tx_id);
-- Update balance caches
UPDATE wallets SET balance_cache = balance_cache - 50.00 WHERE id = wallet_a;
UPDATE wallets SET balance_cache = balance_cache + 50.00 WHERE id = wallet_b;
COMMIT;
Deadlock prevention: Always acquire locks in the same order (wallet ID ascending). If wallet A always locks before wallet B, two concurrent A→B and B→A transfers can’t deadlock.
Client
│
API Gateway (auth, rate limiting)
│
Wallet Service
├── Balance query → Postgres (balance_cache or computed)
├── Deposit → Postgres (ledger + balance_cache, atomic)
├── Withdrawal → Postgres (optimistic lock check, ledger, balance_cache)
├── Transfer → Postgres (both wallets, atomic)
└── Async notifications → Kafka → Notification Service
Reconciliation Job (nightly):
Compute actual balance from ledger, compare with cache, alert on drift
Statement Service:
Query ledger_entries with pagination for history/export
For large history: offload to read replica or data warehouse
Balance reads are frequent. Options:
- Read from primary — always consistent, adds primary load
- Read balance_cache — fast, may be milliseconds stale (acceptable for display)
- Read from replica with read-after-write consistency — user sees their own updates
For display purposes (showing balance on dashboard): read balance_cache from replica. For debit decision (can this user withdraw?): read from primary with lock.
Each wallet can hold multiple currencies. A wallet isn’t “amount X” — it’s “amounts per currency.”
CREATE TABLE wallet_balances (
wallet_id BIGINT,
currency VARCHAR(3), -- 'USD', 'EUR', 'GBP'
balance NUMERIC(18, 6),
version BIGINT,
PRIMARY KEY (wallet_id, currency)
);
Currency conversion is a separate operation (FX transaction) with a locked-in exchange rate at conversion time. Never do implicit currency conversion in a debit/credit path — make it explicit and audit-logged.
DB write fails mid-transfer:
- Transaction rollback — both ledger entries are rolled back. No partial state.
- Idempotency key: client retries → server checks if the idempotency key’s transaction already succeeded
Optimistic lock collision:
- Two concurrent withdrawals: one succeeds, one gets a version mismatch
- The failing request retries. On retry, it re-reads the balance. If now insufficient, it returns “insufficient funds.”
Balance cache drift:
- Bug causes cache to be wrong. Reconciliation job detects it.
- Never trust the cache for debit authorization — recompute from ledger or use pessimistic lock.
- Why not use a mutable balance field? Classic read-modify-write race condition. Two concurrent reads see the same balance; both write back. You need either pessimistic locking or optimistic concurrency control.
- Why double-entry? Ensures money is always conserved (no creation or destruction). Provides complete audit trail. Industry standard in accounting.
- Numeric precision: Never use floating point for money. Use
NUMERIC(exact) in Postgres, or store as integer cents.float(0.1) + float(0.2) != 0.3in floating point. - At what scale does this break? A single Postgres primary can handle thousands of wallet transactions per second. Sharding by wallet_id becomes necessary at millions of TPS — but most wallets don’t reach this scale. The bottleneck is usually hot wallets (merchant accounts receiving many payments), which need special handling (async aggregation, escrow sub-wallets).