Some notes about system design
- NoSQL for:
- low latency
- unstructured or non relational data
- only need to serialize and deserialize (json, xml, yaml, etc)
- store massive amount of data
- consistency is not a big deal, consistent scaling.
- read frequenly, modified infrequently
- implement expiration policy
- consistency is hard for multiple regions
- eviction policy: LRU, LFU, FIFO
- Dynamic content caching??
- Cache static content near the user
- The origin returns TTL to describe how long to cache.
- Move session data out of web servers and into a cache / NoSQL.
- Message queue for async tasks, pub/sub.
- Sharding: choose a good sharding key / partition key
- reshard when a single shard is not enough, have exhausted shards due to uneven distribution.
- beware of celebrity problem, what if Justin Bieber and Lady gaga are in the same shard.
- joins are hard in sharded databases, try denormalizing.
Back of the envelope estimation
Powers of two
Power Exact Value Approx Value Bytes --------------------------------------------------------------- 7 128 8 256 10 1024 1 thousand 1 KB 16 65,536 64 KB 20 1,048,576 1 million 1 MB 30 1,073,741,824 1 billion 1 GB 32 4,294,967,296 4 GB 40 1,099,511,627,776 1 trillion 1 TB
Latency Comparison Numbers -------------------------- L1 cache reference 0.5 ns Branch mispredict 5 ns L2 cache reference 7 ns 14x L1 cache Mutex lock/unlock 25 ns Main memory reference 100 ns 20x L2 cache, 200x L1 cache Compress 1K bytes with Zippy 10,000 ns 10 us Send 1 KB bytes over 1 Gbps network 10,000 ns 10 us Read 4 KB randomly from SSD* 150,000 ns 150 us ~1GB/sec SSD Read 1 MB sequentially from memory 250,000 ns 250 us Round trip within same datacenter 500,000 ns 500 us Read 1 MB sequentially from SSD* 1,000,000 ns 1,000 us 1 ms ~1GB/sec SSD, 4X memory HDD seek 10,000,000 ns 10,000 us 10 ms 20x datacenter roundtrip Read 1 MB sequentially from 1 Gbps 10,000,000 ns 10,000 us 10 ms 40x memory, 10X SSD Read 1 MB sequentially from HDD 30,000,000 ns 30,000 us 30 ms 120x memory, 30X SSD Send packet CA->Netherlands->CA 150,000,000 ns 150,000 us 150 ms Notes ----- 1 ns = 10^-9 seconds 1 us = 10^-6 seconds = 1,000 ns 1 ms = 10^-3 seconds = 1,000 us = 1,000,000 ns
|Availability %||Downtime per day||Downtime per year|
|99%||14.40 minutes||3.65 days|
|99.9%||1.44 minutes||8.77 hours|
|99.99%||8.64 seconds||52.60 minutes|
|99.999%||864.00 milliseconds||5.26 minutes|
|99.9999%||86.40 milliseconds||31.56 seconds|
The Twelve-Factor App
- Codebase: Use VCS (git), one codebase per app. Refactor shared code into libraries.
- Dependencies: Declare and isolate dependencies, vendor if necessary.
- Config: Use environment variables, do not group them together.
- Backing services: Treat them (MySQL, RabbitMQ) as attached resources via config.
- Build, release, run: Strict separation between these stages, releases cannot be mutated and are append-only.
- Processes: Processes are stateless and share-nothing. Implement sticky sessions using Memcached or Redis.
- Port binding: Export services via port binding.
- Concurrency: Scale out via the process model horizontally.
- Disposability: Fast startup and graceful shutdown using disposable processes.
- Dev/prod parity: Keep development, staging, and production as similar as possible, continuous deployment.
- Logs: Print unbuffered to stdout and should be captured by the executing environment.
- Admin processes: Run admin/management tasks as one-off processes.
Scalability for Dummies
- Hide servers behing a load balancer.
- Servers does not store any user related data, store sessions in an external cache.
- Denormalize and use NoSQL + cache
- Try to retrieve data from cache and use database on miss.
- Cache objects (sessions, articles, activity streams, user relations) instead of queries.
- Precomputing (pre-render static html for CDN)
- Job queues
- Twitter: Primer, CodelyTV, Exponent
- Pastebin: Primer
- Dropbox: Stanford,
- TikTok: Exponent
- Instagram: Exponent