Numbers

280 trillion objects
100 million requests per second
125 billion event notifications to serverless applications
100PB data moved per week for S3 Replication
1PB per day restored from Glacier
4 billion checksums per second
Millions of hard disks

280 TRILLION objects, with a T! 4 billion checksums per second! MILLIONS of hard disks!

Holy fucking shit.

I suspected Amazon dealt with numbers like these, considering it pretty much hosts a large chunk of the entire internet, but seeing these numbers feels unreal.

Hard disks bit error rate

Hard disks have a bit error rate of 1 in 10^15 requests. Mortal humans usually don't even need to think about this. S3 actually encounters this pretty frequently and needs to account for it.

How to deal with eventual consistency

Don't. Just let there be inconsistency — it isn't a big deal.

Dynamo was eventually consistent, so it was possible for your shopping cart to be wrong. [...] ultimately, these conflicts were rare, and you could resolve them by getting support staff involved and making a human decision.

This is a key takeaway for me. If the problem is small enough at Amazon's scale, it's something I generally needn't worry about.

Leadership, Motivation, Ownership

Explain what the problem is and let people come up with their own solutions.

It’s a lot harder to get invested in an idea that you don’t own. I consciously spend a lot more time trying to develop problems, and to do a really good job of articulating them, rather than trying to pitch solutions.

Super biased and with a ton of caveats, but the idea is interesting.

tl;dr: The Crazy Scale of Amazon S3

Numbers

Hard disks bit error rate

How to deal with eventual consistency

Leadership, Motivation, Ownership

A newsletter for programmers