Let’s be clear: This is not a fun read for anyone who doesn’t possess some advanced knowledge of current applications for data partitioning and simple sharding techniques. If you glossed over the words in the last sentence, trust us, you’re out already. Here’s an example of a “helpful” graph meant to illuminate the information:
Which, to most people, comes across like this:
Ultimately, what it boils down to is this (and yes, we’re well aware that discussing tech architecture of any sort—especially summarizing it like this— is basically an open invitation to scorn): The company has all sorts of things it wants to keep track of: what you watch, when you watch it, how long you watch it for, and pretty much any other little piece of data that could possibly be useful. When Netflix started, it wasn’t too hard to keep track, and even though sometimes the older data got lost, it was preferable to sacrifice a little responsiveness in order to try and make sure that all aspects of the streaming system, both viewer and provider (or “nodes”) were seeing the same thing at the same time. But, this system meant that if even one node failed, a certain percentage of members wouldn’t be able to see or interact with their viewing history. This wasn’t going to fly with Netflix.
So, the company is modifying and upgrading its system to detach various aspects of its data and storage from each other. So that if, for example, something fails, it doesn’t knock out a bunch of other information as well, meaning there would still be at least limited availability of functions that would have previously crashed. This will enable Netflix to keep collecting all that sweet, sweet data on all their members, and suffer minimal loss during accidents. Hooray! Now, if you’ll excuse us, we still have to finish catching up on Black Mirror.