Tagged “distributed-systems”
-
MIT 6.824: Lecture 20 - Blockstack
·
7 min read
The final post in this lecture series is about Blockstack. Blockstack is a network for building decentralized applications based on blockchain. Decentralized applications promise to give users more ownership and control of their data. Here, I'll start with an overview of how a decentralized app might work before describing Blockstack.
mit-6.824 distributed-systems learning-diary -
MIT 6.824: Lecture 19 - Bitcoin
·
9 min read
Following the lecture on Certificate Transparency, we're exploring Bitcoin, another open system comprising mutually untrustworthy components. Bitcoin is a digital currency for making online payments. I'll start this post by making a case for digital currencies, before describing Bitcoin and how it addresses the double-spending problem.
mit-6.824 distributed-systems learning-diary -
MIT 6.824: Lecture 18 - Certificate Transparency
·
6 min read
The systems we have seen so far are closed systems for which we have assumed that all the participants are trustworthy. But in an open system like the Web where anyone can take part, and there is no universally trusted authority, trust and security are top-level issues to address. Certificate Transparency is one approach to ensuring trust and improving security on the Web.
mit-6.824 distributed-systems learning-diary -
MIT 6.824: Lecture 17 - Causal Consistency, COPS
·
6 min read
In studying distributed systems, I've come across systems like Spanner, which incurs additional latency for strong consistency, and DynamoDB, which sacrifices strong consistency for low latency in responding to requests. This latency vs consistency tradeoff is one that many systems have to make, including COPS—this lecture's focus—but it offers a unique balance.
mit-6.824 distributed-systems learning-diary -
MIT 6.824: Lecture 16 - Scaling Memcache at Facebook
·
10 min read
This lecture is about building systems at scale. I'll start this post by describing how a website's architecture might evolve to cope with increasing load, before highlighting Facebook's use of memcached to support the world's largest social network.
mit-6.824 distributed-systems learning-diary -
MIT 6.824: Lecture 15 - Spark
·
5 min read
In the first lecture of this series, I wrote about MapReduce as a distributed computation framework. MapReduce partitions the input data across worker nodes, which process data in two stages: map and reduce. While MapReduce was innovative, it was inefficient for iterative and more complex computations. Researchers at UC Berkeley invented Spark to deal with these limitations.
mit-6.824 distributed-systems learning-diary -
MIT 6.824: Lecture 14 - Optimistic Concurrency Control
·
9 min read
This lecture on optimistic concurrency control is based on a system called FaRM. FaRM is a main memory distributed computing platform that provides distributed transactions with strict serializability, high performance, durability and high availability by taking advantage of two hardware trends. Here, I explain how FaRM uses these techniques to perform faster and yield far greater throughput than Spanner for simple transactions.
mit-6.824 distributed-systems learning-diary -
MIT 6.824: Lecture 13 - Spanner
·
10 min read
Spanner is a rare example of a distributed database that supports externally consistent distributed transactions. Many other databases either choose not to implement distributed transactions at all, or opt for weaker consistency models because of the performance cost involved. In this post, we'll learn how Google's TrueTime API enables it to provide this guarantee at a good performance.
mit-6.824 distributed-systems learning-diary -
MIT 6.824: Lecture 12 - Distributed Transactions
·
5 min read
Distributed databases typically divide their tables into partitions spread across different servers which get accessed by many clients. In these databases, client transactions often span the different servers, as the transactions may need to read from various partitions. A distributed transaction is a database transaction which spans multiple servers. This post will detail how databases guarantee some ACID properties when executing distributed transactions.
mit-6.824 distributed-systems learning-diary -
MIT 6.824: Lecture 11 - Cache Consistency, Frangipani
·
6 min read
The ideal distributed file system would guarantee that all its users have coherent access to a shared set of files and be easily scalable. It would also be fault-tolerant and require minimal human administration. This post will cover how Frangipani approximates this ideal, with a focus on how it provides a consistent view of shared files while maintaining a cache for each user.
mit-6.824 distributed-systems learning-diary -
MIT 6.824: Lecture 10 - Cloud Replicated DB, Aurora
·
6 min read
Amazon Aurora is a distributed database service provided by AWS. The paper describes the considerations in building a database for the cloud and details how Aurora's architecture differs from many traditional databases today. This post will explain how traditional databases work and then highlight how Aurora provides great performance through quorum writes and by building a database around the log.
mit-6.824 distributed-systems learning-diary -
MIT 6.824: Lecture 9 - CRAQ
·
6 min read
Many distributed systems today sacrifice stronger consistency guarantees for the sake of greater availability and higher throughput. CRAQ, which stands for Chain Replication with Apportioned Queries, is a system designed to challenge this trade-off. CRAQ's approach differs from existing replication techniques we have seen so far, like in Raft. It improves on the original form of Chain Replication. This post will start by presenting the Chain Replication approach, before describing how CRAQ improves on it.
mit-6.824 distributed-systems learning-diary -
MIT 6.824: Lecture 8 - ZooKeeper
·
8 min read
Can the coordination of distributed systems be handled by a stand-alone general-purpose service? If so, what should the API of that service look like? In addition, can we improve the performance of a system by N times if we add N times replica servers? This post will focus on answering these questions using the ZooKeeper system as a case study.
mit-6.824 distributed-systems learning-diary -
MIT 6.824: Lectures 6 & 7 - Fault Tolerance(Raft)
·
14 min read
One common pattern in the previous systems we have discussed like MapReduce, GFS, and VMware FT is that they all rely on a single entity to make the key decisions. While this has the advantage of making it easier for the system to decide, the downside of this approach is that the entity is now a single point of failure. In this post, we'll learn how the Raft consensus algorithm solves this problem.
mit-6.824 distributed-systems learning-diary -
MIT 6.824: Lecture 5 - Go, Threads, and Raft
·
5 min read
This post will contain some examples of good and bad Go code, using them to show common mistakes that can be made when starting to build concurrent programs, and how those can be corrected. It will cover goroutines, mutexes, condition variables, and channels.
mit-6.824 distributed-systems learning-diary -
MIT 6.824: Lecture 4 - Primary/Backup Replication
·
10 min read
Replication is one way by which applications can be made to be more fault tolerant. Using the VMware FT system as a case study, we'll discuss the different ways in which replication can be implemented, the challenges associated with each approach, and some acceptable tradeoffs that can be made when implementing replication in a system.
mit-6.824 distributed-systems learning-diary -
MIT 6.824: Lecture 3 - GFS
·
10 min read
Building distributed storage is a hard problem. In this lecture, we'll examine the Google File System and highlight the challenges involved in building distributed storage. We'll consider the tradeoffs that were made in building this system, and briefly discuss why Google eventually had to build a successor to GFS.
mit-6.824 distributed-systems learning-diary -
MIT 6.824: Lecture 2 - RPC and Threads
·
6 min read
Concurrency is important in distributed systems, but is often easy to get wrong. In this post, we'll learn about threads and why one needs to be careful when dealing with them. We'll also discuss an alternative to having to manage multiple threads and some downsides of that too. Finally, we'll discuss Remote Procedure Calls(RPC), which are a technique for client-server communication, highlighting their challenges and semantics.
mit-6.824 distributed-systems learning-diary -
MIT 6.824: Lecture 1 - MapReduce
·
4 min read
I started a study group with some friends where we will be going through the MIT. 6.824: Distributed Systems course. Over the next couple of weeks, I intend to upload my notes from studying each week's material. This post contains my notes from the first lecture on the MapReduce paradigm for large-scale data processing.
mit-6.824 distributed-systems learning-diary -
Consistency Models
·
3 min read
A brief description of Serial Consistency, External Consistency and Linearizability in distributed database systems.
distributed-systems learning-diary -
Chapter 9 - Consistency and Consensus (Part Two)
·
20 min read
Second part of my notes from Chapter 9 of Martin Kleppmann's 'Designing Data-Intensive Applications' book.
distributed-systems learning-diary ddia -
Chapter 9 - Consistency and Consensus (Part One)
·
15 min read
Notes from Chapter 9 of Martin Kleppmann's 'Designing Data-Intensive Applications' book.
distributed-systems learning-diary ddia -
Chapter 8 - The Trouble with Distributed Systems
·
18 min read
My notes from Chapter 8 of Martin Kleppmann's 'Designing Data-Intensive Applications' book.
learning-diary distributed-systems ddia -
Chapter 7 - Transactions
·
21 min read
My notes from Chapter 7 of 'Designing Data-Intensive Applications' by Martin Kleppmann.
learning-diary distributed-systems ddia -
Chapter 6 - Partitioning
·
9 min read
My notes from Chapter 6 of 'Designing Data-Intensive Applications by Martin Kleppmann'.
learning-diary distributed-systems ddia -
Chapter 5 - Replication
·
20 min read
My notes from the fifth chapter of Martin Kleppmann's book: Designing Data Intensive Applications.
distributed-systems learning-diary ddia -
Chapter 4 - Encoding and Evolution
·
6 min read
My notes from the fourth chapter of Martin Kleppmann's book: Designing Data Intensive Applications.
learning-diary ddia distributed-systems -
Chapter 3 - Storage and Retrieval
·
18 min read
My notes from the third chapter of Martin Kleppmann's book: Designing Data Intensive Applications.
learning-diary ddia distributed-systems -
Chapter 2 - Data Models and Query Languages
·
4 min read
My notes from the second chapter of Martin Kleppmann's book: Designing Data Intensive Applications.
ddia distributed-systems learning-diary -
Chapter 1 - Reliable, Scalable and Maintainable Applications
·
8 min read
These are my notes from the first chapter of Martin Kleppmann's: Designing Data Intensive Applications.
distributed-systems learning-diary ddia - Learning Diary: Designing Data Intensive Applications by Martin Kleppmann · 2 min read