From 39203c4167d16e4491fa063fb1c670b90ada44fe Mon Sep 17 00:00:00 2001 From: freemanzhang Date: Thu, 9 Mar 2017 10:02:19 -0800 Subject: [PATCH] restructure readme outline --- README.md | 1193 +++++++++++++++++++++++++---------------------------- 1 file changed, 570 insertions(+), 623 deletions(-) diff --git a/README.md b/README.md index 74c966f..6315395 100755 --- a/README.md +++ b/README.md @@ -20,51 +20,28 @@ - [Search](#search) - [Scale](#scale) - [System design evaluation standards](#system-design-evaluation-standards) - - [Work solution 25%](#work-solution-25%) - - [Special case 20%](#special-case-20%) - - [Analysis 25%](#analysis-25%) - - [Tradeoff 15%](#tradeoff-15%) - - [Knowledge base 15%](#knowledge-base-15%) - - [Tradeoffs](#tradeoffs) - - [Tradeoffs between latency and durability](#tradeoffs-between-latency-and-durability) - - [Tradeoffs between availability and consistency](#tradeoffs-between-availability-and-consistency) - - [Consistency](#consistency) - - [Update consistency](#update-consistency) - - [Read consistency](#read-consistency) -- [Principles of Good Software Design](#principles-of-good-software-design) - - [Simplicity](#simplicity) - - [Loose coupling](#loose-coupling) - - [Don't repeat yourself](#dont-repeat-yourself) - - [Coding to contract](#coding-to-contract) - - [Draw diagrams](#draw-diagrams) - - [Single responsibility](#single-responsibility) - - [Open-Closed principle](#open-closed-principle) - - [Dependency injection](#dependency-injection) - - [Inversion of control](#inversion-of-control) - - [Design for Scale](#design-for-scale) - - [Replication](#replication) - - [Consistency](#consistency-1) - - [Topology](#topology) - - [Master-slave vs peer-to-peer](#master-slave-vs-peer-to-peer) - - [Master-slave replication](#master-slave-replication) - - [Peer-to-peer replication](#peer-to-peer-replication) - - [Replication mode](#replication-mode) - - [Synchronous and Asynchronous](#synchronous-and-asynchronous) - - [Synchronous vs Asynchronous](#synchronous-vs-asynchronous) - - [Replication purpose](#replication-purpose) - - [High availability by creating redundancy](#high-availability-by-creating-redundancy) - - [Replication for scaling read](#replication-for-scaling-read) - - [Sharding](#sharding) - - [Benefits](#benefits) - - [Sharding key](#sharding-key) - - [Sharding function](#sharding-function) - - [Static sharding](#static-sharding) - - [Dynamic sharding](#dynamic-sharding) - - [Challenges](#challenges) - - [Cross-shard joins](#cross-shard-joins) - - [Using AUTO_INCREMENT](#using-autoincrement) - - [Distributed transactions](#distributed-transactions) +- [OO design principles](#oo-design-principles) + - [SRP: The Single Responsibility Principle](#srp-the-single-responsibility-principle) + - [OCP: The Open-Closed Principle](#ocp-the-open-closed-principle) + - [LSP: The Liskov Substitution Principle](#lsp-the-liskov-substitution-principle) + - [DIP: The Dependency-Inversion Principle](#dip-the-dependency-inversion-principle) + - [ISP: The Interface-Segregation Principle](#isp-the-interface-segregation-principle) + - [DRY: Don't repeat yourself](#dry-dont-repeat-yourself) +- [Distributed system concepts](#distributed-system-concepts) + - [CAP theorem](#cap-theorem) + - [Consistency](#consistency) + - [Update consistency](#update-consistency) + - [Read consistency](#read-consistency) + - [Replication Consistency](#replication-consistency) + - [Message queue](#message-queue) + - [Benefits](#benefits) + - [Components](#components) + - [Routing methods](#routing-methods) + - [Protocols](#protocols) + - [Metrics to decide which message broker to use](#metrics-to-decide-which-message-broker-to-use) + - [Challenges](#challenges) - [Networking](#networking) + - [TCP vs UDP](#tcp-vs-udp) - [HTTP](#http) - [Status code](#status-code) - [Groups](#groups) @@ -81,20 +58,13 @@ - [Stateless applications](#stateless-applications) - [Structure of a session](#structure-of-a-session) - [Server-side session vs client-side cookie](#server-side-session-vs-client-side-cookie) - - [Store session state in client-side cookies](#store-session-state-in-client-side-cookies) - - [Cookie Def](#cookie-def) - - [Cookie typical workflow](#cookie-typical-workflow) - - [Cookie Pros and cons](#cookie-pros-and-cons) - - [Store session state in server-side](#store-session-state-in-server-side) - - [Typical server-side session workflow](#typical-server-side-session-workflow) - - [Use a load balancer that supports sticky sessions:](#use-a-load-balancer-that-supports-sticky-sessions) - - [TCP vs UDP](#tcp-vs-udp) - - [SSL](#ssl) - - [Definition](#definition) - - [How does HTTPS work](#how-does-https-work) - - [How to avoid public key being modified?](#how-to-avoid-public-key-being-modified) - - [How to avoid computation consumption from PKI](#how-to-avoid-computation-consumption-from-pki) -- [Data Center Infrastructure](#data-center-infrastructure) + - [Store session state in client-side cookies](#store-session-state-in-client-side-cookies) + - [Cookie Def](#cookie-def) + - [Cookie typical workflow](#cookie-typical-workflow) + - [Cookie Pros and cons](#cookie-pros-and-cons) + - [Store session state in server-side](#store-session-state-in-server-side) + - [Typical server-side session workflow](#typical-server-side-session-workflow) + - [Use a load balancer that supports sticky sessions:](#use-a-load-balancer-that-supports-sticky-sessions) - [DNS](#dns) - [Design](#design) - [Initial design](#initial-design) @@ -116,27 +86,23 @@ - [Load balancers](#load-balancers) - [Benefits](#benefits-1) - [Round-robin algorithm](#round-robin-algorithm) - - [Hardware vs software](#hardware-vs-software) - - [HAProxy vs Nginx](#haproxy-vs-nginx) - - [Web Application Layer](#web-application-layer) - - [Apache and Nginx](#apache-and-nginx) - - [Apache vs Nginx](#apache-vs-nginx) - - [Web service Layer](#web-service-layer) - - [Design web services](#design-web-services) - - [Monolithic approach](#monolithic-approach) - - [Process](#process) - - [Benefits](#benefits-2) - - [Downsides](#downsides) - - [API-First approach](#api-first-approach) - - [Benefits](#benefits-3) - - [Downsides](#downsides-1) - - [Pragmatic approach](#pragmatic-approach) - - [Types](#types-1) - - [Function-Centric Services](#function-centric-services) - - [Resource-Centric Services](#resource-centric-services) - - [REST](#rest) + - [Security](#security) + - [SSL](#ssl) + - [Definition](#definition) + - [How does HTTPS work](#how-does-https-work) + - [How to avoid public key being modified?](#how-to-avoid-public-key-being-modified) + - [How to avoid computation consumption from PKI](#how-to-avoid-computation-consumption-from-pki) +- [NoSQL](#nosql) + - [NoSQL vs SQL](#nosql-vs-sql) + - [NoSQL flavors](#nosql-flavors) + - [Key-value](#key-value) + - [Document](#document) + - [Column-Family](#column-family) + - [Graph](#graph) +- [Scaling](#scaling) + - [Functional partitioning](#functional-partitioning) - [REST best practices](#rest-best-practices) - - [Consistency](#consistency-2) + - [Consistency](#consistency-1) - [Endpoint naming conventions](#endpoint-naming-conventions) - [HTTP verbs and CRUD consistency](#http-verbs-and-crud-consistency) - [Versioning](#versioning) @@ -145,13 +111,45 @@ - [Paging](#paging) - [Scaling REST web services](#scaling-rest-web-services) - [Keeping service machine stateless](#keeping-service-machine-stateless) + - [Benefits](#benefits-2) + - [Common use cases needing share state](#common-use-cases-needing-share-state) - [Caching service responses](#caching-service-responses) - - [Functional partitioning](#functional-partitioning) - - [Security](#security) + - [Cache-Control header](#cache-control-header) + - [Expires](#expires) + - [Last-Modified/If-Modified-Since/Max-age](#last-modifiedif-modified-sincemax-age) + - [ETag](#etag) + - [Vary: Authorization](#vary-authorization) + - [Functional partitioning](#functional-partitioning-1) + - [Security](#security-1) - [Throttling](#throttling) - [Use OAuth2 with HTTPS for authorization, authentication and confidentiality.](#use-oauth2-with-https-for-authorization-authentication-and-confidentiality) - [Documentation](#documentation) - [Others](#others-1) + - [Data partitioning - Sharding](#data-partitioning---sharding) + - [Sharding benefits](#sharding-benefits) + - [Sharding key](#sharding-key) + - [Sharding function](#sharding-function) + - [Static sharding](#static-sharding) + - [Dynamic sharding](#dynamic-sharding) + - [Challenges](#challenges-1) + - [Cross-shard joins](#cross-shard-joins) + - [Using AUTO_INCREMENT](#using-autoincrement) + - [Distributed transactions](#distributed-transactions) + - [Clones - Replication](#clones---replication) + - [Replication purpose](#replication-purpose) + - [High availability by creating redundancy](#high-availability-by-creating-redundancy) + - [Planning for failures](#planning-for-failures) + - [Replication for scaling read](#replication-for-scaling-read) + - [When to use](#when-to-use) + - [When not to use](#when-not-to-use) + - [Replication Topology](#replication-topology) + - [Master-slave vs peer-to-peer](#master-slave-vs-peer-to-peer) + - [Master-slave replication](#master-slave-replication) + - [Number of slaves](#number-of-slaves) + - [Peer-to-peer replication](#peer-to-peer-replication) + - [Replication mode](#replication-mode) + - [Synchronous and Asynchronous](#synchronous-and-asynchronous) + - [Synchronous vs Asynchronous](#synchronous-vs-asynchronous) - [Cache](#cache) - [Why does cache work](#why-does-cache-work) - [Cache hit ratio](#cache-hit-ratio) @@ -164,18 +162,18 @@ - [Typical caching scenarios](#typical-caching-scenarios) - [HTTP Cache](#http-cache) - [Headers](#headers-1) - - [Types](#types-2) + - [Types](#types-1) - [Browser cache](#browser-cache) - [Caching proxies](#caching-proxies) - [Reverse proxy](#reverse-proxy) - [Content delivery networks](#content-delivery-networks) - - [Scaling](#scaling) + - [Scaling](#scaling-1) - [Application objects cache](#application-objects-cache) - - [Types](#types-3) + - [Types](#types-2) - [Client-side web storage](#client-side-web-storage) - [Caches co-located with code: One located directly on your web servers.](#caches-co-located-with-code-one-located-directly-on-your-web-servers) - [Distributed cache store](#distributed-cache-store) - - [Scaling](#scaling-1) + - [Scaling](#scaling-2) - [Caching rules of thumb](#caching-rules-of-thumb) - [Cache priority](#cache-priority) - [Cache reuse](#cache-reuse) @@ -189,26 +187,35 @@ - [Def](#def-1) - [Solutions](#solutions) - [Scaling Memcached at Facebook](#scaling-memcached-at-facebook) - - [Message queue](#message-queue) - - [Benefits](#benefits-4) - - [Components](#components) - - [Routing methods](#routing-methods) - - [Protocols](#protocols) - - [Metrics to decide which message broker to use](#metrics-to-decide-which-message-broker-to-use) - - [Challenges](#challenges-1) - - [Data Persistent Layer](#data-persistent-layer) - - [MySQL](#mysql) - - [NoSQL](#nosql) - - [NoSQL vs SQL](#nosql-vs-sql) - - [NoSQL flavors](#nosql-flavors) - - [Key-value](#key-value) - - [Document](#document) - - [Column-Family](#column-family) - - [Graph](#graph) -- [Troubleshooting](#troubleshooting) - - [What happened if we cannot access a website](#what-happened-if-we-cannot-access-a-website) - - [What happened if a werbserver is too slow](#what-happened-if-a-werbserver-is-too-slow) - - [What should we do for increasing traffic](#what-should-we-do-for-increasing-traffic) +- [Architecture](#architecture) + - [Lambda architecture](#lambda-architecture) +- [Building blocks](#building-blocks) + - [Load balancer](#load-balancer) + - [Hardware vs software](#hardware-vs-software) + - [HAProxy vs Nginx](#haproxy-vs-nginx) + - [Web server](#web-server) + - [Apache and Nginx](#apache-and-nginx) + - [Apache vs Nginx](#apache-vs-nginx) + - [Cache](#cache-1) + - [In-memory cache - Guava cache](#in-memory-cache---guava-cache) + - [Standalone cache](#standalone-cache) + - [Memcached](#memcached) + - [Redis](#redis) + - [Database](#database) + - [DynamoDB](#dynamodb) + - [Cassandra](#cassandra) + - [Queue](#queue) + - [ActiveMQ](#activemq) + - [RabbitMQ](#rabbitmq) + - [SQS](#sqs) + - [Kafka](#kafka) + - [Data Processing](#data-processing) + - [Hadoop](#hadoop) + - [Spark](#spark) + - [EMR](#emr) + - [Stream Processing](#stream-processing) + - [Samza](#samza) + - [Storm](#storm) - [References](#references) @@ -293,23 +300,48 @@ * Denormalization # System design evaluation standards -## Work solution 25% -## Special case 20% -## Analysis 25% -## Tradeoff 15% -## Knowledge base 15% +* Work solution 25% +* Special case 20% +* Analysis 25% +* Tradeoff 15% +* Knowledge base 15% + +# OO design principles + +## SRP: The Single Responsibility Principle +* Your classes should have one single responsibility and no more. + - Take validation of an e-mail address as an example. If you place your validation logic directly in the code that creates user accounts, you will not be able to reuse it in a different context. Having validation logic separated into a distinct class would let you reuse it in multiple places and have only a single implementation. -## Tradeoffs -### Tradeoffs between latency and durability +## OCP: The Open-Closed Principle +* Create code that does not have to be modified when requirements change or when new use cases arise. "Open for extension but closed for modification" + - Requires you to break the problem into a set of smaller problems. Each of these tasks can then vary independently without affecting the reusability of remaining components. + - MVC frameworks. You have the ability to extend the MVC components by adding new routes, intercepting requests, returning different responses, and overriding default behaviors. + +## LSP: The Liskov Substitution Principle + +## DIP: The Dependency-Inversion Principle +* Dependency injection provides references to objects that the class depends on instead of allowing the class to gather the dependencies itself. In practice, dependency injection can be summarized as not using the "new" keyword in your classes and demanding instances of your dependencies to be provided to your class by its clients. +* Dependency injection is an important principle and a subclass of a broader principle called inversion of control. Dependency injection is limited to object creation and assembly of its dependencies. Inversion of control, on the other hand, is a more generic idea and can be applied to different problems on different levels of abstraction. + - IOC is heavily used by several frameworks such as Spring, Rails and even Java EE containers. Instead of you being in control of creating instances of your objects and invoking methods, you become the creator of plugins or extensions to the framework. The IOC framework will look at the web request and figure out which classes should be instantiated and which components should be delegated to. This means your classes do not have to know when their instances are created, who is using them, or how their dependencies are put together. -### Tradeoffs between availability and consistency -* CAP theorem: if you get a network partition, you have to trade off consistency versus availability. +## ISP: The Interface-Segregation Principle + +## DRY: Don't repeat yourself +* There are a number of reasons developers repeated waste time: + - Following an inefficient process + - Lack of automation + - Reinventing the wheel + - Copy/Paste programming + +# Distributed system concepts +## CAP theorem +* If you get a network partition, you have to trade off consistency versus availability. - Consistency: Every read would get the most recent write. - Availability: Every request received by the nonfailing node in the system must result in a response. - Partition tolerance: The cluster can survive communication breakages in the cluster that separate the cluster into multiple partitions unable to communicate with each other. -#### Consistency -##### Update consistency +## Consistency +### Update consistency * Def: Write-write conflicts occur when two clients try to write the same data at the same time. Result is a lost update. * Solutions: - Pessimistic approach: Preventing conflicts from occuring. @@ -320,7 +352,7 @@ * Problems of the solution: Both pessimistic and optimistic approach rely on a consistent serialization of the updates. Within a single server, this is obvious. But if it is more than one server, such as with peer-to-peer replication, then two nodes might apply the update in a different order. * Often, when people first encounter these issues, their reaction is to prefer pessimistic concurrency because they are determined to avoid conflicts. Concurrent programming involves a fundamental tradeoff between safety (avoiding errors such as update conflicts) and liveness (responding quickly to clients). Pessimistic approaches often severly degrade the responsiveness of a system to the degree that it becomes unfit for its purpose. This problem is made worse by the danger of errors such as deadlocks. -##### Read consistency +### Read consistency * Def: - Read-write conflicts occur when one client reads inconsistent data in the middle of another client's write. * Types: @@ -340,62 +372,7 @@ + Solution1: A sticky session. a session that's tied to one node. A sticky session allows you to ensure that as long as you keep read-your-writes consistency on a node, you'll get it for sessions too. The downsides is that sticky sessions reduce the ability of the load balancer to do its job. + Solution2: Version stamps and ensure every interaction with the data store includes the latest version stamp seen by a session. - -# Principles of Good Software Design -## Simplicity -* Hide complexity and build abstractions -* Avoid overengineering - - When you try to predict every possible use case and every edge case, you lose focus on the most common use cases. Good design allows you to add more details and features later on. -* Test-driven development: - - To write tests, you assume the viewpoint of the client code using your component, rather than focusing on the internal implementation of it. This slight difference in approach results in greatly improved code design and API simplicity. -* Learn from models of simplicity - - Grails - - Hadoop - - Google Maps API - -## Loose coupling -* Promoting loose coupling - - In OO languages like Java, you can use public, protected and private key words. You want to declare as many methods as private/protected as possible. -* Avoid unnecessary coupling - - One example is when clients of a module or class need to invoke methods in a particular order for the work to be done correctly. Sometimes there are valid reasons for it, but more often it is caused by bad API design, such as the existence of initialization functions. Clients of your class/module should not have to know how you expect them to use your code. They should be able to use the public interface in any way they want. - - Avoid circular dependencies. -* Learn from models of loose coupling - - Unix command-line programs and their use of pipes. - - SLF4J. - -## Don't repeat yourself -* There are a number of reasons developers repeated waste time: - - Following an inefficient process - - Lack of automation - - Reinventing the wheel - - Copy/Paste programming - -## Coding to contract -## Draw diagrams -* Use case -* Class diagram -* Module diagram - -## Single responsibility -* Your classes should have one single responsibility and no more. - - Take validation of an e-mail address as an example. If you place your validation logic directly in the code that creates user accounts, you will not be able to reuse it in a different context. Having validation logic separated into a distinct class would let you reuse it in multiple places and have only a single implementation. - -## Open-Closed principle -* Create code that does not have to be modified when requirements change or when new use cases arise. "Open for extension but closed for modification" - - Requires you to break the problem into a set of smaller problems. Each of these tasks can then vary independently without affecting the reusability of remaining components. - - MVC frameworks. You have the ability to extend the MVC components by adding new routes, intercepting requests, returning different responses, and overriding default behaviors. - -## Dependency injection -* Dependency injection provides references to objects that the class depends on instead of allowing the class to gather the dependencies itself. In practice, dependency injection can be summarized as not using the "new" keyword in your classes and demanding instances of your dependencies to be provided to your class by its clients. - -## Inversion of control -* Dependency injection is an important principle and a subclass of a broader principle called inversion of control. Dependency injection is limited to object creation and assembly of its dependencies. Inversion of control, on the other hand, is a more generic idea and can be applied to different problems on different levels of abstraction. - - IOC is heavily used by several frameworks such as Spring, Rails and even Java EE containers. Instead of you being in control of creating instances of your objects and invoking methods, you become the creator of plugins or extensions to the framework. The IOC framework will look at the web request and figure out which classes should be instantiated and which components should be delegated to. This means your classes do not have to know when their instances are created, who is using them, or how their dependencies are put together. - - -## Design for Scale -### Replication -#### Consistency +### Replication Consistency * Def: Slaves could return stale data. * Reason: - Replication is usually asynchronous, and any change made on the master needs some time to replicate to its slaves. Depending on the replication lag, the delay between requests, and the speed of each server, you may get the freshest data or you may get stale data. @@ -404,158 +381,96 @@ - Cache the data that has been written on the client side so that you would not need to read the data you have just written. - Minize the replication lag to reduce the chance of stale data being read from stale slaves. -#### Topology - -##### Master-slave vs peer-to-peer - -| Types | Strengths | Weakness | -| ------------ |:----------------:|:-------------------:| -| Master-slave | | | -| p2p: Master-master | | Not a viable scalability technique. | -| p2p: Ring-based | Chain three or more masters together to create a ring. | | - - -##### Master-slave replication -* Responsibility: - - Master is reponsible for all data-modifying commands like updates, inserts, deletes or create table statements. The master server records all of these statements in a log file called a binlog, together with a timestamp, and a sequence number to each statement. Once a statement is written to a binlog, it can then be sent to slave servers. - - Slave is responsible for all read statements. -* Replication process: The master server writes commands to its own binlog, regardless if any slave servers are connected or not. The slave server knows where it left off and makes sure to get the right updates. This asynchronous process decouples the master from its slaves - you can always connect a new slave or disconnect slaves at any point in time without affecting the master. - 1. First the client connects to the master server and executes a data modification statement. The statement is executed and written to a binlog file. At this stage the master server returns a response to the client and continues processing other transactions. - 2. At any point in time the slave server can connect to the master server and ask for an incremental update of the master' binlog file. In its request, the slave server provides the sequence number of the last command that it saw. - 3. Since all of the commands stored in the binlog file are sorted by sequence number, the master server can quickly locate the right place and begin streaming the binlog file back to the slave server. - 4. The slave server then writes all of these statements to its own copy of the master's binlog file, called a relay log. - 5. Once a statement is written to the relay log, it is executed on the slave data set, and the offset of the most recently seen command is increased. - -###### Number of slaves -* It is a common practice to have two or more slaves for each master server. Having more than one slave machine have the following benefits: - - Distribute read-only statements among more servers, thus sharding the load among more servers - - Use different slaves for different types of queries. E.g. Use one slave for regular application queries and another slave for slow, long-running reports. - - Losing a slave is a nonevent, as slaves do not have any information that would not be available via the master or other slaves. - -##### Peer-to-peer replication -* Dual masters - - Two masters replicate each other to keep both current. This setup is very simple to use because it is symmetric. Failing over to the standby master does not require any reconfiguration of the main master, and failing back to the main master again when the standby master fails in turn is very easy. - + Active-active: Writes go to both servers, which then transfer changes to the other master. - + Active-passive: One of the masters handles writes while the other server, just keeps current with the active master - - The most common use of active-active dumal masters setup is to have the servers geographically close to different sets of users - for example, in branch offices at different places in the world. The users can then work with local server, and the changes will be replicated over to the other master so that both masters are kept in sync. - -* Circular replication - -#### Replication mode -##### Synchronous and Asynchronous -* Asynchronous: The master does not wait for the slaves to apply the changes, but instead just dispatches each change request to the slaves and assume they will catch up eventually and replicate all the changes. -* Synchronous: The master and slaves are always in sync and a transaction is not allowed to be committed on the master unless the slaves agrees to commit it as well (i.e. synchronous replication makes the master wait for all the slaves to keep up with the writes.) - -##### Synchronous vs Asynchronous -* Asynchronous replication is a lot faster than synchronous replication. Compared with asynchronous replication, synchronous replication requires extra synchronization to guarantee consistency. It is usually implemented through a protocol called two-phase commit, which guarantees consistency between the master and slaves. What makes this protocol slow is that it requires a total of four messages, including messages with the transaction and the prepare request. The major problem is not the amount of network traffic required to handle the synchronization, but the latency introduced by the network and by processing the commit on the slave, together with the fact that the commit is blocked on the master until all the slaves have acknowledged the transaction. In contrast, the master does not have to wait for the slave, but can report the transaction as committed immediately, which improves performance significantly. -* The performance of asynchronous replication comes at the price of consistency. In asynchronous replication the transaction is reported as committed immediately, without waiting for any acknowledgement from the slave. - -#### Replication purpose -##### High availability by creating redundancy -* Duplicate components - - Def: Keep duplicates around for each component - ready to take over immediately if the original component fails. - + Characteristics: Do not lose performance when switching and switching to the standby is usually faster than restructuring the system. But expensive. - - For example: Hot standby - + A dedicated server that just duplicates the main master. The hot standby is connected to the master as a slave, so that it reads and applies all changes. This setup is often called primary-backup configuration. - -* Create spare capacity - - Def: Have extra capacity in the system so that if a component fails, you can still handle the load. - - Characteristics: Should one of the component fail, the system will still be responding, but the capacity of the system will be reduced. - -###### Planning for failures -* Slave failures - - Because the slaves are used only for read quires, it is sufficient to inform the load balancer that the slave is missing. Then we can take the failing slave out of rotation. rebuild it and put it back. - -* Master failures - - Problems: - + All the slaves have stale data. - + Some queries may block if they are waiting for changes to arrive at the slave. Some queries may make it into the relay log of the slave and therefore will eventually be executed by the slave. No special consideration has to be taken on the behalf of these queries. - + For queries that are waiting for events that did not leave the master before it crashed, they are usually reported as failures so users should reissue the query. - - Solutions: - + If simply restart does not work - + First find out which of your slaves is most up to date. - + Then reconfigure it to become a master. - + Finally reconfigure all remaining slaves to replicate from the new master. - -* Relay failures - - For servers acting as relay servers, the situation has to be handled specially. If they fail, the remaining slaves have to be redirected to use some other relay or the master itself. - -* Disaster recovery - - Disaster does not have to mean earthquakes or floods; it just means that something went very bad for the computer and it is not local to the machine that failed. Typical examples are lost power in the data center (not necessarily because the power was lost in the city; just losing power in the building is sufficient.) - - The nature of a disaster is that many things fail at once, making it impossible to handle redundancy by duplicating servers at a single data center. Instead, it is necessary to ensure data is kept safe at another geographic location, and it is quite common for companies to ensure high availability by having different components at different offices. - -##### Replication for scaling read -###### When to use -* Scale reads: Instead of a single server having to respond to all the queries, you can have many clones sharing the load. You can keep scaling read capacity by simply adding more slaves. And if you ever hit the limit of how many slaves your master can handle, you can use multilevel replication to further distribute the load and keep adding even more slaves. By adding multiple levels of replication, your replication lag increases, as changes need to propogate through more servers, but you can increase read capacity. -* Scale the number of concurrently reading clients and the number of queries per second: If you want to scale your database to support 5,000 concurrent read connections, then adding more slaves or caching more aggressively can be a great way to go. - -###### When not to use -* Scale writes: No matter what topology you use, all of your writes need to go through a single machine. - - Although a dual master architecture appears to double the capacity for handling writes (because there are two masters), it actually doesn't. Writes are just as expensive as before because each statement has to be executed twice: once when it is received from the client and once when it is received from the other master. All the writes done by the A clients, as well as B clients, are replicated and get executed twice, which leaves you in no better position than before. -* Not a good way to scale the overall data set size: If you want to scale your active data set to 5TB, replication would not help you get there. The reason why replication does not help in scaling the data set size is that all of the data must be present on each of the machines. The master and each of its slave need to have all of the data. - - Def of active data set: All of the data that must be accessed frequently by your application. (all of the data your database needs to read from or write to disk within a time window, like an hour, a day, or a week.) - - Size of active data set: When the active data set is small, the database can buffer most of it in memory. As your active data set grows, your database needs to load more disk blocks because in-memory buffers are not large enough to contain enough of the active disk blocks. - - Access pattern of data set - + Like a time-window: In an e-commerce website, you use tables to store information about each purchase. This type of data is usually accessed right after the purchase and then it becomes less and less relevant as time goes by. Sometimes you may still access older transactions after a few days or weeks to update shipping details or to perform a refund, but after that, the data is pretty much dead except for an occasional report query accessing it. - + Unlimited data set growth: A website that allowed users to listen to music online, your users would likely come back every day or every week to listen to their music. In such case, no matter how old an account is, the user is still likely to log in and request her playlists on a weekly or daily basis. - -### Sharding -#### Benefits -* Scale horizontally to any size. Without sharding, sooner or later, your data set size will be too large for a single server to manage or you will get too many concurrent connections for a single server to handle. You are also likely to reach your I/O throughput capacity as you keep reading and writing more data. By using application-level sharing, none of the servers need to have all of the data. This allows you to have multiple MySQL servers, each with a reasonable amount of RAM, hard drives, and CPUs and each of them being responsible for a small subset of the overall data, queries, and read/write throughput. -* Since sharding splits data into disjoint subsets, you end up with a share-nothing architecture. There is no overhead of communication between servers, and there is no cluster-wide synchronization or blocking. Servers are independent from each other because they shared nothing. Each server can make authoritative decisions about data modifications -* You can implement in the application layer and then apply it to any data store, regardless of whether it supports sharding out of the box or not. You can apply sharding to object caches, message queues, nonstructured data stores, or even file systems. -#### Sharding key -* Determine what tables need to be sharded. A good starting point for deciding that is to look at the number of rows in the tables as well as the dependencies between the tables. - - Typically you use only a single column as partition key. Using multiple columns can be hard to maintain unless they are hard to maintain. - - Sharding on a column that is a primary key offers significant advantages. The reason for this is that the column should have a unique index, so that each value in the column uniquely identifies the row. +## Message queue +### Benefits +* **Enabling asynchronous processing**: + - Defer processing of time-consuming tasks without blocking our clients. Anything that is slow or unpredictable is a candidate for asynchronous processing. Example include + + Interact with remote servers + + Low-value processing in the critical path + + Resource intensive work + + Independent processing of high- and low- priority jobs + - Message queues enable your application to operate in an asynchronous way, but it only adds value if your application is not built in an asynchronous way to begin with. If you developed in an environment like Node.js, which is built with asynchronous processing at its core, you will not benefit from a message broker that much. What is good about message brokers is that they allow you to easily introduce asynchronous processing to other platforms, like those that are synchronous by nature (C, Java, Ruby) +* **Easier scalability**: + - Producers and consumers can be scaled separately. We can add more producers at any time without overloading the system. Messages that cannot be consumed fast enough will just begin to line up in the message queue. We can also scale consumers separately, as now they can be hosted on separate machines and the number of consumers can grow independently of producers. +* **Decoupling**: + - All that publishers need to know is the format of the message and where to publish it. Consumers can become oblivious as to who publishes messages and why. Consumers can focus solely on processing messages from the queue. Such a high level decoupling enables consumers and producers to be developed indepdently. They can even be developed by different teams using different technologies. +* **Evening out traffic spikes**: + - You should be able to keep accepting requests at high rates even at times of icnreased traffic. Even if your publishing generates messages much faster than consumers can keep up with, you can keep enqueueing messages, and publishers do not have to be affected by a temporary capacity problem on the consumer side. +* **Isolating failures and self-healing**: + - The fact that consumers' availability does not affect producers allows us to stop message processing at any time. This means that we can perform maintainance and deployments on back-end servers at any time. We can simply restart, remove, or add servers without affecting producer's availability, which simplifies deployments and server management. Instead of breaking the entire application whenever a back-end server goes offline, all that we experience is reduced throughput, but there is no reduction of availability. Reduced throughput of asynchronous tasks is usually invisible to the user, so there is no consumer impact. -#### Sharding function -##### Static sharding -* Def: The sharding key is mapped to a shard identifier using a fixed assignment that never changes. - - Static sharding schemes run into problems when the distribution of the queries is not even. -* Types: - - Range partitioning (used in HBase) - + Easy to implement but distribution can easy to become uneven. For example, if you are using URIs as keys, "hot" sites will be clustered together when you actually want the opposite, to spread them out. One hard can become overloaded and you have to split it a lot to be able to cope with the increase in load. - - Hash partitioning - + Computes a hash of the input in some manner (MD5 or SHA-1) and then uses modulo arithmetic to get a number between 1 and the number of the shards. - * Evenly distributed but need large amount of data migration when the number of server changes and rehashing - + Consistent hashing: Gauranteed to move rows from just one old shard to the new shard. The entire hash range is shown as a ring. On the hash ring, the shards are assigned to points on the ring using the hash function. In a similar manner, the rows are distributed over the ring using the same hash function. Each shard is now responsible for the region of the ring that starts at the shard's point on the ring and continues to the next shard point. Because a region may start at the end of the hash range and wrap around to the beginning of the hash range, a ring is used here instead of a flat line. - - Pick a hash function which must have a big range, hence a lot of "points" on the hash ring where rows can be assigned. The most commonly used functions are MD5, SHA and Murmur hash (murmur3 -2^128, 2^128) - - Less data migration but hard to balance node - * Unbalanced scenario 1: Machines with different processing power/speed. - * Unbalanced scenario 2: Ring is not evenly partitioned. - * Unbalanced scenario 3: Same range length has different amount of data. - - Virtual nodes (Used in Dynamo and Cassandra) - + Solution: Each physical node associated with a different number of virtual nodes. - + Problems: Data should not be replicated in different virtual nodes but the same physical nodes. +### Components +* Message producer + - Locate the message queue and send a valid message to it +* Message broker - where messages are sent and buffered for consumers. + - Be available at all times for producers and to accept their messages. + - Buffering messages and allowing consumers to consume related messages. +* Message consumer + - Receive and process message from the message queue. + - The two most common ways of implement consumers are a "cron-like" and a "daemon-like" approach. + + Connects periodically to the queue and checks the status of the queue. If there are messages, it consumes them and stops when the queue is empty or after consuming a certain amount of messages. This model is common in scripting languages where you do not have a persistenly running application container, such as PHP, Ruby, or Perl. Cron-like is also referred to as a pull model because the consumers pulls messages from the queue. It can also be used if messages are added to the queue rarely or if network connectivity is unreliable. For example, a mobile application may try to pull the queue from time to time, assuming that connection may be lost at any point in time. + + A daemon-like consumer runs constantly in an infinite loop, and it usually has a permanent connection to the message broker. Instead of checking the status of the queue periodically, it simply blocks on the socket read operation. This means that the consumer is waiting idly until messages are pushed by the message broker in the connection. This model is more common in languages with persistent application containers, such as Java, C#, and Node.js. This is also referred to as a push model because messages are pushed by the message broker onto the consumer as fast as the consumer can keep processing them. -##### Dynamic sharding -* The sharding key is looked up in a dictionary that indicates which shard contains the data. - - More flexible. You are allowed to change the location of shards and it is also easy to move data between shards if you have to. You do not need to migrate all of the data in one shot, but you can do it incrementally, one account at a time. To migrate a user, you need to lock its account, migrate the data, and then unlock it. You could usually do these migrations at night to reduce the impact on the system, and you could also migrate multiple accounts at the same time. There is an additional level of flexibility, as you can cherry-pick users and migrate them to the shards of your choice. Depending on the application requirements, you could migrate your largest or busiest clients to separate dedicated database instances to give them more capacity. - - Requires a centralized store called the sharding database and extra queries to find the correct shard to retrieve the data from. +### Routing methods +* Direct worker queue method + - Consumers and producers only have to know the name of the queue. + - Well suited for the distribution of time-consuming tasks such as sending out e-mails, processing videos, resizing images, or uploading content to third-party web services. +* Publish/Subscribe method + - Producers publish message to a topic, not a queue. Messages arriving to a topic are then cloned for each consumer that has a declared subscription to that topic. +* Custom routing rules + - A consumer can decide in a more flexible way what messages should be routed to its queue. + - Logging and alerting are good examples of custom routing based on pattern matching. -#### Challenges -##### Cross-shard joins -* Tricky to execute queries spanning multiple shards. The most common reason for using cross-shard joins is to create reports. This usually requires collecting information from the entire database. There are basically two approaches to solve this problem - - Execute the query in a map-reduce fashion (i.e., send the query to all shards and collect the result into a single result set). It is pretty common that running the same query on each of your servers and picking the highest of the values will not guarantee a correct result. - - Replicate all the shards to a separate reporting server and run the query there. This approach is easier. It is usually feasible, as well, because most reporting is done at specific times, is long-running, and does not depend on the current state of the database. +### Protocols +* AMQP: A standardized protocol accepted by OASIS. Aims at enterprise integration and interoperability. +* STOMP: A minimalist protocol. + - Simplicity is one of its main advantages. It supports fewer than a dozen operations, so implementation and debugging of libraries are much easier. It also means that the protocol layer does not add much performance overhead. + - But interoperability can be limited because there is no standard way of doing certain things. A good example of impaired is message prefetch count. Prefetch is a great way of increasing throughput because messages are received in batches instead of one message at a time. Although both RabbitMQ and ActiveMQ support this feature, they both implement it using different custom STOMP headers. +* JMS + - A good feature set and is popular + - Your ability to integrate with non-JVM-based languages will be very limited. +### Metrics to decide which message broker to use +* Number of messages published per second +* Average message size +* Number of messages consumed per second (this can be much higher than publishing rate, as multiple consumers may be subscribed to receive copies of the same message) +* Number of concurrent publishers +* Number of concurrent consumers +* If message persistence is needed (no message loss during message broker crash) +* If message acknowledgement is need (no message loss during consumer crash) -##### Using AUTO_INCREMENT -* It is quite common to use AUTO_INCREMENT to create a unique identifier for a column. However, this fails in a sharded environment because the the shards do not syncrhonize their AUTO_INCREMENT identifiers. This means if you insert a row in one shard, it might well happen that the same identifier is used on another shard. If you truly want to generate a unique identifer, there are basically three approaches. - - Generate a unique UUID. The drawback is that the identifier takes 128 bits (16 bytes). - - Use a composite identifier. Where the first part is the shard identifier and the second part is a locally generated identifier. Note that the shard identifier is used when generating the key, so if a row with this identifier is moved, the original shard identifier has to move with it. You can solve this by maintaining, in addition to the column with the AUTO_INCREMENT, an extra column containing the shard identifier for the shard where the row was created. - - Use atomic counters provided by some data stores. For example, if you already use Redis, you could create a counter for each unique identifier. You would then use Redis' INCR command to increase the value of a selected counter and return it with a different value. +### Challenges +* No message ordering: Messages are processed in parallel and there is no synchronization between consumers. Each consumer works on a single message at a time and has no knowledge of other consumers running in parallel to it. Since your consumers are running in parallel and any of them can become slow or even crash at any point in time, it is difficult to prevent messages from being occasionally delivered out of order. + - Solutions: + + Limit the number of consumers to a single thread per queue + + Build the system to assume that messages can arrive in random order + + Use a messaging broker that supports partial message ordering guarantee. + - It is best to depend on the message broker to deliver messages in the right order by using partial message guarantee (ActiveMQ) or topic partitioning (Kafka). If your broker does not support such functionality, you will need to ensure that your application can handle messages being processed in an unpredictable order. + + Partial message ordering is a clever mechanism provided by ActiveMQ called message groups. Messages can be published with a special label called a message group ID. The group ID is defined by the application developer. Then all messages belonging to the same group are guaranteed to be consumed in the same order they were produced. Whenever a message with a new group ID gets published, the message broker maps the new group Id to one of the existing consumers. From then on, all the messages belonging to the same group are delivered to the same consumer. This may cause other consumers to wait idly without messages as the message broker routes messages based on the mapping rather than random distribution. + - Message ordering is a serious issue to consider when architecting a message-based application, and RabbitMQ, ActiveMQ and Amazon SQS messaging platform cannot guarantee global message ordering with parallel workers. In fact, Amazon SQS is known for unpredictable ordering messages because their infrastructure is heavily distributed and ordering of messages is not supported. +* Message requeueing + - By allowing messages to be delivered to your consumers more than once, you make your system more robust and reduce constraints put on the message queue and its workers. For this approach to work, you need to make all of your consumers idempotent. + + But it is not an easy thing to do. Sending emails is, by nature, not an idempotent operation. Adding an extra layer of tracking and persistence could help, but it would add a lot of complexity and may not be able to handle all of the faiulres. + + Idempotent consumers may be more sensitive to messages being processed out of order. If we have two messages, one to set the product's price to $55 and another one to set the price of the same product to $60, we could end up with different results based on their processing order. +* Race conditions become more likely +* Risk of increased complexity + - When integrating applications using a message broker, you must be very diligent in documenting dependencies and the overarching message flow. Without good documentation of the message routes and visibility of how the message flow through the system, you may increase the complexity and make it much harder for developers to understand how the system works. -##### Distributed transactions -* Lose the ACID properties of your database as a whole. Maintaining ACID properties across shards requires you to use distributed transactions, which are complex and expensive to execute (most open-source database engines like MySQL do not even support distributed transactions). +# Networking +## TCP vs UDP +| TCP | UDP | +|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| Reliable: TCP is connection-oriented protocol. When a file or message send it will get delivered unless connections fails. If connection lost, the server will request the lost part. There is no corruption while transferring a message. | Not Reliable: UDP is connectionless protocol. When you a send a data or message, you don’t know if it’ll get there, it could get lost on the way. There may be corruption while transferring a message. | +| Ordered: If you send two messages along a connection, one after the other, you know the first message will get there first. You don’t have to worry about data arriving in the wrong order. | Not Ordered: If you send two messages out, you don’t know what order they’ll arrive in i.e. no ordered | +| Heavyweight: – when the low level parts of the TCP “stream” arrive in the wrong order, resend requests have to be sent, and all the out of sequence parts have to be put back together, so requires a bit of work to piece together. | Lightweight: No ordering of messages, no tracking connections, etc. It’s just fire and forget! This means it’s a lot quicker, and the network card / OS have to do very little work to translate the data back from the packets. | +| Streaming: Data is read as a “stream,” with nothing distinguishing where one packet ends and another begins. There may be multiple packets per read call. | Datagrams: Packets are sent individually and are guaranteed to be whole if they arrive. One packet per one read call. | +| Examples: World Wide Web (Apache TCP port 80), e-mail (SMTP TCP port 25 Postfix MTA), File Transfer Protocol (FTP port 21) and Secure Shell (OpenSSH port 22) etc. | Examples: Domain Name System (DNS UDP port 53), streaming media applications such as IPTV or movies, Voice over IP (VoIP), Trivial File Transfer Protocol (TFTP) and online multiplayer games | -# Networking ## HTTP ### Status code #### Groups @@ -667,11 +582,11 @@ | Scalability | Need efforts to scale because requests depend on server state | Easier to implement | -### Store session state in client-side cookies -#### Cookie Def +#### Store session state in client-side cookies +##### Cookie Def * Cookies are key/value pairs used by websites to store state informations on the browser. Say you have a website (example.com), when the browser requests a webpage the website can send cookies to store informations on the browser. -#### Cookie typical workflow +##### Cookie typical workflow ``` // Browser request example: @@ -697,12 +612,12 @@ Cookie: foo=10; bar=20 Accept: */* ``` -#### Cookie Pros and cons +##### Cookie Pros and cons * Advantage: You do not have to store the sesion state anywhere in your data center. The entire session state is being handed to your web server with every web request, thus making your application stateless in the context of the HTTP session. * Disadvantage: Session storage can becomes expensive. Cookies are sent by the browser with every single request, regardless of the type of resource being requested. As a result, all requests within the same cookie domain will have session storage appended as part of the request. * Use case: When you can keep your data minimal. If all you need to keep in session scope is userID or some security token, you will benefit from the simplicity and speed of this solution. Unfortunately, if you are not careful, adding more data to the session scope can quickly grow into kilobytes, making web requests much slower, especially on mobile devices. The coxt of cookie-based session storage is also amplified by the fact that encrypting serialized data and then Based64 encoding increases the overall byte count by one third, so that 1KB of session scope data becomes 1.3KB of additional data transferred with each web request and web response. -### Store session state in server-side +#### Store session state in server-side * Approaches: - Keep state in main memory - Store session state in files on disk @@ -710,7 +625,7 @@ Accept: */* + Delegate the session storage to an external data store: Your web application would take the session identifier from the web request and then load session data from an external data store. At the end of the web request life cycle, just before a response is sent back to the user, the application would serialize the session data and save it back in the data store. In this model, the web server does not hold any of the session data between web requests, which makes it stateless in the context of an HTTP session. + Many data stores are suitable for this use case, for example, Memcached, Redis, DynamoDB, or Cassandra. The only requirement here is to have very low latency on get-by-key and put-by-key operations. It is best if your data store provides automatic scalability, but even if you had to do data partitioning yourself in the application layer, it is not a problem, as sessions can be partitioned by the session ID itself. -#### Typical server-side session workflow +##### Typical server-side session workflow 1. Every time an internet user visits a specific website, a new session ID (a unique number that a web site's server assigns a specific user for the duration of that user's visit) is generated. And an entry is created inside server's session table | Columns | Type | Meaning | @@ -724,41 +639,10 @@ Accept: */* 4. Each time the user sends a request to the server. The cookie for that domain will be automatically attached. 5. The server validates the sessionID inside the request. If it is valid, then the user has logged in before. - -### Use a load balancer that supports sticky sessions: +##### Use a load balancer that supports sticky sessions: * The load balancer needs to be able to inspect the headers of the request to make sure that requests with the same session cookie always go to the server that initially the cookie. * But sticky sessions break the fundamental principle of statelessness, and I recommend avoiding them. Once you allow your web servers to be unique, by storing any local state, you lose flexibility. You will not be able to restart, decommission, or safely auto-scale web servers without braking user's session because their session data will be bound to a single physical machine. - -## TCP vs UDP - -| TCP | UDP | -|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| Reliable: TCP is connection-oriented protocol. When a file or message send it will get delivered unless connections fails. If connection lost, the server will request the lost part. There is no corruption while transferring a message. | Not Reliable: UDP is connectionless protocol. When you a send a data or message, you don’t know if it’ll get there, it could get lost on the way. There may be corruption while transferring a message. | -| Ordered: If you send two messages along a connection, one after the other, you know the first message will get there first. You don’t have to worry about data arriving in the wrong order. | Not Ordered: If you send two messages out, you don’t know what order they’ll arrive in i.e. no ordered | -| Heavyweight: – when the low level parts of the TCP “stream” arrive in the wrong order, resend requests have to be sent, and all the out of sequence parts have to be put back together, so requires a bit of work to piece together. | Lightweight: No ordering of messages, no tracking connections, etc. It’s just fire and forget! This means it’s a lot quicker, and the network card / OS have to do very little work to translate the data back from the packets. | -| Streaming: Data is read as a “stream,” with nothing distinguishing where one packet ends and another begins. There may be multiple packets per read call. | Datagrams: Packets are sent individually and are guaranteed to be whole if they arrive. One packet per one read call. | -| Examples: World Wide Web (Apache TCP port 80), e-mail (SMTP TCP port 25 Postfix MTA), File Transfer Protocol (FTP port 21) and Secure Shell (OpenSSH port 22) etc. | Examples: Domain Name System (DNS UDP port 53), streaming media applications such as IPTV or movies, Voice over IP (VoIP), Trivial File Transfer Protocol (TFTP) and online multiplayer games | - -## SSL -### Definition -* Hyper Text Transfer Protocol Secure (HTTPS) is the secure version of HTTP, the protocol over which data is sent between your browser and the website that you are connected to. The 'S' at the end of HTTPS stands for 'Secure'. It means all communications between your browser and the website are encrypted. HTTPS is often used to protect highly confidential online transactions like online banking and online shopping order forms. - -### How does HTTPS work -* HTTPS pages typically use one of two secure protocols to encrypt communications - SSL (Secure Sockets Layer) or TLS (Transport Layer Security). Both the TLS and SSL protocols use what is known as an 'asymmetric' Public Key Infrastructure (PKI) system. An asymmetric system uses two 'keys' to encrypt communications, a 'public' key and a 'private' key. Anything encrypted with the public key can only be decrypted by the private key and vice-versa. -* As the names suggest, the 'private' key should be kept strictly protected and should only be accessible the owner of the private key. In the case of a website, the private key remains securely ensconced on the web server. Conversely, the public key is intended to be distributed to anybody and everybody that needs to be able to decrypt information that was encrypted with the private key. - -### How to avoid public key being modified? -* Put public key inside digital certificate. - - When you request a HTTPS connection to a webpage, the website will initially send its SSL certificate to your browser. This certificate contains the public key needed to begin the secure session. Based on this initial exchange, your browser and the website then initiate the 'SSL handshake'. The SSL handshake involves the generation of shared secrets to establish a uniquely secure connection between yourself and the website. - - When a trusted SSL Digital Certificate is used during a HTTPS connection, users will see a padlock icon in the browser address bar. When an Extended Validation Certificate is installed on a web site, the address bar will turn green. - -### How to avoid computation consumption from PKI -* Only use PKI to generate session key and use the session key for further communications. - - -# Data Center Infrastructure - ## DNS * Resolve domain name to IP address @@ -853,90 +737,105 @@ Accept: */* - Not all servers have identical processing power. Need to query back-end server to discover memory and CPU usage, server load, and perhaps even network latency. - How to support sticky sessions: Hashing based on network address might help but is not a reliable option. Or the load balancer could maintain a lookup table mapping session ID to server. -### Hardware vs software -| Category | Software | Hardware | -|----------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------| -| Def | Run on standard PC hardware, using applications like Nginx and HAProxy | Run on special hardware and contain any software pre-installed and configured by the vendor. | -| Model | Operate on Application Layer | Operate on network and transport layer and work with TCP/IP packets. Route traffic to backend servers and possibly handling network address translation | -| Strength/Weakness | More intelligent because can talk HTTP (can perform the compression of resources passing through and routing-based on the presence of cookies) and more flexible for hacking in new features or changes | Higher throughput and lower latency. High purchase cost. Hardware load balancer prices start from a few thousand dollars and go as high as over 100,000 dollars per device. Specialized training and harder to find people with the work experience necessary to operate them. | - -### HAProxy vs Nginx +## Security +### SSL +#### Definition +* Hyper Text Transfer Protocol Secure (HTTPS) is the secure version of HTTP, the protocol over which data is sent between your browser and the website that you are connected to. The 'S' at the end of HTTPS stands for 'Secure'. It means all communications between your browser and the website are encrypted. HTTPS is often used to protect highly confidential online transactions like online banking and online shopping order forms. -| Category | Nginx | HAProxy | -|-----------|---------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------| -| Strengths | Can cache HTTP responses from your servers. | A little faster than Nginx and a wealth of extra features. It can be configured as either a layer 4 or layer 7 load balancer. | +#### How does HTTPS work +* HTTPS pages typically use one of two secure protocols to encrypt communications - SSL (Secure Sockets Layer) or TLS (Transport Layer Security). Both the TLS and SSL protocols use what is known as an 'asymmetric' Public Key Infrastructure (PKI) system. An asymmetric system uses two 'keys' to encrypt communications, a 'public' key and a 'private' key. Anything encrypted with the public key can only be decrypted by the private key and vice-versa. +* As the names suggest, the 'private' key should be kept strictly protected and should only be accessible the owner of the private key. In the case of a website, the private key remains securely ensconced on the web server. Conversely, the public key is intended to be distributed to anybody and everybody that needs to be able to decrypt information that was encrypted with the private key. -* Extra functionalities of HAProxy. It can be configured as either a layer 4 or layer 7 load balancer. - - When HAProxy is set up to be a layer 4 proxy, it does not inspect higher-level protocols and it depends solely on TCP/IP headers to distribute the traffic. This, in turn, allows HAProxy to be a load balancer for any protocol, not just HTTP/HTTPS. You can use HAProxy to distribute traffic for services like cache servers, message queues, or databases. - - HAProxy can also be configured as a layer 7 proxy, in which case it supports sticky sessions and SSL termination, but needs more resources to be able to inspect and track HTTP-specific information. The fact that HAProxy is simpler in design makes it perform sligthly better than Nginx, especially when configured as a layer 4 load balancer. Finally, HAProxy has built-in high-availability support. +#### How to avoid public key being modified? +* Put public key inside digital certificate. + - When you request a HTTPS connection to a webpage, the website will initially send its SSL certificate to your browser. This certificate contains the public key needed to begin the secure session. Based on this initial exchange, your browser and the website then initiate the 'SSL handshake'. The SSL handshake involves the generation of shared secrets to establish a uniquely secure connection between yourself and the website. + - When a trusted SSL Digital Certificate is used during a HTTPS connection, users will see a padlock icon in the browser address bar. When an Extended Validation Certificate is installed on a web site, the address bar will turn green. -## Web Application Layer -### Apache and Nginx -* Apache and Nginx could always be used together. - - NGINX provides all of the core features of a web server, without sacrificing the lightweight and high‑performance qualities that have made it successful, and can also serve as a proxy that forwards HTTP requests to upstream web servers (such as an Apache backend) and FastCGI, memcached, SCGI, and uWSGI servers. NGINX does not seek to implement the huge range of functionality necessary to run an application, instead relying on specialized third‑party servers such as PHP‑FPM, Node.js, and even Apache. - - A very common use pattern is to deploy NGINX software as a proxy in front of an Apache-based web application. Can use Nginx's proxying abilities to forward requests for dynamic resources to Apache backend server. NGINX serves static resources and Apache serves dynamic content such as PHP or Perl CGI scripts. +#### How to avoid computation consumption from PKI +* Only use PKI to generate session key and use the session key for further communications. -### Apache vs Nginx +# NoSQL +## NoSQL vs SQL +* There is no generally accepted definition. All we can do is discuss some common characteristics of the databases that tend to be called "NoSQL". -| Category | Apache | Nginx | -|--------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| History | Invented around 1990s when web traffic is low and web pages are really simple. Apache's heavyweight, monolithic model has its limit. Tunning Apache to cope with real-world traffic efficiently is a complex art. | Heavy traffic and web pages. Designed for high concurrency. Provides 12 features including which make them appropriate for microservices. | -| Architecture | One process/threads per connection. Each requests to be handled as a separate child/thread. | Asynchronous event-driven model. There is a single master process with one or more worker processes. | -| Performance | To decrease page-rendering time, web browsers routinely open six or more TCP connections to a web server for each user session so that resources can download in parallel. Browsers hold these connections open for a period of time to reduce delay for future requests the user might make during the session. Each open connection exclusively reserves an httpd process, meaning that at busy times, Apache needs to create a large number of processes. Each additional process consumes an extra 4MB or 5MB of memory. Not to mention the overhead involved in creating and destroying child processes. | Can handle a huge number of concurrent requests | -| Easier development | Very easy to insert additional code at any point in Apache's web-serving logic. Developers could add code securely in the knowledge that if newly added code is blocked, ran slowly, leaked resources, or even crashed, only the worker process running the code would be affected. Processing of all other connections would continue undisturbed | Developing modules for it isn't as simple and easy as with Apache. Nginx module developers need to be very careful to create efficient and accurate code, without any resource leakage, and to interact appropriately with the complex event-driven kernel to avoid blocking operations. | +| Database | SQL | NoSQL | +| --------------------- |:-------------:| ------------:| +| Data uniformness | Uniform data. Best visualized as a set of tables. Each table has rows, with each row representing an entity of interest. Each row is described through columns. One row cannot be nested inside another. | Non-uniform data. NoSQL databases recognize that often, it is common to operate on data in units that have a more complex structure than a set of rows. This is particularly useful in dealing with nonuniform data and custom fields. NoSQL data model can be put into four categories: key-value, document, column-family and graph. | +| Schema change | Define what table exists, what column exists, what data types are. Although actually relational schemas can be changed at any time with standard SQL commands, it is of hight cost. | Changing schema is casual and of low cost. Essentially, a schemaless database shifts the schema into the application code. | +| Query flexibility | Low cost on changing query. It allows you to easily look at the data in different ways. Standard SQL supports things like joins and subqueries. | High cost in changing query. It does not allow you to easily look at the data in different ways. NoSQL databases do not have the flexibility of joins or subqueries. | +| Transactions | SQL has ACID transactions (Atomic, Consistent, Isolated, and Durable). It allows you to manipulate any combination of rows from any tables in a single transaction. This operation either succeeds or fails entirely, and concurrent operations are isolated from each other so they cannot see a partial update. | Graph database supports ACID transactions. Aggregate-oriented databases do not have ACID transactions that span multiple aggregates. Instead, they support atomic manipulation of a single aggregate at a time. If we need to manipulate multiple aggregates in an atomic way, we have to manage that ourselves in application code. An aggregate structure may help with some data interactions but be an obstacle for others. | +| Consistency | Strong consistency | Trade consistency for availability or partition tolerance. Eventual consistency | +| Scalability | elational database use ACID transactions to handle consistency across the whole database. This inherently clashes with a cluster environment | Aggregate structure helps greatly with running on a cluster. It we are running on a cluster, we need to minize how many nodes we need to query when we are gathering data. By using aggregates, we give the database important information about which bits of data (an aggregate) will be manipulated together, and thus should live on the same node. | +| Performance | MySQL/PosgreSQL ~ 1k QPS | MongoDB/Cassandra ~ 10k QPS. Redis/Memcached ~ 100k ~ 1M QPS | +| Maturity | Over 20 years. Integrate naturally with most web frameworks. For example, Active Record inside Ruby on Rails | Usually less than 10 years. Not great support for serialization and secondary index | +## NoSQL flavors +### Key-value +* Suitable use cases + - **Storing session information**: Generally, every web session is unique and is assigned a unique sessionid value. Applications that store the sessionid on disk or in a RDBMS will greatly benefit from moving to a key-value store, since everything about the session can be stored by a single PUT request or retrieved using GET. This single-request operation makes it very fast, as everything about the session is stored in a single object. Solutions such as Memcached are used by many web applications, and Riak can be used when availability is important + - **User profiles, Preferences**: Almost every user has a unique userId, username, or some other attributes, as well as preferences such as language, color, timezone, which products the user has access to, and so on. This can all be put into an object, so getting preferences of a user takes a single GET operation. Similarly, product profiles can be stored. + - **Shopping Cart Data**: E-commerce websites have shopping carts tied to the user. As we want the shopping carts to be available all the time, across browsers, machines, and sessions, all the shopping information can be put into value where the key is the userid. A riak cluster would be best suited for these kinds of applications. -## Web service Layer -### Design web services -#### Monolithic approach -* Web applications following this approach would usually be developed using a Model View Controller framework and web services would be implemented as a set of additional controllers and views, allowing clients to interact with your system without having to go through the complexity of HTML/AJAX interactions. +* When not to use + * **Relationships among Data**: If you need to have relationships between different sets of data, or correlate teh data between different sets of key, key-value stores are not the best solution to use, even though some key-value stores provide link-walking features. + * **Multioperation transactions**: If you're saving multiple keys and there is a failure to save any of them, and you want to revert or roll back the rest of the operations, key-value stores are not the best solution to be used. + * **Query by data**: If you need to search the keys based on something found in the value part of the key-value pairs, then key-value stores are not going to perform well for you. This is no way to inspect the value on the database side, with the exception of some products like Riak Search or indexing engines like Lucene. + * **Operations by sets**: Since operations are limited to one key at a time, there is no way to operate upon multiple keys at the same time. If you need to operate upon multiple keys, you have to handle this from the client side. -##### Process -* Build the web application first and then add web services as an alternative interface to it. -* Take a hotel booking website as example - - First implement the front end (HTML views with some AJAX and Cascading Style Sheets) and your business logic (usually backend code running within some Model View Controller framework). Your webiste the allow users to do the usual things like searching for hotels, checking availability, and booking hotel rooms. - - After the core functionality was completed, you would then add web services to your web application when a particular need arose. For example, a few months after your product was live, you wanted to integrate with a partner company and allow them to promote your hotels. Then as part of integration efforts you would design and implement web services, allowing your partner to perform certain operations. +### Document +* Suitable use cases + - **Event logging**: Applications have different event logging needs; within the enterprise, there are many different applications that want to log events. Document databases can store all these different types of events and can act as a central data store for event storage. This is especially true when the type of data being captured by the events keeps changing. Events can be sharded by the name of the application where the event originated or by the type of event such as order_processed or customer_logged. + - **Content Management Systems, Blogging Platforms**: Since document databases have no predefined schemas and usually uderstand JSON documents, they work well in content management systems or applications for publishing websites, managing user comments, user registrations, profiles, web-facing documents. + - **Web Analytics or Real-Time Analytics**: Document databases can store data for real-time analytics; since parts of the document can be updated, it's very easy to store page views or unique visitors, and new metrics can be easily added without schema changes. + - **E-Commerce Applications**: E-commerce applications often need to have flexible schema for products and orders, as well as the ability to evolve their data models without expensive database refactoring or data migration. -##### Benefits -* You can add features and make changes to your code at very high speed, especially in early phases of development. Not having APIs reduces the number of components, layers, and the overall complexity of the system, which makes it easier to work with. If you do not have any customers yet, you do not know whether your business model will work, and if you are trying to get the early minimum viable product out the door, you may benefit from a lightweight approach like this. -* You defer implementation of any web service code until you have proven that your product works and that it is worth further development. It helps you to avoid overengineering. +* When not to use + - **Complex Transactions Spanning Different Operations**: If you need to have atomic cross-document operations, then document databases may not be for you. However, there are some document databases that do support these kinds of operations, such as RavenDB. + - **Queries against Varying Aggregate Structure**: Flexible schema means that the database does not enforce any restrictions on the schema. Data is saved in the form of application entities. If you need to query these entities ad hoc, your queries will be changing (in RDBMS terms, this would mean that as you join criteria between tables, the tables to join keep changing). Since the data is saved as an aggregate, if the design of the aggregate is constantly changing, you need to save the aggregates at the lowest level of granularity-basically, you need to normalize the data. In this scenario, document databases may not work. -##### Downsides -* Potential future costs like the need for major refactoring or rewrites. +### Column-Family +* Suitable use cases + - **Event Logging**: Column-family databases with their ability to store any data structures are a great choice to store event information, such as application state or errors encountered by the application. Within the enterprise, all applications can write their events to Cassandra with their own columns and the row key of the form appname:timestamp. Since we can scale writes, Cassandra would work ideally for an event logging system. + - **Content Management Systems, Blogging Platforms**: Using column-families, you can store blog entries with tags, categories, links, and trackbacks in different columns. Comments can be either stored in the same row or moved to a different keyspace; similarly, blog users and the actual blogs can be put into different column families. + - **Counters**: Often, in web applications you need to count and categorize visitors of a page to calculate analytics, you can use the CounterColumnType during creation of a column family. + - **Expiring usage**: You may provide demo to users, or may want to show ad banners on a website for a specific time. You can do this by using expiring columns: Cassandra allows you to have columns which, after a given time, are deleted automatically. This time is known as TTL and is defined in seconds. The column is deleted after the TTL has elapsed; when the column does not exist, the access can be revoked or the banner can be removed. -#### API-First approach -* API-first implies designing and building your API contract first and then building clients consuming that API and the actual implementation of the web service. The concept of API-first came about as a solution to the problem of multiple user interfaces. It is common nowadays for a company to have a mobile application, a desktop website, a mobile website, and a need to integrate with third parties by giving them programmtic access to the functionality and data of their system. -* A straightforward solution is to implement each use cases separately. You would likely end up with multiple implementations of the same logic spread across different parts of your system. Since your web application, mobile client, and your partners each have slightly different needs, it feels natural to satisfy each of the use cases by providing slightly different interfaces. Before you realize it, you will have duplicate code spread across all of your controllers. An alternative approach to that problem is to create a layer of web services that encapsulates most of the business logic and hides complexity behind a single API contract. In this scenario, all of your clients use the same API interface when talking to your web application. +```sql +CREATE COLUMN FAMILY visit_counter +WITH default_validation_class=CounterColumnType +AND key_validation_class=UTF8Type AND comparator=UTF8Type -##### Benefits -* By having a single web service with all of the business logic, you only need to maintain one copy of that code. That in turn means that you need to modify less code when making changes, since you can make changes to the web service alone rather than having to apply these changes to all of the clients. -* Having an API also makes it easier to scale your system because you can use functional partitioning and divide your web services layer into a set of smaller independent web services. +// Once a column family is created, you can have arbitrary columns for each page visited within the web application for every user. +INCR visit_counter['mfowler'][home] BY 1; +INCR visit_counter['mfowler'][products] BY 1; +INCR visit_counter['mfowler'][contactus] BY 1; -##### Downsides -* To make sure you do not overengineer and still provide all of the functionality needed by your clients, you may need to spend much more time designing and researching your future use cases. No matter how much you try, you still take a risk of implementing too much or designing too restrictively. +// expiring columns +SET Customer['mfowler']['demo_access'] = 'allowed' WITH ttl=2592000; +``` -#### Pragmatic approach -* When you see a use case that can be easily isolated into a separate web service and that will most likely require multiple clients performing the same type of functionality, then you should consider building a web service for it. On the other hand, when you are just testing the waters with very loosely defined requirements, you may be better off by starting small and learning quickly rather than investing too much upfront. +* When not to use + - **ACID transactions for writes and reads** + - **Database to aggregate the data using queries (such as SUM or AVG)**: you have to do this on the client side using data retrieved by the client from all the rows. + - **Early prototypes or initial tech spikes**: During the early stages, we are not sure how the query patterns may change, and as the query patterns change, we have to change the column family design. This causes friction for the product innovation team and slows down developer productivity. RDBMS impose high cost on schema change, which is traded off for a low cost of query change; in Cassandra, the cost may be higher for query change as compared to schema change. -### Types -#### Function-Centric Services -* A simple way of thinking about function-centric web services is to imagine that anywhere in your code you could call a function. As a result of that function call, your arguments and all the data needed to execute that function would be serialized and sent over the network to a machine that is supposed to execute it. After reaching the remote server, the data would be converted back to the native formats used by that machine, the function would be invoked, and then results would be serialized back to the network abstraction format. Then the result would be sent to your server and unserialized to your native machine formats so that your code could continue working without ever knowing that the function was executed on a remote machine. -* SOAP is the dominant technology. - - Features - + One of the most important features was that it allowed web services to be discovered and the integration code to be generated based on contract descriptors themselves. - + Another important feature is its extensibility. - - Downsides - + The richness of features came at a cost of reduced interoperability. Integration between development stack (dynamic languages such as PHP, Ruby, Perl) was difficult. - + You cannot use HTTP-level caching with SOAP. SOAP requests are issued by sending XML documents, where request parameters and method names are contained in the XML document itself. Since the URL does not contain all of the information needed to perform the remote procedure call, the response cannot be cached on the HTTP layer based on the URL alone. - + Some of the additional specifications introduce state into the web service protocol, making it stateful. In theory, you could implement a stateless SOAP web service using just the bare minimum of SOAP-related specifications, but in practice, companies often want to use more than that. As soon as you begin supporting things like transactions or secure conversation, you forfeit the ability to treat your web services machines as stateless clones and distribute requests among them. +### Graph +* Suitable use cases + - **Connected data**: + + Social networks are where graph databases can be deployed and used very effectively. These social graphs don't have to be only of the friend kind; for example, they can represent employees, their knowledge, and where they worked with other employees on different projects. Any link-rich domain is well-suited for graph databases. + + If you have relationships between domain entities from different domains (such as social, spatial, commerce) in a single database, you can make these relationships more valuable by providing the ability to traverse across domains. + - **Routing, Dispatch, and Location-Based Services**: Every location or address that has a delivery is node, and all the nodes where the delivery has to be made by the delivery person can be modeled as a graph nodes. Relationships between nodes can have the property of distance, thus allowing you to deliver the goods in an efficient manner. Distance and location properties can also be used in graphs of places of interest, so that your application can provide recommendations of good restaurants or entertainment options nearby. You can also create nodes for your points of sales, such as bookstores or restaurants, and notify the users when they are close to any of the nodes to provide location-based services. + - **Recommendation Engines**: + + As nodes and relationships are created in the system, they can be used to make recommendations like "your friends also bought this product" or "when invoicing this item, these other items are usually invoiced." Or, it can be used to make recommendations to travelers mentioning that when other visitors come to Barcelona they usually visit Antonio Gaudi's creations. + + An interesting side effect of using the graph databases for recommendations is that as the data size grows, the number of nodes and relationships available to make the recommendations quickly increases. The same data can also be used to mine information-for example, which products are always bought together, or which items are always invoiced together; alerts can be raised when these conditions are not met. Like other recommendation engines, graph databases can be used to search for patterns in relationships to detect fraud in transactions. -#### Resource-Centric Services -* Each resource can be treated as a type of object, and there are only a few operations that can be performed on these objects (you can create, delete, update and fetch them). You model your resources in any way you wish, but you interact with them in more standardized ways. +* When not to use + - When you want to update all or a subset of entities - for example, in an analytics solution where all entities may need to be updated with a changed property - graph databases may not be optimal since changing a peroperty on all the nodes is not a straight-forward operation. Even if the data model works for the problem domain, some databases may be unable to handle lots of data, especially in global graph operations. -### REST -* REST is not always the best. For example, mobile will force you to move away from the model of a single resource per call. There are various ways to support the mobile use case, but none of them is particularly RESTful. That's because mobile applications need to be able to make a single call per screen, even if that screen demonstrates multiple types of resources. +# Scaling +## Functional partitioning ### REST best practices * Could look at industrial level api design example by [Github](https://developer.github.com/v3/) @@ -1149,68 +1048,217 @@ HTTP/1.1 400 Bad Request * This kind of safeguarding is usually unnecessary when dealing with an internal API, or an API meant only for your front end, but it's a crucial measure to make when exposing the API publicly. * Suppose you define a rate limit of 2,000 requests per hour for unauthenticated users; the API should include the following headers in its responses, with every request shaving off a point from the remainder. The X-RateLimit-Reset header should contain a UNIX timestamp describing the moment when the limit will be reset -> X-RateLimit-Limit: 2000 -> X-RateLimit-Remaining: 1999 -> X-RateLimit-Reset: 1404429213925 +> X-RateLimit-Limit: 2000 +> X-RateLimit-Remaining: 1999 +> X-RateLimit-Reset: 1404429213925 + + +* Once the request quota is drained, the API should return a 429 Too Many Request response, with a helpful error message wrapped in the usual error envelope: + +``` +X-RateLimit-Limit: 2000 +X-RateLimit-Remaining: 0 +X-RateLimit-Reset: 1404429213925 +{ + "error": { + "code": "bf-429", + "message": "Request quota exceeded. Wait 3 minutes and try again.", + "context": { + "renewal": 1404429213925 + } + } +} +``` + +* However, it can be very useful to notify the consumer of their limits before they actually hit it. This is an area that currently lacks standards but has a number of popular conventions using HTTP response headers. + +##### Use OAuth2 with HTTPS for authorization, authentication and confidentiality. + +#### Documentation +* Good documentation should + - Explain how the response envelope works + - Demonstrate how error reporting works + - Show how authentication, paging, throttling, and caching work on a high level + - Detail every single endpoint, explain the HTTP verbs used to query those endpoints, and describe each piece of data that should be in the request and the fields that may appear in the response +* Test cases can sometimes help as documentation by providing up-to-date working examples that also indicate best practices in accessing an API. The docs should show examples of complete request/response cycles. Preferably, the requests should be pastable examples - either links that can be pasted into a browser or curl examples that can be pasted into a terminal. GitHub and Stripe do a great job with this. + - CURL: always illustrating your API call documentation by cURL examples. Readers can simply cut-and-paste them, and they remove any ambiguity regarding call details. +* Another desired component in API documentation is a changelog that briefly details the changes that occur from one version to the next. The documentation must include any deprecation schedules and details surrounding externally visible API updates. Updates should be delivered via a blog (i.e. a changelog) or a mailing list (preferably both!). + +#### Others +* Provide filtering, sorting, field selection and paging for collections + - Filtering: Use a unique query parameter for all fields or a query language for filtering. + + GET /cars?color=red Returns a list of red cars + + GET /cars?seats<=2 Returns a list of cars with a maximum of 2 seats + - Sorting: Allow ascending and descending sorting over multiple fields + + GET /cars?sort=-manufactorer,+model. This returns a list of cars sorted by descending manufacturers and ascending models. + - Field selection: Mobile clients display just a few attributes in a list. They don’t need all attributes of a resource. Give the API consumer the ability to choose returned fields. This will also reduce the network traffic and speed up the usage of the API. + + GET /cars?fields=manufacturer,model,id,color + - Paging: + + Use limit and offset. It is flexible for the user and common in leading databases. The default should be limit=20 and offset=0. GET /cars?offset=10&limit=5. + + To send the total entries back to the user use the custom HTTP header: X-Total-Count. + + Links to the next or previous page should be provided in the HTTP header link as well. It is important to follow this link header values instead of constructing your own URLs. +* Content negotiation + - Content-type defines the request format. + - Accept defines a list of acceptable response formats. If a client requires you to return application/xml and the server could only return application/json, then you'd better return status code 406. + - We recommend handling several content distribution formats. We can use the HTTP Header dedicated to this purpose: “Accept”. + - By default, the API will share resources in the JSON format, but if the request begins with “Accept: application/xml”, resources should be sent in the XML format. + - It is recommended to manage at least 2 formats: JSON and XML. The order of the formats queried by the header “Accept” must be observed to define the response format. + - In cases where it is not possible to supply the required format, a 406 HTTP Error Code is sent (cf. Errors — Status Codes). +* Pretty print by default and ensure gzip is supported + - An API that provides white-space compressed output isn't very fun to look at from a browser. Although some sort of query parameter (like ?pretty=true) could be provided to enable pretty printing, an API that pretty prints by default is much more approachable. +* HATEOAS: Hypertext As The Engine of Application State + - There should be a single endpoint for the resource, and all of the other actions you’d need to undertake should be able to be discovered by inspecting that resource. + - People are not doing this because the tooling just isn't there. + + + +## Data partitioning - Sharding +### Sharding benefits +* Scale horizontally to any size. Without sharding, sooner or later, your data set size will be too large for a single server to manage or you will get too many concurrent connections for a single server to handle. You are also likely to reach your I/O throughput capacity as you keep reading and writing more data. By using application-level sharing, none of the servers need to have all of the data. This allows you to have multiple MySQL servers, each with a reasonable amount of RAM, hard drives, and CPUs and each of them being responsible for a small subset of the overall data, queries, and read/write throughput. +* Since sharding splits data into disjoint subsets, you end up with a share-nothing architecture. There is no overhead of communication between servers, and there is no cluster-wide synchronization or blocking. Servers are independent from each other because they shared nothing. Each server can make authoritative decisions about data modifications +* You can implement in the application layer and then apply it to any data store, regardless of whether it supports sharding out of the box or not. You can apply sharding to object caches, message queues, nonstructured data stores, or even file systems. + +### Sharding key +* Determine what tables need to be sharded. A good starting point for deciding that is to look at the number of rows in the tables as well as the dependencies between the tables. + - Typically you use only a single column as partition key. Using multiple columns can be hard to maintain unless they are hard to maintain. + - Sharding on a column that is a primary key offers significant advantages. The reason for this is that the column should have a unique index, so that each value in the column uniquely identifies the row. + +### Sharding function +#### Static sharding +* Def: The sharding key is mapped to a shard identifier using a fixed assignment that never changes. + - Static sharding schemes run into problems when the distribution of the queries is not even. +* Types: + - Range partitioning (used in HBase) + + Easy to implement but distribution can easy to become uneven. For example, if you are using URIs as keys, "hot" sites will be clustered together when you actually want the opposite, to spread them out. One hard can become overloaded and you have to split it a lot to be able to cope with the increase in load. + - Hash partitioning + + Computes a hash of the input in some manner (MD5 or SHA-1) and then uses modulo arithmetic to get a number between 1 and the number of the shards. + * Evenly distributed but need large amount of data migration when the number of server changes and rehashing + + Consistent hashing: Gauranteed to move rows from just one old shard to the new shard. The entire hash range is shown as a ring. On the hash ring, the shards are assigned to points on the ring using the hash function. In a similar manner, the rows are distributed over the ring using the same hash function. Each shard is now responsible for the region of the ring that starts at the shard's point on the ring and continues to the next shard point. Because a region may start at the end of the hash range and wrap around to the beginning of the hash range, a ring is used here instead of a flat line. + - Pick a hash function which must have a big range, hence a lot of "points" on the hash ring where rows can be assigned. The most commonly used functions are MD5, SHA and Murmur hash (murmur3 -2^128, 2^128) + - Less data migration but hard to balance node + * Unbalanced scenario 1: Machines with different processing power/speed. + * Unbalanced scenario 2: Ring is not evenly partitioned. + * Unbalanced scenario 3: Same range length has different amount of data. + - Virtual nodes (Used in Dynamo and Cassandra) + + Solution: Each physical node associated with a different number of virtual nodes. + + Problems: Data should not be replicated in different virtual nodes but the same physical nodes. + +#### Dynamic sharding +* The sharding key is looked up in a dictionary that indicates which shard contains the data. + - More flexible. You are allowed to change the location of shards and it is also easy to move data between shards if you have to. You do not need to migrate all of the data in one shot, but you can do it incrementally, one account at a time. To migrate a user, you need to lock its account, migrate the data, and then unlock it. You could usually do these migrations at night to reduce the impact on the system, and you could also migrate multiple accounts at the same time. There is an additional level of flexibility, as you can cherry-pick users and migrate them to the shards of your choice. Depending on the application requirements, you could migrate your largest or busiest clients to separate dedicated database instances to give them more capacity. + - Requires a centralized store called the sharding database and extra queries to find the correct shard to retrieve the data from. + +### Challenges +#### Cross-shard joins +* Tricky to execute queries spanning multiple shards. The most common reason for using cross-shard joins is to create reports. This usually requires collecting information from the entire database. There are basically two approaches to solve this problem + - Execute the query in a map-reduce fashion (i.e., send the query to all shards and collect the result into a single result set). It is pretty common that running the same query on each of your servers and picking the highest of the values will not guarantee a correct result. + - Replicate all the shards to a separate reporting server and run the query there. This approach is easier. It is usually feasible, as well, because most reporting is done at specific times, is long-running, and does not depend on the current state of the database. + + +#### Using AUTO_INCREMENT +* It is quite common to use AUTO_INCREMENT to create a unique identifier for a column. However, this fails in a sharded environment because the the shards do not syncrhonize their AUTO_INCREMENT identifiers. This means if you insert a row in one shard, it might well happen that the same identifier is used on another shard. If you truly want to generate a unique identifer, there are basically three approaches. + - Generate a unique UUID. The drawback is that the identifier takes 128 bits (16 bytes). + - Use a composite identifier. Where the first part is the shard identifier and the second part is a locally generated identifier. Note that the shard identifier is used when generating the key, so if a row with this identifier is moved, the original shard identifier has to move with it. You can solve this by maintaining, in addition to the column with the AUTO_INCREMENT, an extra column containing the shard identifier for the shard where the row was created. + - Use atomic counters provided by some data stores. For example, if you already use Redis, you could create a counter for each unique identifier. You would then use Redis' INCR command to increase the value of a selected counter and return it with a different value. + +#### Distributed transactions +* Lose the ACID properties of your database as a whole. Maintaining ACID properties across shards requires you to use distributed transactions, which are complex and expensive to execute (most open-source database engines like MySQL do not even support distributed transactions). + + +## Clones - Replication +### Replication purpose +#### High availability by creating redundancy +* Duplicate components + - Def: Keep duplicates around for each component - ready to take over immediately if the original component fails. + + Characteristics: Do not lose performance when switching and switching to the standby is usually faster than restructuring the system. But expensive. + - For example: Hot standby + + A dedicated server that just duplicates the main master. The hot standby is connected to the master as a slave, so that it reads and applies all changes. This setup is often called primary-backup configuration. + +* Create spare capacity + - Def: Have extra capacity in the system so that if a component fails, you can still handle the load. + - Characteristics: Should one of the component fail, the system will still be responding, but the capacity of the system will be reduced. + +##### Planning for failures +* Slave failures + - Because the slaves are used only for read quires, it is sufficient to inform the load balancer that the slave is missing. Then we can take the failing slave out of rotation. rebuild it and put it back. + +* Master failures + - Problems: + + All the slaves have stale data. + + Some queries may block if they are waiting for changes to arrive at the slave. Some queries may make it into the relay log of the slave and therefore will eventually be executed by the slave. No special consideration has to be taken on the behalf of these queries. + + For queries that are waiting for events that did not leave the master before it crashed, they are usually reported as failures so users should reissue the query. + - Solutions: + + If simply restart does not work + + First find out which of your slaves is most up to date. + + Then reconfigure it to become a master. + + Finally reconfigure all remaining slaves to replicate from the new master. + +* Relay failures + - For servers acting as relay servers, the situation has to be handled specially. If they fail, the remaining slaves have to be redirected to use some other relay or the master itself. + +* Disaster recovery + - Disaster does not have to mean earthquakes or floods; it just means that something went very bad for the computer and it is not local to the machine that failed. Typical examples are lost power in the data center (not necessarily because the power was lost in the city; just losing power in the building is sufficient.) + - The nature of a disaster is that many things fail at once, making it impossible to handle redundancy by duplicating servers at a single data center. Instead, it is necessary to ensure data is kept safe at another geographic location, and it is quite common for companies to ensure high availability by having different components at different offices. + +#### Replication for scaling read +##### When to use +* Scale reads: Instead of a single server having to respond to all the queries, you can have many clones sharing the load. You can keep scaling read capacity by simply adding more slaves. And if you ever hit the limit of how many slaves your master can handle, you can use multilevel replication to further distribute the load and keep adding even more slaves. By adding multiple levels of replication, your replication lag increases, as changes need to propogate through more servers, but you can increase read capacity. +* Scale the number of concurrently reading clients and the number of queries per second: If you want to scale your database to support 5,000 concurrent read connections, then adding more slaves or caching more aggressively can be a great way to go. + +##### When not to use +* Scale writes: No matter what topology you use, all of your writes need to go through a single machine. + - Although a dual master architecture appears to double the capacity for handling writes (because there are two masters), it actually doesn't. Writes are just as expensive as before because each statement has to be executed twice: once when it is received from the client and once when it is received from the other master. All the writes done by the A clients, as well as B clients, are replicated and get executed twice, which leaves you in no better position than before. +* Not a good way to scale the overall data set size: If you want to scale your active data set to 5TB, replication would not help you get there. The reason why replication does not help in scaling the data set size is that all of the data must be present on each of the machines. The master and each of its slave need to have all of the data. + - Def of active data set: All of the data that must be accessed frequently by your application. (all of the data your database needs to read from or write to disk within a time window, like an hour, a day, or a week.) + - Size of active data set: When the active data set is small, the database can buffer most of it in memory. As your active data set grows, your database needs to load more disk blocks because in-memory buffers are not large enough to contain enough of the active disk blocks. + - Access pattern of data set + + Like a time-window: In an e-commerce website, you use tables to store information about each purchase. This type of data is usually accessed right after the purchase and then it becomes less and less relevant as time goes by. Sometimes you may still access older transactions after a few days or weeks to update shipping details or to perform a refund, but after that, the data is pretty much dead except for an occasional report query accessing it. + + Unlimited data set growth: A website that allowed users to listen to music online, your users would likely come back every day or every week to listen to their music. In such case, no matter how old an account is, the user is still likely to log in and request her playlists on a weekly or daily basis. +### Replication Topology +#### Master-slave vs peer-to-peer -* Once the request quota is drained, the API should return a 429 Too Many Request response, with a helpful error message wrapped in the usual error envelope: +| Types | Strengths | Weakness | +| ------------ |:----------------:|:-------------------:| +| Master-slave | | | +| p2p: Master-master | | Not a viable scalability technique. | +| p2p: Ring-based | Chain three or more masters together to create a ring. | | -``` -X-RateLimit-Limit: 2000 -X-RateLimit-Remaining: 0 -X-RateLimit-Reset: 1404429213925 -{ - "error": { - "code": "bf-429", - "message": "Request quota exceeded. Wait 3 minutes and try again.", - "context": { - "renewal": 1404429213925 - } - } -} -``` -* However, it can be very useful to notify the consumer of their limits before they actually hit it. This is an area that currently lacks standards but has a number of popular conventions using HTTP response headers. +#### Master-slave replication +* Responsibility: + - Master is reponsible for all data-modifying commands like updates, inserts, deletes or create table statements. The master server records all of these statements in a log file called a binlog, together with a timestamp, and a sequence number to each statement. Once a statement is written to a binlog, it can then be sent to slave servers. + - Slave is responsible for all read statements. +* Replication process: The master server writes commands to its own binlog, regardless if any slave servers are connected or not. The slave server knows where it left off and makes sure to get the right updates. This asynchronous process decouples the master from its slaves - you can always connect a new slave or disconnect slaves at any point in time without affecting the master. + 1. First the client connects to the master server and executes a data modification statement. The statement is executed and written to a binlog file. At this stage the master server returns a response to the client and continues processing other transactions. + 2. At any point in time the slave server can connect to the master server and ask for an incremental update of the master' binlog file. In its request, the slave server provides the sequence number of the last command that it saw. + 3. Since all of the commands stored in the binlog file are sorted by sequence number, the master server can quickly locate the right place and begin streaming the binlog file back to the slave server. + 4. The slave server then writes all of these statements to its own copy of the master's binlog file, called a relay log. + 5. Once a statement is written to the relay log, it is executed on the slave data set, and the offset of the most recently seen command is increased. -##### Use OAuth2 with HTTPS for authorization, authentication and confidentiality. +##### Number of slaves +* It is a common practice to have two or more slaves for each master server. Having more than one slave machine have the following benefits: + - Distribute read-only statements among more servers, thus sharding the load among more servers + - Use different slaves for different types of queries. E.g. Use one slave for regular application queries and another slave for slow, long-running reports. + - Losing a slave is a nonevent, as slaves do not have any information that would not be available via the master or other slaves. -#### Documentation -* Good documentation should - - Explain how the response envelope works - - Demonstrate how error reporting works - - Show how authentication, paging, throttling, and caching work on a high level - - Detail every single endpoint, explain the HTTP verbs used to query those endpoints, and describe each piece of data that should be in the request and the fields that may appear in the response -* Test cases can sometimes help as documentation by providing up-to-date working examples that also indicate best practices in accessing an API. The docs should show examples of complete request/response cycles. Preferably, the requests should be pastable examples - either links that can be pasted into a browser or curl examples that can be pasted into a terminal. GitHub and Stripe do a great job with this. - - CURL: always illustrating your API call documentation by cURL examples. Readers can simply cut-and-paste them, and they remove any ambiguity regarding call details. -* Another desired component in API documentation is a changelog that briefly details the changes that occur from one version to the next. The documentation must include any deprecation schedules and details surrounding externally visible API updates. Updates should be delivered via a blog (i.e. a changelog) or a mailing list (preferably both!). +#### Peer-to-peer replication +* Dual masters + - Two masters replicate each other to keep both current. This setup is very simple to use because it is symmetric. Failing over to the standby master does not require any reconfiguration of the main master, and failing back to the main master again when the standby master fails in turn is very easy. + + Active-active: Writes go to both servers, which then transfer changes to the other master. + + Active-passive: One of the masters handles writes while the other server, just keeps current with the active master + - The most common use of active-active dumal masters setup is to have the servers geographically close to different sets of users - for example, in branch offices at different places in the world. The users can then work with local server, and the changes will be replicated over to the other master so that both masters are kept in sync. -#### Others -* Provide filtering, sorting, field selection and paging for collections - - Filtering: Use a unique query parameter for all fields or a query language for filtering. - + GET /cars?color=red Returns a list of red cars - + GET /cars?seats<=2 Returns a list of cars with a maximum of 2 seats - - Sorting: Allow ascending and descending sorting over multiple fields - + GET /cars?sort=-manufactorer,+model. This returns a list of cars sorted by descending manufacturers and ascending models. - - Field selection: Mobile clients display just a few attributes in a list. They don’t need all attributes of a resource. Give the API consumer the ability to choose returned fields. This will also reduce the network traffic and speed up the usage of the API. - + GET /cars?fields=manufacturer,model,id,color - - Paging: - + Use limit and offset. It is flexible for the user and common in leading databases. The default should be limit=20 and offset=0. GET /cars?offset=10&limit=5. - + To send the total entries back to the user use the custom HTTP header: X-Total-Count. - + Links to the next or previous page should be provided in the HTTP header link as well. It is important to follow this link header values instead of constructing your own URLs. -* Content negotiation - - Content-type defines the request format. - - Accept defines a list of acceptable response formats. If a client requires you to return application/xml and the server could only return application/json, then you'd better return status code 406. - - We recommend handling several content distribution formats. We can use the HTTP Header dedicated to this purpose: “Accept”. - - By default, the API will share resources in the JSON format, but if the request begins with “Accept: application/xml”, resources should be sent in the XML format. - - It is recommended to manage at least 2 formats: JSON and XML. The order of the formats queried by the header “Accept” must be observed to define the response format. - - In cases where it is not possible to supply the required format, a 406 HTTP Error Code is sent (cf. Errors — Status Codes). -* Pretty print by default and ensure gzip is supported - - An API that provides white-space compressed output isn't very fun to look at from a browser. Although some sort of query parameter (like ?pretty=true) could be provided to enable pretty printing, an API that pretty prints by default is much more approachable. -* HATEOAS: Hypertext As The Engine of Application State - - There should be a single endpoint for the resource, and all of the other actions you’d need to undertake should be able to be discovered by inspecting that resource. - - People are not doing this because the tooling just isn't there. +* Circular replication + +### Replication mode +#### Synchronous and Asynchronous +* Asynchronous: The master does not wait for the slaves to apply the changes, but instead just dispatches each change request to the slaves and assume they will catch up eventually and replicate all the changes. +* Synchronous: The master and slaves are always in sync and a transaction is not allowed to be committed on the master unless the slaves agrees to commit it as well (i.e. synchronous replication makes the master wait for all the slaves to keep up with the writes.) +#### Synchronous vs Asynchronous +* Asynchronous replication is a lot faster than synchronous replication. Compared with asynchronous replication, synchronous replication requires extra synchronization to guarantee consistency. It is usually implemented through a protocol called two-phase commit, which guarantees consistency between the master and slaves. What makes this protocol slow is that it requires a total of four messages, including messages with the transaction and the prepare request. The major problem is not the amount of network traffic required to handle the synchronization, but the latency introduced by the network and by processing the commit on the slave, together with the fact that the commit is blocked on the master until all the slaves have acknowledged the transaction. In contrast, the master does not have to wait for the slave, but can report the transaction as committed immediately, which improves performance significantly. +* The performance of asynchronous replication comes at the price of consistency. In asynchronous replication the transaction is reported as committed immediately, without waiting for any acknowledgement from the slave. ## Cache ### Why does cache work @@ -1451,169 +1499,68 @@ public V readSomeData(K key) { - In a region: Replication - Across regions: Consistency +# Architecture +## Lambda architecture -## Message queue -### Benefits -* **Enabling asynchronous processing**: - - Defer processing of time-consuming tasks without blocking our clients. Anything that is slow or unpredictable is a candidate for asynchronous processing. Example include - + Interact with remote servers - + Low-value processing in the critical path - + Resource intensive work - + Independent processing of high- and low- priority jobs - - Message queues enable your application to operate in an asynchronous way, but it only adds value if your application is not built in an asynchronous way to begin with. If you developed in an environment like Node.js, which is built with asynchronous processing at its core, you will not benefit from a message broker that much. What is good about message brokers is that they allow you to easily introduce asynchronous processing to other platforms, like those that are synchronous by nature (C, Java, Ruby) -* **Easier scalability**: - - Producers and consumers can be scaled separately. We can add more producers at any time without overloading the system. Messages that cannot be consumed fast enough will just begin to line up in the message queue. We can also scale consumers separately, as now they can be hosted on separate machines and the number of consumers can grow independently of producers. -* **Decoupling**: - - All that publishers need to know is the format of the message and where to publish it. Consumers can become oblivious as to who publishes messages and why. Consumers can focus solely on processing messages from the queue. Such a high level decoupling enables consumers and producers to be developed indepdently. They can even be developed by different teams using different technologies. -* **Evening out traffic spikes**: - - You should be able to keep accepting requests at high rates even at times of icnreased traffic. Even if your publishing generates messages much faster than consumers can keep up with, you can keep enqueueing messages, and publishers do not have to be affected by a temporary capacity problem on the consumer side. -* **Isolating failures and self-healing**: - - The fact that consumers' availability does not affect producers allows us to stop message processing at any time. This means that we can perform maintainance and deployments on back-end servers at any time. We can simply restart, remove, or add servers without affecting producer's availability, which simplifies deployments and server management. Instead of breaking the entire application whenever a back-end server goes offline, all that we experience is reduced throughput, but there is no reduction of availability. Reduced throughput of asynchronous tasks is usually invisible to the user, so there is no consumer impact. - -### Components -* Message producer - - Locate the message queue and send a valid message to it -* Message broker - where messages are sent and buffered for consumers. - - Be available at all times for producers and to accept their messages. - - Buffering messages and allowing consumers to consume related messages. -* Message consumer - - Receive and process message from the message queue. - - The two most common ways of implement consumers are a "cron-like" and a "daemon-like" approach. - + Connects periodically to the queue and checks the status of the queue. If there are messages, it consumes them and stops when the queue is empty or after consuming a certain amount of messages. This model is common in scripting languages where you do not have a persistenly running application container, such as PHP, Ruby, or Perl. Cron-like is also referred to as a pull model because the consumers pulls messages from the queue. It can also be used if messages are added to the queue rarely or if network connectivity is unreliable. For example, a mobile application may try to pull the queue from time to time, assuming that connection may be lost at any point in time. - + A daemon-like consumer runs constantly in an infinite loop, and it usually has a permanent connection to the message broker. Instead of checking the status of the queue periodically, it simply blocks on the socket read operation. This means that the consumer is waiting idly until messages are pushed by the message broker in the connection. This model is more common in languages with persistent application containers, such as Java, C#, and Node.js. This is also referred to as a push model because messages are pushed by the message broker onto the consumer as fast as the consumer can keep processing them. - -### Routing methods -* Direct worker queue method - - Consumers and producers only have to know the name of the queue. - - Well suited for the distribution of time-consuming tasks such as sending out e-mails, processing videos, resizing images, or uploading content to third-party web services. -* Publish/Subscribe method - - Producers publish message to a topic, not a queue. Messages arriving to a topic are then cloned for each consumer that has a declared subscription to that topic. -* Custom routing rules - - A consumer can decide in a more flexible way what messages should be routed to its queue. - - Logging and alerting are good examples of custom routing based on pattern matching. - -### Protocols -* AMQP: A standardized protocol accepted by OASIS. Aims at enterprise integration and interoperability. -* STOMP: A minimalist protocol. - - Simplicity is one of its main advantages. It supports fewer than a dozen operations, so implementation and debugging of libraries are much easier. It also means that the protocol layer does not add much performance overhead. - - But interoperability can be limited because there is no standard way of doing certain things. A good example of impaired is message prefetch count. Prefetch is a great way of increasing throughput because messages are received in batches instead of one message at a time. Although both RabbitMQ and ActiveMQ support this feature, they both implement it using different custom STOMP headers. -* JMS - - A good feature set and is popular - - Your ability to integrate with non-JVM-based languages will be very limited. - -### Metrics to decide which message broker to use -* Number of messages published per second -* Average message size -* Number of messages consumed per second (this can be much higher than publishing rate, as multiple consumers may be subscribed to receive copies of the same message) -* Number of concurrent publishers -* Number of concurrent consumers -* If message persistence is needed (no message loss during message broker crash) -* If message acknowledgement is need (no message loss during consumer crash) - -### Challenges -* No message ordering: Messages are processed in parallel and there is no synchronization between consumers. Each consumer works on a single message at a time and has no knowledge of other consumers running in parallel to it. Since your consumers are running in parallel and any of them can become slow or even crash at any point in time, it is difficult to prevent messages from being occasionally delivered out of order. - - Solutions: - + Limit the number of consumers to a single thread per queue - + Build the system to assume that messages can arrive in random order - + Use a messaging broker that supports partial message ordering guarantee. - - It is best to depend on the message broker to deliver messages in the right order by using partial message guarantee (ActiveMQ) or topic partitioning (Kafka). If your broker does not support such functionality, you will need to ensure that your application can handle messages being processed in an unpredictable order. - + Partial message ordering is a clever mechanism provided by ActiveMQ called message groups. Messages can be published with a special label called a message group ID. The group ID is defined by the application developer. Then all messages belonging to the same group are guaranteed to be consumed in the same order they were produced. Whenever a message with a new group ID gets published, the message broker maps the new group Id to one of the existing consumers. From then on, all the messages belonging to the same group are delivered to the same consumer. This may cause other consumers to wait idly without messages as the message broker routes messages based on the mapping rather than random distribution. - - Message ordering is a serious issue to consider when architecting a message-based application, and RabbitMQ, ActiveMQ and Amazon SQS messaging platform cannot guarantee global message ordering with parallel workers. In fact, Amazon SQS is known for unpredictable ordering messages because their infrastructure is heavily distributed and ordering of messages is not supported. -* Message requeueing - - By allowing messages to be delivered to your consumers more than once, you make your system more robust and reduce constraints put on the message queue and its workers. For this approach to work, you need to make all of your consumers idempotent. - + But it is not an easy thing to do. Sending emails is, by nature, not an idempotent operation. Adding an extra layer of tracking and persistence could help, but it would add a lot of complexity and may not be able to handle all of the faiulres. - + Idempotent consumers may be more sensitive to messages being processed out of order. If we have two messages, one to set the product's price to $55 and another one to set the price of the same product to $60, we could end up with different results based on their processing order. -* Race conditions become more likely -* Risk of increased complexity - - When integrating applications using a message broker, you must be very diligent in documenting dependencies and the overarching message flow. Without good documentation of the message routes and visibility of how the message flow through the system, you may increase the complexity and make it much harder for developers to understand how the system works. - -## Data Persistent Layer -### MySQL - - -### NoSQL -#### NoSQL vs SQL -* There is no generally accepted definition. All we can do is discuss some common characteristics of the databases that tend to be called "NoSQL". - -| Database | SQL | NoSQL | -| --------------------- |:-------------:| ------------:| -| Data uniformness | Uniform data. Best visualized as a set of tables. Each table has rows, with each row representing an entity of interest. Each row is described through columns. One row cannot be nested inside another. | Non-uniform data. NoSQL databases recognize that often, it is common to operate on data in units that have a more complex structure than a set of rows. This is particularly useful in dealing with nonuniform data and custom fields. NoSQL data model can be put into four categories: key-value, document, column-family and graph. | -| Schema change | Define what table exists, what column exists, what data types are. Although actually relational schemas can be changed at any time with standard SQL commands, it is of hight cost. | Changing schema is casual and of low cost. Essentially, a schemaless database shifts the schema into the application code. | -| Query flexibility | Low cost on changing query. It allows you to easily look at the data in different ways. Standard SQL supports things like joins and subqueries. | High cost in changing query. It does not allow you to easily look at the data in different ways. NoSQL databases do not have the flexibility of joins or subqueries. | -| Transactions | SQL has ACID transactions (Atomic, Consistent, Isolated, and Durable). It allows you to manipulate any combination of rows from any tables in a single transaction. This operation either succeeds or fails entirely, and concurrent operations are isolated from each other so they cannot see a partial update. | Graph database supports ACID transactions. Aggregate-oriented databases do not have ACID transactions that span multiple aggregates. Instead, they support atomic manipulation of a single aggregate at a time. If we need to manipulate multiple aggregates in an atomic way, we have to manage that ourselves in application code. An aggregate structure may help with some data interactions but be an obstacle for others. | -| Consistency | Strong consistency | Trade consistency for availability or partition tolerance. Eventual consistency | -| Scalability | elational database use ACID transactions to handle consistency across the whole database. This inherently clashes with a cluster environment | Aggregate structure helps greatly with running on a cluster. It we are running on a cluster, we need to minize how many nodes we need to query when we are gathering data. By using aggregates, we give the database important information about which bits of data (an aggregate) will be manipulated together, and thus should live on the same node. | -| Performance | MySQL/PosgreSQL ~ 1k QPS | MongoDB/Cassandra ~ 10k QPS. Redis/Memcached ~ 100k ~ 1M QPS | -| Maturity | Over 20 years. Integrate naturally with most web frameworks. For example, Active Record inside Ruby on Rails | Usually less than 10 years. Not great support for serialization and secondary index | - -#### NoSQL flavors -##### Key-value -* Suitable use cases - - **Storing session information**: Generally, every web session is unique and is assigned a unique sessionid value. Applications that store the sessionid on disk or in a RDBMS will greatly benefit from moving to a key-value store, since everything about the session can be stored by a single PUT request or retrieved using GET. This single-request operation makes it very fast, as everything about the session is stored in a single object. Solutions such as Memcached are used by many web applications, and Riak can be used when availability is important - - **User profiles, Preferences**: Almost every user has a unique userId, username, or some other attributes, as well as preferences such as language, color, timezone, which products the user has access to, and so on. This can all be put into an object, so getting preferences of a user takes a single GET operation. Similarly, product profiles can be stored. - - **Shopping Cart Data**: E-commerce websites have shopping carts tied to the user. As we want the shopping carts to be available all the time, across browsers, machines, and sessions, all the shopping information can be put into value where the key is the userid. A riak cluster would be best suited for these kinds of applications. - -* When not to use - * **Relationships among Data**: If you need to have relationships between different sets of data, or correlate teh data between different sets of key, key-value stores are not the best solution to use, even though some key-value stores provide link-walking features. - * **Multioperation transactions**: If you're saving multiple keys and there is a failure to save any of them, and you want to revert or roll back the rest of the operations, key-value stores are not the best solution to be used. - * **Query by data**: If you need to search the keys based on something found in the value part of the key-value pairs, then key-value stores are not going to perform well for you. This is no way to inspect the value on the database side, with the exception of some products like Riak Search or indexing engines like Lucene. - * **Operations by sets**: Since operations are limited to one key at a time, there is no way to operate upon multiple keys at the same time. If you need to operate upon multiple keys, you have to handle this from the client side. - -##### Document -* Suitable use cases - - **Event logging**: Applications have different event logging needs; within the enterprise, there are many different applications that want to log events. Document databases can store all these different types of events and can act as a central data store for event storage. This is especially true when the type of data being captured by the events keeps changing. Events can be sharded by the name of the application where the event originated or by the type of event such as order_processed or customer_logged. - - **Content Management Systems, Blogging Platforms**: Since document databases have no predefined schemas and usually uderstand JSON documents, they work well in content management systems or applications for publishing websites, managing user comments, user registrations, profiles, web-facing documents. - - **Web Analytics or Real-Time Analytics**: Document databases can store data for real-time analytics; since parts of the document can be updated, it's very easy to store page views or unique visitors, and new metrics can be easily added without schema changes. - - **E-Commerce Applications**: E-commerce applications often need to have flexible schema for products and orders, as well as the ability to evolve their data models without expensive database refactoring or data migration. - -* When not to use - - **Complex Transactions Spanning Different Operations**: If you need to have atomic cross-document operations, then document databases may not be for you. However, there are some document databases that do support these kinds of operations, such as RavenDB. - - **Queries against Varying Aggregate Structure**: Flexible schema means that the database does not enforce any restrictions on the schema. Data is saved in the form of application entities. If you need to query these entities ad hoc, your queries will be changing (in RDBMS terms, this would mean that as you join criteria between tables, the tables to join keep changing). Since the data is saved as an aggregate, if the design of the aggregate is constantly changing, you need to save the aggregates at the lowest level of granularity-basically, you need to normalize the data. In this scenario, document databases may not work. - -##### Column-Family -* Suitable use cases - - **Event Logging**: Column-family databases with their ability to store any data structures are a great choice to store event information, such as application state or errors encountered by the application. Within the enterprise, all applications can write their events to Cassandra with their own columns and the row key of the form appname:timestamp. Since we can scale writes, Cassandra would work ideally for an event logging system. - - **Content Management Systems, Blogging Platforms**: Using column-families, you can store blog entries with tags, categories, links, and trackbacks in different columns. Comments can be either stored in the same row or moved to a different keyspace; similarly, blog users and the actual blogs can be put into different column families. - - **Counters**: Often, in web applications you need to count and categorize visitors of a page to calculate analytics, you can use the CounterColumnType during creation of a column family. - - **Expiring usage**: You may provide demo to users, or may want to show ad banners on a website for a specific time. You can do this by using expiring columns: Cassandra allows you to have columns which, after a given time, are deleted automatically. This time is known as TTL and is defined in seconds. The column is deleted after the TTL has elapsed; when the column does not exist, the access can be revoked or the banner can be removed. +# Building blocks +## Load balancer +### Hardware vs software +| Category | Software | Hardware | +|----------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------| +| Def | Run on standard PC hardware, using applications like Nginx and HAProxy | Run on special hardware and contain any software pre-installed and configured by the vendor. | +| Model | Operate on Application Layer | Operate on network and transport layer and work with TCP/IP packets. Route traffic to backend servers and possibly handling network address translation | +| Strength/Weakness | More intelligent because can talk HTTP (can perform the compression of resources passing through and routing-based on the presence of cookies) and more flexible for hacking in new features or changes | Higher throughput and lower latency. High purchase cost. Hardware load balancer prices start from a few thousand dollars and go as high as over 100,000 dollars per device. Specialized training and harder to find people with the work experience necessary to operate them. | -```sql -CREATE COLUMN FAMILY visit_counter -WITH default_validation_class=CounterColumnType -AND key_validation_class=UTF8Type AND comparator=UTF8Type +### HAProxy vs Nginx -// Once a column family is created, you can have arbitrary columns for each page visited within the web application for every user. -INCR visit_counter['mfowler'][home] BY 1; -INCR visit_counter['mfowler'][products] BY 1; -INCR visit_counter['mfowler'][contactus] BY 1; +| Category | Nginx | HAProxy | +|-----------|---------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------| +| Strengths | Can cache HTTP responses from your servers. | A little faster than Nginx and a wealth of extra features. It can be configured as either a layer 4 or layer 7 load balancer. | -// expiring columns -SET Customer['mfowler']['demo_access'] = 'allowed' WITH ttl=2592000; -``` +* Extra functionalities of HAProxy. It can be configured as either a layer 4 or layer 7 load balancer. + - When HAProxy is set up to be a layer 4 proxy, it does not inspect higher-level protocols and it depends solely on TCP/IP headers to distribute the traffic. This, in turn, allows HAProxy to be a load balancer for any protocol, not just HTTP/HTTPS. You can use HAProxy to distribute traffic for services like cache servers, message queues, or databases. + - HAProxy can also be configured as a layer 7 proxy, in which case it supports sticky sessions and SSL termination, but needs more resources to be able to inspect and track HTTP-specific information. The fact that HAProxy is simpler in design makes it perform sligthly better than Nginx, especially when configured as a layer 4 load balancer. Finally, HAProxy has built-in high-availability support. -* When not to use - - **ACID transactions for writes and reads** - - **Database to aggregate the data using queries (such as SUM or AVG)**: you have to do this on the client side using data retrieved by the client from all the rows. - - **Early prototypes or initial tech spikes**: During the early stages, we are not sure how the query patterns may change, and as the query patterns change, we have to change the column family design. This causes friction for the product innovation team and slows down developer productivity. RDBMS impose high cost on schema change, which is traded off for a low cost of query change; in Cassandra, the cost may be higher for query change as compared to schema change. -##### Graph -* Suitable use cases - - **Connected data**: - + Social networks are where graph databases can be deployed and used very effectively. These social graphs don't have to be only of the friend kind; for example, they can represent employees, their knowledge, and where they worked with other employees on different projects. Any link-rich domain is well-suited for graph databases. - + If you have relationships between domain entities from different domains (such as social, spatial, commerce) in a single database, you can make these relationships more valuable by providing the ability to traverse across domains. - - **Routing, Dispatch, and Location-Based Services**: Every location or address that has a delivery is node, and all the nodes where the delivery has to be made by the delivery person can be modeled as a graph nodes. Relationships between nodes can have the property of distance, thus allowing you to deliver the goods in an efficient manner. Distance and location properties can also be used in graphs of places of interest, so that your application can provide recommendations of good restaurants or entertainment options nearby. You can also create nodes for your points of sales, such as bookstores or restaurants, and notify the users when they are close to any of the nodes to provide location-based services. - - **Recommendation Engines**: - + As nodes and relationships are created in the system, they can be used to make recommendations like "your friends also bought this product" or "when invoicing this item, these other items are usually invoiced." Or, it can be used to make recommendations to travelers mentioning that when other visitors come to Barcelona they usually visit Antonio Gaudi's creations. - + An interesting side effect of using the graph databases for recommendations is that as the data size grows, the number of nodes and relationships available to make the recommendations quickly increases. The same data can also be used to mine information-for example, which products are always bought together, or which items are always invoiced together; alerts can be raised when these conditions are not met. Like other recommendation engines, graph databases can be used to search for patterns in relationships to detect fraud in transactions. +## Web server +### Apache and Nginx +* Apache and Nginx could always be used together. + - NGINX provides all of the core features of a web server, without sacrificing the lightweight and high‑performance qualities that have made it successful, and can also serve as a proxy that forwards HTTP requests to upstream web servers (such as an Apache backend) and FastCGI, memcached, SCGI, and uWSGI servers. NGINX does not seek to implement the huge range of functionality necessary to run an application, instead relying on specialized third‑party servers such as PHP‑FPM, Node.js, and even Apache. + - A very common use pattern is to deploy NGINX software as a proxy in front of an Apache-based web application. Can use Nginx's proxying abilities to forward requests for dynamic resources to Apache backend server. NGINX serves static resources and Apache serves dynamic content such as PHP or Perl CGI scripts. -* When not to use - - When you want to update all or a subset of entities - for example, in an analytics solution where all entities may need to be updated with a changed property - graph databases may not be optimal since changing a peroperty on all the nodes is not a straight-forward operation. Even if the data model works for the problem domain, some databases may be unable to handle lots of data, especially in global graph operations. +### Apache vs Nginx -# Troubleshooting -## What happened if we cannot access a website -## What happened if a werbserver is too slow -## What should we do for increasing traffic +| Category | Apache | Nginx | +|--------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| History | Invented around 1990s when web traffic is low and web pages are really simple. Apache's heavyweight, monolithic model has its limit. Tunning Apache to cope with real-world traffic efficiently is a complex art. | Heavy traffic and web pages. Designed for high concurrency. Provides 12 features including which make them appropriate for microservices. | +| Architecture | One process/threads per connection. Each requests to be handled as a separate child/thread. | Asynchronous event-driven model. There is a single master process with one or more worker processes. | +| Performance | To decrease page-rendering time, web browsers routinely open six or more TCP connections to a web server for each user session so that resources can download in parallel. Browsers hold these connections open for a period of time to reduce delay for future requests the user might make during the session. Each open connection exclusively reserves an httpd process, meaning that at busy times, Apache needs to create a large number of processes. Each additional process consumes an extra 4MB or 5MB of memory. Not to mention the overhead involved in creating and destroying child processes. | Can handle a huge number of concurrent requests | +| Easier development | Very easy to insert additional code at any point in Apache's web-serving logic. Developers could add code securely in the knowledge that if newly added code is blocked, ran slowly, leaked resources, or even crashed, only the worker process running the code would be affected. Processing of all other connections would continue undisturbed | Developing modules for it isn't as simple and easy as with Apache. Nginx module developers need to be very careful to create efficient and accurate code, without any resource leakage, and to interact appropriately with the complex event-driven kernel to avoid blocking operations. | +## Cache +### In-memory cache - Guava cache +### Standalone cache +#### Memcached +#### Redis + +## Database +### DynamoDB +### Cassandra + +## Queue +### ActiveMQ +### RabbitMQ +### SQS +### Kafka + +## Data Processing +### Hadoop +### Spark +### EMR + +## Stream Processing +### Samza +### Storm # References * [Hired in Tech courses](https://www.hiredintech.com/courses)