Universal Resource Locater
http://sub.example.com/product/electric/phone/
The distinction between path and resource does not matter that much.
URLs are a type of URI.
https://danielmiessler.com/study/difference-between-uri-url/
Domain Name System
DNS translates domain names to IP addresses. Similarly to a phone book.
Transmission Control Protocol
Browser needs to establish TCP connection with web server. This takes several network round trips. This is an expensive operation.
Modern browsers use a "keep-alive" connection to reuse already established TCP connection.
After which, the browser sends HTTP requests to the server. Server then sends back HTTP responses. There usually are multiple requests and responses on a typical website. Also, each image / file and javascript bundle / file has to be separately requested.
HTTPS requires a SSL/TLS handshake to establish an encrypted connection. This handshake is even more expensive than regular TCP handshake.
Browser uses tricks like SSL session resumption to reduce cost.
Playlist: Web API Design Series
Application Programming Interface
Video: REST vs RPC vs GraphQL API - How do I pick the right API paradigm?
Article: Debunking the Myths of RPC & REST
https://developer.mozilla.org/en-US/docs/Web/HTTP/Status
Common codes:
200
: OK
GET
: Resource fetched successfully and returned in body.HEAD
: Representation headers are included in response without any message body.PUT
or POST
: Resource describing the result of the action is in body.TRACE
Message body contains request message as received by server.201
: Created
POST
, and sometimes PUT
301
: Moved Permanently
304
: Not Modified
400
: Bad Request
401
: Unauthorised
403
: Forbidden
404
instead of 403
to hide the existence of the resource.404
: Not Found
404
instead of 403
to hide the existence of the resource.418
: I'm a teapot
429
: Too Many Requests
500
: Internal Server Error
501
: Not Implemented
GET
and HEAD
methods must be supported by the server.502
: Bad Gateway
503
: Service Unavailable
Retry-After
HTTP header, if possible504
: Gateway Timeout
Allows newer versions to introduce changes that would otherwise break compatibility.
Video: Web API Versioning | Additive Change Strategy
Changes to API will always not break compatibility.
Better for smaller and simpler projects that are not likely to change in the future.
Use a numbered versioning scheme to allow the user to pick which version of the endpoint.
Better for larger projects and enterprise applications.
Methods
URI Components
Before resource https://www.youtube.com/api/v1.1/channels
: use when version scheme applies to a collection of endpoints
After resource https://www.youtube.com/api/channels/v1.1
: use when version scheme applies to a single endpoint
Easier to debug, because more visible
URIs need to be permanently supported links
HTTP Headers
Use either:
Custom Header: Youtube-Version: 1.2
Accept Header: Accept: application/json; version=1.2
Reduces noise in URIs
Harder to debug, because less visible
Potential client caching issues
Request Parameters
https://youtube.com/api/channels?version=1.2
semver.org | Semantic Versioning
Major.Minor.Patch (2.0.1)
Typically, only major versions are exposed differently to the end user.
Deprecate old versions to lower maintenance work.
Need to communicate to consumers, through email, newsletters, articles, etc... BUT can use Sunset Response Headers.
Sunset Response Headers specify the sunset date. Sunset: Sat, 31 Dec 2022 23:59:59 GMT
After the sunset date, accessing the resource should result in a either 400-level or 300-level error.
Video: What Is REST API? Examples And How To Use It
Representational State Transfer Application Programming Interface
Models the problem domain as resources.
Stateless: client & server does not need to store information about each other; every request is independent
Organises resources into a set of URIs (Uniform Resource Identifiers). They differentiate the different types of resources.
Resources should be nouns rather than verbs: GET /user
, rather than GET /get_user
.
CRUD: create, read, update, delete
GET
- ReadPOST
- Create
PUT
- Full UpdatePATCH
- Partial UpdateDELETE
- DeleteHEAD
- GET
but without the response body (only the headers)
Typically returns JSON or XML.
PUT
- full update - overwrites the entire resource if it exists, otherwise create the new resource
PATCH
- partial update - only send the data to be updated (a set of changes/instructions to be implemented), other parameters will not be affected
PUT vs PATCH idempotent
PUT
is idempotent.
PATCH
can have "commands", like "op": "add"
to /users/
for adding a user.
Using an operation like adding a user, PATCH
is not idempotent (if usernames are not unique).
https://stackoverflow.com/questions/4024271/rest-api-best-practices-where-to-put-parameters
/api/user/XXX
(path variables)/api/user?username=XXX
(query parameters)Path variables are generally used for:
One reason why it is recommended to use path variables as much as possible is because some browsers do not cache the results if the query contains query parameters (because of RPC APIs)
Query parameters are generally used for:
However, these are not set in stone, and there is no general consensus on when to use which. It is more important to be consistent!
Some operations like archiving, deactivate, search do not naturally fit into any of the CRUD operations.
Archive
Typically, would use PATCH
with a flag
of "archive": true
.
Deactivate
PUT /users/user-1/deactivate
Search
GET /search/code?name=bob
GET
with query parameters. This is because browsers cache GET
requests.Do NOT use the request body because browser caching works only with the URL.
GET
requests with a request body also breaks the REST principle.
Remote Procedure Call
Exposes actions (think function calls), rather than CRUD operations.
Has endpoints for each action.
Examples:
https://slack.com/api/chat.postMessage
https://slack.com/api/chat.scheduleMessage
https://slack.com/api/chat.deleteScheduledMessage
Supports only
Video: What is RPC? gRPC Introduction
Google's implementation of RPC that is very widely used.
Protocol Buffers are used as the data interchange format, it is language-agnostic and platform-agnostic (works the same on all languages and platforms).
.proto
file.proto
file.proto
files can be used to generate code / classes in the client and server for making RPC callsOnly a single endpoint.
Client requests a structure and the server returns the result in the exact format requested.
Supports only
https://www.youtube.com/watch?v=6RvlKYgRFYQ
Instead of constantly polling using Request-Response APIs, use Event-Driven APIs instead.
Producer: Server producing / writing events. Consumer: Client consuming / reading events.
Distributed system of servers and clients communicating using TCP network protocol, can be deployed on bare-metal hardware, virtual machines, containers, cloud environments and the like.
There are Kafka Streams libraries in various popular programming languages.
Producer <-> Kafka Cluster (contains multiple brokers) <-> Consumer
https://kafka.apache.org/intro#intro_platform
Kafka use cases: https://www.neovasolutions.com/2020/07/20/apache-kafka-a-quick-overview/
Pros
Cons
Typically, HTTP responses are of a finite length. But with HTTP streaming, the server responds with an indefinite response.
How?
Pros
Cons
https://blog.logrocket.com/infinite-scrolling-graphql/ (first section is more relevant)
N
products when N
can be really large, like thousands, millions, billions...Two main ways:
Commonly use limit
& offset
as parameters. If not specified, usually there are sensible default values.
limit
next-cursor
(typically integer value), aka continuation tokensBasically a pointer that the server uses to optimise performance. There are many ways to implement this.
A possible cursor is: timestamp of row creation.
SELECT * FROM Products
WHERE created_timestamp < $1 -- $1 is the cursor
ORDER BY created_timestamp
LIMIT 50;
Be sure to index columns that the cursor value comes from to improve performance. SQL:
CREATE INDEX index_name ON table_name (column_name);
next-cursor
Open Web Application Security Project (OWASP)
Guidelines from OWASP.
Validation
Video: Web API Security | Basic Auth, OAuth, OpenID Connect, Scopes & Refresh Tokens
Cons
https://developer.okta.com/blog/2017/06/21/what-the-heck-is-oauth (very in-depth article)
OAuth is a open standard for authorisation.
Allows users to grant access to applications without having to share passwords with them.
One such service is Auth0.
Used for authorisation, not authentication (user email, id, etc)!
code
is the most commonly used - authorisation code flow)Returns authorisation code, then the client exchanges this code with the authorisation server to get an access token. Subsequent requests to the API uses this access token.
Need to exchange authorisation code to an access token for security reasons. During this exchange, a secret is also provided as a proof of identity.
Video: What is Single Sign-on (SSO)? How It Works
https://stormpath.com/blog/oauth-is-not-sso
SSO is not a protocol. Instead, it is a high-level concept.
Both OpenID Connect and SAML are both very similar and are widely supported by most services.
A layer on top of the OAuth protocol that enables authentication. Allows client to verify the identity of end-users to get basic user profile.
Add a scope of openid
.
Then, exchange for a (access token + ID token) from the authorisation server.
Uses JWT (JSON Web Token) to share identity between services.
The workflow is similar to SAML below, but instead of a signed SAML Assertion, a signed JWT is instead used.
Security Assertion Markup Language
XML-based open standard for exchanging identity information. An alternative to OpenID.
Generally used in enterprise. One account to connect to many services.
Basic SSO Login Flow
Navigating to another SSO application
Video: Web API Rate Limiting - Why it's so IMPORTANT for your APIs
Why
Considerations
Video: Proxy vs Reverse Proxy (Real-world Examples)
A Forward Proxy sits between a User and Internet. Forward Proxy intercepts requests to web servers and acts as a middleman between a User and a Web Server.
A Reverse Proxy sits between the Internet and a Web Server. Reverse Proxy intercepts requires from clients and acts as a middleman between a User and a Web Server.
There can be many layers of reverse proxies. CloudFare + API Gateway / Load Balancer
Nginx, Apache are popular reverse proxies.
https://en.wikipedia.org/wiki/Anycast
Anycast is a method used for having many destination devices (which can be in different locations) share a single IP address.
This method is used by Content Delivery Networks to serve content closer to end users to minimise latency.
Typically, handles:
Request flow
Video: What is a CDN? How Does It Work?
CDN brings content closer to the user by caching the site in a nearer (physical) location.
CDNs employ servers in various locations, called Point of Presence (PoPs). A server in a PoP is called an Edge Server.
There are different technologies for directing user requests to nearest PoP. For example: Amazon Cloudfront, Cloudfare, Akamai, Microsoft Azure CDN. They use Anycast or DNS-based routing.
Edge Servers act as Reverse Proxies with a content cache.
Benefits
Modern Edge Servers also do further optimisation, like mini-fication, or transforming file formats to more modern and web-friendly ones.
All TLS handshakes terminate at the edge server. This reduces the cost (remember that TLS handshakes require multiple back-and-forth round trips to establish). So, even for dynamic un-cacheable content, edge servers are used.
Edge Servers are more resilient to DDOS attacks by distributing the attack. This is even more effective with Anycast, because the server can re-direct requests to servers with a lower load.
Each PoP has its own IP address. When the User looks up the IP address of the CDN, the DNS returns the IP address of the nearest / best PoP.
Video: FAANG System Design Interview: Design A Location Based Service (Yelp, Google Places)
For efficiently querying nearby locations.
To query for nearby locations, typically will find the quadrant in O(1) and then find the 8 neighbouring quadrants in O(1) again. These will be the nearby locations.
Many libraries exist to convert GeoHash <-> latitude & longitude
Relational Database
Any relational database can handle this because it only requires a TEXT column!
Use compound key of (business_id, geohash) to allow for efficient removal of businesses.
In-memory data structures, not database solutions!
Distributed databases are often handled via Load Balancers.
https://en.wikipedia.org/wiki/Database_caching
Add a cache layer in front of the database to cache commonly queried data to minimise actual DB queries.
There are many ways to achieve this.
https://www.digitalocean.com/community/tutorials/understanding-database-sharding
Split the data into multiple smaller databases, called logical shards. Each shard contains a subset of the total data. Collectively, the shards hold the entire dataset.
Each data point might exist in more than 1 shard. There are some situations where this is useful: a table that contains conversion rate data.
Sharding is typically implemented at the application level. However, some DBMS have sharding capabilities built in.
Key Based (Hash Based) Sharding
Essentially, just hashing.
Shard keys chosen should ideally be static (don't change often over time).
Hash function must be chosen appropriately, otherwise shards can become very unbalanced.
Cons: hard to add/remove servers, because there is a need to re-balance everything, which leads to downtime.
Consistent Hashing: to counter the need to re-balance keys, by minimising movement of data on adding/removing shards
Range Based Sharding
Similar to key-based, but using range of values instead. Very simple to implement but unfortunately, it often leads to unbalanced shards.
Directory Based Sharding
Similar to key based, but instead of a hash function, use a lookup table instead.
It is the most flexible option out of the 3, but there is an additional lookup cost involved.
https://www.integrate.io/glossary/what-is-database-replication/
https://www.indeed.com/career-advice/career-development/database-replication
Create partial or complete copies of a database.
Publisher Database -- (replicates to) -> Subscriber Databases
DDBMS - Distributed Database Management System
Change Data Capture (CDC) records the changes made to the publisher database. Then, the DDBMS applies these changes into the subscriber databases.
Video: SSL, TLS, HTTPS Explained
Hypertext Transfer Protocol
Hypertext Transfer Protocol Secure
HTTP extended to be encrypted by SSL/TLS.
https://www.cloudflare.com/en-gb/learning/ssl/what-is-ssl/
Secure Sockets Layer
SSL is the predecessor to TLS. SSL is now deprecated and has not been updated since 1996, but people still refer to both technologies collectively as SSL/TLS.
Transport Layer Security
Man-in-the-middle attack only can see encrypted data.
Refer to HTTPS section for how it is used.
Video: Microservices explained in 5 minutes (high-level overview)
(UI + Business Logic + Data) all in 1 application.
Separate different parts of the application into various tiers.
A common model is 3-tier Architecture:
Split the application into smaller, independent microservices. Each microservice deals with a single service of the application.
Microservices communicate through either HTTP or message queues.
A process is an instance of an application / executable.
Processes are independent from each other.
Each process has its own:
Each process will have at least one thread: the main thread.
CPUs perform Context Switching, by saving a process's state and loading another process's state to run different processes on one core. This is expensive.
Since Context Switching is expensive, there are other mechanisms like Fibers and Coroutines which are more efficient. However, these are more complex and generally require the application to manage the threads itself (instead of the OS).
A thread is the unit of execution within a process.
Each thread has its own:
Threads within a Process share a memory address space. So, threads within a process can communicate. One malfunctioning thread can crash the entire process.
CPUs perform Context Switching on Threads too, similarly to Processes. Switching threads is generally faster than switching processes because there is no need to switch out memory pages.
https://www.quora.com/What-is-difference-between-Goroutines-vs-OS-threads
Goroutines use N:M
scheduling, where N
goroutines are backed by M
OS threads.
aka Brewer's Theorem
Video: CAP Theorem Simplified | System Design Fundamentals
https://en.wikipedia.org/wiki/CAP_theorem
CAP Theorem refers to the fact that any distributed system can only achieve 2 out of the 3 guarantees of CAP.
Consistency: Every read receives the most recent write or an error.
Availability: Every request receives a (non-error) response, without the guarantee that it is the most recent write.
Partition Tolerance: System continues to operate regardless of the number of messages being dropped or delayed by the network between nodes.
When a network partition failure occurs, a decision must be made between:
However, this "two out of three" concept is somewhat misleading because partitions are rare in most systems.
As an example, most traditional database systems with ACID guarantees consistency over availability. However, some NoSQL systems based around BASE philosophy choose availability over consistency.
https://www.digitalocean.com/community/tutorials/how-to-use-contexts-in-go
https://quii.gitbook.io/learn-go-with-tests/go-fundamentals/context
context
package in standard library!
Convention is to use ctx
as the variable name and to use it as the first parameter of every function.
context.TODO()
and context.Background()
create "empty" contexts.
Background
main function, initialisation, and tests, and as the top-level Context for incoming requests
TODO
when it's unclear which Context to use or it is not yet available (because the surrounding function has not yet been extended to accept a Context parameter).
* these are directly quoted from the library's documentation
http.Request
interface has a Context()
function that returns the context of that request. It also ends if the client disconnects before the request is done.
Context can contain values, but it is not wise to use them as they are untyped!
Instead, explicitly pass the values as function parameters!
context.WithCancel(parent context.Context) (ctx context.Context, cancel context.CancelFunc)
Similar to cancel, but provide a deadline where the context will be done.
context.WithDeadline(parent context.Context, d time.Time) (context.Context, context.CancelFunc)
Similar to deadline, but provide a duration instead of a time.
context.WithTimeout(parent context.Context, timeout time.Duration) (context.Context, context.CancelFunc)
ctx.Done()
returns a <-chan struct{}
that closes when the context is done.
select {
case <-ctx.Done():
// context is done!
return
case result := <-data:
// await data from channel
default:
// if not using a data channel, use a default clause instead
}
ctx.Err()
If Done is not yet closed, Err returns nil. If Done is closed, Err returns a non-nil error explaining why: Canceled if the context was canceled or DeadlineExceeded if the context's deadline passed. After Err returns a non-nil error, successive calls to Err return the same error.
Customer Identification to fight against financial crime, money laundering, and fraud.
https://en.wikipedia.org/wiki/Database_normalization
https://en.wikipedia.org/wiki/Denormalization
Strategy used on a previously normalised database, aims to
Simple forms of denormalisation
Article: The complete guide to System Design: Short introductions to a lot of topics, but with links to more in-depth content (some of which are paid content).
Article: The Practical Test Pyramid: Talks about various types of testing and how to effectively utilise them.