Proxies

Usually a forward proxy
Think of it like a friend (middle layer, proxy) hides and protects the original persons ID, and does something on your behalf
Forward Proxy → Sitting inbetween client and server or set of clients and server (or set of servers), communicates on behalf of the client to the server
- If the server doesn’t have a reverse proxy set up, the server will just say the answer back to the Proxy, and the proxy will tell it to us, the client
- This is how VPNs work → As a proxy between client and server, to hide their identity
In summary, a forward proxy server serves as an intermediary in network communication
1. Providing privacy (I.e. your IP doesn’t get used, proxies does)
2. Increasing efficiency through caching
3. Allowing control over network traffic
Reverse Proxy → Sitting between server and clients and acts on behalf of servers, typically used for logging, load balancing, or caching.
- When a user sends a request to client, the PROXY actually interprets it (client doesn’t know it’s the proxy), and filters out the request based on some logic (defined by the server admin), and then sends it to the server if necessary
- For example, suppose a request comes in which is known ID for a hacker, the proxy will intercept and deny it automatically before letting it go to the server (HYpothetical, Example)
- The most useful case of reverse proxies is Load Balancing. Depending on how the servers are doing (capacity wise), send request to the server that has the least capacity. Etc.
- Nginx, is a very popular webserver thats often used as a reverse proxy and load balancer!
Reverse and Forward Proxy Diagram

Load Balancers

Screenshot 2023-10-31 at 9.26.47 AM.png

Works as a type of reverse proxy, you redirect cleints requests / Traffic based on some logic the server admin has determined in the load balancer.

After you throw in a load balancer, you can then horizontally scale those servers as needed.

How exactly do we distribute that traffic evenly though?

Many Algorithms to do this! (Round Robbin, Weighted Round Robin, User Location, Least Number of COnnections, Layer 4 and Layer 7 Load Balancing)

Algorithms for Load Distribution

Round Robbin → essentially just send a request sequentially to each server as they come, and keep going until. I.e. suppose you have 3 servers. Req1 → Server 1, Req2 → Server 2, Req3 → Server3, and then Req4 → Server 1, …, and so on.
- Basically just go in cycles, but it’s possible some of the servers are less powerful than others (i.e. suppose you have 100 servers, but 25% only have 8GB RAM, vs the other 75% have 1 TB of RAM (note they also need to have better CPU, motherboard, etc to utilize that much RAM).
- When we refer to "computational power", it entails the server's CPU Processing Power, RAM, storage, and network bandwidth. The computational power influences how many requests a server can handle and how quickly it can process and respond to these requests. By using the WRR technique, the distribution of requests is proportional to the resources of each server, ensuring more efficient utilization of available resources.
- The above caveat gives rise to algorithm number 2 below!
Weighted Round Robbin → Each server gets requests based on how much they’ve been configured to handle (i.e in the diagram below, 50% of the requests should go to server 1, and 25% to server 2 & 3)
Least Number of Connections → Send requests to server that is currently dealing with the LEAST number of requests. ****
- This is specifically useful for when we don’t know how long each user request is going to be (high variability)
- The least connections method works by assigning new requests to the server with the fewest active connections. This proves advantageous in scenarios where servers may complete tasks at different rates.
- Example: Ecommerce website during black friday, some shoppers are just browsering forever, whereas others are adding items to cart, checking out, etc. So it’s good to distribute load based on number of connections / server in this case
- By dynamically distributing requests to servers with the fewest active connections, the least connections method helps prevent imbalances and optimizes resource utilization within a load-balanced environment.
User Location → If users are geographically located closer to a specific server, we could route them to the server that is closest to them

Under this strategy, we want to ensure that users receive data from the nearest server, minimizing the distance that data needs to travel. This approach helps reduce latency and provides a faster and smoother user experience.

If our servers A, B, and C located in Asia, Europe, and North America, respectively, the load balancer uses the user's IP address to determine their location. Based on this information, the load balancer routes the user's request to the corresponding server that is geographically closest to them. This helps optimize response times and reduces latency by delivering the content from the server in close proximity to the user.
Layer 4 and Layer 7 → Load balancers at different levels of network layers
- Recall the OSI model, which provides a standard for different computers to communicate w/ each other (universal language for computer networking, by splitting up the communication into seven layers)
- Here, Layer 4 load balancer technique works corresponds to the Transport layer, which is responsible for end-to-end communication using TCP, UDP, etc. Layer 4 load balancers are designed to route traffic based on Layer 4 data, such as IP address and TCP port, rather than the contents of the requests. This approach makes Layer 4 load balancing fast and straightforward.
  - Faster bcz we look at the IP address to balance (Uses location based or round robin, but we can’t see the data itself so thats a con)
- Layer 7 (Application) Load Balancers → We can see the data itself, we can intelligently route user requests based on what type of request it is.
  - For example, having a server deal with tweets, another with user authentication, another with browsing new content, etc.
- The downfall of Layer7 is that it has MULTIPLE connections. We have TCP connections from Client → Load Balancer and then another TCP connection from Load Balancers → Servers; which is required for being able to see / decrypt each request
- Whereas for the Layer 4 just forwards the IP to one of the server, and returns it back
- Layer 7 → Slower, but much more flexible
- Layer 4 → Faster, but much more restricted

What happens if you only have 1 load balancer? What if it goes down itself?

The easiest thing is to have MULTIPLE load balancers, or have a BACKUP load balancer!
They usually have very high throughput as they aren’t doing much, just forwarding requests back and forth

Consistent Hashing

Recall that in Load balancers, we can use multiple algorithms on how exactly to route traffic to a specific server (I.e. round robbin, layer 4, layer 7, user location, etc)