- Recall our CPU reads/writes from RAM, DISK and Cache —> Cache is the fastest, but it can’t store that much data
- Getting data hierarchy from fastest to slowest: Cache, RAM, DISK, Network! (APIs)
- Caching is simply the process of making copies of data.
- Caching is also used in distributed systems, where we have lots of computers!
- Caching is essentially taking a copy of the data from ram/disk, and put a subset of it in Cache so that its faster to lower latency, and increase throughput
- Even though the cache outperforms Disk and RAM in terms of speed, it does have a downside: its capacity is restricted, usually to just kilobytes or megabytes. This limitation poses a considerable challenge. As a result, the operating system must make thoughtful decisions regarding what data to store in the cache.
- So how does it work in practice?
- Well, every-time there is some static content (i.e. js, html, css files) ona website, what the client does it check if that data is already on the disk (Cache, lasts about 3600 seconds → 60mins), if it is, thats a cache hit → Pulls up the website quickly. If it’s a Cache Miss → i.e. data is not there or expired, it’ll make a call over the network and retrieve the website (Slower)
- When a cached file is found, this is known as a cache hit. When a cached file is not found, this is known as a cache miss.
- Cache Ratio → hit / hit + miss → We want the Cache ratio to be higher than lower
- We try to use Cache whenever we can, because of the speed it provides
- How does Caching work for servers?
Question: The difference between 123 ms and 11 ms is rather miniscule so why even bother enabling the cache?
Answer: Because we are dealing with only one file. If we have a 100 resources and each takes an additional 100 ms
to load the cost starts to add up. This negatively impacts the user
experience and even small delays in page load time can lead to users
abandoning a website.
The client's perspective
When a browser needs to load a resource, such as an image file, it
follows a sequence of steps to determine where to get the file:
- Check the Memory Cache: The browser first checks its memory cache. This is used for resources downloaded in the current
browsing sessions (since memory is non-persistent).
- Check the Disk Cache: If the resource isn't in
the memory cache, the browser checks the disk cache, a more persistent
cache that contains resources from sites visited in the past.
- Network Request: If the resource isn't in either the memory or disk, the browser makes a network request to the server hosting the resource.
When a cached file is found, this is known as a cache hit. When a cached file is not found, this is known as a cache miss.
It should also be noted that data could be cached and we could still
have a cache miss. This data might be stale and the cache might have
expired.