Caching Strategies for Mobile Apps
I was once debugging a feed-loading issue on an Android app. Users were complaining that the app showed stale content from three days ago, but only sometimes, and only after they reopened the app from the background. The root cause? Our in-memory cache was getting wiped during OS-initiated process death, and the disk cache fallback had a broken TTL check that never expired entries. Two caching layers, both failing in different ways, combining into a bug that was nearly impossible to reproduce on a developer's device.
Caching comes up in every single mobile system design interview. But most candidates treat it as an afterthought — "and then we'll add a cache." That's not enough. Interviewers want to see that you understand which cache, where in the data flow, how it gets invalidated, and what happens when the OS kills your app mid-write. This article will get you there.
Why Caching Is Different on Mobile#
If you've worked on backend systems, forget most of what you know about caching. Server-side caches run on machines with gigabytes of RAM, persistent processes, and reliable storage. Mobile is a different world.
The OS is hostile to your cache. On iOS, the system can terminate your app's process at any time when it's in the background. On Android, the OS aggressively kills background processes under memory pressure. Your carefully populated in-memory cache? Gone. No warning, no callback, no graceful shutdown.
Storage is shared and limited. Your app doesn't own the device. The user has photos, music, other apps — all competing for the same storage. iOS can purge your app's Caches directory at any time when the device runs low on disk space. Android has similar behavior with its cache partition.
Memory pressure is real. An iPhone might have 6GB of RAM, but your app gets a fraction of that. Hold too much in memory, and the OS will kill you. didReceiveMemoryWarning on iOS isn't a suggestion — it's a threat.
Network is unreliable and expensive. Users switch between Wi-Fi and cellular. They go through tunnels. They have data caps. A good caching strategy directly reduces network usage and makes your app feel fast even on a 3G connection.
Interview Tip: When discussing caching in an interview, start by acknowledging these constraints. Saying "mobile caching is tricky because the OS can kill your process and wipe your in-memory state at any time" immediately signals that you understand the platform.
Memory Cache vs Disk Cache#
Every mobile caching architecture has two layers: memory (fast, volatile) and disk (slower, persistent). Understanding when to use each — and how they interact — is foundational.
| Memory Cache | Disk Cache | |
|---|---|---|
| Speed | Nanoseconds | Milliseconds |
| Survives app kill | No | Yes |
| Survives device reboot | No | Yes |
| Size limit | ~50-100MB practical max | Hundreds of MB, but OS can purge |
| Best for | Decoded images, parsed models, hot data | Serialized responses, image files, large datasets |
| iOS tooling | NSCache, Dictionary | FileManager, Core Data, SQLite |
| Android tooling | LruCache, HashMap | Room, SQLite, File storage |
| Eviction | Automatic under memory pressure (NSCache) or manual | Manual, or OS purge on low disk |
What process death actually does
The same three entries live in both tiers — until the OS needs your RAM
- user_42parsed model
- feed_page_1API response
- avatar_a8f.jpgdecoded image
- user_42parsed model
- feed_page_1API response
- avatar_a8f.jpgdecoded image
Both caches are warm. Now press the button — this is what the OS does to your app dozens of times a day.
Memory Cache#
On iOS, NSCache is your best friend. It automatically evicts entries under memory pressure, it's thread-safe, and it doesn't retain keys strongly. On Android, LruCache gives you an LRU eviction policy with a configurable size limit.
// iOS: Simple memory cache using NSCache
final class MemoryCache<Key: Hashable, Value> {
private let cache = NSCache<WrappedKey, WrappedValue>()
init(countLimit: Int = 100) {
cache.countLimit = countLimit
}
func get(_ key: Key) -> Value? {
cache.object(forKey: WrappedKey(key))?.value
}
func set(_ value: Value, forKey key: Key) {
cache.setObject(WrappedValue(value), forKey: WrappedKey(key))
}
func remove(_ key: Key) {
cache.removeObject(forKey: WrappedKey(key))
}
// NSCache requires NSObject keys and values
private class WrappedKey: NSObject {
let key: Key
init(_ key: Key) { self.key = key }
override var hash: Int { key.hashValue }
override func isEqual(_ object: Any?) -> Bool {
guard let other = object as? WrappedKey else { return false }
return key == other.key
}
}
private class WrappedValue {
let value: Value
init(_ value: Value) { self.value = value }
}
}
// Android: Memory cache using LruCache
class MemoryCache<K, V>(maxSize: Int) {
private val cache = object : LruCache<K, V>(maxSize) {
override fun sizeOf(key: K & Any, value: V & Any): Int {
// Override this to measure actual size in KB
return 1
}
}
fun get(key: K): V? = cache.get(key)
fun put(key: K, value: V) {
cache.put(key, value)
}
fun remove(key: K) {
cache.remove(key)
}
fun clear() {
cache.evictAll()
}
}
Disk Cache#
Disk caches persist across app restarts. Use them for data that's expensive to re-fetch: API responses, images, computed results. The trade-off is speed — reading from disk involves I/O, deserialization, and potentially database queries.
On iOS, write to the Caches directory (the OS can purge it, but it won't be backed up to iCloud). On Android, use context.cacheDir for the same semantics.
Interview Tip: Always mention the two-tier approach in interviews: "I'd use an in-memory LRU cache for hot data backed by a disk cache for persistence. On a cache miss in memory, we check disk before going to the network."
Cache-Aside (Lazy Loading)#
Cache-aside is the most common caching pattern on mobile. The application code manages the cache explicitly: check the cache first, fetch from the network on a miss, then populate the cache.
The flow is simple:
- Check memory cache
- On miss, check disk cache
- On miss, fetch from network
- Store result in both memory and disk cache
- Return the data
Cache-aside, step by step
Watch one request travel through the tiers — replay each scenario
App UI
—
Memory
~ns
Disk
~ms
Network
100s of ms
Pick a scenario above to watch the lookup flow. The green dot on a tier means data is stored there.
// iOS: Cache-aside pattern with two-tier caching
final class UserRepository {
private let memoryCache = MemoryCache<String, UserProfile>()
private let diskCache: DiskCache<UserProfile>
private let apiClient: APIClient
init(apiClient: APIClient, diskCache: DiskCache<UserProfile>) {
self.apiClient = apiClient
self.diskCache = diskCache
}
func getUser(id: String) async throws -> UserProfile {
// 1. Check memory cache
if let cached = memoryCache.get(id) {
return cached
}
// 2. Check disk cache
if let diskCached = try diskCache.read(key: id) {
memoryCache.set(diskCached, forKey: id)
return diskCached
}
// 3. Fetch from network
let user = try await apiClient.fetchUser(id: id)
// 4. Populate both caches
memoryCache.set(user, forKey: id)
try diskCache.write(user, key: id)
return user
}
}
// Android: Cache-aside with coroutines
class UserRepository(
private val memoryCache: MemoryCache<String, UserProfile>,
private val userDao: UserDao,
private val apiService: ApiService
) {
suspend fun getUser(id: String): UserProfile {
// 1. Check memory cache
memoryCache.get(id)?.let { return it }
// 2. Check disk cache (Room)
userDao.getById(id)?.let { entity ->
val profile = entity.toUserProfile()
memoryCache.put(id, profile)
return profile
}
// 3. Fetch from network
val response = apiService.getUser(id)
val profile = response.toUserProfile()
// 4. Populate both caches
memoryCache.put(id, profile)
userDao.insert(profile.toEntity())
return profile
}
}
This pattern is straightforward, but watch out for a common mistake: thundering herd. If 10 UI components all request the same user at the same time, you get 10 network calls. Use a mechanism like actor isolation on iOS or a Mutex/SingleFlightCache on Android to deduplicate in-flight requests.
Write-Through and Write-Behind#
Cache-aside handles reads. But what about writes? Two patterns dominate here, and the right choice depends on your consistency requirements.
Write-Through#
With write-through, every write goes to both the cache and the backend simultaneously. The write isn't considered complete until both succeed. This gives you strong consistency, but writes are slower because you're waiting on the network.
When to use it: Banking apps, payment flows, anything where showing stale data could cause real harm. If a user transfers money, you need the backend to confirm before updating the local state.
// Android: Write-through for a banking app
suspend fun transferFunds(from: Account, to: Account, amount: BigDecimal) {
// Write to backend first — this is the source of truth
val result = apiService.transfer(
fromId = from.id,
toId = to.id,
amount = amount
)
// Only update cache after backend confirms
val updatedFrom = from.copy(balance = from.balance - amount)
val updatedTo = to.copy(balance = to.balance + amount)
memoryCache.put(from.id, updatedFrom)
memoryCache.put(to.id, updatedTo)
accountDao.update(updatedFrom.toEntity())
accountDao.update(updatedTo.toEntity())
}
// iOS: Write-through for a banking app
func transferFunds(from: Account, to: Account, amount: Decimal) async throws {
// Write to backend first — this is the source of truth
let result = try await apiClient.transfer(
fromId: from.id,
toId: to.id,
amount: amount
)
// Only update cache after backend confirms
let updatedFrom = from.withBalance(from.balance - amount)
let updatedTo = to.withBalance(to.balance + amount)
memoryCache.set(updatedFrom, forKey: from.id)
memoryCache.set(updatedTo, forKey: to.id)
try diskCache.write(updatedFrom, key: from.id)
try diskCache.write(updatedTo, key: to.id)
}
Write-Behind (Write-Back)#
With write-behind, you update the cache immediately and return to the caller. The backend write happens asynchronously, often batched. This gives instant UI feedback but introduces eventual consistency.
When to use it: Social media likes, reactions, draft saves, analytics events. The user taps "like" and sees the heart fill immediately. The actual API call can happen a second later, or be batched with other pending writes.
// Android: Write-behind for social media likes
class LikeRepository(
private val likeDao: LikeDao,
private val apiService: ApiService,
private val workManager: WorkManager
) {
suspend fun toggleLike(postId: String, isLiked: Boolean) {
// 1. Update cache immediately — UI responds instantly
likeDao.upsert(LikeEntity(postId = postId, isLiked = isLiked, synced = false))
// 2. Schedule background sync — batched, retried on failure
val workRequest = OneTimeWorkRequestBuilder<LikeSyncWorker>()
.setConstraints(
Constraints.Builder()
.setRequiredNetworkType(NetworkType.CONNECTED)
.build()
)
.build()
workManager.enqueueUniqueWork(
"like_sync_$postId",
ExistingWorkPolicy.REPLACE,
workRequest
)
}
}
// iOS: Write-behind for social media likes
final class LikeRepository {
private let likeStore: LikeStore // local persistence (Core Data / SQLite)
private let apiClient: APIClient
init(likeStore: LikeStore, apiClient: APIClient) {
self.likeStore = likeStore
self.apiClient = apiClient
}
func toggleLike(postId: String, isLiked: Bool) async throws {
// 1. Update cache immediately — UI responds instantly
try await likeStore.upsert(
PendingLike(postId: postId, isLiked: isLiked, synced: false)
)
// 2. Schedule background sync — batched, retried on failure
let request = BGProcessingTaskRequest(identifier: "com.app.likeSync")
request.requiresNetworkConnectivity = true
try BGTaskScheduler.shared.submit(request)
}
}
One like tap, two write strategies
Both columns receive the same tap at the same instant — watch when each UI confirms
Write-through
strong consistencyCache written only after the server confirms
Write-behind
eventual consistencyCache written first, backend sync deferred
Tap the button — the same write hits both architectures simultaneously.
Interview Tip: When an interviewer asks about caching for writes, name the pattern explicitly. "For the like button, I'd use write-behind caching — update the local state immediately so the UI is responsive, then sync to the backend asynchronously using WorkManager on Android or BGTaskScheduler on iOS." That kind of precision stands out.
Cache Invalidation#
"There are only two hard things in computer science: cache invalidation and naming things." — Phil Karlton
It's a cliche because it's true. A cache that never expires serves stale data forever. A cache that expires too aggressively defeats the purpose of caching. Here are the three strategies that matter on mobile.
TTL-Based (Time-To-Live)#
The simplest approach: each cached entry has an expiration timestamp. After that time, the entry is considered stale and must be re-fetched.
// Android: TTL-based cache entry
data class CacheEntry<T>(
val value: T,
val cachedAtMillis: Long,
val ttlMillis: Long
) {
val isExpired: Boolean
get() = System.currentTimeMillis() - cachedAtMillis > ttlMillis
}
// Usage in repository
suspend fun getUser(id: String): UserProfile {
userDao.getEntry(id)?.let { entry ->
if (!entry.isExpired) return entry.value
}
val user = apiService.fetchUser(id)
userDao.upsert(
CacheEntry(
value = user,
cachedAtMillis = System.currentTimeMillis(),
ttlMillis = 300_000L // 5 min TTL
)
)
return user
}
// iOS: TTL-based cache entry
struct CacheEntry<T: Codable>: Codable {
let value: T
let cachedAt: Date
let ttlSeconds: TimeInterval
var isExpired: Bool {
Date().timeIntervalSince(cachedAt) > ttlSeconds
}
}
// Usage in repository
func getUser(id: String) async throws -> UserProfile {
if let entry: CacheEntry<UserProfile> = try diskCache.read(key: id),
!entry.isExpired {
return entry.value
}
let user = try await apiClient.fetchUser(id: id)
let entry = CacheEntry(value: user, cachedAt: Date(), ttlSeconds: 300) // 5 min TTL
try diskCache.write(entry, key: id)
return user
}
TTL works well when staleness is tolerable within a known window. User profiles? 5 minutes is fine. Stock prices? 10 seconds or less. A feed? Maybe 60 seconds.
TTL in action
12 seconds here ≈ a 5-minute production TTL — keep tapping “Request” as it drains
— no entry cached —Tap “Request user_42” — the first one will be a cache miss.
Event-Based Invalidation#
Instead of guessing when data becomes stale, the server tells you. This is common with WebSocket connections or push notifications.
For example, in a chat app, when a message is edited on another device, the server pushes an event. The client receives it and invalidates the cached version of that message. This gives you real-time consistency without polling.
Version-Based Invalidation#
Each piece of data has a version number or ETag. When fetching, you send your cached version. The server responds with either "304 Not Modified" (your cache is current) or the new data. This avoids transferring data when nothing has changed.
// Android: Version-based cache check
class FeedRepository(
private val feedDao: FeedDao,
private val apiService: ApiService
) {
suspend fun getFeed(): List<FeedItem> {
val cachedVersion = feedDao.getCurrentVersion()
return try {
val response = apiService.getFeed(ifNoneMatch = cachedVersion)
if (response.code() == 304) {
// Cache is still valid
feedDao.getAll().map { it.toFeedItem() }
} else {
val items = response.body()!!
val newVersion = response.headers()["ETag"]
feedDao.replaceAll(items.map { it.toEntity() }, newVersion)
items
}
} catch (e: IOException) {
// Network error — fall back to cache
feedDao.getAll().map { it.toFeedItem() }
}
}
}
// iOS: Version-based cache check
final class FeedRepository {
private let feedStore: FeedStore
private let apiClient: APIClient
init(feedStore: FeedStore, apiClient: APIClient) {
self.feedStore = feedStore
self.apiClient = apiClient
}
func getFeed() async throws -> [FeedItem] {
let cachedVersion = try feedStore.currentVersion()
do {
let (items, response) = try await apiClient.getFeed(ifNoneMatch: cachedVersion)
if response.statusCode == 304 {
// Cache is still valid
return try feedStore.getAll()
} else {
let newVersion = response.value(forHTTPHeaderField: "ETag")
try feedStore.replaceAll(items, version: newVersion)
return items
}
} catch is URLError {
// Network error — fall back to cache
return try feedStore.getAll()
}
}
}
In practice, you often combine strategies. Use TTL as a baseline (don't hit the network more than once per minute), event-based for real-time features (chat messages, notifications), and version-based for large datasets where you want to avoid unnecessary transfers (feed, product catalog).
Image Caching#
Image caching is the most common mobile caching use case. It's also the one interviewers expect you to know cold, because almost every mobile app displays images.
Libraries like Kingfisher and SDWebImage (iOS) or Coil and Glide (Android) all follow the same fundamental pattern: a three-tier waterfall.
Memory cache (decoded UIImage/Bitmap objects, ready to display) -> Disk cache (compressed image files on the file system) -> Network (download from the URL).
Here's what happens under the hood when you load an image:
- Hash the URL to create a cache key.
- Check memory cache — if the decoded image is there, return it immediately. This is why scrolling back up in a list feels instant.
- Check disk cache — if the compressed file exists, read it from disk, decode it into a displayable image, store the decoded image in the memory cache, return it.
- Download from network — fetch the image, write the compressed data to disk, decode it, store the decoded image in memory, return it.
// Android: Simplified image caching pipeline
object ImageLoader {
private val memoryCache = LruCache<String, Bitmap>(100)
private lateinit var diskCacheDir: File
private val client = OkHttpClient()
// Track in-flight downloads to avoid duplicate requests
private val inFlightRequests = mutableMapOf<String, Deferred<Bitmap>>()
private val mutex = Mutex()
private val scope = CoroutineScope(SupervisorJob() + Dispatchers.IO)
fun init(context: Context) {
diskCacheDir = File(context.cacheDir, "ImageCache").apply { mkdirs() }
}
suspend fun loadImage(url: String): Bitmap {
// 1. Memory cache
memoryCache.get(url)?.let { return it }
// 2. Disk cache
val file = File(diskCacheDir, url.sha256Hash())
if (file.exists()) {
BitmapFactory.decodeFile(file.path)?.let { bitmap ->
memoryCache.put(url, bitmap)
return bitmap
}
}
// 3. Deduplicate in-flight requests
val download = mutex.withLock {
inFlightRequests.getOrPut(url) {
scope.async {
val request = Request.Builder().url(url).build()
val bytes = client.newCall(request).execute().use { response ->
response.body?.bytes() ?: throw IOException("Empty body")
}
val bitmap = BitmapFactory.decodeByteArray(bytes, 0, bytes.size)
?: throw IOException("Decoding failed")
file.writeBytes(bytes)
memoryCache.put(url, bitmap)
bitmap
}
}
}
try {
return download.await()
} finally {
mutex.withLock { inFlightRequests.remove(url) }
}
}
}
// iOS: Simplified image caching pipeline
final class ImageLoader {
static let shared = ImageLoader()
private let memoryCache = NSCache<NSString, UIImage>()
private let diskCachePath: URL
private let session = URLSession.shared
// Track in-flight downloads to avoid duplicate requests
private var inFlightRequests: [String: Task<UIImage, Error>] = [:]
init() {
let caches = FileManager.default.urls(for: .cachesDirectory, in: .userDomainMask)[0]
diskCachePath = caches.appendingPathComponent("ImageCache")
try? FileManager.default.createDirectory(at: diskCachePath, withIntermediateDirectories: true)
}
func loadImage(from url: URL) async throws -> UIImage {
let key = url.absoluteString
// 1. Memory cache
if let cached = memoryCache.object(forKey: key as NSString) {
return cached
}
// 2. Disk cache
let filePath = diskCachePath.appendingPathComponent(key.sha256Hash)
if let data = try? Data(contentsOf: filePath),
let image = UIImage(data: data) {
memoryCache.setObject(image, forKey: key as NSString)
return image
}
// 3. Deduplicate in-flight requests
if let existingTask = inFlightRequests[key] {
return try await existingTask.value
}
let task = Task<UIImage, Error> {
let (data, _) = try await session.data(from: url)
guard let image = UIImage(data: data) else {
throw ImageLoadingError.decodingFailed
}
try data.write(to: filePath)
memoryCache.setObject(image, forKey: key as NSString)
return image
}
inFlightRequests[key] = task
defer { inFlightRequests.removeValue(forKey: key) }
return try await task.value
}
}
A few things that real image caching libraries handle that you should mention in interviews:
- Downsampling: Decoding a 4000x3000 photo into a
UIImageconsumes ~48MB of memory. If you're displaying it in a 200x200 cell, that's wasteful. Libraries downsample during decoding to match the target view size. - Progressive loading: Show a blurry placeholder first, then sharpen as more data arrives.
- Disk size limits: Cap the disk cache at something reasonable (100-200MB) and evict oldest entries when the limit is hit.
- Memory eviction on warning: Clear the memory cache entirely when the OS sends a low-memory warning.
Interview Tip: Don't just say "I'd use Kingfisher for image loading." Say "Kingfisher implements a two-tier cache — decoded images in memory via NSCache, compressed data on disk. It deduplicates in-flight downloads and downsamples images to match the display size, which prevents memory spikes when loading large photos in a scrolling list."
HTTP Caching Headers#
HTTP has a built-in caching mechanism, and your mobile client should respect it. Three headers matter most.
Cache-Control — tells the client how long a response can be cached and under what conditions.
max-age=3600: cache for 1 hour.no-cache: you can cache it, but must revalidate with the server before using it.no-store: don't cache this at all (sensitive data).private: only the client can cache this, not intermediate proxies.
ETag — a version identifier for the response. On subsequent requests, send If-None-Match: <etag>. If the data hasn't changed, the server returns 304 Not Modified with no body, saving bandwidth.
Last-Modified — a timestamp of when the resource was last changed. Works like ETag but with timestamps via If-Modified-Since.
On iOS, URLSession respects these headers by default if you use the default URLCache. On Android, OkHttp does the same with its built-in cache.
// Android: Setting up OkHttp with HTTP caching
val cacheDir = File(context.cacheDir, "http_cache")
val cacheSize = 50L * 1024 * 1024 // 50 MB
val client = OkHttpClient.Builder()
.cache(Cache(cacheDir, cacheSize))
.addInterceptor { chain ->
var request = chain.request()
// Force cache when offline
if (!isNetworkAvailable()) {
request = request.newBuilder()
.cacheControl(CacheControl.FORCE_CACHE)
.build()
}
chain.proceed(request)
}
.build()
// iOS: Setting up URLSession with HTTP caching
let cacheDir = FileManager.default.urls(for: .cachesDirectory, in: .userDomainMask)[0]
.appendingPathComponent("http_cache")
let config = URLSessionConfiguration.default
config.urlCache = URLCache(
memoryCapacity: 10 * 1024 * 1024, // 10 MB in memory
diskCapacity: 50 * 1024 * 1024, // 50 MB on disk
directory: cacheDir
)
config.requestCachePolicy = .useProtocolCachePolicy
let session = URLSession(configuration: config)
// Force cache when offline
func makeRequest(url: URL) -> URLRequest {
var request = URLRequest(url: url)
if !isNetworkAvailable() {
request.cachePolicy = .returnCacheDataDontLoad
}
return request
}
Sometimes you'll want to override the server's caching headers. Maybe the server sends no-cache for an endpoint, but you know the data is safe to cache for 5 minutes on the client. In that case, use a network interceptor (Android) or the willCacheResponse delegate (iOS) to rewrite the response headers:
// Override server cache headers for specific endpoints
.addNetworkInterceptor { chain ->
val response = chain.proceed(chain.request())
if (chain.request().url.encodedPath.contains("/feed")) {
response.newBuilder()
.header("Cache-Control", "public, max-age=60")
.build()
} else {
response
}
}
// Override server cache headers for specific endpoints
func urlSession(
_ session: URLSession,
dataTask: URLSessionDataTask,
willCacheResponse proposedResponse: CachedURLResponse
) async -> CachedURLResponse? {
guard let response = proposedResponse.response as? HTTPURLResponse,
let url = response.url, url.path.contains("/feed"),
var headers = response.allHeaderFields as? [String: String] else {
return proposedResponse
}
headers["Cache-Control"] = "public, max-age=60"
guard let newResponse = HTTPURLResponse(
url: url,
statusCode: response.statusCode,
httpVersion: nil,
headerFields: headers
) else {
return proposedResponse
}
return CachedURLResponse(response: newResponse, data: proposedResponse.data)
}
Interview Tip: Mentioning HTTP caching headers shows that you think about caching at every layer, not just in your application code. It's a signal of real-world experience.
Pagination and Cache#
Most list-based features — feeds, search results, message histories — use pagination. Caching paginated data introduces specific challenges that interviewers love to explore.
The core problem: you cache page 1 of a feed. The user scrolls down, you fetch and cache page 2. Now the user pulls to refresh. New posts have been added at the top. Do you throw away your entire cache? Merge the new data in? What if an item that was on page 1 is now on page 2?
Offset-Based Pagination and Caching#
With offset-based pagination (?page=2&limit=20), cache keys map directly to page numbers. But insertion at the top shifts everything. Item 20 moves to item 21, and your page 2 cache now overlaps with the new page 1. This is why offset-based pagination caches poorly.
Cursor-Based Pagination and Caching#
Cursor-based pagination (?after=cursor_abc&limit=20) is more cache-friendly. Each page is anchored to a specific item, so insertions at the top don't affect existing pages.
Why offset pagination breaks your cache
Same feed, same new post, same scroll — two very different page 2s
Server feed (newest first)
ABCDEFOffset-based
post_Acachedpost_Bcachedpost_CcachedCursor-based
post_Acachedpost_Bcachedpost_CcachedPress the button to publish post N and watch both clients paginate.
The pattern I've used in practice:
// Android: Caching cursor-paginated feed data
class FeedCache(
private val feedDao: FeedDao,
private val apiService: ApiService
) {
/** Load the next page after the given cursor. */
suspend fun loadPage(afterCursor: String?): FeedPage {
// Check if we have this page cached
feedDao.getPage(afterCursor)?.let { cached ->
if (!cached.isExpired) return cached
}
val page = apiService.getFeed(after = afterCursor, limit = 20)
// Store items with their position metadata
feedDao.insertPage(
items = page.items,
nextCursor = page.nextCursor,
previousCursor = afterCursor,
fetchedAt = System.currentTimeMillis()
)
return page
}
/** Pull-to-refresh: fetch new items and prepend to cache */
suspend fun refresh(): List<FeedItem> {
val latestCachedId = feedDao.getLatestItemId()
// Fetch newest items until we find overlap with our cache
val newItems = apiService.getFeed(after = null, limit = 20)
val overlapIndex = newItems.items.indexOfFirst { it.id == latestCachedId }
return if (overlapIndex >= 0) {
// Only insert items before the overlap
val freshItems = newItems.items.take(overlapIndex)
feedDao.prependItems(freshItems)
freshItems
} else {
// No overlap — gap too large, reset cache
feedDao.clearAndInsert(newItems.items, nextCursor = newItems.nextCursor)
newItems.items
}
}
}
// iOS: Caching cursor-paginated feed data
final class FeedCache {
private let feedDao: FeedDao
/// Load the next page after the given cursor.
func loadPage(after cursor: String?) async throws -> FeedPage {
// Check if we have this page cached
if let cached = try feedDao.getPage(afterCursor: cursor),
!cached.isExpired {
return cached
}
let page = try await apiClient.getFeed(after: cursor, limit: 20)
// Store items with their position metadata
try feedDao.insertPage(
items: page.items,
nextCursor: page.nextCursor,
previousCursor: cursor,
fetchedAt: Date()
)
return page
}
/// Pull-to-refresh: fetch new items and prepend to cache
func refresh() async throws -> [FeedItem] {
let latestCachedId = try feedDao.getLatestItemId()
// Fetch newest items until we find overlap with our cache
let newItems = try await apiClient.getFeed(after: nil, limit: 20)
if let overlapIndex = newItems.items.firstIndex(where: { $0.id == latestCachedId }) {
// Only insert items before the overlap
let freshItems = Array(newItems.items.prefix(upTo: overlapIndex))
try feedDao.prependItems(freshItems)
return freshItems
} else {
// No overlap — gap too large, reset cache
try feedDao.clearAndInsert(newItems.items, nextCursor: newItems.nextCursor)
return newItems.items
}
}
}
The key insight for interviews: cursor-based pagination lets you append new pages to the cache without worrying about shifted offsets, and pull-to-refresh becomes a matter of finding the overlap point between fresh data and your cached data.
Cache Eviction Policies#
When a cache reaches its size limit, something has to go. Three policies show up in interviews.
LRU (Least Recently Used) — evict the entry that hasn't been accessed the longest. This is the default for NSCache, LruCache, and most image caching libraries. It works well because recently accessed data is likely to be accessed again (temporal locality). For mobile, LRU is almost always the right answer.
LFU (Least Frequently Used) — evict the entry with the fewest accesses. This keeps popular items in cache longer, but it has a cold-start problem: new items have low frequency and might get evicted before they have a chance to prove their popularity. Rarely used on mobile.
FIFO (First In, First Out) — evict the oldest entry regardless of access patterns. Simple to implement, but it doesn't adapt to usage patterns. Useful for log buffers or event queues where order matters more than access frequency.
Be the LRU cache
Capacity: 4 entries · tap items below to access them and watch what gets evicted
feed_1user_aThe user opens…
Blue-dotted items are already cached. Tap any item — cached ones move to the front, new ones push the least recently used out.
Why LRU wins on mobile: Mobile usage is bursty and recency-driven. A user scrolling through a feed will likely scroll back up. A user viewing a profile might tap back and view it again. LRU naturally keeps these recently-viewed items cached. LFU would keep items from yesterday's browsing session that happened to be viewed many times, wasting cache space on data the user no longer needs.
Interview Tip: If asked about eviction, say LRU and explain why it matches mobile access patterns. If the interviewer pushes, mention that LFU could work for something like a music app's "most played" cache, but for general purpose, LRU is the standard.
Presenting Caching in Interviews#
This might be the most important section. Knowing caching strategies is one thing. Weaving them into your system design naturally is what separates strong candidates from everyone else.
Don't bolt caching on at the end. The worst thing you can do is design your entire system, then say "and we could add caching too." By that point, it feels like an afterthought because it is one.
Instead, introduce caching when you design the data flow. When you draw the Repository layer, say: "The repository checks a memory cache first, then disk, then network. Let me show you the flow." This makes caching an integral part of your architecture, not an optimization you might add later.
Here's a structure that works:
-
During requirements gathering, ask: "What's the acceptable staleness for this data? Can we show data from 5 minutes ago, or does it need to be real-time?" This tells the interviewer you're already thinking about caching.
-
During high-level design, draw the cache as a first-class component in your data flow diagram. Put it between the ViewModel and the Network layer, inside the Repository.
-
During deep dive, explain your specific strategy: "For the feed, I'd use cache-aside with a 60-second TTL. For the user's profile, write-through so the UI stays consistent after edits. For likes, write-behind because we want instant UI feedback."
-
During optimization, discuss eviction policies, size limits, and what happens under memory pressure.
ViewModel
observes state
Repository
single source of truth for data access
Cache Manager
Memory
LRU · ~ns
Disk
SQLite · ~ms
Network Service
100s of ms · costs battery & data
Match the strategy to the feature. In a single app, you'll use different caching strategies for different data:
- User profile: Cache-aside, write-through, 5-minute TTL
- Feed items: Cache-aside, cursor-based pagination, 60-second TTL
- Likes/reactions: Write-behind with background sync
- Images: Three-tier waterfall (memory, disk, network) with LRU eviction
- Chat messages: Event-based invalidation via WebSocket, disk-persisted
When you can articulate this level of detail — matching specific strategies to specific features with clear reasoning — you're showing the interviewer that you've actually built these systems. That's what they're looking for.