Adding Caching

In this episode23:33

This episode is freely available thanks to the support of our subscribers

Subscribers get exclusive access to new and all previous subscriber-only episodes, video downloads, and 30% discount for team members. Become a Subscriber →

We add support for caching network requests without altering our original networking abstraction.

00:05 In this episode, we're going to build a network cache. We built it for the Swift Talk iOS app, which we started before we released the first episode, but we haven't managed to finish it yet. We still want to finish it though!

00:29 In order to make the app work offline, we need to cache the metadata for the episodes as well as the videos themselves. Today we'll only look at caching the metadata.

00:44 There are many different ways we could approach this problem. For example, we could store all the episodes in Core Data and persist them that way. Or, we could somehow serialize the episode structs we generate from the network data. However, what we want to do is cache the plain network response. We might go for a different solution later on if our requirements change, but for now we'll work on the level of the network request itself.

01:17 Foundation already has a class for caching network requests called NSURLCache. However, on iOS, this doesn't give you control over where the cached data gets stored. The system stores it in a purgeable location, i.e. when your device runs low on disk storage, the data can be cleared out. We want our offline data to be available to the user at any time, so we're going to build our own simple caching layer.

The Networking Layer

01:49 Before we start building the caching layer though, let's take a look at how we do networking in this app. We've covered this before in Swift Talk episodes #1 (Networking) and #8 (Networking: POST Requests). For each endpoint, we create a Resource struct that specifies the endpoint's URL as well as a parsing function that can turn the data from the network into a result with a specific type. Then we can use the load function on our Webservice class to load such a resource:

var allEpisodes: Resource<[Episode]> = try! Resource(
    url: URL(string: "http://localhost:8000/episodes.json")!,
    parseElement: Episode.init
)

let webservice = Webservice()
webservice.load(allEpisodes) { result in
    print(result)
}

To make this asynchronous code work in the playground, we have to specify that it needs indefinite execution:

import PlaygroundSupport
PlaygroundPage.current.needsIndefiniteExecution = true

The Cached Webservice API

03:18 The API we want to have for the cache is very similar to the API of the Webservice class. In fact, the CachedWebservice class is just a thin wrapper around an instance of Webservice. We initialize it with an instance of Webservice, and we define a load method that at first just forwards the request:

final class CachedWebservice {
    let webservice: Webservice

    init(_ webservice: Webservice) {
        self.webservice = webservice
    }

    func load<A>(_ resource: Resource<A>, update: @escaping (Result<A>) -> ()) {
        webservice.load(resource, completion: update)
    }
}

04:52 The idea behind the CachedWebservice class is to maintain a separation of concerns: the Webservice should only be concerned with executing network requests, while CachedWebservice adds the caching functionality on top. We could've gone into our networking code and added caching there, but we think the modular approach will result in simpler code.

05:18 Since the load method on CachedWebservice has exactly the same signature as load on Webservice, we might be tempted to introduce a protocol here. While there could be situations where this would become useful, we currently don't need this extra layer of abstraction; we can always add it later if we have a specific use case for it.

05:46 Other than the Webservice class, the CachedWebservice is much more application specific. The load method will implement the specific caching strategy we want to use in this project: we return the cached data immediately if it's available, and we make a network request each time regardless of whether or not cached data was available. This strategy might vary depending on the specific use case.

Because of this strategy, the behavior of the load method on CachedWebservice differs from the load method on Webservice in that it can call the update callback multiple times: e.g. once with the cached results, and then once again with the results from the network.

Implementing the Cache

06:52 We'll implement the actual caching in a separate class to keep the concerns separated: Webservice only deals with network requests, Cache only deals with storing and retrieving cached data, and CachedWebservice combines the two.

07:10 The Cache class has two methods: one to store data into the cache, and one to load data from the cache. Both methods take a Resource as parameter, because we always want to store/load data for a particular resource. Let's start with load:

final class Cache {
    func load<A>(_ resource: Resource<A>) -> A? {
        // ...
    }
}

07:46 Within load, we first have to construct a URL for the cached data. For this, we define a baseURL property (for simplicity, we just hardcode this to the documents directory). Then we combine this base URL with a cache key to create a unique cache file name for this particular resource:

final class Cache {
    let baseURL = try! FileManager.default.url(for: .documentDirectory, in: .userDomainMask, appropriateFor: nil, create: true)```
    func load<A>(_ resource: Resource<A>) -> A? {
        let url = baseURL.appendingPathComponent(resource.cacheKey)
    }
}

We'll come back to actually defining cacheKey on Resource in a bit.

09:36 With this URL at hand, we can now add the code that actually loads the data from disk. First we call Data(contentsOf:) to load the data from disk, and then we use the resource's parse function to turn it from Data into the load method's return type, A?:

final class Cache {
    let baseURL = try! FileManager.default.url(for: .documentDirectory, in: .userDomainMask, appropriateFor: nil, create: true)```
    func load<A>(_ resource: Resource<A>) -> A? {
        let url = baseURL.appendingPathComponent(resource.cacheKey)
        let data = try? Data(contentsOf: url)
        return data.flatMap(resource.parse)
    }
}

We use flatMap to apply resource.parse to data, because data is optional and the result of resource.parse is also optional.

10:20 We still have to implement cacheKey on Resource to make the above code work. Here we'll cut some corners and just use the URL's hashValue property to generate a key. In production we use a SHA1 hash because we need to be sure not to run into hash collisions. Since we're working in a playground and can't easily import Common Crypto, using hashValue does the job for the sake of this example:

extension Resource {
    var cacheKey: String {
        return "cache" + String(url.hashValue) // TODO use sha1
    }
}

11:14 Now we can make use of the Cache's load method in our CachedWebservice class:

final class CachedWebservice {
    // ...
    let cache = Cache()

    func load<A>(_ resource: Resource<A>, update: @escaping (Result<A>) -> ()) {
        if let result = cache.load(resource) {
            print("cache hit")
            update(.success(result))
        }
        webservice.load(resource, completion: update)
    }
}

12:37 To store data into the cache, we'll add a save method on the Cache class:

final class Cache {
    // ...
    func save<A>(_ data: Data, for resource: Resource<A>) {
        let url = baseURL.appendingPathComponent(resource.cacheKey)
        _ = try? newValue?.write(to: url)    
    }
}

13:33 One addition we'll make to the save and load methods is to add a check to see whether we're dealing with a GET request, since we only want to cache data for this request type:

final class Cache {
    func load<A>(_ resource: Resource<A>) -> A? {
        guard case .get = resource.method else { return nil }
        // ...
    }
    func save<A>(_ data: Data, for resource: Resource<A>) {
        guard case .get = resource.method else { return }
        // ...
    }
}

Putting Everything Together

14:27 If we try to use the new save method in the CachedWebservice's load method to actually store some data into the cache, we encounter a problem: the call to webservice.load doesn't return the plain network data since it already applied the resource's parse function to it. So we've already missed the point where we could've stored the plain network response.

14:59 To solve this, we create a new resource from the one that gets passed into load. This new resource is constructed in a way so that it doesn't parse the raw data:

let dataResource = Resource<Data>(url: resource.url, parse: { $0 }, method: resource.method)

If we use this resource when we call webservice.load, then we can store the raw data in the cache before applying the original resource's parse function to turn the raw data into the final result:

final class CachedWebservice {
    // ...
    func load<A>(_ resource: Resource<A>, update: @escaping (Result<A>) -> ()) {
        if let result = cache.load(resource) {
            print("cache hit")
            update(.success(result))
        }
        
        let dataResource = Resource<Data>(url: resource.url, parse: { $0 }, method: resource.method)
        webservice.load(dataResource, completion: { result in
            switch result {
            case let .error(error):
                update(.error(error))
            case let .success(data):
                self.cache.save(data, for: resource)
                update(Result(resource.parse(data), or: WebserviceError.other))
            }
        })
    }
}

17:02 In the last call to update, we use an initializer on the Result type that allows us to specify an error value in case the first parameter is nil. The result of the call, resource.parse(data), could be nil. In this case, we return an .other error.

17:35 To test out our CachedWebservice, we can instantiate it with the existing Webservice instance and call load on it:

let webservice = Webservice()
let cachedWebservice = CachedWebservice(webservice)
cachedWebservice.load(allEpisodes) { result in
    print(result)
}

Refactoring the `Cache` Class

18:47 Currently, the Cache class implements a particular type of caching, namely writing the data to disk. However, we'd like to pull the code specific to a particular kind of caching out of the Cache class so that we could also create, for example, an in-memory instance of Cache.

19:28 We create a struct, FileStorage, that will only provide a subscript API to save and load data. We only use the struct as a namespace, so we could've also chosen to use a class:

struct FileStorage {
    let baseURL = try! FileManager.default.url(for: .documentDirectory, in: .userDomainMask, appropriateFor: nil, create: true)

    subscript(key: String) -> Data? {
        // ...
    }
}

Usually you'd want the baseURL to be configurable for the outside, but we'll stick with the hardcoded URL for this example.

20:16 To implement the subscript's getter and setter, we can use the code we've already written in the load and save methods of the Cache class:

struct FileStorage {
    let baseURL = try! FileManager.default.url(for: .documentDirectory, in: .userDomainMask, appropriateFor: nil, create: true)

    subscript(key: String) -> Data? {
        get {
            let url = baseURL.appendingPathComponent(key)
            return try? Data(contentsOf: url)
        }
        set {
            let url = baseURL.appendingPathComponent(key)
            _ = try? newValue?.write(to: url)
        }
    }
}

21:09 In the setter, we could also implement the possibility to clear the cache for a particular key by setting it to nil.

21:19 Now we can make use of FileStorage in the Cache class:

final class Cache {
    var storage = FileStorage()
    
    func load<A>(_ resource: Resource<A>) -> A? {
        guard case .get = resource.method else { return nil }
        let data = storage[resource.cacheKey]
        return data.flatMap(resource.parse)
    }
    
    func save<A>(_ data: Data, for resource: Resource<A>) {
        guard case .get = resource.method else { return }
        storage[resource.cacheKey] = data
    }
}

22:04 This additional separation of concerns cleans up our code and will also make it easier to test.

Conclusion

22:31 What we like about this approach is that each component has a very simple job. The Webservice only does network requests, and the Cache only does caching. The only part that isn't that simple is the CachedWebservice, because it combines the different parts and implements the specific caching behavior we want to have for our app.

23:04 If you want to explore other ways to implement caching logic, there was an interesting talk at CUFP 2016 about Composable Caching in Swift using monoids. It sounds a bit scary at first, but it's actually not that hard to understand.