Swift Talk #71

Type-Safe File Paths with Phantom Types

with special guest Brandon Kase

This episode is freely available thanks to the support of our subscribers

Subscribers get exclusive access to new and all previous subscriber-only episodes, video downloads, and 30% discount for team members. Become a Subscriber

Brandon Kase joins us to show how Swift's type system can be leveraged to check file paths at compile time.

00:06 Brandon is in Berlin for the Functional Swift Conference to talk about phantom types, a technique that can make file paths more type-safe. We'll discuss parts of that technique here, but Brandon's talk dives deeper into the matter.

The Problems

00:42 Let's say we're creating an API that opens a file path:

let path = "/Users/chris/test.md"

func open(path: String) {
    // ...
}

open(path: path)

01:28 If the method takes a string, there are many things that could go wrong. The compiler doesn't enforce the fact that the string we're passing into the method is actually a path. We could just try to use a random string, like an email address. But as users of this API, we're responsible for using it correctly ourselves, so that's not ideal.

02:14 Even if the strings we work with are proper paths, things can still go wrong. We may want to append components to a path, but we have to make sure we only append a file to a directory path, and not to a path that already points to a file:

let path = "/Users/chris/test.md"
path + "/" + "test.md" // bad

03:04 Additionally, when we're dealing with an absolute path like the above, we shouldn't be allowed to prepend to it.

Path

03:38 In order to make working with paths safer, we start by introducing a Path struct that wraps the string:

struct Path {
    var path: String
    init(_ path: String) {
        self.path = path
    }
}

let path = Path("/Users/chris/test.md")

04:40 The value we created now has a more descriptive type, Path, and an API that asks for a Path reminds us that we shouldn't pass in, say, an email string:

func open(path: Path) {
}

04:55 That's a small win, though it doesn't prevent us from making mistakes when we want to transform the path. But before looking into fixing these problems, we first add the ability to append to the path:

struct Path {
    // ...
    func appending(_ component: String) -> Path {
        return Path(path + "/" + component)
    }
}

06:03 Now we can append components to a path:

let path = Path("/Users/chris")
let path2 = path.appending("test.md")
print(path2) // Path(path: "/Users/chris/test.md")

06:26 Using this API, we end up with nonsense paths if we append to a path that has a trailing slash or if we append a filename twice:

let path = Path("/Users/chris/")
let path2 = path.appending("test.md")
let path3 = path2.appending("test.md")

print(path3) // Path(path: "/Users/chris//test.md/test.md")

07:13 Instead of wrapping a single string, we can let Path wrap an array of path components. A label on the initializer adds a little more clarity:

struct Path {
    var components: [String]
    init(directoryComponents: [String]) {
        self.components = directoryComponents
    }

    func appending(_ component: String) -> Path {
        return Path(directoryComponents: components + [component])
    }
}

09:18 We add a property that renders the components to a path string. Here we make the decision that we only work with absolute paths — so we start the rendered path with a slash:

struct Path {
    // ...
    var rendered: String {
        return "/" + components.joined(separator: "/")
    }
}

Directory or File

10:23 We still have to prevent appending two filenames. We can catch this violation at runtime by checking a Boolean that indicates whether or not we're still allowed to append to the path:

struct Path {
    var components: [String]
    var isFile: Bool

    init(directoryComponents: [String]) {
        isFile = false
        self.components = directoryComponents
    }
    // ...
}

11:20 We need separate appending methods for directories and files. In the file-appending method, we want to initialize a path with isFile set to true. By moving our custom initializer into an extension, we gain back the memberwise initializer, which we'll call in the file-appending method:

struct Path {
    var components: [String]
    var isFile: Bool

    func appending(directory: String) -> Path {
        return Path(directoryComponents: components + [directory])
    }

    func appending(file: String) -> Path {
        return Path(components: components + [file], isFile: true)
    }

    var rendered: String {
        return "/" + components.joined(separator: "/")
    }
}

extension Path {
    init(directoryComponents: [String]) {
        isFile = false
        self.components = directoryComponents
    }
}

12:58 We can check the Boolean at runtime with a precondition:

func appending(directory: String) -> Path {
    precondition(!isFile)
    return Path(directoryComponents: components + [directory])
}

func appending(file: String) -> Path {
    precondition(!isFile)
    return Path(components: components + [file], isFile: true)
}

13:33 Now we get a failure when we run the code and try to append a file twice:

let path = Path(directoryComponents: ["Users", "chris"])
let path2 = path.appending(file: "test.md")
let path3 = path2.appending(file: "test.md")// error

13:36 This error reporting works as long as we have good test coverage. But it'd be even better if the compiler gives us an error before runtime. This means the compiler has to understand the difference between directory paths and file paths, so we introduce a new type parameter for Path:

struct Path<FileType> {
    // ...
}

14:32 Now that we have a type parameter, we can insert a File or Directory type into it. We create these types as enums without any cases. This way, the types don't have any constructors, which means there can't be a value of the types at runtime; they can only exist at compile time:

enum File {}
enum Directory {}

15:04 We get rid of the isFile Boolean, and instead we use the type parameter of Path to express that it points to a file or a directory. We implement this by only appending components within an extension constrained to Directory paths:

struct Path<FileType> {
    var components: [String]

    var rendered: String {
        return "/" + components.joined(separator: "/")
    }
}

extension Path where FileType == Directory {
    init(directoryComponents: [String]) {
        self.components = directoryComponents
    }

    func appending(directory: String) -> Path<Directory> {
        return Path(directoryComponents: components + [directory])
    }

    func appending(file: String) -> Path<File> {
        return Path<File>(components: components + [file])
    }
}

16:40 The appending methods now return a path with an explicit type. If we'd simply return a Path, the compiler would implicitly return a path with the same file type as the path we're appending to.

17:14 We add a private initializer that overwrites the default public memberwise initializer so that users outside the framework can't mess things up:

struct Path<FileType> {
    var components: [String]

    private init(_ components: [String]) {
        self.components = components
    }

    // ...
}

extension Path where FileType == Directory {
    // ...

    func appending(file: String) -> Path<File> {
        return Path<File>(components + [file])
    }
}

18:39 We get an ambiguous reference error from the compiler when we try to append a file to a file path. It's not a very helpful error message, but it'll become clearer when we give the appending methods different names:

func appending(directory: String) -> Path<Directory> {
    return Path(directoryComponents: components + [directory])
}

func appendingFile(_ file: String) -> Path<File> {
    return Path<File>(components + [file])
}

19:25 Now we get the error we want, saying that Path<File> is not convertible to Path<Directory>:

let path = Path(directoryComponents: ["Users", "chris"])

let path2 = path.appendingFile("test.md")
let path3 = path2.appendingFile("test.md") // error

19:33 To confirm our code works, we first append a directory component:

let path = Path(directoryComponents: ["Users", "chris"])
let path1 = path.appending(directory: "Documents")
let path2 = path1.appendingFile("test.md")

Then we check that appending a directory after a file fails:

let path = Path(directoryComponents: ["Users", "chris"])
let path1 = path.appendingFile("test.md")
let path2 = path1.appending(directory: "Documents") // error

20:34 This works much better already. We could also expand Path to understand absolute and relative paths and support combining the two. In Brandon's conference talk, he solves that challenge with another type parameter.

Phantom Types

20:56 As a member of Pinterest's Core Platform team, Brandon works on tooling and build systems that read and write files all the time. An implementation of Path like the one we wrote today helps him prevent many mistakes.

21:49 For an app that loads one or two files, the Path implementation might be too much overhead. But the technique we used can be applied to whatever is most important for your domain.

22:15 Path is a phantom type: at least one of its type parameters isn't used in constructors, but only as a constraint. For an app that does currency conversion, we might use a phantom type to keep track of which currency we're dealing with.

23:12 Another way to look at today's challenge is to think of the program as a state machine. Starting out with a directory path, we can keep appending directory components and we'll stay in the same state. As soon as we append a file, we switch to a different state, where different rules apply. Anytime a program can be seen as a state machine, phantom types can be helpful.