00:06 Wouter Swierstra is back, and
we're going to talk about the importance of choosing the right types in our
code. In Swift, we can use classes, structs, and enums, as well as optionals and
results, and all these types have different meanings for the code we write.
00:39 We have a lot of freedom in writing our code, but we should be
considerate about choosing types and leverage the type system in such a way that
our code precisely models the data of the domain we're working in.
01:08 A library that lays out text might define various ways of aligning
text, and it could do so using integers, where 0
means left-aligned, 1
means
right-aligned, and 2
means centered. But using an integer to represent text
alignment allows for values without any meaning — e.g. how should the library
deal with the value 27
? It makes sense to define the possible values with an
enum, thus ensuring only valid values can exist.
URLSession Completion Handler
01:44 In Foundation, we find an API where the used types may cause some
confusion. When we let URLSession
load data from a URL, we have to supply a
completion handler that takes three parameters:
func dataTask(with request: URLRequest, completionHandler: @escaping (Data?, URLResponse?, Error?) -> Void) -> URLSessionDataTask
02:12 The three parameters are all optional, so any or all of them can
be present or absent. This means that the completion handler, in theory, has to
deal with eight possible states. We can see that more clearly if we write out
the parameters in a different way:
struct CallbackInfo {
var data: Data?
var response: URLResponse?
var error: Error?
}
03:13 In reality, not all of the eight possible states may ever occur.
We can read the documentation to get a hint of how our callback might be called,
but documentation doesn't give us guarantees like the compiler and the type
system would.
03:25 We can think about which states make sense ourselves. We could
assume that if we get data, then we don't get an error. And the other way
around: if we get an error, there's no data. We could model these two situations
as an enum:
enum CallbackInfo2 {
case success(Data, URLResponse)
case failure(Error)
}
04:11 But we don't know for sure that this enum covers all possible
states. Of the eight possible states of the CallbackInfo
struct, there might
be some that can occur but that the CallbackInfo2
enum can't express.
04:25 It's hard to tell which situations may happen and which ones never
will just by reading the documentation of the data task method. The case could
be made that an enum-based approach, like CallbackInfo2
, does a better job at
making clear to the user which situations need to be handled.
04:59 On the other hand, we can't say that an enum is always better than
optional arguments. If we're dealing with four or five optionals and all
possible combinations may occur, then we'd have to define a huge enum with 16
or 32 cases. Doing so probably wouldn't make the usage of such an API less
complicated.
05:27 We can already illustrate this problem with our own example. Let's
say that the failure case also comes with an optional Data?
:
enum CallbackInfo2 {
case success(Data, URLResponse)
case failure(Data?, Error)
}
Given both cases can contain data, it would make more sense for the data to be
provided as an optional property instead of being tucked away in the associated
values of the enum, because that makes it harder to access.
05:54 Enums aren't always better than a set of optionals, or vice versa,
but it depends on the possible states that we're trying to model.
User Session
06:04 The second example comes from Apple's book on Swift, The Swift
Programming Language:
struct Session {
var user: User?
var expired: Bool
}
06:31 Here we have a user session. The user
property is optional,
since there may not be a registered user. This model of the user session allows
for four possible states: a user can be present or not, and the expired
Boolean property can be true
or false
.
07:13 The Swift book then goes on to say that we can alternatively model
the session as an enum, thereby eliminating the state that doesn't occur (in
which we have no user and an expired session):
enum Session1 {
case loggedIn(User)
case expired(User)
case notRegistered
}
07:55 The enum version models the domain more precisely than the struct
does, because it can only represent possible states of the user session.
08:25 But like in the previous example, we now have two cases that share
an associated value, and we have to switch over the enum in order to extract a
User
from the session. However, there's a third way of modeling the session
that makes it easier to access the user without becoming any less precise:
struct Session {
var user: User
var expired: Bool
}
var session: Session?
08:45 Here we're using a struct again, but this time the user property
is not optional. Instead, the session itself is stored in an optional variable.
A possible state is that session
is nil
, which means the same as the
notRegistered
case of the Session1
enum. In the other two states, there is a
session, and therefore also a user, and the session either is or isn't expired.
09:28 We come across this situation quite a lot: when multiple cases of
an enum share the same associated value, then we can often wrap the enum in a
struct and pull the associated value out into a property of the struct.
Mapping File Names to Data
09:46 Let's look at another example. Suppose we have an array of file
names as strings, and we're writing a function that maps over the array and
returns data from the files. What should the result type of this function be?
10:05 The function could simply return an array of Data
:
func readFiles(_ fileNames: [String]) -> [Data] {
}
10:33 That works fine, but what happens if one of the files doesn't
exist or if it fails to be read? The function can leave out that file's data and
return the rest, but as users, we have no way of knowing which of the files
failed.
11:08 The result type could also be an array of optionals:
func readFiles(_ fileNames: [String]) -> [Data?] {
}
This way we can try to figure out which of the files are successfully loaded,
but we can't be absolutely sure, because we don't have the guarantee that the
result array is ordered the same way as the input array.
11:55 And we might want to report an error about missing files, so
perhaps the function should return the file names along with the optional data
values, combined in tuples:
func readFiles(_ fileNames: [String]) -> [(String, Data?)] {
}
12:20 Another option is to make the entire array optional. That makes
the result all or nothing: we either get data from all requested files, or
something failed with one of the files and we get no results at all:
func readFiles(_ fileNames: [String]) -> [(String, Data)]? {
}
12:50 Even for a simple function like the one above, we can easily think
of seven variations. For example, we could decide to return a Result
instead
of an optional, or maybe we want to include a custom enum describing different
kinds of failures. Choosing between types totally depends on what makes the most
sense for the application.
Preciseness vs. Ease of Use
13:33 We could go even further and try to enforce the fact that the
input and output arrays of the readFiles
function should have the same length.
There are certain programming languages that let you express this, but in Swift
we also have some tricks that can help out.
14:12 We could try to tag an array with its length somehow. Then we
could define a map function that preserves the length and use this map to
implement readFiles
. But we'd be pushing how much information we can put into
types, and we should ask ourselves if the added complexity is worth it.
15:05 Having less strict types means that we need to trust the
implementation of a piece of code more. And we can always write tests that check
how that code behaves when we feed it lots of sample input.
15:28 The standard library has plenty of examples where, in favor of
simplicity of use, the types used are not the most precise in describing what
they do. For one, an Array
is indexed by integers (Int
), and not by unsigned
integers (UInt
), even though indices are never negative.
The same goes for the count
of an array or a string. An amount can never be a
negative number, so it would be more precise to use UInt
instead of Int
. But
this would make the count
property more difficult to use, because in most
cases, we'd have to convert the type to Int
before passing it on to some other
API.
16:32 Choosing the right types means dealing with the tradeoff between
preciseness and ease of use. The best approach is to explore different types and
to settle on a type that's the most accurate and best describes whatever we're
describing, but which doesn't contain any junk values that can represent
impossible states.
Using Phantom Types
17:04 In our episode about phantom
types
with Brandon Kase, we discussed the concept of
tagging types in order to make the types both more descriptive and more
restrictive, and in doing so, we leverage the type system to prevent incorrect
usage of our APIs.
17:49 We can find a practical example of phantom types being used in
Auto Layout. There, the
anchors of
views are tagged with a phantom type to distinguish between horizontal and
vertical anchors. This makes it impossible to, for example, constrain a leading
anchor to a top anchor, which wouldn't make sense.
18:29 The matter of accurately modeling the data of a particular domain
automatically comes up in communities for strongly typed languages. One of the
main Elm developers, Richard Feldman, held a talk about making impossible
states impossible. And there have
been similar talks about F#, OCaml, and Haskell as well, all discussing how to
find the data representation that lets you define nothing but the functions that
make sense.
And we'll leave it there. See you next time.