00:06 In this episode, we'll continue adding functionality to IndexSet
by conforming to more protocols. By conforming to Sequence
in the last
episode, we were able
to iterate over the values, and we gained helper methods like map
, filter
,
and contains
.
Now we want to make our IndexSet
conform to Collection
as well, which will
look similar to a lot of stuff we did when we implemented SortedArray
, which
was our first example of a custom collection. This time, it'll be more
complicated because IndexSet
doesn't have a simple index type like Int
. We
need a new index type that allows us to index into the ranges and consequently
into an element of a range.
01:37 We create an Index
struct nested inside IndexSet
. It has two
properties: an index for the range, and one for the element in that range, for
which we use the index type of episode
#38's RangeType
:
extension IndexSet {
struct Index {
let rangeIndex: Int
let elementIndex: RangeType.Index
}
}
To conform to the Collection
protocol, we have to implement startIndex
,
endIndex
, subscript
, and index(after:)
.
Creating startIndex and endIndex
02:22 The startIndex
is the index of the first element of the first
range, so its rangeIndex
will simply be the startIndex
of ranges
, and the
elementIndex
is the first range's startIndex
. However, this is an optional
because there could be no ranges at all. In such a case, we return a zero for
elementIndex
, which isn't an integer but rather a RangeType.Index
. For this
dummy zero index, we return the startIndex
of a fake range:
var startIndex: Index {
let zero = (0...0).startIndex
return Index(rangeIndex: ranges.startIndex, elementIndex: ranges.first)?.startIndex ?? zero
}
03:40 The endIndex
is somewhat similar but perhaps less
straightforward. It has to meet three conditions:
-
The endIndex
must be out of bounds.
-
The endIndex
must be equal to startIndex
if the collection is empty.
-
All valid indices of the collection should compare as less than the
endIndex
.
For the rangeIndex
, we can return the endIndex
of ranges
, which already
places the entire index out of bounds. For elementIndex
, we can use the same
dummy value as before, which satisfies the second condition:
var endIndex: Index {
let zero = (0...0).startIndex
return Index(rangeIndex: ranges.endIndex, elementIndex: zero)
}
Comparable
04:59 To meet the third condition, the index type needs to be
Comparable
. We make our Index
conform to this protocol in another extension
by writing the functions ==
and <
. The equals function is pretty easy; it's
true if both subindices are equal:
extension IndexSet.Index: Comparable {
static func ==(lhs: IndexSet.Index, rhs: IndexSet.Index) -> Bool {
return lhs.rangeIndex == rhs.rangeIndex && lhs.elementIndex == rhs.elementIndex
}
}
The less than comparison has some more cases to consider. Firstly, we can
compare the rangeIndex
of both sides — if they compare less than, we can
already return true
. If they compare equal, we're in the same range and we can
return the comparison of the two sides' elementIndex
. In all other cases, we
can return false
:
extension IndexSet.Index: Comparable {
static func <(lhs: IndexSet.Index, rhs: IndexSet.Index) -> Bool {
if lhs.rangeIndex < rhs.rangeIndex { return true }
if lhs.rangeIndex == rhs.rangeIndex {
return lhs.elementIndex < rhs.elementIndex
}
return false
}
}
Subscript
07:38 The subscript
function takes an Index
and returns the
collection's element type: Int
. We can simply return the corresponding value
without checking the bounds. This means we can cause a crash, but that's
actually the expected behavior:
subscript(index: Index) -> Int {
return ranges[index.rangeIndex][index.elementIndex]
}
Advancing with index(after:)
08:27 The tricky part is index(after:)
, in which we have to return the
next index. Again, there are a few cases: we have to return either a next index
within the same range or the first index of a next range, or we should return
endIndex
, which signifies the end of the collection.
First we'll take the range of the given Index
. If the range's next index is
still in bounds, we can return it:
func index(after index: Index) -> Index {
let range = ranges[index.rangeIndex]
let nextElementIndex = range.index(after: index.elementIndex)
if nextElement < range.endIndex {
return Index(rangeIndex: index.rangeIndex, elementIndex: nextElementIndex)
}
}
Otherwise, we try to get the next range's index, which could be within or out of
bounds. If it's within bounds, we return the first index of that range.
Otherwise, we can return our endIndex
:
func index(after index: Index) -> Index {
let nextRangeIndex = ranges.index(after: index.rangeIndex)
if nextRangeIndex < ranges.endIndex {
let nextRange = ranges[nextRangeIndex]
return Index(rangeIndex: nextRangeIndex, elementIndex: nextRange.startIndex)
} else {
return endIndex
}
}
11:35 That's a lot of typing, and it's easy to make mistakes here. Plus,
writing an integer-based collection would be less complicated than our compound
index type, so the question is how much we've gained by doing it this way. For
example, we can try to use subscript
with startIndex
to get the first index
element out:
var set = IndexSet()
set.insert(4...5)
set.insert(0...2)
let idx = set.startIndex
set[idx]
12:31 We have to remember to actually declare our Collection
conformance:
extension IndexSet: Collection {
}
Choosing Protocols
13:04 By implementing Collection
on top of Sequence
, we didn't
really gain a lot of extra functionality. We can count
:
set.count
But for other methods, like last
, we'd have to also implement
BidirectionalCollection
by adding index(before:)
.
14:08 You could look at conforming to protocols as a statement about
your type, and not just as a means to gain functionality. By implementing
Collection
, you're saying it's OK to iterate over this type multiple times.
With a Sequence
, you might only be able to iterate over it once, e.g. standard
input. Our IndexSet
can be iterated over as many times as we want, so making
it into a Collection
makes more sense.
14:39 Also, a Collection
promises that using subscript
takes a
constant time, which, in this case, is a requirement we can fulfill. Array
, on
the other hand, also implements RandomAccessCollection
, which we could
technically implement, but the protocol has the soft requirement that you can
perform count
in constant time. We can't promise that because our type has
different characteristics. As such, we shouldn't conform to that protocol.
15:17 Another important aspect to note when working with indices is the
fact that stored indices might become invalid when you mutate the collection.
For example, if you remove an element from an IndexSet
, its range could split
into two ranges. The same goes for Arrays
: if you remove an element, all the
indices after that element will point to different elements or they'll be out of
bounds.
16:21 So that's it for our custom collection type. Implementing these
protocols turned out to be not much work. If you have a data type that works
like a collection, it could be worth the effort to improve readability or
encapsulate functionality.