Evaluating Expressions

Ledger Mac App: Evaluating Expressions

In this episode29:36

This episode is freely available thanks to the support of our subscribers

Subscribers get exclusive access to new and all previous subscriber-only episodes, video downloads, and 30% discount for team members. Become a Subscriber →

Expressions are at the heart of Ledger. We write an evaluator for this expression language in a test-driven way.

00:06 Let's look at evaluating expressions in Ledger. When we display a Ledger file, we first parse the file. After that, we evaluate the contents of the parsed file. This episode is about the step that comes after parsing: computing everything. 00:29 To illustrate, here are some sample expressions in a Ledger file:

2016/07/25 Lunch
  Expenses:Food    10 USD
  Assets:Checking

00:34 We're interested in the expressions on the right-hand side of a posting. For example, the Expenses:Food posting has the expression 10 USD. This is the simplest possible expression: it's a number with a currency. 00:53 In Ledger, we can also have more complicated expressions. For example, we can calculate on the right-hand side, and we could say `(10 USD

5)`. 01:08 We can also perform all kinds of different calculations: add, divide, multiply, and so on. We can even use variables. For example, we could introduce a variable first and then use it in an expression:

define numberOfPeople=5
2016/07/25 Lunch
  Expenses:Food    (10 USD * numberOfPeople)
  Assets:Checking

01:35 We have to handle all these cases in our evaluator. The topic of evaluating expressions might seem a bit obscure at first. However, many apps have some form of expression evaluation. 01:53 If you look at a spreadsheet, for example, or at a compiler, it's the same thing: you get a parsed expression and you need to do something with it — maybe interpret it or compile it. Even in apps like Photoshop or Sketch, you can often type in the measurements as an expression, so you could write: 100px/2, which evaluates to 50px. 02:18 Expression evaluation is an interesting problem to solve, and once you know how to solve it, you can apply the techniques everywhere.

Representing the Expressions

02:24 That said, let's look at how to do this in Ledger. 02:26 The first step is to represent the expressions. Currently, when we look at our Xcode project, we already have an Amount struct defined. It represents the combination of a number and a commodity — for example, 10 USD:

struct Amount: Equatable {
    var value: LedgerDouble
    var commodity: Commodity?
    
    init(value: LedgerDouble, commodity: Commodity? = nil) {
        self.value = value
        self.commodity = commodity
    }
}

02:51 If Ledger only supported amounts for postings, we would already be done. However, since we can write expressions, we need to wrap the above. We can start by writing an Expression enum for all the different cases in an expression. The most straightforward case is the amount case, which is what we use whenever we have an expression that's just an amount. Later on, we'll add cases for operators and identifiers:

enum Expression {
    case amount(Amount)
    // operator
    // variable
}

03:49 Let's write a test for this. We want to evaluate an Expression and compute the final amount we get out, so we'll start with a simple test. For the .amount case, it's easy — we test whether or not an Expression.amount evaluates to the corresponding amount:

class ExpressionEvaluationTests: XCTestCase {
    func testAmount() {
        // 5 EUR
        let expr = Expression.amount(Amount(value: 5, commodity: "EUR"))
        XCTAssertEqual(expr.evaluate(), Amount(value: 5, commodity: "EUR"))
    }
}

04:19 In our finished application, we get expressions out of the parser. Here, however, we instantiate expressions in our test code. 05:15 Let's write an extension and implement the evaluate method. The method returns an Amount, and the initial implementation is straightforward: we switch on self, and because we only have a single case, we just return the amount. 05:43 Now the tests pass:

extension Expression {
    func evaluate() -> Amount {
        switch self {
        case .amount(let amount):
            return amount
        }
    }
}

Operators

05:54 In a second test case, we test an operator. We'd like to test an expression like 5 + 2, so we write an expression representing the addition. We need a second Expression case for the operator, which we could call infixOperator. It will have three associated values: an operator name, and two amounts. We'd like to test that it evaluates to 7:

func testOperator() {
    // 5 + 2
    let expr = Expression.infixOperator("+", Amount(value: 5), Amount(value: 2))
    XCTAssertEqual(expr.evaluate(), Amount(value: 7))
}

07:06 Before we make our test pass, let's add a small convenience helper to read and write these tests, because they get a little bit difficult to read after a while. We can make our Amount struct conform to the IntegerLiteralConvertible protocol so that we can initialize an Amount from an integer literal:

extension Amount: IntegerLiteralConvertible {
    init(integerLiteral value: LedgerDouble) {
        self.value = value
        self.commodity = nil
    }
}

08:10 We can now simplify our tests:

func testOperator() {
    // 5 + 2
    let expr = Expression.infixOperator("+", 5, 2)
    XCTAssertEqual(expr.evaluate(), 7)
}

08:35 In Expression, we now add the infixOperator case:

enum Expression {
    case amount(Amount)
    case infixOperator(String, Amount, Amount)
    // variable
}

09:09 The compiler helps us by warning that the switch statement in evaluate is no longer exhaustive, so let's implement the + operator. We'll switch on self and add a new case, infixOperator. We match on the + symbol and define variables for the left-hand and right-hand sides:

extension Expression {
    func evaluate() -> Amount {
        switch self {
        case .amount(let amount):
            return amount
        case let .infixOperator("+", lhs, rhs):
            // TODO
    }
}

09:38 Here we want to return an amount. We could, for example, write lhs + rhs. However, there's not yet a definition of + on Amount. So before we do that, let's add a default case to the switch statement and make it crash for now. This is something we'll fix later:

extension Expression {
    func evaluate() -> Amount {
        switch self {
        case .amount(let amount):
            return amount
        case let .infixOperator("+", lhs, rhs):
            return lhs + rhs
        default:
            fatalError()
        }
    }
}

10:12 We now need a + operation for amounts, and we can simply define it as a function that takes two Amounts and returns an Amount. Because we can't add amounts like 5 USD + 2 EUR, we add a guard statement to ensure that the commodities are equal. However, what do we do in the else case? We could crash or we could return nil, but we choose to throw an error. This way, we know what went wrong, and we can use the information during evaluation.

Error Messages

11:16 Because there are a lot of different places where things could go wrong, it would be nice to throw an error with more specific information than just a nil value. 11:26 We made String conform to the Error protocol, and that way we can throw a simple String value. We also need to mark our function as throws. 11:54 Now that we're outside of the guard, we can create a new Amount with the two values added together, along with their commodity:

func +(lhs: Amount, rhs: Amount) throws -> Amount {
    guard lhs.commodity == rhs.commodity else {
        throw "Commodities don't match"
    }
    return Amount(value: lhs.value+rhs.value, commodity: lhs.commodity)
}

12:15 Now we'll have to add try everywhere. We need to mark the call to + with a try keyword, mark evaluate as throws, and add try keywords in our tests:

extension Expression {
    func evaluate() throws -> Amount {
        switch self {
        case .amount(let amount):
            return amount
        case let .infixOperator("+", lhs, rhs):
            return try lhs + rhs
        default:
            fatalError()
        }
    }
}

// ...

class ExpressionEvaluationTests: XCTestCase {
    func testAmount() {
        // 5 EUR
        let expr = Expression.amount(Amount(value: 5, commodity: "EUR"))
        XCTAssertEqual(try! expr.evaluate(), Amount(value: 5, commodity: "EUR"))
    }

    // ...
    }

12:36 Now our tests pass.

12:48 The next step is to add another operator. We currently cover +, so let's add * for multiplication. We'll start by copying and modifying our test for addition:

func testMultiplication() {
    // 5 * 2
    let expr = Expression.infixOperator("*", 5, 2)
    XCTAssertEqual(try! expr.evaluate(), 10)
}

13:19 The test-driven approach works really well for the problem we're now solving, but it would be hard to test all of this manually, as we don't have an easy way to construct expressions without a parser and a UI. We don't often write tests for large parts of our projects because it's just the two of us, and the tradeoff isn't worth it. But here, it's perfect.

14:10 When we run our test, we now crash in our default case, and we'll implement multiplication by again copy-pasting the code for addition. We'll copy the case in the switch statement:

// ...
case let .infixOperator("*", lhs, rhs):
    return try lhs * rhs
// ...

14:41 For the * operator, we can copy-paste the + operator and modify it accordingly:

func *(lhs: Amount, rhs: Amount) throws -> Amount {
    guard lhs.commodity == rhs.commodity else {
        throw "Commodities don't match"
    }
    return Amount(value: lhs.value*rhs.value, commodity: lhs.commodity)
}

15:04 Now all the tests pass, but our code isn't nice — and it's repetitive. We need to refactor, and we have the tests in place, so it should be possible.

Refactoring

15:23 First, we could remove the duplication in the operator definitions, and we could write them as an extension on Amount. The method takes an amount and an operator function with type (LedgerDouble,LedgerDouble) -> LedgerDouble. It returns an Amount as well. And just like the original operator definitions, this method can throw:

extension Amount {
    func compute(operatorFunction: (LedgerDouble,LedgerDouble) -> LedgerDouble, other: Amount) throws -> Amount {
        guard commodity == other.commodity else {
            throw "Commodities don't match"
        }
        return Amount(value: operatorFunction(value,other.value), commodity: commodity)
    }
}

We can now delete our operator definitions and use our compute method inside evaluate:

extension Expression {
    func evaluate() throws -> Amount {
        switch self {
        case .amount(let amount):
            return amount
        case let .infixOperator("+", lhs, rhs):
            return try lhs.compute(operatorFunction: +, other: rhs)
        case let .infixOperator("*", lhs, rhs):
            return try lhs.compute(operatorFunction: *, other: rhs)
        default:
            fatalError()
        }
    }
}

17:49 The tests still pass.

17:57 Next, we should improve our switch statement. Having the default case is dangerous, as we might forget to implement a case on Expression, so let's restructure the switch statement. We add an infixOperator case, and we match on the operator symbol in a nested switch. This allows us to move the default case inside the nested switch. This is an improvement, because the default isn't at the outermost switch anymore. If we were to add another case to Expression, we'd get a warning again:

extension Expression {
    func evaluate() throws -> Amount {
        switch self {
        case .amount(let amount):
            return amount
        case let .infixOperator(op, lhs, rhs):
            switch op {
            case "+":
                return try lhs.compute(operatorFunction: +, other: rhs)
            case "*":
                return try lhs.compute(operatorFunction: *, other: rhs)
            default:
                fatalError()
            }
        }
    }
}

Removing Duplication

19:18 We can do even better, because the cases for addition and multiplication are almost identical — only the operator is different. Instead of the switch, we can put the operators in a dictionary with the symbols as the keys. 20:52 In case we can't look up an operator, we can throw an error instead of crashing. We can also get rid of the switch statement, and our code becomes a lot simpler. Our tests still run:

extension Expression {
    func evaluate() throws -> Amount {
        switch self {
        case .amount(let amount):
            return amount
        case let .infixOperator(op, lhs, rhs):
            let dict: [String: (LedgerDouble, LedgerDouble) -> LedgerDouble] = [
                "+": (+),
                "*": (*)
            ]
            guard let operatorFunction = dict[op] else { throw "Undefined operator: \(op)" }
            return try lhs.compute(operatorFunction: operatorFunction, other: rhs)
        }
    }
}

21:40 The name dict isn't very descriptive, so let's change it to operators.

21:52 Also, we're cheating a bit. We can represent expressions like 2+7, but we can't represent expressions like 2+7+8. We defined our infixOperator with Amounts, and we'd like the operands to be Expressions instead, so we change the associated value to Expression:

enum Expression {
    case amount(Amount)
    case infixOperator(String, Expression, Expression)
    // variable
}

22:27 In our evaluate method, we can't simply call compute on the left-hand and right-hand sides. Instead, we first need to recursively evaluate the operands:

extension Expression {
    func evaluate() throws -> Amount {
        switch self {
        case .amount(let amount):
            return amount
        case let .infixOperator(op, lhs, rhs):
            let operators: [String: (LedgerDouble, LedgerDouble) -> LedgerDouble] = [
                "+": (+),
                "*": (*)
            ]
            guard let operatorFunction = operators[op] else { throw "Undefined operator: \(op)" }
            let left = try lhs.evaluate()
            let right = try rhs.evaluate()
            return try left.compute(operatorFunction: operatorFunction, other: right)
        }
    }
}

23:02 We also need to change our tests:

func testOperator() {
    // 5 + 2
    let expr = Expression.infixOperator("+", .amount(5), .amount(2))
    XCTAssertEqual(try! expr.evaluate(), 7)
}

23:18 The code doesn't work yet. The compiler tells us that the Expression enum needs to be marked as indirect because we defined it recursively. After doing that, the tests pass, and we can have complicated nested expressions.

Adding Variables

23:56 There's a final element missing: we want to use variables within our expressions. Let's write a test for that. More specifically, we'd like to test that an identifier lookup works. When we evaluate the expression numberOfPeople, we expect the result to be five. This means that we also need some way to specify that the numberOfPeople variable currently has a value of five. Of course, we don't really know up front where it will be defined. Just like in a programming language, a variable could be defined as a local variable, or maybe it's defined one scope up, or somewhere else. To represent this, we'll add a context dictionary with the current variables and their values, and we'll pass the context dictionary to our evaluate method:

func testIdentifier() {
    // numberOfPeople
    let expr = Expression.identifier("numberOfPeople")
    let context: [String: Amount] = ["numberOfPeople": 5]
    XCTAssertEqual(try! expr.evaluate(context: context), 5)
}

26:16 We'll need to change our evaluate method and add a parameter for the context. We'll also need to pass it on recursively:

extension Expression {
    func evaluate(context: [String:Amount]) throws -> Amount {
        switch self {
        case .amount(let amount):
            return amount
        case let .infixOperator(op, lhs, rhs):
            let operators: [String: (LedgerDouble, LedgerDouble) -> LedgerDouble] = [
                "+": (+),
                "*": (*)
            ]
            guard let operatorFunction = operators[op] else { throw "Undefined operator: \(op)" }
            let left = try lhs.evaluate(context: context)
            let right = try rhs.evaluate(context: context)
            return try left.compute(operatorFunction: operatorFunction, other: right)
        }
    }
}

26:48 Additionally, we'll add a third case to our Expression enum:

indirect enum Expression {
    case amount(Amount)
    case infixOperator(String, Expression, Expression)
    case identifier(String)
}

26:52 In the evaluate method, we look up the identifier in the context dictionary. In the nil case, we simply throw an error:

case .identifier(let name):
    guard let value = context[name] else {
        throw "Unknown variable \(name)"
    }
    return value

27:46 We update all our tests and add an empty context. Rather than adding a default value to the evaluate method, we require callers to specify the value explicitly. This way, we can't accidentally forget to pass it on.

28:12 Our tests now pass. Let's add a final test for recursive expressions, where we combine multiplication and an identifier into a single expression:

func testMultiplicationWithIdentifier() {
    // 10 * numberOfPeople
    let expr = Expression.infixOperator("*", .amount(10), .identifier("numberOfPeople"))
    XCTAssertEqual(try! expr.evaluate(context: ["numberOfPeople": 5]), 50)
}

28:54 Because we already implemented this, the test passes straight away.

29:06 It almost feels like we're implementing a little scripting language, which makes sense because the techniques you'd use for evaluating an interpreted language are exactly the same. In the next episode, we'll look into evaluating entire transactions and postings.