Declarative Text Kit: Expression and Evaluation

May 27th, 2024

In DeclarativeTextKit, I found the abstraction of an “expression” in my vocabulary to represent the Domain-Specific Language’s instructions useful.

Here is an example of a valid block of changes:

Select(LineRange(buffer.selection)) { selectedLineRange
    Modifying(selectedLineRange) { editedRange in
        Delete(
            location: editedRange.location,
            length: editedRange.length / 2
        )
    }

    Select(selectedLineRange.endLocation)
}

This uses two kinds of Swift Result Builder to define the DSL’s grammar:

Select(...), with a block that is a CommandSequenceBuilder,
- Modifying(...), which is a Command that goes into the CommandSequenceBuilder. Its own block is assembled via a ModificationBuilder (that accepts either Deletes or Inserts).
  - Delete, which is a Modification that goes into the ModificationBuilder block. Once it’s applied, the editedRange.length is halved, “bubbling up” this information.
- Select(...), which is a Command that goes into the CommandSequenceBuilder, this time without a block. It only executes the side-effect of putting the cursor to the end location after the deletion has been applied. At that point, the selectedLineRange reflects the changes from the Delete command.

The rules of the grammar are essentially this:

You cannot mix Delete and Insert modifications. It’s either-or.
You can mix some commands, like chaining Modifying blocks and adding a Select or two.
You cannot mix the chainable commands from (2) with the modifications from (1).

My initial approach led me to a separation into two Result Builder blocks: @ModificationBuilder for modifications, and @CommandSequenceBuilder for commands that on their own don’t modify the content of a text buffer

Expression

I noodled my way towards an abstraction until I arrived at the insight that I’m essentially sketching instructions of a scripting or programming language with this DSL, and that a proper name for this would be “expression” (or “statement”). It represents an instruction that will be executed or rather interpreted later, and produces an “evaluation result”.

The types I started with, Expression and its associated Evaluation, are these:

protocol Expression {
    associatedtype Evaluation
    associatedtype Failure: Error
    func evaluate(in buffer: Buffer) -> Result<Evaluation, Failure>
}

That could be anything. I started with Insert (for which I wrote the Line and string based stuff already) and in its raw form, it would look something like this:

struct Insert {
    let location: Buffer.Location
    let content: Insertable
    
    func evaluate(
        in buffer: Buffer
    ) -> Result<ChangeInLength, BufferAccessFailure> {
        do { 
            let changeInLength = try content.insert(into: buffer)
            return .success(changeInLength)
        } catch {
            return .failure(.wrap(error))
        }
    }
}

ChangeInLength is a wrapper around an Int value I use to forward information about the total change from a Modifying block. It’s returned from Insertable as it is being applied because this information is not static:

If you recall how Line(...) works,, you’ll remember that it wraps the string with "\n" left and right as needed, so its actual change in length is dynamic: for any string of length N, the change in length is between N and N+2. Many things can go wrong on the way: the buffer cannot be read at the insertion point for some reason, or be write-locked, so we prepare to get errors.

Delete can work in a similar way: change the buffer, but expect a negative change in length. Which leads us to an abstraction to group both modification commands:

Modification Expressions

The primitive commands like Insert and Delete can be grouped as a specialization of Expression called Modification.

A Modification evaluates in a Buffer and returns either a ChangeInLength, or a failure produced by Buffer mutation methods:

protocol Modification: Expression
where Evaluation == ChangeInLength, Failure == BufferAccessFailure { }

extension Insert: Modification { ... }
extension Delete: Modification { ... }

Both modification types can be used in a Result Builder block, but not mixed. The trick here is to not define Result Builder functions on the protocol, but on the concrete types, and thus only allow one or the other in sequence:

@resultBuilder public struct ModificationBuilder { }

extension ModificationBuilder {
    static func buildPartialBlock(first: Insert) -> Insert
    static func buildPartialBlock(accumulated: Insert, next: Insert) -> Insert
    static func buildArray(_ components: [Insert]) -> Insert
}

extension ModificationBuilder {
    static func buildPartialBlock(first: Delete) -> Delete
    static func buildPartialBlock(accumulated: Delete, next: Delete) -> Delete
    static func buildArray(_ components: [Delete]) -> Delete
}

If you were wondering: Two modifications of the same kind coalesce into one modification with two elements (both are backed by a sorted array under the hood), but that’s an unimportant implementation detail here.

With this group defined, it feels natural to define another group for the other expressions. We’ll see later how this “natural” inclination bit me eventually.

Commands: Non-Modifying Expressions

With the @ModificationBuilder ready, the Select and Modifying expressions needed to become a separate group and their own Result Builder so I could mix these two.

Their branch in the vocabulary was called Command:

protocol Command: Expression
where Evaluation == Void { }  // Failure is still `any Error`

This allowed me to create a new Result Builder for Command-conforming types, which excluded Insert/Delete automatically. Since mixing is allowed, using the Command protocol in the builder functions is perfectly fine. The implementation is so simple I can show it in full:

@resultBuilder struct CommandSequenceBuilder {
    static func buildBlock(_ components: any Command...) -> CommandSequence {
        CommandSequence(components)
    }
}

struct CommandSequenceFailure: Error {
    let wrapped: Error
}

struct CommandSequence: Command {
    let commands: [any Command]
    init(_ commands: [any Command]) {
        self.commands = commands
    }

    // Favor the throwing alternative of the protocol extension (read on)
    @_disfavoredOverload  
    func evaluate(in buffer: Buffer) -> Result<Void, CommandSequenceFailure> {
        do {
            for command in commands {
                try command.evaluate(in: buffer)
            }
            return .success(())
        } catch {
            return .failure(CommandSequenceFailure(wrapped: error))
        }
    }
}

Collect each Command in a block into an array, then execute each Command in sequence. Wrap errors in a common type to make working with Result easier, and return () on success.

A Result<Void, CommandSequenceFailure> is awkward to work with, but with protocol extensions, you can make it feel like any regular Void-returning function:

extension Expression {
    @inlinable @inline(__always)
    func evaluate(in buffer: Buffer) throws -> Evaluation {
        try self.evaluate(in: buffer).get()
    }
}

For an Evaluation of type `Void, this is just a throwing function without a return value.

Thanks to the @_disfavoredOverload annotation, the Swift compiler will try to default to the throwing overload from the extension. The Result-based function is best suited for the API internals, because you have strongly typed errors, but the throwing version feels nicer to use in most external call sites.

This leg of the journey was really fun because I enjoy building up vocabularies like this. This all took a while to actually write, fiddle with, and get to work properly. I didn’t think of the ChangeInLength type immediately in top-down planning. That came naturally during the process. Tests were written and edge cases found. The Buffer protocol needed to be tweaked and changed – the more features I added, the more text buffer inspections and transformations needed to be performed.

Next time, I’ll tell you about how I needed to change all of this a couple of days later.