Advent of Code 2023: Day 1

We’re kicking things off the same way as last year: taking the (hopefully) easiest problem and solving it in one of the more difficult languages to code in: Brainfuck.

A brief introduction to Brainfuck

As a brief reminder (or introduction if you’ve never had the pleasure), Brainfuck programs run using an array of memory cells, each storing an integer. There is a pointer into this array, initially pointing at the first cell.

There are only eight instructions for Brainfuck programs:

Command Description
> Move the pointer one cell to the right.
< Move the pointer one cell to the left.
+ Increment the integer in the pointed-at cell.
- Decrement the integer in the pointed-at cell.
, Read from the input into the pointed-at cell.
. Write the pointed-at cell to the output.
[ Jump past the matching ] if the pointed-at cell is 0.
] Jump back to the matching [.

With these instructions, we can start to build higher-level constructs. The simplest of these is [-]. This repeatedly decrements the pointed-at cell until it is equal to zero, or more simply, it sets the pointed-at cell to zero. Add some +s after that, and we now can set the pointed-at cell to any number, regardless of its prior state.

We also will want to be able to copy a value from one cell to another. Let’s first look at how we’d move a value from one cell to another: [->+<]. This decrements the first cell, moves to the next cell, increments it, and then moves back to the first cell. The loop will repeat until the first cell is zero, at which point we will have finished moving the value.

To copy a cell then, we need a temporary cell: we’ll start by moving from the first cell to the second and third cells with [->+>+<<], and then follow up by moving the third cell back to the first cell with >>[-<<+>>]. Note that in the constructs we’ve seen so far, the < and > within […] loops are all balanced: this is usually the case, until you start wanting to use the memory cells as an array.

Another useful high-level construct is to be able to read the input one character at a time until you encounter some specific value. Here’s how you’d read a single line of input: ,----------[++++++++++ $inner ,----------]. This is pretty straightforward, we read a character and subtract 10 so the newline character ('\n' = 10) will become a zero that terminates the loop. The ++++++++++ restores the original character value, though this is optional if you’re happy to process a shifted character value.

We’ll also want conditionals (if/else statements): we can use [[-] $ifbody] to run $ifbody once if the pointed-at cell is nonzero, though this also sets the pointed-at cell to zero, so we may wish to use a temporary and then move the value back with [[->+<] $ifbody]>[-<+>]<.

To implement an else statement, we need to use another memory cell as a flag:

# set the "else flag" to one
>[-]+<

# if the initial memory cell is nonzero (destructive)
[
    [-]
    $ifbody
    # clear the "else flag"
    >-<
]

# if the "else flag" is nonzero, we didn't run $ifbody
>[
    [-]
    $elsebody
]<

Some conventions I will continue to use: # indicates a comment (there is a way to do comments natively in Brainfuck, but this should be familiar to readers), and $ indicates that the supplied code is merely a template for a Brainfuck program: in the case above, $ifbody and $elsebody would need to be replaced with some actual Brainfuck instructions.

Part 1: the problem

Heavens, all that and we haven’t even looked at the problem yet?! As expected, the problem is pretty straightforward: we want the sum of calibration values for each line, and each calibration value is formed by concatenating the first and last digits found in the line. The last line of the example input helpfully points out the edge case where there is only one digit in the line, so the first and last digits will be the same.

We’ll start by designing an algorithm for this that works character-by-character (since that’s how we’ll need to implement it):

  1. Until the end of the input:
    1. Read characters until we find a digit
    2. Add the digit 10 times to the total
    3. Store the digit as the last-seen digit
    4. Read characters until a newline, updating the last-seen digit as we go
    5. Add the last-seen digit to the total
  2. Print the total

Is it a digit?

We have most of the pieces we need already, but there is one tricky part: how do we tell if a character is a digit or not? The digits '0123456789' have ASCII/Unicode values 48, 49, …, 57, so if we knew we had a digit, we could get its value by subtracting 48. But if we have another character like 'a' (value 97), then subtracting 48 will give us 49. So we could subtract 48 and then test if the value is in the range 0-9… but we don’t have a construct for doing that in Brainfuck (we have "if nonzero", and that can be combined with subtraction to get "if equal to", but we don’t have an "if less than" or "if greater than").

Rather than being clever and coming up for a construct for that, let’s do something simpler and just test each of the 10 different digits. Here’s some pseudo-code to explain the approach:

function is_digit(in c, out is_digit, temp c_copy, temp else_flag):
    is_digit = 0
    unroll for $digit in [48...57]:
        c_copy = c - $digit (temporary=else_flag)
        else_flag = 1
        if c_copy != 0:
            else_flag = 0
            c_copy = 0
        else: (flag=else_flag)
            else_flag = 0
            is_digit = 1

We need four memory cells:

  • c: the character to test, which we pinky-promise will be unchanged at the end of the function
  • is_digit: a flag indicating if c is a digit (1) or not (0)
  • c_copy: a temporary variable which will hold c - '0', c - '1', etc. If this is ever zero, it means we’ve found a digit.
  • else_flag: a temporary variable used for copying c to c_copy and indicating if the else conditional needs to be run.

Let’s now turn that into Brainfuck:

# is_digit = 0
>[-]<

# $digit = '0'
    # c_copy = c
    # else_flag = c
    # c = 0
    [->>+>+<<<]

    # c = else_flag
    # else_flag = 0
    >>>[-<<<+>>>]<<<

    # c_copy -= '0'
    >>------------------------------------------------

    # else_flag = 1
    >[-]+
    # if c_copy != 0
    <[
        # c_copy = 0
        [-]
        # else_flag = 0
        >[-]<
    ]
    # else
    >[
        # else_flag = 0
        [-]
        # is_digit = 1
        <<[-]+>>
    ]<<<

# $digit = '1'
    [->>+>+<<<]
    >>>[-<<<+>>>]<<<

    # c_copy -= '1'
    >>-------------------------------------------------

    >[-]+
    <[
        [-]
        >[-]<
    ]
    >[
        [-]
        <<[-]+>>
    ]<<<

# $digit = '2'
    [->>+>+<<<]>>>[-<<<+>>>]<<<
    >>--------------------------------------------------
    >[-]+<[[-]>[-]<]>[[-]<<[-]+>>]<<<
# $digit = '3'
    [->>+>+<<<]>>>[-<<<+>>>]<<<
    >>---------------------------------------------------
    >[-]+<[[-]>[-]<]>[[-]<<[-]+>>]<<<
# $digit = '4'
    [->>+>+<<<]>>>[-<<<+>>>]<<<
    >>----------------------------------------------------
    >[-]+<[[-]>[-]<]>[[-]<<[-]+>>]<<<
# $digit = '5'
    [->>+>+<<<]>>>[-<<<+>>>]<<<
    >>-----------------------------------------------------
    >[-]+<[[-]>[-]<]>[[-]<<[-]+>>]<<<
# $digit = '6'
    [->>+>+<<<]>>>[-<<<+>>>]<<<
    >>------------------------------------------------------
    >[-]+<[[-]>[-]<]>[[-]<<[-]+>>]<<<
# $digit = '7'
    [->>+>+<<<]>>>[-<<<+>>>]<<<
    >>-------------------------------------------------------
    >[-]+<[[-]>[-]<]>[[-]<<[-]+>>]<<<
# $digit = '8'
    [->>+>+<<<]>>>[-<<<+>>>]<<<
    >>--------------------------------------------------------
    >[-]+<[[-]>[-]<]>[[-]<<[-]+>>]<<<
# $digit = '9'
    [->>+>+<<<]>>>[-<<<+>>>]<<<
    >>---------------------------------------------------------
    >[-]+<[[-]>[-]<]>[[-]<<[-]+>>]<<<

Gorgeous.

For the "read characters until we find a digit" step, we’re going to want to have a loop that terminates when we’ve found a digit. Since loops terminate on zero, it’s convenient to have a variant of this which returns 0 if the character is a digit, else 1, which can be constructed by swapping all is_digit = 0 with is_digit = 1 (and vice versa).

Putting it together

We start with our memory layout:

  1. total: The total of all calibration values.
  2. last_seen: The last-seen digit (as a value 0–9).
  3. c: The current character.
  4. is_digit: Whether the current character is a digit or not (though the assignment of 0/1 to true/false is contextual).
  5. c_copy: Temporary variable, primarily for copying c.
  6. else_flag: Temporary variable, primarily for else statements.

The overall program structure then looks like:

# Read a character until EOF.
>>,[
    # Check if that character was a digit.
    $is_digit(false→1, true→0)

    # Until we find a digit:
    >[
        # Read the next character.
        <,
        $is_digit(false→1, true→0)
    ]

    # Set last_seen to the first digit, and add it ten times to the total.
    <------------------------------------------------
    [-<+<++++++++++>>]

    # Read characters until a newline
    ,----------[
        ++++++++++
        # If this character is a digit, store it in last_seen.
        $is_digit(false→0, true→1)
        >[
            [-]
            <------------------------------------------------
            <[-]>[-<+>]>
        ]

        # Read the next character.
        ,----------
    ]

    # Add last_seen to total.
    <[-<+>]>

    # Read the next character (start of the next line).
    ,
]

<<
# Print total,
[>>>>++++++++++<<<<[->+>>+>-[<-]<[<<[->>>+<<<]>>>>+<<-<]<<]++++++++[->++++++<]>[-<+>]>>>>[-<<<<+>>>>]<[-]<<<]<[.<]
# then a newline.
++++++++++.

Running this is a bit of a challenge: the input is 1000 lines long, and each line can add at most 99 to the total, so our total cell needs to be able to hold the number 99000, which rules out 8-bit and 16-bit cell widths. Fortunately, there is an optimizing brainfuck compiler which can recognise some of the constructs in our program and produce equivalent code that runs much faster.

Onwards to Part 2

That wasn’t too tricky, was it? Unfortunately, Part 2 is here to ruin our fun. Now our digits can also be spelled out with letters, so we need to recognise one, two, …, nine. Uh oh, that’s going to be a lot of conditionals…

Here’s the new memory layout:

  1. total: The total of all calibration values.
  2. last_seen: The last-seen digit (as a value 0–9).
  3. multiplier: How many times to add a digit to the total.
  4. c4: The fifth-last character.
  5. c3: The fourth-last character.
  6. c2: The third-last character.
  7. c1: The second-last character.
  8. c0: The last (current) character.
  9. match: does the current word match so far? / else_flag: temporary variable for else statements.
  10. c_copy: Temporary variable, primarily for copying cN.

There’s really only two changes here: we’ve introduced multiplier and a sliding window of the last five input characters. One of the problems of the Part 1 solution is that $is_digit had to be repeated three times. With the addition of the words for digits, we’re going to have a lot of extra code for recognising digits, so we really want to restructure the code to avoid that repetition.

This is where the multiplier cell comes in. We’re going to simplify the algorithm so we just have one loop reading input characters:

  1. Set multiplier to 10
  2. Until the end of the input:
    1. Shift the sliding window (c0 goes to c1, … c3 goes to c4)
    2. Read a character into c0
    3. If that character was a newline:
      1. Add the last-seen digit to the total
      2. Set multiplier to 10
    4. For each digit word (including the actual digits), if the word appears at the end of the sliding window:
      1. Set the last-seen digit to the digit’s value
      2. Add the digit’s value multiplier times to total
      3. Set multiplier to 0
  3. Print the total

The first time we see a digit, multiplier will be 10, so we’ll add it to the total ten times. But for every digit after that, multiplier will be 0, so although the same code path will be run, we’ll add it to the total zero times, having no effect. Then once we see the newline, we’ll add the last digit to the total, and reset multiplier for the first digit of the next line.

Finding words in the sliding window

The phrase "if the word appears at the end of the sliding window" in the algorithm above is suspiciously high-level for a Brainfuck program. Implementing it turns out to be not too difficult, albeit extremely verbose.

Here’s some pseudo-code to explain the approach:

function test_character($letter, in c, out match, temp c_copy):
    c -= $letter
    if c != 0:
        c_copy += c
        c = 0
        match = 0
    c += c_copy
    c += $letter

function test_word($word, in c4, in c3, in c2, in c1, in c0, out match, temp c_copy):
    match = 1
    $if length($word) >= 5:
        $test_character($word[-5], c4, match, c_copy)
    $if length($word) >= 4:
        $test_character($word[-4], c3, match, c_copy)
    $if length($word) >= 3:
        $test_character($word[-3], c2, match, c_copy)
    $if length($word) >= 2:
        $test_character($word[-2], c1, match, c_copy)
    $if length($word) >= 2:
        $test_character($word[-1], c0, match, c_copy)

For 'two', that turns into the following Brainfuck code (the last-seen digit and multiplier adjustments are included):

# match = 1
[-]+

<<<
# $index = 2, $letter = 't'
    # c2 -= 't'
    --------------------------------------------------------------------------------------------------------------------
    # if c2 != 0:
    [
        # c_copy = c2; c2 = 0
        [->>>>+<<<<]
        # match = 0
        >>>[-]<<<
    ]
    # c2 += c_copy; c_copy = 0
    >>>>[-<<<<+>>>>]<<<<
    # c2 += 't'
    ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
    >

# $index = 1, $letter = 'w'
    # c1 -= 'w'
    -----------------------------------------------------------------------------------------------------------------------
    # if c1 != 0:
    [
        # c_copy = c1; c1 = 0
        [->>>+<<<]
        # match = 0
        >>[-]<<
    ]
    # c1 += c_copy; c_copy = 0
    >>>[-<<<+>>>]<<<
    # c1 += 'w'
    +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
    >
# $index = 0, $letter = 'o'
    # c0 -= 'o'
    ---------------------------------------------------------------------------------------------------------------
    # if c0 != 0:
    [
        # c_copy = c0; c0 = 0
        [->>+<<]
        # match = 0
        >[-]<
    ]
    # c0 += c_copy; c_copy = 0
    >>[-<<+>>]<<
    # c0 += 'o'
    +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
    >

Putting it together (again)

Once we’ve tested a word, match will be 1 if we found the digit word at the end of the sliding window, so we’re nicely set up to make the adjustments to last_seen, multiplier and total from our algorithm.

function check_digit_word($word, $value, in c4, in c3, in c2, in c1, in c0, temp match, temp c_copy):
    $test_word($word, c4, c3, c2, c1, c0, match, c_copy)
    if match != 0:
        match = 0
        last_seen = $value
        total += $value * multiplier
        multiplier = 0

For 'two', $value = 2 and will turn into (omitting $test_word) this Brainfuck code:

# if match != 0
[
    # match = 0
    [-]
    # last_seen = 2
    <<<<<<<[-]++
    # total += 2 * multiplier; multiplier = 0
    >[-<<++>>]
    >>>>>>
]

We now have everything we need to implement the entire algorithm:

# set multiplier to 10
>>++++++++++
# c0 = read()
>>>>>,
# while c0 != eof:
[
    # else_flag = 1
    >+<
    # if c0 != '\n'
    ----------[
        # c_copy = c0; c0 = 0
        [->>+<<]
        # else_flag = 0
        >[-]<
    ]
    # c0 += c_copy
    >>[-<<+>>]<<
    # c0 += '\n'
    ++++++++++
    # else / if else_flag == 1 / if c0 == '\n'
    >[
        # else_flag = 0
        [-]
        # add last-seen digit to total
        <<<<<<<[-<+>]
        # multiplier = 10
        >[-]++++++++++
        >>>>>>
    ]

    $check_digit_word($word="one", $value=1)
    $check_digit_word($word="two", $value=2)
    $check_digit_word($word="three", $value=3)
    $check_digit_word($word="four", $value=4)
    $check_digit_word($word="five", $value=5)
    $check_digit_word($word="six", $value=6)
    $check_digit_word($word="seven", $value=7)
    $check_digit_word($word="eight", $value=8)
    $check_digit_word($word="nine", $value=9)
    $check_digit_word($word="0", $value=0)
    $check_digit_word($word="1", $value=1)
    $check_digit_word($word="2", $value=2)
    $check_digit_word($word="3", $value=3)
    $check_digit_word($word="4", $value=4)
    $check_digit_word($word="5", $value=5)
    $check_digit_word($word="6", $value=6)
    $check_digit_word($word="7", $value=7)
    $check_digit_word($word="8", $value=8)
    $check_digit_word($word="9", $value=9)

    # update the character window
        # c4 = 0
        <<<<<[-]
        # c4 += c3; c3 = 0
        >[-<+>]
        # c3 += c2; c2 = 0
        >[-<+>]
        # c2 += c1; c1 = 0
        >[-<+>]
        # c1 += c0; c0 = 0
        >[-<+>]
        # c0 = read()
        ,
]

<[-]<[-]<[-]<[-]<[-]<[-]<
# Print total
[>>>>++++++++++<<<<[->+>>+>-[<-]<[<<[->>>+<<<]>>>>+<<-<]<<]++++++++[->++++++<]>[-<+>]>>>>[-<<<<+>>>>]<[-]<<<]<[.<]
# Print '\n'
++++++++++.

A complete code listing, debugging interpreter and a tool to strip comments/whitespace are available in the repository.

Comments