Advent of Code 2023: Day 5

My language choice for Day 5’s puzzle is brought to you by the wonderful combination of a what if spark of inspiration and a concerning amount of stubbornness. The inspiration was this: what if we could make a text adventure game where the puzzle input resulted in the game computing the correct output?

Inform 7 is a domain specific language for creating interactive fiction, where the source code is largely written in natural language, though punctuation and indentation is structurally significant, and (unsurprisingly) sentences often need to be written in specific ways to avoid ambiguity.

Part 1

Our input is structured as a list of seeds, followed by a number of maps that transform the seed numbers to soil numbers, to fertilizer numbers, and so on. The types of things don’t really matter, so all we need to do is read the seed list, then (repeated seven times) read a map and transform the numbers, and finally output the smallest value.

Numbers

There is an unfortunate complication though. The actual input has unsigned 32-bit integers, but the best native integer type Inform offers is signed 32-bit (targeting Glulx) or 16-bit (targeting Z-machine) integers. There’s a few options for working around this:

  • Restrict to Glulx, use signed 32-bit integers, representing 0 as -231, up to 232-1 as 231-1. This lets us use signed comparison, addition and subtraction, but would require custom parsing and display routines. The representation is also somewhat annoying to understand when debugging.
  • Restrict to Glulx, combine the unsigned parts of two 32-bit integers to implement up to 62-bit unsigned integers. This would require custom operators for comparison, addition, subtraction, parsing and display. The representation is also somewhat annoying to understand when debugging.
  • Combine the unsigned parts of at least three 16-bit integers to implement up to 45-bit unsigned integers. Again, requires implementing everything, and the representation is also not suited to debugging.
  • Use a string of the decimal representation. This does not require parsing and display routines1, but would require custom comparison, addition and subtraction routines. This is likely the least performant option, but does have the advantage of being arbitrary precision and having an easily understood representation when debugging.

Since performance is not a concern (and because, as far as I can tell, bitwise operators are only available by embedding Inform 6 code), I went with the last option.

Reading the seeds list

Let’s get onto some code:

"Advent of Code 2023 Day 5.1" by flowblok

Chapter 0 - The Setting

The Garden is a room.

This is just a preamble, specifying the title and author of the story. Inform 7 source code can be organised by a hierarchy of headings (volumes, books, parts, chapters and sections), primarily to improve navigation of the source. Inform 7 organises the world into rooms, so we need at least one room (for the player to start in), so we make one called "The Garden".

Chapter 1 - On the Identification of Seeds

Defining the seed list is an action applying to one topic.
Understand "seeds [text]" as defining the seed list.
The seed list is a list of text which varies.

Carry out defining the seed list:
    now the seed list is the words of the topic understood.

There’s a couple of things going on here: firstly, we define an action called "defining the seed list", which applies to (has a parameter of) one topic (some text). We also need to specify how the player can trigger this action, so we configure Inform 7’s player command parsing to understand that seeds [text] should invoke that action (with everything after the word seeds as the topic). We then define a global variable called "the seed list" which is a list of text (since we’re using the decimal representation for our numbers). And finally, we specify what should happen when the "defining the seed list" action is carried out: now the seed list is the words of the topic understood would be written in traditional programming languages something like seed_list = input_topic.split(" ").

There’s no input validation here, we’re just going to assume that the player actually entered numbers. Keen readers may also note that the puzzle input looks like "seeds: 79 14 55 13", but our parsing only works with "seeds 79 14 55 13". Unfortunately, Inform assumes that the colon means we’re asking something called "seeds" to perform the "79" action on "14 55 13"2, which is not very easy to make work for our processing. To work around this, we can hook into the command parsing and remove the colons before the parser tries to understand what action is being invoked on what:

Section - Input Sanitization

After reading a command:
    let T be "[the player's command]";
    replace the regular expression ":" in T with "";
    change the text of the player's command to T.

Reading maps

Each map is a list of (destination start, source start, range length) entries that specifies the interval [source start, source start + range length) should be remapped to the interval [destination start, destination start + range length). We only need to store one map at a time, so we do so in a table.

Chapter 2 - On the Reading of Various Kinds of Maps

Table of Translation
Destination start (text)    Source start (text) Source end (text)
with 100 blank rows

Defining a map is an action applying to one topic.
Understand "[text] map" as defining a map.

Carry out defining a map:
    blank out the whole of the Table of Translation.

Entering a map row is an action applying to one topic.
Understand "[text]" as entering a map row.

Carry out entering a map row:
    let the number list be the words of the topic understood;
    choose a blank row in the Table of Translation;
    now destination start entry is entry 1 of the number list;
    now source start entry is entry 2 of the number list;
    now source end entry is (source start entry + entry 3 of the number list) - "1".

Our Table of Translation has three columns, and many blank rows. Rather than storing the length of the range, we store source start + range length - 1 (the inclusive end of the source range), which will aid in searching through the map.

When we see "seed-to-soil map" or "soil-to-fertilizer map" (or similar), we understand that as the "defining a map" action, which clears the table of its previous values.

Each map row will be three numbers, but there’s no word at the start we can use to help with parsing. Fortunately though, our overall format is not overly complex, so we can understand anything (which doesn’t match another action) as the "entering a map row" action. The arithmetic in this action is using [text] + [text] and [text] - [text] operators (which we haven’t implemented yet, but will define later).

Transforming the seeds list

Once the map has been entered, the player will enter a blank line, and we should apply the mapping to the seeds list (which then should be called the soil list, but we won’t bother tracking the name changes).

Chapter 3 - On the Application of Maps to Seeds

To decide which text is (input - text) after being mapped (this is mapping):
    repeat through the Table of Translation:
        if input is numerically at least source start entry and input is numerically at most source end entry:
            let offset be input - source start entry;
            decide on the destination start entry + offset;
    decide on input.

Finishing a map is an action applying to nothing.
[Understand "" as finishing a map.]

Carry out finishing a map:
    now the seed list is mapping applied to the seed list.

We define a phrase (a first-class function) that takes a string (decimal number) and remaps it according to the current entries in the Table of Translation. The remapping is pretty simple: we loop through all the entries in our map, and if our input is in the source interval, we figure out its offset to determine its place in the destination interval. If it doesn’t match anything, it stays as-is.

Then when the player enters a blank line, we carry out the "finishing a map" action, which applies the "mapping" phrase to every entry in the seed list. There is (of course) a complication: you’d think that Understand "" as finishing a map. would configure Inform to understand the empty line, but it’s actually a parser error, so we need to hook into the parser in a different way to get it to actually work:

Rule for printing a parser error when the latest parser error is the I beg your pardon error:
    try finishing a map instead.

Computing the answer

The actions above don’t keep track of which maps is being entered, so we need to figure out which one is the last and print out the answer after its mapping has been applied. This turns out to be beautifully simple:

Chapter 4 - On the Production of the Answer

After finishing a map for the eighth time:
    say "[the minimization reduction of the seed list]".

Inform keeps track of how many times the action is invoked for us, so we just need to find the smallest entry in the seed list. To do this, we rely on having a phrase for finding the smaller of two decimal numbers, and then reduce over the list (this is more or less the maximization example in the docs).

Interlude: decimal numbers

That application logic wasn’t too hard to implement, but that’s mostly because I hand-waved over all those numerical operations. We can skip equality (assuming we never have leading zeros in numbers), so let’s start by writing a phrase that can decide if one number is smaller, equal, or larger than another:

Comparisons

[This is an enum. Maybe there's an existing/better way of doing this, but I couldn't find it.]
Comparison is a kind of value. The comparisons are smaller, equal and larger.

To decide what comparison is (N - text) compared to (M - text):
    if the number of characters in N < the number of characters in M, decide on smaller;
    if the number of characters in N > the number of characters in M, decide on larger;
    repeat with i running from 1 to the number of characters in N:
        if character number i in N < character number i in M, decide on smaller;
        if character number i in N > character number i in M, decide on larger;
    decide on equal.

If one number has more digits than the other, it’s the larger one. If we have the same number of digits, then moving from the highest-value digit to lowest, whichever number has a higher value digit is larger. There is an assumption in this code: it compares the digits as strings, so it relies on the lexicographic comparison being the same, which appears to be true for the Z-machine and Glulx targets. We could do the string to number conversion, but uh, you’ll see why I wanted to avoid that soon enough.

We can then build on top of this phrase to provide the <, <=, >= and > relations:

[<]
Numerically less than relates a text (called N) to a text (called M) when N compared to M is smaller.
The verb to be numerically less than means the numerically less than relation.
[<=]
Numerically at most relates a text (called N) to a text (called M) when N compared to M is not larger.
The verb to be numerically at most means the numerically at most relation.
[>=]
Numerically at least relates a text (called N) to a text (called M) when N compared to M is not smaller.
The verb to be numerically at least means the numerically at least relation.
[>]
Numerically greater than relates a text (called N) to a text (called M) when N compared to M is larger.
The verb to be numerically greater than means the numerically greater than relation.

There is also one last thing to define: in order to use the minimization reduction when computing the answer, we need:

To decide what text is the smaller of (N - text) and (M - text) (this is minimization):
    if N compared to M is smaller, decide on N;
    decide on M.

Addition

Consider how you would perform 972 + 35 by hand:

  • add the last digit of each: 2 + 5 = 7, so write 7 (→ 7)
  • add the second-last digit of each: 7 + 3 = 10, so write 0 and carry the 1 (→ 07)
  • add the carry to the first digit of the left number: 9 + 1 = 10, so write 0 and carry the 1 (→ 007)
  • write the carry (→ 1007)

We’re going to do something very similar:

  • convert "972" + "35" to lists: [9, 7, 2] + [3, 5]
  • reverse each list: [2, 7, 9] + [5, 3]
  • pad with zeros to make them the same length: [2, 7, 9] + [5, 3, 0]
  • produce a new list adding the digits together, keeping track of the carry:
    • [2, 7, 9] + [5, 3, 0] = []
    • [7, 9] + [3, 0] = [7]
    • [9] + [0] = [7, 0] (+carry)
    • [] + [] = [7, 0, 0] (+carry)
  • add the final carry (if there is one):
    • = [7, 0, 0, 1]
  • reverse the list: [1, 0, 0, 7]
  • convert back into a string: "1007"

Here’s the Inform implementation:

To decide what text is the (N - text) + (M - text):
    let N digits be the digits of N;
    let M digits be the digits of M;
    reverse N digits;
    reverse M digits;
    extend N digits to the number of entries in M digits entries;
    extend M digits to the number of entries in N digits entries;
    let result digits be a list of numbers;
    let carry be 0;
    repeat with i running from 1 to the number of entries in N digits:
        let digit sum be entry i of N digits + entry i of M digits + carry;
        add the remainder after dividing digit sum by 10 to result digits;
        now carry is digit sum / 10;
    if carry > 0, add carry to result digits;
    reverse result digits;
    decide on the concatenation reduction of str applied to result digits.

This code has a suprising numbers of dependencies. The line let N digits be the digits of N uses this phrase:

To decide what list of numbers is the digits of (N - text):
    let digits be a list of numbers;
    repeat with i running from 1 to the number of characters in N:
        add the digit value of character number i in N to digits;
    decide on digits.

Which itself relies on this phrase:

To decide what number is the digit value of (N - text):
    if N is "1", decide on 1;
    if N is "2", decide on 2;
    if N is "3", decide on 3;
    if N is "4", decide on 4;
    if N is "5", decide on 5;
    if N is "6", decide on 6;
    if N is "7", decide on 7;
    if N is "8", decide on 8;
    if N is "9", decide on 9;
    decide on 0.

Sigh.

Moving swiftly along, the line extend N digits to the number of entries in M digits entries is also notable: it fills any extra values needed with the default value of the list, which since we have a list of numbers will be zero. The final point of interest is turning the list back into a string: the concatenation reduction of str applied to result digits says to apply the str phrase to each entry of the list (converting each number into a string) and then reduce the list with the concatenation phrase, which joins the strings together. Of course, we need to define these phrases:

To decide what text is the concatenation of (X - text) and (Y - text) (this is concatenation):
    decide on "[X][Y]".

To decide what text is the str of (X - number) (this is str):
    decide on "[X]".

Subtraction

You know, let’s just skip over this: it’s very similar to addition. The only thing to call out is that I decided not to implement negative numbers, so my subtraction operator only implements N - M under the condition N >= M.

Part 2

It turns out we’ve been reading the input wrong! The line seeds: 79 14 55 13 actually specifies two intervals of seeds to plant: [79, 79 + 14) and [55, 55 + 13). This is a modest 27 seed numbers, but looking at the actual input suggests we’re going to have billions of seeds, so expanding these ranges would be rather ill-advised.

Reading the seed ranges

Instead, we’ll track each range as a list of [start, end], and adjust our algorithms to work on lists of ranges.

Chapter 1 - On the Identification of Seeds

Defining the seed ranges is an action applying to one topic.
Understand "seeds [text]" as defining the seed ranges.
The seed ranges is a list of lists of text which varies.

Carry out defining the seed ranges:
    change the seed ranges to have 0 entries;
    repeat with pair running through the pairs of words of the topic understood:
        let range be a list of text;
        add entry 1 of pair to range;
        add (entry 1 of pair + entry 2 of pair) - "1" to range;
        add range to the seed ranges.

When the seed list is entered, we convert each pair of start/length into start/end ranges. Of course, we also have to define what a pair of words is:

Section - Text Manipulation

[This phrase assumes there are an even number of words.]
To decide which list of lists of text is the pairs of words of (strings - text):
    let result be a list of lists of text;
    repeat with i running from 1 to the number of words in strings:
        let pair be a list of text;
        add word number i in strings to pair;
        add word number i + 1 in strings to pair;
        add pair to result;
        now i is i + 1;  [skip the second word of the pair]
    decide on result.

Reading maps

This section is slightly different to Part 1: for reasons that will shortly become apparent, we’re going to add a column for the number of digits in the source start.

Chapter 2 - On the Reading of Various Kinds of Maps

Table of Translation
Destination start (text)    Source start (text) Source end (text)   Source start length (number)
with 100 blank rows

Defining a map is an action applying to one topic.
Understand "[text] map" as defining a map.

Carry out defining a map:
    blank out the whole of the Table of Translation.

Entering a map row is an action applying to one topic.
Understand "[text]" as entering a map row.

Carry out entering a map row:
    let the number list be the words of the topic understood;
    choose a blank row in the Table of Translation;
    now destination start entry is entry 1 of the number list;
    now source start entry is entry 2 of the number list;
    now source start length entry is the number of characters in source start entry;
    now source end entry is (source start entry + entry 3 of the number list) - "1".

Transforming the seed ranges

In Part 1, we just needed to find the mapping for a single number, but now we need to find the mapping for intervals. Let’s work through each step of that process:

Chapter 3 - On the Application of Maps to Seeds

Carry out finishing a map:
    sort the Table of Translation in source start order;
    sort the Table of Translation in source start length order;

The first thing we do is to sort the Table of Translation by the source start order. Unfortunately, this is using lexicographical order, so "10" < "2". But fear not: sorting tables is defined to be stable, so if we next sort by the length of the source start column (helpfully in its own explicit column), the rows within each length will remain sorted lexicographically, which gives us a numerical sort.

    now the seed ranges is the sorted interval union of the seed ranges;
    let the destination ranges be a list of lists of text;

Next, we get the "sorted interval union" of the seed ranges. We’ll define this later, but this takes the seed ranges, sorts them by their interval start and then merges any adjacent/overlapping intervals together, so we have a canonical, minimal representation of the intervals.

We also define the destination ranges to store the mapped intervals as we go.

    let the starting row be 1;
    repeat with seed range running through the seed ranges:
        repeat with N running from the starting row to the number of filled rows in the Table of Translation:
            now the starting row is N;
            choose row N in the Table of Translation;

Nested loops, in my interactive fiction? But fear not: both the seed ranges and the Table of Translation are sorted, so for each seed range, we scan through the Table of Translation to find out how it should be mapped, but then we know that the rows before that point cannot possibly be useful for processing future seed ranges, so we keep starting row updated to note where we can start from (so this is linear time).

Now we need to match a few cases for how the seed range intervals correspond to the intervals in the Table of Translation:

            if entry 1 of seed range is numerically less than source start entry:
                [seed range start < source start, so we have an unmapped subrange]
                let unmapped subrange be a list of text;
                add entry 1 of seed range to unmapped subrange;
                let last possible unmapped number be the source start entry - "1";
                add the smaller of entry 2 of seed range and the last possible unmapped number to unmapped subrange;
                let new source end be entry 2 of unmapped subrange + "1";
                now entry 1 of seed range is new source end;
                add unmapped subrange to the destination ranges;
                [if seed range is now empty, there's no point looking at any more translations]
                if seed range is an empty interval, break;

If we have a seed range like [2, 20] and the first row of the Table of Translation is for the source range [10, 30], then we need to first split off the [2, 9] part, as it is unmapped and should just be added to the destination ranges as-is. We’d then continue processing with the interval [10, 20].

But we also could have had a seed range [0, 5]: in that case, we add it entirely to the destination ranges, and then update the seed range to be [6, 5]. The if seed range is an empty interval condition catches this case and breaks out of the loop: we entirely processed this seed range and can move onto the next one. Note that starting row remains at this row, as our next seed range could also hit this case (e.g. [7, 9]).

            [seed range start >= source start, but we also need seed range start <= source end for overlap]
            if entry 1 of seed range is numerically at most source end entry:
                let mapped subrange be a list of text;
                let destination start be the mapping of entry 1 of seed range on row N;
                let subrange source end be the smaller of entry 2 of the seed range and source end entry;
                add destination start to mapped subrange;
                add the mapping of subrange source end on row N to mapped subrange;
                add mapped subrange to destination ranges;
                now entry 1 of seed range is subrange source end + "1";
                [if seed range is now empty, there's no point looking at any more translations]
                if seed range is an empty interval, break;

The next case is for when the seed range interval overlaps the source range in the Table of Translation. The previous case ensures that the seed range start is now at least the source range start3, but our interval might be entirely after this source range, so we need to check our start is before the source range end.

If it is, then we need to find the subinterval which overlaps and apply the mapping to the destination range. The destination start is simple: we know the seed range start is in the source range, so we just need to map it, which we do with the mapping of entry 1 of seed range on row N. The arithmetic here is a little messy inlined, so I’ve split off the source → destination mapping for a single number to a separate phrase.

The end of the interval is a little trickier, since it could extend outside the source range of the current table row: so we find the subrange source end as the smaller of the seed range and and the source range end, and then map it (using the source → destination mapping phrase again) to find the destination end.

In case the seed range does extend past the source range, we adjust the seed range start to source range end + 1. If it didn’t, then this will make the interval empty and we’ll break out of the loop. If it did, then we’ll continue on to the next table row with the rest of the interval to be mapped.

        [we reached the end of the (possibly-empty) table]
        unless seed range is an empty interval:
            [with a non-empty range, so it must be unmapped]
            add seed range to destination ranges;

Dropping out of the Table loop, if we reached the end of the table and our seed range is still not empty, then we have another unmapped range.

    now the seed ranges is the destination ranges.

And finally, once we’ve mapped all the seed ranges into destination mappings, it replaces seed ranges. Phew.

And let’s tidy up a couple of loose threads:

To decide which text is the mapping of (input - text) on row (N - number):
    choose row N in the Table of Translation;
    decide on destination start entry + (input - source start entry);

To decide if (X - list of text) is an empty interval:
    if entry 1 of X is numerically greater than entry 2 of X, decide yes;
    decide no.

Finding the sorted interval union

We tidied up the loose threads, now let’s tidy up a loose rope: we need a phrase to find the "sorted interval union" of the seed ranges, which I defined as sorting by the interval start and merging any adjacent or overlapping intervals together.

To decide if (R1 - list of text) is before (R2 - list of text) (this is interval start order):
    if entry 1 of R1 is numerically less than entry 1 of R2, decide yes;
    decide no.

To decide which list of lists of text is the sorted interval union of (ranges - list of lists of text):
    let result be a list of lists of text;
    now ranges is ranges sorted by interval start order;
    repeat with range running through ranges:
        [The first entry cannot be merged with a previous interval.]
        if the number of entries in result is 0:
            add range to result;
            next;
        [We can merge if the start of this range <= the end of the previous range + 1]
        let this interval start be entry 1 of range;
        let the previous interval end be entry 2 of the last entry of result;
        if this interval start is numerically at most the previous interval end + "1":
            let this interval end be entry 2 of range;
            now entry 2 of the last entry of result is the larger of the previous interval end and this interval end;
        otherwise:
            add range to result;
    decide on result.

As you can see, we sort the ranges by interval start order, which is defined above as comparing the first entries (interval start) numerically. Then we go through each interval. The first gets added to result unconditionally. For every interval after that though, we look at the previous interval in result and see if we can merge it: if this interval’s start is at most the previous interval’s end + 1, it is either adjacent or overlapping, and instead of adding the interval to the list, we just update the previous interval’s end (if this interval’s end is larger). If they’re not adjacent/overlapping, we just add the range to the list.

This is all well and good, but there’s another slight hiccup. Inform’s built-in sorting will sort a list of values by their natural ordering, but does not let us specify a custom comparator (our lists of intervals are lists of lists of text, and so the natural ordering would sort by the interval starts lexicographically, not numerically).

That’s fine, let’s just bang out a quick and terrible implementation of quicksort:

To decide which list of lists of text is (ranges - list of lists of text) sorted by (cmp - phrase (list of text, list of text) -> truth state):
    if the number of entries in ranges <= 1, decide on ranges;
    let pivot be entry 1 of ranges;
    let left be a list of lists of text;
    let right be a list of lists of text;
    repeat with i running from 2 to the number of entries in ranges:
        let range be entry i of ranges;
        if cmp applied to range and pivot is true:
            add range to left;
        otherwise:
            add range to right;
    now left is left sorted by cmp;
    now right is right sorted by cmp;
    add pivot to left;
    add right to left;
    decide on left;

While implementing this, I discovered two bugs: firstly because I initially tried to remove the pivot from the list, which corrupts the list, and then when I tried to use kind variables (generic types, more or less), as To decide what list of K is the (input - list of values of kind K) sorted by (cmp - phrase (K, K) -> truth state) does not compile (this is apparently a regression).

Computing the answer

The answer will then be the smallest start of any interval, which we could find in linear time, but instead of writing more code, let’s just sort the intervals and look at the first one:

After finishing a map for the eighth time:
    now the seed ranges is the sorted interval union of the seed ranges;
    let the answer be entry 1 of entry 1 of the seed ranges;
    say the answer.

Transcript

The sections above focus on the programming elements of Inform, but what’s the point of using an Interactive Fiction tool if you’re not going to tell a story? So my actual code adds descriptions and some flavour text as the story goes through the process of computing the answer.

Here’s a transcript for the example for Part 2:

Advent of Code 2023 Day 5.2
An Interactive Fiction by flowblok
Release 1 / Serial number 240316 / Inform 7 v10.1.2

Garden
A giant garden that looks more to you like a farm.

The latest Island Island almanac, containing an overwhelmingly amount of information on seeds, soils, fertilizers, water, lighting, temperature and humidity.

You can also see the gardener here.

> seeds: 79 14 55 13
You open the almanac and read out the long list of seeds, as the gardener looks on in appreciation.

>
You flip to the next page of the almanac.

> seed-to-soil map:
> 50 98 2
> 52 50 48
>
One by one, you slowly find the right type of soil for each kind of seed. The gardener looks fascinated to understand how you're figuring everything out.

> soil-to-fertilizer map:
> 0 15 37
> 37 52 2
> 39 0 15
>
As you find the right type of fertilizer for each kind of soil, the gardener looks on with fading interest.

> fertilizer-to-water map:
> 49 53 8
> 0 11 42
> 42 0 7
> 57 7 4
>
The gardener is looking very bored now and wanders off to tend to other tasks.

> water-to-light map:
> 88 18 7
> 18 25 70
>
Without an audience, you're starting to tire of this onerous task. You wonder how there could possibly be this many types of water.

> light-to-temperature map:
> 45 77 23
> 81 45 19
> 68 64 13
>
Some of these temperature numbers look a bit concerning, but there's no time to stop and think, you have to keep translating these numbers.

> temperature-to-humidity map:
> 0 69 1
> 1 0 69
>
As you flip to the next page of the almanac, you realise you're almost at the end. A wave of determination washes over you, and you set yourself to the task anew.

> humidity-to-location map:
> 60 56 37
> 56 93 4
>
The gardener wanders over to you. "You found it already?" he exclaims. "It would have taken me ages to figure out it was 46!"

Source code is in my repository here, as always.


  1. Unless you’re the kind of person who asserts the identity function requires implementation. 

  2. Well, not necessarily, it depends on what understanding sentences have been written. But colons are pesky in general. 

  3. This may not be obvious, but if we’re here, we have a non-empty interval which does not have any part before the source range start (as we removed it as unmapped). 

Comments