Advent of Code 2023: Day 2

I don’t know about you, but when I see a problem that needs some basic numeric operations like min, max, addition and multiplication, I think to myself: “this is a great opportunity to pull out sed”. Oh, just me then?

sed is a line-oriented text processor: you give it a sequence of commands, and it applies those commands to each line of the input. The substitution command is the most well-known, and many folks will only have ever used sed 's/regex/replacement/g' on the command line, opting for other tools for more complex problems. But sed has plenty more functionality to offer!

The problem input

We have a game where we’re pulling marbles from a bag, and we’re given a record of what happened for these games. The input format is:

Game 1: 3 blue, 4 red; 1 red, 2 green, 6 blue; 2 green
Game 2: 1 blue, 2 green; 3 green, 4 blue, 1 red; 1 green, 1 blue
Game 3: 8 green, 6 blue, 20 red; 5 blue, 4 red, 13 green; 5 green, 1 red
Game 4: 1 green, 3 red, 6 blue; 3 green, 6 red; 3 green, 15 blue, 14 red
Game 5: 6 red, 1 blue, 3 green; 2 blue, 1 red, 2 green

It turns out we don’t need to care about the commas and semicolons: neither of the parts have any kind of dependency on the sequence or conjunction of cubes.

Part 1

To work with numbers in sed, we’re going to convert our numbers between numeral systems. The three forms we will have are:

  • Decimal: 43 (the coefficients of 4×10¹ + 3×10⁰)
  • Tally: BBBBAAA (B = 10¹, A = 10⁰, add together the values)
  • Unary: AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA (count the symbols)

In our input, the largest number is three digits (Game 100), so we can convert from decimal to tally format like so:

# Convert all numbers to tally format.
s/9([0-9]{2})/C8\1/g
s/8([0-9]{2})/C7\1/g
s/7([0-9]{2})/C6\1/g
s/6([0-9]{2})/C5\1/g
s/5([0-9]{2})/C4\1/g
s/4([0-9]{2})/C3\1/g
s/3([0-9]{2})/C2\1/g
s/2([0-9]{2})/C1\1/g
s/1([0-9]{2})/C\1/g
s/9([0-9])/B8\1/g
s/8([0-9])/B7\1/g
s/7([0-9])/B6\1/g
s/6([0-9])/B5\1/g
s/5([0-9])/B4\1/g
s/4([0-9])/B3\1/g
s/3([0-9])/B2\1/g
s/2([0-9])/B1\1/g
s/1([0-9])/B\1/g
s/9/A8/g
s/8/A7/g
s/7/A6/g
s/6/A5/g
s/5/A4/g
s/4/A3/g
s/3/A2/g
s/2/A1/g
s/1/A/g
s/0//g

The first line turns any 9xx decimal number into C8xx, which then gets turned by the second line into CC7xx, and so on, until the coefficient of 10² is entirely in tally format. The next ten lines do the same for the coefficient of 10¹, and the last ten lines for 10⁰.

Our example now looks like:

Game A: AAA blue, AAAA red; A red, AA green, AAAAAA blue; AA green
Game AA: A blue, AA green; AAA green, AAAA blue, A red; A green, A blue
Game AAA: AAAAAAAA green, AAAAAA blue, BB red; AAAAA blue, AAAA red, BAAA green; AAAAA green, A red
Game AAAA: A green, AAA red, AAAAAA blue; AAA green, AAAAAA red; AAA green, BAAAAA blue, BAAAA red
Game AAAAA: AAAAAA red, A blue, AAA green; AA blue, A red, AA green

We can then turn this into unary format with:

# Convert tally format to unary.
s/C/BBBBBBBBBB/g
s/B/AAAAAAAAAA/g

Which turns our example into:

Game A: AAA blue, AAAA red; A red, AA green, AAAAAA blue; AA green
Game AA: A blue, AA green; AAA green, AAAA blue, A red; A green, A blue
Game AAA: AAAAAAAA green, AAAAAA blue, AAAAAAAAAAAAAAAAAAAA red; AAAAA blue, AAAA red, AAAAAAAAAAAAA green; AAAAA green, A red
Game AAAA: A green, AAA red, AAAAAA blue; AAA green, AAAAAA red; AAA green, AAAAAAAAAAAAAAA blue, AAAAAAAAAAAAAA red
Game AAAAA: AAAAAA red, A blue, AAA green; AA blue, A red, AA green

Now we need to discard any game which has more than 12 red cubes, 13 green cubes or 14 blue cubes. Consider the red cubes: if we see "AAAAAAAAAAAAA red" (that’s 13 As) somewhere in the line, then it means that one of the draws had at least 13 red cubes. So, we can discard the games with:

# 13+ red cubes -> impossible
s/^.*AAAAAAAAAAAAA red.*$//
# 14+ green cubes -> impossible
s/^.*AAAAAAAAAAAAAA green.*$//
# 15+ blue cubes -> impossible
s/^.*AAAAAAAAAAAAAAA blue.*$//

Typically, you’d discard lines using the d command (e.g. /AAAAAAAAAAAAA red/d), but I couldn’t figure out how to make that work with the next part, so we’re just emptying the line. Our example is now:

Game A: AAA blue, AAAA red; A red, AA green, AAAAAA blue; AA green
Game AA: A blue, AA green; AAA green, AAAA blue, A red; A green, A blue


Game AAAAA: AAAAAA red, A blue, AAA green; AA blue, A red, AA green

We next need only the game ids, so we filter out everything else:

# Get just the ids.
s/^Game (A+):.*$/\1/

Which gives us:

A
AA


AAAAA

And then we need to merge all the lines together to add our numbers, which obviously is just:

# Merge the lines together.
H
$!d
x
s/\n//g

Obviously? Not in the slightest, so let’s break this down. Remember how sed processes text line-by-line? In order to do multi-line operations, we need to make use of the hold buffer. The H command appends the current line to the hold buffer, which is great, we can now accumulate all our lines into one place.

But we only want to do anything with what we put into the hold buffer once we reach the ends, so what $!d says is “if it’s not (!) the last line ($), then delete the line (and go back to the start of the program to process the next line)”.

On the last line1 however, we get to continue with x, which (to skip some detail) says that our hold buffer (of all the lines we accumulated) is now the line we’re processing. This “line” actually contains multiple lines with newline characters between, which the s/\n//g removes to give us a single line.

Anyhow, this mess does what we need, and we get:

AAAAAAAA

We then just need to turn this back into a decimal number, which we do by first turning it into the tally format:

# Convert the tally back to a number.
# We have 100 input lines, so at worst we can have 1 + 2 + ... + 100 = 5050
s/AAAAAAAAAA/B/g
s/BBBBBBBBBB/C/g
s/CCCCCCCCCC/D/g

And then decimal:

# Insert zeros.
s/D([^CD]|$)/D0\1/
s/C([^BC]|$)/C0\1/
s/B([^AB]|$)/B0\1/

# Convert back to digits.
s/([ABCD])(\1{8})/9/g
s/([ABCD])(\1{7})/8/g
s/([ABCD])(\1{6})/7/g
s/([ABCD])(\1{5})/6/g
s/([ABCD])(\1{4})/5/g
s/([ABCD])(\1{3})/4/g
s/([ABCD])(\1{2})/3/g
s/([ABCD])(\1{1})/2/g
s/[ABCD]/1/g

Part 2

For the second part, we don’t care about the game ids, and the commas and semicolons are going to get in the way, so we start by removing them:

# Remove unnecessary parts.
s/^.*://
s/[,;]//g

Which turns our example into:

 3 blue 4 red 1 red 2 green 6 blue 2 green
 1 blue 2 green 3 green 4 blue 1 red 1 green 1 blue
 8 green 6 blue 20 red 5 blue 4 red 13 green 5 green 1 red
 1 green 3 red 6 blue 3 green 6 red 3 green 15 blue 14 red
 6 red 1 blue 3 green 2 blue 1 red 2 green

We then convert our numbers to unary format as before (though this time we only need to do two digits), and then turn the colours into a single-letter suffix on our numbers:

# (decimal->tally->unary from above)

# Turn the colour into a suffix.
s/ red/r/g
s/ green/g/g
s/ blue/b/g

Our example is now:

 AAAb AAAAr Ar AAg AAAAAAb AAg
 Ab AAg AAAg AAAAb Ar Ag Ab
 AAAAAAAAg AAAAAAb AAAAAAAAAAAAAAAAAAAAr AAAAAb AAAAr AAAAAAAAAAAAAg AAAAAg Ar
 Ag AAAr AAAAAAb AAAg AAAAAAr AAAg AAAAAAAAAAAAAAAb AAAAAAAAAAAAAAr
 AAAAAAr Ab AAAg AAb Ar AAg

In order to find the minimum number of cubes for any game, we need to find the maximum of each colour. We can do this by repeatedly removing any number where there is a larger number in the same line (e.g. in the first line, we should remove AAAb because there is also AAAAAAb which is bigger). The implementation:

# Find the minimum number of cubes of each colour required.
:min1
    s/ (A+[rgb])(.*)\1/\2\1/g
    tmin1

:min2
    s/(A+[rgb])(.*) \1/\1\2/g
    tmin2

There are two new features shown here: :min1 defines a label named min1, and tmin1 says to jump to min1 if an s command has succeeded since the last input line was read or a conditional branch (which includes t) was taken. Combined, this just says “keep running the s command in between until it doesn’t do anything”.

So what does the s command do? It matches a space, then a number with a colour-suffix (which is the first capture group \1), then anything (the second capture group \2), then a repetition of the first capture group. Consider the first line of our example, where we can find this match: (AAAb)( AAAAr Ar AAg AAA)AAAb. This finds an example where the first capture group is a number which is less than or equal to a number (of the same colour) appearing later in the line. This gets substituted with \2\1, which would be ( AAAAr Ar AAg AAA)(AAAb) (without the parentheses), which removes the left number.

Unsurprisingly, min2 does the same thing, but for the cases where the larger number appears earlier in the line. Once these have both run to completion, we’ll be left with one number per colour, though in no particular order. Here’s what the result looks like:

 AAAAr AAAAAAb AAg
 AAAg AAAAb Ar
 AAAAAAb AAAAAAAAAAAAAAAAAAAAr AAAAAAAAAAAAAg
 AAAg AAAAAAAAAAAAAAAb AAAAAAAAAAAAAAr
 AAAAAAr AAAg AAb

We don’t need the order though, we just need to multiply them together, which we’ll do with:

# Discard colour suffixes.
s/[rgb]//g

# Multiply the first two numbers together.
s/^/=/
:mul1
    s/= (A+) AA/\1= \1 A/
    tmul1
s/(A*)= (A+) A / \1\2 /

# Multiply the third number in.
s/^/=/
:mul2
    s/= (A+) AA/\1= \1 A/
    tmul2
s/(A*)= (A+) A$/\1\2/

Consider the first line of our example, " AAAAr AAAAAAb AAg". We start by discarding the colour suffixes and adding a = at the start of the line, giving "= AAAA AAAAAA AA". The mul1 loop then performs naïve multiplication of AAAA and AAAAAA, by adding AAAA (4) to a total AAAAAA (6) times. Here’s each substitution (parentheses indicate capture groups, square brackets indicate the entire match):

  • [= (AAAA) AA]AAAA AA[AAAA= AAAA A]AAAA AA
  • AAAA[= AAAA AA]AAA AAAAAA[AAAA= AAAA A]AAA AA
  • AAAAAAAA[= AAAA AA]AA AAAAAAAAAA[AAAA= AAAA A]AA AA
  • AAAAAAAAAAAA[= AAAA AA]A AAAAAAAAAAAAAA[AAAA= AAAA A]A AA
  • AAAAAAAAAAAAAAAA[= AAAA AA] AAAAAAAAAAAAAAAAAA[AAAA= AAAA A] AA

And then post-loop, we make the substitution:

  • [(AAAAAAAAAAAAAAAAAAAA)= AAAA A ]AA[AAAAAAAAAAAAAAAAAAAAAAAA] AA

Which leaves us with 24×2 as the multiplication for mul2, which operates almost exactly the same.

Once we’ve done that, it’s just a matter of merging the lines together to add all the numbers, then converting back to the tally format and then back to decimal. Simple.

As always, the code is in my repository.


  1. This is why I couldn’t use the delete command earlier: if the last line of the input was an impossible game, we’d delete it and the program would terminate without getting to the processing of the hold buffer. Perhaps there’s a nicer solution? I’d love to know. 

Comments