Advent of Code 2023: Day 3

On Day 3 we reach the gondola lift station, but the engine is broken and for some inexplicable reason the only computer we have is a mainframe running COBOL (which appears to be the COmmon Bauble Ornamentation Language on this machine).

To repair the engine, we’re going to need a program:

       IDENTIFICATION DIVISION.
       PROGRAM-ID. aoc.

And we’re going to need it to be able to read the engine schematic:

       ENVIRONMENT DIVISION.
          INPUT-OUTPUT SECTION.
             FILE-CONTROL.
             SELECT STDIN ASSIGN TO KEYBOARD
             ORGANIZATION IS LINE SEQUENTIAL.

       DATA DIVISION.
           FILE SECTION.
           FD STDIN.
           01 STDIN-LINE PIC X(200).

There’s a lot of verbosity here, but the key points are we declare a file called STDIN which comes from the keyboard input and is a sequence of lines, with each line containing 200 characters (this is what PIC X(200) indicates: X means any character). Our actual input is only 140 characters wide, I’m just rounding up.

We’re going to want to read the input into memory so we can manipulate it, so we’ll define a bunch of variables for doing this:

       WORKING-STORAGE SECTION.
           01 WIDTH PIC 9(3).
           01 HEIGHT PIC 9(3).
           01 X PIC 9(3).
           01 Y PIC 9(3).
           01 CHAR-GRID.
               02 CHAR-GRID-ROW OCCURS 200 TIMES.
                   03 CHAR-GRID-CELL PIC X OCCURS 200 TIMES.

The first four variables (WIDTH, HEIGHT, X and Y) are all defined with PIC 9(3), meaning they can store three-digit numbers (000–999), which is all we need. The definition of CHAR-GRID will be a little less obvious to COBOL newcomers. I’d say that it’s effectively char[200][200], but that’s not exactly true, since we give names to each level of the array, so it’s more like this C++ code:

struct CharGridRowRecord {
    char CHAR_GRID_CELL[200];
};

struct CharGridRecord {
    CharGridRowRecord CHAR_GRID_ROW[200];
};

CharGridRecord CHAR_GRID;

Anyhow, let’s read into this structure:

       PROCEDURE DIVISION.
      * Read the grid.
           MOVE 1 TO Y
           OPEN INPUT STDIN
               PERFORM UNTIL EXIT
                   READ STDIN INTO CHAR-GRID-ROW(Y)
                       AT END EXIT PERFORM
                       NOT AT END ADD 1 TO Y
                   END-READ
               END-PERFORM
           CLOSE STDIN
      * Calculate width and height.
           COMPUTE HEIGHT = Y - 1
           INSPECT CHAR-GRID-ROW(1) TALLYING WIDTH FOR ALL " "
           COMPUTE WIDTH = 200 - WIDTH

Reading the grid is pretty straightforward: PERFORM UNTIL EXIT is what most other languages call while (true), and EXIT PERFORM is break. Computing the width is messier than I’d like: I’m counting the number of spaces (which occur at the end of the line, after the read characters) in the first line, and then subtracting from 200 (the size I picked above). I’d love to know if someone has a better way to do this.

Part 1

Now we’ve got that boilerplate out of the way, let’s get onto the main part of the problem. We’re given a grid of characters, and are asked for the sum of numbers which are adjacent (including diagonals) to symbols (excluding periods). Here’s what an example input looks like:

467..114..
...*......
..35..633.
......#...
617*......
.....+.58.
..592.....
......755.
...$.*....
.664.598..

The way I’m going to approach this is to build a grid of booleans (i.e. a bitmap or 2D bitmask), each indicating whether the corresponding cell in the input grid can reach a symbol. The first step is to construct the bitmap with 1s where there are symbols in the input, and 0s everywhere else:

0000000000
0001000000
0000000000
0000001000
0001000000
0000010000
0000000000
0000000000
0001010000
0000000000

We’ll need some more data structures to represent this in our working storage section:

       WORKING-STORAGE SECTION.
      * … (as above)
           01 FILTER.
               02 FILTER-ROW OCCURS 200 TIMES.
                   03 FILTER-CELL PIC 1 USAGE BIT OCCURS 200 TIMES.

FILTER is a 2D array like CHAR-GRID, but this time of booleans (PIC 1 USAGE BIT1). We can construct the bitmap for the first step with:

       PROCEDURE DIVISION.
      * … (as above)
      * Construct the filter grid.
           PERFORM VARYING Y FROM 1 BY 1 UNTIL Y > HEIGHT
               PERFORM VARYING X FROM 1 BY 1 UNTIL X > WIDTH
                   IF CHAR-GRID-CELL(Y,X) IS NOT NUMERIC
                       AND CHAR-GRID-CELL(Y,X) <> "."
                   THEN
                       MOVE 1 TO FILTER-CELL(Y,X)
                   ELSE
                       MOVE 0 TO FILTER-CELL(Y,X)
                   END-IF
               END-PERFORM
           END-PERFORM

We then expand this bitmap: for each cell, we can set it to a 1 if the corresponding character in the input is a digit and one of the eight neighbours in the bitmap is a 1. The first application of this rule produces:

0010000000
0001000000
0011001000
0000001000
0011000000
0000010000
0000100000
0000001000
0001010000
0011011000

Rinse and repeat: once we stop making changes, we end up with:

1110000000
0001000000
0011001110
0000001000
1111000000
0000010000
0011100000
0000001110
0001010000
0111011100

To implement this, we need a few more bits of storage: a boolean indicating whether applying the rule resulted in any changes (FILTER-CHANGED), some integers tracking the delta to each neighbour (DX, DY) and each neighbour’s position (X2, Y2):

       WORKING-STORAGE SECTION.
      * … (as above)
           01 FILTER-CHANGED PIC 1 USAGE BIT.
           01 DX PIC S9.
           01 DY PIC S9.
           01 X2 PIC 9(3).
           01 Y2 PIC 9(3).

The additions to the procedure division are a mess of nested loops:

       PROCEDURE DIVISION.
      * … (as above)
      * Expand the bitmap.
           MOVE 1 TO FILTER-CHANGED
           PERFORM UNTIL FILTER-CHANGED = 0
               MOVE 0 TO FILTER-CHANGED
               PERFORM VARYING Y FROM 1 BY 1 UNTIL Y > HEIGHT
               PERFORM VARYING X FROM 1 BY 1 UNTIL X > WIDTH
                   IF FILTER-CELL(Y,X) = 0 AND
                       CHAR-GRID-CELL(Y,X) IS NUMERIC
                   THEN
                       PERFORM VARYING DY FROM -1 BY 1 UNTIL DY = 2
                       PERFORM VARYING DX FROM -1 BY 1 UNTIL DX = 2
                           COMPUTE X2 = X + DX
                           COMPUTE Y2 = Y + DY
                           IF X2 > 0 AND X2 <= WIDTH AND
                               Y2 > 0 AND Y2 <= HEIGHT AND
                               FILTER-CELL(Y2,X2) <> 0
                           THEN
                               MOVE 1 TO FILTER-CELL(Y,X)
                               MOVE 1 TO FILTER-CHANGED
                           END-IF
                       END-PERFORM
                       END-PERFORM
                   END-IF
               END-PERFORM
               END-PERFORM
           END-PERFORM

The structure here is:

  • Outermost loop: keep running until we do an iteration without changing any cells (indicated by setting FILTER-CHANGED to 1)
  • Second/third loops: look at each cell (X,Y) in the bitmap, check if the corresponding input character is a digit (IS NUMERIC), and the bitmap cell is not already a 1.
  • Fourth/fifth loops: loop over each of the relative offsets to the eight neighbouring cells: (-1, -1), (0, -1), (1, -1), (-1, 0), ..., (1, 1). This does also include (0, 0), but in that case we already know that FILTER-CELL(Y2,X2) will be 0, so it doesn’t matter.
  • Innermost body: compute (X2, Y2) as the absolute position of the neighbour, check if it’s in bounds and the neighbour’s cell is a 1. If so, we can set this cell to a 1 and set FILTER-CHANGED to 1 to indicate we’ve made a change.

We can then apply this bitmask back to the original input to hide the numbers which aren’t touching a symbol, leaving us with the simpler problem of adding the remaining numbers:

           PERFORM VARYING Y FROM 1 BY 1 UNTIL Y > HEIGHT
               PERFORM VARYING X FROM 1 BY 1 UNTIL X > WIDTH
                   IF FILTER-CELL(Y,X) = 0 THEN
                       MOVE "." TO CHAR-GRID-CELL(Y,X)
                   END-IF
               END-PERFORM
           END-PERFORM

After this, our example input is now missing the numbers which weren’t touching a non-period symbol:

467.......
...*......
..35..633.
......#...
617*......
.....+....
..592.....
......755.
...$.*....
.664.598..

The only thing left is to add up the remaining numbers and print the total. We need two last variables in our working storage section:

       WORKING-STORAGE SECTION.
      * … (as above)
           01 PARSED-NUMBER PIC 9(3).
           01 TOTAL PIC 9(9).

To add up the numbers, we first need to assemble each number out of its digits. We loop over each cell in the grid: when we see a digit, we multiply the existing digits in PARSED-NUMBER by 10 to shift them up a place, then add on the digit. Unfortunately while we know that CHAR-GRID-CELL(Y,X) is a digit, it’s still a PIC X, so we need FUNCTION NUMVAL(…) to turn it into a PIC 9 so we can add it.

Whenever we reach a non-numeric character or the end of the row, we add PARSED-NUMBER to the total and reset it to zero. It doesn’t matter if we hadn’t actually seen a number: in that case, PARSED-NUMBER will be zero and adding it to TOTAL will have no effect.

      * Find numbers in the grid, add them up.
           MOVE 0 TO PARSED-NUMBER
           MOVE 0 TO TOTAL
           PERFORM VARYING Y
               FROM 1 BY 1 UNTIL Y > HEIGHT
               PERFORM VARYING X
                   FROM 1 BY 1 UNTIL X > WIDTH
                   IF CHAR-GRID-CELL(Y,X) IS NUMERIC THEN
                       COMPUTE NUMBER = NUMBER * 10 +
                           FUNCTION NUMVAL(CHAR-GRID-CELL(Y,X))
                   ELSE
                       ADD PARSED-NUMBER TO TOTAL
                       MOVE 0 TO PARSED-NUMBER
                   END-IF
               END-PERFORM
               ADD PARSED-NUMBER TO TOTAL
               MOVE 0 TO PARSED-NUMBER
           END-PERFORM
      * Print the total.
           DISPLAY TOTAL
           STOP RUN.

Part 2

Having found the missing part, we need to take things up a gear. We now only care about the numbers which are touching a * symbol: when there are two of these, we need to multiply them together and add them to the total.

Let’s give each gear (a * symbol) a number, starting from 1. Then we can turn our bitmap into something a bit more detailed: rather than just storing 0 or 1, it’ll store 0 or the gear number that digit is connected to.

       WORKING-STORAGE SECTION.
      * … (as above, but without FILTER / FILTER-CHANGED)
           01 GEAR-GRID.
               02 GEAR-GRID-ROW OCCURS 200 TIMES.
                   03 GEAR-GRID-CELL PIC 9(3) OCCURS 200 TIMES.
           01 GEARS-CHANGED PIC 1 USAGE BIT.
           01 GEARS-COUNT PIC 9(3).
           01 GEAR-INFO OCCURS 1000 TIMES.
               02 GEAR-RATIO PIC 9(6).
               02 GEAR-COUNT PIC 9.
           01 CURRENT-GEAR PIC 9(3).

We also have the number of gears (GEARS-COUNT) and for each gear, a record (GEAR-INFO) which will have the result of multiplying the touching numbers (GEAR-RATIO) and the count of how many touching numbers there are (GEAR-COUNT2). There’s also CURRENT-GEAR, which is an index into GEAR-INFO we’ll need later.

To construct the gear grid, we loop over each cell: whenever we see a * we increment the gear count to get the current gear number, put it in the gear grid, and initialize the corresponding entries in GEAR-INFO. We initialize GEAR-RATIO to 1, which will come in handy later, when we’re multiplying numbers into it.

      * (replacing the "Construct the filter grid" section)
      * Construct the gears grid.
           INITIALIZE GEAR-GRID
           MOVE 0 TO GEARS-COUNT.
           PERFORM VARYING Y
               FROM 1 BY 1 UNTIL Y > HEIGHT
               PERFORM VARYING X
                   FROM 1 BY 1 UNTIL X > WIDTH
                   IF CHAR-GRID-CELL(Y,X) = "*" THEN
                       ADD 1 TO GEARS-COUNT
                       MOVE GEARS-COUNT TO GEAR-GRID-CELL(Y,X)
                       MOVE 1 TO GEAR-RATIO(GEARS-COUNT)
                       MOVE 0 to GEAR-COUNT(GEARS-COUNT)
                   ELSE
                       MOVE 0 TO GEAR-GRID-CELL(Y,X)
                   END-IF
               END-PERFORM
           END-PERFORM

For our example input, this gives us (pretending that GEAR-GRID-CELL is a single-digit number):

0000000000
0001000000
0000000000
0000000000
0002000000
0000000000
0000000000
0000000000
0000030000
0000000000

Iteratively expanding the gears grid is very similar to the filter:

      * (replacing the "Expand the bitmap" section)
      * Expand the gears grid.
           MOVE 1 TO GEARS-CHANGED
           PERFORM UNTIL GEARS-CHANGED = 0
               MOVE 0 TO GEARS-CHANGED
               PERFORM VARYING Y FROM 1 BY 1 UNTIL Y > HEIGHT
               PERFORM VARYING X FROM 1 BY 1 UNTIL X > WIDTH
                   IF GEAR-GRID-CELL(Y,X) = 0 AND
                       CHAR-GRID-CELL(Y,X) IS NUMERIC THEN
                       PERFORM VARYING DY FROM -1 BY 1 UNTIL DY = 2
                       PERFORM VARYING DX FROM -1 BY 1 UNTIL DX = 2
                           COMPUTE X2 = X + DX
                           COMPUTE Y2 = Y + DY
                           IF X2 > 0 AND X2 <= WIDTH AND
                               Y2 > 0 AND Y2 <= HEIGHT AND
                               GEAR-GRID-CELL(Y2,X2) <> 0
                           THEN
                               MOVE GEAR-GRID-CELL(Y2,X2)
                                   TO GEAR-GRID-CELL(Y,X)
                               MOVE 1 TO GEARS-CHANGED
                           END-IF
                       END-PERFORM
                       END-PERFORM
                   END-IF
               END-PERFORM
               END-PERFORM
           END-PERFORM

Aside from a bunch of renaming, pretty much the only change is that we now need to copy the gear number from the neighbour (GEAR-GRID-CELL(Y2,X2)) when setting each updated cell’s gear number (GEAR-GRID-CELL(Y,X)).

Once expanded, we get (again pretending that GEAR-GRID-CELL is a single-digit number):

1110000000
0001000000
0011000000
0000000000
2222000000
0000000000
0000000000
0000003330
0000030000
0000033300

Like in Part 1, we filter the CHAR-GRID using GEAR-GRID to remove anything which isn’t touching a gear, but now our final step is a bit more complicated. When we find a number, we need to figure out which gear it corresponds to, multiply the number into GEAR-RATIO and increment GEAR-COUNT (so we’ll be able to tell if there’s exactly two touching numbers we multiplied):

      * (replacing the "Find numbers in the grid, add them up" section)
      * Find numbers in the grid, multiply them into their gears.
           MOVE 0 TO CURRENT-GEAR.
           MOVE 0 TO PARSED-NUMBER
           PERFORM VARYING Y FROM 1 BY 1 UNTIL Y > HEIGHT
               PERFORM VARYING X FROM 1 BY 1 UNTIL X > WIDTH
                   IF CHAR-GRID-CELL(Y,X) IS NUMERIC THEN
                       COMPUTE PARSED-NUMBER = PARSED-NUMBER * 10 +
                           FUNCTION NUMVAL(CHAR-GRID-CELL(Y,X))
                       MOVE GEAR-GRID-CELL(Y,X) TO CURRENT-GEAR
                   ELSE
                       IF CURRENT-GEAR <> 0 THEN
                           MULTIPLY PARSED-NUMBER
                               BY GEAR-RATIO(CURRENT-GEAR)
                           ADD 1 TO GEAR-COUNT(CURRENT-GEAR)
                           MOVE 0 TO CURRENT-GEAR
                           MOVE 0 TO PARSED-NUMBER
                       END-IF
                   END-IF
               END-PERFORM
               IF CURRENT-GEAR <> 0 THEN
                   MULTIPLY PARSED-NUMBER
                       BY GEAR-RATIO(CURRENT-GEAR)
                   ADD 1 TO GEAR-COUNT(CURRENT-GEAR)
                   MOVE 0 TO CURRENT-GEAR
                   MOVE 0 TO PARSED-NUMBER
               END-IF
           END-PERFORM

In this part, we need to track which gear the number belongs to, so when we find a digit, we store GEAR-GRID-CELL(Y,X) (which has the gear number) in CURRENT-GEAR. Unlike addition, multiplying by zero does have an effect, so we also need to check when we actually have a number to do something with, which is provided by CURRENT-GEAR being non-zero.

All that’s left is to add up the gear ratios for any gear where we saw exactly two touching numbers and print it out:

      * Add up the gear ratios.
           MOVE 0 TO TOTAL
           PERFORM VARYING CURRENT-GEAR
               FROM 1 BY 1 UNTIL CURRENT-GEAR > GEARS-COUNT
               IF GEAR-COUNT(CURRENT-GEAR) = 2 THEN
                   ADD GEAR-RATIO(CURRENT-GEAR) TO TOTAL
               END-IF
           END-PERFORM
           DISPLAY TOTAL
           STOP RUN.

As always, the code is in my repository.


  1. Unfortunately, my COBOL implementation doesn’t implement this efficiently and currently my blog’s syntax highlighter treats this as an error. But it’s totally a thing! 

  2. I regret this is very similar to GEARS-COUNT, but I couldn’t find an alternative I liked. 

Comments