It’s the end of the Advent of Code for 2020. The last week was tough – the part 1s were fairly easily doable, but part 2s often were a lot harder. In the end, I managed to solve all of this week but 1 in BaseA Alteryx. For the one I couldn’t, I used BaseA Alteryx as much as possible and the Python tool for one step.

As Christmas approaches, we all get busier, so there are less of us still going strong. I will still pick the odd alternative solution where I found ones that are sufficiently different from my own and interesting. **A word of warning**, some of my approaches to solving within BaseA rules are complicated and not for the faint of heart!

My previous summaries can be found:

Anyway onto the final 6 challenges.

*Tools used: a lot – about 105, run-time: 0.8s*

This puzzle was all about putting a jigsaw together. The pieces were represented with by a 10 x 10 text block with either a `#`

or a `.`

.

..##.#..#. ##..#..... #...##..#. ####.#...# ##.##.###. ##...#.### .#.#.#..## ..#....#.. ###...#.#. ..###..###

The first challenge was to find the four corners. Two pieces could fit together if a side lined up with another side of a different tile with the `#`

and `.`

all aligned. The second tile could be rotated or reflected. I chose to read around the 4 sides clockwise producing 4 fields (`RT`

, `RR`

, `RB`

and `RL`

). I then reversed these four strings to produce the four edges reading anti-clockwise (`FT`

, `FR`

, `FB`

and `FL`

) – note `FL`

is the reverse of `RR`

and `FR`

is the reverse of `RL`

as the tile is flipped. Next, I looked for joins with the left side being one of the `R`

fields and not joining to the same cell. Finding the corners was then a case of just finding tiles with 2 possible joins.

For part 2, the first problem was putting the picture together. I chose first to pick one of the corners. Using a summarise tool, I took the lowest value tile corner. I then wanted to orientate it, so it is the bottom left tile. By looking at the two sides I had joins on, I can work out which side is the right and top in terms of the original orientation:

Side1 | Side2 | Top | Right |
---|---|---|---|

RB | RR | RR (Right) | RB (Bottom) |

RR | RT | RT (Top) | RR (Right) |

RL | RT | RL (Left) | RT (Top) |

RB | RL | RB (Bottom) | RL (Left) |

I then produced a map of the joins as a single string. I did some pre-computation on this. If two tiles were both aligned the original way up then the `RR`

on the first tile would join to a `FL`

on the second tile; hence I need to flip `R`

and `F`

. The map was such that the second tile’s entry would be the opposite side to the joining side. In this example, if tile 1234 was up the original way next to 4321, the entry would be `1234RR ==> 4321RR`

. I encoded all of this into a long string that looked like:

1907FB:2111FR 1907RB:2111RR 2017FL:3343FR 2017RL:3343RR 2477FR:3613FB 2477RR:3613RB 3671FR:2411RL ...

After this, I walked right from the starting corner using a generate rows tool until I reached the end of a row – denoted by not finding another step in the map. This produced 12 rows of 1 tile. I knew the orientation for each of these tiles as knew what the right side was, so could deduce the top side and if it was reflected or not. Having found the top of all 12, a second generate rows tool allowed me to complete the tile layout, including orientation.

I won’t go over all the remaining step in full detail as this post would be enormous but will give the rough steps instead. First, for every tile, I produced 8 copies in each of the possible rotations and reflections. This could then be joined to the layout to produce the full picture I needed, with some concatenations producing the full lines.

Within the full picture, you needed to hunt for sea monsters. I chose to search using a regular expression for the middle line (replacing spaces with `.`

was all that was needed) of the sea monster. Again the picture could be in one of 8 orientations; however, you only needed to check the four rotations to determine which orientation was correct (as the monster would just be upside down). A multi-row formula allowed me to check if the previous and following rows matched for the found monster.

The second stage of this problem was fiddly, to say the least (and took a fair amount if debugging), but fun thinking it through and though the workflow is massive, it is a fairly clean solution when complete.

*Tools used: 36, run-time: 3.3s*

In some ways, this was similar to some of the other days. Well suited to Alteryx with just lots of parsing and joining. Breaking it first into a list of foods with their possible allergens. For a food to be possibly an allergen, it must be present in every list for that allergen. Using an append fields to create all possible joins of lists of possible ingredient for each allergen it was then a case of removing any which were not always possible (using the `L`

output of a join tool in my case).

For part 2, it was again the case of walking a hierarchy. This should probably have been an iterative macro, but as I was tired of debugging these by this point, I went for copy and pasting each block. Each row allocated the allergens which had one possible food and then removed those foods from the list and repeated. In my dataset’s case, it concluded in 4 steps (hence avoiding an iterative macro).

I think my approach over complicated the first part somewhat. Danilang produced a much more sensible approach to solving part one. This involved just looking at the counts of allergens versus the ingredient/allergen count. It made the joins a lot cleaner to see what is going on (instead of making a dynamic regular expression as I did!).

Danilang also chose to use an iterative macro to fill all the possible rows. The iteration is as described above but involved less copy and pasting!

*Tools used: 21, run-time: 1hr 2mins*

This puzzle was basically a simplified version of Top Trumps. My two children would have been pleased. Part 1 was just an iteration and could be solved with an iterative macro – however, the solution I present was built to allow me to solve part 2 (albeit very slowly!!!).

My input had a range of 1 to 50. I chose to encode these as ASCII characters so they would be 1 character each regardless of value. To make debugging easier, I chose it so that 1 mapped to character 1, 2 to 2, 3 to 3 up to 9 and then following with the next ASCII character (e.g. 10 to :). A formula of `CharFromInt(48+ToNumber([Field1]))`

does this. I then joined the characters for each player into a string separated by a space. The example ends up as `92631 5847:`

. Using a generate rows tool, I can iterate comparing the first letter of each word and moving around until only one word remains.

iif(CharToInt(GetWord(C,0))>CharToInt(GetWord(C,1)), // Player 1 wins Substring(GetWord(C,0),1) + Left(GetWord(C,0),1) + Left(GetWord(C,1),1) + " " + Substring(GetWord(C,1),1), Substring(GetWord(C,0),1) + " " + Substring(GetWord(C,1),1)+ Left(GetWord(C,1),1) + Left(GetWord(C,0),1))

Because of my termination condition, I need to apply this once more to finish it (using a formula tool). Now onto part 2…

In this case, we need to keep a record of where we have been and deal with recursion! Alteryx is not designed to implement a recursive algorithm. This would clearly be pretty straight forward in a general-purpose programming language. Let’s deal with the record of past moves first and then think about recursion.

// Play a turn iif(CharToInt(GetWord(C,0))>CharToInt(GetWord(C,1)), Substring(GetWord(C,0),1) + Left(GetWord(C,0),1) + Left(GetWord(C,1),1) + " " +Substring(GetWord(C,1),1), Substring(GetWord(C,0),1) + " " +Substring(GetWord(C,1),1)+ Left(GetWord(C,1),1) + Left(GetWord(C,0),1)) // Add New Played + " " + GetWord(C,0) + "#" + GetWord(C,1) // Keep Old + Regex_Replace(C,"^[^ ]+ [^ ]+ ?", " ")

This handles playing the turn and adding a new word representing the current state as `#`

(`#`

is safe to use as it’s ASCII code is less than 47). You can then look for this string in the current state string to see if you have already played this.

ELSEIF Contains(REGEX_Replace(C, " ! .*$", ""),GetWord(C,0) + "#" + GetWord(C,1)) THEN // Player 1 win by termination

This gives a sequence like:

92631 5847: 263195 847: 92631#5847: 63195 47:82 263195#847: 92631#5847: 319564 7:82 63195#47:82 263195#847: 92631#5847: 19564 :8273 319564#7:82 63195#47:82 263195#847: 92631#5847:

The next problem is how to deal with a recursive step being needed. The start of a sub-game can be easily detected by looking at the lengths of the first two words versus their first character’s ASCII code:

ELSEIF CharToInt(GetWord(C,0)) - 48 < Length(GetWord(C,0)) and CharToInt(GetWord(C,1))-48 < Length(GetWord(C,1)) THEN // Do I need to recurse Substring(GetWord(C,0),1,CharToInt(GetWord(C,0)) - 48) + " " + Substring(GetWord(C,1),1,CharToInt(GetWord(C,1))-48) + " ! "+ C

When the condition is true, the process needs to recurse; 2 new words representing the starting positions of the sub-game are added at the start of the string, followed by a `!`

to represent the end of the sub-game. This sub-game can then be played until a winner is found (or a new sub-game is needed).

The complicated formula’s final piece is working out how to resolve a game or sub-game when a player wins. In this case, either it ends with a single word (as per part 1) or adjusting the parent game’s state. The expression below represents player 1 winning:

iif(Contains(C," ! "), Substring(GetWord(REGEX_Replace(C, "^[^!]+ ! ", ""), 0), 1) + Left(GetWord(REGEX_Replace(C, "^[^!]+ ! ", ""), 0), 1) + Left(GetWord(REGEX_Replace(C, "^[^!]+ ! ", ""), 1), 1) + " " + Substring(GetWord(REGEX_Replace(C, "^[^!]+ ! ", ""), 1), 1) + " " + GetWord(REGEX_Replace(C, "^[^!]+ ! ", ""), 0) + "#" + GetWord(REGEX_Replace(C, "^[^!]+ ! ", ""), 1) + Regex_Replace(C, "^[^!]+ ! [^ ]+ [^ ]+ ?", " "), Substring(GetWord(C,0),1) + Left(GetWord(C,0),1) + Left(GetWord(C,1),1))

If the string doesn’t contain a `!`

then the outer game has been completed, and a final single word is produced, and the generate rows ends. Alternatively, everything is deleted up to and including the first `!`

. Then the next two words are adjusted to account for the winner of the subgame, and the state of the parent game is added to the existing string. The parent game then continues to be played.

Putting it all together gives a very long-expression, but one which will run the whole recursive game within a generate rows tool! I added a small additional step which meant an extra formula tool wasn’t needed. The long strings and hence slow manipulation make this a long process to complete, but it works in BaseA!

*Tools used: 16 (including the Python tool and iterative macro), run-time: 1min*

Part 1 of this puzzle was solved using a straight forward iterative macro. Carrying a current value and the 9 number ring’s order, each iteration finds the new value and mutates the ring state. Running this 100 times produces the required answer.

For part 2, instead of a ring of 9, you need a ring of 1,000,000 numbers and to iterate it 10,000,000 times. This is impossible within BaseA rules as far as I know. I tried adding a `VarListIndexOf`

function to the Abacus library which would allow the iteration to be run within a generate rows, but so far I haven’t managed to make it a reasonable solution (the lookup is an `O(n)`

operation so the process is 1 with 13 zeros after it!).

I chose to solve this using the Python tool but minimising its use to just the iteration step. First, I generate the 1,000,000 rows in Alteryx and then use python to run the process. The python tool allowed me to create a linked list with a dictionary mapping value to the entry’s pointer in the linked list. This meant that the operation was O(n), so performant enough to run in a reasonable time. The code is below:

from ayx import Alteryx import pandas as pd df = Alteryx.read("#1") vals = list(df['V']) l = len(vals) class Cup: def __init__(self, v: int): self.v = v self.next = None def __repr__(self): return f'C({self.v}:{self.next.v})' d = {} for i, v in enumerate(vals): c = Cup(v) if i > 0: d[vals[i - 1]].next = c d[v] = c d[vals[-1]].next = d[vals[0]] current = d[vals[0]] for i in range(df['games'][0]): c_next = current.next current.next = current.next.next.next.next pickup = [c_next.v, c_next.next.v, c_next.next.next.v] i = 1 while (current.v - i if current.v - i > 0 else current.v - i + l) in pickup: i += 1 i = (current.v - i if current.v - i > 0 else current.v - i + l) t_next = d[i].next d[i].next = c_next c_next.next.next.next = t_next current = current.next current = d[1] output = [] while (current.next.v != 1): output.append(current.v) current = current.next df = pd.DataFrame(output) Alteryx.write(df,1)

If I can find a way to make it work using the Abacus function, I will publish it. I’m not sure if it will be possible without specialised functions which don’t seem a valid solution.

*Tools used: 24 (including iterative macro), run-time: 3.3s*

The first challenge for today was to work out how to represent a hex-grid within Alteryx. I chose to think of it like:

It’s worth noting that this is not correct. Assuming the x co-ordinates are correct, the centres are 10 apart. The side length of the hexagon is actually then `10/√3`

, so the y co-ordinate should be about 8.66. For what was needed for this puzzle, it was simpler to treat it as 10.

My solution to part one was first to parse the input string into rows using a regular expression of `ne|nw|se|sw|e|w`

. I then choose to use a couple of multi-row formula tools to compute the current `x`

and `y`

values for each row. Finally, a sample tool picked the final cell reached.

For part 2, I chose to build an iterative macro. I passed the set of filled hexagons as the input. For each of these cells, I moved east, west, northeast, northwest, southeast and southwest (by using an append fields tool and a couple of formulas) to get all the neighbours. Having got these 6 new co-ordinates, it is then a case of joining back to the set of filled hexagons. This allows you to produce both the filled cells flipping white and the white cells becoming filled (in both cases by summing and filtering). The resulting set of filled cells can then be looped around. Using a final filter tool on `Engine.IterationNumber`

allows for the macro to terminate easily.

*Tools used: 8, run-time: 21.5s*

And so we come to the final day. The first part of this was a simple generate rows tool running an iteration to work out the required number of steps. A second generate row tool then can compute the required result. A nice and gentle finish to the puzzles.

There we have it. Advent of Code 2020 done. The table below shows my successes (* – BaseA, **A** – Abacus, **P** – Python tool):

Day | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

Part 1 | * | * | * | * | * | * | * | * | * | * | * | * | * | * | * | * | * | * | * | * | * | * | * | * | * |

Part 2 | * | * | * | * | * | * | * | * | * | * | * | * | A |
* | A |
* | * | * | * | * | * | * | P |
* | * |

We had a few people make it to the end. I can’t comment if anyone found BaseA solutions to 15 part 2 and 23 part 2 but I believe not, some did find a workaround or pure solution using BaseA for 13 part 2. Either way 50 stars feels a great accomplishment with 47 in BaseA and all within Alteryx.

The list of repositories I know of is below (no new ones added this week):

- Mine: https://github.com/jdunkerley/adventofcode/
- NicoleJohnson: https://github.com/AlteryxNJ/AdventOfCode_2020
- ColoradoNed: https://github.com/NedHarding/Advent2020
- CGoodman3: https://github.com/ChrisDataBlog/AdventOfCode_2020
- AlteryxAd: https://gitlab.com/adriley/adventofcode2020-alteryx/
- NiklasEk: https://github.com/NiklasJEk/AdventOfCode_2020
- peter_gb: https://github.com/peter-gb/AdventofCode
- AkimasaKajitani: https://github.com/AkimasaKajitani/AdventOfCode
- dsmdavid: https://github.com/dsmdavid/AdventCode2020

That’s a wrap for AoC 2020 – it’s been a lot of fun. Hopefully, my overviews have given you an insight into the thought processes I used to solve these puzzles and this will help you when solving your own challenges. I look forward to 2021 and the next set of challenges. A final huge thanks to Eric Wastl for setting the amazing puzzles and all the work that entails.

]]>So week 1 and week 2 were both possible in BaseA Alteryx, although getting harder as the puzzles progress. Week 3 was the first time I needed to go beyond BaseA to find a solution for a couple of the parts (though in at least one case the community found a BaseA solution).

As with a couple of years ago, doing the Advent of Code, inspired me to do more work on the Abacus library. This time I added four functions allowing for 64-bit integer-based arithmetic. The numbers need to be passed as strings and are returned as strings. The new functions are:

`Int64Add(a,b,c...)`

– sums all the inputs`Int64Mult(a,b,c...)`

– products all the inputs`Int64Div(a,b)`

– computes`a / b`

using integer division (i.e. returns`Floor(a / b)`

)`Int64Mod(a,b)`

– computes`a % b`

(i.e. returns the remainder of`a / b`

)

Anyway, onto the puzzles!

*Tools used: 21, run-time: 0.3s*

Part 1 was straight forward a simple formula of `CEIL([Time]/[ID])*[ID]-[Time]`

gave the first time the bus would depart post the initial time. Part 2, however, was a lot harder and involved remembering (or googling) some maths I had long since forgotten – the Chinese Remainder Theorem. Roughly, it breaks into the following steps:

- Work out the
*Product*of all the divisors (*n*)_{i} - For each row work out
*p*, given by_{i}*Product / n*_{i} - Solve the equation
*s*_{i}p_{i}= 1 (mod n_{i}) - A solution for the equations,
*x*is given by the sum of all the*s*_{i} - The minimal solution is given by
*x mod Product*

I took a look at the python code from Rosetta Code to help implement this in Alteryx. This solution worked perfectly for all the examples given in the puzzles. However, for the specific problem (a much larger set), the engine reported the following:

The problem is Alteryx formulas are always evaluated as doubles. This is only a problem in step 4, where the values get very high. One feature I didn’t know is that the summarise tool will accurately sum fixed decimal types (thanks to Ned for this hint). This allows me to complete step 4 but still leaves me stuck on step 5. This is where I went to the Abacus library and implemented the needed 64-bit integer functions. Having done this a formula of `Int64Mod([Sum_v], [Product])`

taking the values converted into strings, computes the correct answer to the puzzle.

CGoodman3 implemented an additional generate rows to try 300 values around the result produced by Alteryx and see if it solves the equations. Ned Harding produced a macro which will handle big integer multiplication in BaseA. A useful tool if you need to deal with these huge numbers.

dsmdavid‘s solution was a very elegant iterative macro solving each equation one at a time and then increasing the step size to solve the next one. To quote his post:

I began with pen and paper, then some excel: I started with - number 3, offset 0 - number 5, offset 1 - number 7, offset 2 - number 11,offset 3 3 & 5 --> the first number that qualifies is 9 (mod(9,3) = 0 && mod(9+1,5)=0). Then I need to find a number that qualifies for (3,5) and 7. I start with the first number that qualifies for the previous condition, go in increments of (3*5) --will all keep qualifying for mod(z,3) = 0, mod(z+1,5) = 0 -- until I find one that qualifies mod(z+2,7) = 0 (the first number that qualifies is 54) For the next, I'll start at 54 and increase in increments of 105 (3*5*7) until I find one that qualifies mod(z+3,11) =0. (789). And so on.

A great solution – and BaseA without issue.

*Tools used: 16, run-time: 4.6s*

Back into a comfortable puzzle for Alteryx. This problem involved some binary masking operations. For part 1, you needed to ignore the `X`

s and to set the bits of the value to `0`

when the mask was `0`

and `1`

when the mask was `1`

. A couple of expressions easily achieved this:

BinaryOr([Computed], BinToInt(ReplaceChar([Mask], "X", "0"))) # Set the 1 mask bits BinaryAnd([Computed],BinToInt(ReplaceChar([Mask], "X", "1"))) # Set the 0 mask bits

After this, its just a case of using a summarise tool to pick the final value (this could also have been done with a sample tool), and then a final summarise to total all the values.

For part 2, the meaning of the mask was changed. In this case, it is applied to the address rather than the value. Additionally, the `0`

s are ignored. The `1`

s are set on the address. The `X`

s become wildcards meaning that treating both as `0`

and `1`

and both should be evaluated.

For simplicity of diagnosing problems, I chose to use the `IntToBin`

function to write the address as a 36 character binary string and applied the `1`

s. Then using some generate rows, I created a row for every permutation and then using regular expressions created the new address values. Finally, after this, the same double summarise tools produced the answer.

*Tools used: 13, run-time: 3m49s*

This puzzle reminded me of day 9 from 2018. You need to hold a lot of state information and keep iterating. For part 1, at each step, you add a new number to the list. I chose to do this by keeping the current list as a string:

A generate rows tool is used to create 2020 rows, and then the state is generated using a multi-row formula with the expression:

[Row-1:State] + " " + iif( Regex_CountMatches( REGEX_Replace([Row-1:State], "\d+$",""), "\b" + Regex_Replace([Row-1:State],".* ","") +"\b" ) == 0, "0", ToString( REGEX_CountMatches( REGEX_Replace( REGEX_Replace([Row-1:State], "\d+$",""), ".*\b" + Regex_Replace([Row-1:State],".* ","") + " ", " " ), " ") ) )

On each row, the current value is read from the last line of the string. If this value is present in the list (checked using the Regex_CountMatches) more than once, then we count the number of spaces between the last instance and the end of the string. If it wasn’t present then a 0 is added.

For part 2, you need to do 30,000,000 iterations. This is clearly not possible with this string state I was keeping. It would involve a string with at least 60,000,000 characters. My first idea was to change to storing the state as key and value in the string, for example:

The expression to update this is fairly complicated (some horribly nested regular expressions). This approach was an improvement on my part 1 approach but took nearly 10 minutes to run 100,000 iterations. Doing a little bit of analysis, I estimated it would take at least 2 years to run this to 30,000,000 iterations.

My non-BaseA solution to this was to use the `VarNumExists`

and `VarNum`

feature of the Abacus library. These allow you to define a variable dynamically and then update it as needed. This makes the iterative formula:

iif([RowCount]<=[Init], GetWord([Field1], [RowCount]-1) + iif([RowCount]>1 AND VarNum([Row-1:Value], RowCount - 1),"",""), ToString( iif(VarNumExists([Row-1:Value]), RowCount-VarNum([Row-1:Value])-1, 0) + VarNum([Row-1:Value], RowCount - 1) * 0 ) )

The row value is just the current value, and the last time each row was accessed is stored in the `VarNum`

variables. This allows Alteryx to complete the 30 million rows in about 4 minutes.

For part 1, a few went with the iterative macro approach to solving this. This was generally successful but significantly slower (4 minutes or so) versus the generate rows approach. That being said, AkimasaKajitani found a huge performance boost by using the AMP engine.

I do not believe it is possible to do part 2 within BaseA rules in anything resembling a sensible time. Some resorted to the python tool to solve this which allowed for answers in a few seconds.

*Tools used: 30, run-time: 0.7s*

The first task was to parse the input. This input contains a set of rules, each looking like `seat: 2-3 or 7-9`

, and a set of tickets each being a comma-separated list. Using a Regex tool and Text to Columns, I ended up with:

Having parsed the input, the next task is to filter the tickets fields to see which rules are valid for which fields. I choose to use an append fields tool to add every possible rule to every ticket and column. You can then join this to the set of tickets to produce the tickets where no field passed any rule (the unique tool in my workflow is not needed – bad tool golf!).

For part 2, you need to work out which rule applies to which column. I chose to use an iterative macro to solve this. Firstly, I filtered out the invalid tickets (using a join tool). After this, I filter down to the valid Tickets, Columns and Rules. For each rule, I count the number of valid tickets and compare this with the distinct count of valid tickets. This gives the set of valid rules for each column:

The last step is the iterative macro:

In this case, each iteration picks out `ColumnName`

s that only occur once in the list. The `Column`

associated with this is then returned with the name, and all other rows with the same `Column`

value are removed, and the iteration repeats. This is pretty similar to previous macros. Having solved this, the answer to the puzzle is easily obtained with a join and summarise.

Good to be back to pure BaseA!

Most solutions for part 1 were pretty similar. So I chose to focus on Danilang‘s macro-free solution to build the column assignments. As with my macro approach, first, the results are filtered down to just the possible ones. Next, for each rule, a count of the possible matching columns is added (using the summary and join).

For each row, a concatenated string of all the possible column numbers is created and assigned to a value of the number of possible fields plus 1. By joining this to the ruleset, it is possible to work out which column has not been used and produce the required column mapping.

*Tools used: 39, run-time: 23s*

The main issue I had with day 17 was understanding the example! I must have read it about 5 times before I cottoned on what was happening. As this only needed to be run 6 times and standard macros are a lot easier to debug than iterative macros I chose to copy the macro 6 times! My macro works for both parts 1 and 2 (i.e. it has been updated to work with 4 dimensions!).

Much like day 11, the first task is to parse the input into the set of active cubes each with a `x`

, `y`

, `z`

and `w`

co-ordinate. This is then the input into the macro. The macro performs one iteration following the rules of the puzzle.

Within the macro, I chose to work out the minimum and maximums for each dimension and then created a 4-dimensional grid one larger in each dimension. For every cell, I also create the shift up and down by 1 in dimension (for part 1, I disable the `w`

shifts). Having created this grid with the shift it is then a case of joining into the active cube set, and it becomes a straight forward summation and formula to compute the new set of active cells. The answer for the puzzle is just given by the number of rows in after the final macro.

My macro also produced a diagnostic output reproducing the string input in the puzzle. This allowed me to work out what was going on with the example and ensure I produced the same results.

*Tools used: 8, run-time: 0.6s*

This puzzle involves implementing a custom order of execution. I chose to do this using an iterative macro evaluating one operation at a time. The macro was adapted to cope with addition being evaluated ahead of multiplication, so it solves both parts 1 and 2. The first question is whether the expression contains any brackets and if it does it zooms in onto that part:

I: iif(Contains([Field1],"("), Length(REGEX_Replace([Field1], "(.*?)(\([0-9 *+]+\)).*", "$1")), 0)

This will look for the innermost paired brackets and counts the characters up to this point; otherwise, it leaves the index at 0. After this, it picks out the block to evaluate using:

ToEval: REGEX_Replace([Field1], ".{" + ToString(I) + "}(\([^)]+\)|[^(]+).*", "$1")

This will either be the entire expression of the inner most bracket. Within this bracket, it then either picks the first 2 values and the operator or if in addition first mode (denoted by `#1`

being true), it hunts for the first `+`

:

Ex: REGEX_Replace(Substring([ToEval],Index), "(\(?\d+ [[*+] \d+\)?).*","$1")

`Ex`

includes leading and trailing brackets allowing for these to be preserved during the evaluation:

Eval: iif(right(Ex,1)!=")" and left(Ex,1)="(", "(", "") + ToString(iif(Contains(Ex, "*"), ToNumber(GetWord(Trim(Ex,"()"),0)) * ToNumber(GetWord(Trim(Ex,"()"),2)), ToNumber(GetWord(Trim(Ex,"()"),0)) + ToNumber(GetWord(Trim(Ex,"()"),2)))) + iif(right(Ex,1)=")" and left(Ex,1)!="(",")","")

After this, it is just a case of substituting back into the outer expression. If the expression still has any operators in it, then the iteration runs again. To answer the puzzle, all is left to do is cast to an integer and sum the result.

*Tools used: 20, run-time: 0.5s*

Once more diving into regular expressions. This time the input defines an expression tree which can be built into a regular expression. This becomes yet another hierarchy walking iterative macro with it substituting the leaves into the parent building a very long (and hard to read!) regular expression. I use far too many brackets in my building up. So consider the input:

3: 4 5 | 5 4 4: "a" 5: "b"

The macro I build removes the ” from 4 and 5 and adds a lot of brackets so that 3 gets evaluated to: `((ab)|(ba))`

. Clearly, this could be simplified to `(ab|ba)`

, but as it was working, I left it as it was. The full expression ends up with a lot of brackets! Having built the expression for 0, part 1 is given just by using a `REGEX_Match`

filter.

Part 2 involved a little more thought. Having looked through the expression for 0, you can evaluate it by removing pairs of blocks – matching expression 42 at the start and expression 31 at the end. If you end up with something which matches repeated expression 42, then it is valid. I chose to do this with a generate rows tool removing the paired blocks, one pair for each generated rows. Then finally it looks to see if it matches repeat blocks of expression 42 using a filter tool.

All of the solutions posted to the community end up looking very similar, though with slightly different approaches to solving part 2. A shout out to stephM for coming up with a solution which doesn’t involve an iterative macro.

So many iterative macros! However, I am really pleased to have passed my total stars from 2018 this week (33 stars in that year). I did have to solve 2 using Abacus functions, but still great to see how far we have got using Alteryx.

There are still a fair number of people trying to solve these each day, and over 40 people have solved one or more! A few more git repositories added this week:

- Mine: https://github.com/jdunkerley/adventofcode/
- NicoleJohnson: https://github.com/AlteryxNJ/AdventOfCode_2020
- ColoradoNed: https://github.com/NedHarding/Advent2020
- CGoodman3: https://github.com/ChrisDataBlog/AdventOfCode_2020
- AlteryxAd: https://gitlab.com/adriley/adventofcode2020-alteryx/
- NiklasEk: https://github.com/NiklasJEk/AdventOfCode_2020
- peter_gb: https://github.com/peter-gb/AdventofCode
- AkimasaKajitani: https://github.com/AkimasaKajitani/AdventOfCode
- dsmdavid: https://github.com/dsmdavid/AdventCode2020

Onto the final six days. Hopefully, someone can get the first 50 stars using Alteryx (with as much as possible in BaseA) this year – though as always it’s a busy time of the year and the puzzles are getting harder!

]]>So week 1 was well suited to Alteryx, let’s see how week 2 unfolded! A nice and gentle Sunday puzzle lulled me into the belief that it was going to be an easy week, followed by the first needed use of an iterative macro, and then something that looked far too much like the dreaded IntCode of 2019…

As with last week, I’ve picked some examples from around the community for different approaches to my own. This week also saw a useful macro by Ned Harding which will download and parse the input from the Advent of Code site. I also played with a version of this, which will download the leaderboard so I could play with the results – and see if anyone had beaten Nicole Johnson yet!

Some of the puzzles this week involve some complicated workflows so I will do my best to explain them as clearly as I can. Where I can’t find a substantially different approach (or don’t understand the other one!) I haven’t included below.

*Tools used: 6, run-time: 0.2s*

A well-suited problem for Alteryx. First, using a multi-row formula tool to identify each group, with the `null`

rows delimiting when a group ends. The old trick of a Regex tool in tokenise mode with an expression of `.`

will break each character into a separate record. A summarise tool grouping by `Group`

and `Char`

will produce a record for each meaning the answer for part 1 is just the row count.

For part 2, you need to know how many people are in each group and then join this with those characters within that group which has the same count. This can easily be done using a Join tool on `Group`

and `Count`

, with the `J`

output record count answering part 2.

There weren’t any real alternative approaches to this one. A few people used a Unique tool for part 1 to produce a unique set.

*Tools used: 22 (including iterative macro), run-time: 3.6s*

This is a fairly classic problem of needing to build and walk a hierarchy. For my solution, I chose to use an Iterative macro to fold up leaf nodes into the parent and then remove. So the first task as always is to parse the input – so back to the usual Regex and Text to Columns tools to produce something like:

Each row represents a parent and child (with the count for the child). My iterative macro selects the `Outer`

colours whose `InnerColour`

are not in the Outer list (the leaf of the network as it stands). These rows are then written out to the output. Additionally, their children are copied into their parent nodes and scaled by the `InnerNumber`

so for example:

shiny gold 1 dark olive shiny gold 2 vibrant plum dark olive 3 faded blue dark olive 4 dotted black vibrant plum 5 faded blue vibrant plum 6 dotted black

`Dark olive`

and `vibrant plum`

are the leaf nodes. The last four lines are written as an output, and the input for the next iteration becomes:

shiny gold 1 dark olive shiny gold 2 vibrant plum shiny gold 3 faded blue shiny gold 4 dotted black shiny gold 10 faded blue shiny gold 12 dotted black

On the next iteration, the new leaf node is `shiny gold`

, so these 6 rows are written to the output. The iterative loop is then empty hence the macro exits. As I calculated the count as I went along parts 1 and 2 are both then solved by just filtering and summarising the rows.

The approaches were all generally similar, but I thought I would highlight Nicole Johnson‘s simpler hierarchy macro. She has answered similar questions to this on the community and has a blog post about Kevin Bacon on the subject.

Unlike my macro, Nicole’s takes in 2 inputs – the set of all connections (same basic format as my input) and the immediate children of `shiny gold`

. Each iteration moves down to the children and multiplies the quantity. These rows are output at each step (`R`

output) and looped round to be the input to the next loop. When a leaf is reached, there will be no connections joined so the macro will terminate.

*Tools used: 26 (including macros), run-time: 43.8s*

My first reaction was – uh oh this is going to be like Int code and take forever. However, it turned out to be a lot easier. My approach was fairly straight forward. First, I parsed the instructions and then passed this into the iterative macro. The iterative macro also takes a state input which is looped round in the iteration. This looks like:

Ptr: 1 # Current Instruction Value: 0 # Current Value Exec: '' # Set of executed values

On each iteration, the instruction at `Ptr`

is looked up. The `Exec`

string is checked to see if it contains the `Ptr`

already (the termination condition for the loop). If it does then the current row is written to the `R`

output, and the loop terminates. Otherwise, `Ptr`

is added to the `Exec`

string and new values for `Ptr`

and `Value`

are computed and passed to the loop output. The result at each step is also outputted (to the third output) as this is needed for Part 2. The answer for part 1 is given in the `R`

output.

For part 2, I chose to use a batch macro to vary one instruction at a time and then run the above iterative macro. In this case, the required answer will be when the iterative macro terminates with a `null`

result. You only need to test changing the `nop`

and `jmp`

operations – this gave a limited set (94) of cases to try. Each case is passed in as control parameters, and then the instruction set is altered using a formula tool. Ideally, this would terminate on the first successful result, but I never got that termination to work within the batch macro.

I had to choose Ned Harding’s macro-free approach for this one. As he says if you can avoid iterative or batch macros, it is easier to debug and much faster (this version runs in 2.9 seconds).

First, Ned combines the expressions into a long single string with each instruction being 5 characters long. The operation is shortened to a single character, and the value is padded with spaces to be 4 characters long. Next, a generate rows tool is used to create as many rows as there are instructions plus 1. This is used to mutate any `jmp`

to `nop`

and vice versa within the set. A unique tool is then used to remove the records which have not been changed.

Each of these ‘programs’ is then fed into a generate rows tool which creates up to 300 steps for each. Then a multi-row formula tool traces through which instruction would be processed on each step. A second multi-row formula tool then evaluates the value of the accumulator. Finally, a third multi-row formula tool tracks the steps which have been executed. If a repeat is detected, this expression will return a `#error`

, if it finds the terminating expression then `#success`

is written.

A very clever and speedy way to solve this problem.

*Tools used: 17, run-time: 0.8s*

Following Ned’s demonstration for day 8, I chose to go macro-free for this one. First, I add a RecordID to the input data. Next using a multi-row formula tool, I concatenate the numbers together keeping the last 25 values in a string. This is then exploded into 25 rows for each RecordID. Then it is a case of following the same idea as day 1 and computing the difference missing and joining onto itself. Using a multi-row tool to identify the missing row (when the step goes up by more than 1) and then joining back to the input gives the missing value.

For part 2, I first computed the running sum and then appended the target value. It is then easy to work out the missing value and join this to the set of running sums. This then creates the block of rows needed and using a summarise tool to pick the minimum and maximum before calculating the difference.

One feature I used, in this case, was to hold the number of rows (25 for the real set, 5 for the test) as a workflow constant. It meant I could refer to it in all the formula and change in a single place as needed.

For an alternative, I choose to look at peter_gb‘s solution. For the first part, instead of building a string and split, a generate rows tool is used to create 25 extra rows. This is then used to join the RecordID to get the set of input values. A second join is then used to create all possible pairs of numbers, and then it is just a case of filtering to find a valid pair.

For part 2, Peter uses an iterative macro removing one row at a time from the front. He then computes the running total across this set and looks to see if it ever meets the target. If it does meet the target then the iteration stops otherwise a single row is removed and the macro loops round.

*Tools used: 8, run-time: 0.2s*

Part 1 of this problem was very straight forward. Sort the data set, work out the row differences (turned out to always be 1 or 3) and then using a CrossTab to count the number of each. The answer is then given by a formula tool (remembering to add 1).

Part 2, however, includes this warning:

A simple brute force approach will not work. So some time with a piece of paper and some thought needed before jumping in. The goal is to count combinations of ones you can skip so looking at sequences of differences like `1,1,3`

you can skip the first one. You cannot skip the second 1 as the jump would then be 4. Likewise, you can’t skip any 3s as this would make the jump to big. This leads to a table like:

1,3 - 1 option 1,1,3 - 2 options (1,1,3 or s,1,3) 1,1,1,3 - 4 options (1,1,1,3; 0,1,1,3; 1,0,1,3 or 0,0,1,3) 1,1,1,1,3 - 7 options (1,1,1,1,3; 0,1,1,1,3; 1,0,1,1,3; 1,1,0,1,3; 0,0,1,1,3; 0,1,0,1,3 or 1,0,0,1,3) 1,1,1,1,13 - 13 options (...)

I then just counted blocks of 1 to see how many there were in each section and then using a formula tool converted to the number of possibilities. Finally, a multi-row formula tool computes the total number by multiplying each block together.

For the elegance and simplicity of his part 2 solution, I choose Balders answer as an alternative. Part 1 is exactly the same, but for part 2, he uses a single multi-row formula with the expression:

IF [Row-3:data] + 3 >= [data] THEN MAX([Row-3:ans],1) ELSE 0 ENDIF + IF [Row-2:data] + 3 >= [data] THEN [Row-2:ans] ELSE 0 ENDIF + IF [Row-1:data] + 3 >= [data] THEN [Row-1:ans] ELSE 0 ENDIF

Using a look back of up to three rows, first, he assesses if each of the preceding rows is within 3 of the current value. For those rows within 3, you total the current number of combinations to work out the new set of combinations.

Very nice win for tool golf (5 tools excluding Browse) and speed (0.2s)!

*Tools used: 32 (including iterative macro), run-time: 12.7s*

My first attempt at this puzzle was a complete mess – though did eventually work. This was my second attempt. First, I parsed the input into a list of seats and their row and column positions. I filtered out the non-seats to give a smaller set to work with. The prepared input looked like:

The next step was to work out the neighbour for each seat if still present in the list. I did this by appending a set of directional moves (Up Left, Up, Up Right, Left, Right, Down Left, Down, Down Right) and then joining to the seats to see if the neighbour was present. The result was a list of all seats and their neighbours.

The iterative macro takes this set of neighbours, and the input and then for every seat counts the occupied neighbours and then switches as needed. If something changes the new seat state is looped around; otherwise, the result is returned.

For part 2, the only difference was that you need to move outwards from the chair in each of the 8 directions until either you reach the end or you find a seat. This was done by generating rows and picking the lowest matching chair. The same iterative macro then produces the correct result.

For an alternative, I choose to take AdamR‘s spatial based solution. The inner iterative macro is very similar to mine (with an optimisation that each seat is assigned a unique id), so I am just going to talk about the spatial approach to find the neighbours.

For each seat, the x and y values are divided by 10. These are then used as Latitude and Longitude inputs into a Create Points tool. Around each point a polygon is created moving 0.15° North/South, East/West to create a rectangle. Then using a Spatial Match tool identifies the neighbour seats.

For part 2, lines are extended from the original seat in the eight specified directions. These lines are then buffered to create a region which can then be used to find the other seats which are within the buffered regions. Next, a distance tool is used to work out how far each matched seat is from the original. The minimum distance for each seat and direction is kept to produce the pairs for part 2.

Clever use of the spatial tools to find the neighbours.

*Tools used: 10, run-time: 0.2s*

For this problem, a lot of multi-row formula tools was my way to go. First, I computed the angle the boat was facing and normalised this to be between 0 and 359. Then I computed the `X`

for the boat with a second multi-row formula. Then a third multi-row computes `Y`

.

For part 2, the added complication is the waypoint rotating around the boat. This means that you need to mutate both the waypoint X and Y at the same time. I choose to do this by storing them as a string of `. Then a multi-row formula tool can evaluate it using an expression of:`

iif([Instruction] in ('F','R','L'), SWITCH(mod(360+[Facing]-[Row-1:Facing],360), [Row-1:WP], 90,ToString( ToNumber(GetWord([Row-1:WP],1)))+" "+ ToString(-ToNumber(GetWord([Row-1:WP],0))), 180,ToString(-ToNumber(GetWord([Row-1:WP],0)))+" "+ ToString(-ToNumber(GetWord([Row-1:WP],1))), 270,ToString(-ToNumber(GetWord([Row-1:WP],1)))+" "+ ToString( ToNumber(GetWord([Row-1:WP],0))) ), ToString( IIF([Row-1:WP]="",10,tonumber(GetWord([Row-1:WP],0))) + switch([Instruction],0,"E",1,"W",-1) * [Value] ) + " " + ToString( IIF([Row-1:WP]="",1,tonumber(GetWord([Row-1:WP],1))) + switch([Instruction],0,"N",1,"S",-1) * [Value] ) )

The `GetWord`

allows easy reading of the two parts. The one trick is handling the rotations. Working it through a rotation of 90° results in `(X,Y)`

going to `(Y,-X)`

. Likewise 180° results in `(-X,-Y)`

and finally 270° is `(-Y,X)`

.

Most people seem to have gone with the multi-row formulas for part 1. For part 2, the only alternative approach I saw was using an Iterative macro. I choose to show Danilang‘s iterative macro for part 2. It is a very elegant solution, every row has a `WayX`

, `WayY`

, `X`

and `Y`

as well as the instruction and row being applied. On each iteration, the row is applied to the state columns and the first row is removed before looping the remaining rows around and repeating.

A significantly harder week but still a lot of success with BaseA. Lots of practice on doing iterations in Alteryx – either via Generate Rows or Iterative Macros. Many people have now passed my total of 18 stars from last year and are still going strong. Maybe this year will be the first year of 50 stars.

An increased collection of git repositories this week:

- Mine: https://github.com/jdunkerley/adventofcode/
- NicoleJohnson: https://github.com/AlteryxNJ/AdventOfCode_2020
- ColoradoNed: https://github.com/NedHarding/Advent2020
- CGoodman3: https://github.com/ChrisDataBlog/AdventOfCode_2020
- AlteryxAd: https://gitlab.com/adriley/adventofcode2020-alteryx/
- NiklasEk: https://github.com/NiklasJEk/AdventOfCode_2020
- peter_gb: https://github.com/peter-gb/AdventofCode

As the community is a competitive bunch, grossal has put a Google sheet together where we can compare the speed of solutions.

Onto week 3 and possibly passing my high score of 33 stars!

]]>So it’s December and time again for the annual Advent of Code. For those not familiar, this is a set of 25 puzzles (each with 2 parts) set by Eric Wastl. They have a Christmas theme and story and are solvable with just a little programming knowledge and some puzzle-solving skills. The puzzles start quite easy and get increasingly more complicated, and part 2 can often be a lot harder.

A couple of years ago, Adam Riley suggested we try solving them in Alteryx and so a new annual tradition began. It is worth noting that the puzzles do not necessarily suit Alteryx, but trying to think about how to solve them is a great chance to sharpen your skills. We created some rules – solving using BaseA:

- No RunCommand tool
- No Python tool
- No R tool
- No SDK based Custom Tools (macros are fine)
- No Formula SDK extensions
- Download tool allowed for downloading (no posting to APIs to get answers from!)

If you want to join us, we have an Alteryx leaderboard you can join by going to https://adventofcode.com/2019/leaderboard/private and using the code `453066-ca912f80`

. We are chatting on the Alteryx Community under the Advent of Code label. The leaderboard awards points by order in which they are solved. As the puzzles are published at midnight Eastern time, this gives those who live on the West Coast an advantage. Those of us in the UK, it’s 5 am, which is not a good time for my brain at least! Generally, this means it is fairest to look at total stars rather than points.

For this year, I thought I would write up my solutions with some alternatives solutions from other people every week (well at least while we succeed in solving them!). When building a workflow, there are a couple of other dimensions we can look at beyond just working: how fast is the workflow and ‘Tool Golf’ (i.e. how few tools can we use)! So I will try and pick different approaches to my own for each day.

*Tools used: 10, run-time: 0.3s*

So for this puzzle, you need to find the pair of numbers which totalled 2020. I chose to work out the second number for each input row (a new column `Miss`

), I can then join this to itself (on `Miss=Field1`

) to find the valid possibilities. I then used a filter tool to pick when `Miss`

is greater than `Field1`

to get a unique solution. Finally, I computed the product with another formula tool.

For part 2, you need to do the same but with three numbers. First, I chose to make all possible pairs using an Append Fields tool (set to allow all appends). After this, the process is similar to compute the missing third number and join.

For best ‘tool golf’, I could have merged a lot of this into the formula tool (doing the comparison and only producing a `Miss`

if fields will be in order) and then join tool would produce a unique result.

This solution belongs to @JeanBalteryx. In this case, first, he creates all possible pairs (again using Append Fields tool) and then filters to the pairs where the total is correct. Using a Sample tool, you get a single unique answer which you then compute the Result on (using the formula tool). He also tidies up the data using a Select tool and puts the results into a Browse tool (mostly I choose to rely on Browse Anywhere for Advent of Code).

For part 2, the process starts from the full set of pairs using another Append Fields tool to get the complete set of triplets. After this performing the same filtering and sampling to produce the required result.

*Tools used: 5, run-time: 0.4s*

And so we enter the world of Regular Expressions. First, using a Regex tool to parse the input into columns. An expression of `(\d+)-(\d+) (.): (.*)`

will break it out into 4 columns:

`Min`

: The first number`Max`

: The second number`Char`

: Character`Password`

: Password to check.

I then fed this into the filter tool using `REGEX_CountMatches`

to count the number of times the specified character occurs: `REGEX_CountMatches([Password], [Char])`

and can then compare with `Min`

and `Max`

to determine valid matches.

For part 2, I relied on the simple `Substring`

function to find the characters to compare. Watch out for the off by 1 error – my brain wasn’t working well at the time when I was building it and became off by 2 before realigning!

For this puzzle, I had to choose Ned Harding‘s solution. He won the ‘tool golf’ challenge with a score of 2 tools! **But this is not for the faint of heart!** Using `REGEX_Replace`

, he constructs the regular expression for a `REGEX_Match`

in the filter tool:

REGEX_Match(REGEX_Replace([Field1], ".*:", ""), REGEX_Replace([Field1], "^(\d+)-(\d+) (.).*$", "^[^$3]*\($3[^$3]*\){$1,$2}$"))

The first `REGEX_Replace`

replaces up to and including the `:`

. The second takes the `1-8 n`

and turns it into `^[^n]*(n[^n]*){1,8}$`

. This will ignore everything until the first `n`

and then match between 1 and 8 blocks each starting with an `n`

, and then followed by some non `n`

characters. The `^`

and `$`

mean it must be the whole password.

For part 2, the formula becomes:

REGEX_Match(REGEX_Replace([Field1], ".*:", ""), REGEX_Replace([Field1], "^(\d+)-(\d+) (.).*$", "\(^\(?=.{$1}[^$3]\)\(?=.{$2}$3\).*$\)|\(^\(?=.{$1}$3\)\(?=.{$2}[^$3]\).*$\)"))

In this case, the input `1-8 n`

is turned into `(^(?=.{1}[^n])(?=.{8}n).*$)|(^(?=.{1}n)(?=.{8}[^n]).*$)`

. This is a substantially more complicated expression. It consists of 2 scenarios separated by a `|`

, I will look at just the first part as the second just flips the 1 and 8. The first case is looking for when the 1st character is not `n`

, and the 8th character is `n`

. The `(?=.{1}[^n])`

is a positive lookahead checking that the character at second character is not `n`

. Note to correct for the off by 1 problem; Ned kept a leading space. The second block `(?=.{8}n)`

is a second positive lookahead checking that character 8 is an `n`

. The `.*`

then matches the entire string as long as both lookaheads were fulfilled.

Some very clever regular expressions in here but not simple!

*Tools used: 11, run-time: 0.4s*

In this puzzle, we are given a map of where the trees are. I chose to turn this into a list of trees with their Row and Column co-ordinates. I used a useful trick of a Regex tool in parse more with an expression of `.`

to break each character into a row.

Having got this list of trees, you can then filter it to cases where the column is equal to the `MOD((Row - 1) * Step + 1, Len)`

, where `Len`

is the length of the input string. For case 1, the step is 2. This works well for all but one of the case 2 scenarios too. In the case of 2 rows for 1 column, you need to amend the expression a little more. I chose to use to make the `Pos`

a double and reproduce the `MOD`

function but the following 3 steps:

Pos=([Row]-1)*[Step] + 1 Pos=[Pos]-floor([Pos]/[Len])*[Len] Pos=iif([Pos]=0,[Len],[Pos])

Having produced this value, you can filter it for when the `Pos=Col`

. Finally, a summarize tool and a multi-row formula allow the computation of the counts and the product.

Danilang‘s solution is a nice improvement to my approach. Instead of parsing the input into a row-based data, he used the `Substring`

function to pick the specific character out.

Additionally, he chose to use a `RowSkip`

so every other row is removed in the specific 2-row case. This is cleaner than switching to use a floating-point number. It would allow a cleaner expression to get the character:

substring([Field_1], MOD([RecordID]/[RowSkip]*[Offset],length(trim([Field_1]))), 1)

A nice clean solution and with a win of 7 in Tool Golf!

*Tools used: 9, run-time: 0.3s*

So we’re back in the land of RegEx. There are a set of fields we need to parse and determine if they are valid. This is a perfect use case for `REGEX_Match`

.

The first task of this puzzle is to read the input and identify when each record ends, and the next one begins. This is identified by a `null`

line (worth noting I had an odd defect when copy and pasting from the input into Alteryx and the empty rows were skipped). After this, using a Regex tool in tokenise mode with an expression `(...:\S+)`

breaks each field out into its own row. In retrospect, this would have been simpler done using a Text to Column tool with a separator of `. After this, I chose to use a formula tool to split into `

`Field`

and `Value`

(again easier to do with a Text to Columns):

Field=left([Field1],3) Value=Substring([Field1],4,1000)

For part one, it’s just a case of counting the number of matching valid fields (ignoring the `cid`

field for simplicity). For part 2, we need to move on to validating the actual value. I chose to use a Find and Replace to append a validation regex. The image below shows the expressions I used:

I was in the mood to do everything within the regex. This meant I didn’t need to do any range checks on top as it was all built into these expressions. So, for example, to check for a 4 digit year from 1920 to 2002 becomes `19[2-9]\d|200[0-2]`

. The first part (`19[2-9]\d`

) deals with all values from 1920 to 1999 and the second (`200[0-2]`

) deals with the last 3 years needed.

CGoodman3 took almost the opposite approach to me and entirely avoided using Regex. He used a Text to Columns tool to break up the input into separate rows. This section shown above solves part 1 and is quite similar to my approach.

For part 2, Chris chose to implement a filter tool for each expression. A couple of examples of these are shown below:

Eye Colour: [Concat_Field_11] = "ecl" AND [Concat_Field_12] IN ("amb","blu","brn","gry","grn","hzl","oth") Issue Year: [Concat_Field_11] = "iyr" AND [Concat_Field_12] >= "2010" AND [Concat_Field_12] <= "2020" Height: [Concat_Field_11] = "hgt" AND ( (right([Concat_Field_12],2) = "in" AND tonumber(left([Concat_Field_12],length([Concat_Field_12])-2)) >= 59 AND tonumber(left([Concat_Field_12],length([Concat_Field_12])-2)) <= 76) OR (right([Concat_Field_12],2) = "cm" AND tonumber(left([Concat_Field_12],length([Concat_Field_12])-2)) >= 150 AND tonumber(left([Concat_Field_12],length([Concat_Field_12])-2)) <= 193) )

One advantage of this approach is that each expression is simple and easy to debug. Each filter deals with one type of field, and then the valid results can be joined together using a Union tool.

*Tools used: 4, run-time: 0.3s*

Day 5’s problem is the description of the binary representation of an integer with 10 bits, with `F`

and `L`

being `0`

and `B`

and `R`

being `1`

. Alteryx has a ‘BinToInt’ function which takes a binary string and converts to an integer. So the unique seat id is given by:

BinToInt( ReplaceChar(ReplaceChar(ReplaceChar(ReplaceChar( [Field1], "B", "1"), "R", "1"), "F", "0"), "L", "0"))

There is no need to treat Row and Column separately as the expression combining the two is the same as a 10-bit integer. To find the missing value, I computed the `Next`

seat and joined the data to itself. The `L`

output then returns 2 rows – the missing row and the final seat.

There is a slight improvement (as pointed out to me by Ned):

BinToInt(ReplaceChar(ReplaceChar( [Field1], "RB", "1"), "FL", "0"))

The `ReplaceChar`

function allows for a list of target characters so you can reduce to 2 calls instead of 4.

The vast majority of submissions for this question were basically the same. There were a few variations. As a bit of a fun experiment, Ned and I came up with an alternative approach:

First, using a generate rows tool to create a `Pos`

field going from 1 to the length of `SID`

(the input string). Then using the expression:

iif(Substring([SID],Pos-1,1) in ('B','R'), pow(2, length([SID])-[Pos]), 0)

This checks the character and using its position within the input string to turn into its binary place value (`pow(2, length([SID])-[Pos])`

). After this, grouping by `SID`

and summing these values gives the required value.

To find the missing seat and the maximum, these are then sorted, and a Multi-Row formula is used to compute the difference with the next row. The 2 cases where this is not 1 gives the required output for parts 1 and 2.

So that’s week one (well first 5 days) down. Generally, these puzzles have been well suited to Alteryx. If you want to take a deeper look at my solutions, they are posted to GitHub. A few other git repositories are listed below:

- NicoleJohnson: https://github.com/AlteryxNJ/AdventOfCode_2020
- ColoradoNed: https://github.com/NedHarding/Advent2020
- CGoodman3: https://github.com/ChrisDataBlog/AdventOfCode_2020

As it stands at 2 pm on 5th December, the leaderboard looks like:

Let’s see what week 2 brings and how much further we can go!

]]>When I was young and starting to program, one of the first things that fascinated me was generating images with fractals. Given an often simple algorithm, you can create incredible images with very little code. The Mandelbrot set is a one such. As an experiment, and to test out the AMP engine, I wanted to experiment with creating one in Alteryx. As usual, BaseA rules apply!

I thought it would also be interesting for people to see some of the iterations I went through rather than just the end product. This first post is only going to concentrate on the computation of the data for the set, the way I chose to produce the image, rendering as a bitmap file, is one for another day.

The Mandelbrot set starts with a simple equation that:

With *z _{0}=0* and

Expanding this into real and imaginary parts:

Let where *i* is the square root of -1

To produce the picture of the Mandelbrot set, you scan through different values of *c* and see if the iterative equations tend to infinity or remain bounded. This is normally tested by if the absolute value of *z* exceeds 2 within a certain number of iterations. The absolute of *z _{n}* is:

To speed up the computation slightly, I chose to check when the absolute of z squared exceeded 4.

The colour of the point at *c* can be produced by how fast the iteration exceeded the limit. I chose to use a colour scheme from D3 – specifically the Rainbow one.

So there a few options of how to create the set within Alteryx. Let’s start with the simple part – creating the grid of points we will evaluate across.

The input I chose for this was:

Dim | Min | Max | Points |
---|---|---|---|

Real | -2 | 1 | 800 |

Imaginary | -1.1 | 1 | 600 |

Using a formula tool, can compute the step size (`([Max]-[Min]) / (Points - 1)`

). Feeding this into a generate row tools to make the 800 real points and 600 imaginary points. After this, using an Append Fields tool, which is set to allow all appends, it creates the complete grid of points.

The complication for the iterative equation is that we need to iterate 2 variables and track the numbers of steps. I tried a few different approaches to this.

Probably the most obvious is to use an iterative macro. These can be sluggish, but as each iteration will act on the entire grid and a maximum of about 80 steps will be needed, it’s not too slow and a straightforward approach. First, let’s add a couple of columns to the grid:

*Thres*: The threshold to declare it is tending to infinity*MaxSteps*: Number of steps to test the data on*ri*: Current real value*ii*: Current imaginary value

Each iteration of the macro does the following:

- Compute
*di*the current value of the*|z|* - Compute the new values for
*ri*and*ii* - Split the data set looking at those points which have
*di > thres*or if the number of maximum iterations has been reached. - Return those rows to the outer workflow, loop the others with the updated
*ri*and*ii*values.

This approach was pretty easy and produced a 1080p image in about 45 seconds on my laptop. In this approach, the difference between AMP engine came out slightly slower than the old E1 engine – 52.3 seconds vs 41.1 seconds.

One approach to iterate a value in Alteryx is to use the Generate Rows tool. This is pretty straight forward to iterate a single value, for example, to create 100 rows you can count from 1 to 100. In this case, we need to iterate 3 values – iteration number, real and imaginary values. One simple trick to do this is to keep it as a string:

<Count> <Real> <Imaginary>

Alteryx provides a handy function `GetWord`

, which allows you to extract a single word from the text. Combining this with the `ToNumber`

and `ToString`

functions, you can construct the iteration:

The Generate Rows tool is set to create a new `V_String(48)`

field. It is initially set to `1 0 0`

. The step is given by:

ToString(ToNumber(GetWord([State], 0)) + 1) + " " + ToString( Pow(ToNumber(GetWord([State], 1)),2) - Pow(ToNumber(GetWord([State], 2)),2) + [r0] ) + " " + ToString(2 * ToNumber(GetWord([State], 1)) * ToNumber(GetWord([State], 2)) + [i0])

The expression breaks into three parts:

`ToString(ToNumber(GetWord([State], 0)) + 1)`

increments the first value by 1.`ToString(`

implements the Real part calculation

Pow(ToNumber(GetWord([State], 1)),2) - Pow(ToNumber(GetWord([State], 2)),2) + [r0]

)

=`ToString(2 * ToNumber(GetWord([State], 1)) * ToNumber(GetWord([State], 2)) + [i0])`

implements the Imaginary part.

The `ToNumber(GetWord([State], 1))`

allows you to extract one of the values from the string and convert it back to a number.

The final part of the Generate Rows is the termination condition:

ToNumber(GetWord([State], 0)) <= [MaxSteps] AND Pow(ToNumber(GetWord([State], 1)),2) + Pow(ToNumber(GetWord([State], 2)),2) < [Thres]

In this case, the first line checks the number of steps executed. The second evaluates if the iteration has broken out.

After this, a Sample tool is used to pick the final row of each iteration, and then a formula tool extracts the initial number.

Producing a 1080p image with this workflow is substantially slower than the iterative macro approach. Using E1, it took about 8 minutes. AMP gave it a nice performance boost up to about 5 minutes 30 seconds. But with either engine a chunk slower than the first approach.

My next approach was to use spatial objects and formulas in the Generate Rows tool. This approach was to use the Latitude and Longitude of points as Real and Imaginary values. Each step would add a new point to the current spatial object and then assess when to terminate based on its distance from 0,0 and now many points there are.

Unfortunately, this was a failed experiment as it was much slower than either of the above approaches, so I didn’t complete it. I tried generating 80 rows each time and using multi-row formulas but again to no significant success.

The one shout out from this approach is the Spatial Functions. They are often overlooked but provide a lot of power in formula tools when working with spatial objects.

My last approach was to use a dynamic formula to build the iteration in a single formula tool. Starting by defining `ri`

and `ii`

as the current real and imaginary values and initialising them to 0. Additionally, define `i`

to be the step of the iteration. The one catch is I need to know when the iteration terminates – I chose to indicate this by switching i to be a negative value. The following three expression evaluate one step of the iteration in a formula tool:

i: iif([i]<0,[i],iif([i]>=[MaxSteps] OR [ri]*[ri]+[ii]*[ii]>[Thres],-[i],[i]+1)) t: iif([i]<0,0,[ri]*[ii]) ri: iif([i]<0,[ri],[ri]*[ri]-[ii]*[ii]+[r0]) ii: iif([i]<0,[ii],2*[t]+[i0])

Note, *t* is needed as `ri`

is mutated before computing `ii`

.

The next step is to extract the XML for these expressions:

<FormulaField expression="iif([i]<0,[i],iif([i]>=[MaxSteps] OR [ri]*[ri]+[ii]*[ii]>[Thres],-[i],[i]+1))" field="i" /> <FormulaField expression="iif([i]<0,0,[ri]*[ii])" field="t" size="8" type="Double" /> <FormulaField expression="iif([i]<0,[ri],[ri]*[ri]-[ii]*[ii]+[r0])" field="ri" /> <FormulaField expression="iif([i]<0,[ii],2*[t]+[i0])" field="ii" />

The XML can be found by going to the XML tab in the properties pane for the Formula tool. Using a formula tool to create a copy of this XML, and then a Generate Rows tool to repeat it the maximum steps times. This can then be fed into summarise tool in concatenate mode to build a big block of XML.

For initial testing, manually edit the XML to be this long new expression between the `FormulaFields`

tags. If all is working, using a simple batch macro to update a formula tool allows this to be more automated:

The control panel is fed into the formula tool and using an Action tool to update the inner XML between the `FormulaFields`

. The very last thing to remember is `i`

will be the negative value at the end so and additional formula tool is needed to switch signs.

Using my same test of producing a 1080p image, the difference between AMP and e1 was minor (1:06 vs 1:10). I also tested without the batch macro, and this made next to no difference.

The next step for producing the image is to convert the number of steps taken into a colour.

Starting with a list of colours from the D3 page expressed as hexadecimal strings, add a RecordID. Then using a MOD function, you can filter out, so only the right number of distinct colours is kept. Adding a new RecordID which can then be used with a Find and Replace to join to the computed set. Using Find and Replace avoids the larger data set being sorted but does mean that the record ID needs to be cast as a string using a Select tool.

I’m not covering the rendering of a bitmap in this post (hopefully will detail that in a later post) but using a little trickery with blobs and base64 encoding it is possible to render it in a reporting tool. The final output looks like:

Hopefully, this post gives you some insight from the different ways I tried to produce the image. The workflows for this post available below:

]]>Stealing an idea from Tasha Alfano, I thought I would do it in both python and Alteryx from first principles. A quick shout out to MathAPI – a handy site and used to render all the LaTeX to SVG.

So let’s start by reviewing how to create a cubic spline and then build it up. I chose to use the algorithm as described in Wikiversity. Specifically with type II simple boundary conditions. I’m not going through the maths but will define the steps to build the spline.

First, step is given an *X* array and a *Y* array of equal length *n* (greater than 2), we want to build a *tridiagonal matrix* which we will then solve to produce the coefficients for the piece-wise spline. The goal of the spline is that it hits every point *(x, y)* and that the first and second derivatives match at these points too.

Sticking with notation in the paper, lets define `H`

to be an `n-1`

length array of the differences in `X`

:

for

A tridiagonal matrix is a square matrix where all values except for the main diagonal and first diagonals below and above this. For example:

1 2 0 0 2 3 2 0 0 2 3 2 0 0 2 1

One advantage of a tridiagonal matrix is that they are fairly straight forward to invert and solve linear equations based on them. For the sake of coding up the algorithm – let’s define `B`

to be the `n`

length array holding the diagonal elements, `A`

to be the `n-1`

length array of the diagonal above this and `C`

to be the `n-1`

length array of the diagonal below:

b0 c0 0 0 a0 b1 c1 0 0 a1 b2 c2 0 0 a2 b3

Please note the indexes here are different from those used in the Wikiversity article, as they align with a standard array starting at 0. For the spline, these are arrays are given by:

for

for

for

Using the simple boundary condition that the second derivative is equal to 0 at the end, gives the values for c_{0} and a_{n-2} both equal to 0. This can easily be coded up in python:

from typing import Tuple, List def compute_changes(x: List[float]) -> List[float]: return [x[i+1] - x[i] for i in range(len(x) - 1)] def create_tridiagonalmatrix(n: int, h: List[float]) -> Tuple[List[float], List[float], List[float]]: A = [h[i] / (h[i] + h[i + 1]) for i in range(n - 2)] + [0] B = [2] * n C = [0] + [h[i + 1] / (h[i] + h[i + 1]) for i in range(n - 2)] return A, B, C

The next step is to compute the right-hand side of the equation. This will be an array of length `n`

. For notation, let’s call this `D`

– the same as in the Wikiversity article:

for

Implementing this in Python looks like:

def create_target(n: int, h: List[float], y: List[float]): return [0] + [6 * ((y[i + 1] - y[i]) / h[i] - (y[i] - y[i - 1]) / h[i - 1]) / (h[i] + h[i-1]) for i in range(1, n - 1)] + [0]

To solve a tridiagonal system, you can use Thomas Algorithm. Mapping this onto the terminology above. We first derive length *n* vectors *C’* and *D’*:

for

for

Having worked out *C’* and *D’*, calculate the result vector `X`

:

for

The implementation of this in Python is shown below:

def solve_tridiagonalsystem(A: List[float], B: List[float], C: List[float], D: List[float]): c_p = C + [0] d_p = [0] * len(B) X = [0] * len(B) c_p[0] = C[0] / B[0] d_p[0] = D[0] / B[0] for i in range(1, len(B)): c_p[i] = c_p[i] / (B[i] - c_p[i - 1] * A[i - 1]) d_p[i] = (D[i] - d_p[i - 1] * A[i - 1]) / (B[i] - c_p[i - 1] * A[i - 1]) X[-1] = d_p[-1] for i in range(len(B) - 2, -1, -1): X[i] = d_p[i] - c_p[i] * X[i + 1] return X

So the last step is to convert this into a set of cubic curves. To find the value of the spline at the point *x*, you want to find *j* such that *x _{j} < x < x_{j+1}*. Let’s define

*z* has property of being 0 when *x* = *x _{j}* and 1 when

Now to put it all together and create a function to build the spline coefficients. The final part needed is to create a closure and wrap it up as a function which will find *j* and then evaluate the spline. There is an excellent library in python, bisect which will do a binary search to find *j* easily and quickly.

The code below implements this, and also validates the input arrays:

def compute_spline(x: List[float], y: List[float]): n = len(x) if n < 3: raise ValueError('Too short an array') if n != len(y): raise ValueError('Array lengths are different') h = compute_changes(x) if any(v < 0 for v in h): raise ValueError('X must be strictly increasing') A, B, C = create_tridiagonalmatrix(n, h) D = create_target(n, h, y) M = solve_tridiagonalsystem(A, B, C, D) coefficients = [[(M[i+1]-M[i])*h[i]*h[i]/6, M[i]*h[i]*h[i]/2, (y[i+1] - y[i] - (M[i+1]+2*M[i])*h[i]*h[i]/6), y[i]] for i in range(n-1)] def spline(val): idx = min(bisect.bisect(x, val)-1, n-2) z = (val - x[idx]) / h[idx] C = coefficients[idx] return (((C[0] * z) + C[1]) * z + C[2]) * z + C[3] return spline

The complete python code is available as a gist:

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters

from typing import Tuple, List | |

import bisect | |

def compute_changes(x: List[float]) -> List[float]: | |

return [x[i+1] - x[i] for i in range(len(x) - 1)] | |

def create_tridiagonalmatrix(n: int, h: List[float]) -> Tuple[List[float], List[float], List[float]]: | |

A = [h[i] / (h[i] + h[i + 1]) for i in range(n - 2)] + [0] | |

B = [2] * n | |

C = [0] + [h[i + 1] / (h[i] + h[i + 1]) for i in range(n - 2)] | |

return A, B, C | |

def create_target(n: int, h: List[float], y: List[float]): | |

return [0] + [6 * ((y[i + 1] - y[i]) / h[i] - (y[i] - y[i - 1]) / h[i - 1]) / (h[i] + h[i-1]) for i in range(1, n - 1)] + [0] | |

def solve_tridiagonalsystem(A: List[float], B: List[float], C: List[float], D: List[float]): | |

c_p = C + [0] | |

d_p = [0] * len(B) | |

X = [0] * len(B) | |

c_p[0] = C[0] / B[0] | |

d_p[0] = D[0] / B[0] | |

for i in range(1, len(B)): | |

c_p[i] = c_p[i] / (B[i] - c_p[i - 1] * A[i - 1]) | |

d_p[i] = (D[i] - d_p[i - 1] * A[i - 1]) / (B[i] - c_p[i - 1] * A[i - 1]) | |

X[-1] = d_p[-1] | |

for i in range(len(B) - 2, -1, -1): | |

X[i] = d_p[i] - c_p[i] * X[i + 1] | |

return X | |

def compute_spline(x: List[float], y: List[float]): | |

n = len(x) | |

if n < 3: | |

raise ValueError('Too short an array') | |

if n != len(y): | |

raise ValueError('Array lengths are different') | |

h = compute_changes(x) | |

if any(v < 0 for v in h): | |

raise ValueError('X must be strictly increasing') | |

A, B, C = create_tridiagonalmatrix(n, h) | |

D = create_target(n, h, y) | |

M = solve_tridiagonalsystem(A, B, C, D) | |

coefficients = [[(M[i+1]-M[i])*h[i]*h[i]/6, M[i]*h[i]*h[i]/2, (y[i+1] - y[i] - (M[i+1]+2*M[i])*h[i]*h[i]/6), y[i]] for i in range(n-1)] | |

def spline(val): | |

idx = min(bisect.bisect(x, val)-1, n-2) | |

z = (val - x[idx]) / h[idx] | |

C = coefficients[idx] | |

return (((C[0] * z) + C[1]) * z + C[2]) * z + C[3] | |

return spline |

As always, it is essential to test to make sure all is working:

import matplotlib.pyplot as plt test_x = [0,1,2,3] test_y = [0,0.5,2,1.5] spline = compute_spline(test_x, test_y) for i, x in enumerate(test_x): assert abs(test_y[i] - spline(x)) < 1e-8, f'Error at {x}, {test_y[i]}' x_vals = [v / 10 for v in range(0, 50, 1)] y_vals = [spline(y) for y in x_vals] plt.plot(x_vals, y_vals)

Creates a small spline and ensure that the fitted *y* values match at the input points. Finally, it plots the results:

So that’s the full process, now to rebuild it in Alteryx using BaseA. For the input, the macro takes two separate inputs – a table of KnownXs and KnownYs and a list of target Xs. Again, the first task is to build *H, A, B, C, D* from the inputs:

Using some Multi-Row Formula tools it is fairly easy to create these. The expressions are shown below. In all cases the value is a *Double* and *NULL* is used for row which don’t exist:

H=[X]-[Row-1:X] A=IIF(ISNULL([Row+1:H]),0,[H]/([H]+[Row+1:H])) C=IIF(ISNULL([Row-1:H]),0,[H]/([H]+[Row-1:H])) B=2

Then using a Join (on row position) and a Union to add the last row to the set. Finally, *D* is given by:

IIF(IsNull([Row-1:X]) or IsNull([Row+1:X]), 0, 6 * (([Row+1:Y]-[Y])/[H] - ([Y]-[Row-1:Y])/[Row-1:H]) / ([H]+[Row-1:H]) )

Now to solve the produced system. In order to save on storage, instead of creating *C’* and *D’*, the multi-row formula tools update the values of *C* and *D*:

C=IIF(ISNULL([Row-1:X]),[C]/[B],IIF(ISNULL([Row+1:X]),0,[C]/([B]-[Row-1:C]*[Row-1:A]))) D=IIF(ISNULL([Row-1:X]),[D]/[B],([D]-[Row-1:D]*[Row-1:A])/([B]-[Row-1:C]*[Row-1:A]))

To compute the solution vector, *M*, it is necessary to reverse the direction of the data. While you can use `Row+1`

to access the next row in a multi-row formula tool, it won’t allow a complete full computation backwards. To do this, add a Record ID and then sort the data on it into descending order. After which *M* can be calculated using another multi-row formula:

M=IIF(IsNull([Row-1:X]),[D],[D]-[C]*[Row-1:M])

After reverting the sort, we now have all the inputs. So can move to compute the coefficients for each piece of the spline:

One small trick here is to skip the first row and join back to the same stream. This gives *x*, *y* and *M* for the next row. The coefficients are then computed using a normal formula tool:

CubeCoefficient=([EndM]-[M])*[H]*[H]/6 SquareCoefficient=[M]*[H]*[H]/2 LinearCoefficient=([EndY]-[Y]-([EndM]+2*[M])*[H]*[H]/6) Constant=[Y]

The final challenge is to reproduce the *bisect* functionality to find the row for each wanted *X*.

In this case, using a Multiple Join allows creating a full-outer join of known and wanted *X* values. After this, add the first KnownX at the top and use a multi-row formula to fill in the gaps.

The last step is just to compute the spline. First, join to get the coefficients and then just a standard formula tool to calculate the fitted value.

Having wrapped it up as a macro, a quick test to see it worked:

The final macro and test workflow can be downloaded from DropBox (macro, workflow).

Hopefully, this gives you some insight into how to build a cubic spline and also how to move some concepts from programming straight into Alteryx.

]]>This puzzle felt like one that Alteryx would be well suited to solving.

The board looks like:

Each location can either have a coin or not have a coin. This means a binary representation is straight forward. Keeping the numbering order the same I choose to have 1 be the first bit through to 10 being the 10th bit. So a board like:

Would be encoded to the number of 662:

Position | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | Total |
---|---|---|---|---|---|---|---|---|---|---|---|

Coin | 0 | 1 | 1 | 0 | 1 | 0 | 0 | 1 | 0 | 1 | 5 |

Decimal Value | 1 | 2 | 4 | 8 | 16 | 32 | 64 | 128 | 256 | 512 | 662 |

Additionally, we can now create all the possible states of the board by running through every number from 1 to 1022. We can ignore the completely empty board (state 0) and completely full board (start 1023). Our goal is to get to a state where only one bit is set.

Using a Generate Rows tool to create a sequence from 1 to 1022 and then feeding this into a Formula tool to create the binary representation. The function I used was:

ReverseString(PadLeft(IntToBin([Mode]),10,'0'))

I chose to reverse the binary representation so character 1 of the string represents coin 1.

Next, want to consider the set of possible moves. The board is very symmetrical so if we look at moving 1 it’s like looking at 7 and 10 as well. Likewise, 2 is like looking at 3, 4, 6, 8 and 9. And finally, we have 5 to look at.

Coin 1 can only jump to either 4 or 6 and only if 2 or 3 are occupied respectively. All other moves are either not in a straight line or exceed the bounds of the board.

Coin 2 can only jump to either 7 or 9 and only if 4 or53 are occupied respectively.

Finally, coin 5 cannot itself jump anywhere without exceeding the bounds of the board.

We can now build an entire set of all legitimate moves:

Start | Moves (Start – Jump – End) |
---|---|

1 | 1-2-4, 1-3-6 |

2 | 2-4-7, 2-5-9 |

3 | 3-5-8, 3-6-10 |

4 | 4-2-1, 4-5-6 |

5 | |

6 | 6-3-1, 6-5-4 |

7 | 7-4-2, 7-8-9 |

8 | 8-5-3, 8-9-10 |

9 | 9-5-2, 9-8-7 |

10 | 10-6-3, 10-9-8 |

What I chose to do was compute all the states that each state could move to. The first part of this was to break each state into 10 rows for every state with a 1 or 0 to show if a coin is present in each cell.

After this a sequence of joins allows you to work out the legitimate moves. First, join the cells with a coin to the start of a move. Then join the jump cell with an occupied cell. And then finally join the target cell with an empty cell. Having done this we end up with a table of State, Start, Jump, Target of all possible starting states and moves. The diagram below shows the state of the board for a random state and the possible moves:

Finally, using a formula tool to compute the new state of the board:

State-Pow(2,[Start]-1)-Pow(2,[Jump]-1)+Pow(2,[Target]-1)

Now we need to work out the sequence of plays. I chose to run the sequence backwards. Start with the end state and reverse moves until you find an initial state. As I was being lazy I repeated blocks rather than an iterative macro but this seemed a simple solution given the scale of the problem.

To compute final states, I used a `REGEX_CountMatches`

expression to count the 1s. A final state will be an *integral non-negative power of 2* and so will have a single 1 in the binary representation. We can then join the state transition to produce the possible last moves. For tracking the sequence of moves, I create two additional columns `Moves`

and `MoveSequence`

. Moves will count the number of separate moves and MoveSequence will track the moves made.

We then need to make 7 more back steps. Each of these steps is identical. They take the current possible set of plays until this point and the state transitions and produces the next set of plays. If the `Target`

of the new move is equal to the `Start`

of the current move then we don’t increment the `Moves`

count otherwise we add 1. The expression I used was:

[Moves] + IIF(StartsWith([MoveSequence],[Target] + '-'),0,1)

Having repeated this 6 more times, we end up with the 84 possible plays which win the game. The last step is to choose the lowest Moves and then we are done.

As always it is a lot of fun to challenge the bounds of what is possible within Alteryx. This was a nice puzzle and had some interesting challenges to build it quickly.

I’m not posting my sequence here but if you run the workflow it will give you the answer. The workflow is available on DropBox

]]>So the lights are up across London, the weather is atrocious and the trains are going nowhere fast; so it must be time to think about the Advent of Code. For those of you who haven’t heard of it, this is a fantastic set of 25 puzzles released one by one every day in December until Christmas Day.

Jesse Clark and I spoke about gamification within Alteryx at Inspire London this year. One of the ways I like to have fun with it is finding challenges which are fun to attempt to solve within the platform. The weekly challenges are a great set, but these are curated specifically for Alteryx. In the Advent of Code, the challenges are general programming problems that start off easy and get progressively harder and harder.

Alteryx is in many ways like a programming language, this naturally led Adam Riley to suggest last year that we try doing the Advent of Code in it. It was a lot of fun if very challenging. Following some very late additional solves Patrick Digan is leading the real Alteryx entries:

The rules for playing are straight forward – solve the puzzle each day as quickly as you can using Alteryx. Nowadays Alteryx lets you lots of extra things so we need a few constraints – at London the ACE’s nicked-named this rules BaseA:

- No RunCommand tool
- No Python tool
- No R tool
- No SDK based Custom Tools (macros are fine)
- No Formula SDK extensions
- Download tool allowed for downloading (no posting to APIs to get answers from!)

Thanks to Sam for our logo.

Last year, some of the puzzles were beyond our capabilities with these constraints but many were perfectly solvable. If you want to read a retrospective of solving them, I wrote a post here.

So as we start another year’s challenges, I invite you to come join us and see who can be the Alteryx user to achieve a perfect 25 Gold Stars.

You can join the Alteryx leaderboard by going to https://adventofcode.com/2019/leaderboard/private and using the code `453066-ca912f80`

. Last year, we chatted on twitter but it might be easier to chat on the Alteryx Community where we can share problems and solutions.

For this little experiment, you will need:

- A GitHub account
- An Azure account
- An AWS account

Let’s create an empty repository in GitHub as we need somewhere to keep code! Go to https://github.com/new and create a new repository, I called mine `PythonLambda`

. I made it public and with a README (so not empty to start).

Next hop over to Visual Studio Online and click get started. You will then need to log in using your Azure account. Then click *Create Environment*. Enter in a name (I chose the same as the GitHub project) and then paste the URL for the GitHub repo into the Git repository. Click create and wait for it to be available.

Next, connect to the environment. A window that looks remarkably like Visual Studio Code will appear; I chose to install support for Python. This took virtually no time whatsoever. Next hit `Ctrl-'`

to open the terminal windows. Let’s check for python by running `python --version`

:

On to installing the AWS CLI. Simply run the command `pip3 install awscli --upgrade --user`

. Once this completes you can run `aws --version`

:

So now we have a working development environment. Took about 3 minutes to set this up. We have a git repository, a copy of Visual Studio Code set up to write python code and the AWS CLI.

Next, we can edit the README and check we can push back to GitHub as a new branch. Change the README file and save it. You can then go to Source Control tab (press `Ctrl-Shift-G`

), commit the files and push to GitHub. It will pop up a window asking you to authorise `microsoft-vs`

to access your GitHub account. After that, it will push the code to GitHub.

Next head to the AWS IAM console and log in. Ideally, you will create a new user but for the sake of simplicity, I will just create a new access key. Go to the `Users`

link, find your user, go to `Security Credentials`

and select `Create Access Key`

. Back within the terminal in Visual Studio Online run `aws configure`

and copy the Access Key ID and Secret Access Key into the prompts.

This step took me about 5 minutes. So, we are ready to start building the lambda in about 10 minutes.

The goal of this post is not really the Lambda more just to get it all set up. The code below is a really trivial function. Create a new file called `lambda.py`

and add the new content:

from datetime import datetime import logging logger = logging.getLogger() logger.setLevel(logging.INFO) def lambda_handler(event): logger.info(event) return { 'statusCode': 200, 'body': datetime.now().isoformat() }

This is a very simple function which will log whatever is passed to it and return the servers current date.

The bare minimum we need to do for a lambda to run is to create an IAM role and policy, and then we can publish a function. Run the following within the terminal:

lambdaName='PythonLambda' account=`aws sts get-caller-identity --output text --query 'Account'` region=`aws configure get region` # Create Role aws iam create-role --role-name $lambdaName --assume-role-policy-document "{\"Version\":\"2012-10-17\",\"Statement\":[{\"Effect\":\"Allow\",\"Principal\":{\"Service\":\"lambda.amazonaws.com\"},\"Action\":\"sts:AssumeRole\"}]}" # Create Policy and Attach aws iam create-policy --policy-name $lambdaName --policy-document "{\"Version\": \"2012-10-17\",\"Statement\": [{\"Effect\": \"Allow\",\"Action\": \"logs:CreateLogGroup\",\"Resource\": \"arn:aws:logs:$region:$account:*\"},{\"Effect\": \"Allow\",\"Action\": [\"logs:CreateLogStream\",\"logs:PutLogEvents\"],\"Resource\": [\"arn:aws:logs:$region:$account:log-group:/aws/lambda/$function:*\"]}]}" aws iam attach-role-policy --role-name $lambdaName --policy-arn "arn:aws:iam::$account:policy/$lambdaName"

This will create the Role and Policy for the lambda to use. Finally, lets publish the first version of the lambda. In the terminal, run:

zip lambda.zip lambda.py aws lambda create-function --function-name $lambdaName --runtime "python3.7" --handler "lambda_handler" --zip-file fileb://lambda.zip --role "arn:aws:iam::$account:role/$lambdaName" rm lambda.zip

Finally, we will set up a build task to publish this to AWS. Create a new folder called `.vscode`

and add a new file called `tasks.json`

. Add the following content:

{ "version": "2.0.0", "tasks": [ { "label": "Publish Lambda", "command": "zip lambda.zip lambda.py && aws lambda update-function-code --function-name PythonLambda --zip-file fileb://lambda.zip && rm lambda.zip", "type": "shell", "group": { "kind": "build", "isDefault": true } } ] }

Now if you hit `Ctrl-Shift-B`

within the editor, it will zip and deploy the python code as it stands to AWS.

Visual Studio Online is amazing. You can get a development environment up and running in minutes. It works anywhere and can easily be hooked into whatever you like. While this post has only created a very basic set up, it is hopefully enough to show you how you could do everything from the comfort of your own browser and in very little time.

]]>So, this is my regime – I am far from an expert but this is how I deal with it. Everybody who has Diabetes will have their own way of doing what they need. I was diagnosed as a diabetic back in November 2004. I am a Type 1, hence this is what I know about. While there are similarities with Type 2, I won’t talk about that as have no personal experience. My knowledge comes from years of living with it and the fantastic DAFNE course.

In the UK, about 3.5 million people have diabetes, of those about 10% of these are type 1. It is a huge cost for the NHS with about 10% of its yearly budget used for the treatment of diabetes.

Diabetes mellitus type 1 is an autoimmune disease where the immune system attacks the beta producing cells in the pancreas. This means that the body can only produce little or no insulin. Without insulin, the body cannot regulate the level of sugar in the blood. Incorrect sugar levels can lead to a whole range of complications both in the short term and the longer term. In order to control this, you need to inject insulin into the body in some way.

The causes of Type 1 are complicated. It is not a simple genetic condition (though the presence of a family member with it increases the risk). It is not caused by exercise or diet, but some environmental factors do play into it. In my case, I had an operation and a nasty infection and that was the trigger event.

Using insulin to control your blood sugar levels is far from trivial, it is a delicate balancing act. If you have too little sugar in your blood, then you can have an incident of hypoglycaemia (a hypo). This can lead to confusion and clumsiness, and in extreme cases loss of consciousness or even death. At the opposite end of the spectrum, if you have high levels of sugar in your blood, then you can cause significant organ damage over time. In extreme cases, you can develop Ketoacidosis which can again end in a coma and time in hospital. So careful and continual management is key.

The way to monitor this is to check your blood sugar levels. The best measure of this is a full blood test called an HbA1C. This measures the level of sugar in the blood over the last 2 to 3 months. I get this done about twice a year when I see a nurse or consultant. The rough goal for a Type 1 is to have this less than 7%. For day to day monitoring, you can check the instant level of sugar in your blood using glucose monitor. I use an AccuCheck Mobile – prick my finger, put some blood on the strip and it gives me a reading of the instant level. For instant measurement, the goal is to be in the range of *4 – 8 mmol/l*. If you averaged *7 mmol/l* you would get an HbA1C of about 6%. Readings of less than *4 mmol/l* would indicate a hypo, higher than *8 mmol/l* would indicate too much sugar.

There are so many factors which affect the blood sugar level, it is crazy – some are obvious, some less so. Let’s start with insulin. I have two types of insulin that I need to take – Levemir and Novo Rapid. Levemir is long-acting insulin (basal) designed to provide a background level in the blood to keep the body able to process some sugar. It takes a while to start acting and then lasts for about 18 hours. I need to take this twice a day. Novo Rapid is fast-acting insulin (bolus) taken in response to food. It starts acting within about 15 minutes and lasts around 3 hours. The real complication is working out how much insulin to take. This takes us on to the next factor – food! The chart below shows a qualitative view of the insulin level in my body over a 24-hour period.

Food and drink is the source of carbohydrates in our bodies. Often in diabetes, we talk about sugar, but it is actually more generally carbohydrates that matter. The amount of the bolus insulin you need is directly related (approximately linearly) to the amount of carbohydrate you eat. This leads to the concept of ‘carb counting’. When you eat food, you can break down into the different food types and work out (approximately) how much carbohydrate there is and you have consumed. The nutritional information on packs will have this, and some restaurant chains provide this too.

Having got this number, you can then convert it into the number of insulin units needed. In my case, 10g of carbohydrate equates to needing about 1.5 units of Novo Rapid. The ratios will vary from person to person and even meal to meal (as they do for me) and need to be worked out by each person. It is so easy to get this wrong. A couple of extra units of insulin and you will have a hypo, a couple of units short and you will be sky-high. Like everything with this disease, you get better with practice and continual work.

Calculating the basal doses is a different exercise. For me, I find it easiest to take it before bed and mid-morning. The before bed one is easy to work out the right kind of levels as want my sugar level to be the same in the morning as it was when I went to bed. For the morning dose, I want the same concept so that my sugar levels remain constant, but this is harder work to get a feel of as activity and food varies. One technique is to have carb-free meals in which case you don’t need the rapid dose. This can allow you to work out the correct levels. DAFNE has guidance and workflows to learn how to figure these doses and the ratios out.

So that’s the principal factors. The next area is exercise. If you do additional exercise the level of insulin you need goes down. First causing a reduction in the quick-acting insulin to avoid a hypo. Doing exercise regularly will improve your metabolism and probably increase the sensitivity of your body to insulin so will further reduce the basal dose you need. Likewise, you can lose weight doing exercise.

Weight is yet another factor in how much insulin you need. The more you weigh, the more insulin you need in general. Unfortunately, this comes with a nasty positive feedback loop. The more insulin you take, the hungrier you feel, the more weight you will put on and then hence the more insulin you will need. This leads to a very difficult cycle to break.

There are many more factors which affect the level of sugar, but this gives you a feel of them. All of this adds a pretty heavy burden onto the day to day life of a diabetic. The potential complications caused by the disease also need to be checked for. My annual review includes a check of many internal organ’s functions (e.g. kidneys) and I need to keep a watch on my blood pressure as high blood pressure can compound issues with diabetes.

All of this adds up to a stressful condition to manage. A lot of diabetics struggle with it – it is always present there is never a holiday from it. My goal for my project is to make my ways of managing it as easy as possible. Fortunately, technology is improving…

Diabetes care is getting better and better. The insulins are much more advanced than they once were. They allow me to lead a normal life – I can eat what I want and do anything I wish. Sure, there is a constant workload to manage the condition, but the modern drugs allow you to balance things around your life.

The glucose meter I use today is significantly more advanced than the old one I got when I was first diagnosed. It links to my phone over Bluetooth and allows me to transfer reading instantly. While I don’t have one, continuous glucose monitors are becoming more common. They allow for real-time checking and monitoring of sugar levels.

The NovoPen 5, which I use to inject insulin, has gained a memory function which tells you when you last injected. The next version is on the horizon and this has NFR communication and keeps a record of a whole set of injections. This will minimise the risk of missing or extra injections. While I don’t have one, insulin pumps have also increased in use and these again are developing quickly.

The holy grail for diabetes is the closed-loop. If you can take the readings from a glucose monitor and inject the correct amount of insulin you can vastly reduce the burden a diabetic lives with. The Open APS project is a truly incredible project started by Dana Lewis and Scott Leibrand to build an artificial pancreas system using existing medical devices with an open-source design and implementation. The work they do certainly gives hope to me and I am sure many other diabetics.

Hopefully, this has given you a flavour of living with the disease. I end up every day:

- Testing my blood about 5 times a day (my fingers are perforated)
- Injecting two different insulins a total of 5 times
- Computing dose levels and carb counting food for every meal
- Doing exercise to reduce insulin levels and improve controls
- Monitoring of blood pressure and weight.

This adds up to a ton of data being produced by a disparate set of devices and while they all have little apps and tools, I want it all in one place. Since I was taught the DAFNE course, I came up with my own spreadsheet-based record tool. It records:

- Weight and BMI
- Insulin injections, what I ate, blood glucose levels
- Estimates a daily, weekly HbA1C from average readings
- Blood pressure readings
- Basic charts to show some quick details

I find it to be helpful to my control and want to turn it from a rough and ready Excel prototype to a more reliable cloud-based solution that others could potentially use as well. Having the data at your fingertips lets you see patterns and understand how they are developing and hopefully, you can intervene earlier rather than later to keep it under control.

I want a simple web interface I can easily through entries into and as well a simple way to pull the whole dataset down and muck around with it in Alteryx or other tools.

In the next post, I will start thinking about the data store and structures and begin building it out.

]]>