In the first part of this series, I gave an overview of the Alteryx platform and some of the functionality offered by the SDK. The first parts of the SDK I am going to look at concern adding additional functions into the formula function library. There are two methods of doing this: XML Macro functions and C-style DLL functions. This post will take a look at first of these – XML Macro functions.
This was the first part of the SDK I played with (not really realising the power of the SDK at that point), I came across them based on a blog by Adam Riley: FormulaAddIn (or 3Rd Floor Flat part 2). My early experimenting was to add some additional string conditional functions – StartsWith, EndsWith and (following Ned’s suggestion) Contains. These are now built in (though I think as C++ functions in the library) to core functions library.
The function library is used within various of the core first party workflow tools, as well as within some reporting and tools for macros or apps design. One exception worth noting is the In-DB formula tool – this is a SQL function tool not linked to the formula library. Some of the tools using the formula system and the different operation they perform are listed below. This in not an exhaustive list, I have listed the ones I use but gives an idea of how widespread the function library is within Alteryx.
This lets you create new fields or update existing fields using an expression based on existing values on the row
- Multi-Row Formula
This allows you to look ahead or look behind a number of rows and create a new field or update an existing field based on them
- Multi-Field Formula
This allows you to perform an operation over all or a subset of the fields, and either create a new field for each (the default) or overwrite the existing field (including changing the type). This can be very useful for data cleansing for example.
You can use the formula based expressions as custom filters
- Generate Rows
This allows you to perform the equivalent of a For…Next loop within Alteryx, generating a set of rows for each input row.
The initial value, loop step and termination expression are all formula expressions.
This tool lets you raise a warning or an error on a condition. This condition can be a formula expression.
This is a fantastic tool and allows you to add safety checks to joins, etc. We will be using this in the next part of this series for creating test workflows.
This tool allows you send a message either at the start, at the end or based on a conditional expression. The message is also an expression as well
This tool allows you to take inputs from interface tools and then update the configuration of other tools within a macro or analytic app.
It is worth noting that XML Macro function cannot add anything that wasn’t possible before. In many ways, they are syntactic sugar. In other words, they have the power to make using function both a lot more pleasant and also a lot quicker. Take as a simple example, the Contains function. Prior to its inclusion in the library you had to write something like:
While not a big change, the contains function reads cleaner:
As a tutorial, we will create two simple functions:
This will return the Golden Ratio
This will return value unless null, in which case it will return value_if_null
We will look at the process to create both of these. In the next post, we will look at creating a simple installer script and setting up some testing workflows.
The Formula Tool
The basic formula tool is the best place to start when wanting to create new functions.
This post is based around the current version of the formula tool (as of version 10.6). A new UI was demonstrated as part of the keynotes at Inspire 2016 in San Diego (see George Mathew’s Keynote). As discussed in the previous posts, the UI and the engine are actually two separate DLLs connected via the XML configuration. The engine piece is not being changed (I don’t believe) and all formulas and extension functions should work as before within the workflow. From what I have seen the new UI is a fantastic step forward (and great demonstration of using HTML as the UI piece) – giving both IntelliSense suggestions and syntax colouring much like in Tableau.
The first step to creating a macro function is to first work out the formula for the function you want. The formula for the Phi and for IfNull are:
You can set these up in the usual manner in a formula tool, and then test that it gives you the results you expect:
For a XML macro the parameters are passed as P1, P2, P3 … The Phi function is not going to take any so we don’t need to change anything there. I find the easiest way to do this is to add a field called P1 and P2 in formula tool and then replace the fields in the function as needed (this will make Alteryx do all the work of writing the XML for us). You don’t need to surround the field with [ ], this is only needed if your field name contains operators or white space. The new formula tool and results table look like:
We now need to get to the XML for the function. This will handle encoding special characters in the XML for us (such as “ and &). The table below gives the characters that need encoding:
|“||"||double quotation mark|
There are two ways simple ways to do this. Either save the workflow and open it in a text editor. If you look at the top of the configuration panel there is a number in brackets (5 in the case above). This is the Tool Id. You can use this to find the section of the xml you need (search for ToolID=”5”). You are looking for the Node element and the Configuration section within it:
An easier way to access the XML (and one to get very familiar with when doing SDK development) is to use the XML view within the Alteryx designer. You will need to turn this on within the user settings. Go to Options => User Settings => Edit User Settings. Then within the Advanced tab, there is a tick box ‘Display XML in Properties Window’:
Having done that a new section appears in the Properties Window for the workflow and for the tools within it. For the workflow itself, it gives you read-only access to the whole yxmd file. For a tool, it gives the Configuration element for the tool:
Creating The Add In
We now have the XML expression for the functions we want to create, we now need to convert them into a formula add in so we can use them as functions. A formula add in is just an XML with each function defined as a Function Node. Lets start by creating an XML with the following contents:
<?xml version=“1.0” encoding=“utf-8”?>
I am not sure if it the document element name (FormulaAddIn) matters but have always used this. We now need to create Formula nodes for the two functions. Each Formula node has six required elements for a macro function:
|Category||Category name for function to appear under. This can be either one of the existing ones or a custom name.|
|NumParams||Specifies the number of parameters. It has a required boolean attribute of variable. For XML Macro functions this must be false.|
|Description||Text to display to inform the user what the function does.|
|InsertText||The text to insert into the Expression box should the user double click on the function in the function list|
|Formula||The actual formula to substitute in place of the function name with P1, P2, … as place holders for the parameters|
Lets start by creating the PHI function. We can get the formula we need from the existing formula tool XML. We want this to be within the Math category. It takes 0 parameters. The XML for this looks like:
<Description>The Golden Ratio</Description>
<Formula>(1 + SQRT(5)) / 2</Formula>
Insert this block into the XML file. This specifies the function name (PHI), the category (Math) and description (The Golden Ratio)for the function list. If you double click on PHI within the list, it will insert PHI() into the expression box. This function will be replace with the formula ((1 + SQRT(5)) / 2) when the engine evaluates this expression. The NumParams node is specifying that this function takes no arguments. The variable attribute must be set to false for macro functions, this is for C-style DLL functions which we will look at in another post.
Now moving on to look at the IFNULL function:
<Description>If the first parameter is null replace with the second</Description>
This block is very similar to the block for PHI. I have chosen to put the IFNULL function with the Specialized category. The formula is exactly the same as we saw in the formula tool above. The only very slight complication is the InsertText element. This is where you can leave placeholders for the parameters for the user to replace. It worth making these descriptive so the user knows what each parameter does. This is as close as you get to providing syntax help to the user.
We have now created the basic XML add in file. You can download this file from here. This needs to be saved to the bin\RuntimeData\FormulaAddIn within the Alteryx install directory. For an admin install of Alteryx, this will be probably something like C:\Program Files\Alteryx\bin\RuntimeData\FormulaAddIn and the folder will be protected by UAC. To get around UAC save to an unprotect location and then copy to the folder. If you have a non-admin install the equivalent install folder will probably be %LOCALAPPDATA%\Alteryx\bin\RuntimeData\FormulaAddIn. After this restart Alteryx, and the functions should be in your formula list (you don’t need to restart for the engine to be able to use the functions as the engine is a separate process to the designer):
You can then use the functions within the formula expression as you would expect:
There is a sample workflow for these function you can download here. This workflow will write the add in file to a temporary location and install it on first run. You will get a UAC prompt on this run. You will need to restart Alteryx for it to appear in the function list. Once the add in is installed the UAC prompt wont trigger again. If you wish to uninstall the add in then disable the Installer tool container and enable the Uninstaller tool container.
Limitations of Macro Functions
First as we said above, these are just syntactic sugar. They get replaced with their formulae early on in the evaluation of an expression. This being said, they can make you expression significantly easier to read and write. For example, the IFNULL function means you do not need to repeat the first argument, if this is a long expression then this results in a nice simplification. I also believe the argument is only evaluated once so if the evaluation is hard then this can be a big performance improvement.
Debugging these add ins can be difficult. Alteryx will not give you any error messages if the XML is badly formed. The add ins will just not load and the functions will not be available in the designer or the engine. I have been playing with creating a XSD (XML Schema) file and then using an online XML validation tool (e.g. http://www.utilities-online.info/xsdvalidation/#.V6-VCZgrIcU) but haven’t completed this yet.
Because debugging them are hard – don’t take it too far. In my experimenting with then. I did go as far as to create HexBin functions (based off the d3.js code) in XML (see image below). They took a lot of tuning and playing with to get to working and were impossible to adjust and really understand. I have now re-implemented them as C-style DLL functions.
Final mention of some fun and games with recursive calls. I was trying to set up a number of business days function and wanted it to call itself if start was after end. Easy enough to express that in the formula but Alteryx does not like it. It will load the add in quite happily but the second you call the function it will just die! I think the substitution engine cannot cope and while fun to find a reliable way to make Alteryx die (as it is a very stable product), it is trivial to create another function and avoid this issue. I do not believe you can mark a function as internal or private and if you get to much complexity or layers worth thinking about C-style DLL functions instead of XML macros.
We have created our first add in. In the next post, I’ll take a look at source control, unit testing and creating an installer.