Beyond Alteryx Macros – Source Control, Testing and Installing (part 3)

While I am writing this as part of the series on the Alteryx SDK, many of the techniques can be used when working with Alteryx in general.

This is the third post in my series on extending Alteryx via its SDKs and APIs. The first post gave a brief overview of Alteryx as a platform and in the last post, we created our first formula add in. This post covers some of the practices and processes I have used for building these add-ins.

Source Control

Everything related to these posts can be kept in source control. Most of Alteryx own files format are XML files, so can easily be diffed and kept in source control (though be careful with merges!). The larger compressed formats, yxdb (data files) and yxzp (packaged workflows), can just be treated as binary files. When this series gets onto covering the C++ based functions and the various platforms for creating custom tools (C#, C++ and HTML/JS) then all of these can and should be kept in some form of source control.

It is very useful to keep the macro directories within source control. Firstly, so if a change breaks them, then they can be reverted to a previous version, but also because using branches it is possible to enable ‘project’ level macros. Create a branch for a project with macros only for that project in the branch. When the branch is changed, those macros disappear from the working set.

SubversionHistorically, I have used Subversion a lot in my work (and CVS before it but that was a long time ago now!). I used this at first when I got started with Alteryx and it suited me fine. It does require a server and does not have the local repository functionality of a more modern distributed version control system. If, however this is the platform available, then I recommend TortoiseSVN for working with it in Windows, it makes the experience a lot more integrated.

Git-logo.svgToday, almost everything I do with Alteryx is kept within git repositories. The distributed nature means I can work on my Surface tablet without a network connection and still be able to keep versions of my work. If you are starting from scratch, git also allows you to just run a repository locally adding a server (remote repository) later. There is a nice online course, tryGit, which will take you through the basics in about 15 minutes.

For a remote repository, I use GitHub. Most of the stuff I do I am happy to be done in the open and hence it is free. If you pay a subscription then you can also have private repositories. In my case, I am also a big fan of Visual Studio Team Services. As I do a lot of work within C# as well it is a great platform and it allows you to set up private git repositories for free. I like Source Tree for working with git, though the integrated support within Visual Studio (and Visual Studio Code – which is my most used tool for formula add-ins) is very good.

Alteryx Server has version control built into its workflow repository. I have seen this demoed at the Inspire Europe conference and it seemed very straight forward but I don’t run a server instance and hence don’t have much experience of it. That said the Alteryx Gallery is I believe a big version of the server, and I have found that publishing and controlling version of macros on there to be very quick and simple.

Creating An Installer Script

This section describes how to install formula add-ins. Many of the pieces are the same for custom tool plugins but I will revisit that in a later post.

For Alteryx to use a formula add in, the XML files and (if necessary) the compiled C++ DLLs need to be copied into a folder within the bin directory of Alteryx’s install called RunTimeData\FormulaAddIn. Alteryx include a Sample.xml file so the folder should already exist. The registry contains entries of where Alteryx is installed, for admin installs it is HKEY_LOCAL_MACHINE\SOFTWARE\WOW6432Node\SRC\Alteryx\InstallDir64, and for user installs it is HKEY _CURRENT_USER\SOFTWARE\SRC\Alteryx\InstallDir64. Both of these are the 64-bit version, but as of version 11 the 32-bit version is no longer available so I would target these. The code below shows how to read the admin install value into %alteryxPath%:

FOR /F "usebackq tokens=2,* skip=2" %%L IN (
 `reg query "HKLM\SOFTWARE\WOW6432Node\SRC\Alteryx" /v InstallDir64`
) DO SET alteryxPath=%%M

If you have an admin install of Alteryx then this is defaults to within Program Files which is protected by Windows UAC. The old version of the installer script for Alteryx Abacus used this technique to request UAC access. The new version uses an executable, InstallAlteyxAbacus.exe, to copy the files that will ask for administrator rights if it needs to. The syntax for this command is:

InstallAlteryxAbacus.exe  [Pattern2] ... [PatternN]

One additional complication when publishing add-ins on the internet, modern versions of Windows automatically block files downloaded from the internet.

image

Fortunately, it is straight forward to unblock these files using the PowerShell cmdlet Unblock-File. The code below lists all files in the current working directory (gci – GetChildItem), and then passes them to Unblock-File:

powershell -Command "gci . | Unblock-File"

Putting it all together results in a script like this:

@echo off

pushd "%~dp0"
powershell -Command "gci . | Unblock-File"

FOR /F "usebackq tokens=2,* skip=2" %%L IN (
    `reg query "HKLM\SOFTWARE\WOW6432Node\SRC\Alteryx" /v InstallDir64`
) DO SET alteryxPath=%%M

if "%alteryxPath%" NEQ "" (
    .\InstallAlteryxAbacus.exe "%~dp0\*.dll" "%~dp0\*.xml" "%alteryxPath%\RuntimeData\FormulaAddIn"
    echo FormulaAddIns installed to %alteryxPath%\RuntimeData\FormulaAddIn\
)

FOR /F "usebackq tokens=2,* skip=2" %%L IN (
    `reg query "HKCU\SOFTWARE\SRC\Alteryx" /v InstallDir64`
) DO SET alteryxUserPath=%%M

if "%alteryxUserPath%" NEQ "" (
    .\InstallAlteryxAbacus.exe "%~dp0\*.dll" "%~dp0\*.xml" "%alteryxUserPath%\RuntimeData\FormulaAddIn"
    echo FormulaAddIns installed to %alteryxUserPath%\RuntimeData\FormulaAddIn\
)

if "%alteryxUserPath%%alteryxPath%" EQU "" (
    echo Please copy all xml and dll files from "%~dp0"  to $AlteryxInstallDir$\RuntimeData\FormulaAddIn\
    pause
)

popd

If you copy this install script and the InstallAlteryxAbacus.exe executable to your own add in folder then it should install any XML and DLL files you have created into the appropriate subfolder within Alteryx bin directory for you.

Testing

Whenever you are creating reusable components for Alteryx, be they macros, functions of custom plugins, it is important to think about how to test them. This post covers a few of the helpful tools for testing anything in Alteryx. While concentrating on reusable components, the techniques can also be used on normal workflows to ensure they are behaving as expected and not losing any data. All of the built-in tools below can be found within the developer’s tool group (which is full of lots of very useful tools):

image

The first tool to mention is the Test tool. It has various different test types it can run either looking at the record count or evaluating an expression. If the test fails then an error is reported from the engine with the specified test name. More generally, this tool is fantastic for ensuring that joins are complete and no records are being lost: Add one to the L or R outputs of a Join tool and verify that the record count is 0. Combined with the setting to make a workflow stop if an error occurs, this makes safety checks on workflows much simpler – and something you should consider doing on any important processes.

image

From a testing perspective, this tool allows us to assert that all the records match a certain expression (using the ‘Expression is true for all records’ type). I find it clearer to use a Filter tool to evaluate the expression (as the check is then more clearly displayed within the designer) before passing in the failing rows and using a test tool to check that the row count is 0, but this is functionally equivalent:

image

An alternative to the test tool is the Message tool. In a test tool, you can have multiple different asserting (tests) whereas within the message tool you can only have a single one. The message tool allows you to control the severity of the message which is sent (Message, Warning, Field Conversion Error, Error) which is no possible with the test tool. Likewise, it is possible to include more details in as the message itself can use values from the data.

A special additional field, [RecordNumber], is also available within the Message tool. This allows you to do record count checks (you should specify to send messages After Last Record). This tool is particularly useful if you are running a workflow through the AlteryxEngineCmd as its output will be displayed within the output.

The final built in tool to mention is the Field Info tool. This reads out the meta data of an incoming connection returning the field names and types. Combining this with a Test tool, it is straight forward to verify that fields are present as you expect and also of the expected type.

Using the combination of these tools (though primarily just the Test and Field Info tools), all of the function of the Alteryx Abacus add in have simple tests. These tests also work as a kind of self-documentation allowing users of the add in to see the syntax and the results. In general, I create a set of data testing various scenarios for the functions with an expected output which I can check against.

So now we can write individual test works flows we now want to put them together. Using a directory tool to search for all workflows and then combined with the List Runner macro (part of the CReW macros created by Adam Riley) allow you to create a workflow which will search for test workflows and run them:

image

The List Runner macro takes the list of files and then runs them one at a time capturing the output messages and providing counts of warnings and errors:

SNAGHTML3d5a739

When we encounter any errors when running the tests, we can read the log entry messages. One warning to be aware of is that the engine will report if it is running low on memory as a warning. Splitting the log into separate lines, it is easy to filter this message out using a regular expression:

!REGEX_Match([Log], "Warning - ToolId \d+: The machine is running low on available physical memory")

A second regular expression can be used first to filter down to just the errors and warnings, and then to parse it into ToolId, type and message:

REGEX_Match([Log], "(Warning|Error|FieldConversionError) - ToolId \d+: .*")

image The final steps are to group by the module and concatenate the errors and then join back to the original list of test workflows to create the of successful and unsuccessful tests. To allow running this from the AlteryxEngineCmd, a test tool checks for no failed tests – that way the exit code will report if there are any failed tests:

image

Non ‘’Happy Path” Tests

Testing for the happy path (where all is well and the value is computed as expected) is straight forward. Happy path tests can and should include checking support for NULL values. Normally, when writing a function or a macro it is possible to choose a reasonable value for the function to return if given a NULL input. These can then be included within the scenarios run as described above. Extending the process above to check for expected errors and warnings, i.e. Non-Happy Path, is a little more complicated.

For XML based add ins, you can only accept a fixed number of arguments. The formula engine will only call into your function if you have the matching number of arguments. For C++ based functions, you can take a variable number with you just specifying a minimum in the XML. In both cases, you might want to return a nice error message to the user when invalid arguments are passed. By default, it is difficult to return a nice error message from XML (you will get errors from the underlying functions though), there is a C++ based function ReportError in Alteryx Abacus (from version 1.3) which returns an error with specified message as a way to allow nicer errors from XML based functions.

The way I chose to implement this is to use the annotation on tools within the test. As Alteryx workflows are just XML files it is possible to read the workflow in dynamically and then extract the Annotation Text.  Using regular expressions, it is fairly easy to find custom annotations on tools and extract them:

image

The first expression is used to break the workflow into tools – Tokenize to rows with an expression of:

.*?

Each of these nodes is then parsed using the second RegEx tool, using the expression below to extract, the ToolID (from the attribute on the Node), and the expected type (one of Warning, Error or FieldConversionError) and then the format (expressed as a regular) from the annotation text for the tool.

.*?(Warning|Error|FieldConversionError): (.*).*?

These tests need to fail if the error is not of the correct type or if the message does not match the regular expression. It also fails if there is no error or warning recorded for the tool. By joining the output of the above back to output from the unit test runner, it is straight forward to filter out the expected messages. The R output of the join will be empty if all expected errors were received, or will give a list of errors that were not triggered. We also need to continue to pass through any errors which were not expected at all, these will be in the L output of the join.

image

imageSample Test Framework

If you wish to try this framework out you can download this sample. There is a deliberately failing test within it and a sample happy and non-happy path test.

A Minor Issue…

I did find a minor bug with AlteryxEngineCmd (as of version 11.0) when putting the non-happy path tests together. If an error is returned from a C++ based function when data is being passed through then it does not get reported in output:

image

image

As the Runner macro, I use for the unit testing is based on the AlteryxEngineCmd, it won’t pick up messages from here.

Multiple Version Compatibility

As version 11 of Alteryx has just come out, I of course wanted to upgrade as soon as possible. However, I want to continue to support the previous version (10.6) for the next few months. Alteryx allows you to install both an admin and a non-admin install and these can be different versions. This is great for beta testing the upcoming release, or for testing compatibility with the previous version. There are a few complications on doing this.

Firstly, whichever instance of Alteryx you install last will be the one that Windows will use by default. You can specify which version you want to open each type by right clicking on a file and going to Open With => Choose Another App. Within the window that pops up you can specify to use the other Alteryx and make this a default:

image

image

Second issue that can occur is that the file being saved in the new version of Alteryx might not be compatible with previous versions. This post has how adjust Alteryx versions for different versions. As with many things in programming remember to “Keep it Simple, Stupid”  when writing the tests and then compatibility will be less of a problem. All of the tools and processes I have used in this post should work with various old version of Alteryx.

When creating function there is a constant, [Engine.Version], which can be used to determine which version of Alteryx you are in. The formula below reads the major and minor release numbers and converts it to a number:

ToNumber(REGEX_Replace([Engine.Version],"(\d+\.\d+)\..*","$1"))

Using this, you can create functions which work with multiple version of the engine. The example below show a formula to parse a date in American format with or without leading 0s for both 10.6 and 11:

DATETIMEPARSE(
    IIF(ToNumber(REGEX_Replace([Engine.Version],"(\d+\.\d+)\..*","$1")) < 11.0,
        Trim(REGEX_Replace(" " + P1,"(\D)(\d)(?=\D)","${1}0$2")),
        P1), 
    "%m/%d/" + IIF(REGEX_Match(P1,".*\D\d{1,2}$"), "%y", "%Y"))

This trick however will not work if you are using a function which did not exist in the old version as the formula will not be able to be parsed. In this scenario you would need to have version specific add in files.

Up Next …

This post has covered a lot of random bits and bobs that I find useful when using Alteryx or developing custom plugs in. All of it is useful for both creating new functions and also for creating custom tool plug ins.

In the next post in this series, I going to take these ideas a little bit further and look at performance testing custom functions, setting up an automated test and release process.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s