Connecting Tableau to Web Data (part 2) – Quandl

My main goal is to connect Tableau in an easy way to my company’s Asp.Net WebAPI site. We use JSON as our interchange format between server code (C#) and the client code (JavaScript and HTML). While that data is only available within my company network it is easy enough to find some JSON data sources on the web.

A friend introduced me to Quandl. They have load of different financial dataset readily available in a variety of different formats. For what I needed for developing the connection I chose to look at the USD/JPY FX rate history: https://www.quandl.com/CURRFX/USDJPY-Currency-Exchange-Rates-USD-vs-JPY. They provide this data in a variety of downloadable formats:

image

For what I wanted to do, I used the JSON format. The URL to the data is:

https://www.quandl.com/api/v1/datasets/CURRFX/USDJPY.json?trim_start=1991-03-05&trim_end=2014-10-31

There is documentation available on the API here https://www.quandl.com/help/api

Chrome has some great JSON tools available. I tend to use PostMan to make test JSON requests that I can then fiddle with:

image

Reading the dataset in within python is very straight forward:

def downloadJSONData(ticker, startDate, endDate):
      dataUrl = "https://www.quandl.com/api/v1/datasets/{0}.json?trim_start={1}&trim_end={2}".format(ticker, startDate.strftime("%Y-%m-%d"), endDate.strftime("%Y-%m-%d"))
     data = urllib2.urlopen(dataUrl).read()
     return json.loads(data)

The dataset I work with in the office tends to be an array of JSON objects out of an ASP.Net Web API service, so I chose to reshape the JSON data in python to be more like the data I wanted:

def dateFromString(dateYYYY_MM_dd):
    return datetime.date(int(dateYYYY_MM_dd[0:4]), int(dateYYYY_MM_dd[5:7]), int(dateYYYY_MM_dd[8:10]))

def extractArrayOfDictionaries(jsonObject):
    colNames = jsonObject["column_names"]
    data = []
    for dataRow in jsonObject["data"]:
        dataObj = {}
        for colii in range(len(colNames)):    
            if colNames[colii]=="Date":
                dataObj[colNames[colii]] = dateFromString(dataRow[colii])
            else:
                dataObj[colNames[colii]] = dataRow[colii]
        data.append(dataObj)
    return data

This remapping function will need be adjusted to match the response from an web service, but after running this process on the JSON response the data looks like:

[{u'Date': datetime.date(2014, 11, 14), u'Rate': 115.567, u'Low (est)': 114.27, u'High (est)': 116.88},
, {u'Date': datetime.date(2014, 11, 13), u'Rate': 115.43, u'Low (est)': 0.0, u'High (est)': 0.0},  … ]

As the data is now typed and uniform, I use this to create the table definition and then iterate over each of the dictionaries and add it to the tableau file:

def createTDETableDef(tdeFile, dict):
     tableDef = tde.TableDefinition()
     for col in dict.keys():
         typeval = type(dict[col])
         if typeval is datetime.date:
             tableDef.addColumn(col, tde.Type.DATE)
         elif typeval is datetime.datetime:
             tableDef.addColumn(col, tde.Type.DATETIME)
         elif typeval is datetime.timedelta:
             tableDef.addColumn(col, tde.Type.DURATION)
         elif typeval is bool:
             tableDef.addColumn(col, tde.Type.BOOLEAN)
         elif typeval is int:
             tableDef.addColumn(col, tde.Type.INTEGER)
         elif typeval is float:
             tableDef.addColumn(col, tde.Type.DOUBLE)
         else:
             tableDef.addColumn(col, tde.Type.UNICODE_STRING)
     return tableDef
 
def addRowToTable(tableDef, dict):
     row = tde.Row(tableDef)
     for ii in range(tableDef.getColumnCount()):
         val = dict[tableDef.getColumnName(ii)]
         type = tableDef.getColumnType(ii)
         if (type == tde.Type.UNICODE_STRING):
             row.setCharString(ii, str(val))
         elif (type == tde.Type.BOOLEAN):
             row.setInteger(ii, bool(val))
         elif (type == tde.Type.INTEGER):
             row.setInteger(ii, int(val))
         elif (type == tde.Type.DOUBLE):
             row.setDouble(ii, float(val))
         elif (type == tde.Type.DATE):
             row.setDate(ii, val.year, val.month, val.day)
         elif (type == tde.Type.DATETIME):
             row.setDateTime(ii, val.year, val.month, val.day, val.hour, val.minute, val.second, (val.microsecond // 100))
         elif (type == tde.Type.DURATION):
             row.setDuration(ii, val.days, val.seconds // 3600, (val.seconds // 60) % 60, val.seconds % 60, (val.microsecond // 100))
     return row
 
# Given a set of dictionaries create TDE
def createTDEFile(fileName, dicts):
     # Create TDE File, TableDef and Table
     tdeFile = tde.Extract(fileName)
     tableDef = createTDETableDef(tdeFile, dicts[0])
     table = tdeFile.addTable("Extract", tableDef)
     for dict in dicts:
         table.insert(addRowToTable(dict))
     tdeFile.close()
     return tdeFile

The whole process is now:

ticker = "CURRFX/USDJPY"
jsonObject = downloadJSONData(ticker, datetime.date(2014,11,1), datetime.date.today())
data = extractArrayOfDictionaries(jsonObject)
fileName = "D:\\Home\\Tableau\\" + ticker.replace("/", ".") + ".quandl.tde"
createTDEFile(fileName, data)

Nothing too difficult. I wrapped it up as a function call and can download data from python into TDE files.

A big help came from reading Scott Hanselman’s blog on the Python Tools for Visual Studio. While I am a complete novice in Python, I spend most of my day job buried in Visual Studio so keeping it all in there is great. And as you can now get Community Edition, it is a great place to do this experimenting.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s