Skip to the content.

These are useful tools for processing the SQL data.

This is the code we wrote to modify SQL to have a consistent style, specifically:

Tests were developed in the process of developing the code and are also included. If you do use this we would suggest proceeding with care - if your SQL contains phenomena we had not considered then the results could be unexpected.

Collects a few simple statistics about a dataset:

A convenient tool to convert from our json format to three files (train, dev, test) conaining one example per line: sentence | query with variables filled in.

A utility script to write json formatted datasets split by question/query splits and also divided by train/dev/test or cross validation splits. This helps read in data independently and simplifies the data loading process.