Skip to content

External tools

Spotlight can connect to a range of data analysis tools including Jupyter Notebooks, Tableau, and Looker. Spotlight serves as a bridge between your analytics tool and the various data sources that may be feeding into the Spotlight dataset. When you run a query in your analytics tool, that query is being distributed by Spotlight to the relevant data source systems and the live results are returned to you.

Permissions#

To pass your query to the source systems that hold your data, Spotlight needs to have stored credentials to those sources. Spotlight automatically stores these credentials whenever you authenticate to view a dataset.

If you encounter any permission errors in your analytics tool, log in to Spotlight and make sure you can view the contents of the dataset there. Enter any missing credentials for specific data sources or update them if your credentials have changed on the source system.

Open in...#

To open one of your Spotlight datasets in an external analysis tool:

First:

  • Visit the dataset's detail page, or
  • select it while in a Workspaces's Workbench, or
  • Open the dataset's context menu in a Workspace and select the "Open in..." menu item

Then:

  • Click the 'External tools' button button that corresponds to the tool you wish to open.

See the section for your preferred analytics tool below for additional configuration details.

Note that datasets representing file uploads or tables on external databases do not have buttons on their detail pages but can also be opened in external tools by adding them to a Workspace and using the Open in buttons in the Workspace's Worbench.

Tableau

Tableau#

When you use the 'Tableau' button button on a dataset, your browser will start downloading a TDS file. To use this file just open it in a licensed copy of Tableau Desktop and a live connection will be automatically setup.

From Tableau desktop, you can publish a workbook with the live data to Tableau server.

Jupyter Notebooks

Jupyter Notebooks#

Requirements#

Connecting to Spotlight from Jupyter Notebooks requires a Spark JDBC driver and various python ODBC support libraries. These instructions are based on use of the Simba Spark JDBC driver and the freely available Anaconda Distribution of data science tools. Other Spark drivers are not known to work.

The .py file#

When you use the 'Jupyter' button button on a dataset, your browser will start downloading a py file with contents similar to:

   import pyodbc
   # Setup connection
   # Configure python odbc driver location and credentials to connect to Europa:
   cnxnstr = 'Driver={<your python odbc driver>};HIVESERVERTYPE=1;HOST=localhost;PORT=10001;HTTPPATH=/cliservice/;UID=anna;PWD=<your password>;SCHEMA=sales;AuthMECH=3;SSL=0;THRIFTTRANSPORT=2;'
   # Python odbc driver examples:
   # For instance Simba driver for MacOS
   # /Library/simba/spark/lib/libsparkodbc_sbu.dylib

   # Connect to database
   cnxn = pyodbc.connect(cnxnstr, autocommit=True)
   # Create cursor
   cursor = cnxn.cursor()
   # Execute query
   cursor.execute('select * from sales_view;')

Save this file to a directory on your local machine and then edit it in a text editor to add in your user credentials and the location of your Simba driver.

Finally, run:

jupyter-notebook

in the same directory containing your downloaded py file and you can begin accessing your Spotlight dataset via Jupyter Notebooks.

Download as CSV

Download as CSV#

When you use the 'Download as CSV' button button on a dataset, Spotlight will begin preparing a CSV version of the dataset's current data for you.

You will receive a snackbar message at the bottom of your screen while the download is being prepared for you and a second snackbar message with a download link once it is ready for you. This message with the download link will remain visible until you click the download link, close it, or close your Spotlight window.

CSV downloads are limited in to 100,000 rows by default, though this limit can be changed by your Spotlight administrator. If you attempt to download a dataset larger than the configured limit, you will receive a message that the dataset is too large and no download link will be generated.