Skip to content

Concepts of Spotlight

The following topics should help you understand how Spotlight interacts with other systems and how it organizes your work.

Assets#

In Spotlight, asset is a general term covering Workspaces, datasets, documents, connections and other resources you might use for making analytic decisions. Almost everything you interact with in Spotlight is an asset. Each different asset type has unique features to help you get your work done in Spotlight. All assets also share common features to make them easier to discover and monitor.

Common asset features

Common asset features#

Name

An asset name can be up to 100 characters long. Names cannot start with a space or underscore (_), and cannot contain the backtick character (`). Note that an asset name does not need to be unique.

Description

The description can be up to 400 characters long.

Tags

Tags help quickly identify, search for, and categorize assets in Spotlight. A tag can be up to 80 characters (spaces are not allowed). Clicking on a tag launches a search for all assets with that tag in Spotlight.

Properties

Properties are custom metadata fields defined by your Spotlight administrator to help you capture the state and features of an asset as structured data. Common properties include the person responsible for an asset's maintenance, how often the asset is supposed to be updated, and whether it has been trusted by a data steward. Properties make assets easier to discover through search (see Find assets: Find by other features).

Owner

The owner is generally the person who added an asset to Spotlight. The owner has the ability to change basic information about the asset like its name and description, to delete an asset (Connections and Datasets from databases or data warehouses cannot be deleted), and to transfer ownership (user-uploaded files and the user-specific connections that house them cannot be transfered)).

Spotlight Administrators have owner-level permissions to all assets. This includes the ability to transfer ownership.

Follow/Un-follow

Follow an asset to have any changes to it included in you Activities panel. Assets you are following will display a Follow button - followed state button on their detail pages and search results. Assets you are not following will show a Follow button - un-followed state button instead. Click on either button to toggle between following and not following the asset.

Datasets#

Datasets are the data assets in Spotlight. They enable you to preview your data, cache it inside Spotlight, discuss it with colleagues, and they can be opened in external tools for further analysis. Datasets can be added to Spotlight or created inside a Workspace's Workbench.

You edit or transform data in Spotlight by creating a Workspace, adding a reference to your data in the Workspace, and then using the Workbench to create a new dataset from that reference. See "What is a reference?" for more on the difference between a reference and a dataset.

Workspaces#

Workspaces are Spotlight's collaborative project areas. Add references to datasets and documents that you need for your project. Use the attached Workbench to combine, filter, and otherwise transform your referenced data into new datasets.

Use the Flow area to trace the lineage of your datasets and quickly show details about the referenced assets that provide data to them. Document with comments, tags, and the description field so that your work is more discoverable and you can easily distinguish between related projects.

Workbench#

The Workbench lets you view data from the references and datasets in your Workspace, then use that data to create new datasets. New datasets can be edited with operations, used elsewhere in Spotlight, or opened in external tools like Tableau, or Jupyter for further analysis.

Three datasets pull data from a single reference in the Flow area of this Workbench.

Visibility#

Each asset has a Visibility setting that controls who can see it in Spotlight. This includes who can find an Asset in Spotlight's search and who can visit its detail page to view related metadata. Visibility does not control who can access the contents of assets. Connected systems (databases, data warehouses, S3, etc) have their own permission systems to control access to the content of datasets and documents they host. To help distinguish between the two permissions, we discuss the setting controlled by Spotlight as "metadata visibility".

The two permissions are independent of each other. It is possible to have metadata visibility permission for an asset in Spotlight (for instance permission to view a table on a database) but then not have permission on the connected system to access the contents of that asset. It is also possible (though less common) to have permission to access the content of an asset but not have metadata visibility permission to see it in Spotlight.

Home page#

The home page is your discovery and navigation center as well as where you add assets to Spotlight. You can return to it from anywhere in Spotlight by clicking the Spotlight logo in the top corner of the screen. See Browse Spotlight for a tour of the home page.

Introspection#

To help you locate your data wherever it lives, Spotlight indexes basic metadata from all your organization's connected data source systems. We call this process "Introspection" and it collects all table and column names that Spotlight can access. Spotlight keeps this inventory of metadata updated as resources change and new systems are connected.

Only table-based systems like databases and data warehouses are introspected. File-based systems (Datameer, S3, and your local computer when uploading files) are not introspected.

Virtualization#

When you connect databases and data warehouses to Spotlight, Spotlight creates a virtual representation of the data assets they contain. Spotlight acts as a live connection to these data assets and does not move or copy the data itself. You can discover, combine, and enrich these virtual assets inside the unified Spotlight interface, then connect to them through you BI tools of choice to perform analysis.

Virtualization has several key advantages.

  • BI queries against a virtual dataset inherit any Spotlight data prep done on it, making queries more specific and minimizing data traffic out of your systems.
  • Compute is pushed down to your data systems where possible, enabling better optimization and load management.
  • Your data systems authenticate users with each Spotlight query, so your existing access control mechanisms retain control over your data.