Skip to content

Workbench

The Workbench lets you view data from the references and datasets in your Workspace, create new datasets from that referenced data, and customize your new datasets using operations.

Access the Workbench from the Workspace detail page by clicking the 'Open Workbench' button button or double click on any of the dataset nodes in your Workspace's Flow area. To go back to the Workspace details page, click the 'Return to Workspace' button button or the Workspace's name at the very top of the screen in the navigation breadcrumbs.

Create new dataset#

From the Workbench you can create new datasets from the data in any of the references or other datasets in your Workspace. See "What is a reference" for details.

  1. Click on an item in the Flow area to make it your active item. This can be a reference to data elsewhere in Neebo or another dataset you have already created in this Workspace. Just make sure it has data you want to include in your new dataset.
  2. Your active item will be colored green and have have a 'New dataset' button button next to it. Clicking the green button will feed the data from your active item into a newly created dataset in this Workspace. You can also right-click an item in the flow and select the "Create Dataset" option from the context menu there.
  3. Your new dataset is ready! It can be edited with operations in the asset toolbox, used elsewhere in Neebo, or opened in external tools for further analysis.

Your newly created dataset will become your active item and you will see an arrow from the flow item that is feeding data to your dataset. Neebo names new datasets using the name of this parent item followed by a number starting at 2 and increasing by 1 for each additional dataset created from the same parent (see figure in Flow area below).

Add or edit operations#

The Workbench is made up of three areas: Flow on the top, data sample on the bottom, and an asset toolbox on the right side.

Flow#

The Workbench shows how data moves between the different references and datasets in your Workspace. This is the the same Flow area as on the Workspace overview page.

Read more...

Click on any item in the Flow area to make it your active item. Your active item will be highlighted in green and have a green 'New dataset' button button next to it enabling you to feed the data from it into a newly created dataset in this Workspace. All items in the Flow that feed data into or pull data from your active item are colored dark gray. Right-click on any item to open its context menu.


Three datasets pull data from a single reference in the Flow area of this Workbench.

Use the Flow zoom controls buttons in the corner of the Flow to zoom in/out or scale the area to fit. Move your view area by clicking anywhere in the Flow and dragging in any direction.

Data sample#

The data sample includes the first 1000 rows of data as well as some column metrics that Neebo has calculated for the sample (see "column metrics" below). Scroll the sample up and down or left and right to view all columns. Click the Handle to resize sections of the Workbench tool button at the bottom of the flow area and drag up or down to resize the sample area.

Column metrics

Column metrics#

Column metrics provide a graphic summary of the data schema in a dataset. They are shown by default, but can be hidden by clicking the Collapse column metrics button to the left of the column metrics. If a dataset is close to the system size limit or if system resources are being heavily used, Neebo may be temporarily unable to generate the column metrics display.

Click on the "more..." link to open a larger window displaying see all values for columns with cardinality ("Uniques") of less than 20. For dates and number values, the histogram will show either start and end dates or minimum, average, and maximum values respectively. Hover over the horizontal indicator on each column to see the total number of records in the dataset ("valid"), the number of "empty" records, and the number of "unique" record values. Note that when cardinality is greater than 90% unique, no chart will be displayed.

The initial display reflects the data sample, but you can right-click in the column metrics area and select "Load Full Metrics" from the context menu to update the display to reflect the full record set. Note that if the dataset is small or the confidence score is 100%, the full and sample graphics will be the same.

Asset toolbox#

The asset toolbox contains information about your active item and any additional actions you can perform on it, including adding/editing operations and configuring caching.

Read more

If your active item was created in this Workspace then it is a dataset and you can edit and configure it in the asset toolbox. Everything else in the Flow is a reference and cannot be modified here. See "What is a reference" for details.

Rename your dataset by clicking the name in the top of the asset toolbox and entering the new name in the field that appears.

Below the dataset's name you will see information about where the data for this dataset ultimately originated as well as a list of any downstream datasets in this Workspace.

All operations applied to create the dataset will show at the bottom of the toolbox, underneath an 'Add operation' button button for adding additional operations. Click on any operation to view or edit its details. Change the active dataset's name by clicking on it in the toolbox area.

Configure caching for the active dataset underneath its name. See "Datasets: Cache" for details