Skip to content

Publishing Data to Snowflake

After transforming your data, you can publish both a view or a table to your Snowflake instance.

A view that was created in Snowflake is a 'CREATE VIEW' view and therefore is not materialized and not a table. Materialized views have limitations (including not supporting joins). Views in Snowflake are a point-in-time of the view in Datameer. They are not auto-updated when the user modifies the Datameer view.

The following applies:

  • The location chosen for the view or table is an existing Database and Schema. The possible Databases and Schemas are configured on the 'Integration' page for your Snowflake system. You are not allowed to create a new Database and/or Schema at this time.
  • Published views and tables appear in both Snowflake and Datameer immediately. These views can be shared across other Projects as well.
  • Views: A view is a result of a table that can be accessed as if it was a table. The performance is slower than tables and the results are not stored for the future use. Views may join several tables into a single result set.
  • Tables: The results are stored, almost as though the results were a table. This allows faster access, but requires storage space and active maintenance.
  • If a referring dataset becomes out of sync with the published view/ table in Snowflake due to changes in the Project, the asset is marked and can be re-published.
  • Overwriting an existing view/ table is not allowed. To do so, you have to republish. Otherwise the view/ table must be renamed.

Publishing Views to Snowflake#

To publish a view to Snowflake:

  1. In the Project's Workbench, click on the "Publish" button on top of the Flow Area. The 'Publish to Snowflake' dialog opens.

  2. Select "View".

  3. (Optional) Rename the view.

  4. Select the target destination from the list below and confirm with "Publish Data". Publishing to Snowflake proceeds. The highlighted published view appears in the Flow Area. Datasets that are published to Snowflake maintain a visual link with the original dataset used to publish it.

Publishing Materialized Tables to Snowflake#

To publish a materialized table:

  1. In the Project's Workbench, click on the "Publish" button on top of the Flow Area. The 'Publish to Snowflake' dialog opens.

  2. Select "Table".

  3. (Optional) Rename the view.

  4. Select the target destination from the list below and confirm with "Publish Data". Publishing to Snowflake proceeds. The highlighted published table appears in the Flow Area. Datasets that are published to Snowflake maintain a visual link with the original dataset used to publish it.

Scheduling Publishing Materialized Tables#

Scheduling the materialization from a Project allows you to keep your target Snowflake table up to date. Scheduling is possible for daily, weekly and customized publishing and is accessable and configurable in the 'Schedule' tab in the Inspector.

Requirements

To create a Snowflake task, a default warehouse is needed. If you don't have set up one, set up the default warehouse:

ALTER USER your_user SET DEFAULT_WAREHOUSE = your_warehouse;

The role that uses this warehouse needs the privileges 'EXECUTE' and 'MANAGE EXECUTE TASK':

GRANT EXECUTE TASK ON ACCOUNT TO ROLE your_role;
GRANT EXECUTE MANAGED TASK ON ACCOUNT TO ROLE your_role;

To configure scheduling:

  1. In the Inspector, switch to the "Schedule" tab.

  2. Click on the "Enable Scheduler" switch. The configuration options appear below.

  3. From the drop-down, select the repeat interval. You can select between 'Daily', 'Weekly' and 'Custom'. The configuration options adapt to the repeat interval.

  4. Adjust the configuration to your needs, e.g. when selecting 'Weekly', mark the days and set the time, and confirm with "Schedule". The configuration is displayed in the 'Schedule' tab.

Viewing the Publishing History#

The publishing history provides information for all published nodes, such as a history timestamp or status.

You have two options to open the publishing information:

  • from the Data Grid, click on the "Show Publish History" button

  • right-click on the published node and select "Show Publish History"

You can investigate the following information in reverse chronological order:

  • timestamp of the start date and time
  • trigger name
  • number of records
  • publishing duration
  • timestamp of the end date and time
  • warehouse
  • error status and error details, if error case

You can filter the publish history according to:

  • scheduled assets
  • manual published assets
  • published assets with error state

Detecting Non-Synchronized Views/ Tables#

You can simply detect when a view or table that has been published, is no longer in synchronization with the one from which the published view or table is based on. Non-synchronized views and tables are highlighted with a red border. The visual warning will persist until the view/ table is re-published or until the changes are reverted.

You can investigate the following information for the view/ table:

  • last publish date
  • last change date to any of the views to generate that view/ table

You can also re-publish the view/ table to make sure both views the one in your Datameer and the one in Snowflake instance are synchronized. Once you re-published the view/ table, the visual indicator that the published view/ table is no longer in sync and warning message disappears.