Sample Use Case
This page provides you a sample use case so you can see how Datameer basically works.
You will be guided through:
- Loggin into Datameer
- Creating a new Project
- Adding tables to a Project
- Performing a Join operation
- Performing an Aggregate operation
- Publishing the View to Snowflake
Logging into Datameer#
At the beginning we have to login to access our Datameer instance. We open a Browser, in our case Google Chrome, and enter our Datameer URL. We enter the "Username" and "Password" and confirm with "Log In".
Creating a New Project and Adding Tables to the Project#
Since it is our first time using Datameer, the next step is to create a Project wherein we will add your Snowflake datasets and schemas and perform our transformations. Usually, after you log in, you are redirected to the Project Overview page and on there click the '+ NEW PROJECT' button to create a new Project. Using Datameer initially automatically creates a new Project for us.
In our sample, we only have one Snowflake schema: 'SNOWFLAKE_SAMPLE_DATA.TPCH_SF1'. Clicking on it opens the overview of all available datasets.
We select both Snowflake sources 'SUPPLIER' and 'NATION' and confirm with "Add to Project". Our first Project - named 'New Project' by default - is now created and we are guided to the Project Workbench and find our two datasets in the flow area.
We could quickly change the Project's name by clicking on the "Edit Project Name" button next to the Project's name, but for our sample, we leave the name as it is.
What we now see it the created Project page with the two Snowflake sources in the flow area in the middle. Below we have the scrollable columns that appear, when we click on one of our sources in the flow area. On the left side, we have the 'Sources Tree' and on the right the tab containing the information of a highlighted source (or later on Datameer views and published views). Let's now start to perform our first transformation.
Performing the Join Operation#
Our goal in the first part of this sample use case is to join both source datasets by their 'NATIONKEY' columns.
First click on the "+" icon of the 'NATION' source and select the light data preparation operation "Join". The 'Join Configuration' view opens and on the right side we see the configuration tab.
We have already selected 'NATION' as the first source. That's why it appears on the left 'Sources' side and we need to select the second source from the dropdown on the right side next. The dropdown provides the 'SUPPLIER' dataset and we select it.
Below the flow area, we get now some suggestions which columns would fit best to the join. In our case we want to join the 'N_NATIONKEY' column and the 'S_NATIONSKEY' column. This join operation would fit to 56% and therefore be the best match.
Now we select the join mode. In our example we want to perform an 'Inner Join' which returns only the records that are contained in both 'N_NATIONKEY' and 'S_NATIONKEY'.
Afterwards we have two options to select our columns: We can select one of the suggestions from below the flow area or select the columns from the dropdown in the configuration tab. We decide to choose option 1 and click on "+ Use Suggested Columns". After a few moments, the join operation is executed and we now see the preview of the join result. To finish the join operation, we confirm with "Create Join".
What we now see in our flow area are the two original Snowflake sources and the new Datameer view, connected by arrow lines. Viewing the new Datameer view in a highlighted square means that an operation has been applied. The counter in the 'NATION 2' node indicates that only one operation has been applied.
Performing the Aggregate Operation#
After joining the two sources we now want to perform another transformation. We want to aggregate the new Datameer View and group by the 'N_NAME' and afterwards use the account balance column 'S_ACCTBAL' as the measure.
There are two options to apply the operations. The way you perform the transformation might differ depending on how many operations you want to apply in total.
Option 1 - Creating a new view based on an existing view
We start from the new 'NATION 2' view and click on the "+" and select "Aggregate". The 'Aggregate' view opens and on the left side we can see our columns.
As the second, we click on the "+" next to 'Group Bys' and select the column we want to sort after. In our example, we mark the 'N_NAME' entry and confirm with "Apply".
Then we click on "+" next to 'Measures', mark the 'S_ACCTBAL' and confirm with "Apply".
We can now finish the aggregate operation and confirm with "Create".
What we see as our aggregate result is the following: In our flow area we have the two original Snowflake sources and the joined Datameer view we performed first. The new highlighted square that is named 'NATION 2 2' is the new aggregate Datameer view. We can later on rename the new view in the view's details, but for now we leave it as it is. The counter in the 'NATION 2 2' node indicates, that only one operation has been applied to the view. The connected arrow line illustrates that this new Datameer view is based on the former Datameer view.
Option 2 - Creating a new view by adding another operation to the recipe
The second option to perform a transformation can be done by adding the operation directly to the recipe. This requires to already have at least one operation applied to a transformed Datameer view.
Now we mark the former 'NATION 2' view and simply click on "+ Add to Recipe". Now the operation overview opens and we select the "Aggregate" operation.
The 'Aggregate' view opens and we can execute the further steps analogous to the option 1.
What we now see as the result is that our former view 'Nation 2' has an increased indicator from '1' to '2'. Furthermore we can see all applied operations in the operation stack on the right side as part of the transformation recipe. The operations are listed in the order in which they were performed. The most recent operation is at the bottom.
Publishing the View to Snowflake#
We are almost done. Finally we want to publish our Datameer view in Snowflake.
To do so, we click on "Publish" on top of the flow area.
We can rename our view first and then select our Snowflake destination by clicking on "Choose Destination". The next dialog lists all available publishing targets we have in our Snowflake account.
We click on "SNOWFLAKE_SAMPLE_DATA.TPCH_SF1" and confirm with "Connect".
After a few moments, the publishing process is finished and we can see our published view in the flow area highlighted by a green border. The arrow line connects the view we created in Datameer with our published Snowflake view.
Congratulations! We made it.