Create a Dataset

Unlock the power of data transformation with Datasets; this feature allows you to write SQL to transform your data and create custom or calculated fields, even without direct access to Snowflake, enabling Marketers to drive personalized campaigns with ease and precision.

▶️

Interested in using Datasets?

Reach out to your account manager to get started.

Create a new dataset

Datasets define Views in the underlying Snowflake that are made available for use in the Schema Builder.

  1. Click on Schema Builder in the left-hand navigation bar.
  2. Click Add New + button at the top of Connected Tables, then click Dataset.
  1. Choose the type of data in the dataset.
    1. Event Data: 1:many with a customer (e.g. page views, add-to-cart events, etc.)
    2. Property Data: 1:1 with a customer (e.g. birthday, LTV, orders placed, etc.)
    3. Lookup Data: data that is not directly tied to a customer (e.g. product catalog)
  2. Configure the dataset (note: this applies to both Event and Property datasets).
    1. Name the dataset. A database friendly version of this name will appear in the Schema Builder as well as in Snowflake.
    2. Select Database Query as the source (note: in the future we will provide the ability to upload CSVs here).
    3. Select your Snowflake database.
    4. Click Start.
  3. You are now in the self-service SQL editor. You will see your database schema in the lower left-hand quadrant.
  1. Add a Description of the dataset if you'd like. This can be done at the top, right underneath the dataset name and configuration details.
  2. Write your query, then click Validate.

📘

FOR EVENT DATASETS ONLY:

If you're writing a query for Event Data, you must include a timestamp field. You will be asked to configure this later.

  1. Clear up any validation or syntax errors if necessary (e.g. trailing semicolons, misspelled column names, etc.).
  1. Once validated, a sample of the data will appear. If the preview of the output matches your expectations, click Save Draft, then click Commit.
  2. Upon committing your dataset, you will be asked to connect the Snowflake view you just created to Simon via the Schema Builder. If you skip this step, you will not be able to use any of the fields in the dataset in segmentation or personalization.
    1. The first section shows the Data Type (i.e. Events or Property), Table Location, and Table Name of the dataset you just created. These fields are not editable once a dataset has been committed and a view has been created in the underlying Snowflake. If you need to make a change here, you must first disconnect the query from the Schema Builder and create the dataset again with the proper configuration.

📘

FOR EVENT DATASETS ONLY:

You will be asked to specify the Timestamp Key here. Select the timestamp field you included in your query from the dropdown.

📘

FOR LOOKUP TABLES ONLY:

Because lookup tables don't contain PII, marketing channel identifiers, or any other data that's directly tied to your customers, you need to configure a Lookup Key rather than a join key.

Once you've committed your lookup query, you will be redirected so you can configure your lookup key. This is the primary key for your table. For example, if you've created a product catalog your lookup key is most likely a Product_ID.

After choosing your lookup key, click Save. You will be redirected to the Schema Builder, where you can see your newly connected or created lookup tables in the Lookup Tables section of the page.

  1. Choose your join key. This is how Simon will join the data in the new view in Snowflake to the Identity table powering your Simon instance.
    1. The identifiers you select here must be the same. For example, if you select email as the join key in your identity table, you must also select email in your new table. It's okay if the column names are different (e.g. email versus email_address), but the underlying values must match.
    2. For Property Datasets, it's always expected that there is only 1 row per unique identifier value (e.g. one email_address should only have one first_name), whereas for Event Datasets the identifier tells us how each event relates to a contact's profile (e.g. email_address1 has 5 page_views).

📘

FOR PROPERTY DATASETS ONLY:

After choosing a join key, you will be asked to select the fields (if any) you'd like to be made available for use in content personalization.

For example, maybe you'd like to include the value for the LOYALTY_POINT_BALANCE field shown in the screenshot below in emails to your customers.

  1. Click Save.
  2. You will be redirected to the Schema Builder, where you can see your newly created and connected query at the bottom of the Connected Tables section!

📘

On a Managed deployment?

If you have a Managed Simon Data deployment, your account is limited to 50 Simon credits generated by the Datasets product. If you require more usage, talk to your account manager.