Openflow - Google Sheets
Goal of this tutorial is to load data from a google sheet into Snowflake via openflow.
Video
Video still in development
Requirements
- You can NOT be on a trial account. (Link)
- Snowflake account has to be in an AWS region.(Link)
- Google cloud account, you can setup a free account to get started.
Snowflake
Lets start the snowflake setup by going into a worksheet (1) and creating the nesseray objects for openflow and the connector.
If you don't have a database, schema or warehouse yet.
Only required if your hosting openflow in Snowflake (SPCS)
Lets create the network rule and external access that will allow openflow/snowflake to talk with google sheets.
Statement executed successfully.
Openflow
In this tutorial we assume you already have a deployment. If you have not it only takes a few minutes. Please follow one of this three options.
- SPCS also know as Snowflake hosted. (Link)
- AWS with Snowflake managed VPC (Link)
- AWS self hosted VPC (link)
Runtime
Inside openflow click runtimes.

Select your deployment, next give your runtime a name, and select accountadmin as the role. Scroll down to extrnal acess.

Only required if your hosting openflow in Snowflake (SPCS)
select the external access we create in our worksheet and finally click "Create".

Now your runtime will start being created. Lets head to the connectors

Install Connector
On the connectors list, install the google sheets connector.

Select your runtime to install, you may have to wait for the runtime to finilize.

Once selected click "add". The connector will be added and you'll be required to login to get into the underliying container.

If you get the error 'Invalid consent request' or 'TOTP Invalid'
You will have to change your default role to a role that is not an admin role. Example default would be public.

Now you should see the openflow canvas with the google sheets connector block. We will switch to Google sheets here to get some information to setup our connector.

Google Sheets
Note
To setup google make sure you have Super Admin permissions, we will need this to create a service account.
Ensure that you are in the project associated with your organization, not the project in your organization.

Orginization Policy
Search "Orginization Policies", next click in the fliter box and search "Disable service account key creation". Select the managed constraint.

Click "Override parent policy", edit rule and select off for enforcement. Next click done and set policy.

Service account
Search "Service account" and once on the page click "+ Create service account".

Give your service account a name and description and click done.

Now copy your email, we will use this later. Once copied click the service account.

Go to keys, add key and create a new key.

We'll want the JSON key. This key will be added to openflow later. Click Create.

Enable Sheets API
Next we'll want to enable the API so that Snowflake can talk with the API. Search "google sheets api".

Once on the API page, click enable.

Once enabled you will be able to see usage metrics.

Share with service account
Next we'll share the google sheet with the service account email (1) we copied earlier. Click the share button.
Insert the email and click send.

Connector Configuration
Lets head back to openflow and right click the google sheet connector and then parameters.

Here we will see three sections where we will have to enter in our configiration paramaters into.

Destination Parameters
Lets click the three dots on the right side of the destination paramters.

If your using SPCS deployment
As an example we'll click the three dots and click edit. We'll put the database, schema, role and warehouse.

One special paramter is the "Snowflake Authentication Strategy" with container service you can put in "SNOWFLAKE_SESSION_TOKEN" and it will not require a key_pair.

This is an example input if you used the configurations given at the beginning of the tutorial.

If your using BYOC deployemnt
These are the paramaters you'll need to be filled out. We will see how to get them below.

To get the Snowflake Account Identifier, you'll go to the bottom left of the homepage and click account details.

You'll copy your account identifier and paste it in openflow.

Next to get your key we'll have to generate a public and private key and apply it to our user. To generate the key run this bash script.
This will generate three file. We will apply the content of the .pub file to our Snowflake user using the alter user command.
Now that we have our public key on our user we'll move to uploading the .pem file to openflow.

Ingestion Parameters
Next for the "Ingestion Parameters" we'll need the sheet name and the sheet range. You'll select the range you want and copy it and the sheet name.

Next we'll want to copy the Spreadsheet ID from the URL.

Now we'll enter in what we want the table name to be. The sheet name and range (1) in the format "Sheet_name!range" and finally the speadsheet ID (2) we got from the URL. Click Apply.
Source Parameters
Next we'll want to copy the contents of our service account JSON file. I opened my file by dragging it to the browser.

We'll enter that in to our service account json form. Once entered it will show as "Sensitive value set".

Now we're done with the paramaters. We can go back to the process group to get it started.

Run the connector
Lets right click the connector, enable "All controller services" and then start the connector. Now since the data is small it will load into Snowflake in seconds.

And we're done, once loaded you will be able to see it in your database/schema with the table name we set prior.

Scenarios
Here we will cover some what if scenarios with the data.
Multiple Spreadsheets / Sheets
Most users will want to ingest multiple spreedsheets with potentially multiple sheets in a spreadsheet. To accomplish this you'll want to go back and add anther Google sheet connector to the runtime.

You'll want to right click your new connector. Click configure.

This is where you can rename both to give them unique names to differentiate them.

After we've named them we'll want to go into our new connector and go to paramaters.

We'll select "Inheritance" in the top menu and select the destination and google source parameters because these are going to be the same for all our google sheets connectors.

Next we'll go back to our "parameters" and add our next google spreadsheet which has two sheets so we'll add both names with both ranges by seperating them with a comma.

Finally we'll start the second connector by right click start and then we'll be able to see the result in our Snowflake table.

Empty Cells
In this scenario the question is what happen when cells are empty but you include them in your range.

If you add the cells and they are blank it will not show up in the google table.

Now we'll add values to the cells and run the pipeline again.

You'll see the pipeline refreshes the whole table and includes the values that are not included.

Schema Change
In this scenario we'll see what will happen when we have schema changes.

When we change the cell range to include the cells the empty C column it will not be included.

Here we we fill in the C column but keep the bottom row empty.

We see the column get added and the cell will be set to null.












