Openflow - RDS Postgres
Goal of this tutorial is to load data from a RDS Postgres into Snowflake via openflow. This tutorial will not cover how to setup a deployment in Snowflake or AWS.
Video
Video still in development
Requirements
- You can NOT be on a trial account. (Link)
- Snowflake account has to be in an AWS region.(Link)
- AWS account, you can setup a free account to get started.
AWS
If you don't have a postgres RDS database, follow here.
We'll select postgres for the database, select a version "I tested with 17", and free tier if your just learning.

Select single instance since this is for learning, give your database a name, username and password.

If your using SPCS to deploy your openflow, this will need to be public so that Snowflake can reach across the internet to connect to the database.

Next we'll select our database to change it's network configuration.

select the security group by clicking the check box and then going to inbound traffic and edit inbound rules.

Now we'll allow all traffic to access the postgres instance. We do this because pgadmin will have to connect and if your using SPCS deployment, Snowflake does not have static i.p's YET.

Now we'll do the same thing as before but for outbound traffic.

Select all traffic and click "Save Rules". Your done with your postgres RDS setup.

Lets start in AWS where we'll need to copy our RDS data base URL, port and configure the database paramaters to allow for replication. Lets start by searching for RDS.

Give your paramater group a name, description, select postgres engine type, select your postgres version, and select DB paramter group. Click Create.

Search "rds.logical_replication" and dit the value to be "1". Click save changes.

Next search "wal_sender_timeout" set this value to "0". Click save changes.

Next we'll head back to our database and click modify.

Head down to "Additional configurations" and add your parameter group.

Apply immediately if you do not mind a short amount of downtime.

You will see the database begin to modify the configuration and then go back to active state.

PGAdmin
Lets stat by downloading PgAdmin. After that we'll connect add sample data and configure a "publication" and a "replication slot".
Example Data
Once installed we'll right click servers, hover over "Register" and click "Server".

Next we'll click connection and put in our host, port, username and password. Click Save.

Next we'll want to add our sample data, first we'll open a query window.

Next add the code to first create the objects and then enter data into them.
Pub / Replication Slot
Next we'll create a publication and replication slot by entering the code below, you can also edit it to be only for certian tables.
Snowflake
Lets start the snowflake setup by going into a worksheet (1) and creating the nesseray objects for openflow and the connector.
If you don't have a database, schema or warehouse yet.
Only required if your hosting openflow in Snowflake (SPCS)
Lets create the network rule and external access that will allow openflow/snowflake to talk with google sheets.
Statement executed successfully.
Now that we have our objects lets add the postgres connector to our deployment. Navigate to openflow in the navbar.

If you get the error 'Invalid consent request' or 'TOTP Invalid'
You will have to change your default role to a role that is not an admin role. Example default would be public.

From there we can switch to the deployment where we can see our deployment and that it's active. If you don't have a deployment use either SPCS or AWS to deploy your Opeflow instance.

Next we'll head to runtime and click " + Create Runtime".

Note
External access will only show up if your on a SPCS deployment.
We'll then select our runtime, give it a name, select accountadmin as the role and if your on SPCS your external access integration.

Once your runtime is active and ready to go. We can head to overview and add the connector.

Lets search postgres and click install.

We'll select our runtime and click add.

Next head to runtime and select your runtime if not already in it.

Once in we'll want to right click the process group and select paramaters.

Paramaters
Now that we are in paramaters we can break it into 3 sections those being destination, Ingestion and Source.
Destination
Lets click the three dots of our destination and click edit.

Now we can select either note option below based on our deployment.
If your using SPCS deployment
As an example we'll click the three dots and click edit. We'll put the database, schema, role and warehouse.

One special paramter is the "Snowflake Authentication Strategy" with container service you can put in "SNOWFLAKE_SESSION_TOKEN" and it will not require a key_pair.

This is an example input if you used the configurations given at the beginning of the tutorial.

If your using AWS deployemnt
These are the paramaters you'll need to be filled out. We will see how to get them below.

To get the Snowflake Account Identifier, you'll go to the bottom left of the homepage and click account details.

You'll copy your account identifier and paste it in openflow.

Next to get your key we'll have to generate a public and private key and apply it to our user. To generate the key run this bash script.
This will generate three file. We will apply the content of the .pub file to our Snowflake user using the alter user command.
Now that we have our public key on our user we'll move to uploading the .pem file to openflow.

Click the upload button again and select your .pem file.

Once uploaded select the key and click "ok". Then fill out the remaing fields and click apply.

Ingestion
For this section it's all about selecting what data you want and how often you want the warehouse to merge the changes into the final table. In my case I'll use the regex expression replicate\..* to grab all tables from the replicate schema. It's a comma seperated list if you want multiple tables or multuple schemas.

Source
Next we'll head to the final paramater source. We'll first edit the connection URL.

The format here is as shown below:
Next we'll want to download the JDBC Driver for our database. I used the Java 8 version.

We'll head back to openflow and upload our driver.

Click upload again in the next window.

Once uploaded you can select your driver and click "ok".

Finally you'll put in your postgres password, username, publication name, and replication slot name. Click Apply.

Run
Finally we can run our connector. Head back to the process group.

Right click the process group again and click "Enable all controller services" and click start.

Now if you don't get any errors you can go back to Snowflake and find your shema with your tables loaded.














