Openflow - Box
Goal of this tutorial is to load data from Box into Snowflake via openflow. This tutorial will not cover how to setup a deployment in Snowflake or AWS.
Video
Still in development
Requirements
- You can NOT be on a trial account. (Link)
- Snowflake account has to be in an AWS region.(Link)
- Box account must be at minimum a busniess account. This is a requirements from Box.
Downloads
- Sample data - Link
Box
This tutorial will use a Box Sandbox to prevent production security complications.
Sanbox Creation
Lets login to our box account and go to "Admin Console".

Now we'll go to platform, select sandbox and click "Create Sandbox". This will allow us to have full access to an account.

We'll give our sandbox a name and select our email as the admin. Click "Create".

Now that the sandbox is created we will login to it.

Box Platform App
Once we are in the accoount or sandbox account, we'll go to https://app.box.com/developers/console

We'll need to create an app so that our Openflow can connected to the account via the app. Click "Create Platform App".

We'll give our app a name and select "Server Auth -JWT" as app type.

App Configuration
Once created we'll be launched into Configurations were we'll need to make changes. To start lets update "App Access Level" to App + Enterprise Access.

Make sure your application scope follows Snowflake requirements.

Next we'll generate a Public / Private Key. Click this button will require you to enter your duo auth code or sign up for one.

Once duo auth you'll have to click the button again.

Once click it will download a .json file. We will use this file in openflow later.

Now lets submit our app for authorization so that our box can use it later.

Copy your Client ID and submit it to your enterprise, this removes and uneeded steep later.

Only needed when using a sandbox
Next lets head to general setting and copy our Service Account ID. This is only if you are using a Sandbox as the description explains.

Navigate to integration select "Server Authentication Apps" and view your application.

Now your app and account is ready. Lets go add data and share it to our service account.

Sample Data / Share
Now lets upload our sample PDF files and share it with our service account because were in a sandbox. Create a new folder.

Give the folder a name and paste the service account email in the "Invite Additional People" box. Click create.

Upload the example PDF files.

Next copy your "Folder ID" from the URL. This will be used in Openflow later.

Snowflake
Lets start the snowflake setup by going into a workspace sql file (1) and creating the necessary objects for openflow and the connector.
If you don't have a database, schema, or warehouse yet.
Only required if your hosting openflow in Snowflake (SPCS)
Lets create the network rule and external access that will allow openflow/snowflake to talk with our SFTP.
Integration OPENFLOW_EXTERNAL_ACCESS successfully created.
Openflow
Now that we have our objects lets add the postgres connector to our deployment. Navigate to openflow in the navbar.

If you get the error 'Invalid consent request' or 'TOTP Invalid'
You will have to change your default role to a role that is not an admin role. Example default would be public.

From there we can switch to the deployment where we can see our deployment and that it's active. If you don't have a deployment use either SPCS or AWS to deploy your Opeflow instance.

Next we'll head to runtime and click " + Create Runtime".

Note
External access will only show up if your on a SPCS deployment.
We'll then select our runtime, give it a name, select accountadmin as the role and if your on SPCS your external access integration.

Once your runtime is active and ready to go. We can head to overview and add the connector.

Click install on the "Box (Cortex Connect)" version to have Snowflake search and all needed tables setup by the connector. There are other connectors for box that are very similar but used in different way.

Paramaters
Now that the connector is installed we'll want to input our Box and Snowflake paramaters.

Now that we are in paramaters we can break it into 3 sections those being destination, Ingestion and Source.
Destination
Lets click the three dots of our destination and click edit.

Now we can select either note option below based on our deployment.
If your using SPCS deployment
As an example we'll click the three dots and click edit. We'll put the database, schema, role and warehouse.

One special paramter is the "Snowflake Authentication Strategy" with container service you can put in "SNOWFLAKE_SESSION_TOKEN" and it will not require a key_pair.

This is an example input if you used the configurations given at the beginning of the tutorial.

If your using AWS deployemnt
These are the paramaters you'll need to be filled out. We will see how to get them below.

To get the Snowflake Account Identifier, you'll go to the bottom left of the homepage and click account details.

You'll copy your account identifier and paste it in openflow.

Next to get your key we'll have to generate a public and private key and apply it to our user. To generate the key run this bash script.
This will generate three file. We will apply the content of the .pub file to our Snowflake user using the alter user command.
Now that we have our public key on our user we'll move to uploading the .pem file to openflow.

Click the upload button again and select your .pem file.

Once uploaded select the key and click "ok". Then fill out the remaing fields and click apply.

Source
Next lets head to source. Click edit on the "Run App Config FIle".

Select "Reference assists" and then the upload button.

Upload your json public/private key we go from box earlier (1).
Once uploaded, we'll click "Close".

We'll select our key and click "ok".

Ingestion
Now we'll move to the final section ingestion. We'll first edit our folder ID, the reason we don't use 0 in this tutorial because Sandbox's don't allow it. Click edit.

We'll put in our "Folder ID" we got from Box (1).
Next we'll edit the role that will have access to the search service.

Since we do everthing in this tutorial with AccountAdmin we'll keep using it.

Run
Finally we can run our connector. Head back to the process group.

Right click the process group again and click "Enable all controller services" and click start.

Next we'll right click "Enable" the connector.

Note
If you get an error here, I would let the connector run one more time, sometimes the connector has race connedition problems where it doesn't know objects have been created yet and fails. It will fix itself on the next run.
Then we'll right click and hit "Start". This will kick off the ingestion.

Stage
Now if you don't get any errors you can go back to Snowflake and find our box schema with everything int it. Letts start with seeing that the files have been replicated.

Cortex Search
Now that we have the files loaded and in Cortex Search automaticly lets use the playground to search through the files.

Now we can search in the playground to get relevent chunks of information.

Snowflake Inetllegence
Now lets allow Snowflake Inetllegence to use our cortex search service to use it during it's questions answering. Lets start by going to agents.

Next put our agents into the same schema and give it a object name and diplay name. Click "Create Agent".

Next in our agent we'll go tools and a Cortex Search.

Select the location of the search service, select the service. For ID column selet web_url and title column will be full_name. Finally we'll give the tool a name and description and click "Add".

Lets navigate to Snowflake Inetllegence.

We'll ask a question similar to "How much of a fee do we charge for International cellular plans in africa?".

We can see that it was able to answer the question with our documents. Also we can click the document to see it in box.

We can see that it pulled it from our international plan section.















