DataFlow: Source Storage, Transform and Sink Biquery

Data Processing, GCP, 2024

image

We will use a very simple but higly used dataflow job type I have seen. Move data from Cloud Storage to Bigquery with a few Transform Steps.

1. Source: Cloud Storage

image

We can configure the Source Job as below.

image

2. Target: Big Query

I created a schema and table in BigQuery.

image

The Sink for the job is accordingly configured.

3. Transform

This is simple task for filtering rows.

image

This results in the below job configuration

image

The Run

image

Validating the result

image