Books >Official Google Cloud Certifi... >Designing Data Pipelines

Designing Data Pipelines

is part of: Official Google Cloud Certified Professional Data Engineer Study Guide

Download PDF
Download References
Request Permissions
Save to
Alerts

Chapter Abstract:

This chapter reviews high‐level design patterns, along with some variations on those patterns. It also reviews how GCP services like Cloud Dataflow, Cloud Dataproc, Cloud...Show More

Metadata

Chapter Abstract:

This chapter reviews high‐level design patterns, along with some variations on those patterns. It also reviews how GCP services like Cloud Dataflow, Cloud Dataproc, Cloud Pub/Sub, and Cloud Composer are used to implement data pipelines. A data pipeline is an abstract concept that captures the idea that data flows from one stage of processing to another. Data pipelines are modeled as directed acyclic graphs (DAGs). A graph is a set of nodes linked by edges. A directed graph has edges that flow from one node to another. Data pipelines may have multiple nodes in each stage. For example, a data warehouse that extracts data from three different sources would have three ingestion nodes. Not all pipelines have all stages. A pipeline may ingest audit log messages, transform them, and write them to a Cloud Storage file but not analyze them. It is possible that most of those log messages will never be viewed, but they must be stored in case they are needed. Log messages that are written to storage without any reformatting or other processing would not need a transformation stage. Transformation is the process of mapping data from the structure used in the source system to the structure used in the storage and analysis stages of the data pipeline.

Page(s): 61 - 88

Edition: 1

ISBN Information:

DOI: 10.1002/9781119618461.ch3

Designing Data Pipelines

Chapter Abstract:

Metadata

Chapter Abstract:

IEEE Account

Purchase Details

Profile Information

Need Help?

Designing Data Pipelines

Alerts

Chapter Abstract:

Metadata

Chapter Abstract:

Authors

Keywords

Metrics

IEEE Account

Purchase Details

Profile Information

Need Help?