AWS Data Pipeline

Updated : 25-Feb-2021

In category : AWS

AWS Pipeline Overview

AWS Data Pipeline is a web based ETL service for processing and moving data between different AWS compute and storage services, as well as on-premises data sources, at specified intervals. It is used with AWS services such as Amazon S3, Amazon RDS, Amazon DynamoDB, and Amazon EMR.

AWS Pipeline Benefits

reliable
easy to use
flexible
scalable
transparent
low cost

AWS Pipeline Features

distributed, highly available infrastructure designed for fault tolerant execution
automatic retry capability
configured through visual interface
library of templates
scheduling
dependency tracking
error handling
work can be dispatched to one machine or many in parallel
full execution logs are automatically delivered to Amazon S3
full control over the compute resources

AWS Pipeline Costs

low monthly rate with low frequency jobs at 6oc per month
high frequency jobs at $1 per month
high frequency is twice or more times per day)

Russell Jamieson

Share This Post