Automate Data Ingestion with Amazon Kinesis and AWS Glue

INTERMEDIATE
90 minutes
5 tasks

In this lab, you will learn how to automate data ingestion using Amazon Kinesis and AWS Glue to build a reliable and scalable data processing pipeline. You will set up a Kinesis Data Stream to capture streaming data and use AWS Glue to extract, transform, and load (ETL) data into Amazon S3 for storage. This lab will teach you how these components work together to process real-time data efficiently, enabling rapid data insights.

Scenario

A retail company needs to streamline its data ingestion process to handle increasing volumes of real-time sales data from its e-commerce platform. They aim to use AWS services to create a data pipeline that can ingest and process data efficiently, reducing latency in sales reporting and analytics.

Learning Objectives

  • Understand the integration between Amazon Kinesis Data Streams and AWS Glue.
  • Learn how to configure AWS Glue to process data from Kinesis streams.
  • Implement a scalable data ingestion pipeline using AWS services.

tasks (5)

task 1: Create a Kinesis Data Stream

15 min

task 2: Configure AWS Glue for ETL Job

20 min

task 3: Verify Data Output in S3

15 min

task 4: Set Up Monitoring with CloudWatch

20 min

task 5: Optimize Data Storage in S3

20 min

Prerequisites

  • Basic understanding of AWS services such as Kinesis, S3, and Glue.
  • Familiarity with data processing concepts and ETL operations.

Skills Tested

Automate data ingestion with Amazon Kinesis and AWS Glue.Data pipeline orchestration and troubleshooting.Process data efficiently using AWS services.

References

    Automate Data Ingestion with Amazon Kinesis and AWS Glue - Hands-On Lab - CertiPass