Ingest and Transform Streaming Data with Kinesis and AWS Glue

INTERMEDIATE
75 minutes
5 tasks

In this lab, students will build a streaming data pipeline using Amazon Kinesis Data Streams and AWS Glue to ingest and transform data in real-time. By leveraging Kinesis Data Streams for data ingestion and AWS Glue for data transformation, learners will develop a deeper understanding of processing streaming data at scale. This lab simulates a financial services company collecting and analyzing real-time stock market data to provide analytics dashboards to their clients. Participants will set up data streams, configure AWS Glue jobs for ETL processes, and validate data flow through the system. This will include critical tasks like setting up Kinesis producers, processing data with AWS Glue jobs, and ensuring transformed data is stored accurately in Amazon S3.

Scenario

Global Financial Services Inc. requires the ability to process real-time stock market data to provide live analytics to their clients for better decision-making. They need a robust and scalable system to handle high-frequency data ingestion and transformation. The current challenge is setting up a seamless streaming process using Amazon Kinesis and AWS Glue to ensure timely and accurate data analysis.

Learning Objectives

  • Set up and configure Amazon Kinesis Data Streams for real-time data ingestion.
  • Create AWS Glue jobs to transform incoming data streams.
  • Store processed data securely in Amazon S3 for further analysis.

tasks (5)

task 1: Create a Kinesis Data Stream for stock market data ingestion

20 min

task 2: Configure an AWS Glue job to consume and transform data from the Kinesis stream

35 min

task 3: Create an S3 bucket to store transformed data

15 min

task 4: Enable logging and monitoring for AWS Glue jobs

15 min

task 5: Test the data pipeline by injecting sample data and verify output

25 min

Prerequisites

  • Understanding of Kinesis Data Streams concepts
  • Basic knowledge of AWS Glue and ETL processes
  • Familiarity with S3 bucket setup and configuration

Skills Tested

Configure Kinesis Data Streams for real-time ingestionImplement AWS Glue jobs for data transformationSet up S3 for data storage with encryption