EMR & Glue
Owner | |
---|---|
Verification | |
Tags | |
Last edited time |
EMR
Run Hadoop clusters on AWS (Big Data)
clusters are made of 100’s of EC2’s with auto scaling
works within a VPC on a single AZ - can export data to S3 for better performance
Every cluster consists of a Master, Core, and optional Task nodes, each with a specific role.
Glue
managed ETL to prepare data for analytics - or to create a data catalog using extracted metadata