AWS for Bigdata


AWS is a very popular cloud, hosting a lot of apps. Hence huge a mount of data is also stored on AWS cloud too.

  1. AWS EC2 and S3:

2. DynamoDB: NoSQL, Index Type: Primary Index, Local Secondary Index, Globlal Secondary Index

3. Elastic MapReduce (EMR): Apache MapReduce + HDFS. Hadoop Stack


EMR Component: MPP over HDFS: Cloudera Impala –> HiveQL and PigLatin over MapReduce –>  Database: Hbase –> Map Reduce

4. Redshift (MPP) Use SQL, MPP vs Map Reduce

5. Data pipeline

6. Other