These are my notes from a discovery task I have had:
- EC2 can write to Kinesis data stream and then Firehose delivery stream can transform data with Lambda and store result at S3 or EMR.
- EMR is good choice to run Spark application (Python, Java, Scala) and may use S3 as a destination storage.
- Athena performs 30% better with S3 data formatted as Parquet compared to JSON.
- Redshift is read-optimized while MySQL is write-optimized.
- Application performance with MySQL would benefit in loading small volumes of data more frequently.
- Redshift is more efficient at loading large volumes of data less frequently.
And a great video with Vladimir and Benjamin discussing evolution of architecture:
Комментариев нет:
Отправить комментарий