Amazon Kinesis Data Streams - Kinesis vs Kafka#
Keywords: AWS, Amazon, Kinesis, Data Stream, Best Practice
AWS Kinesis 对标的系统是 Apache Kafka, 两者的共同点有:
都是 Publish / Subscription 模型. Kafka 有 Partition, Kinesis 有 Shard.
都是分布式系统.
两者的不同点:
Kafka 性能更好, 不过自己配置起来出错概率更高. Kinesis 性能稍差一点, 但是运维无压力.
Kafka 的持久化可以做到永久. Kinesis 不行, 最多 keep 365 天. 再久你就要自己存到 S3 做持久化了.
Comparison#
Apache Kafka |
Amazon Kinesis |
|
Developed/Hosted By |
Amazon |
|
Software |
Open-Source |
Proprietary |
SDK Support |
Kafka SDK supports Java |
AWS SDK supports Android, Java, Go, .NET |
Configuration & Features |
More control on configuration and better performance. |
Number of days/shards can only be configured |
Data Stored In |
Kafka Partition |
Kinesis Shard |
Reliability |
Replication factor can be configured |
Kinesis writes synchronously to 3 different machines/data-centers |
Performance |
Kafka wins |
Kinesis writes each message synchronously to 3 different machines |
Configuration Store |
Apache Zookeeper |
Amazon DynamoDB |
Setup |
Weeks |
Couple Of hours |
Data Retention |
Configurable |
7 days at max |
Log Compaction |
Supported |
Only can store logs for 7 days |
Processing Events |
More than 1000s of events/sec |
Atmost 1000s of events/sec |
Checkpointing |
Offsets stored in special topic |
DynamoDB |
Ordering |
Partion level |
Shard level |
Human Costs |
Require human support for installing and managing their clusters, and also accounting for requirements such as high availability, durability, and recovery |
Kinesis is just about pay and use |
Producer Throughput |
Kafka Wins |
Kinesis is bit slower than Kafka |
Incident Risk/Maintainence |
More In Kafka |
Amazon takes care |
Ordered sequence of immutable data records |
Kafka Topic |
Kinesis Stream |
Each record has a unique number called |
Offset number |
Sequence number |
Concepts |
Kafka Streams |
Kinesis Analytics |
参考资料: