I got serveral questions about auto.offset.reset. This configuration parameter governs how consumer read the message from Kafka when there is no initial offset in ZooKeeper or if an offset is out of range.
Q1. "no initial offset in zookeeper " means that there isn't any consumer to consume the message yet(The offset is set once the consumer starts to consume)?
-- Yes, or if you consumed messages, but auto offset commit is disabled and you haven't explicitly committed any offsets.
Q2: What does "offset is out of range" mean? Can you eleborate one scenario when "offset is out of range" could happen?
Kafka uses a retention policy for topics to expire data and clean it up. If some messages expire and your consumer hasn't run in a while, the last committed offset may no longer exist
auto.offset.reset has two values:smallest and largest.
Assume one scenario: A producer has produced 10 messages to kafka, and there is no consumer yet to consume it.
Q3: If auto.offset.reset is set to "smallest", does it mean that the consumer will read the message from the offset 0?(0 is smallest here)
Q4: If auto.offset.reset is set to "largest", does it mean that the consumer will not read any message but wait until new messages come?
Also correct. This is why in the quickstart you need to use the --from-beginning flag on the console consumer. Since the consumer is executed after the console producer it wouldn't see any messages unless it set auto.offset.reset to smallest, which is what --from-beginning does.