-
Notifications
You must be signed in to change notification settings - Fork 3.2k
Description
Read the FAQ first: https://github.com/edenhill/librdkafka/wiki/FAQ
Description
we have a huge kafka cluster (1000+ node), I use librdkafka cpp prducer msg to kafka.
when we add a topic partition count, one of the brokers were out of sync. So producer may get wrong meta.
log is : LOG:5PARTCNT[thrd:main]: Topic org.xxxxxxx partition count changed from 16 to 10
but we have 22 partitons for this topic.
then producer crashed.
I get a core-rdk:broker405-31053-1531725863 file ,it size is 89G,
bt is :
#0 0x00007f26ecaff495 in raise () from /lib64/libc.so.6
#1 0x00007f26ecb00c75 in abort () from /lib64/libc.so.6
#2 0x00000000005ad293 in rd_kafka_crash () at rdkafka.c:3367
#3 0x0000000000604df5 in rd_kafka_toppar_destroy_final () at rdkafka_partition.c:269
#4 0x00000000005e94e8 in rd_kafka_handle_Produce () at rdkafka_request.c:1934
#5 0x00000000005dc896 in rd_kafka_buf_callback () at rdkafka_buf.c:444
#6 0x00000000005c17ba in rd_kafka_recv () at rdkafka_broker.c:1288
#7 0x00000000005d9ed0 in rd_kafka_transport_io_event () at rdkafka_transport.c:1419
#8 0x00000000005c8a40 in rd_kafka_broker_serve () at rdkafka_broker.c:2533
#9 0x00000000005ca499 in rd_kafka_broker_thread_main () at rdkafka_broker.c:2820
#10 0x0000000000616a17 in _thrd_wrapper_function ()
#11 0x00007f26ece68aa1 in start_thread () from /lib64/libpthread.so.0
#12 0x00007f26ecbb5bcd in clone () from /lib64/libc.so.6
How to reproduce
cant reproduce every time. it dependence on broker state.it is difficult to make one broker out of sync.
Checklist
- librdkafka version (release number or git tag): v0.11.5-PRE7
- Apache Kafka version: 0.8.2.1
- librdkafka client configuration:
metadata.broker.list=rz-data-rt023:9092,rz-data-rt198:9092,gh-data-rt0774:9092,gh-data-rt1066:9092
api.version.request=false
broker.version.fallback=0.8.2.1
queue.buffering.max.messages=300000
message.max.bytes=4000000
topic.metadata.refresh.interval.ms=300000
metadata.max.age.ms=1500000
queue.buffering.max.ms=5
batch.num.messages=5000
message.send.max.retries=1
message.timeout.ms=900000
request.required.acks=1
request.timeout.ms=900000
socket.max.fails=0
log.connection.close=false
socket.keepalive.enable=true
queue.buffering.backpressure.threshold=0 - Operating system: Centos6