Broker Load 报错

【详述】 Broker Load 导入 hdfs 3 亿数据,出错
【StarRocks版本】2.1.4
【集群规模】例如:3fe(3 followerr)+13be
【机器信息】96C/64G/万兆
be 报错
I0425 13:06:03.072351 30408 client_cache.cpp:64] get_client(): cached client for TNetworkAddress(hostname=172.16.40.81, port=9021)
I0425 13:06:03.115648 30318 client_cache.cpp:64] get_client(): cached client for TNetworkAddress(hostname=172.16.40.83, port=8000)
W0425 13:06:03.150573 29985 delta_writer.cpp:294] Invalid argument: Fail to do LZ4FRAME decompress, res=ERROR_allocation_failed
/root/starrocks/be/src/storage/rowset/page_io.cpp:187 opts.codec->decompress(compressed_body, &decompressed_body)
/root/starrocks/be/src/storage/rowset/scalar_column_iterator.cpp:299 _reader->read_page(_opts, iter.page(), &handle, &page_body, &footer)
/root/starrocks/be/src/storage/rowset/scalar_column_iterator.cpp:101 _read_data_page(_page_iter)
/root/starrocks/be/src/storage/rowset/vectorized/segment_iterator.cpp:105 iter->seek_to_ordinal(pos)
/root/starrocks/be/src/storage/rowset/vectorized/segment_iterator.cpp:598 _context->seek_columns(_cur_rowid)
/root/starrocks/be/src/storage/rowset/vectorized/segment_iterator.cpp:674 _read(chunk, rowid, chunk_capacity - chunk_start)
/root/starrocks/be/src/storage/vectorized/merge_iterator.cpp:141 fill(i)
/root/starrocks/be/src/storage/vectorized/merge_iterator.cpp:185 init()
/root/starrocks/be/src/storage/rowset/beta_rowset_writer.cpp:363 _final_merge()
W0425 13:06:03.174194 29977 delta_writer.cpp:294] Invalid argument: Fail to do LZ4FRAME decompress, res=ERROR_allocation_failed
/root/starrocks/be/src/storage/rowset/page_io.cpp:187 opts.codec->decompress(compressed_body, &decompressed_body)
/root/starrocks/be/src/storage/rowset/scalar_column_iterator.cpp:299 _reader->read_page(_opts, iter.page(), &handle, &page_body, &footer)
/root/starrocks/be/src/storage/rowset/scalar_column_iterator.cpp:101 _read_data_page(_page_iter)
/root/starrocks/be/src/storage/rowset/vectorized/segment_iterator.cpp:105 iter->seek_to_ordinal(pos)
/root/starrocks/be/src/storage/rowset/vectorized/segment_iterator.cpp:598 _context->seek_columns(_cur_rowid)
/root/starrocks/be/src/storage/rowset/vectorized/segment_iterator.cpp:674 _read(chunk, rowid, chunk_capacity - chunk_start)
/root/starrocks/be/src/storage/vectorized/merge_iterator.cpp:141 fill(i)
/root/starrocks/be/src/storage/vectorized/merge_iterator.cpp:185 init()
/root/starrocks/be/src/storage/rowset/beta_rowset_writer.cpp:363 _final_merge()

麻烦贴下详细的broker日志看看报错


type: LOAD_RUN_FAIL; msg: intolerable failure in opening node channels

导入失败: too many tablet version 问题解决方法 可以按这个帖子排查下是否这个问题

broker.log (35.0 KB)

W0401 20:44:49.579327 25009 tablet_sink.cpp:219] NodeChannel[31704-11339] add batch req success but status isn’t ok, load_id=6c44fe9d-57e4-1af2-fbf8-68c4dfee0581, txn_id=123309, node=xxx:8060, errmsg=too many tablet versions

但是时间对不上,是今天 12 点多导入的

broker 中org.apache.thrift.transport.TTransportException: java.net.SocketException: Connection reset 这个报错是连 hdfs 连接重置了?

你查询并发高吗,看下fe是不是有压力重启过?

查询并发不高,image

确认下你的每台be都有权访问hdfs不。因为拉数据是随机分配broker去拉数据的。可以检查下网络通不通

网络肯定通,因为 be 和 fe 和 hadoop 是混合部署的