be突然全部宕机,版本2.1.4

【详述】问题详细描述
【背景】做过哪些操作?
【业务影响】
【StarRocks版本】例如:2.1.4
【集群规模】例如:3fe(1 follower+2observer)+8be(fe与be混部)
【机器信息】CPU虚拟核/内存/网卡,例如:40C/128G/万兆

请问刚才突然所有be都宕掉了,看be.out,都是报这一个问题

start time: Mon May 16 19:47:22 CST 2022
*** Aborted at 1652701645 (unix time) try “date -d @1652701645” if you are using GNU date ***
PC: @ 0x0 (unknown)
*** SIGSEGV (@0x0) received by PID 18722 (TID 0x7f3d57951700) from PID 0; stack trace: ***
@ 0x350a8a2 google::(anonymous namespace)::FailureSignalHandler()
@ 0x7f3db0ce0630 (unknown)
@ 0x0 (unknown)
start time: Mon May 16 19:47:26 CST 2022
*** Aborted at 1652701649 (unix time) try “date -d @1652701649” if you are using GNU date ***
PC: @ 0x286d0d7 starrocks::ExprContext::close()
*** SIGSEGV (@0x0) received by PID 19997 (TID 0x7f06ab361700) from PID 0; stack trace: ***
@ 0x350a8a2 google::(anonymous namespace)::FailureSignalHandler()
@ 0x7f06fc552630 (unknown)
@ 0x286d0d7 starrocks::ExprContext::close()
@ 0x286efcf starrocks::Expr::close()
@ 0x24c6a1e starrocks::vectorized::TabletScanner::close()
@ 0x24c70a8 starrocks::vectorized::TabletScanner::~TabletScanner()
@ 0x22117c7 ZZN9starrocks10ObjectPool3addINS_10vectorized13TabletScannerEEEPT_S5_ENUlPvE_4_FUNES6
@ 0x221104f starrocks::vectorized::OlapScanNode::~OlapScanNode()
@ 0x2211612 starrocks::vectorized::OlapScanNode::~OlapScanNode()
@ 0x1cb93c7 std::_Sp_counted_ptr<>::_M_dispose()
@ 0x1cb3c32 starrocks::RuntimeState::~RuntimeState()
@ 0x1c4c802 starrocks::FragmentExecState::~FragmentExecState()
@ 0x1c55bab std::_Sp_counted_ptr<>::_M_dispose()
@ 0x16d703a std::_Sp_counted_base<>::_M_release()
@ 0x1c4dd15 _ZNSt17_Function_handlerIFvvEZN9starrocks11FragmentMgr18exec_plan_fragmentERKNS1_23TExecPlanFragmentParamsERKSt8functionIFvPNS1_20PlanFragmentExecutorEEESC_EUlvE_E10_M_managerERSt9_Any_dataRKSF_St18_Manager_operation
@ 0x1d794f2 starrocks::FunctionRunnable::~FunctionRunnable()
@ 0x1d78ff2 starrocks::ThreadPool::dispatch_thread()
@ 0x1d747f8 starrocks::thread::supervise_thread()
@ 0x7f06fc54aea5 start_thread
@ 0x7f06fb94fb0d __clone
@ 0x0 (unknown)
start time: Mon May 16 19:47:32 CST 2022
*** Aborted at 1652701654 (unix time) try “date -d @1652701654” if you are using GNU date ***
PC: @ 0x0 (unknown)
*** SIGSEGV (@0x0) received by PID 21301 (TID 0x7f207ed51700) from PID 0; stack trace: ***
@ 0x350a8a2 google::(anonymous namespace)::FailureSignalHandler()
@ 0x7f20d7f4e630 (unknown)
@ 0x0 (unknown)
start time: Mon May 16 19:47:38 CST 2022
*** Aborted at 1652701661 (unix time) try “date -d @1652701661” if you are using GNU date ***
PC: @ 0x0 (unknown)
*** SIGSEGV (@0x0) received by PID 22582 (TID 0x7f6cb015f700) from PID 0; stack trace: ***
@ 0x350a8a2 google::(anonymous namespace)::FailureSignalHandler()
@ 0x7f6d02433630 (unknown)
@ 0x0 (unknown)
start time: Mon May 16 20:06:54 CST 2022
*** Aborted at 1652703280 (unix time) try “date -d @1652703280” if you are using GNU date ***
PC: @ 0x207e1f6 starrocks::EvHttpServer::on_header()
*** SIGSEGV (@0x0) received by PID 25569 (TID 0x7f3981796700) from PID 0; stack trace: ***
@ 0x350a8a2 google::(anonymous namespace)::FailureSignalHandler()
@ 0x7f3a4d784630 (unknown)
@ 0x207e1f6 starrocks::EvHttpServer::on_header()
@ 0x359565c evhttp_read_header
@ 0x35973f3 bufferevent_readcb
@ 0x35855c1 event_process_active_single_queue
@ 0x3585c2f event_base_loop
@ 0x207c5f5 _ZNSt6thread11_State_implINS_8_InvokerISt5tupleIJZN9starrocks12EvHttpServer5startEvEUlvE_EEEEE6_M_runEv
@ 0x4fe7870 execute_native_thread_routine
@ 0x7f3a4d77cea5 start_thread
@ 0x7f3a4cb81b0d __clone
@ 0x0 (unknown)

请问这是什么原因,应该如何解决

有执行什么操作么?be可以重启成功么?

可以重启成功,但是刚才又掉了两台被自动拉起了

没有特定执行什么操作,但集群上有不少实时和离线导入作业,用的是datax插件和flink插件

be一直不停的有宕的,后来重启了全部fe后恢复正常

请发下宕机时间节点的be.warning日志,如果可以core文件也提供下,您可以上传百度云。

你好,错误日志我私信回你了,请问core文件是哪个

大佬,请问这个问题解决了嘛