StarRocks2.3.2 pipeline引擎性能问题

【详述】在开启pipeline引擎后,集群的p99显著降低,根据查询查询时间分析,每隔5分钟,就会有数百条慢查询出现,通过对慢查询的profile进行分析,发现PendingTime占用了大量的时间。
关闭pipeline引擎后,问题恢复。
【背景】通过调整base compaction的配置,降低了压缩频率,使集群的磁盘IO降低到了一个比较低的范围,排除了磁盘IO导致查询变慢的可能性。
关闭pipeline引擎后,慢查询数量显著降低,排除了外部请求的SQL自身问题导致集群产生大量慢查询的可能性。
【业务影响】
【StarRocks版本】例如:2.3.2
【集群规模】例如:5fe(2 follower+2observer)+16be(fe与be混部)
【机器信息】CPU虚拟核/内存/网卡,例如:40C/128G/万兆
【附件】
Query:
Summary:
- Query ID: a82028f0-3975-11ed-846b-6c92bf67ef6c
- Start Time: 2022-09-21 14:21:41
- End Time: 2022-09-21 14:21:41
- Total: 304ms
- Query Type: Query
- Query State: EOF
- StarRocks Version: 2.3.2
- User: root
- Default Db: default_cluster:cache_log
- Sql Statement: SELECT UNIX_TIMESTAMP(time_at_5min), domain, isp, province, sum(traffic) as traffic FROM nginx_region_detail WHERE (domain IN (116153,228593,136786,132537,222065,254919,118285,220270,70766,224345) AND time_at_5min >= ‘2022-09-20 23:20:00’ AND time_at_5min <= ‘2022-09-20 23:50:00’ AND isp IN (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,19,21,22,23,25,27,28,29) AND province IN (1,2,3,4,5,6,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,386) AND country IN (1) AND is_parent = 0 AND is_https = 1) GROUP BY time_at_5min, domain, isp, province
- QueryCpuCost: 104ms
- QueryMemCost: 16.788MB
- Variables: parallel_fragment_exec_instance_num=1,pipeline_dop=0
- Collect Profile Time: 8ms
Planner:
- Total: 44ms / 1
Execution Profile a82028f0-3975-11ed-846b-6c92bf67ef6c:
- ExecutionTotalTime: 239.905ms
Fragment 0:
- BackendNum: 1
- InstanceNum: 1
- MemoryLimit: 91.60 GB
- PeakMemoryUsage: 331.45 KB
Pipeline (id=0):
- ActiveTime: 799.395us
- BlockByInputEmpty: 54
- __MAX_OF_BlockByInputEmpty: 4
- __MIN_OF_BlockByInputEmpty: 1
- BlockByOutputFull: 231
- __MAX_OF_BlockByOutputFull: 17
- __MIN_OF_BlockByOutputFull: 6
- BlockByPrecondition: 0
- DegreeOfParallelism: 20
- DriverTotalTime: 239.905ms
- OverheadTime: 799.395us
- PendingTime: 237.302ms
- InputEmptyTime: 196.373ms
- FirstInputEmptyTime: 195.500ms
- FollowupInputEmptyTime: 873.30us
- OutputFullTime: 41.528ms
- PendingFinishTime: 0ns
- PreconditionBlockTime: 0ns
- ScheduleCount: 306
- __MAX_OF_ScheduleCount: 21
- __MIN_OF_ScheduleCount: 9
- ScheduleTime: 1.803ms
- TotalDegreeOfParallelism: 20
- YieldByTimeLimit: 0
RESULT_SINK:
CommonMetrics:
- CloseTime: 1.130us
- OperatorTotalTime: 342.103us
- PeakMemoryUsage: 0.00
- PrepareTime: 17.588us
- PullChunkNum: 0
- PullRowNum: 0
- PullTotalTime: 0ns
- PushChunkNum: 320
- __MAX_OF_PushChunkNum: 39
- __MIN_OF_PushChunkNum: 8
- PushRowNum: 4.073K (4073)
- __MAX_OF_PushRowNum: 484
- __MIN_OF_PushRowNum: 122
- PushTotalTime: 340.774us
- SetFinishedTime: 47ns
- SetFinishingTime: 150ns
UniqueMetrics:
EXCHANGE_SOURCE (plan_node_id=6):
CommonMetrics:
- CloseTime: 543ns
- OperatorTotalTime: 110.327us
- PeakMemoryUsage: 0.00
- PrepareTime: 13.11us
- PullChunkNum: 324
- __MAX_OF_PullChunkNum: 39
- __MIN_OF_PullChunkNum: 8
- PullRowNum: 4.073K (4073)
- __MAX_OF_PullRowNum: 484
- __MIN_OF_PullRowNum: 122
- PullTotalTime: 108.373us
- PushChunkNum: 0
- PushRowNum: 0
- PushTotalTime: 0ns
- SetFinishedTime: 107ns
- SetFinishingTime: 1.303us
UniqueMetrics:
- BytesPassThrough: 6.25 KB
- BytesReceived: 72.18 KB
- DecompressChunkTime: 828.114us
- DeserializeChunkTime: 3.644ms
- ReceiverProcessTotalTime: 7.380ms
- RequestReceived: 321
- SenderTotalTime: 7.294ms
- SenderWaitLockTime: 1.87ms
Fragment 1:
- BackendNum: 16
- InstanceNum: 16
- MemoryLimit: 91.60 GB
- PeakMemoryUsage: 23.87 MB
- __MAX_OF_PeakMemoryUsage: 1.56 MB
- __MIN_OF_PeakMemoryUsage: 1.47 MB
Pipeline (id=1):
- ActiveTime: 189.891us
- BlockByInputEmpty: 0
- BlockByOutputFull: 0
- BlockByPrecondition: 0
- DegreeOfParallelism: 20
- DriverTotalTime: 191.310ms
- OverheadTime: 189.891us
- PendingTime: 190.64ms
- InputEmptyTime: 187.597ms
- FirstInputEmptyTime: 187.597ms
- FollowupInputEmptyTime: 0ns
- OutputFullTime: 0ns
- PendingFinishTime: 2.489ms
- PreconditionBlockTime: 0ns
- ScheduleCount: 320
- ScheduleTime: 1.56ms
- TotalDegreeOfParallelism: 320
- YieldByTimeLimit: 0
EXCHANGE_SINK (plan_node_id=6):
CommonMetrics:
- CloseTime: 1.349us
- OperatorTotalTime: 97.597us
- PeakMemoryUsage: 0.00
- PrepareTime: 34.750us
- PullChunkNum: 0
- PullRowNum: 0
- PullTotalTime: 0ns
- PushChunkNum: 320
- PushRowNum: 4.073K (4073)
- __MAX_OF_PushRowNum: 26
- __MIN_OF_PushRowNum: 5
- PushTotalTime: 22.254us
- SetFinishedTime: 46ns
- SetFinishingTime: 73.945us
UniqueMetrics:
- DestID: 6
- DestFragments: a82028f0397511ed-846b6c92bf67ef8d
- PartType: UNPARTITIONED
- BytesPassThrough: 6.25 KB
- __MAX_OF_BytesPassThrough: 454.00 B
- __MIN_OF_BytesPassThrough: 0.00
- BytesSent: 72.18 KB
- __MAX_OF_BytesSent: 5.14 KB
- __MIN_OF_BytesSent: 0.00
- CompressTime: 11.567us
- NetworkTime: 3.31ms
- __MAX_OF_NetworkTime: 36.699ms
- __MIN_OF_NetworkTime: 0ns
- OverallThroughput: 2.4360837936401367 MB/sec
- __MAX_OF_OverallThroughput: 1.7358732223510742 MB/sec
- __MIN_OF_OverallThroughput: 0.0 /sec
- RequestSent: 300
- __MAX_OF_RequestSent: 20
- __MIN_OF_RequestSent: 0
- SerializeChunkTime: 5.984us
- ShuffleHashTime: 0ns
- UncompressedBytes: 92.51 KB
- __MAX_OF_UncompressedBytes: 608.00 B
- __MIN_OF_UncompressedBytes: 0.00
- WaitTime: 2.346ms
PROJECT (plan_node_id=5):
CommonMetrics:
- CloseTime: 8.276us
- OperatorTotalTime: 21.571us
- PeakMemoryUsage: 0.00
- PrepareTime: 13.173us
- PullChunkNum: 320
- PullRowNum: 4.073K (4073)
- __MAX_OF_PullRowNum: 26
- __MIN_OF_PullRowNum: 5
- PullTotalTime: 231ns
- PushChunkNum: 320
- PushRowNum: 4.073K (4073)
- __MAX_OF_PushRowNum: 26
- __MIN_OF_PushRowNum: 5
- PushTotalTime: 12.859us
- SetFinishedTime: 97ns
- SetFinishingTime: 103ns
UniqueMetrics:
AGGREGATE_BLOCKING_SOURCE (plan_node_id=4):
CommonMetrics:
- CloseTime: 9.155us
- OperatorTotalTime: 76.100us
- PeakMemoryUsage: 0.00
- PrepareTime: 13.794us
- PullChunkNum: 320
- PullRowNum: 4.073K (4073)
- __MAX_OF_PullRowNum: 26
- __MIN_OF_PullRowNum: 5
- PullTotalTime: 66.657us
- PushChunkNum: 0
- PushRowNum: 0
- PushTotalTime: 0ns
- SetFinishedTime: 132ns
- SetFinishingTime: 153ns
UniqueMetrics:
Pipeline (id=0):
- ActiveTime: 346.251us
- BlockByInputEmpty: 1.709K (1709)
- __MAX_OF_BlockByInputEmpty: 8
- __MIN_OF_BlockByInputEmpty: 3
- BlockByOutputFull: 0
- BlockByPrecondition: 0
- DegreeOfParallelism: 20
- DriverTotalTime: 190.489ms
- OverheadTime: 346.251us
- PendingTime: 186.776ms
- InputEmptyTime: 186.814ms
- FirstInputEmptyTime: 44.850ms
- __MAX_OF_FirstInputEmptyTime: 113.402ms
- __MIN_OF_FirstInputEmptyTime: 34.887ms
- FollowupInputEmptyTime: 141.963ms
- __MAX_OF_FollowupInputEmptyTime: 143.772ms
- __MIN_OF_FollowupInputEmptyTime: 75.324ms
- OutputFullTime: 0ns
- PendingFinishTime: 0ns
- PreconditionBlockTime: 0ns
- ScheduleCount: 2.056K (2056)
- __MAX_OF_ScheduleCount: 9
- __MIN_OF_ScheduleCount: 4
- ScheduleTime: 3.366ms
- TotalDegreeOfParallelism: 320
- YieldByTimeLimit: 0
AGGREGATE_BLOCKING_SINK (plan_node_id=4):
CommonMetrics:
- CloseTime: 8.186us
- OperatorTotalTime: 80.363us
- PeakMemoryUsage: 1.46 MB
- __MAX_OF_PeakMemoryUsage: 5.03 KB
- __MIN_OF_PeakMemoryUsage: 4.26 KB
- PrepareTime: 80.562us
- PullChunkNum: 0
- PullRowNum: 0
- PullTotalTime: 0ns
- PushChunkNum: 1.855K (1855)
- __MAX_OF_PushChunkNum: 8
- __MIN_OF_PushChunkNum: 3
- PushRowNum: 4.073K (4073)
- __MAX_OF_PushRowNum: 26
- __MIN_OF_PushRowNum: 5
- PushTotalTime: 70.883us
- SetFinishedTime: 42ns
- SetFinishingTime: 1.249us
UniqueMetrics:
- GroupingKeys: 15: time_at_5min, 1: domain, 4: isp, 5: province
- AggregateFunctions: sum(23: sum)
- AggComputeTime: 26.736us
- ExprComputeTime: 18.680us
- ExprReleaseTime: 19.650us
- GetResultsTime: 41.356us
- HashTableSize: 4.073K (4073)
- __MAX_OF_HashTableSize: 26
- __MIN_OF_HashTableSize: 5
- InputRowCount: 4.073K (4073)
- __MAX_OF_InputRowCount: 26
- __MIN_OF_InputRowCount: 5
- PassThroughRowCount: 0
- ResultAggAppendTime: 1.582us
- ResultGroupByAppendTime: 3.885us
- ResultIteratorTime: 20.716us
- RowsReturned: 0
- StreamingTime: 0ns
EXCHANGE_SOURCE (plan_node_id=3):
CommonMetrics:
- CloseTime: 716ns
- OperatorTotalTime: 85.44us
- PeakMemoryUsage: 0.00
- PrepareTime: 15.458us
- PullChunkNum: 1.855K (1855)
- __MAX_OF_PullChunkNum: 8
- __MIN_OF_PullChunkNum: 3
- PullRowNum: 4.073K (4073)
- __MAX_OF_PullRowNum: 26
- __MIN_OF_PullRowNum: 5
- PullTotalTime: 80.157us
- PushChunkNum: 0
- PushRowNum: 0
- PushTotalTime: 0ns
- SetFinishedTime: 109ns
- SetFinishingTime: 4.59us
UniqueMetrics:
- BytesPassThrough: 9.79 KB
- __MAX_OF_BytesPassThrough: 2.42 KB
- __MIN_OF_BytesPassThrough: 0.00
- BytesReceived: 126.74 KB
- __MAX_OF_BytesReceived: 9.12 KB
- __MIN_OF_BytesReceived: 6.26 KB
- DecompressChunkTime: 37.342us
- DeserializeChunkTime: 468.390us
- ReceiverProcessTotalTime: 680.241us
- RequestReceived: 129
- __MAX_OF_RequestReceived: 9
- __MIN_OF_RequestReceived: 7
- SenderTotalTime: 678.388us
- SenderWaitLockTime: 3.655us
Fragment 2:
- BackendNum: 16
- InstanceNum: 16
- MemoryLimit: 91.48 GB
- PeakMemoryUsage: 274.09 MB
- __MAX_OF_PeakMemoryUsage: 17.38 MB
- __MIN_OF_PeakMemoryUsage: 16.54 MB
Pipeline (id=2):
- ActiveTime: 112.347us
- BlockByInputEmpty: 0
- BlockByOutputFull: 0
- BlockByPrecondition: 0
- DegreeOfParallelism: 20
- DriverTotalTime: 36.794ms
- __MAX_OF_DriverTotalTime: 157.956ms
- __MIN_OF_DriverTotalTime: 4.746ms
- OverheadTime: 112.347us
- PendingTime: 35.741ms
- __MAX_OF_PendingTime: 156.832ms
- __MIN_OF_PendingTime: 3.898ms
- InputEmptyTime: 4.985ms
- __MAX_OF_InputEmptyTime: 153.727ms
- __MIN_OF_InputEmptyTime: 2.395ms
- FirstInputEmptyTime: 4.985ms
- __MAX_OF_FirstInputEmptyTime: 153.727ms
- __MIN_OF_FirstInputEmptyTime: 2.395ms
- FollowupInputEmptyTime: 0ns
- OutputFullTime: 0ns
- PendingFinishTime: 30.758ms
- __MAX_OF_PendingFinishTime: 153.615ms
- __MIN_OF_PendingFinishTime: 109.971us
- PreconditionBlockTime: 0ns
- ScheduleCount: 320
- ScheduleTime: 940.614us
- TotalDegreeOfParallelism: 320
- YieldByTimeLimit: 0
EXCHANGE_SINK (plan_node_id=3):
CommonMetrics:
- CloseTime: 1.52us
- OperatorTotalTime: 107.428us
- PeakMemoryUsage: 0.00
- PrepareTime: 46.280us
- PullChunkNum: 0
- PullRowNum: 0
- PullTotalTime: 0ns
- PushChunkNum: 8
- __MAX_OF_PushChunkNum: 1
- __MIN_OF_PushChunkNum: 0
- PushRowNum: 4.073K (4073)
- __MAX_OF_PushRowNum: 907
- __MIN_OF_PushRowNum: 0
- PushTotalTime: 28.578us
- SetFinishedTime: 93ns
- SetFinishingTime: 77.702us
UniqueMetrics:
- DestID: 3
- DestFragments: a82028f0397511ed-846b6c92bf67ef7d, a82028f0397511ed-846b6c92bf67ef7e, a82028f0397511ed-846b6c92bf67ef7f, a82028f0397511ed-846b6c92bf67ef80, a82028f0397511ed-846b6c92bf67ef81, a82028f0397511ed-846b6c92bf67ef82, a82028f0397511ed-846b6c92bf67ef83, a82028f0397511ed-846b6c92bf67ef84, a82028f0397511ed-846b6c92bf67ef85, a82028f0397511ed-846b6c92bf67ef86, a82028f0397511ed-846b6c92bf67ef87, a82028f0397511ed-846b6c92bf67ef88, a82028f0397511ed-846b6c92bf67ef89, a82028f0397511ed-846b6c92bf67ef8a, a82028f0397511ed-846b6c92bf67ef8b, a82028f0397511ed-846b6c92bf67ef8c
- PartType: HASH_PARTITIONED
- BytesPassThrough: 9.79 KB
- __MAX_OF_BytesPassThrough: 1.96 KB
- __MIN_OF_BytesPassThrough: 0.00
- BytesSent: 126.74 KB
- __MAX_OF_BytesSent: 32.84 KB
- __MIN_OF_BytesSent: 0.00
- CompressTime: 6.399us
- NetworkTime: 7.0ms
- __MAX_OF_NetworkTime: 20.708ms
- __MIN_OF_NetworkTime: 28.310us
- OverallThroughput: 549.9258985519409 MB/sec
- __MAX_OF_OverallThroughput: 538.416693687439 MB/sec
- __MIN_OF_OverallThroughput: 0.0 /sec
- RequestSent: 113
- __MAX_OF_RequestSent: 30
- __MIN_OF_RequestSent: 0
- SerializeChunkTime: 4.798us
- ShuffleHashTime: 618ns
- UncompressedBytes: 147.62 KB
- __MAX_OF_UncompressedBytes: 29.65 KB
- __MIN_OF_UncompressedBytes: 0.00
- WaitTime: 229.529us
AGGREGATE_STREAMING_SOURCE (plan_node_id=2):
CommonMetrics:
- CloseTime: 11.754us
- OperatorTotalTime: 13.985us
- PeakMemoryUsage: 0.00
- PrepareTime: 14.294us
- PullChunkNum: 8
- __MAX_OF_PullChunkNum: 1
- __MIN_OF_PullChunkNum: 0
- PullRowNum: 4.073K (4073)
- __MAX_OF_PullRowNum: 907
- __MIN_OF_PullRowNum: 0
- PullTotalTime: 1.999us
- PushChunkNum: 0
- PushRowNum: 0
- PushTotalTime: 0ns
- SetFinishedTime: 117ns
- SetFinishingTime: 112ns
UniqueMetrics:
Pipeline (id=1):
- ActiveTime: 3.227ms
- BlockByInputEmpty: 20
- __MAX_OF_BlockByInputEmpty: 3
- __MIN_OF_BlockByInputEmpty: 0
- BlockByOutputFull: 0
- BlockByPrecondition: 0
- DegreeOfParallelism: 20
- DriverTotalTime: 9.127ms
- __MAX_OF_DriverTotalTime: 158.329ms
- __MIN_OF_DriverTotalTime: 4.425ms
- OverheadTime: 3.227ms
- PendingTime: 1.695ms
- __MAX_OF_PendingTime: 148.643ms
- __MIN_OF_PendingTime: 143.130us
- InputEmptyTime: 1.695ms
- __MAX_OF_InputEmptyTime: 148.644ms
- __MIN_OF_InputEmptyTime: 142.866us
- FirstInputEmptyTime: 202.829us
- FollowupInputEmptyTime: 1.492ms
- __MAX_OF_FollowupInputEmptyTime: 148.404ms
- __MIN_OF_FollowupInputEmptyTime: 0ns
- OutputFullTime: 0ns
- PendingFinishTime: 0ns
- PreconditionBlockTime: 0ns
- ScheduleCount: 340
- __MAX_OF_ScheduleCount: 4
- __MIN_OF_ScheduleCount: 1
- ScheduleTime: 4.205ms
- __MAX_OF_ScheduleTime: 10.684ms
- __MIN_OF_ScheduleTime: 2.618ms
- TotalDegreeOfParallelism: 320
- YieldByTimeLimit: 0
AGGREGATE_STREAMING_SINK (plan_node_id=2):
CommonMetrics:
- CloseTime: 5.88us
- OperatorTotalTime: 8.232us
- PeakMemoryUsage: 336.03 KB
- __MAX_OF_PeakMemoryUsage: 94.00 KB
- __MIN_OF_PeakMemoryUsage: 0.00
- PrepareTime: 82.638us
- PullChunkNum: 0
- PullRowNum: 0
- PullTotalTime: 0ns
- PushChunkNum: 8
- __MAX_OF_PushChunkNum: 1
- __MIN_OF_PushChunkNum: 0
- PushRowNum: 7.845K (7845)
- __MAX_OF_PushRowNum: 2.747K (2747)
- __MIN_OF_PushRowNum: 0
- PushTotalTime: 2.534us
- SetFinishedTime: 47ns
- SetFinishingTime: 560ns
UniqueMetrics:
- GroupingKeys: 15: time_at_5min, 1: domain, 4: isp, 5: province
- AggregateFunctions: sum(19: traffic)
- AggComputeTime: 2.370us
- ExprComputeTime: 64ns
- ExprReleaseTime: 29ns
- GetResultsTime: 1.546us
- HashTableSize: 4.073K (4073)
- __MAX_OF_HashTableSize: 907
- __MIN_OF_HashTableSize: 0
- InputRowCount: 7.845K (7845)
- __MAX_OF_InputRowCount: 2.747K (2747)
- __MIN_OF_InputRowCount: 0
- PassThroughRowCount: 0
- ResultAggAppendTime: 375ns
- ResultGroupByAppendTime: 414ns
- ResultIteratorTime: 417ns
- RowsReturned: 0
- StreamingTime: 0ns
PROJECT (plan_node_id=1):
CommonMetrics:
- CloseTime: 5.767us
- OperatorTotalTime: 6.148us
- PeakMemoryUsage: 0.00
- PrepareTime: 13.485us
- PullChunkNum: 8
- __MAX_OF_PullChunkNum: 1
- __MIN_OF_PullChunkNum: 0
- PullRowNum: 7.845K (7845)
- __MAX_OF_PullRowNum: 2.747K (2747)
- __MIN_OF_PullRowNum: 0
- PullTotalTime: 4ns
- PushChunkNum: 8
- __MAX_OF_PushChunkNum: 1
- __MIN_OF_PushChunkNum: 0
- PushRowNum: 7.845K (7845)
- __MAX_OF_PushRowNum: 2.747K (2747)
- __MIN_OF_PushRowNum: 0
- PushTotalTime: 192ns
- SetFinishedTime: 59ns
- SetFinishingTime: 124ns
UniqueMetrics:
CHUNK_ACCUMULATE (plan_node_id=0):
CommonMetrics:
- CloseTime: 386ns
- OperatorTotalTime: 752ns
- PeakMemoryUsage: 0.00
- PrepareTime: 13.591us
- PullChunkNum: 8
- __MAX_OF_PullChunkNum: 1
- __MIN_OF_PullChunkNum: 0
- PullRowNum: 7.845K (7845)
- __MAX_OF_PullRowNum: 2.747K (2747)
- __MIN_OF_PullRowNum: 0
- PullTotalTime: 6ns
- PushChunkNum: 9
- __MAX_OF_PushChunkNum: 2
- __MIN_OF_PushChunkNum: 0
- PushRowNum: 7.845K (7845)
- __MAX_OF_PushRowNum: 2.747K (2747)
- __MIN_OF_PushRowNum: 0
- PushTotalTime: 82ns
- SetFinishedTime: 131ns
- SetFinishingTime: 144ns
UniqueMetrics:
OLAP_SCAN (plan_node_id=0):
CommonMetrics:
- CloseTime: 636.511us
- OperatorTotalTime: 3.732ms
- PeakMemoryUsage: 0.00
- PrepareTime: 34.898us
- PullChunkNum: 2.354K (2354)
- __MAX_OF_PullChunkNum: 14
- __MIN_OF_PullChunkNum: 2
- PullRowNum: 7.845K (7845)
- __MAX_OF_PullRowNum: 2.747K (2747)
- __MIN_OF_PullRowNum: 0
- PullTotalTime: 3.95ms
- PushChunkNum: 0
- PushRowNum: 0
- PushTotalTime: 0ns
- SetFinishedTime: 145ns
- SetFinishingTime: 135ns
UniqueMetrics:
- MorselQueueType: fixed_morsel_queue
- Predicates: 1: domain IN (116153, 228593, 136786, 132537, 222065, 254919, 118285, 220270, 70766, 224345), 15: time_at_5min >= ‘2022-09-20 23:20:00’, 15: time_at_5min <= ‘2022-09-20 23:50:00’, 4: isp IN (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 19, 21, 22, 23, 25, 27, 28, 29), 5: province IN (1, 2, 3, 4, 5, 6, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 386), 6: country = 1, 3: is_parent = 0, 10: is_https = 1
- Rollup: region_5min
- Table: nginx_region_detail
- Aggr: 62.593ms
- __MAX_OF_Aggr: 74.299ms
- __MIN_OF_Aggr: 6.447ms
- BytesRead: 595.85 KB
- __MAX_OF_BytesRead: 187.12 KB
- __MIN_OF_BytesRead: 0.00
- CachedPagesNum: 0
- ChunkBufferCapacity: 20.48K (20480)
- CompressedBytesRead: 2.37 MB
- __MAX_OF_CompressedBytesRead: 361.24 KB
- __MIN_OF_CompressedBytesRead: 0.00
- CreateSegmentIter: 323.571us
- DefaultChunkBufferCapacity: 20.48K (20480)
- IOTime: 125.702us
- __MAX_OF_IOTime: 47.6ms
- __MIN_OF_IOTime: 0ns
- LateMaterialize: 1.744ms
- __MAX_OF_LateMaterialize: 6.583ms
- __MIN_OF_LateMaterialize: 104.785us
- MorselsCount: 2.344K (2344)
- __MAX_OF_MorselsCount: 14
- __MIN_OF_MorselsCount: 1
- PeakChunkBufferSize: 960
- __MAX_OF_PeakChunkBufferSize: 62
- __MIN_OF_PeakChunkBufferSize: 57
- PushdownPredicates: 8
- RawRowsRead: 21.432K (21432)
- __MAX_OF_RawRowsRead: 7.424K (7424)
- __MIN_OF_RawRowsRead: 0
- ReadPagesNum: 360
- __MAX_OF_ReadPagesNum: 64
- __MIN_OF_ReadPagesNum: 0
- RowsRead: 7.845K (7845)
- __MAX_OF_RowsRead: 2.747K (2747)
- __MIN_OF_RowsRead: 0
- ScanTime: 379.819us
- __MAX_OF_ScanTime: 148.579ms
- __MIN_OF_ScanTime: 1.635us
- SegmentInit: 249.228us
- __MAX_OF_SegmentInit: 120.982ms
- __MIN_OF_SegmentInit: 0ns
- BitmapIndexFilter: 7.114us
- __MAX_OF_BitmapIndexFilter: 8.120ms
- __MIN_OF_BitmapIndexFilter: 0ns
- BitmapIndexFilterRows: 0
- BloomFilterFilterRows: 0
- SegmentZoneMapFilterRows: 34.308333436B (34308333436)
- __MAX_OF_SegmentZoneMapFilterRows: 92.094182M (92094182)
- __MIN_OF_SegmentZoneMapFilterRows: 0
- ShortKeyFilterRows: 123.559917M (123559917)
- __MAX_OF_ShortKeyFilterRows: 20.085039M (20085039)
- __MIN_OF_ShortKeyFilterRows: 0
- ZoneMapIndexFilterRows: 0
- SegmentRead: 106.444us
- __MAX_OF_SegmentRead: 41.633ms
- __MIN_OF_SegmentRead: 0ns
- BlockFetch: 5.727us
- BlockFetchCount: 12
- __MAX_OF_BlockFetchCount: 3
- __MIN_OF_BlockFetchCount: 0
- BlockSeek: 128.732us
- __MAX_OF_BlockSeek: 43.388ms
- __MIN_OF_BlockSeek: 0ns
- BlockSeekCount: 31
- __MAX_OF_BlockSeekCount: 8
- __MIN_OF_BlockSeekCount: 0
- ChunkCopy: 186ns
- DecompressT: 541ns
- DelVecFilterRows: 0
- IndexLoad: 0ns
- PredFilter: 270ns
- PredFilterRows: 13.587K (13587)
- __MAX_OF_PredFilterRows: 4.677K (4677)
- __MIN_OF_PredFilterRows: 0
- RowsetsReadCount: 16.708K (16708)
- __MAX_OF_RowsetsReadCount: 118
- __MIN_OF_RowsetsReadCount: 1
- SegmentsReadCount: 7.342K (7342)
- __MAX_OF_SegmentsReadCount: 106
- __MIN_OF_SegmentsReadCount: 0
- TotalColumnsDataPageCount: 53.133K (53133)
- __MAX_OF_TotalColumnsDataPageCount: 8.636K (8636)
- __MIN_OF_TotalColumnsDataPageCount: 0
- SubmitIOTaskCount: 2.344K (2344)
- __MAX_OF_SubmitIOTaskCount: 14
- __MIN_OF_SubmitIOTaskCount: 1
- TabletCount: 2.344K (2344)
- __MAX_OF_TabletCount: 147
- __MIN_OF_TabletCount: 146
- UncompressedBytesRead: 2.43 MB
- __MAX_OF_UncompressedBytesRead: 367.80 KB
- __MIN_OF_UncompressedBytesRead: 0.00
- Union: 62.577ms
- __MAX_OF_Union: 74.281ms
- __MIN_OF_Union: 6.437ms
Pipeline (id=0):
- ActiveTime: 188.301us
- BlockByInputEmpty: 0
- BlockByOutputFull: 0
- BlockByPrecondition: 0
- DegreeOfParallelism: 1
- DriverTotalTime: 5.352ms
- __MAX_OF_DriverTotalTime: 10.616ms
- __MIN_OF_DriverTotalTime: 4.164ms
- OverheadTime: 188.301us
- PendingTime: 0ns
- InputEmptyTime: 0ns
- FirstInputEmptyTime: 0ns
- FollowupInputEmptyTime: 0ns
- OutputFullTime: 0ns
- PendingFinishTime: 0ns
- PreconditionBlockTime: 0ns
- ScheduleCount: 16
- ScheduleTime: 5.164ms
- __MAX_OF_ScheduleTime: 10.448ms
- __MIN_OF_ScheduleTime: 3.990ms
- TotalDegreeOfParallelism: 16
- YieldByTimeLimit: 0
NOOP_SINK (plan_node_id=0):
CommonMetrics:
- CloseTime: 207ns
- OperatorTotalTime: 349ns
- PeakMemoryUsage: 0.00
- PrepareTime: 13.934us
- PullChunkNum: 0
- PullRowNum: 0
- PullTotalTime: 0ns
- PushChunkNum: 0
- PushRowNum: 0
- PushTotalTime: 0ns
- SetFinishedTime: 37ns
- SetFinishingTime: 104ns
UniqueMetrics:
OLAP_SCAN_PREPARE (plan_node_id=0):
CommonMetrics:
- CloseTime: 5.273us
- OperatorTotalTime: 189.582us
- PeakMemoryUsage: 0.00
- PrepareTime: 794.154us
- PullChunkNum: 16
- PullRowNum: 0
- PullTotalTime: 184.70us
- PushChunkNum: 0
- PushRowNum: 0
- PushTotalTime: 0ns
- SetFinishedTime: 66ns
- SetFinishingTime: 171ns
UniqueMetrics:


db定义超过150ms的查询为慢查询,这张图为每分钟的慢查询数量图,在14:37关闭了pipeline引擎,除此之外没有任何改动,可以看到关闭引擎后慢查询数量显著降低

能提供一个开启pipeline和非pipeline相同query的explain costs+sql看下吗?

不开pipeline (121.8 KB)

这是不开pipeline的profile,和我帖子里贴的开了pipeline的是同一条语句

即使是开了pipeline执行相同语句,也不是100%都变成慢查询,我观测是每整5分钟会突刺一波,怀疑是pipeline在高并发场景下资源分配能力不太行,导致一些正常的查询因为等待资源时间较长变成了慢查询,查询本身没有问题。


有没有可能是默认的线程数太少了,我没有单独配置pipeline引擎的执行线程数量,默认取系统的cpu核数,就是40个线程。

pipeline_dop 这个参数有设置过吗?

没有,还是默认值0

麻烦您把开启pipeline的profile以txt文件格式发一下,我这边看看,pendingtime应该不是主要的原因,那个是等待下面算子传数据的时间

开启pipeline (32.3 KB)
当时那个profile已经找不到了,我又从fe拉了一个开启pipeline的profile

两个profile的sql不一样,需要相同sql的两个profile

不开pipeline (124.0 KB)
这个sql一样了


我昨天14:37左右关闭了pipeline引擎,性能得到提升
然后刚刚发现关闭引擎前后Fail to report exec state due to query not found这行日志的数量发生了明显变化

怀疑慢查询与这行日志有关

好像不对,这行日志只在pipeline引擎才输出,所以关了引擎数量减少时正常的,之所以后面还会输出可能是部分session没有在重新建立连接前,pipeline引擎没有立刻关闭

开pipeline只有一个文件,我打开了你发的最新的没开的pipeline文件,两个语句是不一样的…


我从这个页面下载了两个文件,语句是一样的啊

两个对比来看:没开150ms,开了160ms,没有相差特别多,您是什么业务场景需要这么快的响应呢?看您定义的慢查询为150ms及以上

因为我在生产环境已经关闭了pipeline引擎,所以给出的pipeline引擎的profile是我手动执行的,实际上在生产环境打开pipeline引擎的时候,每隔5分钟都会出现一批慢查询,慢查询的语句没什么异常,但是就是湖站扎堆变慢了。150ms的忽然变成400ms了。
如果关闭pipeline引擎,就没出现这个间接性大量慢查询的现象了


这是昨天打开pipeline时的慢查询趋势图

我在最初发的那个不是txt文件格式的pipeline就是我从生产环境拿到的,但是当时没保存下来,直接粘贴到了帖子里,所以变不成txt了