too many tablet versions

问题描述:
导入频率太快,compaction没能及时合并导致版本数过多,默认版本数1000

1.增大单次导入数据量,降低频率
2.调整compaction策略,加快合并(调整完需要观察内存和io)be.conf,如果磁盘个数超过10,不建议调整以下参数,主要还是需要降低导入频率
cumulative_compaction_num_threads_per_disk = 4
base_compaction_num_threads_per_disk = 2
cumulative_compaction_check_interval_seconds = 2


请问be.conf 是这么修改吗?修改完还是报错

请问下采用的那种导入方式,线上或者生产禁止使用insert into values,导入请参考https://docs.starrocks.com/zh-cn/main/loading/Loading_intro

datax导入的

datax的sink配置发下

{
“job”: {
“setting”: {
“speed”: {
“channel”: 1,
“byte”: 6291456
},
“errorLimit”: {
“record”: 1000,
“percentage”: 0.02
}
},
“content”: [
{
“reader”: {
“name”: “oraclereader”,
“parameter”: {
“username”: “xx”,
“password”: “xx”,
“column”: [
““DATA_MONTH””,
““PLATFORM_ID””,
““PLATFORM””,
““SHOP_ID””,
““SHOP_NAME””,
““COMPANY_NAME_T””,
““CREDITCODE””,
““COMPANY_NAME””,
““SHOP_SALES_VOLUME””,
““SHOP_SALES_AMOUNT””,
““SHOP_SALES_VOLUME_SUM””,
““SHOP_SALES_AMOUNT_SUM””,
““SHOP_SALES_VOLUME_YOY””,
““SHOP_SALES_AMOUNT_YOY””,
““SHOP_SALES_VOLUME_SUM_YOY””,
““SHOP_SALES_AMOUNT_SUM_YOY””,
““SHOP_ITEM_NUM””,
““IS_SELF_RUN””,
““SHOP_URL””,
““SHOP_OPEN_DATE””,
““SHOP_OPEN_YEARS””,
““SHOP_SATATUS””,
““SHOP_CLOSE_DATE””,
““MAIN_BUSINESS””,
““STAT_MAIN_BUSINESS””,
““BUSINESS_SCOPE””,
““COMPANY_PROVINCE””,
““COMPANY_CITY””,
““COMPANY_COUNTY””,
““COMPANY_AREA_CODE””,
““SHOP_PROVINCE””,
““SHOP_CITY””,
““SHOP_COUNTY””,
““SHOP_AREA_CODE””,
““SHOP_DELIVERY_PROVINCE””,
““SHOP_DELIVERY_CITY””,
““SHOP_DELIVERY_COUNTY””,
““SHOP_DELIVERY_CODE””,
““CREATTIME””,
““UPDATETIME””,
““SHOP_IMG_URL””
],
“splitPk”: “”,
“connection”: [
{
“table”: [
“WX_SUM_SINFO”
],
“jdbcUrl”: [
“jdbc:oracle:thin:@172.22xx:1521:dcdw1”
]
}
]
}
},
“writer”: {
“name”: “mysqlwriter”,
“parameter”: {
“username”: “xx”,
“password”: “xx”,
“column”: [
DATA_MONTH”,
PLATFORM_ID”,
PLATFORM”,
SHOP_ID”,
SHOP_NAME”,
COMPANY_NAME_T”,
CREDITCODE”,
COMPANY_NAME”,
SHOP_SALES_VOLUME”,
SHOP_SALES_AMOUNT”,
SHOP_SALES_VOLUME_SUM”,
SHOP_SALES_AMOUNT_SUM”,
SHOP_SALES_VOLUME_YOY”,
SHOP_SALES_AMOUNT_YOY”,
SHOP_SALES_VOLUME_SUM_YOY”,
SHOP_SALES_AMOUNT_SUM_YOY”,
SHOP_ITEM_NUM”,
IS_SELF_RUN”,
SHOP_URL”,
SHOP_OPEN_DATE”,
SHOP_OPEN_YEARS”,
SHOP_SATATUS”,
SHOP_CLOSE_DATE”,
MAIN_BUSINESS”,
STAT_MAIN_BUSINESS”,
BUSINESS_SCOPE”,
COMPANY_PROVINCE”,
COMPANY_CITY”,
COMPANY_COUNTY”,
COMPANY_AREA_CODE”,
SHOP_PROVINCE”,
SHOP_CITY”,
SHOP_COUNTY”,
SHOP_AREA_CODE”,
SHOP_DELIVERY_PROVINCE”,
SHOP_DELIVERY_CITY”,
SHOP_DELIVERY_COUNTY”,
SHOP_DELIVERY_CODE”,
CREATTIME”,
UPDATETIME”,
SHOP_IMG_URL
],
“connection”: [
{
“table”: [
“wx_sum_sinfo”
],
“jdbcUrl”: “jdbc:mysql://10.58.xx:9030/stars”,
“loadUrl”: [
“10.58.xx:8030”,
“10.58.xx:8030”
]
}
]
}
}
}
]
}
}

image
be.conf 这么配置有问题吗

不好意思 datax 也刚上手 sink 配置是指哪个文件?

这个配置去掉试试呢

另外这个表有其他导入吗?现在问题的原因是导入频率太高了,每次导入算作一个版本,导入频率太高很容易达到版本个数限制。增大单个批次导入数据量,降低导入频率

没有其他导入 我看别的用户也是这么说的 说是在be.conf 配置
cumulative_compaction_num_threads_per_disk = 4
base_compaction_num_threads_per_disk = 2
cumulative_compaction_check_interval_seconds = 2

我已经配置了

有重启过be服务吗

去掉 “byte”:6291456 了 ,还是哪个错,配置完合并策略就重启了be了

你的用法有问题,writer需要用starrockwriter,具体使用请参考https://docs.starrocks.com/zh-cn/main/loading/DataX-starrocks-writer

image

starrockswriter

那上面发的为啥是mysqlwriter?

starrockswriter,改了在調整下,json格式就可以 ,非常感谢 :grinning: :grinning: :grinning:

1赞

我用的 datax-web 配置的json,datax web没有starrocks选项,我就用mysql的选项了,想着后面再改来着,然后就没有然后嘞,


现在可以了就是导入速度有点慢,我看资源占用不是很大,优化参数配置是该刚才的那个
cumulative_compaction_num_threads_per_disk = 4
base_compaction_num_threads_per_disk = 2
cumulative_compaction_check_interval_seconds = 2