Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fix](compaction) Should do_lease for full compaction #47436

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

bobhan1
Copy link
Contributor

@bobhan1 bobhan1 commented Jan 24, 2025

What problem does this PR solve?

Problem Summary:

Manually triggered full compaction may fail with the msg "there is no running compaction".

We should do lease for full compaction. Otherwise, if full compaction lasts longer than config::lease_compaction_interval_seconds * 4 = 80s, the later compaction on same tablet will remove the full compaction from tablet's job in MS.

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@bobhan1
Copy link
Contributor Author

bobhan1 commented Jan 24, 2025

run buildall

@bobhan1 bobhan1 force-pushed the do-lease-for-full-compaction branch from f34e86a to 140f1b3 Compare January 24, 2025 12:40
@bobhan1
Copy link
Contributor Author

bobhan1 commented Jan 24, 2025

run buildall

@bobhan1 bobhan1 force-pushed the do-lease-for-full-compaction branch from 140f1b3 to 52e0f38 Compare January 24, 2025 12:54
@bobhan1
Copy link
Contributor Author

bobhan1 commented Jan 24, 2025

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 32121 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 52e0f38398fecd5145130adcc9fc278ff07bc339, data reload: false

------ Round 1 ----------------------------------
q1	17583	5449	5308	5308
q2	2042	302	165	165
q3	10428	1275	730	730
q4	10212	968	532	532
q5	7532	2344	2158	2158
q6	187	160	132	132
q7	922	741	595	595
q8	9229	1334	1211	1211
q9	5307	4862	4895	4862
q10	6825	2332	1899	1899
q11	472	279	254	254
q12	345	362	217	217
q13	17781	3684	3080	3080
q14	227	217	216	216
q15	514	473	461	461
q16	645	614	591	591
q17	571	854	323	323
q18	6991	6524	6339	6339
q19	1208	947	529	529
q20	338	332	204	204
q21	2827	2337	2006	2006
q22	370	332	309	309
Total cold run time: 102556 ms
Total hot run time: 32121 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5446	5486	5459	5459
q2	238	326	240	240
q3	2301	2638	2317	2317
q4	1412	1877	1355	1355
q5	4370	4751	4671	4671
q6	166	159	124	124
q7	1985	2008	1794	1794
q8	2657	2858	2687	2687
q9	7322	7193	7228	7193
q10	3007	3254	2741	2741
q11	588	507	501	501
q12	681	815	615	615
q13	3442	3886	3360	3360
q14	295	318	286	286
q15	517	467	468	467
q16	629	686	652	652
q17	1230	1717	1268	1268
q18	7648	7440	7397	7397
q19	810	1043	1140	1043
q20	1954	2029	1918	1918
q21	5762	5348	5054	5054
q22	618	593	589	589
Total cold run time: 53078 ms
Total hot run time: 51731 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 184763 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 52e0f38398fecd5145130adcc9fc278ff07bc339, data reload: false

query1	973	368	384	368
query2	6516	2097	2074	2074
query3	6788	237	212	212
query4	33319	23597	23022	23022
query5	4332	609	443	443
query6	300	196	188	188
query7	4610	489	309	309
query8	308	251	233	233
query9	9301	2609	2601	2601
query10	480	305	270	270
query11	17775	15130	14962	14962
query12	156	104	104	104
query13	1646	508	398	398
query14	9248	6876	6702	6702
query15	221	199	178	178
query16	7808	591	437	437
query17	1578	718	536	536
query18	1983	393	294	294
query19	189	178	148	148
query20	117	111	109	109
query21	209	119	98	98
query22	4296	4460	4068	4068
query23	33627	32968	32990	32968
query24	6825	2327	2212	2212
query25	466	443	383	383
query26	1214	277	153	153
query27	2152	463	335	335
query28	5441	2454	2421	2421
query29	708	528	414	414
query30	232	181	161	161
query31	956	928	779	779
query32	75	61	62	61
query33	517	358	306	306
query34	743	829	510	510
query35	792	822	716	716
query36	993	997	964	964
query37	119	97	81	81
query38	4208	4243	4051	4051
query39	1436	1406	1411	1406
query40	207	113	102	102
query41	53	50	57	50
query42	117	102	101	101
query43	508	527	512	512
query44	1297	833	813	813
query45	183	170	171	170
query46	854	1029	639	639
query47	1839	1823	1754	1754
query48	380	388	313	313
query49	797	502	402	402
query50	638	649	392	392
query51	4139	4206	4110	4110
query52	107	100	95	95
query53	231	263	187	187
query54	479	488	406	406
query55	84	78	77	77
query56	247	273	236	236
query57	1142	1152	1096	1096
query58	255	224	241	224
query59	3090	3299	3017	3017
query60	286	275	258	258
query61	141	140	145	140
query62	773	701	665	665
query63	226	203	217	203
query64	4169	1004	662	662
query65	3191	3160	3201	3160
query66	1068	412	303	303
query67	15673	15458	15478	15458
query68	2384	817	550	550
query69	412	295	256	256
query70	1217	1143	1154	1143
query71	336	297	262	262
query72	5046	3827	3806	3806
query73	649	744	367	367
query74	9939	9217	9042	9042
query75	3136	3151	2645	2645
query76	2254	1122	771	771
query77	331	361	272	272
query78	10111	10080	9359	9359
query79	1001	872	594	594
query80	684	533	457	457
query81	485	268	237	237
query82	1179	154	122	122
query83	250	173	159	159
query84	238	95	71	71
query85	816	358	301	301
query86	367	294	300	294
query87	4472	4450	4418	4418
query88	3043	2208	2176	2176
query89	402	325	305	305
query90	1692	190	191	190
query91	135	137	109	109
query92	56	59	56	56
query93	933	869	541	541
query94	567	396	295	295
query95	332	270	255	255
query96	497	600	277	277
query97	2805	2852	2776	2776
query98	237	214	193	193
query99	1281	1366	1244	1244
Total cold run time: 272118 ms
Total hot run time: 184763 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.53 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 52e0f38398fecd5145130adcc9fc278ff07bc339, data reload: false

query1	0.03	0.03	0.03
query2	0.08	0.03	0.03
query3	0.24	0.07	0.06
query4	1.62	0.11	0.10
query5	0.42	0.42	0.42
query6	1.16	0.67	0.66
query7	0.02	0.01	0.02
query8	0.04	0.03	0.03
query9	0.57	0.52	0.50
query10	0.56	0.56	0.55
query11	0.14	0.11	0.10
query12	0.14	0.10	0.11
query13	0.62	0.60	0.61
query14	2.72	2.72	2.74
query15	0.91	0.84	0.82
query16	0.39	0.39	0.39
query17	1.05	1.06	1.05
query18	0.24	0.21	0.21
query19	1.93	1.86	2.02
query20	0.01	0.01	0.01
query21	15.39	0.95	0.58
query22	0.75	0.75	0.94
query23	15.08	1.45	0.56
query24	3.15	1.66	1.78
query25	0.30	0.14	0.12
query26	0.22	0.14	0.13
query27	0.05	0.06	0.05
query28	14.47	1.00	0.43
query29	12.58	3.98	3.30
query30	0.24	0.09	0.06
query31	2.82	0.64	0.38
query32	3.24	0.56	0.47
query33	3.10	2.99	3.07
query34	16.73	5.17	4.50
query35	4.60	4.51	4.54
query36	0.63	0.51	0.49
query37	0.10	0.06	0.06
query38	0.04	0.04	0.03
query39	0.04	0.03	0.03
query40	0.16	0.14	0.12
query41	0.07	0.02	0.02
query42	0.03	0.02	0.02
query43	0.04	0.03	0.02
Total cold run time: 106.72 s
Total hot run time: 31.53 s

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 42.06% (10969/26082)
Line Coverage: 32.33% (92718/286800)
Region Coverage: 31.47% (47536/151040)
Branch Coverage: 27.51% (24077/87506)
Coverage Report: http://coverage.selectdb-in.cc/coverage/52e0f38398fecd5145130adcc9fc278ff07bc339_52e0f38398fecd5145130adcc9fc278ff07bc339/report/index.html

@bobhan1
Copy link
Contributor Author

bobhan1 commented Jan 24, 2025

run p0

Copy link
Contributor

@gavinchou gavinchou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Jan 26, 2025
Copy link
Contributor

PR approved by at least one committer and no changes requested.

Copy link
Contributor

PR approved by anyone and no changes requested.

@bobhan1
Copy link
Contributor Author

bobhan1 commented Jan 26, 2025

run p0

1 similar comment
@bobhan1
Copy link
Contributor Author

bobhan1 commented Jan 26, 2025

run p0

Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/3.0.x reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants