Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CARBONDATA-4270]Delete segment expect remain_number #4203

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

W1thOut
Copy link

@W1thOut W1thOut commented Aug 14, 2021

Why is this PR needed?

In some scenarios, need to delete old segments in batches and keep only latest few segments.

What changes were proposed in this PR?

Add a DML.

Does this PR introduce any user interface change?

  • No

Is any new testcase added?

  • Yes

@CarbonDataQA2
Copy link

Can one of the admins verify this patch?

@W1thOut
Copy link
Author

W1thOut commented Aug 14, 2021

start build

@W1thOut W1thOut changed the title Delete segment expect remain_number [CARBONDATA-4270]Delete segment expect remain_number Aug 14, 2021
@W1thOut
Copy link
Author

W1thOut commented Aug 14, 2021

retest this please

val ex = intercept[Exception] {
sql("delete from table deleteSegmentTable expect segment.remain_number = -1")
}
assert(ex.getMessage.contains("not found in database"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"not found in database" has nothing to do with -1, why choose this msg.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

sql(
s"""LOAD DATA local inpath '$resourcesPath/dataretention3.csv'
| INTO TABLE deleteSegmentTable OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= '"')""".stripMargin)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need some after all function to make sure the table has been removed.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i have define function "beforeEach" to drop table

// if insert overwrite in progress, do not allow delete segment
if (SegmentStatusManager.isOverwriteInProgressInTable(carbonTable)) {
throw new ConcurrentOperationException(carbonTable, "insert overwrite", "delete segment")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this pr do not support insert overwrite

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The table cannot be updated when deleted

}

val segments = CarbonStore.readSegments(carbonTable.getTablePath, showHistory = false, None)
if (segments.length == remaining.toInt) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remaining.toInt means the segment count shoule be int range, does this restriction necessary

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if there has some segment with status SegmentStatus.MARKED_FOR_DELETE, does this condition is right or not?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interger value from -2147483648 to 2147483647. i think the remaining number of segment will not exceed this range.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SegmentStatus.MARKED_FOR_DELETE won't be in remaining range.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remaining.toInt can throw java.lang.NumberFormatException , handle appropriately.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. i have modify sql parse code. transfer 'remain_number' to Integer value directly.

@W1thOut
Copy link
Author

W1thOut commented Aug 16, 2021

retest this please

@MarvinLitt
Copy link
Contributor

jenkins,add to whitelist

@MarvinLitt
Copy link
Contributor

add to whitelist

@W1thOut
Copy link
Author

W1thOut commented Aug 17, 2021

retest this please

@MarvinLitt
Copy link
Contributor

@CarbonDataQA2 why this pr can not trigger the CI, please help.

@W1thOut
Copy link
Author

W1thOut commented Aug 17, 2021

@CarbonDataQA2 why this pr can not trigger the CI, please help.

@ajantha-bhat
Copy link
Member

add to whitelist

}

// Through the remaining number, get the delete id
val deleteSegmentIds = segments.filter(segment => segment.getSegmentStatus != SegmentStatus.MARKED_FOR_DELETE)
Copy link
Contributor

@vikramahuja1001 vikramahuja1001 Aug 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of filtering out MFD segments, can only take success and compacted segments and avoid In-progress segments, because clean files will anyways take care of those segments

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i have modify this problem

@CarbonDataQA2
Copy link

Build Failed with Spark 3.1, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_3.1/263/

@CarbonDataQA2
Copy link

Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/5860/

@vikramahuja1001
Copy link
Contributor

Hi @W1thOut , can you raise the discussion in the community first? As you want to expose a new DDL, please check the community guidelines here: https://www.mail-archive.com/[email protected]/msg01835.html

@CarbonDataQA2
Copy link

Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/4117/

override def beforeEach(): Unit = {
initTestTable
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add test cases with SI as well

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. i have add test cases with SI

@MarvinLitt
Copy link
Contributor

retest this please

1 similar comment
@W1thOut
Copy link
Author

W1thOut commented Aug 17, 2021

retest this please

@CarbonDataQA2
Copy link

Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/5865/

@CarbonDataQA2
Copy link

Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/4123/

@CarbonDataQA2
Copy link

Build Failed with Spark 3.1, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_3.1/268/

@CarbonDataQA2
Copy link

Build Failed with Spark 3.1, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_3.1/269/

@CarbonDataQA2
Copy link

Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/5866/

@CarbonDataQA2
Copy link

Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/4124/

@W1thOut
Copy link
Author

W1thOut commented Aug 18, 2021

Hi @W1thOut , can you raise the discussion in the community first? As you want to expose a new DDL, please check the community guidelines here: https://www.mail-archive.com/[email protected]/msg01835.html

okay. but i can not access community: http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/
Is it because I don't have permission?

@nihal0107
Copy link
Contributor

nihal0107 commented Aug 18, 2021

@W1thOut
Copy link
Author

W1thOut commented Aug 18, 2021

retest this please

@CarbonDataQA2
Copy link

Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/4127/

@CarbonDataQA2
Copy link

Build Failed with Spark 3.1, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_3.1/272/

@W1thOut
Copy link
Author

W1thOut commented Aug 18, 2021

retest this please

@CarbonDataQA2
Copy link

Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/5871/

@CarbonDataQA2
Copy link

Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/4129/

@CarbonDataQA2
Copy link

Build Failed with Spark 3.1, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_3.1/274/

Revise

modify code

add SI test
add no partition table teset

code checkstyle
@W1thOut
Copy link
Author

W1thOut commented Aug 19, 2021

retest this please

@CarbonDataQA2
Copy link

Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/4133/

@CarbonDataQA2
Copy link

Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/5876/

@CarbonDataQA2
Copy link

Build Success with Spark 3.1, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_3.1/278/

@W1thOut
Copy link
Author

W1thOut commented Aug 21, 2021

Hi @W1thOut , can you raise the discussion in the community first? As you want to expose a new DDL, please check the community guidelines here: https://www.mail-archive.com/[email protected]/msg01835.html

okay. but i can not access community: http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/
Is it because I don't have permission?

Could you give me your valuable comments/inputs/suggestions.
http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/DISCUSSION-CARBONDATA-4270-New-DDL-delete-table-tableName-expect-segment-remain-number-n-td109013.html

@MarvinLitt
Copy link
Contributor

does this pr can be a new kind of segment management.
there may be some scenario to aging segments just left few ones.

@CarbonDataQA2
Copy link

Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/4383/

@CarbonDataQA2
Copy link

Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/6126/

@CarbonDataQA2
Copy link

Build Failed with Spark 3.1, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_3.1/516/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants