Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

merge from apache master #7

Open
wants to merge 3,218 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
3218 commits
Select commit Hold shift + click to select a range
ecebee5
[CARBONDATA-4092] Fix concurrent issues in delete segment API's and M…
vikramahuja1001 Dec 18, 2020
aae93c1
[CARBONDATA-4093] Added logs for MV and method to verify if mv is in …
Dec 19, 2020
1dfcdec
[CARBONDATA-4094]: Fix fallback count(*) issue on partition table wit…
vikramahuja1001 Dec 21, 2020
c8cec12
[CARBONDATA-4089] Create table with location, if the location doesn't…
jack86596 Dec 22, 2020
11ae435
[CARBONDATA-4095] Fix Select Query with SI filter fails, when columnD…
Dec 22, 2020
385d9ab
[CARBONDATA-4088] Drop metacache didn't clear some cache information …
jack86596 Dec 17, 2020
316939b
[CARBONDATA-4099] Fixed select query on main table with a SI table in…
vikramahuja1001 Dec 28, 2020
19f9027
[CARBONDATA-4100] Fix SI segments are in inconsistent state with main…
Dec 28, 2020
8831af4
[CARBONDATA-4073] Added FT for missing scenarios and removed dead cod…
akkio-97 Nov 29, 2020
44db434
[CARBONDATA-3987] Handled filter and IUD operation for pagination rea…
nihal0107 Dec 30, 2020
4d8a01f
[CARBONDATA-4070] [CARBONDATA-4059] Fixed SI issues and improved FT
nihal0107 Nov 27, 2020
e019806
[CARBONDATA-4065] Support MERGE INTO SQL Command
BrooksLI Nov 5, 2020
2129466
[DOC] Running the Thrift JDBC/ODBC server with CarbonExtensions
QiangCai Jan 15, 2021
aa2121e
[CARBONDATA-4055]Fix creation of empty segment directory and meta
akashrn5 Nov 23, 2020
7585656
[CARBONDATA-4096] SDK read fails from cluster and sdk read filter que…
ShreelekhyaG Dec 22, 2020
5971417
[CARBONDATA-4051] Geo spatial index algorithm improvement and UDFs en…
shenjiayu17 Nov 19, 2020
f5e35cd
[CARBONDATA-4097] ColumnVectors should not be initialized as
Karan980 Dec 22, 2020
54f8697
[CARBONDATA-4104] Vector filling for complex decimal type needs to be…
akkio-97 Jan 7, 2021
46a46a0
[CARBONDATA-4109] Improve carbondata coverage for presto-integration …
akkio-97 Jan 11, 2021
5a2edc3
[CARBONDATA-4112] Data mismatch issue in SI global sort merge flow
Karan980 Jan 28, 2021
440ab03
[CARBONDATA-4113] Partition prune and cache fix when carbon.read.part…
ShreelekhyaG Jan 28, 2021
aa7efda
[CARBONDATA-4082] Fix alter table add segment query on adding a segme…
Karan980 Jan 4, 2021
9b04540
[CARBONDATA-4107] Added related MV tables Map to fact table and added…
Jan 12, 2021
afbf531
[CARBONDATA-4111] Filter query having invalid results after add segme…
ShreelekhyaG Jan 25, 2021
ec1c0ca
[CARBONDATA-4102] Added UT and FT to improve coverage of SI module.
nihal0107 Dec 23, 2020
115182d
[CARBONDATA-4122] Use CarbonFile API instead of java File API for Fli…
Feb 7, 2021
791857b
[CARBONDATA-4125] SI compatability issue fix
Feb 3, 2021
91f1b69
[CARBONDATA-4124] Fix Refresh MV which does not exist error message
Feb 9, 2021
3f1db97
[CARBONDATA-4117][CARBONDATA-4123] cg index and bloom index query iss…
ShreelekhyaG Feb 5, 2021
1cab165
[CARBONDATA-3962] Fixed concurrent load failure with flat folder stru…
nihal0107 Feb 16, 2021
5ec3536
[CARBONDATA-4126] Concurrent compaction failed with load on table
Karan980 Feb 11, 2021
59ad77a
[CARBONDATA-4121] Prepriming is not working in Index Server
Karan980 Feb 5, 2021
0112268
[CARBONDATA-4115] Successful load and insert will return segment ID
areyouokfreejoe Jan 30, 2021
8f2ee7f
[CARBONDATA-4137] Refactor CarbonDataSourceScan without the soruces.F…
QiangCai Feb 22, 2021
35c4b33
[CARBONDATA-4133] Concurrent Insert Overwrite with static partition o…
ShreelekhyaG Feb 23, 2021
25c5687
[CARBONDATA-4141] Index Server is not caching indexes for external ta…
Karan980 Mar 2, 2021
d5b3b8c
[CARBONDATA-4075] Using withEvents instead of fireEvent
QiangCai Jan 19, 2021
d9f69ae
[CARBONDATA-4110] Support clean files dry run operation and show stat…
vikramahuja1001 Jan 7, 2021
bce6481
[CARBONDATA-4144] During compaction, the segment lock of SI table is …
liuhe0702 Mar 10, 2021
a4921e9
[CARBONDATA-4145] Query fails and the message "File does not exist: x…
liuhe0702 Mar 10, 2021
b00efca
[CARBONDATA-4149] Fix query issues after alter add partition.
ShreelekhyaG Mar 13, 2021
b74645e
[CARBONDATA-4148] Reindex failed when SI has stale carbonindexmerge file
jack86596 Mar 12, 2021
8d17de6
[CARBONDATA-4147] Fix re-arrange schema in logical relation on MV par…
Mar 12, 2021
6ab3647
[CARBONDATA-4146]Query fails and the error message "unable to get fil…
liuhe0702 Mar 10, 2021
fd0ff22
[CARBONDATA-4153] Fix DoNot Push down not equal to filter with Cast o…
Mar 16, 2021
0f53bdb
[CARBONDATA-4155] Fix Create table like table with MV
Mar 22, 2021
f5e4c89
[CARBONDATA-4149] Fix query issues after alter add empty partition lo…
ShreelekhyaG Mar 23, 2021
865ec9b
[CARBONDATA-4156] Fix Writing Segment Min max with all blocks of a se…
Mar 9, 2021
d535a1e
[CARBONDATA-4154] Fix various concurrent issues with clean files
vikramahuja1001 Mar 16, 2021
4ec3e58
add .asf.yaml
chenliang613 Mar 28, 2021
603133f
[maven-release-plugin] prepare release apache-carbondata-2.1.1-rc2
ajantha-bhat Mar 26, 2021
baa1f69
[maven-release-plugin] prepare for next development iteration
ajantha-bhat Mar 26, 2021
db8666c
Enable github's merge function
chenliang613 Apr 15, 2021
6be1691
Enable github's merge function This closes #4119
chenliang613 Apr 15, 2021
f67c8fa
[CARBONDATA-4161] Describe complex columns
ShreelekhyaG Mar 23, 2021
d01d9f5
[CARBONDATA-4163] Support adding of single-level complex columns(arra…
akkio-97 Mar 31, 2021
09ad509
[CARBONDATA-4158]Add Secondary Index as a coarse-grain index and use …
VenuReddy2103 Mar 9, 2021
71910fb
[CARBONDATA-4037] Improve the table status and segment file writing
ShreelekhyaG Oct 16, 2020
3a6e4a4
[CARBONDATA-4173][CARBONDATA-4174] Fix inverted index query issue and…
ShreelekhyaG Apr 22, 2021
3b411bb
[CARBONDATA-4172] Select query having parent and child struct column …
Apr 22, 2021
e5b1dd0
[CARBONDATA-4167][CARBONDATA-4168] Fix case sensitive issues and inpu…
ShreelekhyaG Apr 6, 2021
2f93479
[CARBONDATA-4170] Support dropping of parent complex columns(array/st…
akkio-97 Mar 31, 2021
7350c33
[HOTFIX] Remove hitcount link due to not working
chenliang613 Apr 30, 2021
c825730
[CARBONDATA-4166] Geo spatial Query Enhancements
Mar 12, 2021
8996369
[CARBONDATA-4175] [CARBONDATA-4162] Leverage Secondary Index till seg…
nihal0107 Mar 24, 2021
41a756f
[CARBONDATA-4188] Fixed select query with small table page size after…
nihal0107 May 17, 2021
861ba2e
[CARBONDATA-4185] Doc Changes for Heterogeneous format segments in ca…
maheshrajus May 12, 2021
35091a2
[CARBONDATA-4184] alter table Set TBLPROPERTIES for RANGE_COLUMN sets…
Karan980 May 12, 2021
07c98e8
[CARBONDATA-4189] alter table validation issues
maheshrajus May 18, 2021
a90243c
[CARBONDATA-4183] Local sort Partition Load and Compaction fix
Apr 2, 2021
01fd120
[CARBONDATA-4186] Fixed insert failure when partition column present …
nihal0107 May 12, 2021
4c04f7c
[CARBONDATA-4191] update table for primitive column not working when …
maheshrajus May 24, 2021
26e9182
[Doc] syntax and format issues in README.md and how-to-contribute-to-…
Sunt-ing May 14, 2021
8740016
[CARBONDATA-4192] UT cases correction for validating the exception me…
maheshrajus May 20, 2021
fee8b18
[CARBONDATA-4193] Fix compaction failure after alter add complex column.
ShreelekhyaG May 26, 2021
70643df
[CARBONDATA-4196] Allow zero or more white space in GEO UDFs
nihal0107 Jun 4, 2021
d838e3b
[CARBONDATA-4143] Enable UT with index server and fix related issues
ShreelekhyaG Feb 25, 2021
cfa02dd
[CARBONDATA-4179] Support renaming of complex columns (array/struct)
akkio-97 May 4, 2021
90841bc
[CARBONDATA-4202] Fix issue when refresh main table with MV
ShreelekhyaG Jun 4, 2021
f1da9e8
[CARBONDATA-4206] Support rename SI table
jack86596 Jun 10, 2021
65fad98
[CARBONDATA-4208] Wrong Exception received for complex child long str…
maheshrajus Jun 11, 2021
95ab745
[CARBONDATA-4212] Fix case sensitive issue with Update query having A…
Jun 16, 2021
fdd00ab
[CARBONDATA-4213] Fix update/delete issue in index server
vikramahuja1001 Jun 16, 2021
d8f7df9
[CARBONDATA-4211] Fix - from xx Insert into select fails if an SQL st…
ShreelekhyaG Jun 14, 2021
d5cb011
[CARBONDATA-4217] Fix rename SI table, other applications didn't get …
jack86596 Jun 19, 2021
18665cc
[CARBONDATA-4214] inserting NULL value when timestamp value received …
maheshrajus Jun 17, 2021
8ceb4fd
[CARBONDATA-4190] Integrate Carbondata with Spark 3.1.1 version
vikramahuja1001 Apr 6, 2021
d4ddd07
[CARBONDATA-4225] Fix Update performance issues when auto merge compa…
Jun 21, 2021
899b7ae
[HOTFIX] Correct CI build status
chenliang613 Jun 26, 2021
5e2adad
[HOTFIX] Correct CI build status This closes #4166
chenliang613 Jun 26, 2021
65462ff
[CARBONDATA-4230] table properties not updated with lower-case and table
maheshrajus Jun 23, 2021
aefa977
[maven-release-plugin] prepare release apache-carbondata-2.2.0-rc1
akashrn5 Jul 5, 2021
718490e
[maven-release-plugin] prepare for next development iteration
akashrn5 Jul 5, 2021
c7a3d6d
[HOTFIX]Revert wrong pom changes commit during prepare release process.
akashrn5 Jul 5, 2021
88fdf60
[CARBONDATA-4232] Add missing doc change for secondary index.
nihal0107 Jun 24, 2021
02e7723
[CARBONDATA-4210] Handle 3.1 parsing failures related to alter comple…
akkio-97 Jun 23, 2021
c9a5231
[CARBONDATA-4204][CARBONDATA-4231] Fix add segment error message,
ShreelekhyaG Jun 21, 2021
0337c32
[CARBONDATA-4250] Ignoring presto random test cases
maheshrajus Jul 19, 2021
9aaeba5
[CARBONDATA-4251][CARBONDATA-4253] Optimize Clean Files Performance
marchpure Jul 26, 2021
f2698fe
[CARBONDATA-4248] Fixed upper case column name in explain command
nihal0107 Jul 19, 2021
feb0521
[CARBONDATA-4247][CARBONDATA-4241] Fix Wrong timestamp value query re…
Jul 20, 2021
1e2fc4c
[CARBONDATA-4242]Improve cdc performance and introduce new APIs for U…
akashrn5 May 31, 2021
3c81c7a
[CARBONDATA-4255] Prohibit Create/Drop Database when databaselocation…
marchpure Jul 27, 2021
aceaa44
add new ways of "contact us" in README.md
Jeromestein Aug 1, 2021
62354e3
[CARBONDATA-4261][Doc][summer-2021] Add new ways of slack in README.m…
chenliang613 Aug 3, 2021
7b0d8f6
[maven-release-plugin] prepare for next development iteration
akashrn5 Aug 2, 2021
11dab76
[Doc] Add new dev mailing list (website) link and update the Nabble a…
Jeromestein Aug 5, 2021
a5bb652
[CARBONDATA-4268][Doc][summer-2021] Add new dev mailing list (website…
chenliang613 Aug 7, 2021
e8f8c02
[Doc][summer-2021] Add TOC and format how-to-contribute-to-apache-car…
Jeromestein Aug 1, 2021
d4abe76
[CARBONDATA-4266][Doc][summer-2021] Add TOC and format how-to-contrib…
chenliang613 Aug 7, 2021
926b67b
Update quick-start-guide.md
ChanceXin Aug 5, 2021
fac48be
[CARBONDATA-4267][Doc][summer-2021]Update and modify some content in …
chenliang613 Aug 8, 2021
bdd4a8c
[CARBONDATA-4256] Fixed parsing failure on SI creation for complex co…
nihal0107 Jul 29, 2021
1ccf295
[CARBONDATA-4091] support prestosql 333 integartion with carbon
ajantha-bhat Mar 9, 2020
5804060
[CARBONDATA-4269] Update url and description for new prestosql-guide.md
Aug 13, 2021
0e59ddb
[CARBONDATA-4272]carbondata test case not including the load command …
MarvinLitt Aug 19, 2021
9f9ea1f
[CARBONDATA-4272]carbondata test case not including the load command …
chenliang613 Aug 22, 2021
8de65a2
[CARBONDATA-4119][CARBONDATA-4238][CARBONDATA-4237][CARBONDATA-4236] …
ShreelekhyaG Aug 16, 2021
f52aa20
[CARBONDATA-4164][CARBONDATA-4198][CARBONDATA-4199][CARBONDATA-4234] …
ShreelekhyaG Jul 14, 2021
ca659b5
[CARBONDATA-4274] Fix create partition table error with spark 3.1
ShreelekhyaG Aug 19, 2021
bdc9484
[CARBONDATA-4271] Support DPP for carbon
Jul 13, 2021
42f6982
[CARBONDATA-4273] Fix Cannot create external table with partitions
Aug 27, 2021
226228f
[CARBONDATA-4278] Avoid refetching all indexes to get segment properties
maheshrajus Aug 23, 2021
4d8bc9e
[CARBONDATA-4282] Fix issues with table having complex columns relate…
ShreelekhyaG Sep 6, 2021
7199357
[CARBONDATA-4277] geo instance compatability fix
ShreelekhyaG Sep 14, 2021
3b29bcb
[CARBONDATA-4284] Load/insert after alter add column on partition tab…
ShreelekhyaG Sep 13, 2021
2d1907b
[CARBONDATA-4286] Fixed measure comparator
nihal0107 Sep 15, 2021
22342f8
[CARBONDATA-4285] Fix alter add complex columns with global sort comp…
maheshrajus Sep 16, 2021
ce860d0
[CARBONDATA-4288][CARBONDATA-4289] Fix various issues with Index Serv…
vikramahuja1001 Sep 17, 2021
9944936
[CARBONDATA-4243] Fixed si with column meta cache on same column
nihal0107 Sep 27, 2021
bca62cd
[CARBONDATA-4228] [CARBONDATA-4203] Fixed update/delete after alter a…
nihal0107 Sep 21, 2021
5a710f9
[CARBONDATA-4293] Make Table created without external keyword as Tran…
Sep 22, 2021
8b3d78b
[CARBONDATA-4215] Fix query issue after add segment other formats wit…
ShreelekhyaG Sep 28, 2021
b8d9a97
[CARBONDATA-4292] Spatial index creation using spark dataframe
ShreelekhyaG Sep 22, 2021
305851e
[CARBONDATA-4298][CARBONDATA-4281] Empty bad record support for compl…
ShreelekhyaG Sep 29, 2021
8953cde
[CARBONDATA-4306] Fix Query Performance issue for Spark 3.1
Sep 30, 2021
9dbd2a5
[CARBONDATA-4303] Columns mismatch when insert into table with static…
jack86596 Oct 12, 2021
7d94691
[CARBONDATA-4240]: Added missing properties on the configurations page
pratyakshsharma Oct 27, 2021
07b41a5
[CARBONDATA-4194] Fixed presto read after update/delete from spark
nihal0107 Sep 24, 2021
3be05d2
[CARBONDATA-4296]: schema evolution, enforcement and deduplication ut…
pratyakshsharma Oct 27, 2021
81c2e29
Supplementary information for add segment syntax .
bieremayi Nov 25, 2021
18840af
[CARBONDATA-4305] Support Carbondata Streamer tool for incremental fe…
akashrn5 Sep 1, 2021
598d1ce
Add FAQ How to manage mix file format in carbondata table.
bieremayi Nov 29, 2021
885a21c
FAQ: carbon rename to carbondata
bieremayi Nov 29, 2021
69ab06c
remove add segment refer
bieremayi Dec 2, 2021
7af81ad
remove useless numerical value , revert some typo issues
bieremayi Dec 4, 2021
42d59be
Revert "remove useless numerical value , revert some typo issues"
bieremayi Dec 7, 2021
580f7f6
Revert "remove add segment refer"
bieremayi Dec 7, 2021
341f1bf
Revert "FAQ: carbon rename to carbondata"
bieremayi Dec 7, 2021
ce5747d
Revert "Add FAQ How to manage mix file format in carbondata table."
bieremayi Dec 7, 2021
f544e59
Revert "Supplementary information for add segment syntax ."
bieremayi Dec 7, 2021
c29fee2
Supplementary information for add segment syntax
bieremayi Dec 7, 2021
379f5ad
Update docs/addsegment-guide.md
bieremayi Dec 18, 2021
a1b6d99
Update docs/addsegment-guide.md
bieremayi Dec 18, 2021
fc3914f
[maven-release-plugin] prepare release apache-carbondata-2.3.0-rc1
kunal642 Dec 20, 2021
861fc67
[maven-release-plugin] prepare for next development iteration
kunal642 Dec 20, 2021
01f8e1a
[maven-release-plugin] Reverted the pom changes to 2.2.0-SNAPSHOT
kunal642 Dec 20, 2021
053d080
[maven-release-plugin] prepare release apache-carbondata-2.3.0-rc1
kunal642 Dec 20, 2021
c0211fc
[maven-release-plugin] prepare for next development iteration
kunal642 Dec 20, 2021
0ced3c8
[maven-release-plugin] Reverted the pom changes to 2.2.0-SNAPSHOT
kunal642 Dec 20, 2021
f266a73
[CARBONDATA-4315] Supplementary information for add segment syntax Th…
chenliang613 Dec 20, 2021
d629dc0
[CARBONDATA-4316]Fix horizontal compaction failure for partition tables
akashrn5 Dec 7, 2021
0f1d2a4
[CARBONDATA-4317] Fix TPCDS performance issues
Dec 7, 2021
a072e7a
[CARBONDATA-4319] Fixed clean files not deleteting stale delete delta…
vikramahuja1001 Dec 22, 2021
970f11d
[CARBONDATA-4308]: added docs for streamer tool configs
pratyakshsharma Dec 10, 2021
308906e
[CARBONDATA-4318]Improve load overwrite performance for partition tables
akashrn5 Dec 8, 2021
05aff87
[CARBONDATA-4320] Fix clean files removing wrong delta files
vikramahuja1001 Jan 4, 2022
59f23c0
[CARBONDATA-4322] Apply local sort task level property for insert
ShreelekhyaG Jan 24, 2022
c840b5f
[CARBONDATA-4325] Update Data frame supported options in document and…
ShreelekhyaG Feb 28, 2022
19343a7
[CARBONDATA-4326] MV not hitting with multiple sessions issue fix
ShreelekhyaG Mar 2, 2022
9b74951
[maven-release-plugin] prepare release apache-carbondata-2.3.0-rc2
kunal642 Jan 19, 2022
e25d5b6
[maven-release-plugin] prepare for next development iteration
kunal642 Jan 19, 2022
a838531
[CARBONDATA-4306] Fix Query Performance issue for Spark 3.1
Mar 4, 2022
41831ce
[CARBONDATA-4327] Update documentation related to partition
ShreelekhyaG Mar 17, 2022
d6ce946
[CARBONDATA-4328] Load parquet table with options error message fix
ShreelekhyaG Mar 14, 2022
46b62cf
[CARBONDATA-4329] Fix multiple issues with External table
Mar 23, 2022
45acd67
[CARBONDATA-4330] Incremental Dataload of Average aggregate in MV
ShreelekhyaG Jan 24, 2022
57e76ee
[CARBONDATA-4336] Table Status Versioning
Apr 22, 2022
33408be
[CARBONDATA-4335] Disable MV by default
maheshrajus May 10, 2022
4b8846d
[CARBONDATA-4341] Drop Index Fails after TABLE RENAME
ShreelekhyaG Jun 15, 2022
93b0af2
[CARBONDATA-4339]Fix NullPointerException in load overwrite on partit…
akashrn5 Jun 16, 2022
858afc7
[CARBONDATA-4344] Create MV fails with "LOCAL_DICTIONARY_INCLUDE/LOCA…
ShreelekhyaG Jun 22, 2022
b8511b6
[CARBONDATA-4345] update/delete operations failed when other format s…
maheshrajus Jun 28, 2022
8691cb7
[CARBONDATA-4342] Fix Desc Columns shows New Column added, even thoug…
Jun 9, 2022
04b1756
[CARBONDATA-4338] Moving dropped partition data to trash
maheshrajus Jun 6, 2022
b690cf2
remove wrong link for jenkin ci
chenliang613 Apr 8, 2023
8e9ffd5
remove wechat info
chenliang613 Apr 8, 2023
8b8345a
use github issues to replace jira issues
chenliang613 Apr 8, 2023
2f0241d
[ISSUE-4298] Fixed mail list issue
xubo245 Apr 8, 2023
c780221
[ISSUE-4299] Fixed compile issue with spark 2.3
xubo245 Apr 8, 2023
4af8af4
Merge pull request #4300 from xubo245/issue-4298
chenliang613 Apr 8, 2023
92f4dff
Merge pull request #4301 from xubo245/issue-4299
chenliang613 Apr 8, 2023
3cc0367
[ISSUE-4305] Optimize the magic number
xubo245 Apr 9, 2023
cba9a8a
Merge pull request #4307 from xubo245/ISSUE-4305-magicNumber
chenliang613 Apr 9, 2023
f92ae07
[ISSUE-4305] Optimize the usage of equalsIgnoreCase, which can avoid …
xubo245 Apr 9, 2023
9d43c78
[ISSUE-4305] Add @Override for override method (#4308)
xubo245 Apr 9, 2023
01dd526
[ISSUE-4306] Fix the error of SDKS3SchemaReadExample (#4312)
xubo245 Apr 10, 2023
44c2bca
[ISSUE-4313] Variable should use LowerCamelCase style (#4315)
xubo245 Apr 10, 2023
b941983
Support using Apache CarbonData in notebook (#4317)
xubo245 Apr 13, 2023
2439589
[ISSUE-4305] Optimize the usage of static method (#4309)
xubo245 Apr 24, 2023
f31edd6
Build carbondata notebook docker image by manual and by dockerfile (#…
xubo245 Jun 8, 2023
208afe9
Add new example:Using CarbonData to visualization in notebook (#4318)
xubo245 Jun 26, 2023
6e031c0
[ISSUE-4305] Optimize the constants and variable style (#4311)
xubo245 Jun 26, 2023
8264b3b
[ISSUE-4313] Add @Override for override method of processing module (…
xubo245 Jul 8, 2023
cd180c9
Create maven.yml
chenliang613 Aug 20, 2023
95b50e8
Update maven.yml
chenliang613 Aug 20, 2023
13a2c97
optimize code smell in presto module, (#4332)
xubo245 Oct 1, 2023
beb426c
[ISSUE-4329] optimize some code smells in presto module (#4330)
xubo245 Oct 1, 2023
9f604fc
Update maven.yml
chenliang613 Oct 1, 2023
95a6407
Update maven.yml
chenliang613 Oct 1, 2023
4462461
Update maven.yml
chenliang613 Oct 1, 2023
af9c6c3
Update maven.yml
chenliang613 Oct 1, 2023
d499699
Create github-actions-demo.yml
chenliang613 Oct 1, 2023
bcb30a5
fix ci issues
chenliang613 Oct 1, 2023
30c1aa8
Create maven.yml
chenliang613 Oct 10, 2023
84cfd20
Update maven.yml
chenliang613 Oct 10, 2023
504a5ae
Update pom.xml
chenliang613 Oct 10, 2023
66cb3a3
Update maven.yml
chenliang613 Oct 10, 2023
13ac2ef
Update maven.yml
chenliang613 Oct 10, 2023
0e0523e
Update pom.xml
chenliang613 Oct 10, 2023
0cae3d1
Update maven.yml
chenliang613 Oct 10, 2023
fd66031
Update maven.yml
chenliang613 Oct 10, 2023
38fdb16
Update maven.yml
chenliang613 Oct 10, 2023
ebe4101
fix
chenliang613 Oct 17, 2023
4618808
optmize code smell in presto module (#4331)
xubo245 Oct 17, 2023
448564a
optimizeCodeSmellInSpark (#4328)
xubo245 Oct 19, 2023
f18846c
optimize code smell in streaming module (#4327)
xubo245 Oct 19, 2023
39dd8ce
[ISSUE-4313] Add @Override for override method of processing module (…
xubo245 Oct 19, 2023
dd74408
Update pom.xml
chenliang613 Nov 5, 2023
1e327f2
line 117 (#4237)
WANGSHIHUAI Nov 5, 2023
57de4a3
fix testcase error (#4337)
QiangCai Nov 6, 2023
4a1b36f
[ISSUE-4338] Fix checkstyle issue in sdk module (#4339)
QiangCai Nov 8, 2023
7abc7cd
[CARBONDATA-4333][Doc] Update the declaration of supported String dat…
tangchuan92 Nov 11, 2023
53d3370
Update pom.xml for release
chenliang613 Nov 11, 2023
64ecd77
Update pom.xml
chenliang613 Nov 11, 2023
48f5976
Update pom.xml
chenliang613 Nov 11, 2023
7195869
[ISSUE-4342] Fix test case errors (#4343)
QiangCai Nov 19, 2023
d326118
[maven-release-plugin] prepare release carbondata-parent-apache-carbo…
chenliang613 Nov 19, 2023
a6e9e37
[maven-release-plugin] prepare for next development iteration
chenliang613 Nov 19, 2023
bcc7137
Create maven-publish.yml
chenliang613 Dec 2, 2023
dee20b8
Bump pyarrow from 0.11.1 to 14.0.1 in /python (#4341)
dependabot[bot] Dec 9, 2023
c1b0e9c
Use HTTPS protocol for the official website (#4348)
git-hulk Mar 16, 2024
74e6e93
Minor refactor the build/README.md (#4349)
git-hulk Mar 22, 2024
71abab0
[WIP] Optimize geo module, the feature seems less be used
chenliang613 Apr 6, 2024
5ff36b6
modify thrift version (#4356)
jackylk Jun 30, 2024
f370d20
[CARBONDATA-4349] Upgrade thrift version (#4355)
jackylk Jun 30, 2024
8f9ce4e
Bump org.apache.commons:commons-compress in /integration/presto (#4345)
dependabot[bot] Jul 6, 2024
2c78847
fix format and docker version issues
chenliang613 Jul 6, 2024
29607c3
[WIP] Optimize geo module, the feature seems less be used (#4353)
chenliang613 Jul 6, 2024
e0ac69a
[ISSUE-4351] Add github action for building (#4358)
kevinjmh Oct 5, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
The diff you're trying to view is too large. We only load the first 3000 changed files.
Binary file added .DS_Store
Binary file not shown.
42 changes: 42 additions & 0 deletions .asf.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

notifications:
commits: [email protected]
issues: [email protected]
pullrequests: [email protected]

github:
description: High performance data store solution
homepage: carbondata.apache.org
labels:
- big-data
- data-format
- scala
- java
- carbondata
- apache
- spark
- hadoop
features:
issues: true
projects: true
wiki: true
enabled_merge_buttons:
squash: true
merge: true
rebase: false
34 changes: 14 additions & 20 deletions .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
@@ -1,21 +1,15 @@
Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

- [ ] Make sure the PR title is formatted like:
`[CARBONDATA-<Jira issue #>] Description of pull request`
- [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
Travis-CI on your fork and ensure the whole test matrix passes).
- [ ] Replace `<Jira issue #>` in the title with the actual Jira issue
number, if there is one.
- [ ] If this contribution is large, please file an Apache
[Individual Contributor License Agreement](https://www.apache.org/licenses/icla.txt).
- [ ] Testing done
### Why is this PR needed?


Please provide details on
- Whether new unit test cases have been added or why no new tests are required?
- What manual testing you have done?
- Any additional information to help reviewers in testing this change.

- [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.

---
### What changes were proposed in this PR?


### Does this PR introduce any user interface change?
- No
- Yes. (please explain the change and update document)

### Is any new testcase added?
- No
- Yes


49 changes: 49 additions & 0 deletions .github/workflows/build-test.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
name: Build Test

on: [push, pull_request]

jobs:
build:

runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
spark-profile: [ 'spark-2.3','spark-2.4','spark-3.1' ]
name: Build with spark-version ${{ matrix.spark-profile }}

steps:
- name: Checkout
uses: actions/checkout@v4
- name: Set up JDK
uses: actions/setup-java@v4
with:
java-version: 8
distribution: temurin
- name: Cache
uses: actions/cache@v4
with:
path: |
~/.m2/repository
/usr/local/bin/thrift
key: ${{ runner.os }}-maven
restore-keys: |
${{ runner.os }}-maven
- name: setup-thrift
run: |
if [ ! -f "/usr/local/bin/thrift" ];then
echo "build thrift binary"
sudo apt-get update -qq
sudo apt-get install -qq protobuf-compiler
sudo apt-get install -qq libboost-dev libboost-test-dev libboost-program-options-dev libevent-dev automake libtool flex bison pkg-config g++ libssl-dev
wget -qO- https://archive.apache.org/dist/thrift/0.20.0/thrift-0.20.0.tar.gz | tar zxf -
cd thrift-0.20.0/
chmod +x ./configure
./configure --disable-libs
sudo make -j4 install
else
echo "use cache thrift binary"
fi
- name: install
run: |
mvn clean install -DskipTests -Pbuild-with-format -P${{ matrix.spark-profile }}
34 changes: 34 additions & 0 deletions .github/workflows/maven-publish.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# This workflow will build a package using Maven and then publish it to GitHub packages when a release is created
# For more information see: https://github.com/actions/setup-java/blob/main/docs/advanced-usage.md#apache-maven-with-a-settings-path

name: Maven Package

on:
release:
types: [created]

jobs:
build:

runs-on: ubuntu-latest
permissions:
contents: read
packages: write

steps:
- uses: actions/checkout@v3
- name: Set up JDK 18
uses: actions/setup-java@v3
with:
java-version: '18'
distribution: 'temurin'
server-id: github # Value of the distributionManagement/repository/id field of the pom.xml
settings-path: ${{ github.workspace }} # location for the settings.xml file

- name: Build with Maven
run: mvn -B package --file pom.xml

- name: Publish to GitHub Packages Apache Maven
run: mvn deploy -s $GITHUB_WORKSPACE/settings.xml
env:
GITHUB_TOKEN: ${{ github.token }}
13 changes: 12 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,18 @@
.settings
.cache
target/
store/CSDK/cmake-build-debug/*
.project
.classpath
.DS_Store
metastore_db/
derby.log
derby.log
python/.idea/
*/.cache-main
*/.cache-tests
*/*/.cache-main
*/*/.cache-tests
*/*/*/.cache-main
*/*/*/.cache-tests
*.flattened-pom.xml
python/pycarbon/.pylintrc
17 changes: 0 additions & 17 deletions .travis.yml

This file was deleted.

10 changes: 0 additions & 10 deletions DISCLAIMER

This file was deleted.

13 changes: 13 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
Expand Up @@ -200,3 +200,16 @@
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.


---------------------------------------------------------------------------
This product bundles various third-party components under other open source licenses.
This section summarizes those components and their licenses. See licenses-binary/
for text of these licenses.

BSD 2-Clause
------------

com.github.luben:zstd-jni

com.github.paul-hammant:paranamer
4 changes: 2 additions & 2 deletions NOTICE
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
Apache CarbonData (incubating)
Copyright 2016-2017 The Apache Software Foundation
Apache CarbonData
Copyright 2016 and onwards The Apache Software Foundation.

This product includes software developed at
The Apache Software Foundation (http://www.apache.org/).
Expand Down
111 changes: 75 additions & 36 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,67 +1,106 @@
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to you under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->

<img src="/docs/images/CarbonData_logo.png" width="200" height="40">

Apache CarbonData(incubating) is an indexed columnar data format for fast analytics on big data platform, e.g.Apache Hadoop, Apache Spark, etc.
Apache CarbonData is an indexed columnar data store solution for fast analytics on big data platform, e.g. Apache Hadoop, Apache Spark, etc.

You can find the latest CarbonData document and learn more at:
[http://carbondata.incubator.apache.org](http://carbondata.incubator.apache.org/)
[https://carbondata.apache.org](https://carbondata.apache.org/)

[CarbonData cwiki](https://cwiki.apache.org/confluence/display/CARBONDATA/)

## Status
[![Build Status](https://travis-ci.org/apache/incubator-carbondata.svg?branch=master)](https://travis-ci.org/apache/incubator-carbondata)
Spark2.4:
[![Coverage Status](https://coveralls.io/repos/github/apache/carbondata/badge.svg?branch=master)](https://coveralls.io/github/apache/carbondata?branch=master)
<a href="https://scan.coverity.com/projects/carbondata">
<img alt="Coverity Scan Build Status"
src="https://scan.coverity.com/projects/13444/badge.svg"/>
</a>

## Features
CarbonData file format is a columnar store in HDFS, it has many features that a modern columnar format has, such as splittable, compression schema ,complex data type etc, and CarbonData has following unique features:
* Stores data along with index: it can significantly accelerate query performance and reduces the I/O scans and CPU resources, where there are filters in the query. CarbonData index consists of multiple level of indices, a processing framework can leverage this index to reduce the task it needs to schedule and process, and it can also do skip scan in more finer grain unit (called blocklet) in task side scanning instead of scanning the whole file.
* Operable encoded data :Through supporting efficient compression and global encoding schemes, can query on compressed/encoded data, the data can be converted just before returning the results to the users, which is "late materialized".
* Supports for various use cases with one single Data format : like interactive OLAP-style query, Sequential Access (big scan), Random Access (narrow scan).
CarbonData file format is a columnar store in HDFS, it has many features that a modern columnar format has, such as splittable, compression schema, complex data type etc, and CarbonData has following unique features:
* Stores data along with index: it can significantly accelerate query performance and reduces the I/O scans and CPU resources, where there are filters in the query. CarbonData index consists of multiple level of indices, a processing framework can leverage this index to reduce the task it needs to schedule and process, and it can also do skip scan in more finer grain unit (called blocklet) in task side scanning instead of scanning the whole file.
* Operable encoded data: through supporting efficient compression and global encoding schemes, can query on compressed/encoded data, the data can be converted just before returning the results to the users, which is "late materialized".
* Supports for various use cases with one single Data format : like interactive OLAP-style query, Sequential Access (big scan), Random Access (narrow scan).

## Building CarbonData
CarbonData is built using Apache Maven, to [build CarbonData](https://github.com/apache/incubator-carbondata/blob/master/build)
CarbonData is built using Apache Maven, to [build CarbonData](https://github.com/apache/carbondata/blob/master/build)

## Online Documentation
* [Quick Start](https://github.com/apache/incubator-carbondata/blob/master/docs/quick-start-guide.md)
* [CarbonData File Structure](https://github.com/apache/incubator-carbondata/blob/master/docs/file-structure-of-carbondata.md)
* [Data Types](https://github.com/apache/incubator-carbondata/blob/master/docs/supported-data-types-in-carbondata.md)
* [Data Management](https://github.com/apache/incubator-carbondata/blob/master/docs/data-management.md)
* [DDL Operations on CarbonData](https://github.com/apache/incubator-carbondata/blob/master/docs/ddl-operation-on-carbondata.md)
* [DML Operations on CarbonData](https://github.com/apache/incubator-carbondata/blob/master/docs/dml-operation-on-carbondata.md)
* [Cluster Installation and Deployment](https://github.com/apache/incubator-carbondata/blob/master/docs/installation-guide.md)
* [FAQ](https://github.com/apache/incubator-carbondata/blob/master/docs/faq.md)
* [Trouble Shooting](https://github.com/apache/incubator-carbondata/blob/master/docs/troubleshooting.md)
* [Useful Tips](https://github.com/apache/incubator-carbondata/blob/master/docs/useful-tips-on-carbondata.md)
* [What is CarbonData](https://github.com/apache/carbondata/blob/master/docs/introduction.md)
* [Quick Start](https://github.com/apache/carbondata/blob/master/docs/quick-start-guide.md)
* [Use Cases](https://github.com/apache/carbondata/blob/master/docs/usecases.md)
* [Language Reference](https://github.com/apache/carbondata/blob/master/docs/language-manual.md)
* [CarbonData Data Definition Language](https://github.com/apache/carbondata/blob/master/docs/ddl-of-carbondata.md)
* [CarbonData Data Manipulation Language](https://github.com/apache/carbondata/blob/master/docs/dml-of-carbondata.md)
* [CarbonData Streaming Ingestion](https://github.com/apache/carbondata/blob/master/docs/streaming-guide.md)
* [Configuring CarbonData](https://github.com/apache/carbondata/blob/master/docs/configuration-parameters.md)
* [Index Developer Guide](https://github.com/apache/carbondata/blob/master/docs/index-developer-guide.md)
* [Data Types](https://github.com/apache/carbondata/blob/master/docs/supported-data-types-in-carbondata.md)
* [CarbonData Index Management](https://github.com/apache/carbondata/blob/master/docs/index/index-management.md)
* [CarbonData BloomFilter Index](https://github.com/apache/carbondata/blob/master/docs/index/bloomfilter-index-guide.md)
* [CarbonData Lucene Index](https://github.com/apache/carbondata/blob/master/docs/index/lucene-index-guide.md)
* [CarbonData MV](https://github.com/apache/carbondata/blob/master/docs/mv-guide.md)
* [Carbondata Secondary Index](https://github.com/apache/carbondata/blob/master/docs/index/secondary-index-guide.md)
* [Heterogeneous format segments in carbondata](https://github.com/apache/carbondata/blob/master/docs/addsegment-guide.md)
* [SDK Guide](https://github.com/apache/carbondata/blob/master/docs/sdk-guide.md)
* [C++ SDK Guide](https://github.com/apache/carbondata/blob/master/docs/csdk-guide.md)
* [Performance Tuning](https://github.com/apache/carbondata/blob/master/docs/performance-tuning.md)
* [S3 Storage](https://github.com/apache/carbondata/blob/master/docs/s3-guide.md)
* [Distributed Index Server](https://github.com/apache/carbondata/blob/master/docs/index-server.md)
* [CDC and SCD](https://github.com/apache/carbondata/blob/master/docs/scd-and-cdc-guide.md)
* [Carbon as Spark's Datasource](https://github.com/apache/carbondata/blob/master/docs/carbon-as-spark-datasource-guide.md)
* [FAQs](https://github.com/apache/carbondata/blob/master/docs/faq.md)

## Experimental Features

Some features are marked as experimental because the syntax/implementation might change in the future.
1. Hybrid format table using Add Segment.
2. Accelerating performance using MV on parquet/orc.
3. Merge API for Spark DataFrame.
4. Hive write for non-transactional table.
5. Secondary Index as a Coarse Grain Index in query processing.

## Integration
* [Hive](https://github.com/apache/carbondata/blob/master/docs/hive-guide.md)
* [Presto](https://github.com/apache/carbondata/blob/master/docs/prestodb-guide.md)
* [Alluxio](https://github.com/apache/carbondata/blob/master/docs/alluxio-guide.md)
* [Flink](https://github.com/apache/carbondata/blob/master/docs/flink-integration-guide.md)

## Other Technical Material
[Apache CarbonData meetup material](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=66850609)
* [Apache CarbonData meetup material](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=66850609)
* [Use Case Articles](https://cwiki.apache.org/confluence/display/CARBONDATA/CarbonData+Articles)

## Fork and Contribute
This is an active open source project for everyone, and we are always open to people who want to use this system or contribute to it.
This guide document introduce [how to contribute to CarbonData](https://github.com/apache/incubator-carbondata/blob/master/docs/How-to-contribute-to-Apache-CarbonData.md).
This guide document introduces [how to contribute to CarbonData](https://github.com/apache/carbondata/blob/master/docs/how-to-contribute-to-apache-carbondata.md).

## Contact us
To get involved in CarbonData:

* First join by emailing to [[email protected]](mailto:[email protected]),then you can discuss issues by emailing to [[email protected]](mailto:[email protected]) or visit http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com
* Report issues on [Apache Jira](https://issues.apache.org/jira/browse/CARBONDATA).
* First join by emailing to [[email protected]](mailto:[email protected]), then you can discuss issues by emailing to [[email protected]](mailto:[email protected]).
You can also directly visit [[email protected]](https://lists.apache.org/[email protected]).
Or you can visit [Apache CarbonData Dev Mailing List archive](https://lists.apache.org/[email protected]).

* Report issues on [github issues](https://github.com/apache/carbondata/issues).

* You can also slack to get in touch with the community. After we invite you, you can use this [Slack Link](https://carbondataworkspace.slack.com/) to sign in to CarbonData.


## About
Apache CarbonData is an open source project of The Apache Software Foundation (ASF).
Expand Down
Loading