Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

merge from apache master #7

Open
wants to merge 3,218 commits into
base: master
Choose a base branch
from
Open
This pull request is big! We’re only showing the most recent 250 commits.

Commits on Dec 21, 2020

  1. [CARBONDATA-4092] Fix concurrent issues in delete segment API's and M…

    …V flow
    
    Why is this PR needed?
    They are multiple issues with the Delete segment API:
    
    Not using the latest loadmetadatadetails while writing to table status file, thus can remove table status entry of any concurrently loaded Insert In progress/success segment.
    The code reads the table status file 2 times
    When in concurrent queries, they both access checkAndReloadSchema for MV on all databases, 2 different queries try to create a file on same location, HDFS takes the lock for one and fails for another, thus failing the query
    
    What changes were proposed in this PR?
    Only reading the table status file once.
    Using the latest tablestatus to mark the segment Marked for delete, thus no concurrent issues will come
    Made touchMDT and checkAndReloadSchema methods syncronized, so that only instance can access it at one time.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    No
    
    This closes #4059
    vikramahuja1001 authored and ajantha-bhat committed Dec 21, 2020
    Configuration menu
    Copy the full SHA
    ecebee5 View commit details
    Browse the repository at this point in the history
  2. [CARBONDATA-4093] Added logs for MV and method to verify if mv is in …

    …Sync during query
    
    Why is this PR needed?
    Added logs for MV and method to verify if mv is in Sync during query
    
    What changes were proposed in this PR?
    1. Move MV Enable Check to beginning to avoid transform logical plan
    2. Add Logger if exception is occurred during fetching mv schema
    3. Check if MV is in Sync and allow Query rewrite
    4. Reuse reading LoadMetadetails to get mergedLoadMapping
    5. Set NO-Dict Schema types for insert-partition flow - missed from [CARBONDATA-4077]
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    Yes
    
    This closes #4060
    Indhumathi27 authored and akashrn5 committed Dec 21, 2020
    Configuration menu
    Copy the full SHA
    aae93c1 View commit details
    Browse the repository at this point in the history

Commits on Dec 22, 2020

  1. [CARBONDATA-4094]: Fix fallback count(*) issue on partition table wit…

    …h index server
    
    Why is this PR needed?
    The used asJava converts to java "in place", without copying the whole data to save
    time and memory and it just simply wraps the scala collection with a class that
    conforms to the java interface and thus java serializer is not able to serialize it.
    
    What changes were proposed in this PR?
    Converting it to list, so that it is able to serialize a list.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    No
    
    This closes #4061
    vikramahuja1001 authored and Indhumathi27 committed Dec 22, 2020
    Configuration menu
    Copy the full SHA
    1dfcdec View commit details
    Browse the repository at this point in the history

Commits on Dec 23, 2020

  1. [CARBONDATA-4089] Create table with location, if the location doesn't…

    … have scheme, the default will be local file system, which is not the file system defined by fs.defaultFS
    
    Why is this PR needed?
    Create table with location, if the location doesn't have scheme, the default will be local file system, which is not the file system defined by fs.defaultFS.
    
    What changes were proposed in this PR?
    If the location doesn't have scheme, add the fs.defaultFS scheme to the beginning of the location.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    Yes
    
    This closes #4065
    jack86596 authored and ajantha-bhat committed Dec 23, 2020
    Configuration menu
    Copy the full SHA
    c8cec12 View commit details
    Browse the repository at this point in the history
  2. [CARBONDATA-4095] Fix Select Query with SI filter fails, when columnD…

    …rift is Set
    
    Why is this PR needed?
    After converting expression to IN Expression for maintable with SI, expression
    is not processed if ColumnDrift is enabled. Query fails with NPE during
    resolveFilter. Exception is added in JIRA
    
    What changes were proposed in this PR?
    Process the filter expression after adding implicit expression
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    Yes
    
    This closes #4063
    Indhumathi27 authored and akashrn5 committed Dec 23, 2020
    Configuration menu
    Copy the full SHA
    11ae435 View commit details
    Browse the repository at this point in the history
  3. [CARBONDATA-4088] Drop metacache didn't clear some cache information …

    …which leads to memory leak
    
    Why is this PR needed?
    When there are two spark applications, one drop a table, some cache information of this table
    stay in another application and cannot be removed with any method like "Drop metacache" command.
    This leads to memory leak. With the passage of time, memory leak will also accumulate which
    finally leads to driver OOM. Following are the leak points:
    1) tableModifiedTimeStore in CarbonFileMetastore;
    2) segmentLockMap in BlockletDataMapIndexStore;
    3) absoluteTableIdentifierByteMap in SegmentPropertiesAndSchemaHolder;
    4) tableInfoMap in CarbonMetadata.
    
    What changes were proposed in this PR?
    Using expiring map to cache the table information in CarbonMetadata and modified time in
    CarbonFileMetaStore so that stale information will be cleared automatically after the expiration
    time. Operations in BlockletDataMapIndexStore no need to be locked, remove all the logic
    related to segmentLockMap.
    
    Does this PR introduce any user interface change?
    New configuration carbon.metacache.expiration.seconds is added.
    
    Is any new testcase added?
    No
    
    This closes #4057
    jack86596 authored and akashrn5 committed Dec 23, 2020
    Configuration menu
    Copy the full SHA
    385d9ab View commit details
    Browse the repository at this point in the history

Commits on Dec 29, 2020

  1. [CARBONDATA-4099] Fixed select query on main table with a SI table in…

    … case of concurrent load,
    
    compact and clean files operation
    
    Why is this PR needed?
    There were 2 issues in the clean files post event listener:
    
    1. In concurrent cases, while writing entry back to the table status file, wrong path was given,
    due to which table status file was not updated in the case of SI table.
    2. While writing the loadmetadetails to the table status file during concurrent scenarios,
    we were only writing the unwanted segments and not all the segments, which could make segments
    stale in the SI table
    Due to these 2 issues, when selet query is executed on SI table, the tablestatus would have entry
    for a segment but it's carbondata file would be deleted, thus throwing an IO Exception.
    3. Segment ID is null when writing hive table
    
    What changes were proposed in this PR?
    1.& 2. Added correct table status path as well sending the correct loadmetadatadetails to be updated in
    the table status file. Now when select query is fired on the SI table, it will not throw
    carbondata file not found exception
    3. set the load model after setup job of committer
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    No
    
    This closes #4066
    vikramahuja1001 authored and akashrn5 committed Dec 29, 2020
    Configuration menu
    Copy the full SHA
    316939b View commit details
    Browse the repository at this point in the history

Commits on Dec 30, 2020

  1. [CARBONDATA-4100] Fix SI segments are in inconsistent state with main…

    …table after concurrent Load & Compaction operation
    
    Why is this PR needed?
    When Concurrent LOAD and COMPACTION is in progress on main table having SI, SILoadEventListenerForFailedSegments listener is called to repair SI failed segments if any. It will compare SI and main table segment status, if there is a mismatch, then it will add that specific load to failedLoads to be re-loaded again.
    
    During Compaction, SI will be updated first and then maintable. So, in some cases, SI segment will be in compacted state and main table will be in SUCCESS state(the compaction can be still in progress or due to some operation failure). SI index repair will add those segments to failedLoads, by checking if segment lock can be acquired. But, if maintable compaction is finished by the time, SI repair comparison is done, then also, it can acquire segment lock and add those load to failedLoad(even though main table load is COMPACTED). After the concurrent operation is finished, some segments of SI are marked as INSERT_IN_PROGRESS. This will lead to inconsistent state between SI and mainTable segments.
    
    What changes were proposed in this PR?
    Acquire compaction lock on maintable(to ensure compaction is not running), and then compare SI and main table load details, to repair SI segments.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    No (concurrent scenario)
    
    This closes #4067
    Indhumathi27 authored and ajantha-bhat committed Dec 30, 2020
    Configuration menu
    Copy the full SHA
    19f9027 View commit details
    Browse the repository at this point in the history
  2. [CARBONDATA-4073] Added FT for missing scenarios and removed dead cod…

    …e in Presto integration
    
    Why is this PR needed?
    FT for following cases has been added. Here store is created by spark and it is read by Presto.
    
    update without local-dict
    delete operations on table
    minor, major, custom compaction
    add and delete segments
    test update with inverted index
    read with partition columns
    Filter on partition columns
    Bloom index
    test range columns
    read streaming data
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    Yes
    
    This closes #4031
    akkio-97 authored and ajantha-bhat committed Dec 30, 2020
    Configuration menu
    Copy the full SHA
    8831af4 View commit details
    Browse the repository at this point in the history

Commits on Jan 5, 2021

  1. [CARBONDATA-3987] Handled filter and IUD operation for pagination rea…

    …der in SDK
    
    Why is this PR needed?
    Currently, SDK pagination reader is not supported for the filter expression and also returning the wrong result after performing IUD operation through SDK.
    
    What changes were proposed in this PR?
    In case of filter present or update/delete operation get the total rows in splits after building the carbon reader else get the row count from the details info of each splits.
    Handled ArrayIndexOutOfBoundException and return zero in case of rowCountInSplits.size() == 0
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    Yes
    
    This closes #4068
    nihal0107 authored and ajantha-bhat committed Jan 5, 2021
    Configuration menu
    Copy the full SHA
    44db434 View commit details
    Browse the repository at this point in the history

Commits on Jan 6, 2021

  1. [CARBONDATA-4070] [CARBONDATA-4059] Fixed SI issues and improved FT

    Why is this PR needed?
    1. Block SI creation on binary column.
    2. Block alter table drop column directly on SI table.
    3. Create table as like should not be allowed for SI tables.
    4. Filter with like should not scan SI table.
    5. Currently compaction is allowed on SI table. Because of this if only SI table
    is compacted and running filter query query on main table is causing more data
    scan of SI table which will causing performance degradation.
    
    What changes were proposed in this PR?
    1. Blocked SI creation on binary column.
    2. Blocked alter table drop column directly on SI table.
    3. Handled Create table as like for SI tables.
    4. Handled filter with like to not scan SI table.
    5. Block the direct compaction on SI table and add FTs for compaction scenario of SI.
    6. Added FT for compression and range column on SI table.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    Yes
    
    This closes #4037
    nihal0107 authored and Indhumathi27 committed Jan 6, 2021
    Configuration menu
    Copy the full SHA
    4d8a01f View commit details
    Browse the repository at this point in the history

Commits on Jan 11, 2021

  1. [CARBONDATA-4065] Support MERGE INTO SQL Command

    Why is this PR needed?
    In order to support MERGE INTO SQL Command in Carbondata
    The previous Scala Parser having trouble to parse the complicated Merge Into SQL Command
    
    What changes were proposed in this PR?
    Add an ANTLR parser, and support parse MERGE INTO SQL Command to DataSet Command
    
    Does this PR introduce any user interface change?
    Yes.
    The PR introduces the MERGE INTO SQL Command.
    
    Is any new testcase added?
    Yes
    
    This closes #4032
    
    Co-authored-by: Zhangshunyu <[email protected]>
    2 people authored and QiangCai committed Jan 11, 2021
    Configuration menu
    Copy the full SHA
    e019806 View commit details
    Browse the repository at this point in the history

Commits on Jan 19, 2021

  1. [DOC] Running the Thrift JDBC/ODBC server with CarbonExtensions

    Why is this PR needed?
    since version 2.0, carbon supports starting spark ThriftServer with CarbonExtensions.
    
    What changes were proposed in this PR?
    add the document to start spark ThriftServer with CarbonExtensions.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    No
    
    This closes #4077
    QiangCai authored and ajantha-bhat committed Jan 19, 2021
    Configuration menu
    Copy the full SHA
    2129466 View commit details
    Browse the repository at this point in the history

Commits on Jan 21, 2021

  1. [CARBONDATA-4055]Fix creation of empty segment directory and meta

    entry when there is no update/insert data
    
    Why is this PR needed?
    1. After #3999 when an update happens on the table, a new segment
    is created for updated data. But when there is no data to update,
    still the segments are created and the table status has in progress
    entries for those empty segments. This leads to unnecessary segment
    dirs and an increase in table status entries.
    2. after this, clean files don't clean these empty segments.
    3. when the source table do not have data, CTAS will result in same
    problem mentioned.
    
    What changes were proposed in this PR?
    when the data is not present during update, make the segment as marked
    for delete so that the clean files take care to delete the segment,
    for cats already handled, added test cases.
    
    This closes #4018
    akashrn5 authored and kunal642 committed Jan 21, 2021
    Configuration menu
    Copy the full SHA
    aa2121e View commit details
    Browse the repository at this point in the history

Commits on Jan 22, 2021

  1. [CARBONDATA-4096] SDK read fails from cluster and sdk read filter que…

    …ry on
    
    sort column giving wrong result with IndexServer
    
    Why is this PR needed?
    1. Create a table and read from sdk written files fails in cluster with
    java.nio.file.NoSuchFileException: hdfs:/hacluster/user/hive/warehouse/carbon.store/default/sdk.
    2. After fixing the above path issue, filter query on sort column gives
    the wrong result with IndexServer.
    
    What changes were proposed in this PR?
    1. In getAllDeleteDeltaFiles , used CarbonFiles.listFiles instead of Files.walk
    to handle custom file types.
    2. In PruneWithFilter , isResolvedOnSegment is used in filterResolver step.
    Have set table and expression on executor side, so indexserver can use this
    in filterResolver step.
    
    This closes #4064
    ShreelekhyaG authored and kunal642 committed Jan 22, 2021
    Configuration menu
    Copy the full SHA
    7585656 View commit details
    Browse the repository at this point in the history

Commits on Jan 25, 2021

  1. [CARBONDATA-4051] Geo spatial index algorithm improvement and UDFs en…

    …hancement
    
    Why is this PR needed?
    Spatial index feature optimization of CarbonData
    
    What changes were proposed in this PR?
    1. Update spatial index encoded algorithm, which can reduce the required properties of creating geo table
    2. Enhance geo query UDFs, support querying geo table with polygon list, polyline list, geoId range list. And add some geo transforming util UDFs.
    3. Load data (include LOAD and INSERT INTO) allows user to input spatial index, which column will still generated internally when user does not give.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    Yes
    
    This closes #4012
    shenjiayu17 authored and ajantha-bhat committed Jan 25, 2021
    Configuration menu
    Copy the full SHA
    5971417 View commit details
    Browse the repository at this point in the history

Commits on Jan 27, 2021

  1. [CARBONDATA-4097] ColumnVectors should not be initialized as

    ColumnVectorWrapperDirect for alter tables
    
    Why is this PR needed?
    Direct filling of column vectors is not allowed for alter tables,
    But its column vectors were getting initialized as ColumnVectorWrapperDirect.
    
    What changes were proposed in this PR?
    Changed the initialization of column vectors to ColumnVectorWrapper
    for alter tables.
    
    This closes #4062
    Karan980 authored and kunal642 committed Jan 27, 2021
    Configuration menu
    Copy the full SHA
    f5e35cd View commit details
    Browse the repository at this point in the history

Commits on Jan 29, 2021

  1. [CARBONDATA-4104] Vector filling for complex decimal type needs to be…

    … handled
    
    Why is this PR needed?
    Filling of vectors in case of complex decimal type whose precision is greater than 18 is not handled properly.
    for ex-
    array<decimal(20,3)>
    
    What changes were proposed in this PR?
    Ensured proper vector filling considering it's page data type.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    Yes
    
    This closes #4073
    akkio-97 authored and ajantha-bhat committed Jan 29, 2021
    Configuration menu
    Copy the full SHA
    54f8697 View commit details
    Browse the repository at this point in the history

Commits on Jan 30, 2021

  1. [CARBONDATA-4109] Improve carbondata coverage for presto-integration …

    …code
    
    Why is this PR needed?
    Few scenarios had missing coverage in presto-integration code. This PR aims to improve it by considering all such scenarios.
    
    Dead code- ObjectStreamReader.java was created with an aim to query complex types. Instead ComplexTypeStreamReader was created. Making ObjectStreamreader obsolete.
    
    What changes were proposed in this PR?
    Test cases added for scenarios that were not covered earlier in presto-integration code
    Removed dead code.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    Yes
    
    This closes #4074
    akkio-97 authored and ajantha-bhat committed Jan 30, 2021
    Configuration menu
    Copy the full SHA
    46a46a0 View commit details
    Browse the repository at this point in the history

Commits on Feb 2, 2021

  1. [CARBONDATA-4112] Data mismatch issue in SI global sort merge flow

    Why is this PR needed?
    When the data files of a SI segment are merged. it results in having more number of rows in SI table than main table.
    
    What changes were proposed in this PR?
    CARBON_INPUT_SEGMENT property was not set before creating the dataframe from SI segment. So it was creating dataframe from all the rows in the table, not only from a particular segment.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    Yes
    
    This closes #4083
    Karan980 authored and ajantha-bhat committed Feb 2, 2021
    Configuration menu
    Copy the full SHA
    5a2edc3 View commit details
    Browse the repository at this point in the history
  2. [CARBONDATA-4113] Partition prune and cache fix when carbon.read.part…

    …ition.hive.direct is disabled
    
    Why is this PR needed?
    When carbon.read.partition.hive.direct is false then select queries on
    partition table result is invalid . For a single partition, partition
    values are appended to form the wrong path when loaded by the same segment.
    Ex: For partition on column b, path: /tablepath/b=1/b=2
    
    What changes were proposed in this PR?
    In PartitionCacheManager, changes made to handle single and multiple partitions.
    Encoded the URI path to handle space values in the string.
    
    This closes #4084
    ShreelekhyaG authored and kunal642 committed Feb 2, 2021
    Configuration menu
    Copy the full SHA
    440ab03 View commit details
    Browse the repository at this point in the history

Commits on Feb 4, 2021

  1. [CARBONDATA-4082] Fix alter table add segment query on adding a segme…

    …nt having delete delta files
    
    Why is this PR needed?
    When a segment is added to a carbon table by alter table add segment query
    and that segment also have a deleteDelta file present in it, then on querying
    the carbon table the deleted rows are coming in the result.
    
    What changes were proposed in this PR?
    Updating the tableStatus and tableUpdateStatus files in correct way for the
    segments having delta delta files.
    
    This closes #4070
    Karan980 authored and kunal642 committed Feb 4, 2021
    Configuration menu
    Copy the full SHA
    aa7efda View commit details
    Browse the repository at this point in the history
  2. [CARBONDATA-4107] Added related MV tables Map to fact table and added…

    … lock while touchMDTFile
    
    Why is this PR needed?
    1. After MV support multi-tenancy PR, mv system folder is moved to database level. Hence,
    during each operation, insert/Load/IUD/show mv/query, we are listing all the databases in the
    system and collecting mv schemas and checking if there is any mv mapped to the table or not.
    This will degrade performance of the query, to collect mv schemas from all databases, even
    though the table has mv or not.
    
    2. When different jvm process call touchMDTFile method, file creation and deletion can
    happen same time. This may fail the operation.
    
    What changes were proposed in this PR?
    1. Added a table property relatedMVTablesMap to fact tables of MV during MV creation. During
    any operation, check if the table has MV or not using the added property and if it has, then
    collect schemas of only related databases. In this way, we can avoid collecting mv schemas
    for table which dont have MV.
    
    2. Take a Global level lock on system folder location, to update last modified time.
    
    NOTE: For compatibilty scenarios, can perform refresh MV operation to update these table properties.
    
    Does this PR introduce any user interface change?
    Yes.
    For compatibilty scenarios, can perform refresh MV operation to update these table properties.
    
    Is any new testcase added?
    No
    
    This closes #4076
    Indhumathi27 authored and akashrn5 committed Feb 4, 2021
    Configuration menu
    Copy the full SHA
    9b04540 View commit details
    Browse the repository at this point in the history
  3. [CARBONDATA-4111] Filter query having invalid results after add segme…

    …nt to table having SI with Indexserver
    
    Why is this PR needed?
    When the index server is enabled, filter query on SI column after alter table
    add sdk segment to maintable throws NoSuchMethodException and the rows added
    by sdk segment are not returned in the result.
    
    What changes were proposed in this PR?
    Added segment path in index server flow, as it is used to identify external segment
    in filter resolver step. No need to load to SI, if it is an add load command.
    Default constructor for SegmentWrapperContainer declared.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    No
    
    This closes #4080
    ShreelekhyaG authored and Indhumathi27 committed Feb 4, 2021
    Configuration menu
    Copy the full SHA
    afbf531 View commit details
    Browse the repository at this point in the history
  4. [CARBONDATA-4102] Added UT and FT to improve coverage of SI module.

    Why is this PR needed?
    Added UT and FT to improve coverage of SI module and also removed the dead or unused code.
    
    What changes were proposed in this PR?
    Added UT and FT to improve coverage of SI module and also removed the dead or unused code.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    Yes
    
    This closes #4071
    nihal0107 authored and Indhumathi27 committed Feb 4, 2021
    Configuration menu
    Copy the full SHA
    ec1c0ca View commit details
    Browse the repository at this point in the history

Commits on Feb 10, 2021

  1. [CARBONDATA-4122] Use CarbonFile API instead of java File API for Fli…

    …nk CarbonLocalWriter
    
    Why is this PR needed?
    Currently, only two writer's(Local & S3) is supported for flink carbon streaming support. If user wants to ingest data from flink as a carbon format, directly into HDFS carbon table, there is no writer type to support it.
    
    What changes were proposed in this PR?
    Since the code for writing flink stage data will be same for Local and Hdfs FileSystems, we can use the existing CarbonLocalWriter to write data into hdfs, by using CarbonFile API instead of java File API.
    
    Changed code to use CarbonFile API instead of java.io.File.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    No
    
    This closes #4090
    Indhumathi27 authored and ajantha-bhat committed Feb 10, 2021
    Configuration menu
    Copy the full SHA
    115182d View commit details
    Browse the repository at this point in the history

Commits on Feb 17, 2021

  1. [CARBONDATA-4125] SI compatability issue fix

    Why is this PR needed?
    Currently, while upgrading table store with SI, we have to execute REFRESH tables and
    REGISTER INDEX command to refresh and register the index to main table. And also, while
    SI creation, we add a property 'indexTableExists' to main table, to identify if table
    has SI or not. If a table has SI, then we load the index information for that table
    from Hive {org.apache.spark.sql.secondaryindex.hive.CarbonInternalMetastore#refreshIndexInfo}.
    indexTableExists will be default 'false' to all tables which does not have SI and for
    SI tables, this property will not be added.
    
    {org.apache.spark.sql.secondaryindex.hive.CarbonInternalMetastore#refreshIndexInfo} will
    be called on any command to refresh indexInfo. indexTableExists property should be either
    true(Main table) or null (SI), in order to get index information from Hive and set it to
    carbon table.
    
    Issue 1:
    While upgarding tables with SI, after refresh main table and SI, If user does any operation
    like Select or Show cache, it is adding indexTableExists property to false. After register
    index and on doing any operation with SI(load or select),
    {org.apache.spark.sql.secondaryindex.hive.CarbonInternalMetastore#refreshIndexInfo} is not
    updating index information to SI table, since indexTableExists is false. Hence, load to SI
    will fail.
    
    Issue 2:
    While upgarding tables with SI, after refresh main table and SI, If user does any operation
    like Update, alter, delete to SI table, while registering it as a index, it is not validating
    the alter operations done on that table.
    
    What changes were proposed in this PR?
    Issue 1:
    While registering SI table as a index, check if SI table has indexTableExists proeprty and
    remove it. For already registered index, allow re-register index to remove the property.
    
    Issue 2:
    Added validations for checking if SI has undergone Load/Update/delete/alter opertaion before
    registering it as a index and throw exception.
    
    This closes #4087
    Indhumathi27 authored and akashrn5 committed Feb 17, 2021
    Configuration menu
    Copy the full SHA
    791857b View commit details
    Browse the repository at this point in the history
  2. [CARBONDATA-4124] Fix Refresh MV which does not exist error message

    Why is this PR needed?
    Refreshing MV which does not exist, is not throwing proper carbon error message.
    It throws Table NOT found message from Spark. This is because, getSchema is
    returning null, if schema is not present.
    
    What changes were proposed in this PR?
    1. Check If getSchema is null and throw No such MV exception.
    2. While drop table, drop mv and then drop fact table from metastore, to avoid
    getting Nullpointer exception, when trying to access fact table while drop MV.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    Yes
    
    This closes #4091
    Indhumathi27 authored and akashrn5 committed Feb 17, 2021
    Configuration menu
    Copy the full SHA
    91f1b69 View commit details
    Browse the repository at this point in the history
  3. [CARBONDATA-4117][CARBONDATA-4123] cg index and bloom index query iss…

    …ue with Index server
    
    Why is this PR needed?
    1. Test cg index query with Index server fails with NPE. While initializing the index model,
    a parsing error is thrown when trying to uncompress with snappy.
    2. Bloom index query with Index server giving incorrect results when splits have >1 blocklets.
    Blocklet level details are not serialized for index server as it is considered as block level cache.
    
    What changes were proposed in this PR?
    1. Have set segment and schema details to BlockletIndexInputSplit object. While writing
    minmax object, write byte size instead of position.
    2. Creating BlockletIndex when bloom filter is used, so in createBlocklet step isBlockCache
    is set to false.
    
    This closes #4089
    ShreelekhyaG authored and kunal642 committed Feb 17, 2021
    Configuration menu
    Copy the full SHA
    3f1db97 View commit details
    Browse the repository at this point in the history

Commits on Feb 18, 2021

  1. [CARBONDATA-3962] Fixed concurrent load failure with flat folder stru…

    …cture.
    
    Why is this PR needed?
    PR #3904 has added the code to remove fact directory and because of this concurrent
    load fails with file not found exception.
    
    What changes were proposed in this PR?
    Reverted PR 3904.
    
    This closes #4905
    nihal0107 authored and Indhumathi27 committed Feb 18, 2021
    Configuration menu
    Copy the full SHA
    1cab165 View commit details
    Browse the repository at this point in the history
  2. [CARBONDATA-4126] Concurrent compaction failed with load on table

    Why is this PR needed?
    Concurrent compaction was failing when run in parallel with load.
    During load we acquire SegmentLock for a particular segment, and
    when this same lock we try to acquire during compaction, we were
    not able to acquire this lock and compaction fails.
    
    What changes were proposed in this PR?
    Skipped compaction for segments for which we are not able to acquire
    the SegmentLock instead of throwing the exception.
    
    This closes #4093
    Karan980 authored and kunal642 committed Feb 18, 2021
    Configuration menu
    Copy the full SHA
    5ec3536 View commit details
    Browse the repository at this point in the history
  3. [CARBONDATA-4121] Prepriming is not working in Index Server

    Why is this PR needed?
    Prepriming is not working in Index Server. Server.getRemoteUser
    returns null value in async call of prepriming which results in
    NPE and crashes the indexServer application. Issue Induced after PR #3952
    
    What changes were proposed in this PR?
    Computed the Server.getRemoteUser value before making the async prepriming
    call and then used the same value during async call. Code reset to code before PR #3952
    
    This closes #4088
    Karan980 authored and kunal642 committed Feb 18, 2021
    Configuration menu
    Copy the full SHA
    59ad77a View commit details
    Browse the repository at this point in the history

Commits on Mar 3, 2021

  1. [CARBONDATA-4115] Successful load and insert will return segment ID

    Why is this PR needed?
    Currently successful load and insert sql return empty Seq in carbondata, we need it to return the segment ID.
    
    What changes were proposed in this PR?
    Successful load and insert will return segment ID.
    
    Does this PR introduce any user interface change?
    Yes. (Successful load and insert will return segment ID.)
    
    Is any new testcase added?
    Yes
    
    This closes #4086
    areyouokfreejoe authored and ajantha-bhat committed Mar 3, 2021
    Configuration menu
    Copy the full SHA
    0112268 View commit details
    Browse the repository at this point in the history

Commits on Mar 5, 2021

  1. [CARBONDATA-4137] Refactor CarbonDataSourceScan without the soruces.F…

    …ilter of Spark 3
    
    Why is this PR needed?
    1. In spark version 3, org.apache.spark.sql.sources.Filter is sealed, carbon can't extend it in carbon code.
    2. The name of CarbonLateDecodeStrategy class is incorrect, the code is complex and hard to read
    3. CarbonDataSourceScan can be the same for 2.3 and 2.4, and should support both batch reading and row reading.
    
    What changes were proposed in this PR?
    1. translate spark Expression to carbon Expression directly, skip the spark Filter step. Remove all spark Filters in carbon code.
      old follow: Spark Expression => Spark Filter => Carbon Expression
      new follow: Spark Expression => Carbon Expression
    2. Remove filter reorder, need to implement expression reorder (added CARBONDATA-4138).
    3. separate CarbonLateDecodeStrategy to CarbonSourceStrategy and DMLStrategy, and simplify the code of CarbonSourceStrategy.
    4. move CarbonDataSourceScan back to the source folder, use one CarbonDataSourceScan for all versions
      CarbonDataSourceScan supports both VectorReader and RowReader, Carbon will not use RowDataSourceScanExec.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    No
    QiangCai authored and MarvinLitt committed Mar 5, 2021
    Configuration menu
    Copy the full SHA
    8f2ee7f View commit details
    Browse the repository at this point in the history

Commits on Mar 9, 2021

  1. [CARBONDATA-4133] Concurrent Insert Overwrite with static partition o…

    …n Index server fails
    
    Why is this PR needed?
    Concurrent Insert Overwrite with static partition on Index server fails. When index server
    and prepriming are enabled, prepriming is triggered even when load fails as it is in finally block.
    Performance degradation with indexserver due to #4080
    
    What changes were proposed in this PR?
    Removed triggerPrepriming method from finally.
    Reverted 4080 and used a boolean flag to determine the external segment.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    No, tested in cluster.
    
    This closes #4096
    ShreelekhyaG authored and Indhumathi27 committed Mar 9, 2021
    Configuration menu
    Copy the full SHA
    35c4b33 View commit details
    Browse the repository at this point in the history

Commits on Mar 10, 2021

  1. [CARBONDATA-4141] Index Server is not caching indexes for external ta…

    …bles with sdk segments
    
    Why is this PR needed?
    Indexes cached in Executor cache are not dropped when drop table is called for external table
    with SDK segments. Because, external tables with sdk segments will not have metadata like table
    status file. So in drop table command we send zero segments to indexServer clearIndexes job,
    which clears nothing from executor side. So when we drop this type of table, executor side
    indexes are not dropped. Now when we again create external table with same location and do
    select * or select count(*), it will not cache the indexes for this table, because indexes with
    same loaction are already present. Now show metacache on this newly created table will use new tableId ,
    but indexes present have the old tableId, whose table is already dropped. So show metacache will return
    nothing, because of tableId mismatch.
    
    What changes were proposed in this PR?
    Prepared the validSegments from indexFiles present at external table location and send it to IndexServer clearIndexes job through IndexInputFormat.
    
    This closes #4099
    Karan980 authored and kunal642 committed Mar 10, 2021
    Configuration menu
    Copy the full SHA
    25c5687 View commit details
    Browse the repository at this point in the history

Commits on Mar 12, 2021

  1. [CARBONDATA-4075] Using withEvents instead of fireEvent

    Why is this PR needed?
    withEvents method can simplify code to fire event
    
    What changes were proposed in this PR?
    Refactor code to use the withEvents method instead of fireEvent
    
    This closes #4078
    QiangCai authored and Indhumathi27 committed Mar 12, 2021
    Configuration menu
    Copy the full SHA
    d5b3b8c View commit details
    Browse the repository at this point in the history

Commits on Mar 15, 2021

  1. [CARBONDATA-4110] Support clean files dry run operation and show stat…

    …istics after clean files
    
    operation
    
    Why is this PR needed?
    Currently in the clean files operation the user does not know how much space will be freed.
    The idea is the add support for dry run in clean files which can tell the user how much space
    will be freed in the clean files operation without cleaning the actual data.
    
    What changes were proposed in this PR?
    This PR has the following changes:
    
    1. Support dry run in clean files: It will show the user how much space will be freed by the
       clean files operation and how much space left (which can be released after expiration time)
       after the clean files operation.
    2. Clean files output: Total size released during the clean files operation
    3. Disable clean files Statistics option in case the user does not want clean files statistics
    4. Clean files log: To enhance the clean files log to print the name of every file that is being
    deleted in the info log.
    
    This closes #4072
    vikramahuja1001 authored and akashrn5 committed Mar 15, 2021
    Configuration menu
    Copy the full SHA
    d9f69ae View commit details
    Browse the repository at this point in the history

Commits on Mar 19, 2021

  1. [CARBONDATA-4144] During compaction, the segment lock of SI table is …

    …not released in abnormal
    
    scenarios.
    
    Why is this PR needed?
    When compact operation fails, the segment lock of SI table is not released. Run compaction again,
    can not get the segment lock of the SI table and compation does nothing, but in the tablestatus
    file of SI table the merged segment status is set to success and the segmentfile is
    xxx_null.segments and the vaule of indexsize is 0.
    
    What changes were proposed in this PR?
    If an exception occurs, release the obtained segment locks.
    If getting segment locks failed, not update the segment status.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    No
    
    This closes #4102
    liuhe0702 authored and akashrn5 committed Mar 19, 2021
    Configuration menu
    Copy the full SHA
    bce6481 View commit details
    Browse the repository at this point in the history
  2. [CARBONDATA-4145] Query fails and the message "File does not exist: x…

    …xxx.carbondata" is
    
    displayed
    
    Why is this PR needed?
    If an exception occurs when the refresh index command is executed, a task has been
    successful. The new query will be failed.
    Reason: After the compaction task is executed successfully, the old carbondata files are
    deleted. If other exception occurs, the deleted files are missing.
    This PR will fix this issue.
    
    What changes were proposed in this PR?
    When all tasks are successful, the driver deletes the old carbondata files.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    No
    
    This closes #4103
    liuhe0702 authored and akashrn5 committed Mar 19, 2021
    Configuration menu
    Copy the full SHA
    a4921e9 View commit details
    Browse the repository at this point in the history
  3. [CARBONDATA-4149] Fix query issues after alter add partition.

    Why is this PR needed?
    Query with SI after add partition based on location on partition table gives incorrect results.
    1. While pruning, if it's an external segment, it should use ExternalSegmentResolver , and no
       need to use ImplicitIncludeFilterExecutor as an external segment is not added in the SI table.
    2. If the partition table has external partitions, after compaction the new files are loaded
       to the external path.
    3. Data is not loaded to the child table(MV) after executing add partition command
    
    What changes were proposed in this PR?
    1. add path to loadMetadataDetails for external partition. It is used to identify it as an
       external segment.
    2. After compaction, to not maintain any link to the external partition, the compacted files
       will be added as a new partition in the table. To update partition spec details in hive metastore,
       (drop partition + add partition) operations performed.
    3. Add Load Pre and Post listener's in CarbonAlterTableAddHivePartitionCommand to trigger data
       load to materialized view.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    Yes
    
    This closes #4107
    ShreelekhyaG authored and Indhumathi27 committed Mar 19, 2021
    Configuration menu
    Copy the full SHA
    b00efca View commit details
    Browse the repository at this point in the history
  4. [CARBONDATA-4148] Reindex failed when SI has stale carbonindexmerge file

    Why is this PR needed?
    Reindex failed when SI has stale carbonindexmerge file, throw exception FileNotFoundException.
    This is because SegmentFileStore.getIndexFiles stores the mapping of indexfile to indexmergefile,
    when stale carbon indexmergefile exists, indexmergefile will not be null. During merging index
    file, new indexmergefile will be created with same name as before in the same location.
    At the end of CarbonIndexFileMergeWriter.writeMergeIndexFileBasedOnSegmentFile, carbon index
    file will be deleted. Since indexmergefile is stored in the indexFiles list, newly created
    indexmergefile will be delete also, which leads to FileNotFoundException.
    
    What changes were proposed in this PR?
    1. SegmentFileStore.getIndexFiles stores the mapping of indexfile to indexmergefile which is redundant.
    2. SegmentFileStore.getIndexOrMergeFiles returns both index file and index merge file, so
       function name is incorrect, rename to getIndexAndMergeFiles.
    3. CarbonLoaderUtil.getActiveExecutor actually get active node, so function name is incorrect,
       rename to getActiveNode, together replace all "executor" with "node" in function assignBlocksByDataLocality.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    Yes
    
    This closes #4105
    jack86596 authored and Indhumathi27 committed Mar 19, 2021
    Configuration menu
    Copy the full SHA
    b74645e View commit details
    Browse the repository at this point in the history

Commits on Mar 21, 2021

  1. [CARBONDATA-4147] Fix re-arrange schema in logical relation on MV par…

    …tition table having sort column
    
    Why is this PR needed?
    After PR-3615, we have avoided rearranging catalog table schema if already re-arranged.
    For MV on a partition table, we always move the partition column to the end on a MV partition table.
    Catalog table will also have the column schema in same order(partition column at last). Hence, in
    this case, we do not re-arrange logical relation in a catalog table again.
    
    But, if there is a sort column present in MV table, then selected column schema and catalog table
    schema will not be in same order. In that case, we have to re-arrange the catalog table schema.
    Currently, we are using rearrangedIndex to re-arrange the catalog table logical relation, but
    rearrangedIndex will keep the index of partition column at the end, whereas, catalog table has
    partition column already at the end. Hence, we are re-arranging the partition column index
    again in catalog table relation, which leads to insertion failure.
    
    Example:
    Create MV on columns: c1, c2 (partition), c3(sort_column), c4
    Problem:
    Create order: c1,c2,c3,c4
    Create order index: 0,1,2,3
    
    Rearranged Index:
    Existing Catalog table schema order: c1, c3, c4, c2 (for MV, partition column will be moved to Last)
    Rearrange index: 2,0,3,1
    After Re-arrange catalog table order: c4,c2,c2, c3(which is wrong)
    
    Solution:
    Change MV create order as below
    New Create order: c1,c4,c3,c2
    Create order index: 0,1,2,3
    
    Rearranged Index:
    Existing Catalog table schema order: c1, c3, c4, c2 (for MV, partition column will be moved to Last)
    Rearrange index: 1,0,2,3
    After Re-arrange catalog table order: c3,c1,c4,c2
    
    What changes were proposed in this PR?
    In MV case, if there is any column schema order change apart from partition column, then re-arrange
    index of only those columns and use the same to re-arrange catalog table logical relation.
    
    This closes #4106
    Indhumathi27 authored and akashrn5 committed Mar 21, 2021
    Configuration menu
    Copy the full SHA
    8d17de6 View commit details
    Browse the repository at this point in the history

Commits on Mar 23, 2021

  1. [CARBONDATA-4146]Query fails and the error message "unable to get fil…

    …e status" is displayed.
    
    query is normal after the "drop metacache on table" command is executed.
    
    Why is this PR needed?
    During compact execution, the status of the new segment is set to success before index
    files are merged. After index files are merged, the carbonindex files are deleted.
    As a result, the query task cannot find the cached carbonindex files.
    
    What changes were proposed in this PR?
    Set the status of the new segment to succeeded after index files are merged.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    No
    
    This closes #4104
    liuhe0702 authored and akashrn5 committed Mar 23, 2021
    Configuration menu
    Copy the full SHA
    6ab3647 View commit details
    Browse the repository at this point in the history
  2. [CARBONDATA-4153] Fix DoNot Push down not equal to filter with Cast o…

    …n SI
    
    Why is this PR needed?
    NOT EQUAL TO filter on SI index column, should not be pushed down to SI table.
    Currently, where x!='2' is not pushing down to SI, but where x!=2 is pushed down to SI.
    
    This is because "x != 2" will be wrapped in a CAST expression like NOT EQUAL TO(cast(x as int) = 2).
    
    What changes were proposed in this PR?
    Handle CAST case while checking DONOT PUSH DOWN to SI
    
    This closes #4108
    Indhumathi27 authored and kunal642 committed Mar 23, 2021
    Configuration menu
    Copy the full SHA
    fd0ff22 View commit details
    Browse the repository at this point in the history
  3. [CARBONDATA-4155] Fix Create table like table with MV

    Why is this PR needed?
    PR-4076 has added a new table property to fact table.
    While executing create table like command, this property
    is not excluded, which leads to parsing exception.
    
    What changes were proposed in this PR?
    Remove MV related info from destination table properties
    
    This closes #4111
    Indhumathi27 authored and kunal642 committed Mar 23, 2021
    Configuration menu
    Copy the full SHA
    0f53bdb View commit details
    Browse the repository at this point in the history
  4. [CARBONDATA-4149] Fix query issues after alter add empty partition lo…

    …cation
    
    Why is this PR needed?
    Query with SI after add partition based on empty location on partition
    table gives incorrect results. pr- 4107 fixes the issue for add
    partition if the location is not empty.
    
    What changes were proposed in this PR?
    while creating blockid, get segment number from the file name for
    the external partition. This blockid will be added to SI and used
    for pruning. To identify as an external partition during the compaction
    process, instead of checking with loadmetapath, checking with
    filepath.startswith(tablepath) format.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    Yes
    
    This closes #4112
    ShreelekhyaG authored and Indhumathi27 committed Mar 23, 2021
    Configuration menu
    Copy the full SHA
    f5e4c89 View commit details
    Browse the repository at this point in the history

Commits on Mar 25, 2021

  1. [CARBONDATA-4156] Fix Writing Segment Min max with all blocks of a se…

    …gment
    
    Why is this PR needed?
    PR-3999 has removed some code related to getting segment min max from all blocks.
    Because of this, if segment has more than one block, currently, it is writing
    min max considering one block only.
    
    What changes were proposed in this PR?
    Reverted specific code from above PR. Removed unwanted synchronization for some methods
    
    This closes #4101
    Indhumathi27 authored and kunal642 committed Mar 25, 2021
    Configuration menu
    Copy the full SHA
    865ec9b View commit details
    Browse the repository at this point in the history

Commits on Mar 26, 2021

  1. [CARBONDATA-4154] Fix various concurrent issues with clean files

    Why is this PR needed?
    There are 2 issues in clean files operation when ran concurrently with multiple load operations:
    
    Dry run can show negative space freed for clean files with concurrent load.
    Accidental deletion of Insert in progress(ongoing load) during clean files operation.
    What changes were proposed in this PR?
    To solve the dry run negative result, saving the old metadatadetails before the clean files operation and comparing it with loadmetadetails after the clean files operation and just ignoring any new entry that has been added, basically doing an intersection of new and old metadatadetails to show the correct space freed.
    In case of load failure issue, there can be scenarios where load in going on(insert in progress state and segment lock is occupied) and as during clean files operation when the final table status lock is removed, there can be scenarios where the load has completed and the segment lock is released but in the clean files in the final list of loadmetadatadetails to be deleted, that load can still be in Insert In Progress state with segment lock released by the load. The clean files operation will delete such loads. To solve this issue, instead of sending a boolean which check if update is required or not in the tablestatus, can send a list of load numbers and will only delete those loadnumbers.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    No
    
    This closes #4109
    vikramahuja1001 authored and ajantha-bhat committed Mar 26, 2021
    Configuration menu
    Copy the full SHA
    d535a1e View commit details
    Browse the repository at this point in the history

Commits on Mar 28, 2021

  1. add .asf.yaml

    chenliang613 committed Mar 28, 2021
    Configuration menu
    Copy the full SHA
    4ec3e58 View commit details
    Browse the repository at this point in the history

Commits on Mar 29, 2021

  1. Configuration menu
    Copy the full SHA
    603133f View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    baa1f69 View commit details
    Browse the repository at this point in the history

Commits on Apr 15, 2021

  1. Enable github's merge function

    Enable github's merge function
    chenliang613 authored Apr 15, 2021
    Configuration menu
    Copy the full SHA
    db8666c View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    6be1691 View commit details
    Browse the repository at this point in the history

Commits on Apr 19, 2021

  1. [CARBONDATA-4161] Describe complex columns

    Why is this PR needed?
    Currently describe formatted displays the column information
    of a table and some additional information. When complex
    types such as ARRAY, STRUCT, and MAP types are present in the
    table, column definition can be long and it’s difficult to
    read in a nested format.
    
    What changes were proposed in this PR?
    The DESCRIBE output can be formatted to avoid long lines
    for multiple fields. We can pass the column name to the
    command and visualize its structure with child fields.
    
    Does this PR introduce any user interface change?
    Yes ,
    DDL Commands:
    DESCRIBE COLUMN fieldname ON [db_name.]table_name;
    DESCRIBE short [db_name.]table_name;
    
    Is any new testcase added?
    Yes
    
    This closes #4113
    ShreelekhyaG authored and Indhumathi27 committed Apr 19, 2021
    Configuration menu
    Copy the full SHA
    f67c8fa View commit details
    Browse the repository at this point in the history

Commits on Apr 20, 2021

  1. [CARBONDATA-4163] Support adding of single-level complex columns(arra…

    …y/struct)
    
    Why is this PR needed?
    This PR enables adding of single-level complex columns(only array and struct)
    to carbon table. Command -
    ALTER TABLE <table_name> ADD COLUMNS(arr1 ARRAY (double) )
    ALTER TABLE <table_name> ADD COLUMNS(struct1 STRUCT<a:int, b:string>)
    The default value for the column in case of old rows will be null.
    
    What changes were proposed in this PR?
    1. Create instances of ColumnSchema for each of the children, By doing this
       each child column will have its own ordinal. The new columns are first
       identified and stored in a flat structure. For example, for arr1 array(int)
       --> 2 column schemas are created - arr1 and arr1.val. First being the parent
       and second being its child. Each of which will have its own ordinals.
    2. Later while updating the Schema evolution entry we only account for the newly
       added parent columns while discarding children columns (As they are no longer
       required. Otherwise we will have the child as a separate column in the schema ).
    3. Using the schema evolution entry the final schema is updated. Since ColumnSchemas
       are stored as a flat structure we later convert them to a nested structure of type Dimensions.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    Yes
    
    This closes #4115
    akkio-97 authored and Indhumathi27 committed Apr 20, 2021
    Configuration menu
    Copy the full SHA
    d01d9f5 View commit details
    Browse the repository at this point in the history

Commits on Apr 22, 2021

  1. [CARBONDATA-4158]Add Secondary Index as a coarse-grain index and use …

    …secondary indexes for Presto queries
    
    Why is this PR needed?
    At present, secondary indexes are leveraged for query pruning via spark plan modification.
    This approach is tightly coupled with spark because the plan modification is specific to
    spark engine. In order to use secondary indexes for Presto or Hive queries, it is not
    feasible to modify the query plans as we desire in the current approach. Thus need arises
    for an engine agnostic approach to use secondary indexes in query pruning.
    
    What changes were proposed in this PR?
    1. Add Secondary Index as a coarse grain index.
    2. Add a new insegment() UDF to support query within the particular segments
    3. Control the use of Secondary Index as a coarse grain index pruning with
    property('carbon.coarse.grain.secondary.index')
    4. Use Index Server driver for Secondary Index pruning
    5. Use Secondary Indexes with Presto Queries
    
    This closes #4110
    VenuReddy2103 authored and kunal642 committed Apr 22, 2021
    Configuration menu
    Copy the full SHA
    09ad509 View commit details
    Browse the repository at this point in the history
  2. [CARBONDATA-4037] Improve the table status and segment file writing

    Why is this PR needed?
    Currently, we update table status and segment files multiple times for a
    single iud/merge/compact operation and delete the index files immediately after
    merge. When concurrent queries are run, there may be situations like user query
    is trying to access the segment index files and they are not present, which is
    availability issue.
    
    What changes were proposed in this PR?
    1. Generate segment file after merge index and update table status at beginning
       and after merge index. If mergeindex/ table status update fails , load will also fail.
       order:
       create table status file => index files => merge index => generate segment file => update table status
     * Same order is now maintained for SI, compaction, IUD, addHivePartition, addSegment scenarios.
     * Whenever segment file needs to be updated for main table, a new segment file is created
       instead of updating existing one.
    2. When compact 'segment_index' is triggered,
       For new tables - if no index files to merge, then logs warn message and exits.
       For old tables - index files not deleted.
    3. After SI small files merge,
       For newly loaded SI segments - DeleteOldIndexOrMergeFiles deletes immediately after merge.
       For segments that are already present (rebuild) - old index files and data files are not deleted.
    4. Removed carbon.merge.index.in.segment property from config-parameters. This property
       to be used for debugging/test purposes.
    
    Note: Cleaning of stale index/segment files to be handled in - CARBONDATA -4074
    
    This closes #3988
    ShreelekhyaG authored and akashrn5 committed Apr 22, 2021
    Configuration menu
    Copy the full SHA
    71910fb View commit details
    Browse the repository at this point in the history

Commits on Apr 26, 2021

  1. [CARBONDATA-4173][CARBONDATA-4174] Fix inverted index query issue and…

    … handle exception for desc column
    
    Why is this PR needed?
    After creating an Inverted index on the dimension column, some of the filter queries give incorrect results.
    handle exception for higher level non-existing children column in desc column.
    
    What changes were proposed in this PR?
    While sorting byte arrays with inverted index, we use compareTo method of ByteArrayColumnWithRowId. Here, it was sorting based on the last byte only. Made changes to sort properly based on the entire byte length when dictionary is used.
    handled exception and added in testcase.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    Yes
    
    This closes #4124
    ShreelekhyaG authored and ajantha-bhat committed Apr 26, 2021
    Configuration menu
    Copy the full SHA
    3a6e4a4 View commit details
    Browse the repository at this point in the history
  2. [CARBONDATA-4172] Select query having parent and child struct column …

    …in projection returns incorrect results
    
    Why is this PR needed?
    After PR-3574, a scenario has been missed while code refactor.
    Currently, if select query has both Parent and its child struct column in projection,
    only child column is pushed down to carbon for filling result. For other columns in parent Struct, data output is null.
    
    What changes were proposed in this PR?
    If parent struct column is also present in projection, then push down only parent column to carbon.
    
    This closes #4123
    Indhumathi27 authored and kunal642 committed Apr 26, 2021
    Configuration menu
    Copy the full SHA
    3b411bb View commit details
    Browse the repository at this point in the history

Commits on Apr 27, 2021

  1. [CARBONDATA-4167][CARBONDATA-4168] Fix case sensitive issues and inpu…

    …t validation for Geo values.
    
    Why is this PR needed?
    1. SPATIAL_INDEX property, POLYGON, LINESTRING, and RANGELIST UDF's are case sensitive.
    2. SPATIAL_INDEX.xx.gridSize and SPATIAL_INDEX.xxx.conversionRatio is accepting negative values.
    3. Accepting invalid values in geo UDF's.
    
    What changes were proposed in this PR?
    1. converted properties to lower case and made UDF's case insensitive.
    2. added validation.
    3. refactored readAllIIndexOfSegment
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    Yes
    
    This closes #4118
    ShreelekhyaG authored and Indhumathi27 committed Apr 27, 2021
    Configuration menu
    Copy the full SHA
    e5b1dd0 View commit details
    Browse the repository at this point in the history
  2. [CARBONDATA-4170] Support dropping of parent complex columns(array/st…

    …ruct/map)
    
    Why is this PR needed?
    This PR supports dropping of parent complex columns (single and multi-level)
    from the carbon table. Dropping of parent column will in turn drop all of
    its children columns too.
    
    What changes were proposed in this PR?
    Children columns are prefixed with its parent column name. So the identified
    columns are added to the delete-column-list and the schema is updated based
    on that.Test cases have been written up to 3-levels.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    Yes
    
    This closes #4121
    akkio-97 authored and Indhumathi27 committed Apr 27, 2021
    Configuration menu
    Copy the full SHA
    2f93479 View commit details
    Browse the repository at this point in the history

Commits on Apr 30, 2021

  1. [HOTFIX] Remove hitcount link due to not working

    Why is this PR needed?
    hitcount link in readme md file is not working
    
    What changes were proposed in this PR?
    Remove the hitcount link as its not required.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    No
    
    This closes #4128
    chenliang613 authored and akashrn5 committed Apr 30, 2021
    Configuration menu
    Copy the full SHA
    7350c33 View commit details
    Browse the repository at this point in the history

Commits on May 10, 2021

  1. [CARBONDATA-4166] Geo spatial Query Enhancements

    Why is this PR needed?
    Currently, for IN_POLYGON_LIST and IN_POLYLINE_LIST udf’s, polygons need to be
    specified in SQL. If the polygon list grows in size, then the SQL will also be too long,
    which may affect query performance, as SQL analysing cost will be more.
    If Polygons are defined as a Column in a new dimension table, then, Spatial dimension
    table join can be supported in order to support aggregation on spatial table columns
    based on polygons.
    
    What changes were proposed in this PR?
    Support IN_POLYGON_LIST and IN_POLYLINE_LIST with SELECT QUERY on the
    polygon table.
    Support IN_POLYGON filter as join condition for spatial JOIN queries.
    
    Does this PR introduce any user interface change?
    Yes.
    
    Is any new testcase added?
    Yes
    
    This closes #4127
    Indhumathi27 authored and ajantha-bhat committed May 10, 2021
    Configuration menu
    Copy the full SHA
    c825730 View commit details
    Browse the repository at this point in the history

Commits on May 11, 2021

  1. [CARBONDATA-4175] [CARBONDATA-4162] Leverage Secondary Index till seg…

    …ment level
    
    Why is this PR needed?
    In the existing architecture, if the parent(main) table and SI table don’t have
    the same valid segments then we disable the SI table. And then from the next query
    onwards, we scan and prune only the parent table until we trigger the next load or
    REINDEX command (as these commands will make the parent and SI table segments in sync).
    Because of this, queries take more time to give the result when SI is disabled.
    
    What changes were proposed in this PR?
    Instead of disabling the SI table(when parent and child table segments are not in sync)
    we will do pruning on SI tables for all the valid segments(segments with status success,
    marked for update and load partial success) and the rest of the segments will be pruned by the parent table.
    As of now, query on the SI table can be pruned in two ways:
    a) With SI as data map.
    b) WIth spark plan rewrite.
    This PR contains changes to support both methods of SI to leverage till segment level.
    
    This closes #4116
    nihal0107 authored and kunal642 committed May 11, 2021
    Configuration menu
    Copy the full SHA
    8996369 View commit details
    Browse the repository at this point in the history

Commits on May 20, 2021

  1. [CARBONDATA-4188] Fixed select query with small table page size after…

    … alter add column
    
    Why is this PR needed?
    Select query on table with long string data type and small page size throws
    ArrayIndexOutOfBoudException after alter add columns.
    Query fails because after changing the schema, the number of rows set in
    bitsetGroup(RestructureIncludeFilterExecutorImpl.applyFilter()) for pages is not correct.
    
    What changes were proposed in this PR?
    Set the correct number of rows inside every page of bitsetGroup.
    
    This closes #4137
    nihal0107 authored and kunal642 committed May 20, 2021
    Configuration menu
    Copy the full SHA
    41a756f View commit details
    Browse the repository at this point in the history
  2. [CARBONDATA-4185] Doc Changes for Heterogeneous format segments in ca…

    …rbondata
    
    Why is this PR needed?
    Heterogeneous format segments in carbondata documenation.
    
    What changes were proposed in this PR?
    Add segment feature background and impact on existed carbondata features
    
    This closes #4134
    maheshrajus authored and kunal642 committed May 20, 2021
    Configuration menu
    Copy the full SHA
    861ba2e View commit details
    Browse the repository at this point in the history

Commits on May 24, 2021

  1. [CARBONDATA-4184] alter table Set TBLPROPERTIES for RANGE_COLUMN sets…

    … unsupported
    
    datatype(complex_datatypes/Binary/Boolean/Decimal) as RANGE_COLUMN
    
    Why is this PR needed?
    Alter table set command was not validating unsupported dataTypes for range column.
    
    What changes were proposed in this PR?
    Added validation for unsupported dataTypes before setting range column value.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    Yes
    
    This closes #4133
    Karan980 authored and Indhumathi27 committed May 24, 2021
    Configuration menu
    Copy the full SHA
    35091a2 View commit details
    Browse the repository at this point in the history
  2. [CARBONDATA-4189] alter table validation issues

    Why is this PR needed?
    1. Alter table duplicate columns check for dimensions/complex columns missed
    2. Alter table properties with long strings for complex columns should not support
    
    What changes were proposed in this PR?
    1. Changed the dimension columns list type in preparing dimensions columns
       [LinkedHashSet to Scala Seq] for handling the duplicate columns
    2. Added check for throwing an exception in case of long strings for complex columns
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    Yes
    
    This closes #4138
    maheshrajus authored and Indhumathi27 committed May 24, 2021
    Configuration menu
    Copy the full SHA
    07c98e8 View commit details
    Browse the repository at this point in the history

Commits on May 25, 2021

  1. [CARBONDATA-4183] Local sort Partition Load and Compaction fix

    Why is this PR needed?
    Currently, number of tasks for partition table local sort load, is decided based on input file size. In this case, the data will not be properly sorted, as tasks launched is more. For compaction, number of tasks is equal to number of partitions. If data is huge for a partition, then there can be chances, that compaction will fail with OOM with less memory configurations.
    
    What changes were proposed in this PR?
    When local sort task level property is enabled,
    
    For local sort load, divide input files based on the node locality (num of task = num of nodes), which will properly do the local sorting.
    For compaction, launch task based on task id for a partition, so the task launched for a partition will be more.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    Yes
    
    This closes #4130
    Indhumathi27 authored and ajantha-bhat committed May 25, 2021
    Configuration menu
    Copy the full SHA
    a90243c View commit details
    Browse the repository at this point in the history

Commits on Jun 2, 2021

  1. [CARBONDATA-4186] Fixed insert failure when partition column present …

    …in local sort scope
    
    Why is this PR needed?
    Currently when we create table with partition column and put the same column as part of
    local sort scope then Insert query fails with ArrayIndexOutOfBounds exception.
    
    What changes were proposed in this PR?
    Handle ArrayIndexOutOfBound exception, earlier array size was not increasing because data
    was inconsistence and in the wrong order for sortcolumn and isDimNoDictFlags.
    
    This closes #4132
    nihal0107 authored and kunal642 committed Jun 2, 2021
    Configuration menu
    Copy the full SHA
    01fd120 View commit details
    Browse the repository at this point in the history
  2. [CARBONDATA-4191] update table for primitive column not working when …

    …complex child
    
    column name and primitive column name match
    
    Why is this PR needed?
    Update primitive column not working when complex column child name and primitive
    data type name same.
    When an update for primitive is received, we are checking in complex child columns
    if column name matches then returning UnsupportedOperationbException.
    
    What changes were proposed in this PR?
    Currently, we are ignoring the prefix of all columns and passing only columns/child
    column info to the update command.
    New Changes: Passing full column(alias name/table name.columnName) name which is given
    by the user and added checks for handling the unsupported update operation of complex columns.
    
    This closes #4139
    maheshrajus authored and kunal642 committed Jun 2, 2021
    Configuration menu
    Copy the full SHA
    4c04f7c View commit details
    Browse the repository at this point in the history

Commits on Jun 4, 2021

  1. [Doc] syntax and format issues in README.md and how-to-contribute-to-…

    …apache-carbondata.md
    
    Why is this PR needed?
    To improve the quality of README.md and how-to-contribute-to-apache-carbondata.md.
    
    What changes were proposed in this PR?
    Syntax and format changes.
    
    This closes #4136
    Sunt-ing authored and Indhumathi27 committed Jun 4, 2021
    Configuration menu
    Copy the full SHA
    26e9182 View commit details
    Browse the repository at this point in the history
  2. [CARBONDATA-4192] UT cases correction for validating the exception me…

    …ssage correctly
    
    Why is this PR needed?
    Currently, when we check the exception message like below, it is not asserting/failing/
    catching if the message content is different.
    `intercept[UnsupportedOperationException](
     sql("update test set(a)=(4) where id=1").collect()).getMessage.contains("abc")`
    
    What changes were proposed in this PR?
    1. Added assert condition like below for validating the exception message correctly
       `assert(intercept[UnsupportedOperationException](
        sql("update test set(a)=(4) where id=1").collect()).getMessage.contains("abc"))`
    2. Added assert condition to check exception message for some test cases which are
       not checking exception message
    3. Fixed add segment doc heading related issues
    
    This closes #4140
    maheshrajus authored and Indhumathi27 committed Jun 4, 2021
    Configuration menu
    Copy the full SHA
    8740016 View commit details
    Browse the repository at this point in the history

Commits on Jun 7, 2021

  1. [CARBONDATA-4193] Fix compaction failure after alter add complex column.

    Why is this PR needed?
    1. When we perform compaction after alter add a complex column, the query fails with
       ArrayIndexOutOfBounds exception. While converting and adding row after merge step
       in WriteStepRowUtil.fromMergerRow, As complex dimension is present, the complexKeys
       array is accessed but doesnt have any values in array and throws exception.
    2. Creating SI with globalsort on newly added complex column throws TreenodeException
       (Caused by: java.lang.RuntimeException: Couldn't find positionId#172 in [arr2#153])
    
    What changes were proposed in this PR?
    1. While restructuring row, added changes to fill complexKeys with default values(null
       values to children) according to the latest schema.
       In SI queryresultprocessor, used the column property isParentColumnComplex to identify
       any complex type. If complex index column not present in the parent table block,
       assigned the SI row value to empty bytes.
    2. For SI with globalsort, In case of complex type projection, TableProperties object in
       carbonEnv is not same as in carbonTable object and hence requiredColumns is not
       updated with positionId. So updating tableproperties from carbon env itself.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    Yes
    
    This closes #4142
    ShreelekhyaG authored and akashrn5 committed Jun 7, 2021
    Configuration menu
    Copy the full SHA
    fee8b18 View commit details
    Browse the repository at this point in the history
  2. [CARBONDATA-4196] Allow zero or more white space in GEO UDFs

    Why is this PR needed?
    Currently, regex of geo UDF is not allowing zero space between
    UDF name and parenthesis. It always expects a single space in
    between. For ex: linestring (120.184179 30.327465). Because of
    this sometimes using the UDFs without space is not giving
    the expected result.
    
    What changes were proposed in this PR?
    Allow zero space between UDFs and parenthesis.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    Yes
    
    This closes #4145
    nihal0107 authored and Indhumathi27 committed Jun 7, 2021
    Configuration menu
    Copy the full SHA
    70643df View commit details
    Browse the repository at this point in the history
  3. [CARBONDATA-4143] Enable UT with index server and fix related issues

    Why is this PR needed?
    enable to run UT with index server.
    Fix below issues:
    1. With index server enabled, select query gives incorrect result with
       SI when parent and child table segments are not in sync.
    2. When reindex is triggered, if stale files are present in the segment
       directory the segment file is being written with incorrect file names.
       (both valid index and stale mergeindex file names). As a result, duplicate
       data is present in SI table but there are no error/incorrect query results.
    
    What changes were proposed in this PR?
    usage of flag useIndexServer. excluded some of the test cases to not run with index server.
    1. While pruning from index server, missingSISegments values were not
       getting considered. Have passed down and set those values to filter.
    2. Before loading data to SI segment, added changes to delete the segment
       directory if already present.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    Yes
    
    This closes #4098
    ShreelekhyaG authored and Indhumathi27 committed Jun 7, 2021
    Configuration menu
    Copy the full SHA
    d838e3b View commit details
    Browse the repository at this point in the history

Commits on Jun 10, 2021

  1. [CARBONDATA-4179] Support renaming of complex columns (array/struct)

    Why is this PR needed?
    This PR enables renaming of complex columns - parent as well as children columns with nested levels
    example: if the schema contains columns - str1 struct<a:int, b:string>, arr1 array<long>
    1. alter table <table_name> change str1 str2 struct<a:int, b:string>
    2. alter table <table_name> change arr1 arr2 array<long>
    3. Changing parent name as well as child name
    4. alter table <table_name> change str1 str2 struct<abc:int, b:string>
    NOTE- Rename operation fails if the structure of the complex column has been altered.
    This check ensures the old and new columns are compatible with each other. Meaning
    the number of children and complex levels should be unaltered while attempting to rename.
    
    What changes were proposed in this PR?
    1. Parses the incoming new complex type. Create a nested DatatypeInfo structure.
    2. This DatatypeInfo is then passed on to the AlterTableDataTypeChangeModel.
    3. Validation for compatibility, duplicate columns happens here.
    4. Add the parent column to the schema evolution entry.
    5. Update the spark catalog table.
    Limitation - Renaming is not supported for Map types yet
    
    Does this PR introduce any user interface change?
    Yes
    
    Is any new testcase added?
    Yes
    
    This closes #4129
    akkio-97 authored and Indhumathi27 committed Jun 10, 2021
    Configuration menu
    Copy the full SHA
    cfa02dd View commit details
    Browse the repository at this point in the history
  2. [CARBONDATA-4202] Fix issue when refresh main table with MV

    Why is this PR needed?
    When trying to register a table of old store which has MV, it fails parser
    error(syntax issue while creating table). It is trying to create table with
    relatedmvtablesmap property which is not valid.
    
    What changes were proposed in this PR?
    1. Removed relatedmvtablesmap from table properties in RefreshCarbonTableCommand
    2. After Main table has registered, to register MV made changes to get the schema
       from the system folder and register.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    Yes
    
    This closes #4147
    ShreelekhyaG authored and Indhumathi27 committed Jun 10, 2021
    Configuration menu
    Copy the full SHA
    90841bc View commit details
    Browse the repository at this point in the history

Commits on Jun 16, 2021

  1. [CARBONDATA-4206] Support rename SI table

    Why is this PR needed?
    Currently rename SI table can succeed, but after rename, insert and query on main table
    failed, throw no such table exception. This is because after SI table renamed, main
    table's tblproperties didn't get update, it still stores the old SI table name, when
    refering to SI table, it tries to find the SI table by old name, which leads to no such table exception.
    
    What changes were proposed in this PR?
    After SI table renamed, update the main table's tblproperties with new SI information.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    Yes
    
    This closes #4149
    jack86596 authored and Indhumathi27 committed Jun 16, 2021
    Configuration menu
    Copy the full SHA
    f1da9e8 View commit details
    Browse the repository at this point in the history

Commits on Jun 18, 2021

  1. [CARBONDATA-4208] Wrong Exception received for complex child long str…

    …ing columns
    
    Why is this PR needed?
    When we create a table with complex columns with child columns with long string
    data type then receiving column not found in table exception. Normally it should
    throw an exception in the above case by saying that complex child columns will
    not support long string data type.
    
    What changes were proposed in this PR?
    Added a case if complex child column has long string data type then throw correct
    exception.
    Exception: MalformedCarbonCommandException
    Exception Message: Complex child column cannot be set as LONG_STRING_COLUMNS
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    Yes
    
    This closes #4150
    maheshrajus authored and akashrn5 committed Jun 18, 2021
    Configuration menu
    Copy the full SHA
    65fad98 View commit details
    Browse the repository at this point in the history
  2. [CARBONDATA-4212] Fix case sensitive issue with Update query having A…

    …lias Table name
    
    Why is this PR needed?
    Update Query having Alias Table name, fails with Unsupported complex types error,
    even if table does not any.
    
    What changes were proposed in this PR?
    Check the columnName irrespective of case
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    Yes
    
    This closes #4152
    Indhumathi27 authored and akashrn5 committed Jun 18, 2021
    Configuration menu
    Copy the full SHA
    95ab745 View commit details
    Browse the repository at this point in the history
  3. [CARBONDATA-4213] Fix update/delete issue in index server

    Why is this PR needed?
    During update/delete, the segment file in the segment would come as an empty
    string due to which it was not able to read the segment file.
    
    What changes were proposed in this PR?
    1. Changed the empty string to NULL
    2. Added empty segment file condition while creating SegmentFileStore.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    No
    
    This closes #4153
    vikramahuja1001 authored and Indhumathi27 committed Jun 18, 2021
    Configuration menu
    Copy the full SHA
    fdd00ab View commit details
    Browse the repository at this point in the history

Commits on Jun 19, 2021

  1. [CARBONDATA-4211] Fix - from xx Insert into select fails if an SQL st…

    …atement contains multiple inserts
    
    Why is this PR needed?
    When multiple inserts with single query is used, it fails from SparkPlan with: java.lang.ClassCastException:
    GenericInternalRow cannot be cast to UnsafeRow.
    For every successful insert/load we return Segment ID as a row. For multiple inserts also, we are returning
    a row containing Segment ID but while processing in spark ClassCastException is thrown.
    
    What changes were proposed in this PR?
    When multiple insert query is given, it has Union node in the plan. Based on its presence, made changes
    to use flag isMultipleInserts to call class UnionCommandExec and implemented custom sideEffectResult which
    converts GenericInternalRow to UnsafeRow and return.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    Yes
    
    This closes #4151
    ShreelekhyaG authored and akashrn5 committed Jun 19, 2021
    Configuration menu
    Copy the full SHA
    d8f7df9 View commit details
    Browse the repository at this point in the history

Commits on Jun 22, 2021

  1. [CARBONDATA-4217] Fix rename SI table, other applications didn't get …

    …reflected issue
    
    Why is this PR needed?
    After one application rename SI table, other application cannot be reflected
    of this change, which leads to query on SI column failed.
    
    What changes were proposed in this PR?
    After update index info of parent table, persist schema info so that other
    applications can refresh table metadata in time.
    
    This closes #4155
    jack86596 authored and Indhumathi27 committed Jun 22, 2021
    Configuration menu
    Copy the full SHA
    d5cb011 View commit details
    Browse the repository at this point in the history

Commits on Jun 23, 2021

  1. [CARBONDATA-4214] inserting NULL value when timestamp value received …

    …from FROM_UNIXTIME(0)
    
    Why is this PR needed?
    Filling null in case of timestamp value is received from FROM_UNIXTIME(0) as spark original
    insert rdd value[internalRow] received in this case zero. if the original column
    value[internalRow] is zero then in insert flow adding NULL and giving NULL to spark.
    When query happens on the same column received NULL value instead of timestamp value.
    Problem code: if (internalRow.getLong(index) == 0) { internalRow.setNullAt(index) }
    
    What changes were proposed in this PR?
    Removed the null filling check for zero value case and if internalRow value is non
    null/empty then only set the internalRow timestamp value.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    Yes
    
    This closes #4154
    maheshrajus authored and Indhumathi27 committed Jun 23, 2021
    Configuration menu
    Copy the full SHA
    18665cc View commit details
    Browse the repository at this point in the history
  2. [CARBONDATA-4190] Integrate Carbondata with Spark 3.1.1 version

    Why is this PR needed?
    To integrate Carbondata with Spark3.1.1
    
    What changes were proposed in this PR?
    Refactored code to add changes to support Spark 3.1.1 along with Spark 2.3 and 2.4 versions
    Changes:
    
    1. Compile Related Changes
    	1. New Spark package in MV, Streaming and spark-integration.
    	2. API wise changes as per spark changes
    2. Spark has moved to Proleptic Gregorian Calendar, due to which timestamp related changes in carbondata are also required.
    3. Show segment by select command refactor
    4. Few Lucene test cases ignored due to the deadlock in spark DAGSchedular, which does not allow it to work.
    5. Alter rename: Parser enabled in Carbon and check for carbon
    6. doExecuteColumnar() changes in CarbonDataSourceScan.scala
    7. char/varchar changes from spark side.
    8. Rule name changed in MV
    9. In univocity parser, CSVParser version changed.
    10. New Configs added in SparkTestQueryExecutor to keep some behaviour same as 2.3 and 2.4
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    No
    
    This closes #4141
    vikramahuja1001 authored and akashrn5 committed Jun 23, 2021
    Configuration menu
    Copy the full SHA
    8ceb4fd View commit details
    Browse the repository at this point in the history
  3. [CARBONDATA-4225] Fix Update performance issues when auto merge compa…

    …ction is enabled
    
    Why is this PR needed?
    1. When auto-compaction is enabled, during update, we are trying to do compaction after
       Insert. Auto-Compaction throws exception, after multiple retries. Carbon does not allow
       concurrent compaction and Update.
    2. dataframe.rdd.isEmpty will launch a Job. This code is called two times in code, which
       is not reused.
    
    What changes were proposed in this PR?
    1. Avoid trying to do Auto-compaction during Update.
    2. Reuse dataframe.rdd.isEmpty and avoided launching a Job.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    Yes
    
    This closes #4156
    Indhumathi27 authored and akashrn5 committed Jun 23, 2021
    Configuration menu
    Copy the full SHA
    d4ddd07 View commit details
    Browse the repository at this point in the history

Commits on Jun 26, 2021

  1. [HOTFIX] Correct CI build status

    Due to apache jenkis CI address changed, correct CI build status.
    chenliang613 authored Jun 26, 2021
    Configuration menu
    Copy the full SHA
    899b7ae View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    5e2adad View commit details
    Browse the repository at this point in the history

Commits on Jun 29, 2021

  1. [CARBONDATA-4230] table properties not updated with lower-case and table

    comment is not working in carbon spark3.1
    
    Why is this PR needed?
    1. table properties storing with case-sensitive and when we query table
       properties with the small case then property not able to get hence table
       create command is failed. this is induced with spark 3.1 integration changes.
    2. Table comment is displayed as byte code in spark 3.1 cluster.
       CommentSpecContext is changed in 3.1
    
    What changes were proposed in this PR?
    1. convert to small case and store in table properties.
    2. Get string value from commentSpec and set as table comment
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    No, already test case is present but not failed in local ut setup as create
    flow is different in local ut env and real cluster setup
    
    This closes #4163
    maheshrajus authored and Indhumathi27 committed Jun 29, 2021
    Configuration menu
    Copy the full SHA
    65462ff View commit details
    Browse the repository at this point in the history

Commits on Jul 5, 2021

  1. Configuration menu
    Copy the full SHA
    aefa977 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    718490e View commit details
    Browse the repository at this point in the history
  3. [HOTFIX]Revert wrong pom changes commit during prepare release process.

    Why is this PR needed?
    Due to wrong branch release, wrong pom changes are present.
    
    What changes were proposed in this PR?
    revert the pom changes.
    
    This closes #4167
    akashrn5 authored and kunal642 committed Jul 5, 2021
    Configuration menu
    Copy the full SHA
    c7a3d6d View commit details
    Browse the repository at this point in the history

Commits on Jul 7, 2021

  1. [CARBONDATA-4232] Add missing doc change for secondary index.

    Why is this PR needed?
    Documentation changes were not handled in PR 4116
    
    What changes were proposed in this PR?
    Added missing documentation.
    
    This closes #4164
    nihal0107 authored and Indhumathi27 committed Jul 7, 2021
    Configuration menu
    Copy the full SHA
    88fdf60 View commit details
    Browse the repository at this point in the history

Commits on Jul 14, 2021

  1. [CARBONDATA-4210] Handle 3.1 parsing failures related to alter comple…

    …x types
    
    Why is this PR needed?
    For 2.3 and 2.4 parsing of alter commands are done by spark. Which is not in the case of 3.1.
    
    What changes were proposed in this PR?
    So carbon is responsible for the parsing here.
    Previously ignored test cases due to this issue are now enabled.
    
    This closes #4162
    akkio-97 authored and kunal642 committed Jul 14, 2021
    Configuration menu
    Copy the full SHA
    02e7723 View commit details
    Browse the repository at this point in the history

Commits on Jul 27, 2021

  1. [CARBONDATA-4204][CARBONDATA-4231] Fix add segment error message,

    index server failed testcases and dataload fail error on update
    
    Why is this PR needed?
    1. When the path is empty in Carbon add segments then
    StringIndexOutOfBoundsException is thrown.
    2. Index server UT failures fix.
    3. Update fails with dataload fail error if set bad
    records action is specified to force with spark 3.1v.
    
    What changes were proposed in this PR?
    1. Added check to see if the path is empty and then throw
    a valid error message.
    2. Used checkAnswer instead of assert in test cases so
    that the order of rows returned would be same with or
    without index server. Excluded 2 test cases where explain
    with query statistics is used, as we are not setting any
    pruning info from index server.
    3. On update command, dataframe.persist is called and with
    latest 3.1 spark changes, spark returns a cloned
    SparkSession from cacheManager with all specified
    configurations disabled. As now it's using a different
    sparkSession for 3.1 which is not initialized in CarbonEnv.
    So CarbonEnv.init is called where new CarbonSessionInfo is
    created with no sessionParams. So, the properties set were
    not accessible. When a new carbonSessionInfo object is
    getting created, made changes to set existing sessionparams
    from currentThreadSessionInfo.
    
    This closes #4157
    ShreelekhyaG authored and kunal642 committed Jul 27, 2021
    Configuration menu
    Copy the full SHA
    c9a5231 View commit details
    Browse the repository at this point in the history
  2. [CARBONDATA-4250] Ignoring presto random test cases

    Why is this PR needed?
    Presto test cases failing randomly and taking more time in CI verification for other PRs.
    
    What changes were proposed in this PR?
    Currently presto random test cases will be ignored and will be fixed with other JIRA raised.
    1. JIRA [CARBONDATA-4250] raised for ignoring presto test cases currently as this random
       failures causing PR CI failures.
    2. JIRA [CARBONDATA-4249] raised for fixing presto random tests in concurrent scenario.
       We can get more details on this JIRA about issue reproduce and problem snippet.
    3. [CARBONDATA-4254] raised to fix Test alter add for structs enabling local dictionary
       and CarbonIndexFileMergeTestCaseWithSI.Verify command of index merge
    
    This closes #4176
    maheshrajus authored and Indhumathi27 committed Jul 27, 2021
    Configuration menu
    Copy the full SHA
    0337c32 View commit details
    Browse the repository at this point in the history

Commits on Jul 28, 2021

  1. [CARBONDATA-4251][CARBONDATA-4253] Optimize Clean Files Performance

    Why is this PR needed?
     1) When execute cleanfile command, it cleans up all the carbonindex and
        carbonmergeindex that once existed, even though carbonindex files have been
        merged into carbonergeindex and deleted. When there are tens of thousands
        of carbonindex that once existed after the completion of the compaction,
        the clean file command will take serveral hours to clean index files which
        actually doesn't exist. We just need to clean up the existing
        files, carbonmergeindex or carbonindex files
     2) The rename command will list partitions of the table, but the partitions
        information is not actually used. If the table has hundreds of thousands
        partitions, the performance of rename table will degrade a lot
    
    What changes were proposed in this PR?
     1) There is a variable indexOrMergeFiles, which means all existing indexfiles,
        CLEAN FILE commmand will delete all existing files instead of delete all
        files in 'indexFilesMap', which is actually all '.carbonindex' files once
        exists. Clean 'indexOrMergeFiles' helps to improve CLEAN FILES performance a lot.
     2) The rename command will list partitions for the table, but the partitions
        information is not actually used. If the table has hundreds of thousands
        partitions, the performance of rename table will degrade a lot
    
    This closes #4183
    marchpure authored and Indhumathi27 committed Jul 28, 2021
    Configuration menu
    Copy the full SHA
    9aaeba5 View commit details
    Browse the repository at this point in the history

Commits on Jul 29, 2021

  1. [CARBONDATA-4248] Fixed upper case column name in explain command

    Why is this PR needed?
    Explain command with upper case column name fails with key not found exception.
    
    What changes were proposed in this PR?
    Changed column name to lower case before conversion of spark data type to carbon data type.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    Yes
    
    This closes #4175
    nihal0107 authored and Indhumathi27 committed Jul 29, 2021
    Configuration menu
    Copy the full SHA
    f2698fe View commit details
    Browse the repository at this point in the history
  2. [CARBONDATA-4247][CARBONDATA-4241] Fix Wrong timestamp value query re…

    …sults for data before
    
    1900 years with Spark 3.1
    
    Why is this PR needed?
    1. Spark 3.1, will store timestamp value as julian micros and rebase timestamp value from
    JulianToGregorianMicros during query.
    -> Since carbon parse and formats timestamp value with SimpleDateFormatter, query gives
    incorrect results, when rebased with JulianToGregorianMicros by spark.
    2. CARBONDATA-4241 -> Global sort load and compaction fails on table having timestamp column
    
    What changes were proposed in this PR?
    1. Use Java Instant to parse new timestamp values. For old stores and query with Spark 3.1,
    Rebase the timestamp value from Julian to Gregorian Micros
    2. If timestamp value is of type Instant, then convert value to java timestamp.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    No (Existing testcase is sufficient)
    
    This closes #4177
    Indhumathi27 authored and akashrn5 committed Jul 29, 2021
    Configuration menu
    Copy the full SHA
    feb0521 View commit details
    Browse the repository at this point in the history
  3. [CARBONDATA-4242]Improve cdc performance and introduce new APIs for U…

    …PSERT, DELETE, INSERT and UPDATE
    
    Why is this PR needed?
    1. In the exiting solution, when we perform join of the source and target dataset for tagging records to delete, update and insert records, we were scanning all the data of target table and then perform join with source dataset. But it can happen that the source data is less and its range may cover only some 100s of Carbondata files out of 1000s of files in the target table. So pruning is main bottleneck here, so scanning all records and involving in join results in so much of shuffle and reduces performance.
    2. Source data caching was not there, caching source data will help to improve its multiple scans and since input source data will be of less size, we can persist the dataset.
    3. When we were performing join, we used to first get the Row object and then operate on it and then for each datatype casting happens to convert to spark datatype and then covert to InternalRow object for further processing of joined data. This will add extra deserializeToObject and map nodes in DAG and increase time.
    4. Initially during tagging records(Join operation), we were preparing a new projection of required columns, which basically involves operations of preparing an internal row object as explained in point 3, and then apply eval function on each row to prepare a projection, so this basically applying same eval of expression on joined data, a repeated work and increases time.
    5. In join operation we were using all the columns of source dataset and the required columns of target table like, join key column and other columns of tupleID, status_on_mergeds etc. So when we there will be so many columns in the table, then it will increase the execution time due to lot of data shuffling.
    6. The current APIs of merge are little bit complex and generalized and confusing to user for simple Upsert, delete and insert operations.
    
    What changes were proposed in this PR?
    1. Add a pruning logic before the join operations. Compare the incoming row with an interval based tree data structure which contains the Carbondata file path and min and max to identify the Carbondata file where the incoming row can be present, so that in some use case scenario which will be explained in later section, can give benefit and help to scan less files rather than blindly scanning all the Carbondata files in the target table.
    2. Cache the incoming source dataset srcDS.cache(), so that the cached data will be used in all the operations and speed will be improved. Uncache() after the merge operation
    3. Instead of operating on row object and then converting to InternalRow, directly operate on the InternalRow object to avoid the data type conversions.
    4. Instead of evaluating the expression again based on required project columns on matching conditions and making new projection, directly identify the indexes required for output row and then directly access these indices on the incoming internal row object after step3, so evaluation is avoided and array access with indices will give O(1) performance.
    5. During join or the tagging of records, do not include all the column data, just include the join key columns and identify the tupleIDs to delete and the rows to insert, this will avoid lot of shuffle and improve performance significantly.
    6. Introduce new APIs for UPSERT, UPDATE, DELETE and INSERT and make the user exposed APIs simple. So now user just needs to give the key column for join, source dataset and the operation type as mentioned above. These new APIs will make use of all the improvements mentioned above and avoid unnecessary operations of the existing merge APIs.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    Yes
    
    This closes #4148
    akashrn5 authored and ajantha-bhat committed Jul 29, 2021
    Configuration menu
    Copy the full SHA
    1e2fc4c View commit details
    Browse the repository at this point in the history

Commits on Jul 30, 2021

  1. [CARBONDATA-4255] Prohibit Create/Drop Database when databaselocation…

    … is inconsistent
    
    Why is this PR needed?
    When carbon.storelocation and spark.sql.warehouse.dir are configured to
    different values, the databaselocation maybe inconsistent. When DROP DATABASE
    command is executed, maybe both location (carbon dblcation and hive
    dblocation) will be cleared, which may confuses the users
    
    What changes were proposed in this PR?
    Drop database is prohibited when database locaton is inconsistent
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    Yes
    
    This closes #4186
    marchpure authored and Indhumathi27 committed Jul 30, 2021
    Configuration menu
    Copy the full SHA
    3c81c7a View commit details
    Browse the repository at this point in the history

Commits on Aug 1, 2021

  1. Configuration menu
    Copy the full SHA
    aceaa44 View commit details
    Browse the repository at this point in the history

Commits on Aug 3, 2021

  1. Configuration menu
    Copy the full SHA
    62354e3 View commit details
    Browse the repository at this point in the history

Commits on Aug 5, 2021

  1. Configuration menu
    Copy the full SHA
    7b0d8f6 View commit details
    Browse the repository at this point in the history

Commits on Aug 7, 2021

  1. Configuration menu
    Copy the full SHA
    11dab76 View commit details
    Browse the repository at this point in the history
  2. [CARBONDATA-4268][Doc][summer-2021] Add new dev mailing list (website…

    …) link and update the Nabble address This closes #4195
    chenliang613 committed Aug 7, 2021
    Configuration menu
    Copy the full SHA
    a5bb652 View commit details
    Browse the repository at this point in the history
  3. [Doc][summer-2021] Add TOC and format how-to-contribute-to-apache-car…

    …bondata.md
    
    Due to Github Flavored Markdown, Github does't support TOC automatic generation in Markdown file. Use  anchors to implement  TOC of headings.
    Jeromestein authored and chenliang613 committed Aug 7, 2021
    Configuration menu
    Copy the full SHA
    e8f8c02 View commit details
    Browse the repository at this point in the history
  4. [CARBONDATA-4266][Doc][summer-2021] Add TOC and format how-to-contrib…

    …ute-to-apache-carbondata.md This closes #4192
    chenliang613 committed Aug 7, 2021
    Configuration menu
    Copy the full SHA
    d4abe76 View commit details
    Browse the repository at this point in the history

Commits on Aug 8, 2021

  1. Update quick-start-guide.md

    Modify minor errors and correct some misunderstandings in the document
    
    Create quick-start-guide.md
    ChanceXin authored and chenliang613 committed Aug 8, 2021
    Configuration menu
    Copy the full SHA
    926b67b View commit details
    Browse the repository at this point in the history
  2. [CARBONDATA-4267][Doc][summer-2021]Update and modify some content in …

    …quick-start-guide.md This closes #4197
    chenliang613 committed Aug 8, 2021
    Configuration menu
    Copy the full SHA
    fac48be View commit details
    Browse the repository at this point in the history

Commits on Aug 11, 2021

  1. [CARBONDATA-4256] Fixed parsing failure on SI creation for complex co…

    …lumn
    
    Why is this PR needed?
    Currently, SI creation on a complex column that includes child column
    with a dot(.) fails with parse exception.
    
    What changes were proposed in this PR?
    Handled parsing for create index on complex column.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    Yes
    
    This closes #4187
    nihal0107 authored and Indhumathi27 committed Aug 11, 2021
    Configuration menu
    Copy the full SHA
    bdd4a8c View commit details
    Browse the repository at this point in the history
  2. [CARBONDATA-4091] support prestosql 333 integartion with carbon

    Why is this PR needed?
    Currently carbondata is integrated with presto-sql 316, which is 1.5 years older.
    There are many good features and optimization that came into presto
    like dynamic filtering, Rubix data cache and some performance improvements.
    
    It is always good to use latest version, latest version is presto-sql 348.
    But jumping from 316 to 348 will be too many changes.
    So, to utilize these new features and based on customer demand, I suggest
    to upgrade presto-sql to 333 version. Later it will be again upgraded
    to more latest version in few months.
    
    Note:
    This is a plain integration to support all existing features of presto316,
    deep integration to support new features like dynamic filtering,
    Rubix cache will be handled in another PR.
    
    What changes were proposed in this PR?
    1. Adapt to the new hive adapter changes like some constructor changes,
       Made a carbonDataConnector to support CarbonDataHandleResolver
    2. Java 11 removed ConstructorAccessor class, so using unsafe class for
       reflection. (presto333 depend on java 11 for runtime)
    3. POM changes to support presto333
    
    Note: JAVA 11 environment is needed for running presto333 with carbon and also
    need add this jvm property "--add-opens=java.base/jdk.internal.ref=ALL-UNNAMED"
    
    This closes #4034
    ajantha-bhat authored and Indhumathi27 committed Aug 11, 2021
    Configuration menu
    Copy the full SHA
    1ccf295 View commit details
    Browse the repository at this point in the history

Commits on Aug 16, 2021

  1. [CARBONDATA-4269] Update url and description for new prestosql-guide.md

    Why is this PR needed?
    PrestoSQL has now changed its name to Trino. Because Facebook established the Presto Foundation at The Linux Foundation®,Led to prestosql Must be change the name
    More information can see here : https://trino.io/blog/2020/12/27/announcing-trino.html
    
    What changes were proposed in this PR?
    1. Change the url to prestosql 333
    2. Added a description indicating that the user prestoSQL has been renamed to Trino
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    No
    
    This closes #4202
    czy006 authored and ajantha-bhat committed Aug 16, 2021
    Configuration menu
    Copy the full SHA
    5804060 View commit details
    Browse the repository at this point in the history

Commits on Aug 19, 2021

  1. Configuration menu
    Copy the full SHA
    0e59ddb View commit details
    Browse the repository at this point in the history

Commits on Aug 22, 2021

  1. [CARBONDATA-4272]carbondata test case not including the load command …

    …with overwrite This closes #4207
    chenliang613 committed Aug 22, 2021
    Configuration menu
    Copy the full SHA
    9f9ea1f View commit details
    Browse the repository at this point in the history

Commits on Aug 24, 2021

  1. [CARBONDATA-4119][CARBONDATA-4238][CARBONDATA-4237][CARBONDATA-4236] …

    …Support geo insert without geoId and document changes
    
    Why is this PR needed?
    1. To insert without geoid (like load) on geo table.
    2. [CARBONDATA-4119] : User Input for GeoID column not validated.
    3. [CARBONDATA-4238] : Documentation Issue in ddl-of-carbondata.md#add-columns
    4. [CARBONDATA-4237] : Documentation issues in streaming-guide.md, file-structure-of-carbondata.md and sdk-guide.md.
    5. [CARBONDATA-4236] : Documenatation issues in configuration-parameters.md.
    6. import processing class in streaming-guide.md is wrong
    
    What changes were proposed in this PR?
    1. Made changes to support insert on geo table with auto-generated geoId.
    2. [CARBONDATA-4119] : Added documentation about insert with custom geoId. Changes in docs/spatial-index-guide.md
    3. Other documentation changes added.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    Yes
    
    This closes #4205
    ShreelekhyaG authored and Indhumathi27 committed Aug 24, 2021
    Configuration menu
    Copy the full SHA
    8de65a2 View commit details
    Browse the repository at this point in the history

Commits on Aug 26, 2021

  1. [CARBONDATA-4164][CARBONDATA-4198][CARBONDATA-4199][CARBONDATA-4234] …

    …Support alter add map, multilevel complex columns and rename/change datatype.
    
    Why is this PR needed?
    Support alter add map, multilevel complex columns, and Change datatype for complex type.
    
    What changes were proposed in this PR?
    1. Support adding of single-level and multi-level map columns
    2. Support adding of multi-level complex columns(array/struct)
    3. Support renaming of map columns including nested levels
    4. Alter change datatype at nested levels (array/map/struct)
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    Yes
    
    This closes #4180
    ShreelekhyaG authored and Indhumathi27 committed Aug 26, 2021
    Configuration menu
    Copy the full SHA
    f52aa20 View commit details
    Browse the repository at this point in the history

Commits on Aug 31, 2021

  1. [CARBONDATA-4274] Fix create partition table error with spark 3.1

    Why is this PR needed?
    With spark 3.1, we can create a partition table by giving partition
    columns from schema.
    Like below example:
    create table partitionTable(c1 int, c2 int, v1 string, v2 string)
    stored as carbondata partitioned by (v2,c2)
    
    When the table is created by SparkSession with CarbonExtension,
    catalog table is created with the specified partitions.
    But in cluster/ with carbon session, when we create partition
    table with above syntax it is creating normal table with no partitions.
    
    What changes were proposed in this PR?
    partitionByStructFields is empty when we directly give partition
    column names. So it was not creating a partition table. Made
    changes to identify the partition column names and get the struct
    field and datatype info from table columns.
    
    This closes #4208
    ShreelekhyaG authored and kunal642 committed Aug 31, 2021
    Configuration menu
    Copy the full SHA
    ca659b5 View commit details
    Browse the repository at this point in the history

Commits on Sep 1, 2021

  1. [CARBONDATA-4271] Support DPP for carbon

    Why is this PR needed?
    This PR enables Dynamic Partition Pruning for carbon.
    
    What changes were proposed in this PR?
    CarbonDatasourceHadoopRelation has to extend HadoopFsRelation,
    because spark has added a check to use DPP only for relation matching HadoopFsRelation
    Apply Dynamic filter and get runtimePartitions and set this to CarbonScanRDD for pruning
    
    This closes #4199
    Indhumathi27 authored and kunal642 committed Sep 1, 2021
    Configuration menu
    Copy the full SHA
    bdc9484 View commit details
    Browse the repository at this point in the history
  2. [CARBONDATA-4273] Fix Cannot create external table with partitions

    Why is this PR needed?
    Create partition table with location fails with unsupported message.
    
    What changes were proposed in this PR?
    This scenario works in cluster mode. This check can be moved in local
    mode also and partition table can be created with table with location
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    Yes
    
    This closes #4211
    Indhumathi27 authored and akashrn5 committed Sep 1, 2021
    Configuration menu
    Copy the full SHA
    42f6982 View commit details
    Browse the repository at this point in the history
  3. [CARBONDATA-4278] Avoid refetching all indexes to get segment properties

    Why is this PR needed?
    When block index[BlockIndex] is available then no need to prepare indexes[List[BlockIndex] from available segments and partition locations which might cause delay in query performance.
    
    What changes were proposed in this PR?
    Call directly get segment properties if block index[BlockIndex] available.
    
          if (segmentIndices.get(0) instanceof BlockIndex) {
            segmentProperties =
                segmentPropertiesFetcher.getSegmentPropertiesFromIndex(segmentIndices.get(0));
          } else {
            segmentProperties =
                segmentPropertiesFetcher.getSegmentProperties(segment, partitionLocations);
          }
    getSegmentPropertiesFromIndex is calling directly block index segment properties.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    No. Already index related test cases are present which can cover the added code.
    
    This closes #4209
    maheshrajus authored and ajantha-bhat committed Sep 1, 2021
    Configuration menu
    Copy the full SHA
    226228f View commit details
    Browse the repository at this point in the history

Commits on Sep 8, 2021

  1. [CARBONDATA-4282] Fix issues with table having complex columns relate…

    …d to long string, SI, local dictionary
    
    Why is this PR needed?
    1.Insert/load fails after alter add complex column if table contains long string columns.
    2.create index on array of complex column (map/struct) throws null pointer exception instead of correct error message.
    3.alter table property local dictionary inlcude/exclude with newly added map column is failing.
    
    What changes were proposed in this PR?
    1. The datatypes array and data row are of different order leading to ClassCastException. Made changes to add newly added complex columns after the long string columns and other dimensions in carbonTableSchemaCommon.scala
    2. For complex columns, SI creation on only array of primitive types is allowed. Check if the child column is of complex type and throw an exception. Changes made in SICreationCommand.scala
    3. In AlterTableUtil.scala, while validating local dictionary columns, array and struct type are present but map type is missed. Added check for complex types.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    Yes
    
    This closes #4214
    ShreelekhyaG authored and ajantha-bhat committed Sep 8, 2021
    Configuration menu
    Copy the full SHA
    4d8bc9e View commit details
    Browse the repository at this point in the history

Commits on Sep 16, 2021

  1. [CARBONDATA-4277] geo instance compatability fix

    Why is this PR needed?
    The CustomIndex interface extends Serializable and for different
    version store, if the serialization id doesn't match, it throws
    java.io.InvalidClassException during load/update/query operations.
    
    What changes were proposed in this PR?
    As the instance is stored in table properties, made changes to
    initialize and update instance while refresh table. Also added
    static serialId for the CustomIndex interface.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    No, tested in cluster
    
    This closes #4216
    ShreelekhyaG authored and Indhumathi27 committed Sep 16, 2021
    Configuration menu
    Copy the full SHA
    7199357 View commit details
    Browse the repository at this point in the history
  2. [CARBONDATA-4284] Load/insert after alter add column on partition tab…

    …le with complex column fails
    
    Why is this PR needed?
    Insert after alter add column on partition table with complex column fails with bufferUnderFlowException
    List of columns order in TableSchema is different after alter add column.
    Ex: If partition is of dimension type, when table is created the schema columns order is as
    dimension columns(partition column also) + complex column
    After alter add, we are changing the order of columns in schema by moving the partition column to last.
    complex column + partition column
    Due to this change in order, while fillDimensionAndMeasureDetails, the indexing is wrong as it
    expects complex column to be last always which causes bufferUnderFlowException while flattening complex row.
    
    What changes were proposed in this PR?
    After alter add, removed changes to add partition column at last.
    
    This closes #4215
    ShreelekhyaG authored and kunal642 committed Sep 16, 2021
    Configuration menu
    Copy the full SHA
    3b29bcb View commit details
    Browse the repository at this point in the history
  3. [CARBONDATA-4286] Fixed measure comparator

    Why is this PR needed?
    Select query on a table with and filter condition returns an empty result
    while valid data present in the table.
    
    Root cause: Currently when we are building the min-max index at block level,
    that time we are using unsafe byte comparator for either dimension or measure
    column which returns incorrect result for measure columns.
    
    What changes were proposed in this PR?
    We should use different comparators for dimensions and measure columns which
    we are already doing at time of writing the min-max index at blocklet level.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    No
    
    This closes #4217
    nihal0107 authored and akashrn5 committed Sep 16, 2021
    Configuration menu
    Copy the full SHA
    2d1907b View commit details
    Browse the repository at this point in the history

Commits on Sep 20, 2021

  1. [CARBONDATA-4285] Fix alter add complex columns with global sort comp…

    …action failure
    
    Why is this PR needed?
    Alter add complex columns with global sort compaction is failing due to
    
    AOI exception : Currently creating default complex delimiter list in global sort compaction
    with size of 3. For map case need extra complex delimiter for handling the key-value
    bad record handling: When we add complex columns after insert the data, complex columns
    has null data for previously loaded segments. this null value is going to treat as bad
    record and compaction is failed.
    
    What changes were proposed in this PR?
    In Global sort compaction flow create default complex delimiter with 4, as already
    doing in load flow.
    Bad records handling pruned for compaction case. No need to check bad records for
    compaction as they are already checked while loading. previously loaded segments data
    we are inserting again in compaction case
    
    This closes #4218
    maheshrajus authored and kunal642 committed Sep 20, 2021
    Configuration menu
    Copy the full SHA
    22342f8 View commit details
    Browse the repository at this point in the history
  2. [CARBONDATA-4288][CARBONDATA-4289] Fix various issues with Index Serv…

    …er caching mechanism.
    
    Why is this PR needed?
    There are 2 issues in the Index Server flow:
    In case when there is a main table with a SI table with prepriming disabled and index serve
    enabled, new load to main table and SI table put the cache for the main table in the index
    server. Cache is also getting again when a select query is fired. This issue happens because
    during load to SI table, getSplits is called on the main table segment which is in Insert In
    Progress state. Index server considers this segment as a legacy segment because it's index
    size = 0 and does not put it's entry in the tableToExecutor mapping. In the getsplits method
    isRefreshneeded is false the first time getSplits is called. During the select query, in
    getSplits method isRefreshNeeded is true and the previous loaded entry is removed from the
    driver but since there is no entry for that table in tableToExecutor mapping, the previous
    cache value becomes dead cache and always stays in the index server. The newly loaded cache
    is loaded to a new executor and 2 copies of cache for the same segment is being mantained.
    Concurrent select queries to the index server shows wrong cache values in the Index server.
    
    What changes were proposed in this PR?
    The following changes are proposed to the index server code:
    Removing cache object from the index server in case the segment is INSERT IN PROGRESS and
    in the case of legacy segment adding the value in tabeToExecutor mappping so that the cache
    is also removed from the executor side.
    Concurrent queries were able adding duplicate cache values to other executors. Changed logic
    of assign executors method so that concurrent queries are not able to add cache for same segment
    in other executors
    
    This closes #4219
    vikramahuja1001 authored and kunal642 committed Sep 20, 2021
    Configuration menu
    Copy the full SHA
    ce860d0 View commit details
    Browse the repository at this point in the history

Commits on Oct 7, 2021

  1. [CARBONDATA-4243] Fixed si with column meta cache on same column

    Why is this PR needed?
    Currently, the select query fails when table contains SI and column_meta_cache
    on the same columns with to date() UDF. This is happening because pushdownfilters
    is null in CarbonDataSourceScanHelper and it is causing null pointer exception.
    
    What changes were proposed in this PR?
    At place of passing null value for pushdownfilters in CarbonDataSourceScan.doCanonicalize passed Seq.empty.
    
    This closes #4225
    nihal0107 authored and Indhumathi27 committed Oct 7, 2021
    Configuration menu
    Copy the full SHA
    9944936 View commit details
    Browse the repository at this point in the history
  2. [CARBONDATA-4228] [CARBONDATA-4203] Fixed update/delete after alter a…

    …dd segment
    
    Why is this PR needed?
    Deleted records are reappearing or updated records are showing old values in select
    queries. It is because after horizontal compaction delete delta file for the external
    segment is written to the default path which is Fact\part0\segment_x\ while if the
    segment is an external segment then delete delta file should be written to the path
    where the segment is present.
    
    What changes were proposed in this PR?
    After delete/update operation on the segment, horizontal compaction will be triggered.
    Now after horizontal compaction for external segments, the delete delta file will be
    written to the segment path at the place of the default path.
    
    This closes #4220
    nihal0107 authored and kunal642 committed Oct 7, 2021
    Configuration menu
    Copy the full SHA
    bca62cd View commit details
    Browse the repository at this point in the history
  3. [CARBONDATA-4293] Make Table created without external keyword as Tran…

    …sactional table
    
    Why is this PR needed?
    Currently, when you create a table with location( without external keyword) in cluster,
    the corresponding table is created as transactional table. If External keyword is
    present, then it is created as non-transactional table. This scenario is not handled
    in local mode.
    
    What changes were proposed in this PR?
    Made changes, to check if external keyword is present or not. If present, then make
    the corresponding table as transactional table.
    
    This closes #4221
    Indhumathi27 authored and kunal642 committed Oct 7, 2021
    Configuration menu
    Copy the full SHA
    5a710f9 View commit details
    Browse the repository at this point in the history

Commits on Oct 8, 2021

  1. [CARBONDATA-4215] Fix query issue after add segment other formats wit…

    …h vector read disabled
    
    Why is this PR needed?
    If carbon.enable.vector.reader is disabled and parquet/orc segments are added
    to carbon table. Then on query, it fails with java.lang.ClassCastException:
    org.apache.spark.sql.vectorized.ColumnarBatch cannot be cast to
    org.apache.spark.sql.catalyst.InternalRow. When vector reader property is
    disabled, while scanning ColumnarBatchScan supportBatch would be overridden
    to false but external file format like ParuetFileFormat supportBatch is not
    overriden and it takes default as true.
    
    What changes were proposed in this PR?
    Made changes to override supportBatch of external file formats based on
    carbon.enable.vector.reader property.
    
    This closes #4226
    ShreelekhyaG authored and Indhumathi27 committed Oct 8, 2021
    Configuration menu
    Copy the full SHA
    8b3d78b View commit details
    Browse the repository at this point in the history

Commits on Oct 12, 2021

  1. [CARBONDATA-4292] Spatial index creation using spark dataframe

    Why is this PR needed?
    To support spatial index creation using spark data frame
    
    What changes were proposed in this PR?
    Added spatial properties in carbonOptions and edited existing testcases.
    
    Does this PR introduce any user interface change?
    Yes
    
    Is any new testcase added?
    Yes
    
    This closes #4222
    ShreelekhyaG authored and Indhumathi27 committed Oct 12, 2021
    Configuration menu
    Copy the full SHA
    b8d9a97 View commit details
    Browse the repository at this point in the history

Commits on Oct 21, 2021

  1. [CARBONDATA-4298][CARBONDATA-4281] Empty bad record support for compl…

    …ex type
    
    Why is this PR needed?
    1. IS_EMPTY_DATA_BAD_RECORD property not supported for complex types.
    2. To update documentation that COLUMN_META_CACHE and RANGE_COLUMN
       doesn't support complex datatype
    
    What changes were proposed in this PR?
    1. Made changes to pass down IS_EMPTY_DATA_BAD_RECORD property and
       throw exception. Store empty complex type instead of storing
       null value which matches with hive table result.
    2. Updated document and added testcase.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    Yes
    
    This closes #4228
    ShreelekhyaG authored and Indhumathi27 committed Oct 21, 2021
    Configuration menu
    Copy the full SHA
    305851e View commit details
    Browse the repository at this point in the history

Commits on Oct 23, 2021

  1. [CARBONDATA-4306] Fix Query Performance issue for Spark 3.1

    Why is this PR needed?
    Currently, with Spark 3.1, some rules are applied many times resulting in performance degrade.
    
    What changes were proposed in this PR?
    Changed Rules apply strategy from Fixed to Once and CarbonOptimizer can directly extend SparkOptimizer avoiding applying same rules many times
    
    This Closes #4229
    Indhumathi27 authored and kunal642 committed Oct 23, 2021
    Configuration menu
    Copy the full SHA
    8953cde View commit details
    Browse the repository at this point in the history

Commits on Oct 26, 2021

  1. [CARBONDATA-4303] Columns mismatch when insert into table with static…

    … partition
    
    Why is this PR needed?
    When insert into table with static partition, source projects should not contain
    static partition column, target table will have all columns, the columns number
    comparison between source table and target table is: source table column
    number = target table column number - static partition column number.
    
    What changes were proposed in this PR?
    Before do the column number comparison, remove the static partition column
    from target table.
    
    This Closes #4233
    jack86596 authored and Indhumathi27 committed Oct 26, 2021
    Configuration menu
    Copy the full SHA
    9dbd2a5 View commit details
    Browse the repository at this point in the history

Commits on Oct 28, 2021

  1. [CARBONDATA-4240]: Added missing properties on the configurations page

    Why is this PR needed?
    Few properties which were not present on configurations page but are
    user facing properties have been added.
    
    What changes were proposed in this PR?
    Addition of missing properties
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    No
    
    This Closes #4210
    pratyakshsharma authored and akashrn5 committed Oct 28, 2021
    Configuration menu
    Copy the full SHA
    7d94691 View commit details
    Browse the repository at this point in the history
  2. [CARBONDATA-4194] Fixed presto read after update/delete from spark

    Why is this PR needed?
    After update/delete with spark on the table which contains array/struct column,
    when we are trying to read from presto then it is throwing class cast exception.
    It is because when we perform update/delete then it contains vector of type
    ColumnarVectorWrapperDirectWithDeleteDelta which we are trying to typecast to
    CarbonColumnVectorImpl and because of this it is throwing typecast exception.
    After fixing this(added check for instanceOf) it started throwing IllegalArgumentException.
    It is because:
    
    1. In case of local dictionary enable CarbondataPageSource.load is calling
    ComplexTypeStreamReader.putComplexObject before setting the correct number
    of rows(doesn't subtrat deleted rows). And it throws IllegalArgument while
    block building for child elements.
    2. position count is wrong in the case of the struct. It should subtract
    the number of deleted rows in LocalDictDimensionDataChunkStore.fillVector.
    While this is not required to be changed in the case of the array because
    datalength of the array already taking care of deleted rows in
    ColumnVectorInfo.getUpdatedPageSizeForChildVector.
    
    What changes were proposed in this PR?
    First fixed class cast exception after putting instanceOf condition in if block.
    Then subtracted the deleted row count before calling ComplexTypeStreamReader.putComplexObject
    in DirectCompressCodec.decodeAndFillVector. Also handle deleted rows in case of struct
    in LocalDictDimensionDataChunkStore.fillVector
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    No
    
    This Closes #4224
    nihal0107 authored and akashrn5 committed Oct 28, 2021
    Configuration menu
    Copy the full SHA
    07b41a5 View commit details
    Browse the repository at this point in the history

Commits on Nov 15, 2021

  1. [CARBONDATA-4296]: schema evolution, enforcement and deduplication ut…

    …ilities added
    
    Why is this PR needed?
    This PR adds schema enforcement, schema evolution and deduplication capabilities for
    carbondata streamer tool specifically. For the existing IUD scenarios, some work
    needs to be done to handle it completely, for example -
    1. passing default values and storing them in table properties.
    
    Changes proposed for the phase 2 -
    1. Handling delete use cases with upsert operation/command itself. Right now we
    consider update as delete + insert. With the new streamer tool, it is possible that
    user sets upsert as the operation type and incoming stream has delete records as well.
    What changes were proposed in this PR?
    
    Configs and utility methods are added for the following use cases -
    1. Schema enforcement
    2. Schema evolution - add column, delete column, data type change scenario
    3. Deduplicate the incoming dataset against incoming dataset itself. This is useful
    in scenarios where incoming stream of data has multiple updates for the same record
    and we want to pick the latest.
    4. Deduplicate the incoming dataset against existing target dataset. This is useful
    when operation type is set as INSERT and user does not want to insert duplicate records.
    
    This closes #4227
    pratyakshsharma authored and kunal642 committed Nov 15, 2021
    Configuration menu
    Copy the full SHA
    3be05d2 View commit details
    Browse the repository at this point in the history

Commits on Nov 25, 2021

  1. Supplementary information for add segment syntax .

    1. add segment option (partition)
    2. segment-management-on-carbondata.md link addsegment-guide.md
    bieremayi committed Nov 25, 2021
    Configuration menu
    Copy the full SHA
    81c2e29 View commit details
    Browse the repository at this point in the history

Commits on Nov 26, 2021

  1. [CARBONDATA-4305] Support Carbondata Streamer tool for incremental fe…

    …tch and merge from kafka and DFS Sources
    
    Why is this PR needed?
    In the current Carbondata CDC solution, if any user wants to integrate it with a streaming source then he
    need to write a separate spark application to capture changes which is an overhead. We should be able to
    incrementally capture the data changes from primary databases and should be able to incrementally ingest
    the same in the data lake so that the overall latency decreases. The former is taken care of using
    log-based CDC systems like Maxwell and Debezium. Here is a solution for the second aspect using Apache Carbondata.
    
    What changes were proposed in this PR?
    Carbondata streamer tool is a spark streaming application which enables users to incrementally ingest data
    from various sources, like Kafka(standard pipeline would be like MYSQL => debezium => (kafka + Schema registry) => Carbondata Streamer tool)
    and DFS into their data lakes. The tool comes with out-of-the-box support for almost all types of schema
    evolution use cases. With the streamer tool only add column support is given with drop column and
    other schema changes capability in line in the upcoming days. Please refer to design document for
    more details about usage and working of the tool.
    
    This closes #4235
    akashrn5 authored and kunal642 committed Nov 26, 2021
    Configuration menu
    Copy the full SHA
    18840af View commit details
    Browse the repository at this point in the history

Commits on Nov 29, 2021

  1. Add FAQ How to manage mix file format in carbondata table.

    1. add segment example.
    2. faq.md link to addsegment-guide.md
    bieremayi committed Nov 29, 2021
    Configuration menu
    Copy the full SHA
    598d1ce View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    885a21c View commit details
    Browse the repository at this point in the history

Commits on Dec 2, 2021

  1. remove add segment refer

    bieremayi committed Dec 2, 2021
    Configuration menu
    Copy the full SHA
    69ab06c View commit details
    Browse the repository at this point in the history

Commits on Dec 4, 2021

  1. Configuration menu
    Copy the full SHA
    7af81ad View commit details
    Browse the repository at this point in the history

Commits on Dec 7, 2021

  1. Configuration menu
    Copy the full SHA
    42d59be View commit details
    Browse the repository at this point in the history
  2. Revert "remove add segment refer"

    This reverts commit 69ab06c.
    bieremayi committed Dec 7, 2021
    Configuration menu
    Copy the full SHA
    580f7f6 View commit details
    Browse the repository at this point in the history
  3. Revert "FAQ: carbon rename to carbondata"

    This reverts commit 885a21c.
    bieremayi committed Dec 7, 2021
    Configuration menu
    Copy the full SHA
    341f1bf View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    ce5747d View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    f544e59 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    c29fee2 View commit details
    Browse the repository at this point in the history

Commits on Dec 18, 2021

  1. Update docs/addsegment-guide.md

    ths!
    
    Co-authored-by: Indhumathi27 <[email protected]>
    bieremayi and Indhumathi27 authored Dec 18, 2021
    Configuration menu
    Copy the full SHA
    379f5ad View commit details
    Browse the repository at this point in the history
  2. Update docs/addsegment-guide.md

    ths!
    
    Co-authored-by: Indhumathi27 <[email protected]>
    bieremayi and Indhumathi27 authored Dec 18, 2021
    Configuration menu
    Copy the full SHA
    a1b6d99 View commit details
    Browse the repository at this point in the history

Commits on Dec 20, 2021

  1. Configuration menu
    Copy the full SHA
    fc3914f View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    861fc67 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    01f8e1a View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    053d080 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    c0211fc View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    0ced3c8 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    f266a73 View commit details
    Browse the repository at this point in the history

Commits on Dec 22, 2021

  1. [CARBONDATA-4316]Fix horizontal compaction failure for partition tables

    Why is this PR needed?
    Horizontal compaction fails for partition table leading to many delete
    delta files for a single block, leading to slower query performance.
    This is happening because during horizontal compaction the delta file
    path prepared for the partition table is wrong which fails to identify
    the path and fails the operation.
    
    What changes were proposed in this PR?
    If it is a partition table, read the segment file and identity the
    partition where the block is present to prepare a proper partition path.
    
    This closes #4240
    akashrn5 authored and kunal642 committed Dec 22, 2021
    Configuration menu
    Copy the full SHA
    d629dc0 View commit details
    Browse the repository at this point in the history
  2. [CARBONDATA-4317] Fix TPCDS performance issues

    Why is this PR needed?
    The following issues has degraded the TPCDS query performance
    1. If dynamic filters is not present in partitionFilters Set, then that filter is skipped, to pushdown to spark.
    2. In some cases, some nodes like Exchange / Shuffle is not reused, because the CarbonDataSourceSCan plan is not mached
    3. While accessing the metadata on the canonicalized plan throws NPE
    
    What changes were proposed in this PR?
    1. Check if dynamic filters is present in PartitionFilters set. If not, pushdown the filter
    2. Match the plans, by converting them to canonicalized and by normalising the expressions
    3. Move variables used in metadata(), to avoid NPE while comparing plans
    
    This closes #4241
    Indhumathi27 authored and kunal642 committed Dec 22, 2021
    Configuration menu
    Copy the full SHA
    0f1d2a4 View commit details
    Browse the repository at this point in the history

Commits on Dec 28, 2021

  1. [CARBONDATA-4319] Fixed clean files not deleteting stale delete delta…

    … files after horizontal compaction
    
    Why is this PR needed?
    After horizontal compaction was performed on partition and non partition tables, the clean files
    operation was not deleting the stale delete delta files. the code was removed as the part of clean
    files refactoring done previously.
    
    What changes were proposed in this PR?
    Clean files with force option now handles removal of these stale delta files as well as the stale
    tableupdatestatus file for both partition and non partition table.
    
    This closes #4245
    vikramahuja1001 authored and kunal642 committed Dec 28, 2021
    Configuration menu
    Copy the full SHA
    a072e7a View commit details
    Browse the repository at this point in the history
  2. [CARBONDATA-4308]: added docs for streamer tool configs

    Why is this PR needed?
    Documentation for the CDC streamer tool is missing
    
    What changes were proposed in this PR?
    Add th documentation for the cdc streamer tool contains the configs
    and all the image and example command to try out.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    No
    
    This closes #4243
    pratyakshsharma authored and akashrn5 committed Dec 28, 2021
    Configuration menu
    Copy the full SHA
    970f11d View commit details
    Browse the repository at this point in the history

Commits on Dec 29, 2021

  1. [CARBONDATA-4318]Improve load overwrite performance for partition tables

    Why is this PR needed?
    With the increase in the number of overwrite loads for the partition table,
    the time takes for each load keeps on increasing over time. This is because,
    
    1. whenever a load overwrite for partition table is fired, it basically means
    that we need to overwrite or drop the partitions if anything overlaps with
    current partitions getting loaded. Since carbondata stores the partition
    information in the segments file, to identify and drop partitions, it's
    reading all the previous segment files to identify and drop the overwriting
    partitions, which leads to a decrease in performance.
    
    2. After partition load is completed, a cleanSegments method is called which
    again reads segment file and table status file to identify MArked for delete
    segments to clean. But Since the force clean is false and timeout also is
    more than a day by default, it's not necessary to call this method.
    Clean files should handle this part.
    
    What changes were proposed in this PR?
    1. we already have the information about current partitions, so with that first
    identify if there are any partitions to overwrite, if yes then only we read segment
    files to call dropParitition, else we don't read the segment files unnecessarily.
    It also contains other refactoring to avoid reading table status file also.
    2. no need to call clean segments after every load. Clean files will take care
    to delete the expired ones.
    
    This closes #4242
    akashrn5 authored and kunal642 committed Dec 29, 2021
    Configuration menu
    Copy the full SHA
    308906e View commit details
    Browse the repository at this point in the history

Commits on Jan 13, 2022

  1. [CARBONDATA-4320] Fix clean files removing wrong delta files

    Why is this PR needed?
    In the case where there are multiple delete delta files in a partition
    in a partition table, some delta files were being ignored and deleted,
    thus changing the value during the query
    
    What changes were proposed in this PR?
    Fixed the logic which checks which delta file to delete. Now checking
    the deltaStartTime and comparing it with deltaEndTime to check consider
    all the delta files during clean files.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    Yes, one test case has been added.
    
    This closes #4246
    vikramahuja1001 authored and akashrn5 committed Jan 13, 2022
    Configuration menu
    Copy the full SHA
    05aff87 View commit details
    Browse the repository at this point in the history

Commits on Feb 14, 2022

  1. [CARBONDATA-4322] Apply local sort task level property for insert

    Why is this PR needed?
    Currently, When carbon.partition.data.on.tasklevel is enabled with
    local sort, the number of tasks launched for load will be based on
    node locality. But for insert command, the local sort task level
    property is not applied which is causing the number of tasks
    launched based on the input files.
    
    What changes were proposed in this PR?
    Included changes to apply carbon.partition.data.on.tasklevel property
    for insert command as well. Used DataLoadCoalescedRDD to coalesce
    the partitions and a DataLoadCoalescedUnwrapRDDto unwrap partitions
    from DataLoadPartitionWrap and iterate.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    Yes
    
    This closes #4248
    ShreelekhyaG authored and Indhumathi27 committed Feb 14, 2022
    Configuration menu
    Copy the full SHA
    59f23c0 View commit details
    Browse the repository at this point in the history

Commits on Mar 4, 2022

  1. [CARBONDATA-4325] Update Data frame supported options in document and…

    … fix partition table creation with df spatial property
    
    Why is this PR needed?
    1. Only specific properties are supported using dataframe options. Need to update the documentation.
    2. Create partition table fails with Spatial index property for carbon table created with dataframe in spark-shell.
    
    What changes were proposed in this PR?
    1. Added data frame supported properties in the documentation.
    2. Using spark-shell, the table gets created with carbon session and catalogTable.properties
    is empty here. Getting the properties from catalogTable.storage.properties to access the properties set.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    No, tested in cluster.
    
    This closes #4250
    ShreelekhyaG authored and Indhumathi27 committed Mar 4, 2022
    Configuration menu
    Copy the full SHA
    c840b5f View commit details
    Browse the repository at this point in the history
  2. [CARBONDATA-4326] MV not hitting with multiple sessions issue fix

    Why is this PR needed?
    MV created in beeline not hitting in sql/shell and vice versa if both
    beeline and sql/shell are running in parallel. Currently, If the view
    catalog for a particular session is already initialized then the schemas
    are not reloaded each time. So when mv is created in another session
    and queried from the currently open session, mv is not hit.
    
    What changes were proposed in this PR?
    1.Reload mv catalog every time to getSchemas from the path. Register the
    schema if not present in the catalog and deregister the schema if it's dropped.
    2. When create SI is triggered, no need to try rewriting the plan and
    check for mv schemas. So, returning plan if DeserializeToObject is present.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    No, tested in cluster
    
    This closes #4251
    ShreelekhyaG authored and Indhumathi27 committed Mar 4, 2022
    Configuration menu
    Copy the full SHA
    19343a7 View commit details
    Browse the repository at this point in the history

Commits on Mar 7, 2022

  1. Configuration menu
    Copy the full SHA
    9b74951 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    e25d5b6 View commit details
    Browse the repository at this point in the history
  3. [CARBONDATA-4306] Fix Query Performance issue for Spark 3.1

    Why is this PR needed?
    Some non-partition filters, which cannot be handled by carbon, is not pushed down to spark.
    
    What changes were proposed in this PR?
    If partition filters is non empty, then the filter column is not partition column, then push the filter to spark
    
    This closes #4252
    Indhumathi27 authored and kunal642 committed Mar 7, 2022
    Configuration menu
    Copy the full SHA
    a838531 View commit details
    Browse the repository at this point in the history

Commits on Mar 18, 2022

  1. [CARBONDATA-4327] Update documentation related to partition

    Why is this PR needed?
    Drop partition with data is not supported and a few of the links are not working.
    
    What changes were proposed in this PR?
    Removed unsupported syntax , duplicate headings and updated the header with proper linkage.
    
    This closes #4254
    ShreelekhyaG authored and kunal642 committed Mar 18, 2022
    Configuration menu
    Copy the full SHA
    41831ce View commit details
    Browse the repository at this point in the history

Commits on Mar 29, 2022

  1. [CARBONDATA-4328] Load parquet table with options error message fix

    Why is this PR needed?
    If parquet table is created and load statement with options is
    triggerred, then its failing with NoSuchTableException:
    Table ${tableIdentifier.table} does not exist.
    
    What changes were proposed in this PR?
    As parquet table load is not handled, added a check to filter out
    non-carbon tables in the parser. So that, the spark parser can handle the statement.
    
    This closes #4253
    ShreelekhyaG authored and Indhumathi27 committed Mar 29, 2022
    Configuration menu
    Copy the full SHA
    d6ce946 View commit details
    Browse the repository at this point in the history

Commits on Apr 1, 2022

  1. [CARBONDATA-4329] Fix multiple issues with External table

    Why is this PR needed?
    Issue 1:
    When we create external table on transactional table location,
    schema file will be present. While creating external table,
    which is also transactional, the schema file is overwritten
    
    Issue 2:
    If external table is created on a location, where the source table
    already exists, on drop external table, it is deleting the table data.
    Query on the source table fails
    
    What changes were proposed in this PR?
    Avoid writing schema file if table type is external and transactional
    Dont drop external table location data, if table_type is external
    
    This closes #4255
    Indhumathi27 authored and kunal642 committed Apr 1, 2022
    Configuration menu
    Copy the full SHA
    46b62cf View commit details
    Browse the repository at this point in the history

Commits on Apr 28, 2022

  1. [CARBONDATA-4330] Incremental Dataload of Average aggregate in MV

    Why is this PR needed?
    Currently, whenever MV is created with average aggregate, a full
    refresh is done meaning it reloads the whole MV for any newly
    added segments. This will slow down the loading. With incremental
    data load, only the segments that are newly added can be loaded to the MV.
    
    What changes were proposed in this PR?
    If avg is present, rewrite the query with the sum and count of the
    columns to create MV and use them to derive avg.
    Refer: https://docs.google.com/document/d/1kPEMCX50FLZcmyzm6kcIQtUH9KXWDIqh-Hco7NkTp80/edit
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    Yes
    
    This closes #4257
    ShreelekhyaG authored and Indhumathi27 committed Apr 28, 2022
    Configuration menu
    Copy the full SHA
    45acd67 View commit details
    Browse the repository at this point in the history

Commits on May 27, 2022

  1. [CARBONDATA-4336] Table Status Versioning

    Why is this PR needed?
    Currently, carbondata will store the records of a transaction (load/insert/IUD/Add/drop segment)
    in a metadata file named `tablestatus’ which will be present in the Metadata directory.
    If the tablestatus file is lost, then the metadata for the transactions cannot be recovered
    directly, as there is no previous version file available for tablestatus. Hence, if we support
    versioning for tablestatus files, then it will be easy to recover the current version tablestatus
    meta from previous version tablestatus files.
    
    Please refer Table Status Versioning & Recovery Tool for more info.
    
    What changes were proposed in this PR?
    -> On each transaction commit, committed the latest load metadata details to a new version file
    -> Updated the latest tablestatus version timestamp in the table properties [CarbonTable cache] and in the hive metastore
    -> Added a table status version tool which can recover the latest transaction details based on old version file
    
    Does this PR introduce any user interface change?
    Yes
    
    Is any new testcase added?
    Yes
    
    This closes #4261
    Indhumathi27 authored and akashrn5 committed May 27, 2022
    Configuration menu
    Copy the full SHA
    57e76ee View commit details
    Browse the repository at this point in the history

Commits on Jun 2, 2022

  1. [CARBONDATA-4335] Disable MV by default

    Why is this PR needed?
    Currently materialized view(mv) is enabled by default. In concurrent scenarios
    with default mv enabled each session is going through the list of databases
    even though mv not used. Due to this query time increased.
    
    What changes were proposed in this PR?
    Disable mv by default as users using mv rarely. If user required then enable
    and use it.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    Yes
    
    This closes #4264
    maheshrajus authored and Indhumathi27 committed Jun 2, 2022
    Configuration menu
    Copy the full SHA
    33408be View commit details
    Browse the repository at this point in the history

Commits on Jun 22, 2022

  1. [CARBONDATA-4341] Drop Index Fails after TABLE RENAME

    Why is this PR needed?
    Drop Index Fails after TABLE RENAME
    
    What changes were proposed in this PR?
    After table rename, its si tables property - parentTableName is updated
    with latest name and index metadata gets updated. Dropping the table
    from the metadata cache so that it would be reloaded and gets updated
    property when fetched next time.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    Yes
    
    This closes #4279
    ShreelekhyaG authored and Indhumathi27 committed Jun 22, 2022
    Configuration menu
    Copy the full SHA
    4b8846d View commit details
    Browse the repository at this point in the history

Commits on Jun 23, 2022

  1. [CARBONDATA-4339]Fix NullPointerException in load overwrite on partit…

    …ion table
    
    Why is this PR needed?
    After delete segment and clean files with force option true, the load overwrite
    operation throws nullpointer exception. This is because when clean files with
    force is done, except the 0th segment and last segment remaining marked for
    delete will be moved to tablestatus.history file irrespective of the status of
    the 0th and last segment. During overwrite load, the overwritten partition
    will be dropped. Since all the segments are physically deleted with clean
    files, and load model's load metadata details list contains 0th segment
    which is marked for delete also leading to failure.
    
    What changes were proposed in this PR?
    When the valid segments are collected, filter using the segment's status to avoid the failure.
    
    This closes #4280
    akashrn5 authored and Indhumathi27 committed Jun 23, 2022
    Configuration menu
    Copy the full SHA
    93b0af2 View commit details
    Browse the repository at this point in the history

Commits on Jun 27, 2022

  1. [CARBONDATA-4344] Create MV fails with "LOCAL_DICTIONARY_INCLUDE/LOCA…

    …L _DICTIONARY_EXCLUDE column: does not exist in table. Please check the DDL" error
    
    Why is this PR needed?
    Create MV fails with "LOCAL_DICTIONARY_INCLUDE/LOCAL _DICTIONARY_EXCLUDE column: does not exist in table.
    Please check the DDL" error.
    Error occurs only in this scenario: Create Table --> Load --> Alter Add Columns --> Drop table --> Refresh Table --> Create MV
    and not in direct scenario like: Create Table --> Load --> Alter Add Columns --> Create MV
    
    What changes were proposed in this PR?
    1. After add column command, LOCAL_DICTIONARY_INCLUDE and LOCAL_DICTIONARY_EXCLUDE properties
       are added to the table even if the columns are empty. So, when MV is created next as
       LOCAL_DICTIONARY_EXCLUDE column is defined it tries to access its columns and fails.
       --> Added empty check before adding properties to the table to resolve this.
    2. In a direct scenario after add column, the schema gets updated in catalog table but
       the table properties are not updated. Made changes to update table properties to catalog table.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    Yes
    
    This closes #4282
    ShreelekhyaG authored and Indhumathi27 committed Jun 27, 2022
    Configuration menu
    Copy the full SHA
    858afc7 View commit details
    Browse the repository at this point in the history

Commits on Jul 1, 2022

  1. [CARBONDATA-4345] update/delete operations failed when other format s…

    …egemnt deleted from carbon table
    
    Why is this PR needed?
    Update/delete operations failed when other format segments deleted from carbon table
    Steps to reproduce:
    1. create carbon table and load the data
    2. create parquet/orc tables and load the data
    3. add parquet/orc format segments in carbon table by alter add segment command
    4. perform update/delete operations in carbon table and they will fail as table
       contains mixed format segments. This is expected behaviour only.
    5. delete the other format segments which is added in step3
    6. try to perform update/delete operation in carbon data. They should not fail
    
    For update/delete operations we are checking if other format segments present
    in table path. If found then carbon data throwing exception by saying mixed
    format segments exists even though the other format segments deleted from table.
    
    What changes were proposed in this PR?
    When we are checking other format segment present in carbon table then it
    should check only for SUCCESS/PARTIAL_SUCCESS segments.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    Yes
    
    This closes #4285
    maheshrajus authored and Indhumathi27 committed Jul 1, 2022
    Configuration menu
    Copy the full SHA
    b8511b6 View commit details
    Browse the repository at this point in the history

Commits on Jul 11, 2022

  1. [CARBONDATA-4342] Fix Desc Columns shows New Column added, even thoug…

    …h ALter ADD column query failed
    
    Why is this PR needed?
    1. When spark.carbon.hive.schema.store property is enabled, alter operations fails
    with Class Cast Exception.
    2. When Alter add/drop/rename column operation failed due to the issue mentioned above,
    the revert schema operation is not reverting back to the old schema
    
    What changes were proposed in this PR?
    1. Use org.apache.spark.sql.hive.CarbonSessionCatalogUtil#getClient to get HiveClient
    to avoid ClassCast Exception
    2. Revert the schema in the spark Catalog table also, in case of failure
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    Yes
    
    This closes #4277
    Indhumathi27 authored and akashrn5 committed Jul 11, 2022
    Configuration menu
    Copy the full SHA
    8691cb7 View commit details
    Browse the repository at this point in the history

Commits on Jul 19, 2022

  1. [CARBONDATA-4338] Moving dropped partition data to trash

    Why is this PR needed?
    When drop partition operation is performed carbon data will
    modify only table status file and can not delete the actual
    partition folder which contains data and index files. As
    comply with hive behaviour carbon data also should delete
    the deleted partition folder in storage[hdfs/obs/etc..].
    Before deleting carbon data will keep copy in Trash folder.
    User can restore it by checking the partition name and time stamp.
    
    What changes were proposed in this PR?
    Moved the deleted partition folder files to trash folder
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    Yes
    
    This closes #4276
    maheshrajus authored and Indhumathi27 committed Jul 19, 2022
    Configuration menu
    Copy the full SHA
    04b1756 View commit details
    Browse the repository at this point in the history

Commits on Apr 8, 2023

  1. Configuration menu
    Copy the full SHA
    b690cf2 View commit details
    Browse the repository at this point in the history
  2. remove wechat info

    chenliang613 committed Apr 8, 2023
    Configuration menu
    Copy the full SHA
    8e9ffd5 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    8b8345a View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    2f0241d View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    c780221 View commit details
    Browse the repository at this point in the history
  6. Merge pull request #4300 from xubo245/issue-4298

     [ISSUE-4298] Fixed mail list issue
    chenliang613 authored Apr 8, 2023
    Configuration menu
    Copy the full SHA
    4af8af4 View commit details
    Browse the repository at this point in the history
  7. Merge pull request #4301 from xubo245/issue-4299

    [ISSUE-4299] Fixed compile issue with spark 2.3
    chenliang613 authored Apr 8, 2023
    Configuration menu
    Copy the full SHA
    92f4dff View commit details
    Browse the repository at this point in the history

Commits on Apr 9, 2023

  1. Configuration menu
    Copy the full SHA
    3cc0367 View commit details
    Browse the repository at this point in the history
  2. Merge pull request #4307 from xubo245/ISSUE-4305-magicNumber

    [ISSUE-4305] Optimize the magic number
    chenliang613 authored Apr 9, 2023
    Configuration menu
    Copy the full SHA
    cba9a8a View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    f92ae07 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    9d43c78 View commit details
    Browse the repository at this point in the history

Commits on Apr 10, 2023

  1. [ISSUE-4306] Fix the error of SDKS3SchemaReadExample (#4312)

    Fix the issue when read schema from S3
    xubo245 authored Apr 10, 2023
    Configuration menu
    Copy the full SHA
    01dd526 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    44c2bca View commit details
    Browse the repository at this point in the history

Commits on Apr 13, 2023

  1. Configuration menu
    Copy the full SHA
    b941983 View commit details
    Browse the repository at this point in the history

Commits on Apr 24, 2023

  1. [ISSUE-4305] Optimize the usage of static method (#4309)

    static method shouldn't be called by object, it should be call by class
    xubo245 authored Apr 24, 2023
    Configuration menu
    Copy the full SHA
    2439589 View commit details
    Browse the repository at this point in the history

Commits on Jun 8, 2023

  1. Configuration menu
    Copy the full SHA
    f31edd6 View commit details
    Browse the repository at this point in the history

Commits on Jun 26, 2023

  1. Add new example:Using CarbonData to visualization in notebook (#4318)

    * Add new example:Using CarbonData to visualization in notebook
    
    * Update the example:Using CarbonData in notebook
    xubo245 authored Jun 26, 2023
    Configuration menu
    Copy the full SHA
    208afe9 View commit details
    Browse the repository at this point in the history
  2. [ISSUE-4305] Optimize the constants and variable style (#4311)

    CONSTANTS should be like: UPPER_NAME;
    variable should be lowerCamelCase
    xubo245 authored Jun 26, 2023
    Configuration menu
    Copy the full SHA
    6e031c0 View commit details
    Browse the repository at this point in the history

Commits on Jul 8, 2023

  1. Configuration menu
    Copy the full SHA
    8264b3b View commit details
    Browse the repository at this point in the history

Commits on Aug 20, 2023

  1. Create maven.yml

    chenliang613 authored Aug 20, 2023
    Configuration menu
    Copy the full SHA
    cd180c9 View commit details
    Browse the repository at this point in the history
  2. Update maven.yml

    chenliang613 authored Aug 20, 2023
    Configuration menu
    Copy the full SHA
    95b50e8 View commit details
    Browse the repository at this point in the history

Commits on Oct 1, 2023

  1. optimize code smell in presto module, (#4332)

    optimize equal
    xubo245 authored Oct 1, 2023
    Configuration menu
    Copy the full SHA
    13a2c97 View commit details
    Browse the repository at this point in the history
  2. [ISSUE-4329] optimize some code smells in presto module (#4330)

    static method shouldn't be called by object, it should be call by class
    xubo245 authored Oct 1, 2023
    Configuration menu
    Copy the full SHA
    beb426c View commit details
    Browse the repository at this point in the history
  3. Update maven.yml

    chenliang613 authored Oct 1, 2023
    Configuration menu
    Copy the full SHA
    9f604fc View commit details
    Browse the repository at this point in the history
  4. Update maven.yml

    chenliang613 authored Oct 1, 2023
    Configuration menu
    Copy the full SHA
    95a6407 View commit details
    Browse the repository at this point in the history
  5. Update maven.yml

    chenliang613 authored Oct 1, 2023
    Configuration menu
    Copy the full SHA
    4462461 View commit details
    Browse the repository at this point in the history
  6. Update maven.yml

    chenliang613 authored Oct 1, 2023
    Configuration menu
    Copy the full SHA
    af9c6c3 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    d499699 View commit details
    Browse the repository at this point in the history
  8. fix ci issues

    chenliang613 committed Oct 1, 2023
    Configuration menu
    Copy the full SHA
    bcb30a5 View commit details
    Browse the repository at this point in the history

Commits on Oct 10, 2023

  1. Create maven.yml

    chenliang613 authored Oct 10, 2023
    Configuration menu
    Copy the full SHA
    30c1aa8 View commit details
    Browse the repository at this point in the history
  2. Update maven.yml

    chenliang613 authored Oct 10, 2023
    Configuration menu
    Copy the full SHA
    84cfd20 View commit details
    Browse the repository at this point in the history
  3. Update pom.xml

    chenliang613 authored Oct 10, 2023
    Configuration menu
    Copy the full SHA
    504a5ae View commit details
    Browse the repository at this point in the history
  4. Update maven.yml

    chenliang613 authored Oct 10, 2023
    Configuration menu
    Copy the full SHA
    66cb3a3 View commit details
    Browse the repository at this point in the history
  5. Update maven.yml

    chenliang613 authored Oct 10, 2023
    Configuration menu
    Copy the full SHA
    13ac2ef View commit details
    Browse the repository at this point in the history
  6. Update pom.xml

    chenliang613 authored Oct 10, 2023
    Configuration menu
    Copy the full SHA
    0e0523e View commit details
    Browse the repository at this point in the history
  7. Update maven.yml

    chenliang613 authored Oct 10, 2023
    Configuration menu
    Copy the full SHA
    0cae3d1 View commit details
    Browse the repository at this point in the history
  8. Update maven.yml

    chenliang613 authored Oct 10, 2023
    Configuration menu
    Copy the full SHA
    fd66031 View commit details
    Browse the repository at this point in the history
  9. Update maven.yml

    chenliang613 authored Oct 10, 2023
    Configuration menu
    Copy the full SHA
    38fdb16 View commit details
    Browse the repository at this point in the history

Commits on Oct 17, 2023

  1. fix

    chenliang613 committed Oct 17, 2023
    Configuration menu
    Copy the full SHA
    ebe4101 View commit details
    Browse the repository at this point in the history
  2. optmize code smell in presto module (#4331)

    add override
    xubo245 authored Oct 17, 2023
    Configuration menu
    Copy the full SHA
    4618808 View commit details
    Browse the repository at this point in the history

Commits on Oct 19, 2023

  1. Configuration menu
    Copy the full SHA
    448564a View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    f18846c View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    39dd8ce View commit details
    Browse the repository at this point in the history

Commits on Nov 5, 2023

  1. Update pom.xml

    chenliang613 authored Nov 5, 2023
    Configuration menu
    Copy the full SHA
    dd74408 View commit details
    Browse the repository at this point in the history
  2. line 117 (#4237)

    #117, The longitude is six decimal places and the dimension is five digits. Why is it the same length after conversion?
    WANGSHIHUAI authored Nov 5, 2023
    Configuration menu
    Copy the full SHA
    1e327f2 View commit details
    Browse the repository at this point in the history

Commits on Nov 6, 2023

  1. fix testcase error (#4337)

    Co-authored-by: QiangCai <[email protected]>
    QiangCai and QiangCai authored Nov 6, 2023
    Configuration menu
    Copy the full SHA
    57de4a3 View commit details
    Browse the repository at this point in the history

Commits on Nov 8, 2023

  1. [ISSUE-4338] Fix checkstyle issue in sdk module (#4339)

    Co-authored-by: QiangCai <[email protected]>
    QiangCai and QiangCai authored Nov 8, 2023
    Configuration menu
    Copy the full SHA
    4a1b36f View commit details
    Browse the repository at this point in the history

Commits on Nov 11, 2023

  1. [CARBONDATA-4333][Doc] Update the declaration of supported String dat…

    …a types (#4263)
    
    Why is this PR needed?
    CHAR and VARCHAR as String data types are no longer supported in Carbon. They should be deleted from doc's desc.
    
    What changes were proposed in this PR?
    CHAR and VARCHAR stop appearing as two String data types in doc.
    
    Does this PR introduce any user interface change?
    No
    
    Is any new testcase added?
    No
    
    Co-authored-by: tangchuan <[email protected]>
    tangchuan92 and tangchuan authored Nov 11, 2023
    Configuration menu
    Copy the full SHA
    7abc7cd View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    53d3370 View commit details
    Browse the repository at this point in the history
  3. Update pom.xml

    chenliang613 authored Nov 11, 2023
    Configuration menu
    Copy the full SHA
    64ecd77 View commit details
    Browse the repository at this point in the history
  4. Update pom.xml

    chenliang613 authored Nov 11, 2023
    Configuration menu
    Copy the full SHA
    48f5976 View commit details
    Browse the repository at this point in the history

Commits on Nov 19, 2023

  1. [ISSUE-4342] Fix test case errors (#4343)

    Co-authored-by: QiangCai <[email protected]>
    QiangCai and QiangCai authored Nov 19, 2023
    Configuration menu
    Copy the full SHA
    7195869 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    d326118 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    a6e9e37 View commit details
    Browse the repository at this point in the history

Commits on Dec 2, 2023

  1. Configuration menu
    Copy the full SHA
    bcc7137 View commit details
    Browse the repository at this point in the history

Commits on Dec 9, 2023

  1. Bump pyarrow from 0.11.1 to 14.0.1 in /python (#4341)

    Bumps [pyarrow](https://github.com/apache/arrow) from 0.11.1 to 14.0.1.
    - [Commits](apache/arrow@apache-arrow-0.11.1...go/v14.0.1)
    
    ---
    updated-dependencies:
    - dependency-name: pyarrow
      dependency-type: direct:production
    ...
    
    Signed-off-by: dependabot[bot] <[email protected]>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
    dependabot[bot] authored Dec 9, 2023
    Configuration menu
    Copy the full SHA
    dee20b8 View commit details
    Browse the repository at this point in the history

Commits on Mar 16, 2024

  1. Configuration menu
    Copy the full SHA
    c1b0e9c View commit details
    Browse the repository at this point in the history

Commits on Mar 22, 2024

  1. Minor refactor the build/README.md (#4349)

    * Minor refactor the build docs
    
    * Fix review comments
    
    * Update build/README.md
    git-hulk authored Mar 22, 2024
    Configuration menu
    Copy the full SHA
    74e6e93 View commit details
    Browse the repository at this point in the history

Commits on Apr 6, 2024

  1. Configuration menu
    Copy the full SHA
    71abab0 View commit details
    Browse the repository at this point in the history

Commits on Jun 30, 2024

  1. modify thrift version (#4356)

    Co-authored-by: jacky <[email protected]>
    jackylk and jacky authored Jun 30, 2024
    Configuration menu
    Copy the full SHA
    5ff36b6 View commit details
    Browse the repository at this point in the history
  2. [CARBONDATA-4349] Upgrade thrift version (#4355)

    * upgrade thrift version
    
    * change to use 0.20.0
    
    ---------
    
    Co-authored-by: jacky <[email protected]>
    jackylk and jacky authored Jun 30, 2024
    Configuration menu
    Copy the full SHA
    f370d20 View commit details
    Browse the repository at this point in the history

Commits on Jul 6, 2024

  1. Bump org.apache.commons:commons-compress in /integration/presto (#4345)

    Bumps org.apache.commons:commons-compress from 1.4.1 to 1.26.0.
    
    ---
    updated-dependencies:
    - dependency-name: org.apache.commons:commons-compress
      dependency-type: direct:production
    ...
    
    Signed-off-by: dependabot[bot] <[email protected]>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
    dependabot[bot] authored Jul 6, 2024
    Configuration menu
    Copy the full SHA
    8f9ce4e View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    2c78847 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    29607c3 View commit details
    Browse the repository at this point in the history

Commits on Oct 5, 2024

  1. [ISSUE-4351] Add github action for building (#4358)

    * add github action for building
    
    * Revert "[WIP] Optimize geo module, the feature seems less be used (#4353)"
    
    This reverts commit 29607c3.
    
    * Revert "[WIP] Optimize geo module, the feature seems less be used"
    
    This reverts commit 71abab0.
    
    * cache thrift
    kevinjmh authored Oct 5, 2024
    Configuration menu
    Copy the full SHA
    e0ac69a View commit details
    Browse the repository at this point in the history