You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Dec 15, 2022. It is now read-only.
When I am using csv file input for a unit test which contains two columns (for example "id" and "a"), but I am using only one of them in the mapping (for example "a") and I choose the other ("id") for sorting, an exception occurs:
2019/02/28 15:07:40 - Spoon - Caused by: org.pentaho.di.core.exception.KettleException:
2019/02/28 15:07:40 - Spoon - Unable to get all rows for database data set 'addnumbers as text'
2019/02/28 15:07:40 - Spoon - -1
2019/02/28 15:07:40 - Spoon -
2019/02/28 15:07:40 - Spoon - at org.pentaho.di.dataset.DataSetCsvGroup.getAllRows(DataSetCsvGroup.java:226)
2019/02/28 15:07:40 - Spoon - at org.pentaho.di.dataset.DataSetGroup.getAllRows(DataSetGroup.java:133)
2019/02/28 15:07:40 - Spoon - at org.pentaho.di.dataset.DataSet.getAllRows(DataSet.java:140)
2019/02/28 15:07:40 - Spoon - at org.pentaho.di.dataset.spoon.xtpoint.InjectDataSetIntoTransExtensionPoint.injectDataSetIntoStep(InjectDataSetIntoTransExtensionPoint.java:198)
2019/02/28 15:07:40 - Spoon - at org.pentaho.di.dataset.spoon.xtpoint.InjectDataSetIntoTransExtensionPoint.callExtensionPoint(InjectDataSetIntoTransExtensionPoint.java:126)
2019/02/28 15:07:40 - Spoon - ... 8 more
2019/02/28 15:07:40 - Spoon - Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
2019/02/28 15:07:40 - Spoon - at org.pentaho.di.core.row.RowMeta.compare(RowMeta.java:915)
2019/02/28 15:07:40 - Spoon - at org.pentaho.di.dataset.DataSetCsvGroup$1.compare(DataSetCsvGroup.java:214)
2019/02/28 15:07:40 - Spoon - at org.pentaho.di.dataset.DataSetCsvGroup$1.compare(DataSetCsvGroup.java:211)
2019/02/28 15:07:40 - Spoon - at java.util.TimSort.countRunAndMakeAscending(TimSort.java:355)
2019/02/28 15:07:40 - Spoon - at java.util.TimSort.sort(TimSort.java:220)
2019/02/28 15:07:40 - Spoon - at java.util.Arrays.sort(Arrays.java:1512)
2019/02/28 15:07:40 - Spoon - at java.util.ArrayList.sort(ArrayList.java:1462)
2019/02/28 15:07:40 - Spoon - at java.util.Collections.sort(Collections.java:175)
2019/02/28 15:07:40 - Spoon - at org.pentaho.di.dataset.DataSetCsvGroup.getAllRows(DataSetCsvGroup.java:211)
2019/02/28 15:07:40 - Spoon - ... 12 more
I debugged it and I think, here is the spot in the code:
(DataSetCsvGroup.java from line 200)
// Which fields are we sorting on (if any)
//
int[] sortIndexes = new int[ sortFields.size() ];
for ( int i = 0; i < sortIndexes.length; i++ ) {
sortIndexes[ i ] = outputRowMeta.indexOfValue( sortFields.get( i ) );
}
if ( !sortFields.isEmpty() ) {
// Sort the rows...
//
Collections.sort( rows, new Comparator<Object[]>() {
@Override public int compare( Object[] o1, Object[] o2 ) {
try {
return outputRowMeta.compare( o1, o2, sortIndexes );
} catch ( KettleValueException e ) {
throw new RuntimeException( "Unable to compare 2 rows", e );
}
}
} );
}
sortIndexes will not be empty, but sortIndexes[0] will be -1 and this will cause and ArrayIndexOutOfBounds exception in outputRowMeta.compare.
You may ask, why want I sorting the csv file base on a field, which is not in the mapping, but it seemed to
me a normal use case. For example, I wanted to test a transformation which adds two numbers together:
id
a
b
c
1
0
0
0
2
1
0
1
The input mapping would be the columns "a" and "b", sorted by "id"
The golden mapping would be the columns "a", "b" and "c" sorted by "id".
Thank you very much for the use case. It's true that I hadn't considered it yet.
I think we'll need to do something novel here like adding the sort columns temporarily until after sorting after which we should remove them again, just to make sure the columns don't end up in the test-transformation.
Cheers,
Matt
Dear Matt,
When I am using csv file input for a unit test which contains two columns (for example "id" and "a"), but I am using only one of them in the mapping (for example "a") and I choose the other ("id") for sorting, an exception occurs:
I debugged it and I think, here is the spot in the code:
(DataSetCsvGroup.java from line 200)
sortIndexes will not be empty, but sortIndexes[0] will be -1 and this will cause and ArrayIndexOutOfBounds exception in outputRowMeta.compare.
You may ask, why want I sorting the csv file base on a field, which is not in the mapping, but it seemed to
me a normal use case. For example, I wanted to test a transformation which adds two numbers together:
The input mapping would be the columns "a" and "b", sorted by "id"
The golden mapping would be the columns "a", "b" and "c" sorted by "id".
I put all the files to reproduce this here:
https://github.com/peterborkuti/pentaho-pdi-dataset-bug-01
Thank you for your wonderful plugin
Péter
The text was updated successfully, but these errors were encountered: