Skip to content

Commit

Permalink
[Bug][Improve][LocalFileSink]Fix LocalFile Sink file_format_type. (#5118
Browse files Browse the repository at this point in the history
)
  • Loading branch information
lightzhao authored Aug 2, 2023
1 parent d87f68b commit e717d80
Show file tree
Hide file tree
Showing 24 changed files with 64 additions and 64 deletions.
18 changes: 9 additions & 9 deletions docs/en/connector-v2/sink/FtpFile.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,9 +40,9 @@ By default, we use 2PC commit to ensure `exactly-once`
| custom_filename | boolean | no | false | Whether you need custom the filename |
| file_name_expression | string | no | "${transactionId}" | Only used when custom_filename is true |
| filename_time_format | string | no | "yyyy.MM.dd" | Only used when custom_filename is true |
| file_format | string | no | "csv" | |
| field_delimiter | string | no | '\001' | Only used when file_format is text |
| row_delimiter | string | no | "\n" | Only used when file_format is text |
| file_format_type | string | no | "csv" | |
| field_delimiter | string | no | '\001' | Only used when file_format_type is text |
| row_delimiter | string | no | "\n" | Only used when file_format_type is text |
| have_partition | boolean | no | false | Whether you need processing partitions. |
| partition_by | array | no | - | Only used then have_partition is true |
| partition_dir_expression | string | no | "${k0}=${v0}/${k1}=${v1}/.../${kn}=${vn}/" | Only used then have_partition is true |
Expand All @@ -52,8 +52,8 @@ By default, we use 2PC commit to ensure `exactly-once`
| batch_size | int | no | 1000000 | |
| compress_codec | string | no | none | |
| common-options | object | no | - | |
| max_rows_in_memory | int | no | - | Only used when file_format is excel. |
| sheet_name | string | no | Sheet${Random number} | Only used when file_format is excel. |
| max_rows_in_memory | int | no | - | Only used when file_format_type is excel. |
| sheet_name | string | no | Sheet${Random number} | Only used when file_format_type is excel. |

### host [string]

Expand Down Expand Up @@ -103,13 +103,13 @@ When the format in the `file_name_expression` parameter is `xxxx-${now}` , `file
| m | Minute in hour |
| s | Second in minute |

### file_format [string]
### file_format_type [string]

We supported as the following file types:

`text` `json` `csv` `orc` `parquet` `excel`

Please note that, The final file name will end with the file_format's suffix, the suffix of the text file is `txt`.
Please note that, The final file name will end with the file_format_type's suffix, the suffix of the text file is `txt`.

### field_delimiter [string]

Expand Down Expand Up @@ -198,7 +198,7 @@ FtpFile {
username = "username"
password = "password"
path = "/data/ftp"
file_format = "text"
file_format_type = "text"
field_delimiter = "\t"
row_delimiter = "\n"
sink_columns = ["name","age"]
Expand All @@ -216,7 +216,7 @@ FtpFile {
username = "username"
password = "password"
path = "/data/ftp"
file_format = "text"
file_format_type = "text"
field_delimiter = "\t"
row_delimiter = "\n"
have_partition = true
Expand Down
12 changes: 6 additions & 6 deletions docs/en/connector-v2/sink/HdfsFile.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,8 +41,8 @@ By default, we use 2PC commit to ensure `exactly-once`
| file_name_expression | string | no | "${transactionId}" | Only used when custom_filename is true |
| filename_time_format | string | no | "yyyy.MM.dd" | Only used when custom_filename is true |
| file_format_type | string | no | "csv" | |
| field_delimiter | string | no | '\001' | Only used when file_format is text |
| row_delimiter | string | no | "\n" | Only used when file_format is text |
| field_delimiter | string | no | '\001' | Only used when file_format_type is text |
| row_delimiter | string | no | "\n" | Only used when file_format_type is text |
| have_partition | boolean | no | false | Whether you need processing partitions. |
| partition_by | array | no | - | Only used then have_partition is true |
| partition_dir_expression | string | no | "${k0}=${v0}/${k1}=${v1}/.../${kn}=${vn}/" | Only used then have_partition is true |
Expand All @@ -55,8 +55,8 @@ By default, we use 2PC commit to ensure `exactly-once`
| kerberos_keytab_path | string | no | - | |
| compress_codec | string | no | none | |
| common-options | object | no | - | |
| max_rows_in_memory | int | no | - | Only used when file_format is excel. |
| sheet_name | string | no | Sheet${Random number} | Only used when file_format is excel. |
| max_rows_in_memory | int | no | - | Only used when file_format_type is excel. |
| sheet_name | string | no | Sheet${Random number} | Only used when file_format_type is excel. |

### fs.defaultFS [string]

Expand Down Expand Up @@ -104,7 +104,7 @@ We supported as the following file types:

`text` `json` `csv` `orc` `parquet` `excel`

Please note that, The final file name will end with the file_format's suffix, the suffix of the text file is `txt`.
Please note that, The final file name will end with the file_format_type's suffix, the suffix of the text file is `txt`.

### field_delimiter [string]

Expand Down Expand Up @@ -198,7 +198,7 @@ For orc file format simple config
HdfsFile {
fs.defaultFS = "hdfs://hadoopcluster"
path = "/tmp/hive/warehouse/test2"
file_format = "orc"
file_format_type = "orc"
}

```
Expand Down
24 changes: 12 additions & 12 deletions docs/en/connector-v2/sink/LocalFile.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ If you use SeaTunnel Engine, It automatically integrated the hadoop jar when you

By default, we use 2PC commit to ensure `exactly-once`

- [x] file format
- [x] file format type
- [x] text
- [x] csv
- [x] parquet
Expand All @@ -36,9 +36,9 @@ By default, we use 2PC commit to ensure `exactly-once`
| custom_filename | boolean | no | false | Whether you need custom the filename |
| file_name_expression | string | no | "${transactionId}" | Only used when custom_filename is true |
| filename_time_format | string | no | "yyyy.MM.dd" | Only used when custom_filename is true |
| file_format | string | no | "csv" | |
| field_delimiter | string | no | '\001' | Only used when file_format is text |
| row_delimiter | string | no | "\n" | Only used when file_format is text |
| file_format_type | string | no | "csv" | |
| field_delimiter | string | no | '\001' | Only used when file_format_type is text |
| row_delimiter | string | no | "\n" | Only used when file_format_type is text |
| have_partition | boolean | no | false | Whether you need processing partitions. |
| partition_by | array | no | - | Only used then have_partition is true |
| partition_dir_expression | string | no | "${k0}=${v0}/${k1}=${v1}/.../${kn}=${vn}/" | Only used then have_partition is true |
Expand All @@ -48,8 +48,8 @@ By default, we use 2PC commit to ensure `exactly-once`
| batch_size | int | no | 1000000 | |
| compress_codec | string | no | none | |
| common-options | object | no | - | |
| max_rows_in_memory | int | no | - | Only used when file_format is excel. |
| sheet_name | string | no | Sheet${Random number} | Only used when file_format is excel. |
| max_rows_in_memory | int | no | - | Only used when file_format_type is excel. |
| sheet_name | string | no | Sheet${Random number} | Only used when file_format_type is excel. |

### path [string]

Expand Down Expand Up @@ -83,13 +83,13 @@ When the format in the `file_name_expression` parameter is `xxxx-${now}` , `file
| m | Minute in hour |
| s | Second in minute |

### file_format [string]
### file_format_type [string]

We supported as the following file types:

`text` `json` `csv` `orc` `parquet` `excel`

Please note that, The final file name will end with the file_format's suffix, the suffix of the text file is `txt`.
Please note that, The final file name will end with the file_format_type's suffix, the suffix of the text file is `txt`.

### field_delimiter [string]

Expand Down Expand Up @@ -174,7 +174,7 @@ For orc file format simple config

LocalFile {
path = "/tmp/hive/warehouse/test2"
file_format = "orc"
file_format_type = "orc"
}

```
Expand All @@ -185,7 +185,7 @@ For parquet file format with `sink_columns`

LocalFile {
path = "/tmp/hive/warehouse/test2"
file_format = "parquet"
file_format_type = "parquet"
sink_columns = ["name","age"]
}

Expand All @@ -197,7 +197,7 @@ For text file format with `have_partition` and `custom_filename` and `sink_colum

LocalFile {
path = "/tmp/hive/warehouse/test2"
file_format = "text"
file_format_type = "text"
field_delimiter = "\t"
row_delimiter = "\n"
have_partition = true
Expand All @@ -224,7 +224,7 @@ LocalFile {
partition_dir_expression="${k0}=${v0}"
is_partition_field_write_in_file=true
file_name_expression="${transactionId}_${now}"
file_format="excel"
file_format_type="excel"
filename_time_format="yyyy.MM.dd"
is_enable_transaction=true
}
Expand Down
10 changes: 5 additions & 5 deletions docs/en/connector-v2/sink/OssFile.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,8 +44,8 @@ By default, we use 2PC commit to ensure `exactly-once`
| file_name_expression | string | no | "${transactionId}" | Only used when custom_filename is true |
| filename_time_format | string | no | "yyyy.MM.dd" | Only used when custom_filename is true |
| file_format_type | string | no | "csv" | |
| field_delimiter | string | no | '\001' | Only used when file_format is text |
| row_delimiter | string | no | "\n" | Only used when file_format is text |
| field_delimiter | string | no | '\001' | Only used when file_format_type is text |
| row_delimiter | string | no | "\n" | Only used when file_format_type is text |
| have_partition | boolean | no | false | Whether you need processing partitions. |
| partition_by | array | no | - | Only used then have_partition is true |
| partition_dir_expression | string | no | "${k0}=${v0}/${k1}=${v1}/.../${kn}=${vn}/" | Only used then have_partition is true |
Expand All @@ -55,8 +55,8 @@ By default, we use 2PC commit to ensure `exactly-once`
| batch_size | int | no | 1000000 | |
| compress_codec | string | no | none | |
| common-options | object | no | - | |
| max_rows_in_memory | int | no | - | Only used when file_format is excel. |
| sheet_name | string | no | Sheet${Random number} | Only used when file_format is excel. |
| max_rows_in_memory | int | no | - | Only used when file_format_type is excel. |
| sheet_name | string | no | Sheet${Random number} | Only used when file_format_type is excel. |

### path [string]

Expand Down Expand Up @@ -112,7 +112,7 @@ We supported as the following file types:

`text` `json` `csv` `orc` `parquet` `excel`

Please note that, The final file name will end with the file_format's suffix, the suffix of the text file is `txt`.
Please note that, The final file name will end with the file_format_type's suffix, the suffix of the text file is `txt`.

### field_delimiter [string]

Expand Down
10 changes: 5 additions & 5 deletions docs/en/connector-v2/sink/OssJindoFile.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,8 +44,8 @@ By default, we use 2PC commit to ensure `exactly-once`
| file_name_expression | string | no | "${transactionId}" | Only used when custom_filename is true |
| filename_time_format | string | no | "yyyy.MM.dd" | Only used when custom_filename is true |
| file_format_type | string | no | "csv" | |
| field_delimiter | string | no | '\001' | Only used when file_format is text |
| row_delimiter | string | no | "\n" | Only used when file_format is text |
| field_delimiter | string | no | '\001' | Only used when file_format_type is text |
| row_delimiter | string | no | "\n" | Only used when file_format_type is text |
| have_partition | boolean | no | false | Whether you need processing partitions. |
| partition_by | array | no | - | Only used then have_partition is true |
| partition_dir_expression | string | no | "${k0}=${v0}/${k1}=${v1}/.../${kn}=${vn}/" | Only used then have_partition is true |
Expand All @@ -55,8 +55,8 @@ By default, we use 2PC commit to ensure `exactly-once`
| batch_size | int | no | 1000000 | |
| compress_codec | string | no | none | |
| common-options | object | no | - | |
| max_rows_in_memory | int | no | - | Only used when file_format is excel. |
| sheet_name | string | no | Sheet${Random number} | Only used when file_format is excel. |
| max_rows_in_memory | int | no | - | Only used when file_format_type is excel. |
| sheet_name | string | no | Sheet${Random number} | Only used when file_format_type is excel. |

### path [string]

Expand Down Expand Up @@ -112,7 +112,7 @@ We supported as the following file types:

`text` `json` `csv` `orc` `parquet` `excel`

Please note that, The final file name will end with the file_format's suffix, the suffix of the text file is `txt`.
Please note that, The final file name will end with the file_format_type's suffix, the suffix of the text file is `txt`.

### field_delimiter [string]

Expand Down
2 changes: 1 addition & 1 deletion docs/en/connector-v2/sink/S3-Redshift.md
Original file line number Diff line number Diff line change
Expand Up @@ -124,7 +124,7 @@ We supported as the following file types:

`text` `csv` `parquet` `orc` `json`

Please note that, The final file name will end with the file_format's suffix, the suffix of the text file is `txt`.
Please note that, The final file name will end with the file_format_type's suffix, the suffix of the text file is `txt`.

### filename_time_format [string]

Expand Down
Loading

0 comments on commit e717d80

Please sign in to comment.