Dynamic allocation for writing parallelism in flink sink #1353

nicochen · 2023-04-14T12:22:09Z

nicochen
Apr 14, 2023

Here is a case that data throughput have significant difference between day and night. A flink job has an immutable parallelism of sinks once started. Take an un-keyed table as an example, 'none' distribution policy is select which means best utilization of all parallelism, while it creates biggest number of files even through small quality of data comes.

Therefore, I‘d like to propose a new distribution policy to dynamically allocate writing parallelism in streaming writing according to number of data. In this way, we can not only relieve the pressure from small files in hdfs, but also increate the efficiency of reading and optimisation work.

nicochen · 2023-04-14T12:23:32Z

nicochen
Apr 14, 2023
Author

@YesOrNo828 maybe we can take a look at this topic.

0 replies

YesOrNo828 · 2023-04-21T08:31:40Z

YesOrNo828
Apr 21, 2023
Collaborator

@nicochen Thanks for your input. This topic is interesting. And I have some questions:

Could this distribution policy work for tables with and without keys? It should ideally work on two kinds of tables. So it leads to the primary key table, how to reduce part of the task not out of the data and can ensure data consistency.

a new distribution policy to dynamically allocate writing parallelism in streaming writing according to number of data.

How to switch the shuffle rule according to the number of data?
Allocating the writing parallelism, Do you mean that some subtasks are still running but no data is being processed?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dynamic allocation for writing parallelism in flink sink #1353

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments

{{title}}

{{title}}

Select a reply

Dynamic allocation for writing parallelism in flink sink #1353

nicochen Apr 14, 2023

Replies: 2 comments

nicochen Apr 14, 2023 Author

YesOrNo828 Apr 21, 2023 Collaborator

nicochen
Apr 14, 2023

nicochen
Apr 14, 2023
Author

YesOrNo828
Apr 21, 2023
Collaborator