Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configurable parameters to improve performance while reading page blobs #378

Open
majumd opened this issue Jan 6, 2021 · 4 comments
Open

Comments

@majumd
Copy link

majumd commented Jan 6, 2021

Hi,
I would like to know the way to improve performance while reading from a page blob. Are there any configurable parameters such as control the number of threads or buffer size which could be used to improve performance?
An enhancement to have the performance factors configurable to tweak as per the environment would be helpful.
Thanks
Udayan

@Jinming-Hu
Copy link
Member

Hi @majumd , every blob API accepts a blob_request_options as a parameter. blob_request_options has a member function set_parallelism_factor with which you can set the max number of threads performing the download operation.

@Jinming-Hu
Copy link
Member

Jinming-Hu commented Jan 8, 2021

You also mentioned buffer size, actually there will be multiple data copy during the download process. For example, you download 100MB blob, the 100MB data will be copied 2 or 3 times (I cannot remember). Is this also something you want to optimize?

@majumd
Copy link
Author

majumd commented Jan 9, 2021

Hi @majumd , every blob API accepts a blob_request_options as a parameter. blob_request_options has a member function set_parallelism_factor with which you can set the max number of threads performing the download operation.

Thanks for the response. I could see that the default value of the member variable is m_parallelism_factor is 1. Could you please explain how this could be used to improve data read performance from Azure Cloud.

Suppose we would like to read 40MB of data, Could the value of the variable be set to 10 using function set_parallelism_factor ?
Does it mean that now the read request of 40MB would ideally take the same time as the time taken for 4MB as 10 parallel requests would be made to Azure each request requesting for 4MB data as per m_stream_read_size?

@Jinming-Hu
Copy link
Member

Suppose we would like to read 40MB of data, Could the value of the variable be set to 10 using function set_parallelism_factor ?
Does it mean that now the read request of 40MB would ideally take the same time as the time taken for 4MB as 10 parallel requests would be made to Azure each request requesting for 4MB data as per m_stream_read_size?

Yes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants