Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancing bdevfilter for replication product #96

Open
aadeodhar opened this issue Dec 12, 2024 · 4 comments
Open

Enhancing bdevfilter for replication product #96

aadeodhar opened this issue Dec 12, 2024 · 4 comments
Labels
enhancement New feature or request

Comments

@aadeodhar
Copy link

Description

As part of the Microsoft Azure Site Recovery product, we have a mechanism to filter I/Os in the kernel for replication. We are facing similar challenges that you pointed out - where if the upstream block layer kernel code is changed, it affects our kernel driver.
We would like to explore the feasibility of using bdevfilter as a mechanism to filter I/Os for our product as well. We have few questions around how chain BIOs are handled in the proposed bdevfilter and blksnap patches. How are BIO splits handled? What happens if a part of the chain BIO fails to be written? In case of errors, what is the impact on backup and replication?

It would be great to have a discussion on these and some other topics.

Usage tips

  • Please use the 👍 reaction to show that you are interested into this.
  • Please don't comment if you have no relevant information to add. It's just extra noise for everyone subscribed to this.
  • Subscribe to receive notifications on status change and new comments, you can do also without add comment.
@aadeodhar aadeodhar added the enhancement New feature or request label Dec 12, 2024
@Fantu
Copy link
Contributor

Fantu commented Dec 12, 2024

The collaboration of more developers and the use of blksnap in more software would help to have better results and better possibilities of upstream integration, essential so that the kernel does not remove other important APIs because they are not used by any module in the kernel, and many other advantages such as being available in all versions since release and without the need for recompilations, usable even by companies that cannot use external kernel modules etc...

@SergeiShtepa is the maintainer of blksnap and Christoph Hellwig an experienced upstream maintainer who helped a lot to improve this project, I think they are the 2 people with the most knowledge.

Here you can find more information and the link to all patches series proposed.

An important note in case you were thinking of using only the blkfilter part, if I remember correctly initially blkfilter was made to support other modules in addition to blksnap but then among the changes requested by the kernel maintainers it was modified for only blksnap.
It seems to me that they would be against a possible integration in the kernel of 2 modules or more modules that would do more or less the same thing.

@aadeodhar
Copy link
Author

"It seems to me that they would be against a possible integration in the kernel of 2 modules or more modules that would do more or less the same thing."
Yes - thats the reason we want to explore bdevfilter as a generic block I/O filtering mechanism if it can handle our use cases as well.

@SergeiShtepa
Copy link
Collaborator

Hi @aadeodhar !
Thank you @Fantu !

I would be very happy to collaborate.
The GPL-2 license allows me to share any information about the blksnap project.
The "Block Device Filtering Mechanism" (blkfilter) can be used for replication, as well as for other tasks.
I use the name bdevfilter for the out-of-tree version, which is implemented as a separate kernel module.

It is obvious to me that if the blkfilter code turns out to be in the upstream, it will immediately be in demand.
However, when one person is developing for a specific product (as in my case), the code does not become universal.
I would like the Linux kernel code to be equally good for all users.

I am sure we could continue this work together in any format convenient for us.
If you are interested in this, then write to me by email [email protected].

Regarding the questions, I will try to answer:

How are BIO splits handled? What happens if a part of the chain BIO fails to be written?

I think that for the blksnap module, splitting BIO requests is not a problem. To create a snapshot image of a block device, it is enough to assume that some range could have changed and copy the data in a timely manner before it is overwritten. If the recording was performed only partially or not at all, this does not affect the algorithm for generating the snapshot image of the block device.

Your questions are specific to the replication algorithm. You need a guarantee that the request was successfully processed by the original block device before you send it for replication. I suppose for this purpose it will be necessary to handle a callback about the completion of BIO processing.

In case of errors, what is the impact on backup and replication?

If errors occur when writing to the original device, the blksnap kernel module cannot know this, so it does not affect it in any way. But if errors occur when reading from the original block device, the snapshot image is considered corrupted and the backup process is interrupted. Backups must be done before the hardware starts to fail :).

@SergeiShtepa
Copy link
Collaborator

It seems to me that they would be against a possible integration in the kernel of 2 modules or more modules that would do more or less the same thing.

I think that the module for replication (I'll call it azure-replica) should have completely different algorithms than the blksnap module. And it has a different purpose. Therefore, I am sure that it is not a duplicate functionality for blksnap. But it may be much more difficult to prove that it is not at all like DRDB... :)

Well, the question also remains: why do you use handling, instead of making a stack of block devices like in DRDB or DM?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants