Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

callbacks on natsConnection are called after natsConnection_destroy #735

Open
thierryba opened this issue Mar 20, 2024 · 8 comments
Open
Assignees
Labels
defect Suspected defect such as a bug or regression

Comments

@thierryba
Copy link
Contributor

Observed behavior

callbacks, especially ClosedCB and DisconnectedCB are called after the call to natsConnection_Destroy

Expected behavior

Either those should not get called or there should be a way to silence those.

Server and client version

I used the nats.c 3.8.0 connecting to demo.nats.io (see the code).

Host environment

No response

Steps to reproduce

I wrote a little c++ example program. It has a class called MessagingController. I took advantage of the fact that we ca set a closure on the callbacks to pass a pointer to that object so that in the callbacks it can redirect the calls to it. That works really well except when I destruct my MessagingController instance.
In that case, some callbacks are called after the destructor of that object. I believe my program uses a very common pattern for encapsulating the nats connection functionality.
But there is no easy way to prevent callbacks on my MEssagingController instance after the destructor.
In my real code, I ended up putting all the "valid" MessagingController instance in a static set so that I can check for validity there.

I would be very grateful for any insight on this one.

Here is the code (c++17):

#include <nats/nats.h>
#include <memory>
#include <iostream>


template<class T> using UniquePtr = std::unique_ptr<T, void (*)(T*)>;

namespace {
void closedCB(natsConnection*, void* closure)
{
    std::cout << __func__ << ": " << closure << std::endl;
}

void connectedStatusChangedCB(natsConnection*, void* closure)
{
    std::cout << __func__ << ": " << closure << std::endl;
}
}


class MessagingController
{
public:
    MessagingController()
    {
        std::cout << __func__<< ": " << this << std::endl;

        //connect
        UniquePtr<natsOptions> opts{nullptr, nullptr};
        {
            natsOptions* opts_ = nullptr;
            if (natsStatus ns = natsOptions_Create(&opts_); ns == NATS_OK) {
                opts = {opts_, natsOptions_Destroy};
            } else {
                throw 1;
            }

            natsOptions_SetClosedCB(opts.get(), closedCB, this);
            natsOptions_SetDisconnectedCB(opts.get(), connectedStatusChangedCB, this);


            static const char *server = "demo.nats.io";
            natsOptions_SetServers(opts.get(), &server, 1);

            natsConnection *conn = nullptr;

            if (natsConnection_Connect(&conn, opts.get()); conn) {
                connection_ = {conn, natsConnection_Destroy};
            } else {
                throw 2;
            }
        }
    }

    ~MessagingController()
    {
        std::cout << __func__<< ": " << this << std::endl;
    }

private:
    UniquePtr<natsConnection> connection_{nullptr, nullptr};
};


int main()
{
    try {
        MessagingController controller;
    } catch(int err) {
        std::cerr << "ERROR: " << err << std::endl;
    }
    std::cout << "Here the controller is clearly already deleted (any callback after this will crash for sure)" << std::endl;
}

@thierryba thierryba added the defect Suspected defect such as a bug or regression label Mar 20, 2024
@thierryba
Copy link
Contributor Author

an example output for the execution of this program is:

MessagingController: 0x16ef0e4c0
~MessagingController: 0x16ef0e4c0
connectedStatusChangedCB: 0x16ef0e4c0
Here the controller is clearly already deleted (any callback after this will crash for sure)
closedCB: 0x16ef0e4c0

@levb levb self-assigned this Mar 20, 2024
@levb
Copy link
Collaborator

levb commented Mar 20, 2024

@thierryba the connection event callbacks are indeed asynchronous, and may be called after natsConnection_Destroy has returned. I question the need to pass the MessagingController's this pointer as the closure context for the connection callbacks. Perhaps, either

  • (a) separate the necessary context into a different struct (e.g. MessageControllerConnectionData*, and clean it up in closedCB, including deleting it;
  • (b) use your preferred synchronization mechanism to wait in the destructor until closedCB is done;
  • or (c) if the cleanup is not needed until the MessageController destructor - place it there, explicitly in the destructor, and don't pass it to closeCB?

Note that if the intention is for MessageController to auto-delete in closedCB e.g. if the server is down, then it should not be allocated on the stack, and can then be deleted from closedCB. A close() method could be exposed to severe the connection (and auto-delete the controller, asynchronously) from the client side.

Please let me know if it helps. I am hesitant to consider controls to "silence them" until I further understand your use case.

@thierryba
Copy link
Contributor Author

Hello @levb , I suppose option a or b could be implemented. My question is more about: how can this be ok that there is no builtin way to control such things. I think it is a very common use case to embed such c-style functionality inside a c++ class. But there is hardly a way to get that done easily. It is at the very least very error prone.

Having to add classes because the not-documented closedCB call is happening after I call destroy. Then I need to see if it was not already closed before and that opens up a can of worms.

I have an ugly workaround to this. But an elegant solution would be really needed.

@levb
Copy link
Collaborator

levb commented Mar 20, 2024

@thierryba, understood. I could easily add natsOptions_SetNoCallbacksFromDestroy to disable the callbacks from the destructor, and leaving you to add the cleanup code there explicitly.

I'd like to point out that natsConnection is ref-counted, and references may still exist after natsConnection_Destroy returns. You should generally avoid providing pointers to its callbacks that can be prematurely deleted by the client code. I'd need to do some more research to see if implementing something like natsConnection_Destroy(And-Wait-For-All-Callbacks?) would make sense and how exactly it would work.

@thierryba
Copy link
Contributor Author

@levb thanks you for your reply and I will be waiting for your decision on that one.

@levb
Copy link
Collaborator

levb commented Apr 11, 2024

@thierryba I think that adding natsOptions_SetNoCallbacksFromDestroy could be quite confusing. I am still unclear on why do you need to pass the MessagingController pointer to the connection callbacks. Can MessagingController be reference counted so it is not destroyed until all callbacks have released it?

Adding a synchronous method to wait for all callbacks to have finished, in the NATS client itself, seems counter to its asynchronous design philosophy.

@levb
Copy link
Collaborator

levb commented Jul 13, 2024

Ok, after running into this myself while fixing #771, I think what really needs to happen is natsConnection_Drain should wait for all async errors to be processed, much like it waits for all pending message callbacks to finish. I'll try to fix this one soon.

@mcosta74
Copy link

mcosta74 commented Jul 13, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
defect Suspected defect such as a bug or regression
Projects
None yet
Development

No branches or pull requests

3 participants