You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm investigating a blocking problem and gdb shows that the problem occurs in current_shard_blk_ids_.count of this commit, but what I'm not understand is why there's a concurrent access problem for current_shard_blk_ids_.count.
In my opinion, only one thread could execute into the code where the commit fixed, and when count is called, no more current_shard_blk_ids_.insert will be called. So why there's still a concurrent access problem?
Looking forward for reply. Thanks.
The text was updated successfully, but these errors were encountered:
only one thread could execute into the code where the commit fixed, and when count is called, no more current_shard_blk_ids_.insert will be called
The problem lies in a bit different scenario. If get_block_by_seqno fails for some block, td::MultiPromise calls the finishing lambda immediately without waiting for other tasks to finish. This cause IndexQuery to stop and destruct deallocating its variables, while some of shard blocks might be still waiting for read. And when they finish they access current_shard_blk_ids_ causing the crash.
The commit title is not very accurate, it should be "Fix crash caused by accessing deallocated variable".
Thanks for your reply. Totally understand now. Both IndexQuery::finish and IndexQuery::error will call Actor::stop, which will destroy the IndexQuery object( unique_ptr.reset).
But here's another question: Is it rational to reset the unique_ptr directlly? In my opinion, Actor::stop should just decrease the reference count instead of deallocate the object. In our case, I believe the reference count of IndexQuery::actor_info_ptr_ won't be 0, but IndexQuery still be deallocate violently.
I'm investigating a blocking problem and gdb shows that the problem occurs in
current_shard_blk_ids_.count
of this commit, but what I'm not understand is why there's a concurrent access problem forcurrent_shard_blk_ids_.count
.In my opinion, only one thread could execute into the code where the commit fixed, and when
count
is called, no morecurrent_shard_blk_ids_.insert
will be called. So why there's still a concurrent access problem?Looking forward for reply. Thanks.
The text was updated successfully, but these errors were encountered: