You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to use celery multiprocessing with HashEncoder from category_encoders which has its own multiprocessing. However, when running celery with HashEncoder .transform() , I get "celery: daemonic processes are not allowed to have children" because Celery uses billiard as multiprocessing, and HashEncoder itself uses multiprocessing.
billiard and multiprocessing are different libraries - billiard is the Celery project's own fork of multiprocessing.
Several solutions were provided by the community which didnt work for me
Monkey patch hash encoder to use billiard as multiprocessing instead of the original multiprocessing. However, the HashEncoder library also uses sklearn which makes it hard to monkey patch both libraries (category_encoders and sklearn) and could introduce instability.
celery set pools=threads. However, this solution uses only 1 core and multiple threads which does not allow true parallelisation, even with concurrency=3.
Run celery without daemon. Not advised to do it this way.
Replace HashEncoder with other encoders. Not a good solution given that my data has high dimensionality, hash encoder would be better.
I really need my worker to run in parallel as performance is critical and would prefer celery=prefork. Is there any workaround that can allow multiprocessing with Celery and hashencoder?
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
I am trying to use celery multiprocessing with HashEncoder from category_encoders which has its own multiprocessing. However, when running celery with HashEncoder .transform() , I get "celery: daemonic processes are not allowed to have children" because Celery uses billiard as multiprocessing, and HashEncoder itself uses multiprocessing.
billiard and multiprocessing are different libraries - billiard is the Celery project's own fork of multiprocessing.
Several solutions were provided by the community which didnt work for me
I really need my worker to run in parallel as performance is critical and would prefer celery=prefork. Is there any workaround that can allow multiprocessing with Celery and hashencoder?
Thanks in advance!
Beta Was this translation helpful? Give feedback.
All reactions