Fixed multi-threading bug and faster compute through caching of find_necessary_steps
We've introduced a cache to avoid computing find_necessary_steps multiple times during each inference call.
This has 2 benefits:
- It reduces computation time of the compute call
- It avoids a subtle multi-threading bug in networkx when accessing the graph from a high number of threads.