Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calculating lots of isochrones can lead to "infinite loops"? #1897

Open
1 task done
lenalebt opened this issue Nov 12, 2024 · 3 comments
Open
1 task done

Calculating lots of isochrones can lead to "infinite loops"? #1897

lenalebt opened this issue Nov 12, 2024 · 3 comments

Comments

@lenalebt
Copy link

Is there an existing issue for this?

  • I have searched the existing issues

Problem description

I am calculating lots of isochrones for different purposes, basically hammering ORS with isochrones requests for days in a row. In most cases, everything works just fine. Sometimes, ORS does not stop calculating an isochrone, which can lead to the whole server running crazy on CPU cycles on all cores without any perceivable output. The threads going crazy live for hours (basically until I manually stop ORS), although the Isos are "only" being calculated for up to 60 minutes (car). I would expect them to stop calculating when a certain size has been reached.

I do not yet have an idea what is actually happening. It does not happen immediately, but it's basically guaranteed to happen after a few hours. I find it hard to debug because I could not yet see for which request this might have started actually. I'm in for some debugging, but before I go deeper here:

  • Is this behaviour known in some way? I could not find anything like it in the github issues, but maybe I just did not find it
  • Do you have any ideas for debugging it specifically? I mean, I could try to get a thread dump or anything like that, just did not yet do it because it's running inside a container and I would need to fiddle a bit about how to get it "properly". But maybe you do have better ideas already.

Proposed solution

More debugging needed, wanted to ask first and then invest more time.

Additional context

ORS 8.2.0 from docker

ORS Settings:

ors:
  #  cors:
  #    allowed_origins: "*"
  #    allowed_headers: Content-Type, X-Requested-With, accept, Origin, Access-Control-Request-Method, Access-Control-Request-Headers, Authorization
  #    preflight_max_age: 600
  #  messages:
  #  ##### ORS endpoints settings #####
  endpoints:
    routing:
      enabled: true
    matrix:
      enabled: true
      maximum_visited_nodes: 10000000
      maximum_search_radius: 70
    isochrones:
      enabled: true
      allow_compute_area: false
      maximum_intervals: 180
      maximum_range_distance_default: 70000
      maximum_range_time_default: 7200
    fastisochrones:
      enabled: true

  #  ##### ORS engine settings #####
  engine:
    source_file: /home/ors/files/area.osm.pbf
    init_threads: 1
    preparation_mode: false
    graphs_root_path: ./graphs
    graphs_data_access: RAM_STORE
    elevation:
      preprocessed: false
      data_access: MMAP
      cache_clear: false
      provider: srtm
      cache_path: ./elevation_cache
    profile_default:
      maximum_snapping_radius: 70
    profiles:
      car:
        enabled: true
        profile: driving-car
        encoder_options:
          turn_costs: true
          block_fords: false
          use_acceleration: true
        preparation:
          min_network_size: 200
          methods:
            ch:
              enabled: true
              threads: 1
              weightings: fastest
            lm:
              enabled: false
              threads: 1
              weightings: fastest,shortest
              landmarks: 16
            core:
              enabled: true
              threads: 1
              weightings: fastest,shortest
              landmarks: 64
              lmsets: highways;allow_all
        execution:
          methods:
            lm:
              active_landmarks: 6
            core:
              active_landmarks: 6
        ext_storages:
          WayCategory:
          HeavyVehicle:
          WaySurfaceType:
          RoadAccessRestrictions:
            use_for_warnings: true
      bike-regular:
        enabled: true
        profile: cycling-regular
        encoder_options:
          consider_elevation: true
          turn_costs: true
        ext_storages:
          WayCategory:
          WaySurfaceType:
          HillIndex:
          TrailDifficulty:
      walking:
        enabled: true
        profile: foot-walking
        encoder_options:
          block_fords: false
        ext_storages:
          WayCategory:
          WaySurfaceType:
          HillIndex:
          TrailDifficulty:
      wheelchair:
        enabled: true
        profile: wheelchair
        encoder_options:
          block_fords: true
        maximum_snapping_radius: 50
        ext_storages:
          WayCategory:
          WaySurfaceType:
          Wheelchair:
            KerbsOnCrossings: true
          OsmId:

Forum Topic Link

No response

@sfendrich
Copy link
Contributor

Thanks for reporting. We haven't faced this behavior so far. Normally ORS should stop calculations once maximum_visited_nodes is exceeded.

Things you could try to narrow down the issue:

  • Increase the log-level of ORS to get more information in the log file.
  • Check whether the OS is running out of memory and swapping, which would drastically slow down ORS.
  • Attach a profiler such as VisualVM to ORS to get information about where ORS gets stuck.
  • Check whether the JVM is running out of memory and maybe starting to collect garbage a lot; can be done with VisualVM, too
  • Maybe specify a smaller value of maximum_visited_nodes for isochrones in your config file.

If you have useful information in your log-file or a specific request that causes the issue you may also post it here.

@aoles
Copy link
Member

aoles commented Jan 23, 2025

Hi @lenalebt ,

did you try any of @sfendrich 's suggestions? In case you have any new insights, it would be great if you could share them.

Cheers!

@lenalebt
Copy link
Author

lenalebt commented Jan 28, 2025

Hey together,
my main problem was that this only happens on the production server, which is under quite some load. So, it happens from time to time (right now I have a single thread running since almost 2 days burning one CPU, but as the others are still available, I kept it running).

  • Yes, I adjusted log levels, but could not find anything useful.
  • The JVM does not run low on memory, and
  • the system is not swapping.

I created a thread dump using jstack directly from the container (I'm running it via docker). This is what the thread under question is doing currently (using the official docker image, version 8.2.0, with the command jstack 1 from within the container):

"http-nio-8082-exec-1684" #1737 [1971] daemon prio=5 os_prio=0 cpu=169975881.04ms elapsed=174790.38s tid=0x000073a5c5335fb0 nid=1971 runnable  [0x000073a5c4fb4000]
   java.lang.Thread.State: RUNNABLE
        at org.locationtech.jts.triangulate.tri.Tri.isInteriorVertex(Tri.java:564)
        at org.locationtech.jts.algorithm.hull.HullTri.isConnecting(HullTri.java:124)
        at org.locationtech.jts.algorithm.hull.ConcaveHull.isRemovableBorder(ConcaveHull.java:416)
        at org.locationtech.jts.algorithm.hull.ConcaveHull.computeHullBorder(ConcaveHull.java:287)
        at org.locationtech.jts.algorithm.hull.ConcaveHull.computeHull(ConcaveHull.java:272)
        at org.locationtech.jts.algorithm.hull.ConcaveHull.getHull(ConcaveHull.java:234)
        at org.locationtech.jts.algorithm.hull.ConcaveHull.concaveHullByLength(ConcaveHull.java:110)
        at org.locationtech.jts.algorithm.hull.ConcaveHull.concaveHullByLength(ConcaveHull.java:93)
        at org.heigit.ors.isochrones.builders.AbstractIsochroneMapBuilder.addIsochrone(AbstractIsochroneMapBuilder.java:268)
        at org.heigit.ors.isochrones.builders.concaveballs.ConcaveBallsIsochroneMapBuilder.compute(ConcaveBallsIsochroneMapBuilder.java:135)
        at org.heigit.ors.isochrones.IsochroneMapBuilderFactory.buildMap(IsochroneMapBuilderFactory.java:35)
        at org.heigit.ors.routing.RoutingProfile.buildIsochrone(RoutingProfile.java:671)
        at org.heigit.ors.routing.RoutingProfileManager.buildIsochrone(RoutingProfileManager.java:605)
        at org.heigit.ors.api.services.IsochronesService.generateIsochronesFromRequest(IsochronesService.java:59)
        at org.heigit.ors.api.controllers.IsochronesAPI.getGeoJsonIsochrones(IsochronesAPI.java:139)
        at org.heigit.ors.api.controllers.IsochronesAPI.getDefaultIsochrones(IsochronesAPI.java:111)
        at java.lang.invoke.LambdaForm$DMH/0x000073a5d0314c00.invokeVirtual([email protected]/LambdaForm$DMH)
        at java.lang.invoke.LambdaForm$MH/0x000073a5d06cc400.invoke([email protected]/LambdaForm$MH)
        at java.lang.invoke.LambdaForm$MH/0x000073a5d0151400.invokeExact_MT([email protected]/LambdaForm$MH)
        at jdk.internal.reflect.DirectMethodHandleAccessor.invokeImpl([email protected]/DirectMethodHandleAccessor.java:155)
        at jdk.internal.reflect.DirectMethodHandleAccessor.invoke([email protected]/DirectMethodHandleAccessor.java:103)
        at java.lang.reflect.Method.invoke([email protected]/Method.java:580)
        at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:255)
        at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:188)
        at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:118)
        at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:926)
        at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:831)
        at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:87)
        at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:1089)
        at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:979)
        at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:1014)
        at org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:914)
        at jakarta.servlet.http.HttpServlet.service(HttpServlet.java:547)
        at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:885)
        at jakarta.servlet.http.HttpServlet.service(HttpServlet.java:614)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:195)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:140)
        at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:51)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:164)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:140)
        at org.springframework.web.filter.RequestContextFilter.doFilterInternal(RequestContextFilter.java:100)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:116)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:164)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:140)
        at org.springframework.web.filter.FormContentFilter.doFilterInternal(FormContentFilter.java:93)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:116)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:164)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:140)
        at org.springframework.web.filter.ServerHttpObservationFilter.doFilterInternal(ServerHttpObservationFilter.java:113)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:116)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:164)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:140)
        at org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:201)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:116)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:164)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:140)
        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:167)
        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:90)
        at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:483)
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:115)
        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:93)
        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:74)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:344)
        at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:384)
        at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:63)
        at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:904)
        at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1741)
        at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:52)
        at org.apache.tomcat.util.threads.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1190)
        at org.apache.tomcat.util.threads.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:659)
        at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:63)
        at java.lang.Thread.runWith([email protected]/Thread.java:1596)
        at java.lang.Thread.run([email protected]/Thread.java:1583)

I ran it a few times to find out whether it is changing, but it is always in that same line (Tri.java:564). I skipped all the parked threads.

I can't say whether it always is that line, since I have only one thread that went wild in this moment. I can try it again when it happens again.

Regarding the maximum_visited_nodes: Understood. But also, I find it hard to say what number I should put there. Sometimes (not very often) I also need large ISOs (maybe 120 minutes by car). Do you have a suggestion for the value? But also, I mean, running over 2 days on the isochrone, it should have visited the whole graph in that time!? (graph is DACH plus France, Spain, Benelux and Czech Repulic currently)

Cheers,
Lena

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants