Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compare average processing time for validator between 5.0 and 6.0 #1876

Open
emmambd opened this issue Oct 8, 2024 · 1 comment
Open

Compare average processing time for validator between 5.0 and 6.0 #1876

emmambd opened this issue Oct 8, 2024 · 1 comment
Assignees
Labels
enhancement New feature request or improvement on an existing feature

Comments

@emmambd
Copy link
Contributor

emmambd commented Oct 8, 2024

Describe the problem

In order to be confident that the new column additions to stop_times.txt with Flex are not having a significant impact on the validator's performance, we want to compare the average runtime for the validation between 5.0 and 6.0.

Proposed solution

Include analytics about how long run time was in 5.0 vs. 6.0 for each feed.

Alternatives you've considered

No response

Additional context

No response

@emmambd emmambd added enhancement New feature request or improvement on an existing feature status: Needs triage Applied to all new issues labels Oct 8, 2024
@emmambd emmambd added this to the 6.0 Validator Release milestone Oct 8, 2024
@emmambd emmambd removed the status: Needs triage Applied to all new issues label Oct 8, 2024
@davidgamez davidgamez self-assigned this Oct 16, 2024
@davidgamez
Copy link
Member

davidgamez commented Oct 16, 2024

Method

How the running time was extracted:

  • Restored the beach feat/1698 base branch of feat: add performance assessment to acceptance tests #1771. In this PR, the time validationTimeSeconds was introduced to the JSON report. There were no significant changes from 5.0.1 to this branch.
  • Created a PR with a branch based from feat/1698 to gather all JSON reports that include the new field validationTimeSeconds. This PR had to be against a similar branch due to a limitation in GitHub that blocks automation from running when conflicts are present more info.
  • Created a Python script locally to compare metrics of the latest master acceptance tests with the one extracted from feat/1698(close to 5.0.1). The script also looks for the most significant changes in time consumption across all feeds.
  • The Java memory max heap was increased to 12GB to avoid impacting the metrics due to crashing validators.

Results

Time performance metrics

Metric 5.0.1 Master Difference
Average 3.93 4.09 0.16
Median 1.38 1.47 0.09
Std Dev 11.07 11.44 0.37
Min feed_id: us-florida-citrus-county-transit-gtfs-630, time: 0.52 feed_id: us-california-flex-v2-developer-test-feed-1-gtfs-1817, time: 0.55 0.03
Max feed_id: gb-unknown-uk-aggregate-feed-gtfs-2014, time: 289.51 feed_id: gb-unknown-uk-aggregate-feed-gtfs-2014, time: 290.63 1.12

** All values are expressed in seconds

There is a minor increase in the metrics but not a significant increase.

Feeds with the most significant increase

The following chart lists the most significant changes in time; to reduce noise, it filters out all the changes in less than 5 seconds.
Image

Only sixfour feeds differed by more than 5 seconds,and one of them reduced time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature request or improvement on an existing feature
Projects
None yet
Development

No branches or pull requests

2 participants