Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added linear regression to compute execution time #46

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 11 additions & 1 deletion src/Hyperion/Analysis.hs
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ import Control.Lens
, foldMapOf
, folded
, to
, toListOf
)

import Control.Lens.Each
Expand All @@ -23,6 +24,8 @@ import Hyperion.Benchmark
import Hyperion.Internal
import Hyperion.Measurement
import Hyperion.Report
import Statistics.Regression
import qualified Data.Vector.Unboxed as UV

identifiers :: Fold Benchmark BenchmarkId
identifiers = go []
Expand All @@ -47,7 +50,10 @@ analyze ident samp = Report
, _reportBenchParams =
map (\(Parameter x) -> fromEnum x) $ benchmarkParameters ident
, _reportTimeInNanos =
totalDuration / trueNumIterations
totalDuration / trueNumIterations
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beware! Trailing whitespace here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At any rate, we don't need that time anymore do we? We want to replace it with the slope of the linear regression. Am I correct? (cc @qnikst )

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should just keep totalDuration here. totalDuration on it's own is meaningful statistics. Sometime we may be interested in the total time of the test and do not care about the time of each measurement, for example that may happen if we want to test RTS options or different RTS optimizations.
So I'd keep the statistics and the name, but it should be just totalDuration

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unsure of what you precisely mean here: should the value of _reportTimeInNanos be replaced by totalDuration, or by slope ? In the latter case, should a new record _totalDuration be created ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_reportTimeInNanos = _totalDuration

, _linearRegressionTime = slope
, _linearRegressionConstant = yIntercept
, _linearRegressionConfidence = determinationCoefficient
, _reportCycles = Nothing
, _reportAlloc = Nothing
, _reportGarbageCollections = Nothing
Expand All @@ -64,3 +70,7 @@ analyze ident samp = Report
Sum
(foldMapOf (measurements.each.batchSize.to realToFrac))
samp
([slope,yIntercept],determinationCoefficient) = over _1 UV.toList $ olsRegress [batchSizesVect] durationsVect
where
batchSizesVect = UV.fromList $ fromIntegral <$> toListOf (measurements.each.batchSize) samp
durationsVect = UV.fromList $ fromIntegral <$> toListOf (measurements.each.duration) samp
6 changes: 5 additions & 1 deletion src/Hyperion/PrintReport.hs
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,11 @@ import Text.PrettyPrint.ANSI.Leijen
formatReport :: Report -> Doc
formatReport report =
green (bold (text (unpack (view reportBenchName report)))) <> line
<> (indent 2 $ text ("Bench time: " ++ prettyNanos (view reportTimeInNanos report)))
<> (indent 2 $ text ("Bench time: " ++ prettyNanos (view reportTimeInNanos report))) <> line
<> (indent 2 $ text ("Computed with linear regression: ")) <> line
<> (indent 4 $ text ("Bench time: " ++ prettyNanos (view linearRegressionTime report))) <> line
<> (indent 4 $ text ("Constant factor: " ++ prettyNanos (view linearRegressionConstant report))) <> line
<> (indent 4 $ text ("Confidence: " ++ showFFloat (Just 4) (view linearRegressionConfidence report) ""))
<> line
where
show2decs x= showFFloat (Just 2) x ""
Expand Down
3 changes: 3 additions & 0 deletions src/Hyperion/Report.hs
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,9 @@ data Report = Report
{ _reportBenchName :: !Text
, _reportBenchParams :: [Int]
, _reportTimeInNanos :: !Double
, _linearRegressionTime :: !Double
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should just replace _reportTimeInNanos, that way all current tools benefit from the improvement.

, _linearRegressionConstant :: !Double
, _linearRegressionConfidence :: !Double
, _reportCycles :: !(Maybe Double)
, _reportAlloc :: !(Maybe Int64)
, _reportGarbageCollections :: !(Maybe Int64)
Expand Down