Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch from haskell-src-exts to ghc-lib-parser #593

Merged

Conversation

toku-sa-n
Copy link
Collaborator

@toku-sa-n toku-sa-n commented Sep 6, 2022

See #587 for the background.

Problems

  • Cannot handle LINE pragmas (e.g., {-# LINE 1 "foo/bar.hs" #-}). It causes a parse error.
  • Cannot preserve comments around a pragma.
{-# LANGUAGE CPP #-} -- Comments

module Foo where
  • stack bench indicates that the HIndent after rewriting is much slower than the original one.
    • Currently, the source files are lexically parsed once to extract the extensions specified in the source code, and then parsed for the actual formatting. I believe this is one of the reasons for this.
  • Some formatting depends on ghc-lib-parser's internal formatted, and we should avoid it if possible. See the comment in HIndent.Pretty.Combinators.Outputable.

Notes

Some formatting results differ from the one by the original HIndent due to several reasons.

  • A bug in haskell-src-exts wrongly parses a left-associative infix operator as right-associative and vice versa.
  • The original HIndent used haskell-src-exts' internal formatter and reproducing it is difficult.

Benchmark results

Before the switch

Build profile: -w ghc-9.2.5 -O1
In order, the following will be built (use -v for more details):
 - hindent-5.3.5 (bench:hindent-bench) (first run)
Preprocessing benchmark 'hindent-bench' for hindent-5.3.5..
Building benchmark 'hindent-bench' for hindent-5.3.5..
Running 1 benchmarks...
Benchmark hindent-bench: RUNNING...
benchmarking Large inputs/Bunch of declarations
time                 27.41 ms   (27.12 ms .. 27.72 ms)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 27.09 ms   (26.61 ms .. 27.27 ms)
std dev              602.4 μs   (234.1 μs .. 1.116 ms)

benchmarking Large inputs/Bunch of declarations - sans comments
time                 2.817 ms   (2.809 ms .. 2.823 ms)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 2.775 ms   (2.767 ms .. 2.783 ms)
std dev              25.82 μs   (22.48 μs .. 30.44 μs)

benchmarking Complex inputs/Quasi-quotes with nested lets and operators
time                 1.579 ms   (1.576 ms .. 1.582 ms)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 1.590 ms   (1.586 ms .. 1.594 ms)
std dev              12.46 μs   (10.30 μs .. 15.51 μs)

benchmarking Complex inputs/Lots of comments and operators
time                 5.861 ms   (5.833 ms .. 5.881 ms)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 5.839 ms   (5.820 ms .. 5.856 ms)
std dev              54.77 μs   (45.75 μs .. 66.22 μs)

Benchmark hindent-bench: FINISH

After the switch

Build profile: -w ghc-9.2.5 -O1
In order, the following will be built (use -v for more details):
 - hindent-5.3.5 (bench:hindent-bench) (first run)
Preprocessing benchmark 'hindent-bench' for hindent-5.3.5..
Building benchmark 'hindent-bench' for hindent-5.3.5..
Running 1 benchmarks...
Benchmark hindent-bench: RUNNING...
benchmarking Large inputs/Bunch of declarations
time                 158.1 ms   (157.0 ms .. 159.2 ms)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 159.3 ms   (158.3 ms .. 162.4 ms)
std dev              2.152 ms   (355.4 μs .. 3.220 ms)
variance introduced by outliers: 12% (moderately inflated)

benchmarking Large inputs/Bunch of declarations - sans comments
time                 157.2 ms   (153.6 ms .. 159.7 ms)
                     1.000 R²   (0.998 R² .. 1.000 R²)
mean                 157.2 ms   (155.1 ms .. 158.1 ms)
std dev              1.869 ms   (719.5 μs .. 3.124 ms)
variance introduced by outliers: 12% (moderately inflated)

benchmarking Complex inputs/Quasi-quotes with nested lets and operators
time                 68.53 ms   (68.35 ms .. 68.78 ms)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 68.49 ms   (68.29 ms .. 68.79 ms)
std dev              431.2 μs   (245.1 μs .. 637.3 μs)

benchmarking Complex inputs/Lots of comments and operators
time                 40.77 ms   (40.46 ms .. 41.18 ms)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 40.64 ms   (40.45 ms .. 40.82 ms)
std dev              385.9 μs   (255.7 μs .. 612.2 μs)

Benchmark hindent-bench: FINISH

Fixes: #587
Fixes: #600
Closes: #582
Closes: #599

@toku-sa-n toku-sa-n changed the title Replace 'haskell-src-exts' with 'ghc-lib-parser' Replace haskell-src-exts with ghc-lib-parser Sep 6, 2022
@toku-sa-n
Copy link
Collaborator Author

@mihaimaruseac Hello. I have a question. Do we still need to support GHC versions that are older than 8.10?

Though it's not impossible, supporting these versions is hard because ghc-lib-parser changed how it generates an AST from a given source. ghc-lib-parser of the compatible version with GHC older than 8.10 generates an AST having the old structure, and supporting old and new ones requires adding many codes, especially for handling comments.

I think supporting GHC 8.10.7 and 9.2.2 is enough. 8.10.7 is the current recommended version by ghcup, and 9.2.2 is supported currently and is newer than the recommended version.

Do you have any thoughts on this?

@mihaimaruseac
Copy link
Owner

Sorry, was not following the progress here too closely.

I think it's good to drop the old versions, I'll send a fix to remove them from CI right now

@mihaimaruseac
Copy link
Owner

Made #596

@toku-sa-n
Copy link
Collaborator Author

Thank you for the quick PR!

@toku-sa-n
Copy link
Collaborator Author

@mihaimaruseac It seems that building with GHC 9.4.1 fails because of a GHC bug. I've created #597 to remove the support of the version.

toku-sa-n added a commit to toku-sa-n/hindent that referenced this pull request Sep 22, 2022
This commit is split from
mihaimaruseac#593 to reduce the PR's
size.
@mihaimaruseac
Copy link
Owner

Thank you for the other PR. From reading the issue (and the CI on #596, I shouldn't have merged it that early) it looks like only Windows is affected? Made a suggestion on your PR for that end.

@toku-sa-n
Copy link
Collaborator Author

toku-sa-n commented Sep 22, 2022

Thanks. Yes, it only affects Windows. I've fixed it in #597.

@mihaimaruseac
Copy link
Owner

Awesome. Thank you. Merged.

@toku-sa-n toku-sa-n force-pushed the replace_haskell_src_exts branch 2 times, most recently from ded61c2 to d928610 Compare November 6, 2022 12:52
@toku-sa-n toku-sa-n changed the title Replace haskell-src-exts with ghc-lib-parser Switch from haskell-src-exts to ghc-lib-parser Nov 6, 2022
@toku-sa-n
Copy link
Collaborator Author

@mihaimaruseac There are still a few things to do (e.g., removing files used in the development from commits, exporting necessary symbols, etc.), and a few FIXMEs which I can't solve immediately, but I think the work of switching from haskell-src-exts to ghc-lib-parser is done.

I apologize that the diffs have become very large. Some of the diffs are not needed for this PR and can be submitted as separate PRs.

I have two thoughts.

  • I think we need to clean up the TESTS.md file. In particular, I think it needs to be broken up into smaller sections. Also, the regression test section is huge and not classified at all. This section should be eliminated, and each test suite should be moved to the appropriate section. BTW I've submitted Fix TESTS.md according to markdownlint #606. Could you take a look at it?
  • Do we need to re-export the functions for the tests exported in the HIndent module and the parseMode? I think testAst is sufficient for the former, and I don't see much need to export the latter.

Thank you.

toku-sa-n added a commit to toku-sa-n/hindent that referenced this pull request Dec 16, 2022
This commit is separated from mihaimaruseac#593 and has tests that added there and
pass with the current HIndent.
@toku-sa-n toku-sa-n mentioned this pull request Dec 16, 2022
@toku-sa-n toku-sa-n force-pushed the replace_haskell_src_exts branch 2 times, most recently from 84bb9b1 to cda4ac8 Compare December 23, 2022 09:27
@mihaimaruseac
Copy link
Owner

Hi @toku-sa-n.

I see you merged all commits and removed the TODO-checklist. Does this mean this is ready for review? :)

Sorry if I'm too eager and this is not yet ready :)

@toku-sa-n
Copy link
Collaborator Author

toku-sa-n commented Dec 23, 2022 via email

@toku-sa-n toku-sa-n marked this pull request as ready for review December 24, 2022 02:05
Copy link
Owner

@mihaimaruseac mihaimaruseac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks so great! There's only one nit about a comment that looks incomplete: "since GHC"

Thank you very much.

if S8.null xs
then False
else S8.last xs == '\n'
hasTrailingLine xs = not (S8.null xs) && S8.last xs == '\n'
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!!

-- ,QuasiQuotes -- breaks [x| ...], making whitespace free list comps break
, GLP.PatternSynonyms -- steals the pattern keyword
, GLP.RecursiveDo -- steals the rec keyword
, GLP.TypeApplications -- since GHC
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this missing a version number?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops. I'll fix it.

Comment on lines +13 to +31
-- > f :: Int
-- > f = 1
-- >
-- > -- A comment between f and g
-- >
-- > -- Another comment between f and g
-- >
-- > g :: Int
-- > g = 2
--
-- to
--
-- > f :: Int
-- > f = 1
-- >
-- > -- A comment between f and g
-- > -- Another comment between f and g
-- > g :: Int
-- > g = 2
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice if comment separation would be maintained, but maybe we can push this to a separate issue.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. We need to change this behavior. I'll try it later. The issue is here: #608.

@mihaimaruseac mihaimaruseac merged commit 5ec1b53 into mihaimaruseac:master Dec 24, 2022
@toku-sa-n toku-sa-n deleted the replace_haskell_src_exts branch December 24, 2022 23:23
@toku-sa-n
Copy link
Collaborator Author

Thank you for reviewing and merging this PR! I also appreciate that you invited me as a collaborator of this repository!

toku-sa-n added a commit to toku-sa-n/hindent that referenced this pull request Dec 24, 2022
In mihaimaruseac#593, I didn't format the files with the above command for fear of
huge diffs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants