Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Illegal Instruction in GHCUP on x86_64 (Nobara Linux) #1003

Open
jackjohn7 opened this issue Feb 16, 2024 · 13 comments
Open

Illegal Instruction in GHCUP on x86_64 (Nobara Linux) #1003

jackjohn7 opened this issue Feb 16, 2024 · 13 comments

Comments

@jackjohn7
Copy link

I'm using Nobara Linux which is based on Fedora.

When I execute the curl --proto '=https' --tlsv1.2 -sSf https://get-ghcup.haskell.org | sh command listed on the homepage for the website I respond to all the configuration prompts and it seems to install properly. Then when the script goes to execute ghcup, I'm met with this output:

Welcome to Haskell!

This script can download and install the following binaries:
  * ghcup - The Haskell toolchain installer
  * ghc   - The Glasgow Haskell Compiler
  * cabal - The Cabal build tool for managing Haskell software
  * stack - A cross-platform program for developing Haskell projects (similar to cabal)
  * hls   - (optional) A language server for developers to integrate with their editor/IDE

ghcup installs only into the following directory,
which can be removed anytime:
  /home/jack/.ghcup

Press ENTER to proceed or ctrl-c to abort.
Note that this script can be re-run at any given time.

-------------------------------------------------------------------------------

Detected bash shell on your system...
Do you want ghcup to automatically add the required PATH variable to "/home/jack/.bashrc"?

[P] Yes, prepend  [A] Yes, append  [N] No  [?] Help (default is "P").

A
-------------------------------------------------------------------------------
Do you want to install haskell-language-server (HLS)?
HLS is a language-server that provides IDE-like functionality
and can integrate with different editors, such as Vim, Emacs, VS Code, Atom, ...
Also see https://haskell-language-server.readthedocs.io/en/stable/

[Y] Yes  [N] No  [?] Help (default is "N").

Y
-------------------------------------------------------------------------------
Do you want to enable better integration of stack with GHCup?
This means that stack won't install its own GHC versions, but uses GHCup's.
For more information see:
  https://docs.haskellstack.org/en/stable/yaml_configuration/#ghc-installation-customisation-experimental
If you want to keep stacks vanilla behavior, answer 'No'.

[Y] Yes  [N] No  [?] Help (default is "Y").

Y
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 15.4M  100 15.4M    0     0  8490k      0  0:00:01  0:00:01 --:--:-- 8488k
[ Info  ] downloading: https://raw.githubusercontent.com/haskell/ghcup-metadata/master/ghcup-0.0.8.yaml as file /home/jack/.ghcup/cache/ghcup-0.0.8.yaml
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  369k  100  369k    0     0  1438k      0 --:--:-- --:--:-- --:--:-- 1443k
main: line 131: 71130 Illegal instruction     (core dumped) "${GHCUP_BIN}/ghcup" ${args} "$@"
"ghcup --metadata-fetching-mode=Strict upgrade" failed!

After this, I tried installing ghcup through the binaries on the file server linked in the documentation for those who don't like curl | sh. I used the most recent x86_64-linux binary. I placed it in the same location that the installation script does, and I added the location to my path. I get the same error when I attempt to use a command (only --help doesn't fail):

$ ghcup list
[ Info  ] downloading: https://raw.githubusercontent.com/haskell/ghcup-metadata/master/ghcup-0.0.8.yaml as file /home/jack/.ghcup/cache/ghcup-0.0.8.yaml
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  369k  100  369k    0     0   538k      0 --:--:-- --:--:-- --:--:--  537k
Illegal instruction (core dumped)

It seems to be running an illegal CPU instruction in any case. I don't see how this could be. I'm using an x86_64 processor (ryzen 7 7700x). Output of lscpu below.

Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         48 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  16
  On-line CPU(s) list:   0-15
Vendor ID:               AuthenticAMD
  Model name:            AMD Ryzen 7 7700X 8-Core Processor
    CPU family:          25
    Model:               97
    Thread(s) per core:  2
    Core(s) per socket:  8
    Socket(s):           1
    Stepping:            2
    CPU(s) scaling MHz:  58%
    CPU max MHz:         5573.0000
    CPU min MHz:         400.0000
    BogoMIPS:            8999.53
    Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good amd_lbr_v2 nopl nonstop_tsc cpuid extd_apicid aperfmperf rapl
                         pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb
                          bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba perfmon_v2 ibrs ibpb stibp ibrs_enhanced vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsa
                         ves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local user_shstk clzero irperf xsaveerptr rdpru wbnoinvd cppc arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vm
                         load vgif x2avic v_spec_ctrl vnmi umip pku ospke rdpid overflow_recov succor smca fsrm flush_l1d
Virtualization features:
  Virtualization:        AMD-V
Caches (sum of all):
  L1d:                   256 KiB (8 instances)
  L1i:                   256 KiB (8 instances)
  L2:                    8 MiB (8 instances)
  L3:                    32 MiB (1 instance)
NUMA:
  NUMA node(s):          1
  NUMA node0 CPU(s):     0-15
Vulnerabilities:
  Gather data sampling:  Not affected
  Itlb multihit:         Not affected
  L1tf:                  Not affected
  Mds:                   Not affected
  Meltdown:              Not affected
  Mmio stale data:       Not affected
  Retbleed:              Not affected
  Spec rstack overflow:  Vulnerable: Safe RET, no microcode
  Spec store bypass:     Mitigation; Speculative Store Bypass disabled via prctl
  Spectre v1:            Mitigation; usercopy/swapgs barriers and __user pointer sanitization
  Spectre v2:            Mitigation; Enhanced / Automatic IBRS, IBPB conditional, STIBP always-on, RSB filling, PBRSB-eIBRS Not affected
  Srbds:                 Not affected
  Tsx async abort:       Not affected

I didn't see another issue quite like this and I couldn't find anyone else with the same issue on Google. I've just verified that the latest installation works fine on my Fedora laptop also using x86_64 (ryzen 5 5500u).

I tried seeing if the esoteric distros section of the installation docs could help, but nothing I tried there worked either.
https://www.haskell.org/ghcup/install/#esoteric-distros

I can use GHC, and cabal-install provided by my package manager for the time being, but I thought I should still report this in case someone encounters something similar.

@hasufell
Copy link
Member

Interesting, I'll investigate that.

@hasufell
Copy link
Member

I have CI self hosted runners that are using AMD Ryzen™ 7 7700. And I definitely cannot reproduce it there.

I have not tried Nobara Linux, but I can't see how that would be relevant.

Are you running under some KVM cloud stuff?

@jackjohn7
Copy link
Author

No cloud stuff. It's just an ordinary desktop I use for development and gaming. I haven't had any similar issues with other toolchains.

@hasufell
Copy link
Member

Can you provide the coredump?

@jackjohn7
Copy link
Author

Output of coredumpctl gdb

           PID: 6116 (ghcup)
           UID: 1000 (jack)
           GID: 1000 (jack)
        Signal: 4 (ILL)
     Timestamp: Sat 2024-02-17 00:02:51 CST (18min ago)
  Command Line: ghcup list
    Executable: /home/jack/.ghcup/bin/ghcup
 Control Group: /user.slice/user-1000.slice/[email protected]/app.slice/app-alacritty-83b7a964c25042d99cc6b8b07d91d3a7.scope
          Unit: [email protected]
     User Unit: app-alacritty-83b7a964c25042d99cc6b8b07d91d3a7.scope
         Slice: user-1000.slice
     Owner UID: 1000 (jack)
       Boot ID: 19b6995a4b854a37986f6783f7a25360
    Machine ID: 88cbced372bf4c199ac9a3e7ffeffceb
      Hostname: nobara-pc
       Storage: /var/lib/systemd/coredump/core.ghcup.1000.19b6995a4b854a37986f6783f7a25360.6116.1708149771000000.zst (present)
  Size on Disk: 1.5M
       Message: Process 6116 (ghcup) of user 1000 dumped core.

                Module /home/jack/.ghcup/bin/ghcup without build-id.
                Stack trace of thread 6116:
                #0  0x0000000000f14e1a n/a (/home/jack/.ghcup/bin/ghcup + 0xb14e1a)
                ELF object binary architecture: AMD x86-64

GNU gdb (Fedora Linux) 14.1-4.fc39
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /home/jack/.ghcup/bin/ghcup...
(No debugging symbols found in /home/jack/.ghcup/bin/ghcup)
[New LWP 6116]
[New LWP 6118]
[New LWP 6117]
[New LWP 6120]
[New LWP 6119]

This GDB supports auto-downloading debuginfo from the following URLs:
  <https://debuginfod.fedoraproject.org/>
Enable debuginfod for this session? (y or [n]) y
Debuginfod has been enabled.
To make this setting permanent, add 'set debuginfod enabled on' to .gdbinit.
Downloading separate debug info for system-supplied DSO at 0x7ffee4da0000
Core was generated by `ghcup list'.
Program terminated with signal SIGILL, Illegal instruction.
#0  0x0000000000f14e1a in ?? ()
[Current thread is 1 (LWP 6116)]

I have a massive dump file (8000+ lines) as well. Would that be useful?

@hasufell
Copy link
Member

Yes

@hasufell
Copy link
Member

CCing @bgamari in case this might be interesting

@jackjohn7
Copy link
Author

I've included a google drive link to the file since it's too large to be attached here https://drive.google.com/file/d/1IkbqgBa19s33RvzReV1J1U7S3jzzCWKf/view?usp=sharing

@runeksvendsen
Copy link
Collaborator

runeksvendsen commented Feb 17, 2024

I've included a google drive link to the file since it's too large to be attached here https://drive.google.com/file/d/1IkbqgBa19s33RvzReV1J1U7S3jzzCWKf/view?usp=sharing

@jackjohn7 you can attach it here if you zip it: core_dump.zip

@bgamari
Copy link
Collaborator

bgamari commented Feb 17, 2024

Very odd. Indeed it appears the executable jumped into the middle of an abyss:

>>> x/8i $pc
=> 0xf14e1a:    add    %al,(%rax)
   0xf14e1c:    add    %al,(%rax)
   0xf14e1e:    add    %al,(%rax)

Even stranger, the Haskell stack register is complete nonsense.

>>> print $rbp
$1 = (void *) 0x12

Something has gone horribly wrong in this program.

I have tried to reproduce this locally with Nobara 39 running under a VM on a Ryzen 5900X to no avail.

@bgamari
Copy link
Collaborator

bgamari commented Feb 17, 2024

@jackjohn7, a few questions:

  • I assume you are able to reliably reproduce this crash? Is there any variance in how it manifests?
  • Which Nobara release are you using?
  • Are you certain that the underlying hardware is solid (no overclocking, memory has been tested)? Have you observed any other instability in this system?'
  • Could you try building ghcup from source with the Nobara packaged toolchain and confirm that the issue in not reproducible?

@jackjohn7
Copy link
Author

jackjohn7 commented Feb 17, 2024

  • It happens when executing any command with ghcup other than --help. During installation using the script, after it installs everything, it seems to attempt to do ghcup list or something similar and it fails. It'll fail when I execute just about any command. Edit: It may be trying to run the install command to install ghc, hls, or stack. It doesn't finish though, so I can't really see the output haha
  • Nobara 39 KDE Plasma
  • I've had instability in Windows with the system in the past, but I've tested the memory with MemTest86 and everything was in order. I'm not overclocking anything since my experience doing so in the past was quite unstable. I used to get blue-screens in windows, but I've had a very stable experience since I stopped overclocking my RAM (it's supposed to be overclocked to achieve its advertised speed). It's also been super stable on Linux. I play games and write a ton of code on it.
  • I can definitely try. I tried looking for documentation on how to do that, and I didn't see it. I'm not super familiar with Haskell or its ecosystem. Still trying to learn.

@jackjohn7
Copy link
Author

Update

AVX512 was disabled for my CPU. I carelessly disabled this feature for playing a particular game. Re-enabling it seems to have fixed the issue entirely. That or updating my system may have effected it. In any case, the toolchain is now working for me.

I see this got tagged as a bug. Was this reproduced for anyone else?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants