Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

elfloader: improve stability #191

Open
wants to merge 12 commits into
base: master
Choose a base branch
from

Commits on Mar 21, 2024

  1. elfloader: arm: stabilize secondary core booting

    EFI may boot the elfloader with caches disabled on the secondary cores,
    we want the value of non_boot_lock to be visible.
    
    Some barriers are added to stabilize SMP booting in the elfloader.
    
    Co-authored-by: Yanyan Shen <[email protected]>
    Co-authored-by: Matthias Rosenfelder <[email protected]>
    Signed-off-by: Andy Bui <[email protected]>
    3 people committed Mar 21, 2024
    Configuration menu
    Copy the full SHA
    7e410f0 View commit details
    Browse the repository at this point in the history
  2. elfloader: arm: fix incorrect sp when restoring

    bootloader parameters.
    
    Signed-off-by: Tw <[email protected]>
    Tw authored and Andy Bui committed Mar 21, 2024
    Configuration menu
    Copy the full SHA
    78644d1 View commit details
    Browse the repository at this point in the history
  3. elfloader: arm: fix function declaration

    (type mismatch).
    
    This chops off the aff3 level on Aarch64.
    
    Why does the compiler not warn??? Because the own header was not
    included. If you just include the header (without the change of the
    return value in the header), we get:
    
    seL4_tools/elfloader-tool/src/arch-arm/64/cpuid.c:14:10: error:
    conflicting types for 'read_cpuid_mpidr'
    
       14 | uint32_t read_cpuid_mpidr(void)
          |          ^~~~~~~~~~~~~~~~
    
    In file included from /home/mro/nvos_neu2/tools/seL4_tools/
    elfloader-tool/src/arch-arm/64/cpuid.c:9:
    elfloader-tool/include/arch-arm/cpuid.h:15:8: note:
    previous declaration of 'read_cpuid_mpidr' was here
       15 | word_t read_cpuid_mpidr(void);
          |        ^~~~~~~~~~~~~~~~
    [190/200] Building C object elfloader-tool/CMakeFiles/elfloader.dir/
    src/arch-arm/smp_boot.c.obj
    ninja: build stopped: subcommand failed.
    
    Signed-off-by: Matthias Rosenfelder <[email protected]>
    mro-github-12345 authored and Andy Bui committed Mar 21, 2024
    Configuration menu
    Copy the full SHA
    864c7c7 View commit details
    Browse the repository at this point in the history
  4. elfloader: arm: fix alignment of AArch32

    pagetables.
    
    The 64 kiB alignment is a maximum requirement for a stage2
    concatenated pagetable. See Table G5-4 in ARM DDI 0487I.a, page
    G5-9186.
    
    Note: Both comments at the top of the file as well as in line 85
    say "64 kiB". 2^14 is unfortunately only 16 kiB.
    
    Note2: This code is not executed on AArch64, because the
    finish_relocation() function panics on AArch64. The latter always
    takes the "shortcut" via continue_boot().
    
    Signed-off-by: Matthias Rosenfelder <[email protected]>
    mro-github-12345 authored and Andy Bui committed Mar 21, 2024
    Configuration menu
    Copy the full SHA
    7417e3c View commit details
    Browse the repository at this point in the history
  5. elfloader: fix variable prototype and remove var

    shadowing.
    
    The variable "dtb_size" is of type "size_t" and is defined in
    src/arch-arm/sys_boot.c, line 36.
    
    "size_t" is most certainly NOT the same size as "uint32_t", even on
    32-bit architectures. Thus, the declaration in smp_boot.c is incorrect,
    since it does not match the definition in sys_boot.c. Why even create
    a local declaration - put this in a common header file and you will
    see those problems right away. Single point of maintenance!
    
    This may lead to an incorrectly sized memory access, that only happens
    to be correct by chance in Little-Endian mode. For ARM in Big-Endian
    mode this is a bug and will most likely result in an incorrect DTB
    size of zero.
    
    This fixes c573511 ("elfloader: pass DTB from bootloader to seL4 on
    ARM").
    
    Moreover, remove the shadowing of global variables by defining local
    ones with the same name => Rename the local one in src/common.c.
    
    This could have been detected with "-Wshadow".
    
    Practically speaking our DTBs are always (a lot) smaller than 32-bit.
    Thus, continue to pass a 32-bit size to the kernel in order to not
    change the API here.
    
    Signed-off-by: Matthias Rosenfelder <[email protected]>
    mro-github-12345 authored and Andy Bui committed Mar 21, 2024
    Configuration menu
    Copy the full SHA
    1b4cddf View commit details
    Browse the repository at this point in the history
  6. elfloader: fix EFI image size

    The size calculation was incorrect, unfortunately. That lead to
    an incorrect memory map provided by UEFI. When switching on EFI_DEBUG
    one can see that the memory range occupied by the ELF-loader
    ("paddr=" line) is not fully marked as used by UEFI and the last
    few pages of the ELF-loader are actually marked as being free.
    
    Output before:
    
    ELF-loader started on Image ranges:
      paddr=[7f9bc0000..7fcde1fff]
        text[7f9bc0000..7f9bd1e8f]
        data[7f9bd2000..7fcddd9ff]
         bss[7f9bd2850..7f9c0a4cf]
       edata[7fcddda00..7fcde1fff]
    [...]
    [paddr=0x180023000-0x7f9bbffff]
        [type = Conventional, attr: Normal  <- ok
    [paddr=0x7f9bc0000-0x7fcdddfff]
        [type = Loader Code, attr: Normal   <- Not ok: end address too low
                                                    Should be 0x4000 higher
    
    [paddr=0x7fcdde000-0x7ffffffff]
        [type = Conventional, attr: Normal  <- Not ok: start address too low
                                                    Should be 0x4000 higher
    
    After:
    
    ELF-loader started on Image ranges:
      paddr=[7f9bbc000..7fcdddfff]
        text[7f9bbc000..7f9bcde8f]
        data[7f9bce000..7fcdd99ff]
         bss[7f9bce850..7f9c064cf]
       edata[7fcdd9a00..7fcdddfff]
    [...]
    [paddr=0x180023000-0x7f9bbbfff]
        [type = Conventional, attr: Normal  <- ok
    [paddr=0x7f9bbc000-0x7fcdddfff]
        [type = Loader Code, attr: Normal   <- ok (same as above)
    [paddr=0x7fcdde000-0x7ffffffff]
        [type = Conventional, attr: Normal  <- ok (starts one after
                                                   paddr end)
    
    Note: You don't have that debug output (EFI_DEBUG) in your code,
    that prints the UEFI memory map. So you have to believe me that
    this is the actual output.
    
    This fixes 030d83b ("elfloader: improve EFI support").
    
    Signed-off-by: Matthias Rosenfelder <[email protected]>
    mro-github-12345 authored and Andy Bui committed Mar 21, 2024
    Configuration menu
    Copy the full SHA
    326dc87 View commit details
    Browse the repository at this point in the history
  7. elfloader: move the data of ELFloader together

    The driver list section was part of the "*(COMMON)" section that
    was placed *after* the image payload (kernel, rootserver etc.).
    The driver list is data and should be placed adjacent to other
    data belonging to the ELFloader.
    
    This is clearly visible in the mapfile (which you don't generate).
    
    The driver list entry is present in the aarch32 linker script for
    EFI, why was it missing for 64 bit?
    
    No functional change.
    
    Signed-off-by: Matthias Rosenfelder <[email protected]>
    mro-github-12345 authored and Andy Bui committed Mar 21, 2024
    Configuration menu
    Copy the full SHA
    6e44ac6 View commit details
    Browse the repository at this point in the history
  8. elfloader: arm: do not hard-code values

    Use existing defines to make the code more descriptive. For this
    move some defines out of the assembler-only file.
    
    This is a preparation patch for upcoming patches.
    
    No functional change.
    
    Signed-off-by: Matthias Rosenfelder <[email protected]>
    mro-github-12345 authored and Andy Bui committed Mar 21, 2024
    Configuration menu
    Copy the full SHA
    33a9819 View commit details
    Browse the repository at this point in the history
  9. elfloader: arm: fix potential UB in right shift

    Regarding right shifts the standard says:
    
    "The type of the result is that of the promoted left operand.
    The behavior is undefined if the right operand is negative, or
    greater than or equal to the length in bits of the promoted left
    operand."
    
    Corresponding GCC warning (if used on a "small" type like uint8_t):
    
    main.c:25:39: warning: right shift count >= width of type
    [-Wshift-count-overflow]
    
    \#define GET_PGD_INDEX(x)        (((x) >> (ARM_2MB_BLOCK_BITS +
    PMD_BITS + PUD_BITS)) & MASK(PGD_BITS))
    
    main.c:46:39: note: in expansion of macro ‘GET_PGD_INDEX’
       46 |     printf("GET_PGD_INDEX(x): %lu\n", GET_PGD_INDEX(x));
          |                                       ^~~~~~~~~~~~~
    
    Thus, make sure that we never exceed/reach it by explicitly casting
    to a 64-bit type.
    
    It also allows using a pointer as macro parameter.
    
    This is a preparation patch for upcoming patches.
    
    Signed-off-by: Matthias Rosenfelder <[email protected]>
    mro-github-12345 authored and Andy Bui committed Mar 21, 2024
    Configuration menu
    Copy the full SHA
    f317679 View commit details
    Browse the repository at this point in the history
  10. elfloader: Exit UEFI boot services very early

    UEFI is an operating system that hides as a bootloader. UEFI is in
    control of the machine as long as we didn't call exit_boot_services.
    
    For instance, UEFI may set up timers to interrupt us while we're
    fiddling with hardware and UEFI is fiddling with hardware itself and
    UEFI may be fiddling with the exact same hardware that we are
    fiddling with, while we're being preempted. That is not good.
    
    The previous state of ELFloader is that before exiting UEFI boot
    services, we already called platform_init() in main(), which may
    fiddle around with all kinds of hardware. Thus, we should have
    already exited UEFI boot services when main() is called.
    
    Note that exit_boot_services now still executes on the UEFI stack
    (since we switch the stack in _start()). But so did e.g. the
    clear_bss() function. I don't see a problem here.
    
    It's more a question the other way around: Previously, we called
    into UEFI with exit_boot_services on our own, potentially too small,
    stack. Do we have enough space for UEFI to execute? How are we
    supposed to know that? The UEFI implementation can change, so we
    can never be sure. But it would be unreasonable for UEFI to start
    us with a stack that is too small to call any UEFI API, including
    exit_boot_services. So we can safely assume that there is enough
    space when using the UEFI stack (since our use of stack to this
    point is minimal).
    
    Also, mask all exceptions until we are about to enter the kernel.
    We do not want to run with whatever state the bootloader set us up
    before, do we? We only re-enable the asyncs and debugs; interrupts
    and FIQs are still masked when entering the kernel. What would we
    gain from that? We don't expect any. Asyncs (SErrors), however,
    can indicate that we e.g. touched memory that we shouldn't have
    touched (secure memory).
    
    Signed-off-by: Matthias Rosenfelder <[email protected]>
    mro-github-12345 authored and Andy Bui committed Mar 21, 2024
    Configuration menu
    Copy the full SHA
    970da42 View commit details
    Browse the repository at this point in the history
  11. elfloader: Rewrite loop: Do not use goto

    There are better ways to loop in C.
    
    No functional change.
    
    Signed-off-by: Matthias Rosenfelder <[email protected]>
    mro-github-12345 authored and Andy Bui committed Mar 21, 2024
    Configuration menu
    Copy the full SHA
    657d6ca View commit details
    Browse the repository at this point in the history
  12. elfloader: fix UEFI integration bug: descriptor

    size mismatch.
    
    The UEFI specification 2.10 says in section 7 for
    EFI_BOOT_SERVICES.GetMemoryMap():
    
    "The GetMemoryMap() function also returns the size and revision
    number of the EFI_MEMORY_DESCRIPTOR. The DescriptorSize represents
    the size in bytes of an EFI_MEMORY_DESCRIPTOR array element returned
    in MemoryMap. The size is returned to allow for future expansion of
    the EFI_MEMORY_DESCRIPTOR in response to hardware innovation.
    
    The structure of the EFI_MEMORY_DESCRIPTOR may be extended in the
    future but it will remain backwards compatible with the current
    definition. Thus OS software must use the DescriptorSize to find
    the start of each EFI_MEMORY_DESCRIPTOR in the MemoryMap array."
    
    This mismatch is the case on (our) Orin UEFI. The compiled size of
    a memory descriptor is 40 Bytes, but the Orin UEFI implementation
    uses 48 Bytes per descriptor.
    
    Thus, due to the requirement to use a larger size than the
    returned total size (due to the fact that the buffer allocation
    itself may lead to one more entry in the memory map), we must
    increase by the size (in terms of number of descriptors), but
    use the number of bytes that UEFI uses for one memory map entry,
    not what we think it might be.
    
    Some other people already stumbled over this:
    https://forum.osdev.org/viewtopic.php?f=1&t=32953
    
    Based on the comment in the existing code, the author seems to
    not have understood how the size of the memory map can be
    determined. Just read the spec!
    So we better update that misleading comment.
    
    Signed-off-by: Matthias Rosenfelder <[email protected]>
    mro-github-12345 authored and Andy Bui committed Mar 21, 2024
    Configuration menu
    Copy the full SHA
    c555152 View commit details
    Browse the repository at this point in the history