Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider replacing Auroraboot for netbooting #2928

Open
jimmykarily opened this issue Oct 9, 2024 · 3 comments
Open

Consider replacing Auroraboot for netbooting #2928

jimmykarily opened this issue Oct 9, 2024 · 3 comments

Comments

@jimmykarily
Copy link
Contributor

jimmykarily commented Oct 9, 2024

The golang library we use in Auroraboot to implement netbooting is no longer maintained: https://github.com/danderson/netboot

It already doesn't work with some devices (e.g. ASUS PN64). We should consider some better maintained alternatives.

netboot.xyz seems like a good candidate. Other projects are already in the list: https://netboot.xyz/docs/faq#what-operating-systems-are-currently-available-on-netbootxyz

By default this needs internet access but there is a way to self host it to: https://netboot.xyz/docs/selfhosting#deploying-with-docker

We should give this a spin and see if it's a viable option that can replace auroraboot for the netbooting part. If yes, then there is less things auroraboot needs to implement, which will help us consolidate in less tools in the future (instead of all , osbuilder, auroraboot, enki etc)

Even if it doesn't work locally, we should still consider adding Kairos in the supported OSes.

@jimmykarily jimmykarily converted this from a draft issue Oct 9, 2024
@jimmykarily
Copy link
Contributor Author

jimmykarily commented Oct 9, 2024

Turns out netboot.xyz only implements a better UX for the client but not the server side.
There are other candidates like this one: https://github.com/insomniacslk/dhcp

which I tried to use and almost work but on my ASUS it still fails with sending block 0: code=8, error: User aborted the transfer which indicates that maybe the issue is with this specific hardware.

For reference this is the code I used:

package main

import (
	"fmt"
	"io"
	"log"
	"net"
	"os"

	"github.com/insomniacslk/dhcp/dhcpv4"
	"github.com/insomniacslk/dhcp/dhcpv4/server4"
	"github.com/pin/tftp"
)

func main() {
	// Run the TFTP server in a separate goroutine
	go startTFTPServer()

	// Start the DHCP server
	startDHCPServer()
}

// startDHCPServer initializes and starts the DHCP server
func startDHCPServer() {
	handler := func(conn net.PacketConn, peer net.Addr, pkt *dhcpv4.DHCPv4) {
		fmt.Printf("pkt = %+v\n", *pkt)

		// Check if it's a DHCP Discover or Request
		if pkt.MessageType() != dhcpv4.MessageTypeDiscover && pkt.MessageType() != dhcpv4.MessageTypeRequest {
			return
		}

		// Create a reply based on the request packet
		resp, err := dhcpv4.NewReplyFromRequest(pkt)
		if err != nil {
			log.Printf("failed to create DHCP reply: %v", err)
			return
		}

		// Define the IP address of the TFTP server
		tftpServerIP := net.IP{192, 168, 1, 36}
		//tftpServerIP := net.IP{192, 168, 122, 1}

		// Set DHCP options for netboot
		resp.Options.Update(dhcpv4.OptBootFileName("kairos.ipxe"))
		resp.Options.Update(dhcpv4.OptServerIdentifier(tftpServerIP))
		resp.Options.Update(dhcpv4.OptTFTPServerName(tftpServerIP.String()))
		//resp.Options.Update(dhcpv4.Option{Code: dhcpv4.OptionDHCPMessageType, Value: dhcpv4.MessageTypeOffer})

		// Optionally set additional options
		//resp.Options.Update(dhcpv4.OptIPAddressLeaseTime(3600 * time.Second)) // Lease time of 1 hour

		fmt.Printf("resp.Options = %+v\n", resp.Options)
		//fmt.Printf("resp = %+v\n", resp)

		// Send the response back to the client
		if _, err := conn.WriteTo(resp.ToBytes(), peer); err != nil {
			log.Printf("failed to send DHCP response: %v", err)
		}
	}

	//iface := "enp121s0" // Replace with the actual network interface name
	iface := "" // Replace with the actual network interface name
	srv, err := server4.NewServer(iface, nil, handler)
	if err != nil {
		log.Fatalf("failed to create DHCP server: %v", err)
	}

	log.Printf("Starting DHCP server on interface %s...", iface)
	if err := srv.Serve(); err != nil {
		log.Fatalf("failed to serve DHCP: %v", err)
	}
}

// startTFTPServer initializes and starts the TFTP server
func startTFTPServer() {
	// Define the TFTP server
	srv := tftp.NewServer(readHandler, nil)

	// Start the TFTP server on port 69
	go func() {
		if err := srv.ListenAndServe(":69"); err != nil {
			log.Fatalf("failed to start TFTP server: %v", err)
		}
	}()

	log.Println("TFTP server started on port 69...")
}

// readHandler serves files requested by TFTP clients
func readHandler(filename string, rf io.ReaderFrom) error {
	fmt.Printf("Reading file %s\n", filename)
	// Path to the directory where TFTP boot files are stored
	filePath := fmt.Sprintf("./%s", filename)

	// Open the requested file
	file, err := os.Open(filePath)
	if err != nil {
		log.Printf("failed to open file %s: %v", filePath, err)
		return err
	}
	defer file.Close()

	// Use the io.ReaderFrom interface to transfer the file
	log.Printf("Serving file %s", filename)
	if _, err := rf.ReadFrom(file); err != nil {
		log.Printf("failed to serve file %s: %v", filename, err)
		return err
	}

	return nil
}

with kairos.ipxe being this file: https://github.com/kairos-io/kairos/releases/download/v3.2.1/kairos-alpine-3.19-core-amd64-generic-v3.2.1.ipxe

I'm not sure if it's worth digging into this more. I would rather not maintain a pxe boot server if we can find something that works out of the box by simply providing an ipxie script.

@mudler
Copy link
Member

mudler commented Oct 9, 2024

netboot provides a very specific functionality at the moment, which is to work aside with an already-existing dhcp server on the same network. It's hard to replace - maybe we can contact the maintainer and see if there is someway to keep it up-to-date by the community?

Maybe we can just fade it out and "keep as is" and leverage things like UEFI HTTP boot. However same functionalities in terms of UX (specify a container image and 'boot') is hard to replicate

@jimmykarily
Copy link
Contributor Author

Let decide what Auroraboot is responsible for here first: #1633
and then we can discuss this again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Todo 🖊
Development

No branches or pull requests

2 participants