Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

M1 support for website builds (plus a few component updates) #629

Closed
wants to merge 13 commits into from

Conversation

rossturk
Copy link
Contributor

@rossturk rossturk commented Jul 8, 2022

When running ./site.sh build-site on a new M1-based macOS machine using Docker Desktop, I encountered a few problems. The root cause of all of them: Docker runs everything inside a linux/aarch64 container instead of the linux/x86_64 one that you get when you run on an x86 Mac. This manifests in a successful build-image followed by an unsuccessful install-node-deps.

To address these issues I have done the following:

Remove node-saas
The node-saas module used by landing-pages has been deprecated, and does not build on aarch64. There is a new module, saas, that is a drop-in replacement and it does build! This PR swaps out the module.

Upgrade to Node 16
The new saas module requires at least Node 12 to run, and we were on 10. I decided to upgrade us all the way to 16, which is their LTS. Seemed prudent.

Build Hugo from source
There is no aarch64 binary for Hugo that I've been able to find. That's a huge bummer because it means we have to build one from source if we want it to work both inside Docker on newer Macs and also inside standard systems and CI. Hugo is written in golang, and the version of golang that comes with Debian stretch is too old to build the version of Hugo we need. Eep.

This PR introduces an additional stage to the Dockerfile that installs gccgo, a C-based golang interpreter, and uses it to bootstrap a modern version of golang. It then uses that to build Hugo, and copies the binary over to the second stage where it can be used in airflow-site containers. Once the Hugo team starts releasing binaries for this arch, I think we should consume them instead.


I've verified that this can be used to successfully build a site on a new M1 Mac + a traditional x86 Linux server. I plan to come back and finalize this draft PR once I've done additional testing to make sure none of this affects the site output.

I also plan to do a bit of experimenting with Hugo versions to figure out the right one to use. The newest ones do not work, but I am not confident I have chosen the most recent (stable) one that does.

@potiuk
Copy link
Member

potiuk commented Jul 8, 2022

OH... fantastic that you are doing it ! The site building refresh has been loooon overdue.. Also there is an outstanding issue #338 about rewriting the site.sh in python which @Bowrna is looking at and maybe the forces can be joined :)

@potiuk
Copy link
Member

potiuk commented Jul 8, 2022

There is no aarch64 binary for Hugo that I've been able to find. That's a huge bummer because it means we have to build one from source if we want it to work both inside Docker on newer Macs and also inside standard systems and CI. Hugo is written in golang, and the version of golang that comes with Debian stretch is too old to build the version of Hugo we need. Eep.

We have long moved to buster in Airflow, so maybe that will help if we do it here too ?

@potiuk
Copy link
Member

potiuk commented Jul 8, 2022

Just a thought @rossturk . Since this is golang mostly and yarn - we do not TECHNICALLY need to run everything with Docker. I think teh set of technologies we use to build "airflow-site" is "multi-patform enough" to not require Docker image? I think this is the main reason why there is no official Hugo aarch64 image is that it's is "good enough" to have everything installed in the "host" environment rather than having to use docker.

For one hugo nicely installs with brew on M1 with arm64 architecture, same with node. Unlike Airflow (which require a number of system-level libraries - mysql and others ) that make it difficult to have a "consistent" build environment in a host of different operating systems, building the site is done rarely enough and dependencies are multi-platform enough that it should be possible to simply drop the Docker approach altogether.

@rossturk
Copy link
Contributor Author

rossturk commented Jul 8, 2022

We have long moved to buster in Airflow, so maybe that will help if we do it here too ?

I didn't want to make such a major change as an early contributor :) but as it turns out this is quite helpful. The version of golang in buster is new enough to build the older version of Hugo that we use.

I've made this update, and it cuts quite a bit of time off of the build.

@rossturk
Copy link
Contributor Author

rossturk commented Jul 8, 2022

Just a thought @rossturk . Since this is golang mostly and yarn - we do not TECHNICALLY need to run everything with Docker. I think teh set of technologies we use to build "airflow-site" is "multi-patform enough" to not require Docker image?

Yeah, I think this could be possible. With homebrew/apt/yum and venv we can probably create a fairly portable build environment. But I still find a lot of benefit in a containerized process. It ensures consistency between the dev's local environment and the one inside CI.

Right now the build is failing on this PR and I think it's because I updated python to python3 in a few spots in site.sh. I did this because one of the systems I test on has both python 2 and 3 installed, and the symlink for python was pointing to 2. I feel it's better to be explicit about python versions, generally. Of course, now I am solving a problem on my local machine that doesn't exist in CI, and along the way I am now suspecting I might have broken CI!

This wouldn't be a problem if those steps in site.sh ran inside the container. But the current process spans the container + the local environment. So we're somewhat missing the benefit.

On one of the other sites I manage, we support local environments as you suggest. It's easy to work with. However, one consequence is that certain components of node behave differently on different architectures, leading to YAML files that contain the same information but constantly switch order, etc., creating lots of git noise.

@potiuk
Copy link
Member

potiuk commented Jul 8, 2022

Right now the build is failing on this PR and I think it's because I updated python to python3 in a few spots in site.sh. I did this because one of the systems I test on has both python 2 and 3 installed, and the symlink for python was pointing to 2. I feel it's better to be explicit about python versions, generally. Of course, now I am solving a problem on my local machine that doesn't exist in CI, and along the way I am now suspecting I might have broken CI!

There are some problems with python3 aliases and expecting them is not universal even in https://peps.python.org/pep-0394/ there is a freedom to decide which alias is pointing to which version and I've seen cases where python3 was missing as well (this is left to a discretion of system administrators and distribution). Hoever "future-looking" approach is that Python2 will all but disappear and python alias will point to python3. And then there will also be a movement to remove python3 think even if it is there. So in all airlfow code we tend to use:

#!/usr/bin/env python

Though occasionally you will find

#!/usr/bin/env python3

BTW. In most distros you can installpython-is-python3 package to get this behaviour forced:

sudo apt install python-is-python3

Yeah, I think this could be possible. With homebrew/apt/yum and venv we can probably create a fairly portable build environment. But I still find a lot of benefit in a containerized process. It ensures consistency between the dev's local environment and the one inside CI.

This wouldn't be a problem if those steps in site.sh ran inside the container. But the current process spans the container + the local environment. So we're somewhat missing the benefit.

On one of the other sites I manage, we support local environments as you suggest. It's easy to work with. However, one consequence is that certain components of node behave differently on different architectures, leading to YAML files that contain the same information but constantly switch order, etc., creating lots of git noise.

All of the above I wholeahartedly agree with and I have exactly the same experience. I am (as you might notice from Airflow CI and Breeze) huge Docker fan for CI + dev env for precisely the reasons you describe. So if buster migration allows to get "simple" Docker env, this is certainly the best approach.

@rossturk
Copy link
Contributor Author

rossturk commented Jul 8, 2022

However "future-looking" approach is that Python2 will all but disappear and python alias will point to python3. And then there will also be a movement to remove python3 think even if it is there.

This makes sense to me. I'll revert this part of the PR.

And actually, while I am in here, I think instead of having site.sh call python <scriptname> I'll add the #! to the top of the script and execute it directly.

(by the way, I don't think this python thing is why CI is failing...)

@potiuk
Copy link
Member

potiuk commented Jul 8, 2022

BTW. Speaking of consistency between dev and CI. I am not sure if you are aware of but our Airflow CI is I think the most extreme example of it (and this is the part I am most proud of). Breeze does not only replicate the exact CI environment for the whole Airflow but whenever your test fail, you can replicate the exact environment with single breeze command:

Screenshot 2022-07-09 at 00 15 49

This command downloads the VERY DOCKER IMAGE that was used or your build (identified by a commit hash) and either drop you into container where you can iterate on the failing test or re-run the whole suite locally.

And this is the "Container as Dev and CI image to the fullest". So yeah I understand exactly what you are talking about,

@Bowrna
Copy link

Bowrna commented Jul 10, 2022

think teh set of technologies we use to build "airflow-site" is "multi-patform enough" to not require Docker image? I think this is the main reason why there is no official Hugo aarch64 image is that it's is "good enough" to have everything installed in the "host" environment rather than having to use docker.
@potiuk do we need to move existing code in site.sh that works with docker command underlying to become docker free? Can you point the list of commands that you think can be docker free and install in "host" environment?

@rossturk
Copy link
Contributor Author

@potiuk do we need to move existing code in site.sh that works with docker command underlying to become docker free? Can you point the list of commands that you think can be docker free and install in "host" environment?

As it turns out, this can be done mostly by tweaking the run_comand function of today's site.sh. Take a look at #634, I have given this a try. It seems to work well for me on my new M1 + my Linux test server, and the CI tests are passing.

@Bowrna you may try to make similar changes in your site.py - from what I can see, the structure is similar. I'd be happy to work with you on the Python port! Let me know if you'd like me to test it out.

@rossturk
Copy link
Contributor Author

I am running into some really bizarre issues with this approach. It works in my local environments, but not in CI. It seems to be failing while trying to write somewhere in /root/.yarn inside the build container.

On a lark, I decided to try to undockerize this process and it was, well, smooth like a knife through butter! So I'll leave this open in draft mode for now and pursue that approach instead.

@rossturk rossturk closed this Jul 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants