Skip to content

ref: Cross-compile aarch64 python pkg#699

Closed
Swatinem wants to merge 4 commits into
masterfrom
release/manylinux-cross
Closed

ref: Cross-compile aarch64 python pkg#699
Swatinem wants to merge 4 commits into
masterfrom
release/manylinux-cross

Conversation

@Swatinem

@Swatinem Swatinem commented Oct 17, 2022

Copy link
Copy Markdown
Contributor

This ports getsentry/relay#1438 to symbolic. It is using a x86_64 docker container configured for cross compiling that builds a aarch64 symbolic binary which is then re-used when building the aarch64 wheel. Thus it avoids running the whole cargo build in aarch64 qemu which can be extremely slow.


Currently the complete build for the binary python package runs in an arm64 qemu, which is super slow. The idea would be to cross-compile the symbolic-cabi library and then move that into the arm64 qemu image. That way, the cross-compile would be slow, and only a tiny part of packaging up the python bits is run in qemu.

The difficulty here is to find the right way to cross-compile the library, where the C/C++ bits (mostly related to demangling) need a C++ cross-compiling setup, and the image that does the cross compilation is "sufficiently old" to have a build-time glibc dependency that matches the one required by the python target.

As I believe python manylinux2014 is already outdated itself, we might jump to the next higher version (manylinux_2_24 I believe?) which has newer glibc requirements, requiring a "less old" image / C++ compiler to build.

This ports getsentry/relay#1438 to symbolic.
It is using a x86_64 docker container configured for cross compiling that builds a
aarch64 symbolic binary which is then re-used when building the aarch64 wheel.
Thus it avoids running the whole cargo build in aarch64 qemu which can be extremely slow.
@codecov-commenter

Copy link
Copy Markdown

Codecov Report

Merging #699 (5184b6d) into master (bb105da) will increase coverage by 0.03%.
The diff coverage is n/a.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #699      +/-   ##
==========================================
+ Coverage   72.82%   72.85%   +0.03%     
==========================================
  Files          92       92              
  Lines       18418    18428      +10     
==========================================
+ Hits        13412    13426      +14     
+ Misses       5006     5002       -4     

@supervacuus

Copy link
Copy Markdown
Collaborator

Here is a quick summary of what goes wrong in this PR and our options to fix this.

Breaking builds

All the Linux builds in the connected release action break because of the incorrect environmental setup of the C++ toolchains (aarch64 as an example where g++ cannot even find the <memory> header). Without going into too much detail about why those packages are so poorly configured, we could quickly fix these issues by "manually" setting environment variables from the Dockerfile.

Incompatible Compiler

The bigger problem is the supplied compiler. The versioned symbol requirements of manylinux2014 are as follows (source):

GLIBC_2.17
CXXABI_1.3.7, CXXABI_TM_1 is also allowed
GLIBCXX_3.4.19
GCC_4.8.0

The thus preferred distribution Centos 7, only provides GCC at version 4.8.5, a version that has no support for the C++ features currently used by the swift-demangler. This is not a problem in the pypa docker image used for the current builds; while also based on Centos 7, it installs the devtools-10 package, which contains GCC10 and a CXXABI_1.3.7 compatible libstdc++. Sadly, to my knowledge, there is no cross-compilation version of the devtools-10 available.

Building an aarch64 docker image with a clean cross-compilation toolchain

I started a quick prototype of a docker-image which builds a cross-compilation toolchain from source using crosstool-ng and then copies the result to a clean stage, setting the environment to allow for the build of the symbolic-cabi.

Building the image takes around an hour on GitHub (run), but this image needs to be built very rarely, and the actual build for symbolic-cabi now takes the same time as the other Linux builds (it's uploaded, you can try it with abovevacant/the_aarch64_builder:0.0.2).

I see no reason we couldn't extend this docker-image with the pypa image as the last stage to include the python wheel building environment (which would allow us to conditionally run make wheel-many against this other image for aarch64 in the build.yml).

Counter to the repository's name, the currently built toolchain is not compatible with manylinux2014, because the toolchain-builder always packages GCC versions in lockstep with the respective libraries. It is compatible with manylinux_2_24 though (CXXABI_1.3.9 + GLIBCXX_3.4.21).

I could not find another publicly uploaded cross-compilation image that targets anything below manylinux_2_28 and supports a GCC environment sufficiently modern to compile the swift-demangler.

Open questions/Next steps

  • How hard is the requirement on manylinux2014? Which would be acceptable alternatives?
  • Does Sentry have an (internal) image repository and a preferred workflow when integrating it with GitHub actions? Are there others at Sentry who build, maintain, and host Docker images?
  • If manylinux2014 is a hard requirement, I would have to dive deeper into how devtools-10 did achieve the libstdc++ symbol compatibility. It is also unclear how well one could realize that hack with crosstool-ng.
  • If manylinux_2_24 is acceptable, the next steps would be:
    • adding the pypa image as the last Docker stage
    • finding a home for the Docker config and image
    • creating a new PR with a much more minimal change in which the docs should also reflect the manylinux_2_24 compatiblity for aarch64 wheels.
  • If manylinux_2_28 is acceptable, we should try out existing cross-compilation images.
  • Any other options not mentioned in this summary?!

@Swatinem

Swatinem commented Dec 5, 2022

Copy link
Copy Markdown
Contributor Author
  • How hard is the requirement on manylinux2014? Which would be acceptable alternatives?

I believe we can bump quite a bit higher than that. I will defer to @asottile-sentry to give an authoritative answer and to also chime in on the other questions.

Another alternative would be to just stop pulling in the C++ (swift-demangle) dependency into the CABI crate.
We are not using that directly ourselves. Another user of our python bindings is Mozilla. We were discussing deprecations already some time ago in #573 (comment) where it did not seem that they were using the demangling part from python CC @gabrielesvelto @willkg

We might as well just remove the need to build the C++ code alltogether. We might need to do an API-breaking release soon anyway due to #729.

@gabrielesvelto

Copy link
Copy Markdown
Contributor

All our de-mangling happens in dump_syms so we never access it from Python.

@Swatinem

Swatinem commented Dec 5, 2022

Copy link
Copy Markdown
Contributor Author

@loewenheim noticed that we have a SourceLocation.function_name getter that does demangling in the symcache bindings:

@property
def function_name(self):
"""The demangled function name."""
return demangle_name(self.symbol, lang=self.lang)

@gabrielesvelto can I assume that you are never using that? In which case we can just safely remove it with the next breaking change.

@willkg

willkg commented Dec 5, 2022

Copy link
Copy Markdown

Mozilla Symbolication Service is not using function_name in symcache directly.

@asottile-sentry

Copy link
Copy Markdown
Contributor

for manylinux, sentry only cares about 2_28+ so go wild

@Swatinem Swatinem closed this Dec 21, 2022
@Swatinem Swatinem deleted the release/manylinux-cross branch January 9, 2023 14:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants