Some GHC bindists have a normal `$out/lib` directory which contains
symlinks to all core libs. Because it is a normal lib directory, the
bintools setup hook will pick up on it and cause ld to pass the
appropriate -L and -rpath flags. We do not want this to happen,
especially in the case of the stage2 compiler. Not only will the final
ghc have an unnecessary reference (and thus increased closure size) to
the binary ghc, but the extra libraries in the rpath mess with the rts
and cause e.g. segfaults in GHCi.
Unfortunately, there is no way to prevent this. It is a fundamental flaw
in the cc and bintools wrappers that they do not actually distinguish
between the roles of dependencies (build, host, target). Instead
the mangleVar* function will translate the dependencies split up by
roles into platforms. This means that the wrappers can't distinguish
between depsBuildBuild and depsHostTarget (== buildInputs) when natively
compiling. As long as we are natively compiling the wrappers will put
the stage0 ghc (be it in depsBuildBuild, nativeBuildInputs etc.) into
the linker flags of the final ghc.
The solution is to sidestep the issue. We just had ghc in depsBuildBuild
to have it added to PATH. GHC itself will pass the appropriate linker
flags if necessary. To avoid the setup hooks picking up on the GHC
libraries we just don't put it into depsBuildBuild or any other
dependency list. Since the GHC build system accepts the GHC binary via
an absolute path, we don't even need to add the stage0 GHC to PATH.
The stage0 ghc is build->build since it builds the stage1 compiler which
has build for its host platform (i.e. is build->target relative to the
entire GHC derivation).
Also annotate a bit more around the use of pkgsBuildBuild and the boot
compiler and make it more explicit where it comes from in the
derivation.
This is a safeguard against a problem we had with 9.6. Unfortunately,
since the cc wrapper emits `-L` and `-rpath` flags based on platform
config (e.g. aarch64-unknown-linux), not platform role (e.g. build),
stdenv itself doesn't prevent ghc from being linked against the boot
compiler when building a native or cross-compiling GHC (since host ==
build).
With disallowedReferences, the build will fail if such a problem is
re-introduced.
We reuse the targetLibs logic for this since it is more or less the same
story. However, the terminfo library is only built when GHC is neither a
cross-compiler nor being cross-compiled. Therefore ncurses (if used)
will only ever come from pkgsHostTarget. In the other cases ncurses is
still passed via depsBuildBuild for the stage1 compiler.
This commit tries to resolve the problem that the package-db doesn't
include library and include dirs of ncurses for the terminfo package,
causing library loading and linking problems in downstream packages,
e.g. dhall-docs and dhall-toml. This problem was introduced in
4b00fbf163. With this in mind, not passing
--with-curses-* – as long as the terminfo package isn't built – seems
fine.
The Darwin LLVM backend of GHC (which is mostly interesting for
GHC < 9.2) uses clang as configured via the CLANG environment variable
as an assembler. Since it processes outputs of clang as configured via
the CC variable, we need to make sure these versions match or risk CLANG
clang not understanding the output of CC clang.
In the past this wasn't really a problem as due to the fairly old
default clang version in the stdenv, clang 11 would be used for CC.
CLANG would always be a newer version and deal with the output without
any problems.
Ever since the upgrade of the default clang version for
darwin (bcbdb800cf), CC would often be a
newer version of clang than CLANG, causing build problems in some
packages like crypton (for GHC 8.10.7 and 9.0.2 on aarch64-darwin where
the darwin LLVM backend was actually used).
An oversight in 884a76c5e6: We need to
mirror the isGhcjs condition of targetCC (toolsForTarget) for installCC
as well. (Note that in practice, targetCC == installCC for ghcjs since
it is cross-only (?).)
The build platform doesn't matter for checking which stage is our final
stage! Stage2 means host and target are sufficiently similar, Stage1
means host and target may differ.
Hadrian started installing unlit with a targetPrefix (if applicable)
which wasn't the case with make before. Unfortunately, the logic to
generate the settings file wasn't updated, so GHC 9.6.* cross compilers
expect to find an unlit binary without a target prefix.
Upstream issue: https://gitlab.haskell.org/ghc/ghc/-/issues/23317
GHC's build system assumes that the C compiler, tools etc. discovered
during configure can also be used at runtime. This means that the CC,
LD, AR etc. variables given at runtime are used to populate the settings
file which GHC uses to lookup the tools it needs.
The implicit assumption of this mechanism is that the build and runtime
environment of GHC are similar enough and PATH is used to find the
tools. I. e. if we set CC=clang, we wouldn't need to worry about this as
much. We, however, pass absolute paths which is useful since it allows
GHC to work outside of stdenv (as long as e. g. no FFI is involved).
Even so, until now, we didn't really have any problems stemming from
this, as we used pkgsBuildTarget to get everything we need. The
compiler we'd want to execute would in principle need to come
from pkgsHostTarget.
1. For native compilers, all package sets are the same since
build == host == target.
2. For cross compilers build == host, so pkgsBuildTarget
is practically the same as pkgsHostTarget.
When cross-compiling a native compiler, build != host, so we need to
actually ensure that GHC uses different tools at runtime compared to
bootstrapping. There is currently no intended way to achieve this, so we
use a custom tool to edit the settings file. An alternative would be to
patch the build system, but this would be difficult to maintain. We
could go down this route if there's interest from upstream to provide a
proper way to specify the runtime tools.
Co-authored-by: sternenseemann <sternenseemann@systemli.org>
The goal of this commit is basically to eliminate the use of
targetPackages for finding libraries. Instead, we introduce a
`targetLibs` set that can be used instead. The libraries in there
philosophically come from targetPackages since they are used by the core
libs and will be linked against user code. However, when cross compiling
GHC it's always a native compiler, so we can and have to use
pkgsHostTarget (targetPackages would be empty). This is explained more
in the acccompanying comment.
An alternative to this approach is not to pass in the libraries
explicitly via `--with-*` flags and rely on cc-wrapper and splicing to
pick the correct library. This works well for ncurses and probably
merits testing for other libraries as well since it's very simple. It
would need to be verified, however, that configure doesn't discover the
“wrong” library and leaks it somewhere.
Co-authored-by: sternenseemann <sternenseemann@systemli.org>
1. Explicitly set WITH_TERMINFO. We usually match GHC's behavior well,
but it is better to tie the Nix option to make explicitly.
Unfortunately, the same is very complicated to achieve with
hadrian (iirc).
2. Disable enableTerminfo if we are cross-compiling. This matches
the behavior of GHC's build system, so we'll have to match it now.
It also reduces the ncurses-related headache a bit.
3. Stop passing --with-curses* flags. Unfortunately, GHC does not
account for the fact that different platforms need different ncurses
libraries. This is somewhat migitated by the fact that ncurses is
only ever needed for the build platform if we are cross compiling,
but I seem to remember it leaking into the final GHC somehow.
A more reliable alternative is relying on the cc/ld wrapper scripts,
as they'll always pull out the correct ncurses out of the environment
when GHC's build system passes -lcurses.
4. Unconditionally add ncurses to depsBuildBuild. Stage0 unconditionally
builds terminfo (maybe the stage1 compiler needs it?), so we need to
make sure that ncurses for the build platform is available.
Co-authored-by: sternenseemann <sternenseemann@systemli.org>
This is easy in comparison since these tools won't end up in GHC's
settings nor need to be available at runtime, so we can use
the *_FOR_BUILD environment variables.
It is important to add buildCC to depsBuildBuild to engage the
stdenv/wrapper script machinery properly.
Co-authored-by: sternenseemann <sternenseemann@systemli.org>
- Unconditionally get `install_name_tool` from cc.bintools.bintools since it is no longer wrapped; and
- Use the `strip` wrapper on both Darwin architectures. It’s the default one, and it’s the same between both.
This refactor should simplify the code a little bit and make future
changes easier. I. e. for cross compiling GHC we'll have to update the
tools in the GHC settings file and calculate the host->target tool paths
for later use. Having a ready function for this will make this a lot
easier.
elfutils is used in the RTS (rts/Libdw.c), i.e. it will be used on the
target platform.
Tested via pkgsCross.gnu32.haskellPackages.ghc [1], though #304605 needs
to be cherry-picked for elfutils to build.
[1]: nix-shell -E 'with import ./. { crossSystem = "i686-linux"; };
mkShell { nativeBuildInputs = [haskellPackages.ghc ]; }'
This makes the hadrian expressions much simpler as we no longer need to
thread through extra arguments for special workarounds.
common-hadrian.nix decides in one place which patches we need and
directly applies them to the source used to build everything.
We'll want to (slowly) unify the source used by the different
derivations we use to build GHC. As a first step, use the same base
source for building GHC and all hadrian related packages. This is
achieved by wrapping the fetcher result in `srcOnly` to apply GHC
patches immediately.
To modify the patches (and source) used by GHC we now have a changed
overriding interface for >= 9.6:
```
oldGhc.override {
ghcSrc = oldGhc.src.overrideAttrs (oldAttrs: {
src = …;
patches = …;
});
}
```
When we are building compiler for a platform we can execute ourselves,
we can build a proper stage2 compiler which unlocks some features that
are interesting for e.g. pkgsStatic.
The resulting compiler is technically a native compiler that's prefixed.
By trying to migitate the conflict between two files on a case
insensitive fs, we will inevitably end up with a different hash than on
case sensitive filesystems. To work around this, we just delete the
directory that contains the offending files — luckily it is not
important to the build of GHC.
The reasoning given for disabling it is flawed: In most cases, sphinx
and its dependencies are already in the binary cache, since we only need
them as build tools—sphinx for the build platform is just the normal
pkgs.sphinx, since it doesn't care about targetPlatform.
We just need to disable it when the buildPlatform is also musl, so we
avoid pulling in sphinx in pkgsMusl.
In this situation, haddock would not be built by hadrian, as there is no
stage0:exe:haddock target by default. (We should eventually try adding
one.) If haddock is enabled and the build->host haddock missing, Cabal
tries using the build->build haddock which may fail to load the
documentation from the interface files produced by the build->host
GHC (e.g. due to a mismatch between dynamic and static linking).
Add regression tests to haskell-updates jobset.
Resolves#275304.
This ports our infamous patch for `Cabal` which cheesily prevents an
output cycle for derivations that use separate bin outputs where
references caused by the `Paths_*` module can't be eliminated by the GHC
aarch64-darwin codegen backend.
See also
- the original issue #140774,
- the original patch for GHC 9.2 #216857
- the ported patch for GHC 9.4
f6f780f129
Co-authored-by: sternenseemann <sternenseemann@systemli.org>
When 9.2.1 was [released], I apparently was confused by the wording. The
NCG (-fasm) codegen backend for aarch64 not only works on
aarch64-darwin, but also aarch64-linux. `useLLVM` being enabled on
aarch64-linux had no adverse effect, as GHC used -fasm anyways, but it
did inflate closure size unnecessarily which we can rectify now.
[released]: https://www.haskell.org/ghc/blog/20211029-ghc-9.2.1-released.html
The switch to cctools-llvm made several LLVM tools the default on
Darwin, which includes llvm-ar. GHC will try to use `-L` with `ar` when
it is `llvm-ar`, but that doesn’t work currently on Darwin.
See https://gitlab.haskell.org/ghc/ghc/-/issues/23188.
This saves just enough space on aarch64-linux so that the hadrian built
GHCs are under the 3GB Hydra output limit:
| compiler | before | after | Δ |
|----------|------------|------------|------------|
| ghc962 | 3241234736 | 2810740560 | -430494176 |
| ghcHEAD | 3341288328 | 2902760872 | -438527456 |
The total output size can be calculated using (don't forget to use
aarch64-linux):
```
nix-build -A <compiler> | xargs nix path-info -s | awk '{ s += $2 }; END { print s }'
```