Cross-compilationIntroduction
"Cross-compilation" means compiling a program on one machine for another
type of machine. For example, a typical use of cross compilation is to
compile programs for embedded devices. These devices often don't have the
computing power and memory to compile their own programs. One might think
that cross-compilation is a fairly niche concern, but there are advantages
to being rigorous about distinguishing build-time vs run-time environments
even when one is developing and deploying on the same machine. Nixpkgs is
increasingly adopting the opinion that packages should be written with
cross-compilation in mind, and nixpkgs should evaluate in a similar way (by
minimizing cross-compilation-specific special cases) whether or not one is
cross-compiling.
This chapter will be organized in three parts. First, it will describe the
basics of how to package software in a way that supports cross-compilation.
Second, it will describe how to use Nixpkgs when cross-compiling. Third, it
will describe the internal infrastructure supporting cross-compilation.
Packaging in a cross-friendly mannerPlatform parameters
Nixpkgs follows the
common
historical convention of GNU autoconf of distinguishing between 3
types of platform: build,
host, and target. In
summary, build is the platform on which a package
is being built, host is the platform on which it
is to run. The third attribute, target, is
relevant only for certain specific compilers and build tools.
In Nixpkgs, these three platforms are defined as attribute sets under the
names buildPlatform, hostPlatform,
and targetPlatform. They are always defined as
attributes in the standard environment. That means one can access them
like:
{ stdenv, fooDep, barDep, .. }: ...stdenv.buildPlatform...
.
buildPlatform
The "build platform" is the platform on which a package is built. Once
someone has a built package, or pre-built binary package, the build
platform should not matter and be safe to ignore.
hostPlatform
The "host platform" is the platform on which a package will be run. This
is the simplest platform to understand, but also the one with the worst
name.
targetPlatform
The "target platform" attribute is, unlike the other two attributes, not
actually fundamental to the process of building software. Instead, it is
only relevant for compatibility with building certain specific compilers
and build tools. It can be safely ignored for all other packages.
The build process of certain compilers is written in such a way that the
compiler resulting from a single build can itself only produce binaries
for a single platform. The task specifying this single "target platform"
is thus pushed to build time of the compiler. The root cause of this
mistake is often that the compiler (which will be run on the host) and
the the standard library/runtime (which will be run on the target) are
built by a single build process.
There is no fundamental need to think about a single target ahead of
time like this. If the tool supports modular or pluggable backends, both
the need to specify the target at build time and the constraint of
having only a single target disappear. An example of such a tool is
LLVM.
Although the existence of a "target platfom" is arguably a historical
mistake, it is a common one: examples of tools that suffer from it are
GCC, Binutils, GHC and Autoconf. Nixpkgs tries to avoid sharing in the
mistake where possible. Still, because the concept of a target platform
is so ingrained, it is best to support it as is.
The exact schema these fields follow is a bit ill-defined due to a long and
convoluted evolution, but this is slowly being cleaned up. You can see
examples of ones used in practice in
lib.systems.examples; note how they are not all very
consistent. For now, here are few fields can count on them containing:
system
This is a two-component shorthand for the platform. Examples of this
would be "x86_64-darwin" and "i686-linux"; see
lib.systems.doubles for more. This format isn't very
standard, but has built-in support in Nix, such as the
builtins.currentSystem impure string.
config
This is a 3- or 4- component shorthand for the platform. Examples of
this would be "x86_64-unknown-linux-gnu" and "aarch64-apple-darwin14".
This is a standard format called the "LLVM target triple", as they are
pioneered by LLVM and traditionally just used for the
targetPlatform. This format is strictly more
informative than the "Nix host double", as the previous format could
analogously be termed. This needs a better name than
config!
parsed
This is a nix representation of a parsed LLVM target triple with
white-listed components. This can be specified directly, or actually
parsed from the config. [Technically, only one need
be specified and the others can be inferred, though the precision of
inference may not be very good.] See
lib.systems.parse for the exact representation.
libc
This is a string identifying the standard C library used. Valid
identifiers include "glibc" for GNU libc, "libSystem" for Darwin's
Libsystem, and "uclibc" for µClibc. It should probably be refactored to
use the module system, like parse.
is*
These predicates are defined in lib.systems.inspect,
and slapped on every platform. They are superior to the ones in
stdenv as they force the user to be explicit about
which platform they are inspecting. Please use these instead of those.
platform
This is, quite frankly, a dumping ground of ad-hoc settings (it's an
attribute set). See lib.systems.platforms for
examples—there's hopefully one in there that will work verbatim for
each platform that is working. Please help us triage these flags and
give them better homes!
Specifying Dependencies
In this section we explore the relationship between both runtime and
buildtime dependencies and the 3 Autoconf platforms.
A runtime dependency between 2 packages implies that between them both the
host and target platforms match. This is directly implied by the meaning of
"host platform" and "runtime dependency": The package dependency exists
while both packages are running on a single host platform.
A build time dependency, however, implies a shift in platforms between the
depending package and the depended-on package. The meaning of a build time
dependency is that to build the depending package we need to be able to run
the depended-on's package. The depending package's build platform is
therefore equal to the depended-on package's host platform. Analogously,
the depending package's host platform is equal to the depended-on package's
target platform.
In this manner, given the 3 platforms for one package, we can determine the
three platforms for all its transitive dependencies. This is the most
important guiding principle behind cross-compilation with Nixpkgs, and will
be called the sliding window principle.
Some examples will probably make this clearer. If a package is being built
with a (build, host, target) platform triple of
(foo, bar, bar), then its build-time dependencies would
have a triple of (foo, foo, bar), and those
packages' build-time dependencies would have triple of
(foo, foo, foo). In other words, it should take two
"rounds" of following build-time dependency edges before one reaches a
fixed point where, by the sliding window principle, the platform triple no
longer changes. Indeed, this happens with cross compilation, where only
rounds of native dependencies starting with the second necessarily coincide
with native packages.
The depending package's target platform is unconstrained by the sliding
window principle, which makes sense in that one can in principle build
cross compilers targeting arbitrary platforms.
How does this work in practice? Nixpkgs is now structured so that
build-time dependencies are taken from buildPackages,
whereas run-time dependencies are taken from the top level attribute set.
For example, buildPackages.gcc should be used at build
time, while gcc should be used at run time. Now, for
most of Nixpkgs's history, there was no buildPackages,
and most packages have not been refactored to use it explicitly. Instead,
one can use the six (gasp) attributes used for
specifying dependencies as documented in
. We "splice" together the
run-time and build-time package sets with callPackage,
and then mkDerivation for each of four attributes pulls
the right derivation out. This splicing can be skipped when not cross
compiling as the package sets are the same, but is a bit slow for cross
compiling. Because of this, a best-of-both-worlds solution is in the works
with no splicing or explicit access of buildPackages
needed. For now, feel free to use either method.
There is also a "backlink" targetPackages, yielding a
package set whose buildPackages is the current package
set. This is a hack, though, to accommodate compilers with lousy build
systems. Please do not use this unless you are absolutely sure you are
packaging such a compiler and there is no other way.
Cross packaging cookbook
Some frequently problems when packaging for cross compilation are good to
just spell and answer. Ideally the information above is exhaustive, so this
section cannot provide any new information, but its ludicrous and cruel to
expect everyone to spend effort working through the interaction of many
features just to figure out the same answer to the same common problem.
Feel free to add to this list!
What if my package's build system needs to build a C program to be run
under the build environment?
depsBuildBuild = [ buildPackages.stdenv.cc ];
Add it to your mkDerivation invocation.
My package fails to find ar.
Many packages assume that an unprefixed ar is
available, but Nix doesn't provide one. It only provides a prefixed one,
just as it only does for all the other binutils programs. It may be
necessary to patch the package to fix the build system to use a prefixed
`ar`.
My package's testsuite needs to run host platform code.
doCheck = stdenv.hostPlatform != stdenv.buildPlatfrom;
Add it to your mkDerivation invocation.
Cross-building packages
More information needs to moved from the old wiki, especially
, for this
section.
Nixpkgs can be instantiated with localSystem alone, in
which case there is no cross compiling and everything is built by and for
that system, or also with crossSystem, in which case
packages run on the latter, but all building happens on the former. Both
parameters take the same schema as the 3 (build, host, and target) platforms
defined in the previous section. As mentioned above,
lib.systems.examples has some platforms which are used as
arguments for these parameters in practice. You can use them
programmatically, or on the command line:
nix-build <nixpkgs> --arg crossSystem '(import <nixpkgs/lib>).systems.examples.fooBarBaz' -A whatever
Eventually we would like to make these platform examples an unnecessary
convenience so that
nix-build <nixpkgs> --arg crossSystem.config '<arch>-<os>-<vendor>-<abi>' -A whatever
works in the vast majority of cases. The problem today is dependencies on
other sorts of configuration which aren't given proper defaults. We rely on
the examples to crudely to set those configuration parameters in some
vaguely sane manner on the users behalf. Issue
#34274
tracks this inconvenience along with its root cause in crufty configuration
options.
While one is free to pass both parameters in full, there's a lot of logic to
fill in missing fields. As discussed in the previous section, only one of
system, config, and
parsed is needed to infer the other two. Additionally,
libc will be inferred from parse.
Finally, localSystem.system is also
impurely inferred based on the platform evaluation
occurs. This means it is often not necessary to pass
localSystem at all, as in the command-line example in the
previous paragraph.
Many sources (manual, wiki, etc) probably mention passing
system, platform, along with the
optional crossSystem to nixpkgs: import
<nixpkgs> { system = ..; platform = ..; crossSystem = ..;
}. Passing those two instead of localSystem is
still supported for compatibility, but is discouraged. Indeed, much of the
inference we do for these parameters is motivated by compatibility as much
as convenience.
One would think that localSystem and
crossSystem overlap horribly with the three
*Platforms (buildPlatform,
hostPlatform, and targetPlatform; see
stage.nix or the manual). Actually, those identifiers are
purposefully not used here to draw a subtle but important distinction: While
the granularity of having 3 platforms is necessary to properly *build*
packages, it is overkill for specifying the user's *intent* when making a
build plan or package set. A simple "build vs deploy" dichotomy is adequate:
the sliding window principle described in the previous section shows how to
interpolate between the these two "end points" to get the 3 platform triple
for each bootstrapping stage. That means for any package a given package
set, even those not bound on the top level but only reachable via
dependencies or buildPackages, the three platforms will
be defined as one of localSystem or
crossSystem, with the former replacing the latter as one
traverses build-time dependencies. A last simple difference then is
crossSystem should be null when one doesn't want to
cross-compile, while the *Platforms are always non-null.
localSystem is always non-null.
Cross-compilation infrastructure
To be written.
If one explores nixpkgs, they will see derivations with names like
gccCross. Such *Cross derivations is
a holdover from before we properly distinguished between the host and
target platforms —the derivation with "Cross" in the name covered the
build = host != target case, while the other covered the
host = target, with build platform the same or not based
on whether one was using its .nativeDrv or
.crossDrv. This ugliness will disappear soon.