[OE-core] The state of reproducible Builds

Adrian Bunk bunk at stusta.de
Tue Jul 2 13:26:40 UTC 2019

On Mon, Jul 01, 2019 at 10:58:04AM -0500, Joshua Watt wrote:
> 1. HOSTTOOLS differences. There are a lot of tools listed in HOSTTOOLS, and
> unfortunately some of them have version dependent output and are used for
> target builds (the one I've currently stumbled upon is pod2man, but I'm sure
> there are others). Unfortunately, one could probably argue that HOSTTOOLS is
> somewhat antithetical to the above statement, at least in regard to target
> builds. Any host tool output that "leaks" into the target build output can
> result in a non-reproducible build across hosts, and possibly should be
> avoided; the alternative is to use (or mandate) the corresponding -native
> recipe that provides that tool as a DEPENDS so that the controlled
> internally built version is used instead. Note that this only really applies
> target builds, not -native (or nativesdk right now). -native recipes would
> obviously need more HOSTTOOLS to help bootstrap the system. I suspect this
> would require reworking how HOSTOOLS works so that they can be split into
> two categories somehow; the tools that have "ubiquitous and stable"
> interfaces and are fine for all recipes (e.g. cat, sed, true, rm, etc.) and
> those that are variable and should only be used for -native builds (e.g.
> pod2man, rpcgen(?), chrpath(?), tar(?)... others?). Anyone have thoughts on
> this?

What is the goal?

1. being able to prove that a given binary has actually been 
   built from the correct sources, or
2. builds on all hosts have the same output

With 1. you can just record all host properties like installed packages
and running kernel, and it isn't a problem if different hosts result in
different output.

With 2. any kind of differences due to host differences is a problem.
You need -native for nearly everything, and then fix all other kinds of 
differences like the version of the running kernel recorded somewhere.

For detecting malicous binaries not built from the claimed sources 1. is 
sufficient. For distributions like Debian that build natively this is 
even the only option available since the host compiler is used.

Doing 2. would of course be more desirable, but it can also be done in 
a second step after all issues related to building on exactly the same
host have been sorted out.

> Joshua Watt



       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed

More information about the Openembedded-core mailing list