Date
1 - 7 of 7
Stable release testing - notes from the autobuilder perspective
Richard Purdie
I wanted to write down my findings on trying to getting and keeping
stable branch builds working on the autobuilder. I also have a proposal in mind for moving this forward. Jeremy did good work in getting thud nearly building, building upon work I'd done in getting buildtools-extended-tarball working for older releases. Its not as simpler a problem as it would first appear. We have two versions of buildtools tarball. In simple terms, one has the basic utils needed to run builds without gcc and the other includes gcc. Our current policy was to install a buildtools tarball on certain problematic autobuilders but this doesn't work since a given release usually has a set of tools its known to work with and it won't work without tools outside that. We therefore suffer "bitrot" as new workers are added and older ones replaced with new distro installs. In particular: * gcc 10 doesn't work with older releases * gcc 4.8 and 4.9 don't work with newer releases * we no longer install makeinfo onto new autobuilder workers * we no longer install python2 onto new autobuilder workers * some older autobuilder workers have old versions of python3 * newer autobuilder workers need newer uninative versions * some things changed like crypt() being moved out of glibc This means that for a given release we want to use the standard buildtools tarball on "old" systems and the extended buildtools tarball on "new" systems that didn't exist at the time the release was made. My thoughts are that we should: a) Remove all the current buildtools installs from the autobuilder b) teach autobuilder-helper to install buildtools tarballs in all the older release branches c) backport most of the autobuilder-helper changes to older releases so its easier to maintain things d) backport buildtools-extended-tarball to older releases e) backport the necessary fixes to older releases to allow them to build on the current infrastructure with buildtools. Dunfell is in a good state and ok. Zeus needs poky:zeus-next yocto-autobuilder-helper:contrib/rpurdie/zeus Thud has branches available that need to update against the zeus changes I've figured out which should get that working too. Pyro has example code at poky-contrib:rpurdie/pyro to allow a buildtools tarball that old to be built. As things stand the branches are all just going to bitrot so if we can get these branches to build cleanly, it would seem to make sense to me to merge this approximate set of changes in the hope that stable maintenance in case of any major security fix (for example) becomes much more possible. Any thoughts from anyone on this? Cheers, Richard |
|
Otavio Salvador
Hello all,
Em seg., 7 de set. de 2020 às 13:14, Richard Purdie <richard.purdie@...> escreveu: ...I second this and at least at O.S. Systems we've been using Docker containers to keep maintenance easier for old releases. I'd be great we could alleviate this and reduce its use as much as possible. The CI builder maintenance is indeed a time-consuming task and as easier it gets the easier is to convince people to set up them for their uses and in the end, this helps to improve the quality of submitted patches and reduces the maintenance effort as well. -- Otavio Salvador O.S. Systems http://www.ossystems.com.br http://code.ossystems.com.br Mobile: +55 (53) 9 9981-7854 Mobile: +1 (347) 903-9750 |
|
Tom Rini
On Mon, Sep 07, 2020 at 02:59:41PM -0300, Otavio Salvador wrote:
Hello all,Excuse what may be a dumb question, but why are we not just building pyro for example in a Ubuntu 16.04 or centos7 (or anything else with official containers available) ? Is the performance hit too much, even with good volume management? And extend that for other branches of course. But as we look at why people care about such old releases (or, supporting a current release into the future) it seems like "our build environment is a container / VM so we can support this on modern HW" pops up. -- Tom |
|
Richard Purdie
On Mon, 2020-09-07 at 16:55 -0400, Tom Rini wrote:
On Mon, Sep 07, 2020 at 02:59:41PM -0300, Otavio Salvador wrote:The autobuilder is setup for speed so there aren't VMs involved, itsHello all,Excuse what may be a dumb question, but why are we not just building 'baremetal'. Containers would be possible but at that point the kernel isn't the distro kernel and you have permission issues with the qemu networking for example. Speed is extremely important as we have about a 6 hour build test time but a *massive* test range (e.g. all the gcc/glibc test suites on each arch, build+boot test all the arches under qemu for sysvinit+systemd, oe-selftest on each distro). I am already tearing my hair out trying to maintain what we have and deal with the races, adding in containers into the mix simply isn't something I can face. We do have older distros in the cluster for a time, e.g. centos7 is still there although we've replaced the OS on some of the original centos7 workers as the hardware had disk failures so there aren't as many of them as there were. Centos7 gives us problems trying to build master. So this plan is the best practical approach we can come up with to allow us to be able to build older releases yet not change the autobuilders too much and cause new sets of problems. I should have mentioned this, I just assume people kind of know this, sorry. Cheers, Richard |
|
Tom Rini
On Mon, Sep 07, 2020 at 10:03:36PM +0100, Richard Purdie wrote:
On Mon, 2020-09-07 at 16:55 -0400, Tom Rini wrote:Which issues do you run in to with qemu networking? I honestly don'tOn Mon, Sep 07, 2020 at 02:59:41PM -0300, Otavio Salvador wrote:The autobuilder is setup for speed so there aren't VMs involved, itsHello all,Excuse what may be a dumb question, but why are we not just building know if the U-Boot networking tests we run via qemu under Docker are more or less complex than what you're running in to. Speed is extremely important as we have about a 6 hour build test timeThe reason I was thinking about containers is that it should remove some of what you have to face. Paul may or may not want to chime in on how workable it ended up being for a particular customer, but leveraging CROPS to setup build environment of a supported host and then running it on whatever the available build hardware is, was good. It sounds like part of the autobuilder problem is that it has to be a specific set of hand-crafted machines and that in turn feels like we've lost the thread, so to speak, about having a reproducible build system. 6 hours even beats my U-Boot world before/after times, so I do get the dread of "now it might take 5% longer, which is a very real more wallclock time. But if it means more builders could be available as they're easy to spin up, that could bring the overall time down. So this plan is the best practical approach we can come up with toSince I don't want to put even more on your plate, what kind of is the reasonable test to try here? Or is it hard to say since it's not just "MACHINE=qemux86-64 bitbake world" but also "run this and that and something else" ? -- Tom |
|
Richard Purdie
On Mon, 2020-09-07 at 17:19 -0400, Tom Rini wrote:
On Mon, Sep 07, 2020 at 10:03:36PM +0100, Richard Purdie wrote:Its the tun/tap device requirement that tends to be the pain point.On Mon, 2020-09-07 at 16:55 -0400, Tom Rini wrote:Which issues do you run in to with qemu networking? I honestly don't Being able to ssh from the host OS into the qemu target image is a central requirement of oeqa. Everyone tells me it should use portmapping and slirp instead to avoid the privs problems and the container issues which is great but not implemented. Removes some, yes, but creates a whole set of other issues.Speed is extremely important as we have about a 6 hour build test timeThe reason I was thinking about containers is that it should remove some Paul may or may not want to chime in on howThe machines are in fact pretty much off the shelf distro installs so not hand crafted. about having a reproducible build system. 6 hoursHere we get onto infrastructure as we're not talking containers on our workers but on general cloud systems which is a different proposition. We *heavily* rely on the fast network fabric between the workers and our nas for sstate (NFS mounted). This is where we get a big chunk of speed. So "easy to spin up" isn't actually the case for different reasons. Its quite simple:So this plan is the best practical approach we can come up with toSince I don't want to put even more on your plate, what kind of is the MACHINE=qemux86-64 bitbake core-image-sato-sdk -c testimage and MACHINE=qemux86-64 bitbake core-image-sato-sdk -c testsdkext are the two to start with. If those work, the other "nasty" ones are oe-selftest and the toolchain test suites. Also need to check kvm is working. We have gone around in circles on this several times as you're not the first to suggest it :/. Cheers, Richard |
|
Tom Rini
On Mon, Sep 07, 2020 at 10:30:20PM +0100, Richard Purdie wrote:
On Mon, 2020-09-07 at 17:19 -0400, Tom Rini wrote:Ah, OK. Yes, we're using "user" networking not tap.On Mon, Sep 07, 2020 at 10:03:36PM +0100, Richard Purdie wrote:Its the tun/tap device requirement that tends to be the pain point.On Mon, 2020-09-07 at 16:55 -0400, Tom Rini wrote:Which issues do you run in to with qemu networking? I honestly don't Sorry, what I meant by hand-crafted is that for it to work for olderRemoves some, yes, but creates a whole set of other issues.Speed is extremely important as we have about a 6 hour build test timeThe reason I was thinking about containers is that it should remove some installs, you have to have this particular dance to provide various host tools, that weren't required at the time. Thanks for explaining it again. I'll go off and do some tests.about having a reproducible build system. 6 hoursHere we get onto infrastructure as we're not talking containers on our -- Tom |
|