Date
1 - 7 of 7
Intermittent failure issue summary
Richard Purdie
I'm guessing a lot of people don't follow the intermittent issues. I therefore
thought I'd share a summary of some of them along with some random thoughts on them. There is a mix of different things here, each needing different skills. Systemd daemon-reload unit restart failures: https://bugzilla.yoctoproject.org/show_bug.cgi?id=14787 AlexK has got part way in figuring out the circumstances of this, any systemd experts able to spot what I think is a service file dependency issue? EFI Boot Failure: https://bugzilla.yoctoproject.org/show_bug.cgi?id=14018 "oe-selftest - efibootpartition.GenericEFITest.test_boot_efi selftest" Does anyone know the EFI boot process and know what logging we might add to the system so we gain more insight when this happens? Bitbake parsing error: https://bugzilla.yoctoproject.org/show_bug.cgi?id=14665 "Parsing recipes...ERROR: ParseError in None: Not all recipes parsed, parser thread killed/died? Exiting" - I just can't spot the logic bug causing this error (and some similar variants), maybe someone else can? sstate files not found: https://bugzilla.yoctoproject.org/show_bug.cgi?id=14775 For this one I think we need to write a standalone replica of the tests against an sstate mirror that sstate.bbclass runs to check if sstate objects exist. That way we could try different load levels against the project server and see whether it is the sstate/fetcher code (which does weird things with threads and concurrent connections) or if it is the server side of things that has some limit we can't spot. pseudo do_flush_pseudodb task error: https://bugzilla.yoctoproject.org/show_bug.cgi?id=14654 not sure why this sometimes happens, like need to sport the race in the pseudo shutdown code. Memory resident bitbake PR Serv issue: https://bugzilla.yoctoproject.org/show_bug.cgi?id=14786 This is one of the blocking issues on moving to memory resident bitbake by default x86 boot log serio/CD drive timeout in qemu: https://bugzilla.yoctoproject.org/show_bug.cgi?id=14743 We've talked about disabling some of the peripherals we don't need/care about such as psmouse and the CD drive. Anyone fancy digging into this with upstream qemu? I suspect there are other people who'd like this too. Bitbake Server timeout: https://bugzilla.yoctoproject.org/show_bug.cgi?id=14201 This one really needs a rework of bitbake's main loop with a new thread so that the UI and server can talk even when whatever it is doing (parsing, event handlers) is blocked. No takers?! Just thought I'd add to the list! :) These are 8 of the issues and probably the most frequent/annoying or ones where there is a clearish path forward. The full list of 57: https://bugzilla.yoctoproject.org/buglist.cgi?quicksearch=AB-INT (it was over 70 at one point, we've beaten it down a bit) Cheers, Richard |
|
Markus Volk
the systemd issue could be this ?
toggle quoted message
Show quoted text
https://github.com/systemd/systemd/pull/22552/commits/de90700f36f2126528f7ce92df0b5b5d5e277558 Am 16.04.22 um 12:26 schrieb Richard Purdie: I'm guessing a lot of people don't follow the intermittent issues. I therefore |
|
Richard Purdie
On Sat, 2022-04-16 at 15:31 +0200, Markus Volk wrote:
the systemd issue could be this ?Yes, that could well be it :) Particularly when you read: https://github.com/systemd/systemd/issues/15316 Alex: Any thoughts? Cheers, Richard |
|
Alexander Kanavin
On Sat, 16 Apr 2022 at 15:40, Richard Purdie
<richard.purdie@...> wrote: These commits have been backported to 250-stable, released in 250.4,the systemd issue could be this ?Yes, that could well be it :) and we already carry that version :-( https://github.com/systemd/systemd-stable/commit/367041af816d48d4852140f98fd0ba78ed83f9e4 Alex |
|
Jose Quaresma
Richard Purdie <richard.purdie@...> escreveu no dia sábado, 16/04/2022 à(s) 11:26: I'm guessing a lot of people don't follow the intermittent issues. I therefore Will it be a good idea to raise a warning and do another try for such cases? A timeout on socket seems to me that is server related and the last server infrastructure migration this timeout issue improves a lot. Before that last migration I can workaround this timeout issue setting BB_NUMBER_THREADS=1 that will do one connection at a time. Ding this BB_NUMBER_THREADS=1 makes me think that this can be some race condition with the oe.utils.ThreadedPool that afaik is only used on the sstate.bbclass. Jose
Best regards, José Quaresma |
|
Ross Burton <ross@...>
On Sat, 16 Apr 2022 at 11:26, Richard Purdie
<richard.purdie@...> wrote: x86 boot log serio/CD drive timeout in qemu:Patches sent for the keyboard/mouse part. The CD drive is trickier... Ross |
|
Richard Purdie
On Tue, 2022-04-19 at 17:50 +0100, Ross Burton wrote:
On Sat, 16 Apr 2022 at 11:26, Richard PurdieKnocking those two out alone is great and much appreciated, thanks! Cheers, Richard |
|