Re: would reducing BB_NUMBER_THREADS solve parallelism overheating?

Andre McCurdy

On Sat, Aug 7, 2021 at 8:26 AM Robert P. J. Day <rpjday@...> wrote:

i've asked about this previously, but i finally got around to
thinking about this in detail, and i'd like some feedback.

i've described the overheating/lockup issues i've been having on a
dell latitude laptop building a sizable (wind river linux-based) image
represented by 5-6000 tasks.

if i start the full build from scratch, it would take about 2 hours
if it succeeded but, on a very regular basis, when the load average
well exceeds 120 and i can feel really hot air blowing (even with t
laptop cooler), the laptop will simply lock up hard.

the CPU is a i7-9850H, so 6 cores/12 threads, and i've refactored
numerous proprietary in-house recipes to really crank up the
parallelism so that, a lot of the time, i can see all 12 threads
churning away on some task. of course, that's exactly when i get the
huge load average/overheating/lockup.
Running all CPUs at 100% load shouldn't cause lockups unless there's a
HW problem. Another possible cause of lockups is using all available
DRAM (which is obviously related to how many simultaneous processes
are running too). On my laptop with 4 CPUs / 8 threads and 16GB DRAM I
have to manually over-ride BB_NUMBER_THREADS in order to keep the
system responsive and avoid running out of memory. The default
BB_NUMBER_THREADS / PARALLEL_MAKE are often too aggressive for laptops
etc running a desktop distro.

a couple colleagues suggested using BB_NUMBER_THREADS (BNT for
short) to dial things back a bit but, after i pondered a bit, it seems
that that would potentially make things worse, not better, so a couple
questions about cores and threads.

first, given the CPU setup, let's say i set BNT to 6. does that mean
3 cores would be taken out of play entirely, leaving the other 3 to do
all the work, or would each core simply use one thread, or who knows
what would happen, and would any one scenario be superior to another?
The kernel will schedule processes to CPUs / SMT threads, not bitbake.
If you have exactly 6 processes to run then the kernel will most
likely schedule them on the 6 physical CPUs rather than 2 SMT threads
x 3 physical CPUs.

The only time the kernel might run 6 processes on 3 physical CPUs and
idle the other 3 would be if you enabled a power saving scheduler in
the kernel, but that behaviour wouldn't be the default.

my other realization(?) is that reducing BNT might be the *worst*
strategy, and here's why. if i have a huge load average, that suggests
i have lots and lots of tasks waiting to run (lots of stuff on the
wait queue). but if i reduce the number of available threads, wouldn't
that just force all that work to wait for a smaller number of threads,
making the wait queues even longer?
No. If you reduce BB_NUMBER_THREADS then you reduce the number of
processes being added to the run queues, not the number of CPUs or SMT
threads which the kernel has available to execute them.

In summary, it sounds like reducing BB_NUMBER_THREADS is exactly what
you should do. It will reduce peak system load (ie the amount of heat
the system produces) and the peak DRAM usage (which may be the cause
of your lockups). Adding more DRAM (if it's not maxed out already)
might possibly solve the lockups too (but not the amount of heat).

Join to automatically receive all group messages.