[poky] build performance: bb-matrix on 4-core (BB_NUMBER_THREADS and PARALLEL_MAKE optimization)

Darren Hart dvhart at linux.intel.com
Thu Jul 7 11:12:13 PDT 2011



On 07/07/2011 03:39 AM, Richard Purdie wrote:
> On Wed, 2011-07-06 at 11:16 -0700, Darren Hart wrote:
>> I ran the attached bb-matrix.sh on the following system:
>>
>> CPU (1): Intel(R) Core(TM) i7 CPU         870  @ 2.93GHz
>> Cores: 4
>> Threads: 8
>> Memory: 8186560 kB
>> OS Disk: INTEL SSDSA2M040G2GC (SSD)
>> Build Disk: Hitachi HDT721050SLA360 (Spinning Media)
>>
>> The script runs builds with all combinations of BB_NUMBER_THREADS and
>> PARALLEL_MAKE from 4 through 16.
>>
>> Once BB_NUMBER_THREADS hit 10, the kernel OOM Killer started killing off
>> tasks and build time tripled. Those runs have been removed the dataset.
>>
>> All of the runs with PARALLEL_MAKE=10 also failed, for a variety of
>> reasons. See bb-pm-errors.txt for details. For whatever reason, 10 seems
>> to be a bad number. Additional failures were seen at 09-11 and 10-14.
>> These have all been removed from the dat file.
>>
>> From the remaining results, a clear downward trend in build time is
>> evident with increasing BB_NUMBER_THREADS through 8, while build time
>> mostly increases again with 9 (and dramatically so with 10, not shown).
>> Optimal build time is achieved with BB_NUMBER_THREADS=8.
>>
>> Along the BB_NUMBER_THREADS=8 line, there is no clear trend with
>> increasing values of PARALLEL_MAKE. Local downward trends appear from
>> 4-7 and from 11-14. Optimal build time occurs with PARALLEL_MAKE=14,
>> however, it only bests PARALLEL_MAKE=7 by 68 seconds.
>>
>> While optimal build time is achieved with BB=8 and PM=14, a more
>> resource friendly setting of BB=8 and PM=6 yields nearly as good results.
> 
> Thanks Darren, I think those are interesting results.
> 
> Is the general advice we should give out therefore to set
> BB_NUMBER_THREADS = PARALLEL_MAKE = number threads?

I don't think we have enough data to make a general recommendation, but
for 4 core systems, I think BB=8 and PM=6 is a good choice. With some
additional runs on other hardware, hopefully we can come up with a more
general number like BB=2*NR_CORES PM=1.5*NR_CORES (cores not threads).

> 
> I'd love to understand why there is the peak and second dip on the
> PARALLEL_MAKE curve...

Me too! The PM axis plots are very strange and not at all expected. I'm
seeing similarly unexpected results on the 12 core machine currently
running through a 12-48 bb-matrix run.

> 
> It would also be good to put the script in scripts/contrib.

Yes, I'll try to get around to that soon... this week. They need a
little cleanup I think first.

-- 
Darren Hart
Intel Open Source Technology Center
Yocto Project - Linux Kernel



More information about the poky mailing list