[poky] [RFC PATCH 1/1] local.conf.sample: update suggestions for BB_NUMBER_THREADS and PARALLEL_MAKE
Darren Hart
dvhart at linux.intel.com
Mon Jun 27 13:47:22 PDT 2011
On 06/21/2011 02:14 PM, Darren Hart wrote:
> On 06/20/2011 03:57 PM, Tom Rini wrote:
>> On 06/17/2011 08:21 PM, Darren Hart wrote:
>>>
>>>
>>> On 06/17/2011 08:16 PM, Joshua Lock wrote:
>>>> It's been suggested that BB_NUMBER_THREADS should be 2 * the number of cores
>>>> and PARALLEL_MAKE should be equal to the number of cores available on the
>>>> build machine.
>>>>
>>>> Update local.conf.sample to suggest this.
>>>>
>>>> Signed-off-by: Joshua Lock <josh at linux.intel.com>
>>>> ---
>>>> meta-yocto/conf/local.conf.sample | 4 +++-
>>>> 1 files changed, 3 insertions(+), 1 deletions(-)
>>>>
>>>> diff --git a/meta-yocto/conf/local.conf.sample b/meta-yocto/conf/local.conf.sample
>>>> index ea32b81..43d06e6 100644
>>>> --- a/meta-yocto/conf/local.conf.sample
>>>> +++ b/meta-yocto/conf/local.conf.sample
>>>> @@ -9,7 +9,9 @@ CONF_VERSION = "1"
>>>> #SSTATE_DIR ?= "${TOPDIR}/sstate-cache"
>>>>
>>>> # Uncomment and set to allow bitbake to execute multiple tasks at once.
>>>> -# For a quadcore, BB_NUMBER_THREADS = "4", PARALLEL_MAKE = "-j 4" would
>>>> +# Recommended values are twice the number of processor cores for
>>>> +# BB_NUMBER_THREADS and the number of processor cores for PARALLEL_MAKE
>>>> +# For a quadcore, BB_NUMBER_THREADS = "8", PARALLEL_MAKE = "-j 4" would
>>>
>>> Hrm, where is this coming from? In my experience it works better the
>>> other way around. We probably also need to be explicit about cores
>>> versus threads.
>>
>> On my older quad-core AMD box, -j 6 / 4 threads is where it's at, and
>> our testing / poking around on other hardware bears that out (for
>> example my Dell M4400 laptop is -j 3 / 2 threads).
>
> Those ratios are closer to what I have seen as optimal as well - specifically
> more PARALLEL_MAKE threads than BB_NUMBER_THREADS, which is the opposite of the
> suggestion made above.
>
>
>> That said, for much
>> more beefy configurations, we use -j 16, and 12 threads (on an 8 core
>> machine with 12GB mem). I think perhaps the best change here is to keep
>> it at 1:1 in the sample (since we've also run into older hardware too)
>> and explain that anywhere between 1:1 and 2*core:2*core could work best
>> depending on setup, ymmv, etc.
>
> I'm well into the data collection process I promised in my first response, with
> several days still remaining. However, here is a snapshot of the results. This
> is a quad-core i7.
>
> BB PM Seconds ...
> 04 04 9083.43 2704.41 16073.55 206% 5079496 57424647 2377276782 51099 0 0 1910864 0 0 0 0
> 04 05 9032.61 2708.10 16093.60 208% 5114730 56938109 2377140907 52246 0 0 2027232 0 0 0 0
> 04 06 9031.50 2711.30 16095.67 208% 5154734 56737116 2377276988 52525 0 0 1937824 0 0 0 0
> 04 07 9022.27 2711.39 16150.86 209% 5165928 56811664 2377060362 52531 0 0 2027248 0 0 0 0
> 04 08 9087.11 2716.88 16103.36 207% 5061918 56763715 2376970987 52525 0 0 1940448 0 0 0 0
> 04 09 9310.98 2698.72 16080.94 201% 5159166 56868437 2377495925 53496 0 0 2027232 0 0 0 0
> 04 10 9243.03 2709.35 16077.68 203% 4983749 56880160 2377210887 53044 0 0 2027360 0 0 0 0
> 04 11 9139.86 2711.91 16085.86 205% 5108951 56824250 2377433181 53253 0 0 2027232 0 0 0 0
> 04 12 9132.83 2719.61 16066.65 205% 4995133 56904875 2377286845 53102 0 0 1857632 0 0 0 0
> 04 13 9169.78 2715.16 16095.05 205% 5046506 56856301 2376970111 52827 0 0 2027248 0 0 0 0
> 04 14 9079.85 2715.04 16082.28 207% 5075792 56804956 2377616169 53000 0 0 1912128 0 0 0 0
> 04 15 9133.57 2710.45 16096.39 205% 5101938 56804976 2377283916 53125 0 0 2027360 0 0 0 0
> 04 16 9200.60 2728.44 16129.32 204% 4959845 56852830 2381322571 53793 0 0 2027248 0 0 0 0
> 05 04 8685.73 2942.91 17328.96 233% 8572490 54322660 2383126091 52468 0 0 2027232 0 0 0 0
> 05 05 9109.86 2964.68 18033.54 230% 8637358 54222700 2380041373 54070 0 0 2027248 0 0 0 0
> 05 06 8681.89 2950.25 17384.38 234% 8466556 54244954 2382944050 53202 0 0 2027360 0 0 0 0
> 05 07 8683.63 2950.78 17380.57 234% 8420681 54277047 2382977368 52970 0 0 2027360 0 0 0 0
> 05 08 8845.89 2922.37 17296.46 228% 8356145 54197213 2375969732 52062 0 0 2027232 0 0 0 0
> 05 09 8801.84 2932.86 17509.07 232% 8508554 54179792 2383768769 52943 0 0 2027216 0 0 0 0
> 05 10 8742.35 2942.06 17341.79 232% 8475095 54323068 2383570357 52587 0 0 2027232 0 0 0 0
> 05 11 8798.83 2943.76 17428.09 231% 8821969 54234482 2375921193 53157 0 0 2027232 0 0 0 0
> 05 12 8805.69 2947.29 17417.49 231% 8753090 54442047 2383254479 52402 0 0 1883440 0 0 0 0
> 05 13 8834.71 2948.17 17435.29 230% 8794659 54271404 2382972825 53267 0 0 2027360 0 0 0 0
> 05 14 9549.34 2867.80 16897.12 206% 7406864 54830774 2376222762 57989 0 0 2027376 0 0 0 0
> 05 15 8912.57 2935.90 17301.71 227% 8389127 54353519 2383108921 52124 0 0 1883312 0 0 0 0
> 05 16 8780.23 2935.33 17286.67 230% 8417447 54242775 2376062987 52913 0 0 2027248 0 0 0 0
> 06 04 8514.25 3134.09 18582.59 255% 12856900 50543451 2375930920 52359 0 0 2027248 0 0 0 0
> 06 05 8502.72 3138.54 18576.75 255% 12884270 50607027 2376254598 52315 0 0 2027248 0 0 0 0
> 06 06 8485.14 3144.19 18611.51 256% 12892763 50365642 2376118338 53214 0 0 2027248 0 0 0 0
> 06 07 8452.52 3123.77 18596.88 256% 13006966 50461438 2376229877 52251 0 0 2027248 0 0 0 0
> 06 08 8450.55 3135.17 18586.06 257% 12926578 50394462 2375790855 52833 0 0 2027232 0 0 0 0
> 06 09 8473.10 3123.47 18554.24 255% 12696799 50642273 2375742839 52867 0 0 2027248 0 0 0 0
> 06 10 8491.59 3125.92 18580.78 255% 12931115 50381662 2375931612 51677 0 0 2027232 0 0 0 0
>
> Here we can see diminishing returns around 7 threads or so for PARALLEL_MAKE, and
> continue to see improvements going from 4 to 6 threads on BB_NUMBER_THREADS. I
> suspect this will find an optimal build time with BB=4 and PM=6, but we'll see
> (it should be around Jul 1 if things proceed on track and I don't melt the
> machine).
>
>
The attached files present the data for BB [4-8] and PM [4-16]. I lost
power during the 9,1 run, but I think the data is adequate as is. I see
an optimal run with a BB of 7 and a PM of 6. I am not seeing the
variation with PM that I thought I should.
NOTE: Some testing revealed that bitbake is ignoring PARALLEL_MAKE from
the environment, this is consistent with the rather long build times I
was experiencing as well. I am going to rework the script to force the
value in the local.conf and kick it off again for two weeks. But, for
now, the data below is still interesting assuming a constant PM.
The system under test:
CPU (1): Intel(R) Core(TM) i7 CPU 870 @ 2.93GHz
Cores: 4
Threads: 8
Memory: 8186560 kB
OS Disk: INTEL SSDSA2M040G2GC (SSD)
Build Disk: Hitachi HDT721050SLA360 (Spinning Media)
plot.png
This is an isometrix 3D surface plot with a density map of sorts applied
to the xy plane. Optimal build time occurs here around BB=7 PM=6.
plot-bb.png
Effectively a runtime (y) vs BB (x) plot. Fairly little gain from
increasing BB beyond 6 (1.5 * NR_CORES).
plot-pm.png
Effectively a runtime (y) vs PM (x) plot. Some improvement is seen from
increasing PM from 4 to 5 and 6 for large values of BB. After 6 there is
no improvement and a trend toward diminishing returns is certainly
evident beyond 6. (or this would be analysis it bitbake was paying
attention to PARALLEL_MAKE from env).
You recreate these plots, run the following with the .plt and .dat
attachments in the same directory:
$ gnuplot -persist < bb-pm-matrix.plt
This will regenerate the plots and display an interactive view of the
plot which allows for manipulating the viewpoint.
--
Darren Hart
Intel Open Source Technology Center
Yocto Project - Linux Kernel
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: bb-pm-matrix.plt
URL: <http://lists.yoctoproject.org/pipermail/poky/attachments/20110627/8cc6ea04/attachment.ksh>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: bb-pm-runtime.dat
Type: application/ms-tnef
Size: 5797 bytes
Desc: not available
URL: <http://lists.yoctoproject.org/pipermail/poky/attachments/20110627/8cc6ea04/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: plot.png
Type: image/png
Size: 14873 bytes
Desc: not available
URL: <http://lists.yoctoproject.org/pipermail/poky/attachments/20110627/8cc6ea04/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: plot-bb.png
Type: image/png
Size: 8205 bytes
Desc: not available
URL: <http://lists.yoctoproject.org/pipermail/poky/attachments/20110627/8cc6ea04/attachment-0001.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: plot-pm.png
Type: image/png
Size: 8436 bytes
Desc: not available
URL: <http://lists.yoctoproject.org/pipermail/poky/attachments/20110627/8cc6ea04/attachment-0002.png>
More information about the poky
mailing list