[Toaster] [PATCH 0/5] Fix task buildstats gathering
Richard Purdie
richard.purdie at linuxfoundation.org
Sun Feb 21 03:04:00 PST 2016
On Sat, 2016-02-20 at 18:51 +0000, Barros Pena, Belen wrote:
> So I ran a build following the steps above, and had a look at the
> data.
> Information now shows, which is definitely an improvement :)
>
> * Time and Disk I/O show for all executed tasks, which is what's
> supposed
> to happen.
> * CPU usage is missing from some executed tasks. This used to happen
> before as well, but we never actually worked out why.
>
> Regarding the numbers, I am not sure how useful is the Disk I/O
> figure
> expressed in bytes. Should we convert it to something else? And then,
> our
> big problem is definitely the CPU usage, which shows pretty crazy
> numbers.
> The highest one in my build was for linux-yocto
> do_compile_kernelmodules
> at a whopping 2455.15%
>
> So, Richard Purdie pointed out to us that % over 100 are related to
> task
> parallelism. And in fact, if I divide the CPU usage value of the
> compile
> tasks by the value of PARALLEL_MAKE (36), I do get percentages below
> 100
> for all of them. In the example of linux-yocto
> do_compile_kernelmodules,
> we get 68.20%
>
> If I divide the CPU usage value of all the install tasks by the value
> of
> PARALLEL_MAKEINST (36), the same happens: % below 100.
>
> However, we do get % over 100 for tasks that we have been told have
> no
> parallelism at all. I see such values for unpack, patch, configure,
> package and populate_sysroot tasks.
FWIW, do_package does contain parallelism.
For unpack/patch/configure/populate_sysroot, there is some parallelism
too, in that the parent logging runs in parallel with the child
execution. Where these tasks run quickly, I think this could account
for the 'parallelism' we see there. Was it only for short running
tasks?
> So, the question is, why are those
> happening? Because for tasks that we know have parallelism we might
> be able to divide the value by the parallelism set, as I did for
> compile and install tasks. But for the others, I genuinely have no
> answer, other that there must be some kind of bug in the data
> collection.
FWIW I'm very strongly against doing any kind of division by
PARALLEL_MAKE or similar numbers as it makes the end resulting number
much less meaningful. The idea behind showing these numbers in the UI
is to allow people to make decisions based on the numbers. If you do
that division, I can't think of useful decisions/actions I could then
make on it.
For example, "2455%" above tells me that we have a parallelism factor
of about 24 in the kernel build. From that I can conclude that whilst
the kernel is good, its not making full use of the hardware which would
have been a factor of 36 (although the system was likely also busy
doing other things unless the task was run in isolation).
The only proposal I have is to simply display these as parallelism
factors rather than percentage usages. I don't think the data collected
is wrong, it does require a certain about of thinking to interpret it
though. We're pulling this data direct from the kernel so its unlikely
we can get any "better" (easier to interpret?) data either.
Cheers,
Richard
More information about the toaster
mailing list