[poky] Quick hack for profiling tasks
Richard Purdie
richard.purdie at linuxfoundation.org
Tue Feb 1 03:05:21 PST 2011
On Mon, 2011-01-31 at 19:43 -0600, Mark Hatle wrote:
> On 1/31/11 6:40 PM, Richard Purdie wrote:
> > On Tue, 2011-02-01 at 00:28 +0000, Richard Purdie wrote:
> >> One thing that is bugging me whilst I've been debugging some issues
> >> we're having with the libc/libgcc package dependency issue is how long
> >> do_package takes for libc. The question is where does it spend the time?
> >> Answer, I have no idea.
> >>
> >> I hacked together the patch below to find out. Its ugly and uses the
> >> boilerplate profiling code from cooker, cut and pasted here to profile
> >> the actual tasks that run.
> >>
> >> I've yet to look at the results but it should allow us to optimise the
> >> python tasks a bit if we can see where they spend time. I'm hoping this
> >> lets others look at that too and also it give us some hints as to how we
> >> might improve the core when turning on profiling in bitbake.
> >
> > For eglibc this worked out as:
> >
> > Tue Feb 1 00:33:21 2011 profile-eglibc_2.12.bb-do_package.log
> >
> > 8339733 function calls (8001600 primitive calls) in 877.972 CPU seconds
> >
> > Ordered by: internal time
> >
> > ncalls tottime percall cumtime percall filename:lineno(function)
> > 3206 321.887 0.100 322.422 0.101 package_do_filedeps:12(process_deps)
> > 403 311.208 0.772 311.208 0.772 {posix.waitpid}
> > 134054 69.860 0.001 69.860 0.001 {method 'read' of 'file' objects}
> > 225554 23.367 0.000 23.367 0.000 {posix.stat}
> > 866 20.279 0.023 20.279 0.023 {posix.system}
> > 85562 19.406 0.000 19.406 0.000 {posix.chmod}
> > 168083 16.691 0.000 16.691 0.000 {posix.lstat}
> > 25824 14.399 0.001 14.399 0.001 {posix.rename}
> > 55391 13.731 0.000 13.731 0.000 {open}
> > 5325 9.019 0.002 9.019 0.002 {posix.popen}
> > 2279 5.490 0.002 5.490 0.002 {method 'readlines' of 'file' objects}
> > 6403 5.187 0.001 6.346 0.001 insane.bbclass:1(package_qa_hash_style)
> > 19214 5.046 0.000 5.046 0.000 {posix.mkdir}
> >
> > so its spending a third of the time in package_do_filedeps(), a lot of
> > which is in waitpid waiting for the process that was spawned.
> >
> > Mark: Is there a way we could batch up the information rather than go
> > file by file? I'm going to look at this for other areas to improve too
> > but thats obviously one worthy of attention.
>
> The way the routine works today is via a script call perfile_rpmdeps.sh. The
> package.bbclass calls this script twice for each package-split.
>
> I.e. if we have base, base-dbg and base-libs, it will run for a total of 6
> times. Each pair it is simply passed the path to the packages-split directory.
>
> Within the script itself, it is doing a find operation:
>
> find "$@" | process $process_type
>
> The output of the file is passed to "process" which is just a wrapper that calls
> the rpmdeps program with the correct parameters. We could optimize this a bit
> by ignoring directories and symlinks. But we still want to process all of the
> files in the system.
>
> Another optimization (that we do NOT have) that is done by default in RPM, is to
> only process files that are +x. We have chosen not to do this as most of our
> libraries are not set +x. An alternative is to use 'file' and check the type of
> each file, however identifying the file type is likely to take longer then
> simply running the per-file deps commands. Another possible optimization is
> only scan certain directories (or the opposite, skip certain directories...) the
> only issue here is missing files that may be dlopened or loaded via RPATH
> because they are in a non-standard location.
>
> So what I'd recommend is we start by adding "-type f" to the find. That is
> likely to help some.
>
> Maybe then add a check for either sitting in /lib or /usr/lib _or_ mode is +x?
> We would likely need to audit the system somehow and tag ELF files that are
> neither...
I did a little investigating and my brain dump on it follows...
I noticed that rpmdeps doesn't help us with any locale information in
the libc case so I tried this patch:
diff --git a/meta/classes/package.bbclass b/meta/classes/package.bbclass
index 856858c..476a84a 100644
--- a/meta/classes/package.bbclass
+++ b/meta/classes/package.bbclass
@@ -607,7 +607,7 @@ python package_do_filedeps() {
# Determine dependencies
for pkg in packages.split():
- if pkg.endswith('-dbg'):
+ if pkg.endswith('-dbg') or pkg.find('-locale-') != -1 or pkg.find('-localedata-') != -1 or pkg.find('-gconv-') != -1 or pkg.find('-charmap-') != -1:
continue
# Process provides
The result was:
Tue Feb 1 10:06:46 2011 profile-eglibc_2.12.bb-do_package.log
8106962 function calls (7768817 primitive calls) in 594.886 CPU seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
403 308.929 0.767 308.929 0.767 {posix.waitpid}
134054 72.907 0.001 72.907 0.001 {method 'read' of 'file' objects}
828 34.643 0.042 34.795 0.042 package_do_filedeps:12(process_deps)
866 29.347 0.034 29.347 0.034 {posix.system}
225558 23.459 0.000 23.459 0.000 {posix.stat}
85562 18.980 0.000 18.980 0.000 {posix.chmod}
171317 17.198 0.000 17.198 0.000 {posix.lstat}
55393 14.378 0.000 14.378 0.000 {open}
25824 13.719 0.001 13.719 0.001 {posix.rename}
2280 5.383 0.002 5.383 0.002 {method 'readlines' of 'file' objects}
6403 4.877 0.001 5.762 0.001 insane.bbclass:1(package_qa_hash_style)
19214 4.843 0.000 4.843 0.000 {posix.mkdir}
2947 3.867 0.001 3.867 0.001 {posix.popen}
So overall we dropped from 878 seconds to 595 seconds and
package_do_filedeps:12(process_deps) dropped from 322 to 34 seconds.
FWIW, the waitpid above is from the binary locale generation, not from
filedeps and we know that is "slow" in the libc case although the cross
locale generation speeds it up a lot. The reason process_deps is showing
so badly on the profile is not because that python function is slow (as
it usually the case) but because its reading from a pipe which blocks on
the output of the rpmdeps process.
I did take a look at the perfile_rpmdeps.sh script and rpmdeps itself. I
suspect we might be faster if we called "rpmdeps --requires --provides
-v" on each file ourselves and handle the prefix the -v option adds to
differentiate between the requires and provides. There is a sed
operation in the script I didn't 100% understand that we'd need to
account for too. Watching cpu usage whilst manually running the script
was not encouraging :/.
So these findings along with your thoughts should give us some avenues
to explore. Do you thing skipping locale packages is ok to do? Maybe
kernel modules too?
There are also other things to pick off in that profile list such as the
200,000 stat calls (from sstate?), the insane check thats showing up and
what all the read calls are doing.
Cheers,
Richard
More information about the poky
mailing list