[poky] [PATCH 1/1] bitbake: optimize file parsing speed
Xu, Dongxiao
dongxiao.xu at intel.com
Fri Nov 19 05:14:13 PST 2010
Richard Purdie wrote:
> On Wed, 2010-11-17 at 12:10 +0800, Dongxiao Xu wrote:
>> build some data cache for generate_dependencies() on hand, and later
>> each time when parsing the bb file, we do not need to build them
>> again
>> and again.
>>
>> This optimization could get about 50% speed gain when parsing all
>> ~800
>> bb files.
>>
>> Signed-off-by: Dongxiao Xu <dongxiao.xu at intel.com>
>> ---
>> bitbake/lib/bb/cooker.py | 2 ++
>> bitbake/lib/bb/data.py | 11 ++++++-----
>> 2 files changed, 8 insertions(+), 5 deletions(-)
>>
>> diff --git a/bitbake/lib/bb/cooker.py b/bitbake/lib/bb/cooker.py
>> index 33eb65e..05e6c16 100644 --- a/bitbake/lib/bb/cooker.py
>> +++ b/bitbake/lib/bb/cooker.py
>> @@ -76,6 +76,8 @@ class BBCooker:
>>
>> self.configuration.data = bb.data.init()
>>
>> + bb.data.init_data_cache(self.configuration.data) +
>> if not server:
>> bb.data.setVar("BB_WORKERCONTEXT", "1",
>> self.configuration.data)
>>
>> diff --git a/bitbake/lib/bb/data.py b/bitbake/lib/bb/data.py index
>> fee10cc..a9e539f 100644
>> --- a/bitbake/lib/bb/data.py
>> +++ b/bitbake/lib/bb/data.py
>> @@ -296,17 +296,18 @@ def build_dependencies(key, keys, shelldeps,
>> d): #bb.note("Variable %s references %s and calls %s" % (key,
>> str(deps), str(execs))) #d.setVarFlag(key, "vardeps", deps)
>>
>> -def generate_dependencies(d):
>> +def init_data_cache(d):
>> + bb.data.keylist = set(key for key in d.keys() if not
>> key.startswith("__")) + bb.data.shelldeps = set(key for key in
>> bb.data.keylist if +d.getVarFlag(key, "export") and not
>> d.getVarFlag(key, "unexport"))
>>
>> - keys = set(key for key in d.keys() if not key.startswith("__"))
>> - shelldeps = set(key for key in keys if d.getVarFlag(key,
>> "export") and not d.getVarFlag(key, "unexport")) +def
>> generate_dependencies(d):
>>
>> deps = {}
>> taskdeps = {}
>>
>> tasklist = bb.data.getVar('__BBTASKS', d) or [] for task
>> in tasklist: - deps[task] = build_dependencies(task, keys,
>> shelldeps, d) + deps[task] = build_dependencies(task,
>> bb.data.keylist, + bb.data.shelldeps, d)
>>
>> newdeps = deps[task]
>> seen = set()
>> @@ -316,7 +317,7 @@ def generate_dependencies(d):
>> newdeps = set()
>> for dep in nextdeps:
>> if dep not in deps:
>> - deps[dep] = build_dependencies(dep, keys,
>> shelldeps, d) + deps[dep] =
>> build_dependencies(dep, + bb.data.keylist, bb.data.shelldeps, d)
>> newdeps |= deps[dep]
>> newdeps -= seen
>> taskdeps[task] = seen | newdeps
>
>
> I'm afraid this isn't going to be quite this simple although this
> does prove those lines of code are a big hotspot in parsing.
>
> Why? You're creating the key and export lists for the base
> configuration data whereas the original code creates these lists for
> the *total* parsed metadata. There will therefore be differences in
> the values held by the two caches :(.
>
> As an example, if you set:
>
> FOO = "bar" in a .bb file, 'FOO' will not appear in your keywords
> cache.
>
> Cheers,
>
> Richard
Richard,
Yes, you are right, thanks for pointing it out.
Now I am trying to solve this parse time issue in another way.
We saw that the following two lines cost a lot of cycles.
keys = set(key for key in d.keys() if not key.startswith("__"))
shelldeps = set(key for key in keys if d.getVarFlag(key, "export") and not d.getVarFlag(key, "unexport"))
After dump out the d.keys(), I found most of the items (>90%) are variables in distro_tracking_fields.inc, actually they are not used in normal build process.
I checked the code, some functions in utility-tasks.bbclass (related with upstream version check) will need information in distro_tracking_fields.inc, and so that poky.conf includes this file. And utility-tasks.bbclass is inherited in base.bbclass, which is somewhat fundamental to poky.
I am thinking of moving those distro checking related code from utility-tasks.bbclass to distrodata.bbclass, in order not to involve such big database (distro_tracking_fields.inc) in normal parsing process. Does it make sense?
I did some simple tests, and this could save about 20% file parsing time.
For the repeated parsing hot spot, I will continue to investigate to see whether there is optimization point.
Thanks,
Dongxiao
More information about the poky
mailing list