[poky] [PATCH 1/1] bitbake: optimize file parsing speed

Richard Purdie rpurdie at linux.intel.com
Fri Nov 19 02:27:32 PST 2010


On Wed, 2010-11-17 at 12:10 +0800, Dongxiao Xu wrote:
> build some data cache for generate_dependencies() on hand, and later
> each time when parsing the bb file, we do not need to build them again
> and again.
> 
> This optimization could get about 50% speed gain when parsing all ~800
> bb files.
> 
> Signed-off-by: Dongxiao Xu <dongxiao.xu at intel.com>
> ---
>  bitbake/lib/bb/cooker.py |    2 ++
>  bitbake/lib/bb/data.py   |   11 ++++++-----
>  2 files changed, 8 insertions(+), 5 deletions(-)
> 
> diff --git a/bitbake/lib/bb/cooker.py b/bitbake/lib/bb/cooker.py
> index 33eb65e..05e6c16 100644
> --- a/bitbake/lib/bb/cooker.py
> +++ b/bitbake/lib/bb/cooker.py
> @@ -76,6 +76,8 @@ class BBCooker:
>  
>          self.configuration.data = bb.data.init()
>  
> +        bb.data.init_data_cache(self.configuration.data)
> +
>          if not server:
>              bb.data.setVar("BB_WORKERCONTEXT", "1", self.configuration.data)
>  
> diff --git a/bitbake/lib/bb/data.py b/bitbake/lib/bb/data.py
> index fee10cc..a9e539f 100644
> --- a/bitbake/lib/bb/data.py
> +++ b/bitbake/lib/bb/data.py
> @@ -296,17 +296,18 @@ def build_dependencies(key, keys, shelldeps, d):
>      #bb.note("Variable %s references %s and calls %s" % (key, str(deps), str(execs)))
>      #d.setVarFlag(key, "vardeps", deps)
>  
> -def generate_dependencies(d):
> +def init_data_cache(d):
> +    bb.data.keylist = set(key for key in d.keys() if not key.startswith("__"))
> +    bb.data.shelldeps = set(key for key in bb.data.keylist if d.getVarFlag(key, "export") and not d.getVarFlag(key, "unexport"))
>  
> -    keys = set(key for key in d.keys() if not key.startswith("__"))
> -    shelldeps = set(key for key in keys if d.getVarFlag(key, "export") and not d.getVarFlag(key, "unexport"))
> +def generate_dependencies(d):
>  
>      deps = {}
>      taskdeps = {}
>  
>      tasklist = bb.data.getVar('__BBTASKS', d) or []
>      for task in tasklist:
> -        deps[task] = build_dependencies(task, keys, shelldeps, d)
> +        deps[task] = build_dependencies(task, bb.data.keylist, bb.data.shelldeps, d)
>  
>          newdeps = deps[task]
>          seen = set()
> @@ -316,7 +317,7 @@ def generate_dependencies(d):
>              newdeps = set()
>              for dep in nextdeps:
>                  if dep not in deps:
> -                    deps[dep] = build_dependencies(dep, keys, shelldeps, d)
> +                    deps[dep] = build_dependencies(dep, bb.data.keylist, bb.data.shelldeps, d)
>                  newdeps |=  deps[dep]
>              newdeps -= seen
>          taskdeps[task] = seen | newdeps


I'm afraid this isn't going to be quite this simple although this does
prove those lines of code are a big hotspot in parsing.

Why? You're creating the key and export lists for the base configuration
data whereas the original code creates these lists for the *total*
parsed metadata. There will therefore be differences in the values held
by the two caches :(.

As an example, if you set:

FOO = "bar" in a .bb file, 'FOO' will not appear in your keywords cache.

Cheers,

Richard





More information about the poky mailing list