Discussion:
[Scons-users] Recursive directories, and the ignoring of filenames.
Alistair Buxton
2018-08-02 23:45:36 UTC
Permalink
I want to build an openwrt image using their image builder. It is
invoked like this:

make -C imagebuilder image FILES=/some/path

All files under /some/path will be inserted into the image. Therefore,
if any file or subdirectory is renamed, the image needs to be rebuilt,
ie the above command needs to be run again, even if the contents of
the renamed file is the same as some previous iteration. It seems
extraordinarily difficult to make scons do this, given how simple the
concept is to explain.

Here is a simplified breakdown of the problem, using "ls" as a
surrogate for the image builder, as it is simpler and everyone should
be familiar with how it works:

I have a directory called 'files'. I want to produce a list of all the
subdirectories and files in 'files' using scons and ls. A naive
attempt would look like this:

env.Command('listing.txt', 'files', 'ls -lR ${SOURCE} > ${TARGET}')

The above will work exactly once, and as long as 'listing.txt' is
never deleted, never again.

After some research you might come up with this:

env.Command('listing.txt', ['files/', Glob('files/*')], 'ls -lR
${SOURCES[0]} > ${TARGET}')

This will fail if 'files' has subdirectories, because Glob is not recursive.

After googling for "scons recursive glob" you might come up with
something like this:

env.Command('listing.txt', ['files/', Glob('files/*'),
Glob('files/*/*'), Glob('files/*/*/*'), Glob('files/*/*/*/*')], 'ls
-lR ${SOURCES[0]} > ${TARGET}')

And you might even think it works... but it doesn't. Besides the
problem that it will miss changes deeper than the number of globs you
specify, it has a more subtle problem. If a file or directory is
renamed, scons may or may not ignore it. You can demonstrate this by
taking the above code, and putting it in a directory with
'files/a.txt' and 'files/b.txt'. Run scons, then rename b.txt to c.txt
and re-run scons. The listing will not be updated, which is incorrect
behaviour by any reasonable definition.

This happens because scons does not look at the filenames of
dependencies. It assumes that they will be delivered in the same order
each time, and only rebuilds if the number of items changed, or if the
nth checksum does not match the previous nth checksum regardless of
the filenames involved. As a result it can miss file and directory
renames. It can also not if the renaming causes dependencies to be
listed in a different order (ie if you renamed a.txt instead of b.txt,
and the files had different contents). This is rather illogical as it
means the problem won't happen every time you rename a file. Even
knowing why it happens requires deep knowledge of scons implementation
details that should be irrelevant to the user.

One possible workaround is to do this:

env.Command('listing.txt', ['files',
Value(subprocess.check_output(['sh', '-c', 'find files/ -type f -exec
sha256sum {} \; | sort | sha256sum']))], 'ls -lR ${SOURCES[0]} >
${TARGET}')

This handles any level of recursion in the directory and also won't
miss directory renames (as long as the directory is not empty). It
sure is ugly to look at though.

Apparently there is a way to do this with a custom decider, but I
cannot make it work. Deciders receive two Nodes and a FileNodeInfo,
none of which contains any information about the previous filename of
a dependency, or the other dependencies of a target and what their
previous filenames may have been:

env = Environment(tools=[])

def my_decider(dependency, target, info):
print(target.get_binfo().bdepends)
return True

env.Decider(my_decider)

env.Command('out.txt', Glob('files/*'), 'ls -lR files/ > ${TARGET}')

Outputs:

scons: Reading SConscript files ...
scons: done reading SConscript files.
scons: Building targets ...
[]
[]
[]
ls -lR files/ > out.txt
[]
[]
[]
scons: done building targets.

as such, a decider can't possibly know if the list of dependencies has changed.

So in summary, implementing recursive glob alone is not enough. Scons
also needs to take care of filename changes in order for recursive
glob to be useful.

--
Alistair Buxton
***@gmail.com
Alistair Buxton
2018-08-02 23:53:28 UTC
Permalink
PS note that the directory isn't even needed to reproduce the renaming
problem. The following will have the same behaviour if the a.txt and
b.txt are in the same directory as SConstruct:

env.Command('listing.txt', Glob('*.txt'), 'ls *.txt > ${TARGET}')

In this case you would instead say:

env.Command('listing.txt', Glob('*.txt'), 'ls ${SOURCES} > ${TARGET}')

and scons would identify that the build needs to be re-done because
the action string changed.

However, this is not so easy with the image builder case as it takes a
directory path as the only argument.

On 3 August 2018 at 00:45, Alistair Buxton <***@gmail.com> wrote:
> I want to build an openwrt image using their image builder. It is
> invoked like this:
>
> make -C imagebuilder image FILES=/some/path
>
> All files under /some/path will be inserted into the image. Therefore,
> if any file or subdirectory is renamed, the image needs to be rebuilt,
> ie the above command needs to be run again, even if the contents of
> the renamed file is the same as some previous iteration. It seems
> extraordinarily difficult to make scons do this, given how simple the
> concept is to explain.
>
> Here is a simplified breakdown of the problem, using "ls" as a
> surrogate for the image builder, as it is simpler and everyone should
> be familiar with how it works:
>
> I have a directory called 'files'. I want to produce a list of all the
> subdirectories and files in 'files' using scons and ls. A naive
> attempt would look like this:
>
> env.Command('listing.txt', 'files', 'ls -lR ${SOURCE} > ${TARGET}')
>
> The above will work exactly once, and as long as 'listing.txt' is
> never deleted, never again.
>
> After some research you might come up with this:
>
> env.Command('listing.txt', ['files/', Glob('files/*')], 'ls -lR
> ${SOURCES[0]} > ${TARGET}')
>
> This will fail if 'files' has subdirectories, because Glob is not recursive.
>
> After googling for "scons recursive glob" you might come up with
> something like this:
>
> env.Command('listing.txt', ['files/', Glob('files/*'),
> Glob('files/*/*'), Glob('files/*/*/*'), Glob('files/*/*/*/*')], 'ls
> -lR ${SOURCES[0]} > ${TARGET}')
>
> And you might even think it works... but it doesn't. Besides the
> problem that it will miss changes deeper than the number of globs you
> specify, it has a more subtle problem. If a file or directory is
> renamed, scons may or may not ignore it. You can demonstrate this by
> taking the above code, and putting it in a directory with
> 'files/a.txt' and 'files/b.txt'. Run scons, then rename b.txt to c.txt
> and re-run scons. The listing will not be updated, which is incorrect
> behaviour by any reasonable definition.
>
> This happens because scons does not look at the filenames of
> dependencies. It assumes that they will be delivered in the same order
> each time, and only rebuilds if the number of items changed, or if the
> nth checksum does not match the previous nth checksum regardless of
> the filenames involved. As a result it can miss file and directory
> renames. It can also not if the renaming causes dependencies to be
> listed in a different order (ie if you renamed a.txt instead of b.txt,
> and the files had different contents). This is rather illogical as it
> means the problem won't happen every time you rename a file. Even
> knowing why it happens requires deep knowledge of scons implementation
> details that should be irrelevant to the user.
>
> One possible workaround is to do this:
>
> env.Command('listing.txt', ['files',
> Value(subprocess.check_output(['sh', '-c', 'find files/ -type f -exec
> sha256sum {} \; | sort | sha256sum']))], 'ls -lR ${SOURCES[0]} >
> ${TARGET}')
>
> This handles any level of recursion in the directory and also won't
> miss directory renames (as long as the directory is not empty). It
> sure is ugly to look at though.
>
> Apparently there is a way to do this with a custom decider, but I
> cannot make it work. Deciders receive two Nodes and a FileNodeInfo,
> none of which contains any information about the previous filename of
> a dependency, or the other dependencies of a target and what their
> previous filenames may have been:
>
> env = Environment(tools=[])
>
> def my_decider(dependency, target, info):
> print(target.get_binfo().bdepends)
> return True
>
> env.Decider(my_decider)
>
> env.Command('out.txt', Glob('files/*'), 'ls -lR files/ > ${TARGET}')
>
> Outputs:
>
> scons: Reading SConscript files ...
> scons: done reading SConscript files.
> scons: Building targets ...
> []
> []
> []
> ls -lR files/ > out.txt
> []
> []
> []
> scons: done building targets.
>
> as such, a decider can't possibly know if the list of dependencies has changed.
>
> So in summary, implementing recursive glob alone is not enough. Scons
> also needs to take care of filename changes in order for recursive
> glob to be useful.
>
> --
> Alistair Buxton
> ***@gmail.com



--
Alistair Buxton
***@gmail.com
Jack Brennen
2018-08-03 00:49:07 UTC
Permalink
For your first example, which you say will work exactly once, and never
again...

Can you just make that node dependent on a Value node with the tuple of
the current time and PID as the Value?

import os, time
env.Depends('listing.txt', env.Value((time.time(),os.getpid())))


By making it dependent on the tuple of time.time() with os.getpid(), it
should effectively rerun the "ls" command every time you execute SCons. 
The chance of getting the same time and the same PID in different SCons
executions should be vanishingly small.

- Jack

On 8/2/2018 4:45 PM, Alistair Buxton wrote:
> I want to build an openwrt image using their image builder. It is
> invoked like this:
>
> make -C imagebuilder image FILES=/some/path
>
> All files under /some/path will be inserted into the image. Therefore,
> if any file or subdirectory is renamed, the image needs to be rebuilt,
> ie the above command needs to be run again, even if the contents of
> the renamed file is the same as some previous iteration. It seems
> extraordinarily difficult to make scons do this, given how simple the
> concept is to explain.
>
> Here is a simplified breakdown of the problem, using "ls" as a
> surrogate for the image builder, as it is simpler and everyone should
> be familiar with how it works:
>
> I have a directory called 'files'. I want to produce a list of all the
> subdirectories and files in 'files' using scons and ls. A naive
> attempt would look like this:
>
> env.Command('listing.txt', 'files', 'ls -lR ${SOURCE} > ${TARGET}')
>
> The above will work exactly once, and as long as 'listing.txt' is
> never deleted, never again.
>
> After some research you might come up with this:
>
> env.Command('listing.txt', ['files/', Glob('files/*')], 'ls -lR
> ${SOURCES[0]} > ${TARGET}')
>
> This will fail if 'files' has subdirectories, because Glob is not recursive.
>
> After googling for "scons recursive glob" you might come up with
> something like this:
>
> env.Command('listing.txt', ['files/', Glob('files/*'),
> Glob('files/*/*'), Glob('files/*/*/*'), Glob('files/*/*/*/*')], 'ls
> -lR ${SOURCES[0]} > ${TARGET}')
>
> And you might even think it works... but it doesn't. Besides the
> problem that it will miss changes deeper than the number of globs you
> specify, it has a more subtle problem. If a file or directory is
> renamed, scons may or may not ignore it. You can demonstrate this by
> taking the above code, and putting it in a directory with
> 'files/a.txt' and 'files/b.txt'. Run scons, then rename b.txt to c.txt
> and re-run scons. The listing will not be updated, which is incorrect
> behaviour by any reasonable definition.
>
> This happens because scons does not look at the filenames of
> dependencies. It assumes that they will be delivered in the same order
> each time, and only rebuilds if the number of items changed, or if the
> nth checksum does not match the previous nth checksum regardless of
> the filenames involved. As a result it can miss file and directory
> renames. It can also not if the renaming causes dependencies to be
> listed in a different order (ie if you renamed a.txt instead of b.txt,
> and the files had different contents). This is rather illogical as it
> means the problem won't happen every time you rename a file. Even
> knowing why it happens requires deep knowledge of scons implementation
> details that should be irrelevant to the user.
>
> One possible workaround is to do this:
>
> env.Command('listing.txt', ['files',
> Value(subprocess.check_output(['sh', '-c', 'find files/ -type f -exec
> sha256sum {} \; | sort | sha256sum']))], 'ls -lR ${SOURCES[0]} >
> ${TARGET}')
>
> This handles any level of recursion in the directory and also won't
> miss directory renames (as long as the directory is not empty). It
> sure is ugly to look at though.
>
> Apparently there is a way to do this with a custom decider, but I
> cannot make it work. Deciders receive two Nodes and a FileNodeInfo,
> none of which contains any information about the previous filename of
> a dependency, or the other dependencies of a target and what their
> previous filenames may have been:
>
> env = Environment(tools=[])
>
> def my_decider(dependency, target, info):
> print(target.get_binfo().bdepends)
> return True
>
> env.Decider(my_decider)
>
> env.Command('out.txt', Glob('files/*'), 'ls -lR files/ > ${TARGET}')
>
> Outputs:
>
> scons: Reading SConscript files ...
> scons: done reading SConscript files.
> scons: Building targets ...
> []
> []
> []
> ls -lR files/ > out.txt
> []
> []
> []
> scons: done building targets.
>
> as such, a decider can't possibly know if the list of dependencies has changed.
>
> So in summary, implementing recursive glob alone is not enough. Scons
> also needs to take care of filename changes in order for recursive
> glob to be useful.
>
Alistair Buxton
2018-08-03 01:19:22 UTC
Permalink
On 3 August 2018 at 01:49, Jack Brennen <***@qti.qualcomm.com> wrote:
> For your first example, which you say will work exactly once, and never
> again...
>
> Can you just make that node dependent on a Value node with the tuple of the
> current time and PID as the Value?
>
> import os, time
> env.Depends('listing.txt', env.Value((time.time(),os.getpid())))
>
>
> By making it dependent on the tuple of time.time() with os.getpid(), it
> should effectively rerun the "ls" command every time you execute SCons. The
> chance of getting the same time and the same PID in different SCons
> executions should be vanishingly small.

I could just use AlwaysBuild for that. The problem is to make the
build action run if and only if the dependencies have changed in some
way.


--
Alistair Buxton
***@gmail.com
Marc Branchaud
2018-08-03 13:56:05 UTC
Permalink
On 2018-08-02 07:45 PM, Alistair Buxton wrote:
> I want to build an openwrt image using their image builder. It is
> invoked like this:
>
> make -C imagebuilder image FILES=/some/path
>
> All files under /some/path will be inserted into the image. Therefore,
> if any file or subdirectory is renamed, the image needs to be rebuilt,
> ie the above command needs to be run again, even if the contents of
> the renamed file is the same as some previous iteration. It seems
> extraordinarily difficult to make scons do this, given how simple the
> concept is to explain.

I use os.walk() to make a big list of all the files. The following
assumes that the directory you want to walk is next to your SConscript.

sources = []
sources_dir = 'path/to/directory'
for path, dirs, files in os.walk(sources_dir):
for f in files:
rel_path = os.path.join(path, f)[len(sources_dir)+1:]
sources.append(rel_path)

env.Command(source = sources, .....)

(If you use variant directories, I find it simplifies things to make
sources_dir an absolute path to the non-variant version of the directory.)

One nice thing about this approach is that I can selectively ignore
paths I don't care about (e.g. ones that start with ".git/").

M.


> Here is a simplified breakdown of the problem, using "ls" as a
> surrogate for the image builder, as it is simpler and everyone should
> be familiar with how it works:
>
> I have a directory called 'files'. I want to produce a list of all the
> subdirectories and files in 'files' using scons and ls. A naive
> attempt would look like this:
>
> env.Command('listing.txt', 'files', 'ls -lR ${SOURCE} > ${TARGET}')
>
> The above will work exactly once, and as long as 'listing.txt' is
> never deleted, never again.
>
> After some research you might come up with this:
>
> env.Command('listing.txt', ['files/', Glob('files/*')], 'ls -lR
> ${SOURCES[0]} > ${TARGET}')
>
> This will fail if 'files' has subdirectories, because Glob is not recursive.
>
> After googling for "scons recursive glob" you might come up with
> something like this:
>
> env.Command('listing.txt', ['files/', Glob('files/*'),
> Glob('files/*/*'), Glob('files/*/*/*'), Glob('files/*/*/*/*')], 'ls
> -lR ${SOURCES[0]} > ${TARGET}')
>
> And you might even think it works... but it doesn't. Besides the
> problem that it will miss changes deeper than the number of globs you
> specify, it has a more subtle problem. If a file or directory is
> renamed, scons may or may not ignore it. You can demonstrate this by
> taking the above code, and putting it in a directory with
> 'files/a.txt' and 'files/b.txt'. Run scons, then rename b.txt to c.txt
> and re-run scons. The listing will not be updated, which is incorrect
> behaviour by any reasonable definition.
>
> This happens because scons does not look at the filenames of
> dependencies. It assumes that they will be delivered in the same order
> each time, and only rebuilds if the number of items changed, or if the
> nth checksum does not match the previous nth checksum regardless of
> the filenames involved. As a result it can miss file and directory
> renames. It can also not if the renaming causes dependencies to be
> listed in a different order (ie if you renamed a.txt instead of b.txt,
> and the files had different contents). This is rather illogical as it
> means the problem won't happen every time you rename a file. Even
> knowing why it happens requires deep knowledge of scons implementation
> details that should be irrelevant to the user.
>
> One possible workaround is to do this:
>
> env.Command('listing.txt', ['files',
> Value(subprocess.check_output(['sh', '-c', 'find files/ -type f -exec
> sha256sum {} \; | sort | sha256sum']))], 'ls -lR ${SOURCES[0]} >
> ${TARGET}')
>
> This handles any level of recursion in the directory and also won't
> miss directory renames (as long as the directory is not empty). It
> sure is ugly to look at though.
>
> Apparently there is a way to do this with a custom decider, but I
> cannot make it work. Deciders receive two Nodes and a FileNodeInfo,
> none of which contains any information about the previous filename of
> a dependency, or the other dependencies of a target and what their
> previous filenames may have been:
>
> env = Environment(tools=[])
>
> def my_decider(dependency, target, info):
> print(target.get_binfo().bdepends)
> return True
>
> env.Decider(my_decider)
>
> env.Command('out.txt', Glob('files/*'), 'ls -lR files/ > ${TARGET}')
>
> Outputs:
>
> scons: Reading SConscript files ...
> scons: done reading SConscript files.
> scons: Building targets ...
> []
> []
> []
> ls -lR files/ > out.txt
> []
> []
> []
> scons: done building targets.
>
> as such, a decider can't possibly know if the list of dependencies has changed.
>
> So in summary, implementing recursive glob alone is not enough. Scons
> also needs to take care of filename changes in order for recursive
> glob to be useful.
>
Alistair Buxton
2018-08-03 16:24:12 UTC
Permalink
On 3 August 2018 at 14:56, Marc Branchaud <***@xiplink.com> wrote:

> I use os.walk() to make a big list of all the files.

This is affected by the bug I described, as are all methods which
build a list of filenames and pass it to scons.

Also os.walk ordering is arbitrary so this algorithm has a small
chance of causing random spurious rebuilds.



--
Alistair Buxton
***@gmail.com
Bill Deegan
2018-08-03 16:51:54 UTC
Permalink
And what if you sort the list generated from os.walk? Then the order should
be consistant...

On Fri, Aug 3, 2018 at 9:24 AM, Alistair Buxton <***@gmail.com>
wrote:

> On 3 August 2018 at 14:56, Marc Branchaud <***@xiplink.com> wrote:
>
> > I use os.walk() to make a big list of all the files.
>
> This is affected by the bug I described, as are all methods which
> build a list of filenames and pass it to scons.
>
> Also os.walk ordering is arbitrary so this algorithm has a small
> chance of causing random spurious rebuilds.
>
>
>
> --
> Alistair Buxton
> ***@gmail.com
> _______________________________________________
> Scons-users mailing list
> Scons-***@scons.org
> https://pairlist4.pair.net/mailman/listinfo/scons-users
>
Alistair Buxton
2018-08-03 16:54:04 UTC
Permalink
On 3 August 2018 at 17:51, Bill Deegan <***@baddogconsulting.com> wrote:
> And what if you sort the list generated from os.walk? Then the order should
> be consistant...

Then you'll avoid the spurious rebuilds, but still be affected by the
bug I've described at the start of this thread.

--
Alistair Buxton
***@gmail.com
Bill Deegan
2018-08-03 17:10:39 UTC
Permalink
one step at a time.

Are the files all guaranteed to be there at the beginning of the build?
Or are some in the directory your Glob()'ing generated by the build?

-Bill

On Fri, Aug 3, 2018 at 9:54 AM, Alistair Buxton <***@gmail.com>
wrote:

> On 3 August 2018 at 17:51, Bill Deegan <***@baddogconsulting.com> wrote:
> > And what if you sort the list generated from os.walk? Then the order
> should
> > be consistant...
>
> Then you'll avoid the spurious rebuilds, but still be affected by the
> bug I've described at the start of this thread.
>
> --
> Alistair Buxton
> ***@gmail.com
> _______________________________________________
> Scons-users mailing list
> Scons-***@scons.org
> https://pairlist4.pair.net/mailman/listinfo/scons-users
>
Matthew Marinets
2018-08-03 17:08:51 UTC
Permalink
To me it sounds more like you just need an emitter.
The bug you described (as I understood it), is that the built-in SCons decider is order-sensitive. Saying [“src/a.txt”, “src/b.txt”] is different from [“src/b.txt”, “src/a.txt”].
The simple solution, as Bill said, it to just sort your list. You can do this automatically in an emitter.

Additionally, from my experience, SCons builders really don’t like taking directories as sources.
--If you need a directory path in your command line, it should not come straight from $SOURCES
---Different variables that you define work fine, say $SOURCES_DIR
---Some SCons tools, particularly Java ones, use a node’s attributes; so you’d set a source’s attribute for, say, source_dir, and call it in the command with $SOURCE.attributes.source_dir
--If you still want to specify just the sources directory when you call your builder, you can do all this calculation automatically in an emitter.

Example:
MyTool.py:
“””My custom tool”””
Import os

from my_util import custom_recursive_glob

def emitter(target, source, env):
“””Formats targets and sources;
The sources root directory will be assigned to $SOURCES_DIR,
The actual source files will be used by SCons for dependency checking
”””
assert os.isdir(str(source[0]))
env[‘SOURCES_DIR’] = source
source = custom_recursive_glob(source, pattern = “*.txt”)
return target, source

def generate(env):
env[‘MYBUILDERCOM’] = “ls -lR $SOURCES_DIR > $TARGET”
env[‘BUILDERS’][‘MyBuilder’] = SCons.Builder.Builder(action = “$MYBUILDERCOM’”, emitter = emitter)

def exists(env):
return True

SConstruct:
env = Environment(tools = [“MyTool”], toolpath = [“path/to/MyTool”])
# build calls and such

-Matthew

From: Scons-users <scons-users-***@scons.org> On Behalf Of Bill Deegan
Sent: August 3, 2018 09:52
To: SCons users mailing list <scons-***@scons.org>
Subject: Re: [Scons-users] Recursive directories, and the ignoring of filenames.

And what if you sort the list generated from os.walk? Then the order should be consistant...

On Fri, Aug 3, 2018 at 9:24 AM, Alistair Buxton <***@gmail.com<mailto:***@gmail.com>> wrote:
On 3 August 2018 at 14:56, Marc Branchaud <***@xiplink.com<mailto:***@xiplink.com>> wrote:

> I use os.walk() to make a big list of all the files.

This is affected by the bug I described, as are all methods which
build a list of filenames and pass it to scons.

Also os.walk ordering is arbitrary so this algorithm has a small
chance of causing random spurious rebuilds.



--
Alistair Buxton
***@gmail.com<mailto:***@gmail.com>
_______________________________________________
Scons-users mailing list
Scons-***@scons.org<mailto:Scons-***@scons.org>
https://pairlist4.pair.net/mailman/listinfo/scons-users
Alistair Buxton
2018-08-03 17:25:07 UTC
Permalink
On 3 August 2018 at 18:08, Matthew Marinets
<***@kardium.com> wrote:
> To me it sounds more like you just need an emitter.
>
> The bug you described (as I understood it), is that the built-in SCons
> decider is order-sensitive. Saying [“src/a.txt”, “src/b.txt”] is different
> from [“src/b.txt”, “src/a.txt”].
>
> The simple solution, as Bill said, it to just sort your list. You can do
> this automatically in an emitter.

Then you didn't correctly understand the issue. Go and read it again.
The issue is that saying ['a.txt', 'b.txt'] is the same say saying
['a.txt', 'c.txt'] if b.txt and c.txt have the same contents, ie it
will NOT trigger a rebuild, even though it should. The way you build
the list is 100% irrelevant as this is an internal implementation
detail of scons.


--
Alistair Buxton
***@gmail.com
Bill Deegan
2018-08-03 17:28:00 UTC
Permalink
On Fri, Aug 3, 2018 at 10:25 AM, Alistair Buxton <***@gmail.com>
wrote:

> On 3 August 2018 at 18:08, Matthew Marinets
> <***@kardium.com> wrote:
> > To me it sounds more like you just need an emitter.
> >
> > The bug you described (as I understood it), is that the built-in SCons
> > decider is order-sensitive. Saying [“src/a.txt”, “src/b.txt”] is
> different
> > from [“src/b.txt”, “src/a.txt”].
> >
> > The simple solution, as Bill said, it to just sort your list. You can do
> > this automatically in an emitter.
>
> Then you didn't correctly understand the issue. Go and read it again.
> The issue is that saying ['a.txt', 'b.txt'] is the same say saying
> ['a.txt', 'c.txt'] if b.txt and c.txt have the same contents, ie it
> will NOT trigger a rebuild, even though it should. The way you build
> the list is 100% irrelevant as this is an internal implementation
> detail of scons.
>

Yes. I understood the issue.
However you care if:
1) the file path changes (if so trigger a rebuild right?)
2) the contents of same name file changes (and/or timestamp) if so then
rebuild right?

-Bill


>
>
> --
> Alistair Buxton
> ***@gmail.com
> _______________________________________________
> Scons-users mailing list
> Scons-***@scons.org
> https://pairlist4.pair.net/mailman/listinfo/scons-users
>
Jack Brennen
2018-08-03 18:01:26 UTC
Permalink
Something like this seems to work for me...  It does assume that you're
in a reasonably POSIX-like environment with find and sha1sum command
line utilities.

env['SRCS_DIR'] = 'srcs'
env.Command('dir_summary.txt', '$SRCS_DIR', 'find $SOURCES -type f -exec sha1sum \{\} \; | sort > $TARGET')
env.AlwaysBuild('dir_summary.txt')

built_image = env.Command('built_image', 'dir_summary.txt', 'run_image_builder $SRCS_DIR $TARGET')


The find command will find all of the files under srcs, and create a
list of sorted message digests tagged with their filenames.  This will
be run every time you run the build, and will recreate the file
dir_summary.txt every time.

If dir_summary.txt changes, then the image builder will run and create
built_time.  If dir_summary.txt does not change, then the image builder
will not run.

It's a little bit heavyweight, in the sense that it will read every byte
of every file under the sources directory, but that could be seen as a
good thing; it will catch changes that "ls -lR" would miss.




On 8/3/2018 10:28 AM, Bill Deegan wrote:
>
>
> On Fri, Aug 3, 2018 at 10:25 AM, Alistair Buxton <***@gmail.com
> <mailto:***@gmail.com>> wrote:
>
> On 3 August 2018 at 18:08, Matthew Marinets
> <***@kardium.com
> <mailto:***@kardium.com>> wrote:
> > To me it sounds more like you just need an emitter.
> >
> > The bug you described (as I understood it), is that the built-in
> SCons
> > decider is order-sensitive. Saying [“src/a.txt”, “src/b.txt”] is
> different
> > from [“src/b.txt”, “src/a.txt”].
> >
> > The simple solution, as Bill said, it to just sort your list.
> You can do
> > this automatically in an emitter.
>
> Then you didn't correctly understand the issue. Go and read it again.
> The issue is that saying ['a.txt', 'b.txt'] is the same say saying
> ['a.txt', 'c.txt'] if b.txt and c.txt have the same contents, ie it
> will NOT trigger a rebuild, even though it should. The way you build
> the list is 100% irrelevant as this is an internal implementation
> detail of scons.
>
>
> Yes. I understood the issue.
> However you care if:
> 1) the file path changes (if so trigger a rebuild right?)
> 2) the contents of same name file changes (and/or timestamp) if so
> then rebuild right?
>
> -Bill
>
>
>
> --
> Alistair Buxton
> ***@gmail.com <mailto:***@gmail.com>
> _______________________________________________
> Scons-users mailing list
> Scons-***@scons.org <mailto:Scons-***@scons.org>
> https://pairlist4.pair.net/mailman/listinfo/scons-users
> <https://pairlist4.pair.net/mailman/listinfo/scons-users>
>
>
>
>
> _______________________________________________
> Scons-users mailing list
> Scons-***@scons.org
> https://pairlist4.pair.net/mailman/listinfo/scons-users
Matthew Marinets
2018-08-03 17:51:47 UTC
Permalink
I can't reproduce your bug then.

Using Python 3.6.5;
scons -v:
script: v3.0.1.74b2c53bc42290e911b334a6b44f187da698a668, 2017/11/14 13:16:53, by bdbaddog on hpmicrodog
engine: v3.0.1.74b2c53bc42290e911b334a6b44f187da698a668, 2017/11/14 13:16:53, by bdbaddog on hpmicrodog
engine path: ['C:\\Program Files\\Python36\\scons-3.0.1\\SCons']


I made a Command() builder that Globs all .c files in the src/ directory, and copies them to the build/ directory:

sources = Glob(pattern = str(src) + '/*.c')
env.Command(target = [build.File(x.name) for x in sources], source = sources, action = 'cp $SOURCES $TARGET.dir')

First I ran it with src contents:
src/a.c
src/b.c
src/asdf.c

All files were copied, as expected. Running SCons again, the build was not executed as all targets were up to date.

Next, I changed the name of asdf.c to qwer.c.
>scons --debug=explain
scons: Reading SConscript files ...
scons: done reading SConscript files.
scons: Building targets ...
scons: rebuilding `build\a.c' because:
`src\asdf.c' is no longer a dependency
`src\qwer.c' is a new dependency
cp src\a.c src\b.c src\qwer.c build
scons: done building targets.

>scons --debug=explain
scons: Reading SConscript files ...
scons: done reading SConscript files.
scons: Building targets ...
scons: `.' is up to date.
scons: done building targets.

I think you've misidentified your bug.

I still suspect that it's because you're putting directories in the source / target arguments of a builder; SCons doesn't like that for some reason. Notice that in my example, the sources and targets were all files or lists of files, and to get a directory path in that action, I called $TARGET.dir

-Matthew

-----Original Message-----
From: Scons-users <scons-users-***@scons.org> On Behalf Of Alistair Buxton
Sent: August 3, 2018 10:25
To: SCons users mailing list <scons-***@scons.org>
Subject: Re: [Scons-users] Recursive directories, and the ignoring of filenames.

On 3 August 2018 at 18:08, Matthew Marinets <***@kardium.com> wrote:
> To me it sounds more like you just need an emitter.
>
> The bug you described (as I understood it), is that the built-in SCons
> decider is order-sensitive. Saying [“src/a.txt”, “src/b.txt”] is
> different from [“src/b.txt”, “src/a.txt”].
>
> The simple solution, as Bill said, it to just sort your list. You can
> do this automatically in an emitter.

Then you didn't correctly understand the issue. Go and read it again.
The issue is that saying ['a.txt', 'b.txt'] is the same say saying ['a.txt', 'c.txt'] if b.txt and c.txt have the same contents, ie it will NOT trigger a rebuild, even though it should. The way you build the list is 100% irrelevant as this is an internal implementation detail of scons.


--
Alistair Buxton
***@gmail.com
_______________________________________________
Scons-users mailing list
Scons-***@scons.org
https://pairlist4.pair.net/mailman/listinfo/scons-users
Alistair Buxton
2018-08-03 18:03:29 UTC
Permalink
On 3 August 2018 at 18:51, Matthew Marinets
<***@kardium.com> wrote:
> I can't reproduce your bug then.
>
> Using Python 3.6.5;
> scons -v:
> script: v3.0.1.74b2c53bc42290e911b334a6b44f187da698a668, 2017/11/14 13:16:53, by bdbaddog on hpmicrodog
> engine: v3.0.1.74b2c53bc42290e911b334a6b44f187da698a668, 2017/11/14 13:16:53, by bdbaddog on hpmicrodog
> engine path: ['C:\\Program Files\\Python36\\scons-3.0.1\\SCons']
>
>
> I made a Command() builder that Globs all .c files in the src/ directory, and copies them to the build/ directory:
>
> sources = Glob(pattern = str(src) + '/*.c')
> env.Command(target = [build.File(x.name) for x in sources], source = sources, action = 'cp $SOURCES $TARGET.dir')
>
> First I ran it with src contents:
> src/a.c
> src/b.c
> src/asdf.c
>
> All files were copied, as expected. Running SCons again, the build was not executed as all targets were up to date.

The build was re-run because the new name you gave to the file changed
the sorting order. I explained this in the initial report:

On 3 August 2018 at 00:45, Alistair Buxton <***@gmail.com> wrote:

> renames. It can also not if the renaming causes dependencies to be
> listed in a different order (ie if you renamed a.txt instead of b.txt,
> and the files had different contents).

If you renamed asdf.c to asdfa.c, then the build would still have
re-run, because you put the name of every file into the action string,
and scons will detect that that the action string has changed. I
explained this in my initial follow up email:


> In this case you would instead say:
>
> env.Command('listing.txt', Glob('*.txt'), 'ls ${SOURCES} > ${TARGET}')
>
> and scons would identify that the build needs to be re-done because
> the action string changed.

and the above was in reference to a reproduction of the issue which
involves no directories at all:

> env.Command('listing.txt', Glob('*.txt'), 'ls *.txt > ${TARGET}')

Please actually read the whole report before jumping to conclusions.


--
Alistair Buxton
***@gmail.com
Bill Deegan
2018-08-03 18:20:42 UTC
Permalink
On Fri, Aug 3, 2018 at 11:03 AM, Alistair Buxton <***@gmail.com>
wrote:

> On 3 August 2018 at 18:51, Matthew Marinets
> <***@kardium.com> wrote:
> > I can't reproduce your bug then.
> >
> > Using Python 3.6.5;
> > scons -v:
> > script: v3.0.1.74b2c53bc42290e911b334a6b44f187da698a668,
> 2017/11/14 13:16:53, by bdbaddog on hpmicrodog
> > engine: v3.0.1.74b2c53bc42290e911b334a6b44f187da698a668,
> 2017/11/14 13:16:53, by bdbaddog on hpmicrodog
> > engine path: ['C:\\Program Files\\Python36\\scons-3.0.1\\SCons']
> >
> >
> > I made a Command() builder that Globs all .c files in the src/
> directory, and copies them to the build/ directory:
> >
> > sources = Glob(pattern = str(src) + '/*.c')
> > env.Command(target = [build.File(x.name) for x in sources],
> source = sources, action = 'cp $SOURCES $TARGET.dir')
> >
> > First I ran it with src contents:
> > src/a.c
> > src/b.c
> > src/asdf.c
> >
> > All files were copied, as expected. Running SCons again, the build was
> not executed as all targets were up to date.
>
> The build was re-run because the new name you gave to the file changed
> the sorting order. I explained this in the initial report:
>

Also you could run into an issue where your command line length exceeds the
shell maximum.

That said you can exclude parts of your command line from the action
signature by wrapping them in "$(" and "$)" (start and end of sequence in
build string to ignore).
(Also in the manpage)


>
> On 3 August 2018 at 00:45, Alistair Buxton <***@gmail.com> wrote:
>
> > renames. It can also not if the renaming causes dependencies to be
> > listed in a different order (ie if you renamed a.txt instead of b.txt,
> > and the files had different contents).
>
> If you renamed asdf.c to asdfa.c, then the build would still have
> re-run, because you put the name of every file into the action string,
> and scons will detect that that the action string has changed. I
> explained this in my initial follow up email:
>


Here's what you said in your initial email
"All files under /some/path will be inserted into the image. Therefore,
if any file or subdirectory is renamed, the image needs to be rebuilt,"

So in the case above, rebuilding is correct?


>
> > In this case you would instead say:
> >
> > env.Command('listing.txt', Glob('*.txt'), 'ls ${SOURCES} > ${TARGET}')
> >
> > and scons would identify that the build needs to be re-done because
> > the action string changed.
>
> and the above was in reference to a reproduction of the issue which
> involves no directories at all:
>
> > env.Command('listing.txt', Glob('*.txt'), 'ls *.txt > ${TARGET}')
>
> Please actually read the whole report before jumping to conclusions.
>

I seem to have to keep reminding you that being courteous to those trying
to HELP you is considered good form.
Alistair Buxton
2018-08-03 18:25:01 UTC
Permalink
On 3 August 2018 at 19:20, Bill Deegan <***@baddogconsulting.com> wrote:

>
> Here's what you said in your initial email
> "All files under /some/path will be inserted into the image. Therefore,
> if any file or subdirectory is renamed, the image needs to be rebuilt,"
>
> So in the case above, rebuilding is correct?

Yes, it is the correct behaviour, but it happens for the wrong reason.
It happens as a result of the action string changing, NOT because the
sources changed, and as I wrote:

On 3 August 2018 at 00:53, Alistair Buxton <***@gmail.com> wrote:

> However, this is not so easy with the image builder case as it takes a
> directory path as the only argument.

In other words, the action string will not change when you have a tool
which takes a single directory path as the only argument, and as a
result, the build will not be re-run.
Bill Deegan
2018-08-03 18:26:53 UTC
Permalink
On Fri, Aug 3, 2018 at 11:25 AM, Alistair Buxton <***@gmail.com>
wrote:

> On 3 August 2018 at 19:20, Bill Deegan <***@baddogconsulting.com> wrote:
>
> >
> > Here's what you said in your initial email
> > "All files under /some/path will be inserted into the image. Therefore,
> > if any file or subdirectory is renamed, the image needs to be rebuilt,"
> >
> > So in the case above, rebuilding is correct?
>
> Yes, it is the correct behaviour, but it happens for the wrong reason.
> It happens as a result of the action string changing, NOT because the
> sources changed, and as I wrote:
>

Why is this the wrong reason?


>
> On 3 August 2018 at 00:53, Alistair Buxton <***@gmail.com> wrote:
>
> > However, this is not so easy with the image builder case as it takes a
> > directory path as the only argument.
>
> In other words, the action string will not change when you have a tool
> which takes a single directory path as the only argument, and as a
> result, the build will not be re-run.
>

This is typically where a scanner would come into play.

-Bill
Alistair Buxton
2018-08-03 18:40:53 UTC
Permalink
On 3 August 2018 at 19:26, Bill Deegan <***@baddogconsulting.com> wrote:
>
>
> On Fri, Aug 3, 2018 at 11:25 AM, Alistair Buxton <***@gmail.com>
> wrote:
>>
>> On 3 August 2018 at 19:20, Bill Deegan <***@baddogconsulting.com> wrote:
>>
>> >
>> > Here's what you said in your initial email
>> > "All files under /some/path will be inserted into the image. Therefore,
>> > if any file or subdirectory is renamed, the image needs to be rebuilt,"
>> >
>> > So in the case above, rebuilding is correct?
>>
>> Yes, it is the correct behaviour, but it happens for the wrong reason.
>> It happens as a result of the action string changing, NOT because the
>> sources changed, and as I wrote:
>
>
> Why is this the wrong reason?

Because the purpose of the example was to demonstrate that scons does
not reliably rebuild when the sources list changes. The fact that it
rebuilds if the action string changes is completely irrelevant, and
not helpful in the case where the tool takes a single directory as the
only argument.

>> On 3 August 2018 at 00:53, Alistair Buxton <***@gmail.com> wrote:
>>
>> > However, this is not so easy with the image builder case as it takes a
>> > directory path as the only argument.
>>
>> In other words, the action string will not change when you have a tool
>> which takes a single directory path as the only argument, and as a
>> result, the build will not be re-run.
>
>
> This is typically where a scanner would come into play.

Since a scanner returns a list of dependencies, and the problem is due
to the way scons internally processes dependency lists, this will not
solve the problem, and will in fact have identical behaviour, unless
you write the scanner to return a Value containing a recursive
checksum similar to the one I presented in the first email - which btw
is the only reliable recursive dependency code presented so far in
this thread, assuming I typed it out correctly :)

--
Alistair Buxton
***@gmail.com
Bill Deegan
2018-08-03 19:41:09 UTC
Permalink
On Fri, Aug 3, 2018 at 11:40 AM, Alistair Buxton <***@gmail.com>
wrote:

> On 3 August 2018 at 19:26, Bill Deegan <***@baddogconsulting.com> wrote:
> >
> >
> > On Fri, Aug 3, 2018 at 11:25 AM, Alistair Buxton <***@gmail.com>
> > wrote:
> >>
> >> On 3 August 2018 at 19:20, Bill Deegan <***@baddogconsulting.com>
> wrote:
> >>
> >> >
> >> > Here's what you said in your initial email
> >> > "All files under /some/path will be inserted into the image.
> Therefore,
> >> > if any file or subdirectory is renamed, the image needs to be
> rebuilt,"
> >> >
> >> > So in the case above, rebuilding is correct?
> >>
> >> Yes, it is the correct behaviour, but it happens for the wrong reason.
> >> It happens as a result of the action string changing, NOT because the
> >> sources changed, and as I wrote:
> >
> >
> > Why is this the wrong reason?
>
> Because the purpose of the example was to demonstrate that scons does
> not reliably rebuild when the sources list changes. The fact that it
> rebuilds if the action string changes is completely irrelevant, and
> not helpful in the case where the tool takes a single directory as the
> only argument.
>

So if this causes the appropriate behavior but for the wrong "reason", then
you will not use it?


> >> On 3 August 2018 at 00:53, Alistair Buxton <***@gmail.com>
> wrote:
> >>
> >> > However, this is not so easy with the image builder case as it takes a
> >> > directory path as the only argument.
> >>
> >> In other words, the action string will not change when you have a tool
> >> which takes a single directory path as the only argument, and as a
> >> result, the build will not be re-run.
> >
> >
> > This is typically where a scanner would come into play.
>
> Since a scanner returns a list of dependencies, and the problem is due
> to the way scons internally processes dependency lists, this will not
> solve the problem, and will in fact have identical behaviour, unless
> you write the scanner to return a Value containing a recursive
> checksum similar to the one I presented in the first email - which btw
> is the only reliable recursive dependency code presented so far in
> this thread, assuming I typed it out correctly :)
>


As I understand it, you're trying to solve two issues to get the build you
want:
1) Find the list of all dependencies correctly. The scanner can resolve
this. (Especially if some of the files are built files, which you've yet
to answer from a previous response in this email thread)
2) Decide if you need to rebuild (For this to work you need a correct list
of dependencies, once again the scanner can solve this part of the
puzzle). Then you just need an appropriate decider.
Alistair Buxton
2018-08-03 19:51:52 UTC
Permalink
On 3 August 2018 at 20:41, Bill Deegan <***@baddogconsulting.com> wrote:

>> Because the purpose of the example was to demonstrate that scons does
>> not reliably rebuild when the sources list changes. The fact that it
>> rebuilds if the action string changes is completely irrelevant, and
>> not helpful in the case where the tool takes a single directory as the
>> only argument.
>
>
> So if this causes the appropriate behavior but for the wrong "reason", then
> you will not use it?

I cannot use it because the tool in question only accepts a single
directory as the sole argument, and so I could not list all the files
in the action string even if I wanted to. (Which I do not.)


>
> As I understand it, you're trying to solve two issues to get the build you
> want:
> 1) Find the list of all dependencies correctly. The scanner can resolve
> this. (Especially if some of the files are built files, which you've yet to
> answer from a previous response in this email thread)
> 2) Decide if you need to rebuild (For this to work you need a correct list
> of dependencies, once again the scanner can solve this part of the puzzle).
> Then you just need an appropriate decider.


No, this is incorrect on both counts.

1) I don't need the list of dependencies, because the tool I am
running only accepts a single directory as the sole argument. The list
of dependencies is therefore useless for the purposes of generating
the action string.

2) Having a correct list of dependencies does not help in deciding if
I need to rebuild, because scons will not check if the filenames have
changed. It will therefore give the wrong result when given a correct
list of dependencies. This is the issue I am reporting, and is the
reason why none of the suggestions made so far in this thread will
work, except for the Value() based workaround I initially presented.


--
Alistair Buxton
***@gmail.com
Bill Deegan
2018-08-03 21:05:50 UTC
Permalink
On Fri, Aug 3, 2018 at 12:51 PM, Alistair Buxton <***@gmail.com>
wrote:

> On 3 August 2018 at 20:41, Bill Deegan <***@baddogconsulting.com> wrote:
>
> >> Because the purpose of the example was to demonstrate that scons does
> >> not reliably rebuild when the sources list changes. The fact that it
> >> rebuilds if the action string changes is completely irrelevant, and
> >> not helpful in the case where the tool takes a single directory as the
> >> only argument.
> >
> >
> > So if this causes the appropriate behavior but for the wrong "reason",
> then
> > you will not use it?
>
> I cannot use it because the tool in question only accepts a single
> directory as the sole argument, and so I could not list all the files
> in the action string even if I wanted to. (Which I do not.)
>

You can use it by added a env.Depends() from the target of whatever that
command is to your packaging command.
So while it won't appear on the command line of your command it will cause
your command to run.


>
> >
> > As I understand it, you're trying to solve two issues to get the build
> you
> > want:
> > 1) Find the list of all dependencies correctly. The scanner can resolve
> > this. (Especially if some of the files are built files, which you've
> yet to
> > answer from a previous response in this email thread)
> > 2) Decide if you need to rebuild (For this to work you need a correct
> list
> > of dependencies, once again the scanner can solve this part of the
> puzzle).
> > Then you just need an appropriate decider.
>
>
> No, this is incorrect on both counts.
>
> 1) I don't need the list of dependencies, because the tool I am
> running only accepts a single directory as the sole argument. The list
> of dependencies is therefore useless for the purposes of generating
> the action string.
>

You are incorrect. You need SCons to know about the full list of
dependencies for it to know if the tool needs to be rerun.


>
> 2) Having a correct list of dependencies does not help in deciding if
> I need to rebuild, because scons will not check if the filenames have
> changed. It will therefore give the wrong result when given a correct
> list of dependencies. This is the issue I am reporting, and is the
> reason why none of the suggestions made so far in this thread will
> work, except for the Value() based workaround I initially presented.
>


Let's list some facts.

A. You are just starting to use the tool
B. Everyone who is responding to you has been using the tool quite a long
time
C. You immediately dismiss their advice time and time again because it
doesn't agree with your limited knowledge of how the tool works.

You are again making assumptions about the method SCons uses to decide if a
command should be rerun based on your experience with the behavior of the
default decider.

So shall we start over.
1. You need SCons to know about the dependencies which SHOULD cause your
tool to be rerun
2. You need a method for SCons to take that information and trigger a rerun
of your tool if any of the file contents or filenames change

True?
Matthew Marinets
2018-08-03 21:16:35 UTC
Permalink
I have reproduced the issue, and it's definitely a bug. It's very specific, which is probably why no one else has run into it before.

My code:
# SConstruct
env = Environment(ENV=os.environ)
src = Dir('src')
build = Dir('build')

sources = Glob(pattern = str(src) + '/*.c')
print([str(x) for x in sources])

# Copy only the first source, but give the builder all sources
output = env.Command(target = [build.File(x.name) for x in sources[:1]], source = sources, action = 'cp $SOURCE $TARGET.dir')

------
with the same directory structure as before. (src/a.c, src/b.c, src/qwer.c)

-Running the build once copies a.c to build, as expected. Running the build again says that build/a.c is up to date.
-Making a change to any source (a.c, b.c, or qwer.c) causes a rebuild
-Changing the name of qwer.c to, say q.c DOES NOT CAUSE A REBUILD. This is the bug.
We would expect that if changing the contents of qwer.c causes a rebuild, then changing the name of qwer.c to something else should also cause a rebuild.
Additionally, adding an env.Depend() DOES NOT WORK. This has to do with how SCons handles its dependency checking:

The method in charge of determining if a node is out of date comes from SCons.Node.__init__.py, in class Node, method changed(). The method checks three things:
-If the length of the source list has changed
-If the decider decides that a child is out of date.
-If the build action changed

The only potential place that can check if a child's name changed is the decider, and the default SCons decider doesn't. It merely compares item x in the current source list with item x in the old source list, and runs hashes only on the contents. I'm still not too great at navigating the SCons codebase, but I think it comes back to SCons.Node.changed_content(), which checks the csig attribute. Csig in turn comes from get_csig(), which only runs MD5 on a node's get_contents() output.

**The script-side temporary fix would have to adjust the decider somehow.
-This function needs to take three arguments. From the SCons source code, it looks like "decider(dependency, target, prev_ni)" is used
-This function should return True if the node changed, false if it did not.
-A user could reasonably copy functions changed_content() and get_csig() from SCons/Node.FS.py and use those as a basis for a new decider.
-The adjusted decider would simply append a node's name to its contents when if calculates a csig.
I'm not 100% exactly how to implement this myself (I'm no hashing expert), so I would appreciate if anyone with the requisite know-how could implement this and share the code.

**The full SCons-side fix is probably going to include a node's name when getting a hash.
-In SCons.Node.get_csig(), we would need to change the "csig = SCons.Util.MD5signature(contents)" to "csig = SCons.Util.MD5signature(contents + self.name)" (might have type issues)
-Either adjust get_content_hash() to include the node's name, or create a new method get_node_hash() that includes the node's name.

Hope my constant walls of text aren't wearing on people : P

-Matthew

-----Original Message-----
From: Scons-users <scons-users-***@scons.org> On Behalf Of Alistair Buxton
Sent: August 3, 2018 12:52
To: SCons users mailing list <scons-***@scons.org>
Subject: Re: [Scons-users] Recursive directories, and the ignoring of filenames.

On 3 August 2018 at 20:41, Bill Deegan <***@baddogconsulting.com> wrote:

>> Because the purpose of the example was to demonstrate that scons does
>> not reliably rebuild when the sources list changes. The fact that it
>> rebuilds if the action string changes is completely irrelevant, and
>> not helpful in the case where the tool takes a single directory as
>> the only argument.
>
>
> So if this causes the appropriate behavior but for the wrong "reason",
> then you will not use it?

I cannot use it because the tool in question only accepts a single directory as the sole argument, and so I could not list all the files in the action string even if I wanted to. (Which I do not.)


>
> As I understand it, you're trying to solve two issues to get the build
> you
> want:
> 1) Find the list of all dependencies correctly. The scanner can
> resolve this. (Especially if some of the files are built files, which
> you've yet to answer from a previous response in this email thread)
> 2) Decide if you need to rebuild (For this to work you need a correct
> list of dependencies, once again the scanner can solve this part of the puzzle).
> Then you just need an appropriate decider.


No, this is incorrect on both counts.

1) I don't need the list of dependencies, because the tool I am running only accepts a single directory as the sole argument. The list of dependencies is therefore useless for the purposes of generating the action string.

2) Having a correct list of dependencies does not help in deciding if I need to rebuild, because scons will not check if the filenames have changed. It will therefore give the wrong result when given a correct list of dependencies. This is the issue I am reporting, and is the reason why none of the suggestions made so far in this thread will work, except for the Value() based workaround I initially presented.


--
Alistair Buxton
***@gmail.com
Bill Deegan
2018-08-03 22:12:46 UTC
Permalink
On Fri, Aug 3, 2018 at 2:16 PM, Matthew Marinets <
***@kardium.com> wrote:

> I have reproduced the issue, and it's definitely a bug. It's very
> specific, which is probably why no one else has run into it before.
>
>
This is very odd SConstruct..


> My code:
> # SConstruct
> env = Environment(ENV=os.environ)
> src = Dir('src')
> build = Dir('build')
>
> sources = Glob(pattern = str(src) + '/*.c')
> print([str(x) for x in sources])
>
> # Copy only the first source, but give the builder all sources
> output = env.Command(target = [build.File(x.name) for x in sources[:1]],
> source = sources, action = 'cp $SOURCE $TARGET.dir')
>

# Here's how I'd write it. More SCons'ian
import os
env = Environment(ENV=os.environ)
src = 'src'
build = 'build'

sources = Glob(src + '/*.txt')

sources_string_list=[str(x) for x in sources]
print(sources_string_list)

# Copy only the first source, but give the builder all sources
output=env.Command("${SOURCES[0].file}", sources, 'cp $SOURCE ${TARGET.dir}'
)

# The line below makes the command rerun when the list of filenames with
path changes.
# Note: No need to change decider or do any magic here..
env.Depends(output, Value(sources_string_list))



>
> ------
> with the same directory structure as before. (src/a.c, src/b.c, src/qwer.c)
>
> -Running the build once copies a.c to build, as expected. Running the
> build again says that build/a.c is up to date.
> -Making a change to any source (a.c, b.c, or qwer.c) causes a rebuild
> -Changing the name of qwer.c to, say q.c DOES NOT CAUSE A REBUILD. This is
> the bug.
>

Is it though?
So target a depends on c,d,e,f
If none of the contents change in the files it depends on then does it
really need to be rebuilt?
(at least for a standard compilation model)

If the order changes, then it does need to be rebuilt as this can have
implications for the contents of the target (in a compilation model)
Certainly with make this would not trigger a rebuild (assuming the
timestamp didn't change).
If you want SCons to check timestamp first and then (if it's changed) check
content then you'd use MD5-timestamp decider.


> We would expect that if changing the contents of qwer.c causes a rebuild,
> then changing the name of qwer.c to something else should also cause a
> rebuild.
> Additionally, adding an env.Depend() DOES NOT WORK. This has to do with
> how SCons handles its dependency checking:
>

Actually it's going to depend on what type of Node(s) you're adding via
Depends.. It's method to determine "changed" will be called.


>
> The method in charge of determining if a node is out of date comes from
> SCons.Node.__init__.py, in class Node, method changed(). The method checks
> three things:
> -If the length of the source list has changed
> -If the decider decides that a child is out of date.
> -If the build action changed
>

Not entirely correct. The items are Nodes and can be many types. Each type
can implement their own changed functions, which function is called is
based o the type and also on the decider set in the Environment (or as a
default for all environments if the Environment's hasn't been explicitly
set)


>
> The only potential place that can check if a child's name changed is the
> decider, and the default SCons decider doesn't. It merely compares item x
> in the current source list with item x in the old source list, and runs
> hashes only on the contents. I'm still not too great at navigating the
> SCons codebase, but I think it comes back to SCons.Node.changed_content(),
> which checks the csig attribute. Csig in turn comes from get_csig(), which
> only runs MD5 on a node's get_contents() output.
>

Which is valid in the traditional compilation model where such files would
also be listed on the command line, the ordering the files are evaluated is
deterministic so assuming dependencies come from the same "place" they will
be in the same order and thus comparing old vs new is valid


>
> **The script-side temporary fix would have to adjust the decider somehow.
> -This function needs to take three arguments. From the SCons source code,
> it looks like "decider(dependency, target, prev_ni)" is used
> -This function should return True if the node changed, false if it did not.
> -A user could reasonably copy functions changed_content() and get_csig()
> from SCons/Node.FS.py and use those as a basis for a new decider.
> -The adjusted decider would simply append a node's name to its contents
> when if calculates a csig.
> I'm not 100% exactly how to implement this myself (I'm no hashing expert),
> so I would appreciate if anyone with the requisite know-how could implement
> this and share the code.
>
> **The full SCons-side fix is probably going to include a node's name when
> getting a hash.
> -In SCons.Node.get_csig(), we would need to change the "csig =
> SCons.Util.MD5signature(contents)" to "csig = SCons.Util.MD5signature(contents
> + self.name)" (might have type issues)
> -Either adjust get_content_hash() to include the node's name, or create a
> new method get_node_hash() that includes the node's name.
>

I'm deep in the code on a fix for MD5-timestamp which does pretty much this.
Note that all seems simple(ish) until you start adding Repository() and
variant dirs and cachedirs
In these cases you may do a build which pulls a built object from a
repository, or from a variant dir where the actual path differs but the
requested path remains the same and so may or may not match the previous
builds actual path, but may be valid and not require a rebuild.

When thinking about handling out-of-date ness during builds some things to
keep in mind.
A) the goal when SCons was created (based on Cons) was to be better than
Make.
B) Make doesn't care about the list of files, the content of the files, or
the command line, only if the listed source(s) are newer than the listed
target(s)




>
> Hope my constant walls of text aren't wearing on people : P
>

Nope. I appreciate well though out discussion rather than walls of text
based on assumptions of how SCons "should" work.


>
> -Matthew
>
> -----Original Message-----
> From: Scons-users <scons-users-***@scons.org> On Behalf Of Alistair
> Buxton
> Sent: August 3, 2018 12:52
> To: SCons users mailing list <scons-***@scons.org>
> Subject: Re: [Scons-users] Recursive directories, and the ignoring of
> filenames.
>
> On 3 August 2018 at 20:41, Bill Deegan <***@baddogconsulting.com> wrote:
>
> >> Because the purpose of the example was to demonstrate that scons does
> >> not reliably rebuild when the sources list changes. The fact that it
> >> rebuilds if the action string changes is completely irrelevant, and
> >> not helpful in the case where the tool takes a single directory as
> >> the only argument.
> >
> >
> > So if this causes the appropriate behavior but for the wrong "reason",
> > then you will not use it?
>
> I cannot use it because the tool in question only accepts a single
> directory as the sole argument, and so I could not list all the files in
> the action string even if I wanted to. (Which I do not.)
>
>
> >
> > As I understand it, you're trying to solve two issues to get the build
> > you
> > want:
> > 1) Find the list of all dependencies correctly. The scanner can
> > resolve this. (Especially if some of the files are built files, which
> > you've yet to answer from a previous response in this email thread)
> > 2) Decide if you need to rebuild (For this to work you need a correct
> > list of dependencies, once again the scanner can solve this part of the
> puzzle).
> > Then you just need an appropriate decider.
>
>
> No, this is incorrect on both counts.
>
> 1) I don't need the list of dependencies, because the tool I am running
> only accepts a single directory as the sole argument. The list of
> dependencies is therefore useless for the purposes of generating the action
> string.
>
> 2) Having a correct list of dependencies does not help in deciding if I
> need to rebuild, because scons will not check if the filenames have
> changed. It will therefore give the wrong result when given a correct list
> of dependencies. This is the issue I am reporting, and is the reason why
> none of the suggestions made so far in this thread will work, except for
> the Value() based workaround I initially presented.
>
>
> --
> Alistair Buxton
> ***@gmail.com
> _______________________________________________
> Scons-users mailing list
> Scons-***@scons.org
> https://pairlist4.pair.net/mailman/listinfo/scons-users
> _______________________________________________
> Scons-users mailing list
> Scons-***@scons.org
> https://pairlist4.pair.net/mailman/listinfo/scons-users
>
Matthew Marinets
2018-08-04 00:35:34 UTC
Permalink
Sorry about jumping to conclusions about how it should work. I was just a bit excited because I finally managed to reproduce the bug after a couple hours on and off trying and wasn’t expecting it to work that way.

So to summarize:
-We can change the behaviour of a builder to rebuild when source names change with:

env.SomeBuilder(target, sources)
env.Depends(target, Value([str(x) for x in sources]))

-Having to do this manually gives more flexibility; “str(x)” can also be “x.name” or some kind of relative path function.
-Having similar functionality built into the decider is in the works, but is more complicated than it seems
-I should be using string paths instead of Dir / File objects in my scripts (?)

I wasn’t too familiar with Value() nodes, so it seems I jumped to conclusions.
(also don’t mind me forgetting to hit the send button two hours ago)

Have a good weekend everyone! : D

-Matthew

From: Scons-users <scons-users-***@scons.org> On Behalf Of Bill Deegan
Sent: August 3, 2018 15:13
To: SCons users mailing list <scons-***@scons.org>
Subject: Re: [Scons-users] Recursive directories, and the ignoring of filenames.



On Fri, Aug 3, 2018 at 2:16 PM, Matthew Marinets <***@kardium.com<mailto:***@kardium.com>> wrote:
I have reproduced the issue, and it's definitely a bug. It's very specific, which is probably why no one else has run into it before.

This is very odd SConstruct..

My code:
# SConstruct
env = Environment(ENV=os.environ)
src = Dir('src')
build = Dir('build')

sources = Glob(pattern = str(src) + '/*.c')
print([str(x) for x in sources])

# Copy only the first source, but give the builder all sources
output = env.Command(target = [build.File(x.name<http://x.name>) for x in sources[:1]], source = sources, action = 'cp $SOURCE $TARGET.dir')

# Here's how I'd write it. More SCons'ian
import os
env = Environment(ENV=os.environ)
src = 'src'
build = 'build'

sources = Glob(src + '/*.txt')

sources_string_list=[str(x) for x in sources]
print(sources_string_list)

# Copy only the first source, but give the builder all sources
output=env.Command("${SOURCES[0].file}", sources, 'cp $SOURCE ${TARGET.dir}')

# The line below makes the command rerun when the list of filenames with path changes.
# Note: No need to change decider or do any magic here..
env.Depends(output, Value(sources_string_list))



------
with the same directory structure as before. (src/a.c, src/b.c, src/qwer.c)

-Running the build once copies a.c to build, as expected. Running the build again says that build/a.c is up to date.
-Making a change to any source (a.c, b.c, or qwer.c) causes a rebuild
-Changing the name of qwer.c to, say q.c DOES NOT CAUSE A REBUILD. This is the bug.

Is it though?
So target a depends on c,d,e,f
If none of the contents change in the files it depends on then does it really need to be rebuilt?
(at least for a standard compilation model)

If the order changes, then it does need to be rebuilt as this can have implications for the contents of the target (in a compilation model)
Certainly with make this would not trigger a rebuild (assuming the timestamp didn't change).
If you want SCons to check timestamp first and then (if it's changed) check content then you'd use MD5-timestamp decider.

We would expect that if changing the contents of qwer.c causes a rebuild, then changing the name of qwer.c to something else should also cause a rebuild.
Additionally, adding an env.Depend() DOES NOT WORK. This has to do with how SCons handles its dependency checking:

Actually it's going to depend on what type of Node(s) you're adding via Depends.. It's method to determine "changed" will be called.


The method in charge of determining if a node is out of date comes from SCons.Node.__init__.py, in class Node, method changed(). The method checks three things:
-If the length of the source list has changed
-If the decider decides that a child is out of date.
-If the build action changed

Not entirely correct. The items are Nodes and can be many types. Each type can implement their own changed functions, which function is called is based o the type and also on the decider set in the Environment (or as a default for all environments if the Environment's hasn't been explicitly set)


The only potential place that can check if a child's name changed is the decider, and the default SCons decider doesn't. It merely compares item x in the current source list with item x in the old source list, and runs hashes only on the contents. I'm still not too great at navigating the SCons codebase, but I think it comes back to SCons.Node.changed_content(), which checks the csig attribute. Csig in turn comes from get_csig(), which only runs MD5 on a node's get_contents() output.

Which is valid in the traditional compilation model where such files would also be listed on the command line, the ordering the files are evaluated is deterministic so assuming dependencies come from the same "place" they will be in the same order and thus comparing old vs new is valid


**The script-side temporary fix would have to adjust the decider somehow.
-This function needs to take three arguments. From the SCons source code, it looks like "decider(dependency, target, prev_ni)" is used
-This function should return True if the node changed, false if it did not.
-A user could reasonably copy functions changed_content() and get_csig() from SCons/Node.FS.py<http://Node.FS.py> and use those as a basis for a new decider.
-The adjusted decider would simply append a node's name to its contents when if calculates a csig.
I'm not 100% exactly how to implement this myself (I'm no hashing expert), so I would appreciate if anyone with the requisite know-how could implement this and share the code.

**The full SCons-side fix is probably going to include a node's name when getting a hash.
-In SCons.Node.get_csig(), we would need to change the "csig = SCons.Util.MD5signature(contents)" to "csig = SCons.Util.MD5signature(contents + self.name<http://self.name>)" (might have type issues)
-Either adjust get_content_hash() to include the node's name, or create a new method get_node_hash() that includes the node's name.

I'm deep in the code on a fix for MD5-timestamp which does pretty much this.
Note that all seems simple(ish) until you start adding Repository() and variant dirs and cachedirs
In these cases you may do a build which pulls a built object from a repository, or from a variant dir where the actual path differs but the requested path remains the same and so may or may not match the previous builds actual path, but may be valid and not require a rebuild.

When thinking about handling out-of-date ness during builds some things to keep in mind.
A) the goal when SCons was created (based on Cons) was to be better than Make.
B) Make doesn't care about the list of files, the content of the files, or the command line, only if the listed source(s) are newer than the listed target(s)




Hope my constant walls of text aren't wearing on people : P

Nope. I appreciate well though out discussion rather than walls of text based on assumptions of how SCons "should" work.


-Matthew

-----Original Message-----
From: Scons-users <scons-users-***@scons.org<mailto:scons-users-***@scons.org>> On Behalf Of Alistair Buxton
Sent: August 3, 2018 12:52
To: SCons users mailing list <scons-***@scons.org<mailto:scons-***@scons.org>>
Subject: Re: [Scons-users] Recursive directories, and the ignoring of filenames.
On 3 August 2018 at 20:41, Bill Deegan <***@baddogconsulting.com<mailto:***@baddogconsulting.com>> wrote:

>> Because the purpose of the example was to demonstrate that scons does
>> not reliably rebuild when the sources list changes. The fact that it
>> rebuilds if the action string changes is completely irrelevant, and
>> not helpful in the case where the tool takes a single directory as
>> the only argument.
>
>
> So if this causes the appropriate behavior but for the wrong "reason",
> then you will not use it?

I cannot use it because the tool in question only accepts a single directory as the sole argument, and so I could not list all the files in the action string even if I wanted to. (Which I do not.)


>
> As I understand it, you're trying to solve two issues to get the build
> you
> want:
> 1) Find the list of all dependencies correctly. The scanner can
> resolve this. (Especially if some of the files are built files, which
> you've yet to answer from a previous response in this email thread)
> 2) Decide if you need to rebuild (For this to work you need a correct
> list of dependencies, once again the scanner can solve this part of the puzzle).
> Then you just need an appropriate decider.


No, this is incorrect on both counts.

1) I don't need the list of dependencies, because the tool I am running only accepts a single directory as the sole argument. The list of dependencies is therefore useless for the purposes of generating the action string.

2) Having a correct list of dependencies does not help in deciding if I need to rebuild, because scons will not check if the filenames have changed. It will therefore give the wrong result when given a correct list of dependencies. This is the issue I am reporting, and is the reason why none of the suggestions made so far in this thread will work, except for the Value() based workaround I initially presented.


--
Alistair Buxton
***@gmail.com<mailto:***@gmail.com>
_______________________________________________
Scons-users mailing list
Scons-***@scons.org<mailto:Scons-***@scons.org>
https://pairlist4.pair.net/mailman/listinfo/scons-users
_______________________________________________
Scons-users mailing list
Scons-***@scons.org<mailto:Scons-***@scons.org>
https://pairlist4.pair.net/mailman/listinfo/scons-users
Bill Deegan
2018-08-04 02:35:43 UTC
Permalink
On Fri, Aug 3, 2018 at 5:35 PM, Matthew Marinets <
***@kardium.com> wrote:

> Sorry about jumping to conclusions about how it should work. I was just a
> bit excited because I finally managed to reproduce the bug after a couple
> hours on and off trying and wasn’t expecting it to work that way.
>
>
>
> So to summarize:
>
> -We can change the behavior of a builder to rebuild when source names
> change with:
>
>
>
> env.SomeBuilder(target, sources)
>
> env.Depends(target, Value([str(x) for x in sources]))
>

Yes. Though you probably want to keep in mind that compound builders like
Program() which can take C files and handle the conversion from .c -> .o
and the link them for you.

So if you have a list of sources for a program and only one of them should
be rebuilt you could do something like

objects = [env.SharedObject(s) for s in sources[1:]
special_object = env.SharedObject(sources[0] or 'myspecialfile.c')
env.Depend(special_object,Value(Some magic here))


>
> -Having to do this manually gives more flexibility; “str(x)” can also be “
> x.name” or some kind of relative path function.
>

Indeed, or the name plus file size, or anything you can imagine..
You can populate a Value node with the Git sha for your currently checked
out sandbox, etc..


> -Having similar functionality built into the decider is in the works, but
> is more complicated than it seems
>

No. Currently only planned for when using MD5-Timestamp decider, as the
performance cost is non-trivial at this point and that's the only decider
where a mismatch has bad side effects. (it will copy the MD5 from previous
build if the timestamp is unchanged and avoid re-MD5'ing files, Github
issue 2980 explains a bit about what was going wrong, but in this case
repeated runs could corrupt the sconsign and cause unnecessary rebuilds.
This was typcially happening when checking out backwards and forward from
git).

Regardless due to the nature of the way the list of sources, dependencies,
and implicit dependencies are generated the current method of new files vs
previous build files are valid. This logic is more or less unchanged in
15 years. It's had significant use and thus far few (if any) bugs are
outstanding with this functionality.

You can see my previous comments about the correctness of not rebuilding if
a file name changes in most contexts.

SCons is not really designed to take a directory as a source or a target,
it's file based.
You can work around these limitations to some extent, but as you can see
it's not vanilla.


> -I should be using string paths instead of Dir / File objects in my
> scripts (?)
>

Generally strings are sufficient and less typing.
There are rare occasions where explicitly using Dir or File are called for.

>
>
> I wasn’t too familiar with Value() nodes, so it seems I jumped to
> conclusions.
>
> (also don’t mind me forgetting to hit the send button two hours ago)
>

Safe to assume when learning a new tool that you know less than everyone
answering the questions.
Or at least on the scons-users mailing list.
Lots of experienced users willing to share a bit of time to help you along.

If you have a complicated build, then SCons is probably the best game in
town.
It's not the fastest.. but we're working on it.
(It's not slow though)


>
>
> Have a good weekend everyone! : D
>

Right back at you!

-Bill
SCons Project Co-manager.


>
>
> -Matthew
>
>
>
> *From:* Scons-users <scons-users-***@scons.org> *On Behalf Of *Bill
> Deegan
> *Sent:* August 3, 2018 15:13
>
> *To:* SCons users mailing list <scons-***@scons.org>
> *Subject:* Re: [Scons-users] Recursive directories, and the ignoring of
> filenames.
>
>
>
>
>
>
>
> On Fri, Aug 3, 2018 at 2:16 PM, Matthew Marinets <
> ***@kardium.com> wrote:
>
> I have reproduced the issue, and it's definitely a bug. It's very
> specific, which is probably why no one else has run into it before.
>
>
>
> This is very odd SConstruct..
>
>
>
> My code:
> # SConstruct
> env = Environment(ENV=os.environ)
> src = Dir('src')
> build = Dir('build')
>
> sources = Glob(pattern = str(src) + '/*.c')
> print([str(x) for x in sources])
>
> # Copy only the first source, but give the builder all sources
> output = env.Command(target = [build.File(x.name) for x in sources[:1]],
> source = sources, action = 'cp $SOURCE $TARGET.dir')
>
>
>
> # Here's how I'd write it. More SCons'ian
>
> import os
>
> env = Environment(ENV=os.environ)
>
> src = 'src'
>
> build = 'build'
>
>
>
> sources = Glob(src + '/*.txt')
>
>
>
> sources_string_list=[str(x) for x in sources]
>
> print(sources_string_list)
>
>
>
> # Copy only the first source, but give the builder all sources
>
> output=env.Command("${SOURCES[0].file}", sources, 'cp $SOURCE $
> {TARGET.dir}')
>
>
>
> # The line below makes the command rerun when the list of filenames with
> path changes.
>
> # Note: No need to change decider or do any magic here..
>
> env.Depends(output, Value(sources_string_list))
>
>
>
>
>
>
> ------
> with the same directory structure as before. (src/a.c, src/b.c, src/qwer.c)
>
> -Running the build once copies a.c to build, as expected. Running the
> build again says that build/a.c is up to date.
> -Making a change to any source (a.c, b.c, or qwer.c) causes a rebuild
> -Changing the name of qwer.c to, say q.c DOES NOT CAUSE A REBUILD. This is
> the bug.
>
>
>
> Is it though?
>
> So target a depends on c,d,e,f
>
> If none of the contents change in the files it depends on then does it
> really need to be rebuilt?
>
> (at least for a standard compilation model)
>
>
>
> If the order changes, then it does need to be rebuilt as this can have
> implications for the contents of the target (in a compilation model)
>
> Certainly with make this would not trigger a rebuild (assuming the
> timestamp didn't change).
>
> If you want SCons to check timestamp first and then (if it's changed)
> check content then you'd use MD5-timestamp decider.
>
>
>
> We would expect that if changing the contents of qwer.c causes a rebuild,
> then changing the name of qwer.c to something else should also cause a
> rebuild.
> Additionally, adding an env.Depend() DOES NOT WORK. This has to do with
> how SCons handles its dependency checking:
>
>
>
> Actually it's going to depend on what type of Node(s) you're adding via
> Depends.. It's method to determine "changed" will be called.
>
>
>
>
> The method in charge of determining if a node is out of date comes from
> SCons.Node.__init__.py, in class Node, method changed(). The method checks
> three things:
> -If the length of the source list has changed
> -If the decider decides that a child is out of date.
> -If the build action changed
>
>
>
> Not entirely correct. The items are Nodes and can be many types. Each type
> can implement their own changed functions, which function is called is
> based o the type and also on the decider set in the Environment (or as a
> default for all environments if the Environment's hasn't been explicitly
> set)
>
>
>
>
> The only potential place that can check if a child's name changed is the
> decider, and the default SCons decider doesn't. It merely compares item x
> in the current source list with item x in the old source list, and runs
> hashes only on the contents. I'm still not too great at navigating the
> SCons codebase, but I think it comes back to SCons.Node.changed_content(),
> which checks the csig attribute. Csig in turn comes from get_csig(), which
> only runs MD5 on a node's get_contents() output.
>
>
>
> Which is valid in the traditional compilation model where such files would
> also be listed on the command line, the ordering the files are evaluated is
> deterministic so assuming dependencies come from the same "place" they will
> be in the same order and thus comparing old vs new is valid
>
>
>
>
> **The script-side temporary fix would have to adjust the decider somehow.
> -This function needs to take three arguments. From the SCons source code,
> it looks like "decider(dependency, target, prev_ni)" is used
> -This function should return True if the node changed, false if it did not.
> -A user could reasonably copy functions changed_content() and get_csig()
> from SCons/Node.FS.py and use those as a basis for a new decider.
> -The adjusted decider would simply append a node's name to its contents
> when if calculates a csig.
> I'm not 100% exactly how to implement this myself (I'm no hashing expert),
> so I would appreciate if anyone with the requisite know-how could implement
> this and share the code.
>
> **The full SCons-side fix is probably going to include a node's name when
> getting a hash.
> -In SCons.Node.get_csig(), we would need to change the "csig =
> SCons.Util.MD5signature(contents)" to "csig = SCons.Util.MD5signature(contents
> + self.name)" (might have type issues)
> -Either adjust get_content_hash() to include the node's name, or create a
> new method get_node_hash() that includes the node's name.
>
>
>
> I'm deep in the code on a fix for MD5-timestamp which does pretty much
> this.
>
> Note that all seems simple(ish) until you start adding Repository() and
> variant dirs and cachedirs
>
> In these cases you may do a build which pulls a built object from a
> repository, or from a variant dir where the actual path differs but the
> requested path remains the same and so may or may not match the previous
> builds actual path, but may be valid and not require a rebuild.
>
>
>
> When thinking about handling out-of-date ness during builds some things to
> keep in mind.
>
> A) the goal when SCons was created (based on Cons) was to be better than
> Make.
>
> B) Make doesn't care about the list of files, the content of the files, or
> the command line, only if the listed source(s) are newer than the listed
> target(s)
>
>
>
>
>
>
>
>
> Hope my constant walls of text aren't wearing on people : P
>
>
>
> Nope. I appreciate well though out discussion rather than walls of text
> based on assumptions of how SCons "should" work.
>
>
>
>
> -Matthew
>
> -----Original Message-----
> From: Scons-users <scons-users-***@scons.org> On Behalf Of Alistair
> Buxton
> Sent: August 3, 2018 12:52
> To: SCons users mailing list <scons-***@scons.org>
> Subject: Re: [Scons-users] Recursive directories, and the ignoring of
> filenames.
>
> On 3 August 2018 at 20:41, Bill Deegan <***@baddogconsulting.com> wrote:
>
> >> Because the purpose of the example was to demonstrate that scons does
> >> not reliably rebuild when the sources list changes. The fact that it
> >> rebuilds if the action string changes is completely irrelevant, and
> >> not helpful in the case where the tool takes a single directory as
> >> the only argument.
> >
> >
> > So if this causes the appropriate behavior but for the wrong "reason",
> > then you will not use it?
>
> I cannot use it because the tool in question only accepts a single
> directory as the sole argument, and so I could not list all the files in
> the action string even if I wanted to. (Which I do not.)
>
>
> >
> > As I understand it, you're trying to solve two issues to get the build
> > you
> > want:
> > 1) Find the list of all dependencies correctly. The scanner can
> > resolve this. (Especially if some of the files are built files, which
> > you've yet to answer from a previous response in this email thread)
> > 2) Decide if you need to rebuild (For this to work you need a correct
> > list of dependencies, once again the scanner can solve this part of the
> puzzle).
> > Then you just need an appropriate decider.
>
>
> No, this is incorrect on both counts.
>
> 1) I don't need the list of dependencies, because the tool I am running
> only accepts a single directory as the sole argument. The list of
> dependencies is therefore useless for the purposes of generating the action
> string.
>
> 2) Having a correct list of dependencies does not help in deciding if I
> need to rebuild, because scons will not check if the filenames have
> changed. It will therefore give the wrong result when given a correct list
> of dependencies. This is the issue I am reporting, and is the reason why
> none of the suggestions made so far in this thread will work, except for
> the Value() based workaround I initially presented.
>
>
> --
> Alistair Buxton
> ***@gmail.com
> _______________________________________________
> Scons-users mailing list
> Scons-***@scons.org
> https://pairlist4.pair.net/mailman/listinfo/scons-users
> _______________________________________________
> Scons-users mailing list
> Scons-***@scons.org
> https://pairlist4.pair.net/mailman/listinfo/scons-users
>
>
>
> _______________________________________________
> Scons-users mailing list
> Scons-***@scons.org
> https://pairlist4.pair.net/mailman/listinfo/scons-users
>
>
Matthew Marinets
2018-08-03 18:21:52 UTC
Permalink
I still can't reproduce the issue:

Renaming qwer.c to qwera.c behaves exactly as expected:
Rebuilt the first time, not rebuilt the next

I put in a print statement to check the order. Even when the order of sources did not change, SCons still behaves as expected. My complete output is below, at the ** symbol, but there's one more thing

At no point did you have a guard against a directory slipping into your sources argument in your initial post. The commands I saw attempted were as follows:
env.Command('listing.txt', 'files', 'ls -lR ${SOURCE} > ${TARGET}')

env.Command('listing.txt', ['files/', Glob('files/*')], 'ls -lR ${SOURCES[0]} > ${TARGET}')

env.Command('listing.txt', ['files/', Glob('files/*'), Glob('files/*/*'), Glob('files/*/*/*'), Glob('files/*/*/*/*')], 'ls -lR ${SOURCES[0]} > ${TARGET}')

env.Command('listing.txt', ['files',
Value(subprocess.check_output(['sh', '-c', 'find files/ -type f -exec sha256sum {} \; | sort | sha256sum']))], 'ls -lR ${SOURCES[0]} >
${TARGET}')

env.Command('out.txt', Glob('files/*'), 'ls -lR files/ > ${TARGET}')

Except for the last one, every call starts with 'files' as the first source. 'files' is a directory, as evidenced by your use of 'files/' in many places, and your introductory statement that "I have a directory called 'files'".

Under the hood, SCons converts all sources and targets to SCons nodes; usually Entry nodes. This means that:
-If the file / directory does not exist, SCons waits waits until something creates it to check whether it's a file or dir.
-If it already exists, SCons resolves it to either a File or Dir node. This might also explain why you said your script would work once then behave incorrectly.

I have read through your email multiple times. If I come off as not having read it clearly enough, I apologize. I am simply trying to help, and hopefully I'll get better at SCons myself as I do so.

-Matthew
-------------------------------------------------------
**My scons output. The only difference in the code is that I added a print statement so we can see the order of the sources being fed to the builder:

C:\Users\mmarinets\IdeaProjects\HelloWorld>dir src
Volume in drive C has no label.
Volume Serial Number is 76FB-4CD1

Directory of C:\Users\mmarinets\IdeaProjects\HelloWorld\src

03-Aug-2018 11:08 AM <DIR> .
03-Aug-2018 11:08 AM <DIR> ..
03-Aug-2018 10:34 AM 71 a.c
03-Aug-2018 10:34 AM 71 b.c
27-Jun-2018 02:20 PM <DIR> com
18-Jul-2018 04:09 PM <DIR> org
03-Aug-2018 10:34 AM 71 qwer.c
3 File(s) 213 bytes
4 Dir(s) 154,776,956,928 bytes free

C:\Users\mmarinets\IdeaProjects\HelloWorld>dir build
Volume in drive C has no label.
Volume Serial Number is 76FB-4CD1

Directory of C:\Users\mmarinets\IdeaProjects\HelloWorld\build

03-Aug-2018 11:08 AM <DIR> .
03-Aug-2018 11:08 AM <DIR> ..
0 File(s) 0 bytes
2 Dir(s) 154,776,956,928 bytes free

C:\Users\mmarinets\IdeaProjects\HelloWorld>scons
scons: Reading SConscript files ...
['src\\a.c', 'src\\b.c', 'src\\qwer.c']
scons: done reading SConscript files.
scons: Building targets ...
cp src\a.c src\b.c src\qwer.c build
scons: done building targets.

C:\Users\mmarinets\IdeaProjects\HelloWorld>scons
scons: Reading SConscript files ...
['src\\a.c', 'src\\b.c', 'src\\qwer.c']
scons: done reading SConscript files.
scons: Building targets ...
scons: `.' is up to date.
scons: done building targets.

C:\Users\mmarinets\IdeaProjects\HelloWorld>cp src/qwer.c src/qwera.c

C:\Users\mmarinets\IdeaProjects\HelloWorld>del src\qwer.c

C:\Users\mmarinets\IdeaProjects\HelloWorld>scons --debug=explain
scons: Reading SConscript files ...
['src\\a.c', 'src\\b.c', 'src\\qwera.c']
scons: done reading SConscript files.
scons: Building targets ...
scons: rebuilding `build\a.c' because:
`src\qwer.c' is no longer a dependency
`src\qwera.c' is a new dependency
cp src\a.c src\b.c src\qwera.c build
scons: done building targets.

C:\Users\mmarinets\IdeaProjects\HelloWorld>scons --debug=explain
scons: Reading SConscript files ...
['src\\a.c', 'src\\b.c', 'src\\qwera.c']
scons: done reading SConscript files.
scons: Building targets ...
scons: `.' is up to date.
scons: done building targets.

-----Original Message-----
From: Scons-users <scons-users-***@scons.org> On Behalf Of Alistair Buxton
Sent: August 3, 2018 11:03
To: SCons users mailing list <scons-***@scons.org>
Subject: Re: [Scons-users] Recursive directories, and the ignoring of filenames.

On 3 August 2018 at 18:51, Matthew Marinets <***@kardium.com> wrote:
> I can't reproduce your bug then.
>
> Using Python 3.6.5;
> scons -v:
> script: v3.0.1.74b2c53bc42290e911b334a6b44f187da698a668, 2017/11/14 13:16:53, by bdbaddog on hpmicrodog
> engine: v3.0.1.74b2c53bc42290e911b334a6b44f187da698a668, 2017/11/14 13:16:53, by bdbaddog on hpmicrodog
> engine path: ['C:\\Program
> Files\\Python36\\scons-3.0.1\\SCons']
>
>
> I made a Command() builder that Globs all .c files in the src/ directory, and copies them to the build/ directory:
>
> sources = Glob(pattern = str(src) + '/*.c')
> env.Command(target = [build.File(x.name) for x in sources],
> source = sources, action = 'cp $SOURCES $TARGET.dir')
>
> First I ran it with src contents:
> src/a.c
> src/b.c
> src/asdf.c
>
> All files were copied, as expected. Running SCons again, the build was not executed as all targets were up to date.

The build was re-run because the new name you gave to the file changed the sorting order. I explained this in the initial report:

On 3 August 2018 at 00:45, Alistair Buxton <***@gmail.com> wrote:

> renames. It can also not if the renaming causes dependencies to be
> listed in a different order (ie if you renamed a.txt instead of b.txt,
> and the files had different contents).

If you renamed asdf.c to asdfa.c, then the build would still have re-run, because you put the name of every file into the action string, and scons will detect that that the action string has changed. I explained this in my initial follow up email:


> In this case you would instead say:
>
> env.Command('listing.txt', Glob('*.txt'), 'ls ${SOURCES} > ${TARGET}')
>
> and scons would identify that the build needs to be re-done because
> the action string changed.

and the above was in reference to a reproduction of the issue which involves no directories at all:

> env.Command('listing.txt', Glob('*.txt'), 'ls *.txt > ${TARGET}')

Please actually read the whole report before jumping to conclusions.


--
Alistair Buxton
***@gmail.com
Alistair Buxton
2018-08-03 18:42:10 UTC
Permalink
On 3 August 2018 at 19:21, Matthew Marinets
<***@kardium.com> wrote:
> I still can't reproduce the issue:
>
> Renaming qwer.c to qwera.c behaves exactly as expected:
> Rebuilt the first time, not rebuilt the next
>
> I put in a print statement to check the order. Even when the order of sources did not change, SCons still behaves as expected. My complete output is below, at the ** symbol, but there's one more thing
>
> At no point did you have a guard against a directory slipping into your sources argument in your initial post. The commands I saw attempted were as follows:
> env.Command('listing.txt', 'files', 'ls -lR ${SOURCE} > ${TARGET}')
>
> env.Command('listing.txt', ['files/', Glob('files/*')], 'ls -lR ${SOURCES[0]} > ${TARGET}')
>
> env.Command('listing.txt', ['files/', Glob('files/*'), Glob('files/*/*'), Glob('files/*/*/*'), Glob('files/*/*/*/*')], 'ls -lR ${SOURCES[0]} > ${TARGET}')
>
> env.Command('listing.txt', ['files',
> Value(subprocess.check_output(['sh', '-c', 'find files/ -type f -exec sha256sum {} \; | sort | sha256sum']))], 'ls -lR ${SOURCES[0]} >
> ${TARGET}')
>
> env.Command('out.txt', Glob('files/*'), 'ls -lR files/ > ${TARGET}')
>
> Except for the last one, every call starts with 'files' as the first source. 'files' is a directory, as evidenced by your use of 'files/' in many places, and your introductory statement that "I have a directory called 'files'".

The difference between my example and your example is that I used
${SOURCE} and ${SOURCES[0]} which expand to only the first item in the
sources, ie "files" regardless of what else is in the list. You used
"${SOURCES}" which expands to the full list of sources, ie "files
files/a.txt files/b.txt" and this string will change depending on what
sources are given. This means that the action string will change, and
this will cause scons to rebuild. But as I just explained to Bill, not
every tool takes a list of files on the command line. Some only accept
a single directory, so using this behaviour as a workaround is not
possible.




--
Alistair Buxton
***@gmail.com
Jason Kenny
2018-08-03 18:09:14 UTC
Permalink
Just to chime in here.
I have a recurse glob in Parts you can use.. I have an updated version I have to merge in that uses Glob.
It has to use cases. it allows you to get all the files in some tree of stuff ( with include and exclude patterns). And it has a way to get the node as a set of targets and source nodes, to allow keeping the tree structure. This is useful for copies and for adding files to a zip or some kind of archive builder.
The code is here: https://bitbucket.org/dragon512/parts/src/b256cf29748d3a175302f4b83542fbefba8328a9/parts/pattern.py#lines-28
given you builder you have for the copy you then say something like:
mypattern = env.Pattern(src_dir = "mysub/dir")
trg, src = mypattern.target_source("my/copy/root/dir")
env.CCopyAs(target=trg, source=src)
I know this works fine. At least this might help provide an example to tweak your version with.
Jason
________________________________
From: Scons-users <scons-users-***@scons.org> on behalf of Matthew Marinets <***@Kardium.com>
Sent: Friday, August 3, 2018 12:51 PM
To: SCons users mailing list
Subject: Re: [Scons-users] Recursive directories, and the ignoring of filenames.

I can't reproduce your bug then.

Using Python 3.6.5;
scons -v:
script: v3.0.1.74b2c53bc42290e911b334a6b44f187da698a668, 2017/11/14 13:16:53, by bdbaddog on hpmicrodog
engine: v3.0.1.74b2c53bc42290e911b334a6b44f187da698a668, 2017/11/14 13:16:53, by bdbaddog on hpmicrodog
engine path: ['C:\\Program Files\\Python36\\scons-3.0.1\\SCons']


I made a Command() builder that Globs all .c files in the src/ directory, and copies them to the build/ directory:

sources = Glob(pattern = str(src) + '/*.c')
env.Command(target = [build.File(x.name) for x in sources], source = sources, action = 'cp $SOURCES $TARGET.dir')

First I ran it with src contents:
src/a.c
src/b.c
src/asdf.c

All files were copied, as expected. Running SCons again, the build was not executed as all targets were up to date.

Next, I changed the name of asdf.c to qwer.c.
>scons --debug=explain
scons: Reading SConscript files ...
scons: done reading SConscript files.
scons: Building targets ...
scons: rebuilding `build\a.c' because:
`src\asdf.c' is no longer a dependency
`src\qwer.c' is a new dependency
cp src\a.c src\b.c src\qwer.c build
scons: done building targets.

>scons --debug=explain
scons: Reading SConscript files ...
scons: done reading SConscript files.
scons: Building targets ...
scons: `.' is up to date.
scons: done building targets.

I think you've misidentified your bug.

I still suspect that it's because you're putting directories in the source / target arguments of a builder; SCons doesn't like that for some reason. Notice that in my example, the sources and targets were all files or lists of files, and to get a directory path in that action, I called $TARGET.dir

-Matthew

-----Original Message-----
From: Scons-users <scons-users-***@scons.org> On Behalf Of Alistair Buxton
Sent: August 3, 2018 10:25
To: SCons users mailing list <scons-***@scons.org>
Subject: Re: [Scons-users] Recursive directories, and the ignoring of filenames.

On 3 August 2018 at 18:08, Matthew Marinets <***@kardium.com> wrote:
> To me it sounds more like you just need an emitter.
>
> The bug you described (as I understood it), is that the built-in SCons
> decider is order-sensitive. Saying [“src/a.txt”, “src/b.txt”] is
> different from [“src/b.txt”, “src/a.txt”].
>
> The simple solution, as Bill said, it to just sort your list. You can
> do this automatically in an emitter.

Then you didn't correctly understand the issue. Go and read it again.
The issue is that saying ['a.txt', 'b.txt'] is the same say saying ['a.txt', 'c.txt'] if b.txt and c.txt have the same contents, ie it will NOT trigger a rebuild, even though it should. The way you build the list is 100% irrelevant as this is an internal implementation detail of scons.


--
Alistair Buxton
***@gmail.com
Alistair Buxton
2018-08-03 18:42:41 UTC
Permalink
On 3 August 2018 at 19:09, Jason Kenny <***@live.com> wrote:

> https://bitbucket.org/dragon512/parts/src/b256cf29748d3a175302f4b83542fbefba8328a9/parts/pattern.py#lines-28

I haven't tested it yet, but since this appears to ultimately produces
a list of sources, I fully expect it to behave identically to all the
other examples wrt renaming a file in a way which does not affect the
order of the sources list.


--
Alistair Buxton
***@gmail.com
Bill Deegan
2018-08-03 17:26:27 UTC
Permalink
Matthew +1 for sorting the emitter!

And actually unless you care about empty directories changing their name,
then having the full path to all the files will yield the same information
(did a directory name change).
-Bill

On Fri, Aug 3, 2018 at 10:08 AM, Matthew Marinets <
***@kardium.com> wrote:

> To me it sounds more like you just need an emitter.
>
> The bug you described (as I understood it), is that the built-in SCons
> decider is order-sensitive. Saying [“src/a.txt”, “src/b.txt”] is different
> from [“src/b.txt”, “src/a.txt”].
>
> The simple solution, as Bill said, it to just sort your list. You can do
> this automatically in an emitter.
>
>
>
> Additionally, from my experience, SCons builders really don’t like taking
> directories as sources.
>
> --If you need a directory path in your command line, it should not come
> straight from $SOURCES
>
> ---Different variables that you define work fine, say $SOURCES_DIR
>
> ---Some SCons tools, particularly Java ones, use a node’s attributes; so
> you’d set a source’s attribute for, say, source_dir, and call it in the
> command with $SOURCE.attributes.source_dir
>
> --If you still want to specify just the sources directory when you call
> your builder, you can do all this calculation automatically in an emitter.
>
>
>
> Example:
>
> MyTool.py:
>
> “””My custom tool”””
>
> Import os
>
>
>
> from my_util import custom_recursive_glob
>
>
>
> def emitter(target, source, env):
>
> “””Formats targets and sources;
>
> The sources root directory will be assigned to $SOURCES_DIR,
>
> The actual source files will be used by SCons for dependency checking
>
> ”””
>
> assert os.isdir(str(source[0]))
>
> env[‘SOURCES_DIR’] = source
>
> source = custom_recursive_glob(source, pattern = “*.txt”)
>
> return target, source
>
>
>
> def generate(env):
>
> env[‘MYBUILDERCOM’] = “ls -lR $SOURCES_DIR > $TARGET”
>
> env[‘BUILDERS’][‘MyBuilder’] =
> SCons.Builder.Builder(action = “$MYBUILDERCOM’”, emitter = emitter)
>
>
>
> def exists(env):
>
> return True
>
>
>
> SConstruct:
>
> env = Environment(tools = [“MyTool”], toolpath = [“path/to/MyTool”])
>
> # build calls and such
>
>
>
> -Matthew
>
>
>
> *From:* Scons-users <scons-users-***@scons.org> *On Behalf Of *Bill
> Deegan
> *Sent:* August 3, 2018 09:52
> *To:* SCons users mailing list <scons-***@scons.org>
> *Subject:* Re: [Scons-users] Recursive directories, and the ignoring of
> filenames.
>
>
>
> And what if you sort the list generated from os.walk? Then the order
> should be consistant...
>
>
>
> On Fri, Aug 3, 2018 at 9:24 AM, Alistair Buxton <***@gmail.com>
> wrote:
>
> On 3 August 2018 at 14:56, Marc Branchaud <***@xiplink.com> wrote:
>
> > I use os.walk() to make a big list of all the files.
>
> This is affected by the bug I described, as are all methods which
> build a list of filenames and pass it to scons.
>
> Also os.walk ordering is arbitrary so this algorithm has a small
> chance of causing random spurious rebuilds.
>
>
>
> --
> Alistair Buxton
> ***@gmail.com
>
> _______________________________________________
> Scons-users mailing list
> Scons-***@scons.org
> https://pairlist4.pair.net/mailman/listinfo/scons-users
>
>
>
> _______________________________________________
> Scons-users mailing list
> Scons-***@scons.org
> https://pairlist4.pair.net/mailman/listinfo/scons-users
>
>
Loading...