Simplify and unify file operations calling diff_2_files.
I think this greatly simplifies the open and stat
operations needed before calling diff_2_files, and this
also unifies them. I simplified the ones in DiffWrapper
a lot, and moved them to the bottom in a new structure
DiffFileData which contains the structure array, and
cleans itself up, and then I reused this code over in
Diff.cpp (which used to have its own code, which wasn't
exactly the same as the DiffWrapper version, although
it was functionally equivalent).
I got rid of the malloc stuff in the process, using
CStrings inside of DiffFileData instead.
This is one step on a plan I have to unify the two sets
of code leading up to diff calls. The next step will be
to (try to) unify the file transform calls in both paths.
The actual goal I'm trying to achieve, although its not
directly relevant to this patch, is to implement
generic properties by name in the DiffContext, and add
some diff statistics (most importantly #diffs, but also
#bytes changed if I can get that) as named properties.
However, I found that I wanted to simplify the paths of
code leading through diffs first, so I could better
understand what is happening, before I try to figure
out the AddDiff stuff; I hope there is a way to
simplify the AddDiff stuff.
Altered and original files (Diff.cpp, DiffWrapper.*) 20k
Logged In: YES
user_id=60964
Scope:
This is code cleanup, and is not after all part of the
#diffs work (which is now covered in another patch), so I
can't think of any remote reason to claim this is in the
existing 2.2 scope.
Logged In: YES
user_id=631874
This looks great! I don't know we have two different
implementations for some diffing code. When I moved code
from CMergeDoc to CDiffWrapper, I just copied existing code.
I didn't want to take risk of breaking something when doing
rather big changes.
Like shortly discussed elsewhere, AddDiff() should be
converted to use some real item counts (shared struct or
class), not just item++.
And I hope you keep in mind we want to have directory
compare done later in two phases: first read all files
(names?) into list (CDiffContext?) and after all files are
in list, run diff for all or selected (new flag) files. We
need this for some improvements, like rescanning only
selected items or getting changed files from windows
filesystem notifications.
Logged In: YES
user_id=631874
Oops, should be "I don't know *why* we have two different..."
Logged In: YES
user_id=60964
re: directory scan in two phases
No, I didn't have this in mind, but that sounds like a good
idea.
But, I don't think I can think about that and try to unify
the two code paths at the same time -- too much thinking :)
Anyway, I think combining the code has to help make it
understandable, at least to me.
Logged In: YES
user_id=631874
re: directory scan in two phases
I have one experimental solution for this - my solution just
moves calls to file compare out of dirscan() function.
Instead I added all files dirscan() found to diffcontext and
then in one loop compared all items in diffcontext (having
"needToRescan" flag). I'm not sure if that's good approach,
but its pretty simple at least.
But my solution changes way we use CDiffContext. Currently
we add items to it after file/dir is compared, as a result.
My solution adds first all items to CDiffContext and then
changes status and other info of items after they are
compared/read.
Now when you are changing CDiffContext, and if you think
something like I described above could work, would be nice
if you can design CDiffContext attributes so that above
usage is possible. Of course its another patch etc.
Logged In: YES
user_id=60964
That sounds fairly good.
How would you like the idea of adding list items immediately
as we get items in the diffcontext to-be-scanned array? That
way, the screen would fill up with items to be scanned, and
I guess they'd all be status unknown, then as they get
scanned their icons would change to green or red (or they'd
disappear -- hm, that might be trouble, removing ones that
fail filters).
I think this may slow things down, but I think it would be
pretty :)
Maybe it would be better to add them to list after they are
scanned, instead of just when we get them into the list of
things-to-be-scanned.
Logged In: YES
user_id=631874
My idea was even simpler. I added one new BOOL
"bNeedToRescan" to CDiffContext. When reading items to
CDiffContext I set bNeedToRescan to TRUE. And then rescan()
just goes through CDiffContext and compares files/dirs
having bNeedToRescan as TRUE.
This avoids having separate lists and copying/moving items
between lists which I think can cause some overhead and
unnecessary complexity. I don't know if this is too simple
for all usages however, for example how to handle new files
found (files copied to directory) after previous rescan...
Logged In: YES
user_id=60964
re: boolean for need to rescan -- sounds good to me :)
I'll bear it in mind, in case I get to that.
Applied this patch.
Closing as Accepted.
Checking in Src/Diff.cpp;
/cvsroot/winmerge/WinMerge/Src/Diff.cpp,v <-- Diff.cpp
new revision: 1.26; previous revision: 1.25
done
Checking in Src/DiffWrapper.cpp;
/cvsroot/winmerge/WinMerge/Src/DiffWrapper.cpp,v <--
DiffWrapper.cpp
new revision: 1.19; previous revision: 1.18
done
Checking in Src/DiffWrapper.h;
/cvsroot/winmerge/WinMerge/Src/DiffWrapper.h,v <--
DiffWrapper.h
new revision: 1.11; previous revision: 1.10
done
Checking in Src/readme.txt;
/cvsroot/winmerge/WinMerge/Src/readme.txt,v <-- readme.txt
new revision: 1.721; previous revision: 1.720
done
Success, CVS operation completed