1

I was using 7zip to compute the hash of a folder (with subfolders), which it can do with two options, with or without including the file names.

However, on the linux version of 7zip, the hash feature is not implemented. I tried different methods to duplicate the result, but none of these methods would give the same result on linux and windows.

Examples of results :

"7za.exe h -scrcsha1 myfolder" on windows gives :

SHA1   for data:              D54D3168B16BFEE600C3A77E848A2A1C1DBCBC59
SHA1   for data and names:    BCE55085200581AD1774CC25AE065DE7DE60077D

, whereas on linux I have :

find . -type f -exec sha1sum "$PWD"/{} \; | sha1sum
ee44137f2462bdfea87ec824dab514f288ae3e6c  -

or

find . -type f | xargs sha1sum | sha1sum
8f971311a28bcdee36fab0ce87a892564622db40  -

So I can't use the result from one platform on another.

(I did verify that the result for a single file is the same for both platforms.)

Nygael
  • 111
  • 4
  • Is making a hash of each file, and then hashing this an option? – davidbaumann Mar 05 '18 at 15:27
  • 2
    What didn't work about the accepted answer you linked to? `find . -type f -exec sha1sum "$PWD"/{} \; | sha1sum` - can you tell us the problems? – Attie Mar 05 '18 at 15:28

2 Answers2

1

Simply running the following command won't necessarily work:

find . -type f | xargs sha512sum | sha512sum

The issue you may face is that the order of files reported by find is different from system-to-system or even from directory copy to copy.

Instead, try running the following:

find . -type f | sort | xargs sha512sum | sha512sum

Feel free to swap sha512sum for another - e.g: md5sum / sha1sum / sha256sum depending on your requirements.

Note that this may get slow for large directory trees, in which case you may prefer a more complex script to traverse the hierarchy.


Example:

$ find . -type f | xargs sha512sum | sha512sum
097e56f6b751c1da15ce5b9dce853ffcc89e06e9cbe10a8dc0894dedb834d40dc4228c65e48bd53f136dd6a7700b0ab07e8e12e7100956db00b0d1b9ef0b9956  -

This includes file names and content in the final hash, but does not include metadata - modification times, permissions, etc...


Note that you can use these utilities on Windows by using "Windows Subsystem for Linux". I've just installed it, which was a painless experience, and which also made me realise the issue with find's reported ordering.

Also watch out for how symbolic links are handled in your tree on Linux vs. Windows.

Attie
  • 19,231
  • 5
  • 58
  • 73
  • Thanks, but how can I come to the same hash value on windows ? – Nygael Mar 05 '18 at 17:31
  • Can you edit your question with the output of `find . -type f | xargs sha512sum` from each system - try a small example directory. I wonder if the backwards slashes are causing issues... Or if your directories are actually different. – Attie Mar 05 '18 at 17:33
  • Oh... I've just re-read your question... do you have ["_Windows Subsystem for Linux_"](https://docs.microsoft.com/en-us/windows/wsl/about) setup? You should be able to run the same command on both systems. – Attie Mar 05 '18 at 17:35
  • I don't have this installed, but it seems a bit overkill for my case, I'm looking for a stand alone exe or similar. (I ultimately want to use the solution from nodejs exec command line on any computer after just an nsis install of my application+dependancies) – Nygael Mar 05 '18 at 17:50
  • Well if you're using NodeJS, then this is a different question... You're not looking for a set of utilities to achieve a goal, you're interested in how to implement your required features. – Attie Mar 05 '18 at 17:55
  • Hmmm, just by stating it I realized that I was so focused on command line due to my previous use of 7z that I did not think of looking for a native nodejs tool, and there is actually one : https://www.npmjs.com/package/folder-hash... I'll try that and maybe post it as an answer if it delivers the same result on both plateforms. – Nygael Mar 05 '18 at 17:55
1

Unfortunately,it seems that it's impossible to reproduce the hash of a folder generated by 7-zip.

This is becasuse 7z uses the FindNextFileW() function to enumerate the directories (7z-1900src/CPP/Windows/FileFind.cpp, line 198).

The order of the function's return value is not guaranteed and can be file system dependent (According to https://docs.microsoft.com/zh-cn/windows/win32/api/fileapi/nf-fileapi-findnextfilew).

So if you want to impliment a platform independent directory hashing function, you should use an unified sorting function.

仕刀_
  • 11
  • 1