103

Cross-platform programs are sometimes distributed as .tar.gz for the Unix version and .zip for the Windows version. This makes sense when the contents of each must be different.

If, however, the contents are going to be the same, it would be simpler to just have one download. Windows prefers .zip because that's the format it can handle out of the box. Does it matter on Unix? That is, I tried today unzipping a file on Ubuntu Linux, and it worked fine; is there any problem with this on any current Unix-like operating system, or is it okay to just provide a .zip file across the board?

rwallace
  • 2,527
  • 4
  • 28
  • 31
  • 1
    Note that tar files may also be compressed with other, more modern compressors (like gzip replaced the original "compress" program as it was much more efficient). The file name ending changes accordingly. – Thorbjørn Ravn Andersen Oct 03 '20 at 13:13

9 Answers9

70

Yes, it matters.
Actually, it depends.

tar.gz

  • Stores unix file attributes: uid, gid, permissions (most notably executable). The default may depend on your distribution, and can be toggled with options.
  • Consolidates all files to be archived in one file ("Tape ARchive").
  • Actual compression is done by GZIP, on the one .tar file

zip

  • Stores MSDOS attributes. (Archive, Readonly, Hidden, System)
  • Compresses each file individually, then consolidates the individually compressed files in one file
  • Includes a file table at the end of the file

Because zip compresses the files individually, a zip-archive will most-likely have a larger size (especially with many smaller files - think config files).

So you see, appart from file size, if you zip a bunch of files on Linux/Unix, and then unzip them, the file-attributes will be gone (at the very least those not supported by MS-DOS - depends on what ZIP-software you use). This may matter, or it may not, in which case it doesn't matter (because the file-size difference is in most cases negligible).

Note:
Apparently, modern versions of ZIP also store Unix-file-attributes (depends on your ZIP-software), so with modern-zip-software, the file-size will be the only difference.

Quandary
  • 1,983
  • 3
  • 24
  • 27
  • 20
    the standard distro of zip on unix-like systems (info-zip) also stores unix file attributes. – Erik Aronesty Apr 24 '18 at 21:26
  • 5
    The ZIP format, e.g. via the Unix `zip` and `unzip` utilities, indeed always stores and restores Unix file permissions. Moreover, `unzip` restores the Unix file timestamps unless you provide the `-DD` option, and `unzip` even restores the UID and GID if you provide the `-X` option. – caw May 22 '21 at 02:18
39

tar gz is better for Linux/Unix as it retains permissions, such as "executable" on scripts.

Zam
  • 407
  • 1
  • 4
  • 2
  • 12
    OS X's Archive Utility and zip / unzip preserve permissions, but there might be other utilities that don't. – Lri Jan 19 '13 at 15:33
  • 7
    Standard zip/unzip tools (info-zip) retain permissions on linux, and timestamps on windows. see: https://en.wikipedia.org/wiki/Info-ZIP for typical capabilities... which overcomes the permissions issues and file size limitations while retaining desirable random access and editable archive properties. – Erik Aronesty Apr 24 '18 at 21:23
33

Most popular Linux distros these days are by default equipped with zip compatibility. But as stated by nc3b, tar and gzip are more common on Linux/Unix systems. If you need 95% compatibility on these systems, consider using tar and gzip. If you need only 85%, zip will do fine.

Pylsa
  • 30,630
  • 16
  • 89
  • 116
  • 4
    Okay, 95% is better than 85% :-) A very minor question, does it matter at all if the file extension is .tgz instead of .tar.gz? – rwallace May 29 '10 at 19:34
  • 10
    Extension doesn't matter at all, it's just used for reference by users and programs. If the extension is .XXX and you know it's .tar, you could still use tar to untar it. .tgz and .tar.gz are both in fact the same extensions and files with these extensions would be similar. – Pylsa May 29 '10 at 19:43
  • 2
    On the other hand, for 100% compatibility on Windows you would need to use cab. – kinokijuf Nov 15 '11 at 17:39
  • 3
    tar will store uid, gid and permissions, such as +x on unix systems. zip stores archive, readonly, hidden and system on windows systems. – Andrew De Andrade Oct 30 '13 at 21:43
  • @BloodPhilia, So does that mean that we can GZip a file and rename it as .zip and it will correctly unzip? – Pacerier Apr 24 '14 at 11:33
  • @Pacerier Yes, as long as you use gzip to unzip it. – Pylsa Apr 26 '14 at 19:40
  • @BloodPhilia Actually gzip does care about the suffix. If you try to ungzip a file which doesn't end in .tgz or .gz it will give the error `gzip: npm-debug.log.zip: unknown suffix -- ignored`. – mtak Jul 31 '14 at 14:43
  • 1
    @mtak you can always just use `gunzip --suffix .zip npm-debug.log.zip` or `gunzip -c < npm-debug.log.zip > npm-debug.log` – Iwan Aucamp Jan 19 '17 at 13:14
29

tar/gzip is a pretty crappy format since the archive cannot be randomly accessed, updated, verified or even appended to... without having to decompress the entire archive.

zip is much better in that regard.... you can quickly obtain the contents of a zip file, append to it without recompressing the first part, etc.

zip has some size limitations ... depending on the version of "zip" that you use... and these can be a problem. but the standard info-zip tool that comes with most linux-like os'es has no size limitations and preserves file permissions just fine.

see: https://en.wikipedia.org/wiki/Info-ZIP for capabilities

Erik Aronesty
  • 524
  • 5
  • 7
11

Barebones Unix installs don't contain unzip (i.e. server installs), but they always contain tar and gzip. If your audience is servers, I'd go for gzip.

Also gzip has greater compression than zip, so the file will be smaller.

Hennes
  • 64,768
  • 7
  • 111
  • 168
Rwky
  • 688
  • 2
  • 6
  • 18
  • 3
    I wouldn't say gzip compresses better than ZIP. Both use the same DEFLATE algorithm, and all comparisons I've done give similar results in file size. – u1686_grawity May 29 '10 at 21:57
  • 7
    Well, tar.gz will compress the whole file in one go, whereas zip compresses files individually. For many small files, the first approach will usually generate noticeably smaller files, because redundancies can be used across files. The difference is not huge though. – sleske Jun 24 '10 at 16:09
  • 4
    @sleske actually gzip has a pretty small window size (32K) for finding redundancies, it's not the whole file – Erik Aronesty Mar 03 '21 at 17:31
  • 1
    To drive home sleske's point: compressing a tar is equivalent to what's called a 'solid' archive for RAR, 7-zip, PowerArchiver and pretty much every other non-crappy archiver out there. .tgz and .zip simply don't give you the choice. Better archivers also offer features like deduplication and advanced codecs, which can improve compression by an order of magnitude on average (i.e. in actual daily use, not only in benchmarks). – DarthGizka Oct 30 '21 at 19:00
7

Yes, it matters. Tar is an archiver. And in tar.gz, we compress that archive.

Zip is both an archiver and compressor.

If you compare compression, from my experience, gzip is much better than zip.

And the other significant difference is mentioned in another answer. If you have a very big file archive, and want to extract a small file, Zip allows you to do that. But with tar.gz, you need to extract entire archive.

Rakesh Reddy
  • 71
  • 1
  • 3
  • Not an archive of gzipped files but a gzip of archived files. That's why you have to extract the whole archive. – m93a Feb 15 '15 at 16:24
1

The decision basically comes down to these:

  • GZIP keeps Unix file permissions, as files being allowed to execute.

  • On the other hand ZIP works out of the box in Windows.

1

tar and gzip are a lot more common on *nix-es than unzip. For instance, at the moment on my arch-2009.08 there is no unzip.

nc3b
  • 1,344
  • 1
  • 8
  • 10
-1

I have experienced that there is concrete difference.

If you are compressing programs with libraries, zip format may lead to "file format not recognized" or "syntax error", because of the lack of information. Tar compression ensure to keep safe also all attribute.