Why do downloads give bad files?

Question

Why, when downloading, do network errors cause the download to finish and create an incomplete file?

Shouldn't the program downloading be able to recognize that the file is of X big, and it has only downloaded Y so far?

I'm assuming that it is a download that knows the size of the file.

Decent question, but too broad. With web transfers & crappy networks I have seen tons of corrupted files. It comes from start/stop of the connection. Best bet I found to deal with them is to use a tool like “Download Them All.” — Giacomo1968, Oct 21 '14 at 19:37
Think about if someone sent you a 100 page document page by page. You get a load of pages and then a few don't turn up. Then a few more (not in the same order) arrive. If the first page tells you how many pages to expect then you know, but what can you possibly do to change how and when the pages arrive? You will still have an incomplete document... — Kinnectus, Oct 21 '14 at 19:40
@BigChris Sure, but I'd rather have a .crdownload file (in chrome's case) that indicates a partial download, than a bad completed file. — Nathan Merrill, Oct 21 '14 at 19:42
It depends on the quality of your down-load manager. Some of them are not very well written (to be polite). Plus, as @Ƭᴇcʜιᴇ007 says, it is not always possible to determine the source file size. That is why many sites publish MD5 check-sums, though there does not seem at present a protocol to allow down-load managers to use this information. — AFH, Oct 21 '14 at 19:48
@AFH "there does not seem at present a protocol to allow down-load managers to use this information" I'd assume that's because you'd need the whole file to use the checksum. So you'd need to download the whole thing to check it anyway. ;) — Ƭᴇcʜιᴇ007, Oct 21 '14 at 19:52

score 3 · Answer 1 · edited Mar 20 '17 at 10:16

3

Many transfer mechanisms have no idea what the size of the file they are downloading is. Check out this related SU question: Why do some downloading files not know their own size?

Also, the transfer mechanisms have no idea what data is supposed to be in the file; so it has no idea if that 1 it just read in the download was actually a 1 at the source or if it's a 1 due to corruption during the transfer.

If the download stream encounters too many retries, a time-out elapsing, or other recognizable transfer errors, then the transfer is stopped. Since download mechanisms save that download stream as it arrives, it writes the file until the incoming stream stops, regardless of why it stops. Whether that incomplete file is kept or not after it's known to have a transfer failure, is up to the client/mechanism used.

edited Mar 20 '17 at 10:16

Community

1

answered Oct 21 '14 at 19:42

Ƭᴇcʜιᴇ007

111,883
19
201
268

I'm not asking about downloads that don't know the size. I'm not asking why bad data creeps into downloads, but why that bad data (or no data) finishes the download prematurely. – Nathan Merrill Oct 21 '14 at 19:45
1

You asked: "Shouldn't the program downloading be able to recognize that the file is of X big, and it has only downloaded Y so far?" So I stated that they (often) have no idea what the expected size of the file is, to explain why that can't be used to detect a problem. – Ƭᴇcʜιᴇ007 Oct 21 '14 at 19:49
I realized that when you posted, and made an edit :P – Nathan Merrill Oct 21 '14 at 19:50
Why is bad data downloaded; because a single packet is corrupted by noise – Ramhound Oct 21 '14 at 20:11
Well… data corruption… If a single packet is corrupted in IP, it will be discarded. If it's corrupted in TCP, it will be retransmitted. That's why they have checksums. So where does the corruption really come from, then? – slhck Oct 22 '14 at 04:47

Why do downloads give bad files?

1 Answers1