>>1017469
>Why not?
Backups are a last defense against data loss.
Incremental backups are backups focused on preventing data loss in a short term: they should only include a very small portion of your files (the ones you work with in that short term) and they should be fast to create and fast to roll back to.
Compressing incremental backups makes them slower to create/verify/roll back, adds a small risk of data loss, and only saves you a bit of dirt cheap storage space: at that point, you may as well skip incremental backups completely.
>Why not?
Because storage is dirt cheap, while data loss is not.
>if we're talking about incremental backups with almost identical copies of the same files, the raw data could easily be 10 times the size of the compressed archive or more.
Then use some version control software to deduplicate and backup those files there.
>How would an archiver be able to detect bogus data beign received by the disk or coming from the SATA controller?
Didn't mean that, I meant that the SATA controller is going to notice most read errors and either autocorrect them or retry the read, and then your archiver is going to notice most errors in the archive itself via CRCs and other verifications.
>but in real life there is silent data corruption
I have to doubt the stats in the CERN study as they imply a 1 in 10^9 bytes in 6 months permanent write corruptions.
That means, I should see bogus hashes on many (quick calc says 60%+) 1GB+ files after 6 months from them being written to disk, and my use case includes having a lot of large files that are validated via QuickSFV and stay around for several months.
Yet, I have seen but a single of those files fail hash verification in several years, on a system without ECC nor RAID and in general very far from the state of the art.
More importantly, no hash mismatches on 10+ GB files, which should have a 99,99+% of corruption.
Let's not even consider system images and backups themselves, even a simple windows install .iso should go bad more than half of the time.
I guess there was some issue at CERN, such as a disk sector going bad, because the 1 in 10^9 over 6 months figure is completely and utterly unbelievable.
>Not being friendly is not the same as being impossible.
In the way I used it, it's close enough.
It means those systems cannot guarantee that appending data will produce valid data as a result, let alone guarantee it will be the data you want.