1

We had a LVM of almost 16TB with ReiserFS. We now want to add more disks, but we cannot continue using ReiserFS, as it has a maximum capacity of 16TB. This volume is being used to make incremental backups with RSync, which hardlinks unchanged files.

We know that we can use rsync or fsarchiver to accomplish it, but both of them are very slow and consume huge amounts of memory, as it has to remember the inodes of every file.

We used dd and ssh to copy the partition to a temporary place and now we want to change the partition type to ext4 64bits (or another type if someone knows a better solution).

emi
  • 163
  • 6

1 Answers1

3

If you want to change the file system type, you're going to have to use a tool like rsync or tar or fsarchiver. The main thing though that I would think is for you to consider the possibility of finding a different backup system than using rsync and using hard links.

The problem is that hard links and using duplicate directories for incremental backups is a very inefficient data structure for storing backup information. This is one of the reasons why using rsync to copy all of these backups is slow and painful. It also means that using fsck to perform a consistency check is going to take a lot of memory in order to make sure the link count for all of the files will take a large amount of memory --- for the same reason that rsync, tar, fsarchiver, etc., will require a large amount of memory.

So my recommendation is to look at a backup system such as bacula, which uses a proper database to store the catalog of backed up files, instead of trying to use directories to store that information. And to use this as an opportunity to transition to a much more scalable backup solution.

Theodore Ts'o
  • 901
  • 4
  • 5
  • Do you know any method to migrate from my incremental copies to the bacula system? Or maybe bacula already has an importing tool? – emi Dec 01 '15 at 09:31
  • The one advantage of using a hard link tree as a backup system is that each top-level directory looks like a snapshot. So the simplest thing to do is to create a shell script that creates a bind mount from the first snapshot --- say, /backup/01-01-2014 --- to some constant directory name --- to /files, and then have bacula do a backup of /files. Then remove the bind mount, and then create a bind mount from the next shapshot --- i.e., /backup/01-08-2014 --- to /files, and then have bacula do the next backup. It should be able to notice the unchanged files as and not back them up again. – Theodore Ts'o Dec 02 '15 at 14:54
  • I still cannot vote your response up. I'll be back when I've harvested enough reputation ;) – emi Dec 02 '15 at 19:24
  • If you don't keep really many snapshots (say less than 20 is enough), the `rsync -aH ...source ...latest`, `cp -al latest $timestamp` is an okay backup strategy. It has the adventage that you don't need any special tools for restoring data and all the historical pieces of data have exactly the same access as they used to have so non-admins can restore stuff by themselves. Another advantage is that there's no database to potentially corrupt and lose *all* the files in backup. – Mikko Rantalainen Jul 04 '19 at 06:20