Permissible exclusions when backing up a repository or branch

Asked by Stuart Bishop

I have scheduled offsite backups of my repository, but the backups seem needlessly large. What parts of the repository is it safe to exclude from backups? At the moment I'm exluding .bzr/repository/obsolete_packs/* but was wondering if all those indexes or other parts are needed too, and I'd rather not find out the hard way that I've been overzealous.

Question information

Language:
English Edit question
Status:
Solved
For:
Bazaar Edit question
Assignee:
No assignee Edit question
Solved by:
John A Meinel
Solved:
Last query:
Last reply:
Revision history for this message
Best John A Meinel (jameinel) said :
#1

The files in .bzr/obsolete-packs/* .bzr/upload/* can be ignored.

Probably the size of the backups is due to auto-packing. In that we may combine smaller packs into a larger one, which would cause that data to get re-written to the backup.

I would probably recommend that you manually run "bzr pack" to create one large pack, which will be unlikely to change any-time soon.

I would *not* recommend that you run "bzr pack" manually after that, as it will regenerate a single large pack, which means the whole history gets stored back to your backup tapes.

The indexes are needed.

Technically, you probably don't have to copy the stuff in .bzr/repository/lock/* but there shouldn't be a lot there anyway.

If you have stuff in .bzr/repository/upload, that is a sign of a lot of interrupted connections. (Push part way, ^C, etc.) Things should only be there until the upload finishes, and then they are moved into .bzr/packs/*

It is also possible to get unreferenced packs in .bzr/packs/*. When adding a new pack, or repacking existing one. We put the new data in upload, move the new pack into place, update the pack-names file, and then rename the old packs out of the way. If you ^C at any time in that path, it is possible to leave data behind that is not referenced. (No corruption, but just extra data.)

We don't (currently) have a bzr command to check and prune that sort of thing out, though I could work with you on it.

(Basically, the filenames in .bzr/repository/pack-names are the only ones we reference, so anything else in .bzr/repository/packs or .bzr/repository/indices, and anything in .bzr/repository/upload)

Revision history for this message
John A Meinel (jameinel) said :
#2

I want to make it clear that these are only really true if there is no active commit/update/push/pull happening. During an active operation there will be files in .bzr/upload/*. Deleting them should just abort the current action, though.

Revision history for this message
Stuart Bishop (stub) said :
#3

Thanks John A Meinel, that solved my question.