Tuesday, June 12, 2007

Database backup compression

Our first technical post is going to be about compressing text database backups. Insurance against the dreaded SQL injection attack or the power surge monster. If you'd rather skip all this incessant babble, scroll to the bottom for the conclusion.

We used to just use straight up gzip with no options. That seemed to work but what's really the best and how does that stack up with the latest/greatest stuff of the 21st century? (7z)

Well we decided it's about time to figure that part out. We tried this on a real database backup with real data. No better way is there?

How we got the backup in case you were wondering.

> mysqldump --opt --no-create-info --database ourdb

Now lets see how the compression stacks up. Remember that in a real company, we trade off space and computer power that it takes, we can't have the server doing dumps all night long.

Compression
TimeSizeFactor
none

3324MB
gzip
421s 993MB29.9%
gzip -9
950s974MB 29.3%
7z
7,876s
705MB
21.2%
7z -9
not
worth
testing
bzip2
1,417s
711MB
21.4%
bzip2 -9
1,531s
711MB
21.4%

Things to consider when looking at these numbers:
  • Server will be offline for this period of time.
  • This also affects slaving and the machine will have to "catch up" after.
  • Does it have other duties?
So it looks like gzip is pretty good with just the defaults. A -9 flag seems to more than double the cpu usage for very little savings. 7z is just way out of the question with an insane amount of computing necessary. Bzip2 seems alright with default compression settings but is more than 3x the computing time...

Conclusion: Just use gzip, default options... the same crap we used before...

what a pointless experiment.

No comments: