User Tolerant Liveware: backup

Showing posts with label backup. Show all posts

2014/03/13

encfs

WARNING The encryption of encfs is severly broken. Do not rely on it to keep anything secret.

So one can layer encfs on top of google-drive-ocamlfuse.

Here's how I set it up.

yum --enablerepo=epel install rlog-devel boost-devel
wget http://encfs.googlecode.com/files/encfs-1.7.4.tgz
tar zxvf encfs-1.7.4.tgz
cd encfs-1.7.4
./configure --prefix=/opt/encfs-1.7.4   \
        --with-boost-serialization=boost_serialization-mt \
        --with-boost-filesystem=boost_filesystem-mt

make all && sudo make install
sudo sh -c "echo /opt/encfs-1.7.4/lib >/etc/ld.so.conf.d/encfs-1.7.4.conf"  
sudo ldconfig 
for n in /opt/encfs-1.7.4/bin/encfs* ; do
    sudo ln -s $n /usr/local/bin 
done 

encfs ~/googledrive/Backup/Encoded ~/encfs

And now "all" I have to do is rsync -av /remote/pictures/ ~/encfs/Pictures/ --progress. And wait. A lot, given I'm getting roughly 12.43kB/s though this setup.

Backups in the cloud.

Given that 1TB of backup from Google is now 10$ a month, I had to look into doing cloud backups again.

Google doesn't have a native Linux client. So one has to use google-drive-ocamlfuse. Installing this on CentOS 6 is surprisingly complex. And Google Drive's auth mechanism is based around a web interface.

But once I got it working, it Just Worked. Or rather, I could copy small files to ~/googledrive, see them via the web interface, delete them there and they are now missing.

Of course you wouldn't leave unencrypted backups on Google's servers. I futzed around with gpg a bit, but maybe layering encfs on top of google-drive-ocamlfuse would be a better idea.

My experimetation was cut short by supper, and by the fact that uploading a 400MB file was very slow :-) (sent 418536429 bytes received 31 bytes 219416.23 bytes/sec aka 1.6 Mbit/S)

For the record, here is how I installed google-drive-ocamlfuse

yum install m4 libcurl-devel fuse-devel sqlite-devel zlib-devel \
    libzip-devel openssl-devel
curl -kL https://raw.github.com/hcarty/ocamlbrew/master/ocamlbrew-install \
    | env OCAMLBREW_FLAGS="-r" bash
source /home/fil/ocamlbrew/ocaml-4.01.0/etc/ocamlbrew.bashrc
opam init
eval `opam config env --root=/home/fil/ocamlbrew/ocaml-4.01.0/.opam`
opam install google-drive-ocamlfuse
sudo usermod -a -G fuse fil
google-drive-ocamlfuse 
google-drive-ocamlfuse ~/googledrive

The second to last command will open a browser to get an OAuth token. This means you need htmlview and a valid DISPLAY. The token is only good for 30 days. This is something that needs to be better automated.

2011/07/08

Fun with mysqldump

mysqldump makes for a quick and dirty way of backing up MySQL. Restore is trivial, if slow. However, it ties up the DB for the entire time it's running. Say you're like me and you have a table with 918,732,676 rows (yes, 9 hundred million rows). This means that the DB is tied up for 5 hours of dumping and 7 days for the restore. Clearly something better is needed.

However, before I work on integrating xtrabackup, I decided to see how slicing the dump with a LIMIT clause would work. mysqldump doesn't allow me to set LIMIT directly, however it does zero validation on the WHERE clause I can set. So:

(

mysqldump --opt DB --ignore-table BIGTABLE

mysqdump -NB --no-data DB BIGTABLE

MAX=$(mysql -NB DB -e 'select count(*) from BIGTABLE')

INC=$(( $MAX/1000 ))

seq -f '%.0f' 0 $INC $MAX | while read start ; do

    mysqldump --compact --no-create-info DB BIGTABLE --where "1 LIMIT $start,$INC"

done

) > /my/backup.sql

This only works because mysqldump blindly tacks the WHERE clause onto it's select statement, giving us (roughly)

SELECT * FROM BIGTABLE WHERE 1 LIMIT 0,918732

This is a hack and a half: a- while it works now (Perconna 5.1.54) there's no guarantee it won't break at some point. But more importantly, b- this will still take a week to restore.

BTW, black-eyes to the idiots made %g the default seq format.

2010/10/20

A better backup

I do backups badly. Basically, rsync to a large partition somewhere. That's not really a backup. It protects against hardware failure, yes. But not against "oops! I deleted that file 3 weeks ago." What's more I'm sure I'm not doing it as well as it could be; by backing-up to a hard disk, why not backup the entire OS, and the hard disk bootable? Would be complicated if multiple machines backup to one backup server, but for my clients, I most often have one server which backs-up to one set of removable disks

What's more, I moved all my VMs from Jimmy to George yesterday. When I say "move VM" I should say "moved all the services to new CentOS 5 VMs." Which sort of shows up another problem: keeping track of what you've set up where and why. Jimmy had lighttpd running on it. Why? Oh... to see the RRD graphs of the temperature and humidity in the attic. I should document all this, now that I "know it" but ideally it should be automated.

And conformance tests; a bit like unit tests, you run some scripts to see if everything in the new install is working as expected. After all was done, I realised that I hadn't copied over my subversion repositories, nor set them up.

One central issue, I suppose, is config files. Ideally, you just copy in the backed-up config file, start the service, run the test script, verify success. I notice that rpm provides a --configfiles option. Combined with rpm's verify options, maybe one could detect what config files have changed and keep a backup set of them. Of course, things like /var/spool/hylafax/etc/config.ttyS0 would have to be added by hand. As would stuff installed by hand into /opt and/or /usr/local

And a modified config file implies that the package is being used, so the package would get flagged as important. And then, maybe once a week say, you'd get email "hey, you don't have a conformance script for package X." Or "You didn't write a changelog for the latest changes to file Y."