sakana

very short memo

back up files

Here is another approach for creating back up files onto dropbox on regular basis by cron. A quite simple script though, here are some remark.

hostname

There are multiple approaches for obtaining hostname though, here I use socket.gethostname(), which seems to be a transliteration of gethostname() system call.

compression

Here I use bzip2 as compression algorithm, which seems to achieve more effective compression rate against files like plain text in comparison to other algorithm like gzip. But as a trade off to good compression rate, compression process seems to take longer time.

temporary file

I create backup file first under temporary directory and then move it under directory under dropbox in case that there is/are any change(s). As creating backup file directly under dropbox directory is quite slow due to traffic.

checksum

Finally I compare temporary backup file and existing file by value of MD5 checksum of each file. If they do not match, it indicates that some change(s) of file(s) may have arisen. If they are identical, then there was no change in files under target directory, so do nothing.

import datetime
import hashlib
import os
import os.path
import shutil
import socket
import tarfile

if __name__ == "__main__":

    # base directory, which is one level upper than target directory
    base      = "<BASE_DIRECTORY_OF_YOUR_CHOICE>"
    # backup target directory under base directory
    target    = "<BACK_UP_TARGET_DIRECTORY>"
    # directories to exclude from backup file
    exclude   = ["<DIRECTORY_TO_EXCLUDE>"]

    # backup file name / HOSTNAME_DAY.tar.bz2
    backup    = socket.gethostname() + "_" +\
                datetime.date.today().strftime("%A") +\
                ".tar.bz2"
    # temporary directory & temporary file
    temp_dir  = "/tmp"
    temp_file = os.path.join(temp_dir, backup)
    # destination directory
    dest_dir  = os.path.join(base, "Dropbox/<TARGET_DIRECTORY>")
    dest_file = os.path.join(dest_dir, backup)

    # let's start backup

    # first create a backup file under temporary file
    tar = tarfile.open(temp_file, "w:bz2")

    os.chdir(base)
    for root, dirs, files in os.walk(target):
        for name in files:
            for d in root.split('/'):  
                if d in exclude:
                    break
            else:
                tar.add(os.path.join(root, name))
    
    tar.close()

    # backup file creation has finished
    # now determine if file replacement is required or not
    if os.path.exists(dest_file):
        # dictionary to store md5 checksum of each file
        md5 = {}
        for f in temp_file, dest_file:
            with open(f) as file_to_check:
                data = file_to_check.read()
                md5[f] = hashlib.md5(data).hexdigest()
        # compare checksum value of each file
        # copy temporary file under destination directory
        if md5[temp_file] != md5[dest_file]:
            shutil.copy(temp_file, dest_dir)
    else:
        # somehow backup of same name does not exist, do copy
        shutil.copy(temp_file, dest_dir)

If you would like to take backup, say, every one hour, then add such a line as follows in cron table.

$ crontab -l
0 * * * * python <PATH_TO_SCRIPT>/backup.py

x86 or x64?

I have to confess that I did not know CPU (AMD E2-1800) of my tiny PC (ThinkPad Edge E135) is x64 capable...

You can check whether your CPU is x64 capable or not by referencing existence of lm flag (long mode).

$ grep ^flags /proc/cpuinfo
flags               : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf pni monitor ssse3 cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch ibs skinit wdt arat hw_pstate npt lbrv svm_lock nrip_save pausefilter

If there is lm in flags line, then its CPU can be x86_64 architecture. After reinstalling OS, now I have x64 OS at hand!

$ arch
x86_64

So easy... Ok, let’s create an lxc instance on this re-born platform.

Timezone of instance seems to be in EDT, which is not convenient for us. Let us change timezone to JST (Asia/tokyo). You can do it by configuring /etc/localtime file to reference timezone data of Japan.

$ file /usr/share/zoneinfo/Asia/Tokyo
/usr/share/zoneinfo/Asia/Tokyo: timezone data, version 2, 3 gmt time flags, 3 std time flags, no leap seconds, 9 transition times, 3 abbreviation chars

$ sudo ln -sf /usr/share/zoneinfo/Asia/Tokyo /etc/localtime

static IP address assignment

Let us configure network related properties of our DNS Server host.

hostname

hostname of host can be configured by editing /etc/hostname file.

static IP address

By default, hosts are configured to obtain IP address from DHCP Server. Configuration is stored in /etc/network/interfaces file.

auto eth0
iface eth0 inet dhcp

Replace above as follows so that IP address 10.0.3.10 is statically assigned for interface eth0.

auto eth0
iface eth0 inet static
address 10.0.3.10
network 10.0.3.0
netmask 255.255.255.0
broadcast 10.0.3.255
gateway 10.0.3.1

resolver

As advertised in it, /etc/resolv.conf file will be overwritten upon system reboot.

# Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
#     DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN

You need to configure IP addresses of DNS servers and search domain in /etc/network/interfaces file as well.

dns-search example.org
dns-nameservers 10.0.3.10 10.0.3.1 8.8.8.8

Resolver will generates such lines as follows in /etc/resolv.conf.

nameserver 10.0.3.10
nameserver 10.0.3.1
nameserver 8.8.8.8
search example.org

Following entry is a good reference.