Backing up Emails using OfflineIMAP

2020-04-19

A previous post explained how to back up and migrate emails using the graphical mail client Thunderbird. This post focuses on backing up emails using the command-line tool OfflineIMAP that is implemented in Python. OfflineIMAP is just one of several tools for this task. The mail retrieval agent getmail, which is also written in Python, is an alternative.

To get started, install OfflineIMAP with your package manager if available, or build it from source.

To configure OfflineIMAP to fetch emails for local storage, you can use the following configuration, which includes sample configurations for the popular email service providers Outlook and Gmail, as a starting point. The configuration file lives at ~/.offlineimaprc:

[general]
accounts = outlook, gmail
ui = TTYUI
pythonfile = ~/.offlineimap.py
fsync = False

[Account outlook]
localrepository = outlook-local
remoterepository = outlook-remote

[Repository outlook-local]
type = Maildir
localfolders = ~/mail/outlook/
sync_deletes = no

[Repository outlook-remote]
type = IMAP
remotehost = outlook.office365.com
remoteport = 993
ssl = yes
sslcacertfile = /usr/local/etc/openssl/cert.pem
readonly = True
# https://www.offlineimap.org/doc/FAQ.html#exchange-and-office365
folderfilter = lambda folder: folder not in [ 'Calendar', 'Calendar/Birthdays', 'Calendar/Sub Folder 1', 'Calendar/Sub Folder 2', 'Calendar/United States holidays', 'Contacts', 'Contacts/Sub Folder 1', 'Contacts/Sub Folder 2', 'Contacts/Skype for Business Contacts', 'Deleted Items', 'Drafts', 'Journal', 'Junk Email', 'Notes', 'Outbox', 'Sync Issues', 'Sync Issues/Conflicts', 'Sync Issues/Local Failures', 'Sync Issues/Server Failures', 'Tasks', 'Tasks/Sub Folder 1', 'Tasks/Sub Folder 2', ]
remoteuser = my-outlook-email@example.com
remotepasseval = get_keychain_pass(account="my-outlook-email@example.com", server="outlook.office365.com")

[Account gmail]
localrepository = gmail-local
remoterepository = gmail-remote

[Repository gmail-local]
type = GmailMaildir
localfolders = ~/mail/gmail/
sync_deletes = no

[Repository gmail-remote]
type = Gmail
ssl = yes
sslcacertfile = /usr/local/etc/openssl/cert.pem
readonly = True
folderfilter = lambda folder: folder not in ['[Gmail]/Trash', '[Gmail]/Spam', '[Gmail]/All Mail',]
remoteuser = my-gmail-email@example.com
remotepasseval = get_keychain_pass(account="my-gmail-email@example.com", server="imap.gmail.com")

Ensure to update at least the remote* settings and create the localfolders on your disk. For details on the configuration settings, check out the official OfflineIMAP documentation.

Although the readonly = True and sync_deletes = no options are set, it's important to note that backing up IMAP mailboxes safely is inherently difficult. For example, if the IMAP server claims that the remote mailbox is empty, the local backup mailbox will be deleted. To guard against this case, it is possible to set sync_deletes = no in the [Repository *-remote] sections (see documentation for this setting).

While it's possible to store your email password in the configuration file, it's recommended to store it in a password manager or the Keychain Access application of macOS for security reasons. Steve Losh provides instructions on how to store it in Keychain Access using a password retrieval script stored at ~/.offlineimap.py that is imported in the configuration file ~/.offlineimaprc via the pythonfile setting above. The script ~/.offlineimap.py provides the get_keychain_pass function used in ~/.offlineimaprc under the remotepasseval settings to retrieve the passwords from Keychain Access. It may look as follows:

#!/usr/bin/env python

import re, subprocess, os

# Based on script by Steve Losh: http://stevelosh.com/blog/2012/10/the-homely-mutt/#retrieving-passwords

def get_keychain_pass(account=None, server=None):
    params = {
        'user': os.environ['USER'],
        'security': '/usr/bin/security',
        'command': 'find-internet-password',
        'account': account,
        'server': server,
        'keychain': os.path.join(os.environ['HOME'], 'Library', 'Keychains', 'login.keychain-db'),
    }
    command = "sudo -u %(user)s %(security)s -v %(command)s -g -a %(account)s -s %(server)s %(keychain)s" % params
    output = subprocess.check_output(command, shell=True, stderr=subprocess.STDOUT)
    outtext = [l for l in output.splitlines()
               if l.startswith('password: ')][0]

    return re.match(r'^password:\s+(?:[^"\s]\s+)?"(.*)"\s*$', outtext).group(1)

After everything is configured as desired, run the command offlineimap to initiate the email retrieval process.

Finally, to read your emails (it's a good idea to always verify that your backup works), you can open your backup mailboxes using a mail client such as NeoMutt in read-only mode:

# Outlook mailbox:
neomutt -R -f ~/mail/outlook/INBOX/

# Gmail mailbox:
neomutt -R -f ~/mail/gmail/INBOX/