Managing Web Files through a Repository

2013-08-17

Handling web page files with a source code management tool such as git has many advantages including the ability to easily revert changes and managing multiple branches. Often people deploy their files to the web server via a different tool such as rsync, SCP, or FTP. This causes extra effort in general and can be a cumbersome process with SCP and FTP (which files were changed since the last deployment and need to be uploaded?). The tools SCP and FTP are also inappropriate in order to propagate only the changes of a large file to the server. FTP should be avoided altogether, since it transmits credentials and files in clear. Instead, it is desirable to directly deploy the files to the web server using the source code management tool. This blog post explains how to achieve that goal using git.

In a non-bare git repository the folder .git contains the history of changes. We want to have that folder on the server in order to have a copy of our repository, but make it inaccessible over the web to prevent visitors from obtaining the site's source code. Only the files in the working tree should be browsable. User kan proposes a simple solution to this problem on stackoverflow. His idea is to put the web files in a subdirectory of the git repository on the server and let the web server reference the subfolder only. Since web servers must prevent upwards directory traversal, the .git folder is invisible for web clients. Another idea is to keep the web files directly in the root of the git repository and to prevent access to .git by deploying an .htaccess file. BozKay provides a corresponding directive in his stackoverflow post.

In order for this to work, the server repository must have a working tree containing the web files and can therefore not be bare. A git push to a non-bare repository updates the history, but does not touch the working tree that might have changed as well. Furthermore, the server will not accept the push by default. With the following command, the server repository can be forced to accept pushed changes regardless of the state of the working tree:

$ git config receive.denyCurrentBranch warn

Although git push is then able to push changes to the server, the working tree on the server will remain as is. Running

$ git reset --hard HEAD

in the server repo will update the working tree, thereby throwing away any changes made.

Abhijit Menon-Sen came up with a cleaner solution. The first step is to set up a bare repository on the server to which the changes can be safely pushed to. This can be done by creating a directory whose name ends in .git by convention and executing the following command inside of it:

$ git init --bare

The second step is to create a hook file that updates the directory containing the web files whenever files are pushed to the bare repo. In the bare repo, add the file hooks/post-receive with the following content:

#!/bin/sh
GIT_WORK_TREE=/var/www/www.example.org git checkout -f

Make the hook executable by running:

$ chmod +x hooks/post-receive

Here /var/www/www.example.org is the directory with the web files that the web server provides to browsers. Now a git push to the server updates the bare repository and subsequently the web files. Note that a push will overwrite the web files on the server with your copy. If the web files were modified on the server, the changes will be lost. Therefore you must make sure that you only edit the files in your local git repository.

Always ensure that the file owner and permissions of the whole bare repo and entire web directory are as desired.