Django, Whitenoise, and Heroku
tl;dr Heroku’s filesystem is ephemeral, so be careful when running the collectstatic command - it doesn’t necessarily do what you think it does..
Our current production asset delivery setup looks like this:
- Run grunt locally to generate static files
- Commit changes to repo
- Push to Heroku
- Run
collectstatic
to push staticfiles to S3 - Configure CloudFront to use S3 as its origin server
It’s a PITA, and we have had persistent issue with CloudFront keeping hold of cached files post-deployment. We’ve been using cache-busting querystring values, but haven’t been able to configure CloudFront to behave (the way we want it to).
The important point (wrt this article) is that we call collectstatic ourselves, post-deployment, through our deployment tool, and do not rely on the default Heroku buildpack to run if for us. We do this because a full push to S3 can take a couple of minutes, and by controlling it ourselves we can elect not to run collectstatic if we know that no files have changed. The related point of significance is that the collectstatic process pushes the files to S3, i.e. off Heroku. There is a hidden gotcha in this process.
Because of persistent caching, and purging, issues with our CloudFront / S3 setup, and the fact that we are a low traffic, high session (e.g. few drive-by users) site, this setup causes more trouble than it solves, and so we are migrating to a simpler configuration based on WhiteNoise. WhiteNoise handles static asset serving direct from the Django app:
With a couple of lines of config WhiteNoise allows your web app to serve its own static files, making it a self-contained unit that can be deployed anywhere without relying on nginx, Amazon S3 or any other external service. (Especially useful on Heroku, OpenShift and other PaaS providers.)
Once the core migration work was complete, and working locally, we pushed to Heroku to see how it fared. We used our own deployment app, which ran collectstatic as it always does, but when the site came up, there were no static assets to be seen. Literally.
The next step was to see if the files were there at all. We ssh
d into the environment, and ran ls /app/staticfiles/
to find an empty directory. Next step was to re-run collectstatic from within the ssh session. It ran, and output all the files exactly as expected. ls
this time showed all the static files. Reload the site - still no static assets. Exit out of the ssh session, log back in, and ls
now shows an empty directory. Cue head-scratching.
The problem is the the Heroku filesystem is ephemeral. They make no secret of this (Ephemeral filesystem), but because we had previously pushed our static content to S3, this is something we’d never come across - we’d never actually looked at the filesystem. But it still seemed perplexing that the staticfiles
directory could have been empty - as running collectstatic to move files into a single common static root is a standard practice, and WhiteNoise itself is in the Heroku’s own Django project template. How could this not be working, and where were our files?
The answer lies in the Heroku Buildpack, and the concept of the application slug:
Slugs are compressed and pre-packaged copies of your application optimized for distribution to the dyno manager. When you git push to Heroku, your code is received by the slug compiler which transforms your repository into a slug. Scaling an application then downloads and expands the slug to a dyno for execution.
The solution is to get your static files into the slug before it’s copied to its final destination. This means that you need to run collectstatic as part of the slug compilation process - where it runs by default; running it post-compilation, as we have always done, will have no effect on the filesystem. We didn’t notice this as we weren’t relying on the filesystem, we were pushing to S3.
In conclusion - if you are running collectstatic with WhiteNoise (or any other process that relies on moving files around the local filesystem), you must not disable the builtin buildpack support (i.e. remove DISABLE_COLLECTSTATIC - irrespective of its value, its existence alone is enough - we thought that setting DISABLE_COLLECTSTATIC=0
would re-enable collectstatic - it doesn't, which only added to the confusion).
Making Freelance Work