Building an app on Heroku? Here is a production-ready checklist.

Heroku is a convenient hosting solution for your web application, it provides a great deal of automated infrastructure so that you can deploy with a single command. Heroku provides a bunch of add-on resources such as databases, cache servers, etc. Here is a nice article about getting started with deploying an application on Heroku.

When you initially get your application running, you aren't very likely to forget about the database. After all, your application probably won't run without the database. If you are feeling proactive, you might set up some caching such as memcache or redis. But there is a lot more to worry about when you go from a proof-of-concept application to something running in production with customers and revenue depending on it. Here is an incomplete checklist to see if your app is ready for production.

  1. A staging environment. It's great that you can deploy code in a single command, but it's worth registering a second heroku application and making that your staging environment so that you can do more in-depth testing of major changes before deploying them to production. Even better, sign up for a CI solution such as Tddium, and have it auto-deploy to your staging environment once unit tests pass. Don't forget about cross-browser and cross-device testing for the most critical parts of your site - SauceLabs is great for this and can be wired right into your test suite.
  2. Storing logs. The heroku logs command is very helpful, but those logs aren't stored very long. You need something like logentries or papertrail to store your web logs for a longer period of time.
  3. Tracking exceptions. While developing your application, there will be bugs. What do you do when the logs show that you have a bunch of 500 errors? You need application exception tracking and notification. This allows you to monitor for crashes in your app and fix them. Most crashes are never reported by customers, you need to know when they happen and fix them. I like exceptional.
  4. Sending email. When you try to do your first password reset, you'll quickly realize that the servers need to be able to send email. Transactional emails are different from marketing/list emails, and you need to be able to get reports on what emails were delivered and what emails had problems. Sendgrid has worked very well for me for transactional emails, and Mailchimp for newsletter style emails.
  5. Restores. Notice how I didn't say backups? You need to regularly test that you can restore your database(s) for use with in a different deployment. I recommend doing a database restore from production into staging regularly, so that you know it works. You will be surprised at how many things break the first time you try and do this, and will sleep better when you know that your restore/recovery process is regularly being exercised.
  6. User behavior monitoring. Is anyone using your site? What pages are popular? Where does your traffic come from? Google Analytics is the default choice, but also consider KISSMetrics, they can provide valuable insights into what people really do with your application.
  7. Performance monitoring. Nobody writes slow code on purpose, but everyone is horrified when they look at NewRelic for the first time and see what is really taking up the time and memory in their application. It's expensive, but use it for at least a month - much cheaper than an engineer spending hours optimizing the wrong thing. NewRelic can also alert you when performance or error rates cross a threshold, so you know when your site is being overwhelmed before it crashes.
  8. Availability monitoring. How do you know when Heroku has an outage? What if Heroku has an outage affecting only 1% of their applications - are you affected? What if that cool search provider you added a dependency on has an outage that means everyone trying to use your site sees error messages in the sidebar? If you don't have something more sophisticated in place, at the very least set up pingdom to monitor your site uptime. It's also a good idea to deliberately break your site by disabling each of your add-ons one by one and observing the failure mode - you may be able to handle partial failures more gracefully with very simple changes. Once you've done that, add some very simple hidden pages that run a check against each add-on and report whether it worked or didn't work, then connect each of those pages into your monitoring system so that you can know when one of your add-ons had an outage even if your main application stayed up.
  9. Notification and escalation. My two favorite ways to make progress on solving problems and doing creative work are to go for walks away from my computer and to sleep. That means I often won't see the emails or chatroom notifications from those monitoring systems for several hours. You need something to text you and call your phone when a bad problem happens, and to escalate to the next person in the response team if you don't acknowledge the alert within a certain amount of time. Pagerduty is the gold standard here, although OpsGenie is an interesting competitor.
  10. Security vulnerability notifications. When a serious security vulnerability is disclosed, you need to respond quickly to patch the holes in your application. Look through your development stack, and figure out what the security announcement mailing lists are, and subscribe to them. Finding out a day or two before the vulnerability is in the news can make a big difference in whether you get things fixed before someone comes and downloads your entire database.
  11. File storage and transcoding. Whether it's something as simple as user profile pictures or videos giving a tour of your product, you won't be able to store the files on Heroku. Amazon S3 is the most common choice here, and there are also services like Zencoder that can handle processing and transcoding videos and audio files into your preferred format before you store them.
  12. SSL. Serve everything over SSL. Don't let your certificates expire (set reminders or wire certificate checks into your monitoring system. Make sure your staging systems use SSL also.
  13. Payment processing. Businesses need to charge money for goods and services, and even charities need to accept donations via credit card. I like for recurring billing - the javascript forms fit nicely into your own site, and still offload the bulk of PCI compliance headaches to the provider. They also take care of Dunning.
  14. User Feedback. If you really want to engage your customers, give them a way to talk back to you. If you want to send your customers email, why wouldn't you also give them an easy way to send you a comment or question? UserVoice is a great tool for this.
  15. DNS. Use DNSimple for your app hosted on Heroku. Here's why.
  16. Search. Your users do not know how your site is organized, and don't want to look at a site map. ElasticSearch is my default choice for search, and there are several add-on providers that will host your ElasticSearch cluster - I'm using Found and have gotten excellent service.
  17. We covered backing up the site, but don't forget to back up your laptop. You don't want to be crippled and unable to access production systems when your laptop is run over by a taxi. I like CrashPlan because it is cross platform, backs up both locally and to the cloud, and warns you when backups haven't been working. Another approach is to use a cloud development environment such as Cloud 9 IDE - but don't forget, they can have downtime too.

When you've got your app up and running and are building your feature list for launch, don't forget all these other things. They are the difference between a professional production quality application and a prototype. Did I miss something, or is there a better option? Please let me know.