I've been thinking about Zero Downtime Deployment for the past few weeks. I even raised it as a discussion topic in our LSCC Roundtable. Here are the key points discussed. Obviously the feasibility/suitability completely depends on the application and platform architecture. Also this is not an attempt to provide a solution but more to record the thoughts and ideas:
Switch over to backup upgrade the primary and then switch back, and then update the backup site.
Have a HA configuration, take hosts out upgrade then switch gradually.
Allow for parallel runs, so you can switch over to a parallel when updating.
Develop application to run on previous and new version of the database. Hot deploy the app and "hot" migrate the database.
Use dynamic features in your platform to allow for hot deployment.
Use semi structured datastore so data migrations are minimal.
Provide targeted releases, i.e. if a particular area of the application changes than only that area is updated and not the whole application.
If eventual consistency across partitions is acceptable then deploy in partitions and update partitions gradually, i.e. like Facebook, Google etc.