Nathan Dye, Microsoft
Code requires care and attention; it's a living thing.
...vs idea that change is bad, we must do as little as possible.
CMW: Quote that Jarkko used to have on his emails: "Biologists have another word for 'stable': it is 'dead'"
Their setup, lots of services (~108) deployed over lots of systems (~3500; CMW: is that right?)
Used to have infrequent releases (6-12 months).
Different errors at deployment time:
Talked to friends at amazon about their service infrastructure (CMW: what was that great podcast with Werner Vogels discussing this and how they setup teams? SE Radio?)
Service team
Isolation
Detect
Rollback
CMW: is this also applicable for a period of time after service x.1 has been in deployment? e.g., I deploy service X.1 and 10 hours later need to roll it back to X.0; possible? Or just immediate rollback?
CMW: is this easier to do in DB if you don't change anything inplace, but instead add new columns (even if duplicative) and you can drop old ones in a successive version? (Kindasorta similar to Helland's "always INSERT" idea for scalability)
Parallel chain: ability to replicate process for dev/testing purposes, see how things work in the real world.
CMW: Focus of all this work allows you to manage change in the smallest scope possible, both technically and human-wise.
CMW: The "how can you tell if you made a change six months ago broke stuff" is a red herring, IMO. Does anyone do that successfully in any systems, no matter what their architecture?
CMW: Same with "how do you deal with security issues of anyone can deploy?" Make people accountable for their actions! (See presentation from Netflix CEO.) But at least you're giving people control of their own destiny, and it requires that you make things transparent (track who deployed X to server Y at time Z).
More: http://servicedeployment.net/ (wiki)
deployment; agile –>