Microservices and Delivery - Lessons Learned

One of the most important selling point of moving towards microservice architecture is the speed of delivery! Today customers can research products, compare reviews and ultimately switch to better products online with unprecedented ease. To survive in this high customer-focused and fast moving industry, we need our teams to deliver software at an accelerated rate and possibly multiple times in a day.

For us over the time, our delivery pipeline has also evolved:











During the Enterprise days, the release cycle was 6 weeks long. If you missed anything - wait for another 6 weeks. We used home grown scripts to coordinate all the deployments and overall experience was just awful. During the early days of Cloud, we moved to a daily deployments but these were at the end of the day with significant downtime and required lots of coordination with different teams. Today we have deployments on demand. Teams are deploying multiple time in a day with zero downtime!

We love Slack at work! This is one of the bots that shows #number-of-deployments in last 14 days to production env. 














Lessons Learned

Empower Development Teams to push code to production:

If there is any hand-off process b/w the path from developer machine to prod - it will never scale. Invest into tools and infrastructure to enable zero downtime deployments and empower the development teams to push code to prod. Sounds Crazy? Totally works!

Improve Development Practices: 

Educate, Implement and REWARD teams that are adopting continuous-delivery friendly practices in their development cycle. Master shipping code using Dark-Launches, Feature Toggles and Backward/Forward compatibility. Bring Performance and Chaos Engineering practices early in development.

Monitoring

Monitoring becomes an important aspect of the of the daily job. Teams should invest into learning monitoring tools (spunk/datadog/grafana …or which ever you use). On-call/pager-duty is a must! Mature teams adopt scrum-ops - an extended version of scrum in which after the regular scrum, the entire team looks at the health/performance of their stacks for last 24 hours and identifies good/bad/normal operating scenarios.

Blame free Post-Mortems

When things do go wrong, provide a space for teams to perform blame free post-morterms. Only focus on Timeline/what contributed/what could we have done better/Action items!

Popular posts from this blog

Break functional and orchestration responsibilities for better testability

Microservices and Tech Stack - Lessons Learned

An Effective TTL-Caching Pattern