Posts

Organizational Alignment in Microservices: Lessons from Industry Pioneers

 In today's fast-paced digital landscape, organizations are constantly striving for agility, scalability, and innovation. Microservices architecture has emerged as a powerful paradigm for achieving these goals. However, the successful implementation of microservices is not solely dependent on technical aspects. It is equally vital to ensure organizational alignment, as misalignment can lead to severe challenges. In this article, we will explore the crucial need for organizational alignment in the context of microservices, drawing insights from pioneering companies, and delves into the repercussions of misalignment, supported by examples of both good and bad organizational behavior patterns. Organizational alignment in the context of microservices refers to the harmonious synchronization of various organizational units, including development teams, operations, and business functions, to support the principles and practices of microservices architecture. A Good organizational alignme...

Horrors of caching

Caching in general is a good practice to improve the responsiveness of the system. But sometimes it can go horribly worng and cause cascading failures leading to bringing the entire system down. Below were such incidents that I encountered: Backend service caches M2M auth tokens which it gets by making a API calls to an Authservice. Auth service introduced a bug in which it returns empty token and 200 status code. The backend service checks the response code and cache the token for few hours and the rest is history :) Service caches DNS entry to make API call for the resource. DR hits and the cached DNS entry no longer works. In the absence of dynamic service discovery, needs entire cluster restart. Caching a /GET call and not respecting the expiration. Well this is just silly and should have been caught in the code review Happy caching!

Architecture & High Performing Teams

Image
Companies have a never ending quest to continuously adopt new tools, processes and new architectures with the goal of enabling teams to move fast and build high performing teams. What is a high performing team? If you ask a product manager, they will say "It’s a team that can deliver new value faster". If you ask a Team lead they will say along the lines of "A high performing team can pay down its tech debt faster, spends majority of the time building new value to the product and optimizes their work for improved quality and faster delivery." "so what makes these teams move fast?" Now this is a great question! I have been fortunate to work with some of these high performing teams.  Juggling down the memory lane, I was trying to find the answer to this question in a way that discounts the personal traits of any individual in the team. I stumbled upon three unique qualities that distinguished a high performing team from others: Ability to make sweeping chang...

Instrumentation, Observability and Monitoring (IOM)

Image
Terminology Observability : Observability is the property of a system to answer either trivial or complex questions about itself. How easily you can find answers to your questions is a measure of how good the system’s observability is. Monitoring : observing the quality of system performance over a time duration Instrumentation : refers to metrics exposed by the internals of the system using code (a type of white-box monitoring) Why IOM?  Analyzing long-term trends like User growth over time User time in the system System Performance over time Comparing over time or experiment groups How much faster are my queries after new version of a library Cache hit/miss ratio after adding more nodes Is my site slower than it was last week? Alerting - Something is broken, and somebody needs to fix it right now!  Building dashboards - Answer basic questions about your service - Latency, Traffic & Errors. Conducting ad hoc retrospective analysis - Our latency just shot up; what else ha...

Dockercon2020 - Monolithic to Microservices + Docker = SDLC on Steroids!

Image
Here is the link to my #Dockerccon2020 talk "Monolithic to Microservices + Docker = SDLC on Steroids!" 

EWF Service Decomposition Pattern

Image
EWF is another microservice decomposition pattern that I like to use when building services. It stands for Entity, Workflow & Functional services. In simplest terms, it’s an extension of Single Responsibility principle and allows for building independent testable services. To understand this pattern better, let’s start with an example. Imagine I want to write a signup service that allows user to signup to a website using their Google account. The service would perform the following:  Let user login into google account and fetch user’s basic information from Google upon successful sign-in. Apply more default settings associated to creating new user.  Persist the user in the database.  After successful registration, redirect the user to the homepage.  At a high level, this works just fine and gets the job done. But if you think about testability first and wants to test this service in isolation we have a problem! The entire testing paradigm relies on ...

Prefer Consumer over Integration Tests

Image
Recently I attended an RCA meeting for an incident that happened in prod few weeks back. The issue was somewhat of a 2nd degree failure. To put it in simple terms, lets say a workflow composed of two services - A and B. When user performed an action from the UI, it made a call to Service A which then internally called service B with some parameters based on user preferences. In this case, due to some unusual combination of preferences, Service-A ended making call to Service-B with parameters that Service-B didn’t fully understood. Solution - Write an integration test to cover this scenario. Reasoning - Since this involved 3 different components (User Preferences, Service A and Service B), it should be covered by an integration test. More Reasoning: This is sort of the usual way of thinking about the integration tests - meaning whenever there are multiple services involved, we immediately think about writing an integration test. But is this the right approach specially in microserv...