Microservices - Q & A

Thank you folks for raising some great questions. I am consolidating them here for everyone to benefit!


How is your experience using Kubernetes? 
>> Kubernetes seems very powerful and some experts would say its the modern day kernel for container orchestration. Internally, we use Rancher which I believe has Kuberneter support but I don’t think we are using it yet. We have an entire SRE team that brings fun delivery tools into the pipeline and they are actively looking into it.


What are the preferred languages for writing microservices? Any recommended frameworks?
>> It really depends on the type of work. If you are writing heavy ETL pipelines then python is pretty awesome! If its a CPU intensive use compiled languages - golang/.net core/java. If its orchestration style or simple get/put/delete DB operation - use nodejs. All the services should be deployed as container so it doesn’t matter which language is used. In terms of frameworks, I have used aiohttp, flask, express, gin etc..in general pick the one that has active community support.



How sessions are handled in clustered environment in microservices?
>> Sessions are bit tricky in certain scenarios. We need session to figure out the authenticity of the call and further figure out which DB/storage to talk to (depending on style of multi-tenancy). The answer here depends on the style of the operation which usually fall into 3 categories:
  • User initiated (request/response)  
  • User initiated but long running (Pub/Sub)
  • Background jobs - needs to be initiated on some schedule. 
For #1 - the session token is propagated along with each call - meaning client makes a call to service and has a session token attached and the same follows when service then further calls other service apis.

For #2 - the session maybe is cached at different places and require some sort of event to notify
services in case when it expires.

For #3 - These problem are bit tricky since you need to create a concept of user w/o actually having a user. We accomplish this by creating something called a service user. This auth token for service user never expires and you need very specific crypto keys to be able to create one on the fly. The good thing is that you shouldn’t need to worry about it since these problems are universal and are solved many times. Work with your auth provider to see if they have any equivalent of service-user.

How Database will become active active strategy in microservices architecture when a DB goes down during a transaction? >> I am guessing you are talking about the availability of the data in the 2nd active node (even for the fraction of seconds) when the primarily assigned active node goes down? If yes then there are few options but first and foremost don't try to go down the obvious path of synchronous replication. Its a lost cause! Instead design your system around asynchronous replication. It means always assume that data on the secondary node is bit behind. Once you accept this fact, there are few options depending upon the style of operation-
  • Retry in case if the transaction is unsuccessful - use this when you only care about persisting the data.
  • Optimistic Concurrency - use this in case if you want to make decision based on first reading the underlying data. Usually this is achieved by validating the version number (or timestamp) of the rows just before persisting. Imagine you first read the data and the version number is 5. Then service updates some aspect of this data and tries to save. Just before persisting it - make sure the version number of the rows are still at 5. If its 4 or 6 (due to DR or some other service overwrite it before your transaction could finish) - then restart your work by grabbing the latest version of the data. 
There is no single strategy that works for all use cases and not every data needs to be 100% transactional. E.g an inventory service (that shows number of items left when buying a product from say amazon) could be eventual consistent but the payment processing service needs to be consistent (although amazon made this part eventually consistent also! - in case if there is some issue with credit card, you get an email after few mins of placing an order). Apply CAP theorem based on the use case.


What does PCF need to host the microservices?
>> We don’t use PCF but I would assume their Container Service should be sufficient for service layer. If you use managed services for storage then I am sure there must be some sort of integration with the popular ones.


How should be the inter routing of microservices to happen? Should it be via API Gateway or Rest exchange templates?
>> API gateway is preferred!


How are microservices different to SOA?
>> As Randy Shoup would say - “Microservices are SOA done correctly”. Conceptually in terms of number of tiers they are not much different. You still have the 3 layers Client->Service->DB. But the real gain is when services are decomposed into smaller independent deliverable chucks that can be designed, built, deployed and scaled independently. Be aware of binary dependency in SOA - its probably one of the biggest pain points to overcome when moving to microservices.


What features in oracle DB 12 C can be leveraged for DB and also whether no sql DB will be a good option to go for microservices. >> We use Sql Server instead of oracle and I am not familiar with oracle technologies. But yes do leverage no-sql databases. Not every data point needs to be relational.


How can we achieve independent DB per microservices as enterprises uses a shared DB across applications?
>> There are two reason why you need independent DBs
  • It guarantees that there is only one service owner of the underlying db-schema and consumer service cannot touch it. It eliminates DB as API and only way to fetch data is by making API calls. 
  • If one service is heavy on DB operations - the performance characteristics of this service should not impact your service performance. 
Generally speaking #2 is not a problem to solve on day 1 but will be needed for a more mature product stack. I think a good intermediate step is to make sure a service owns a set of tables with-in the shared DB and that it is the only sole owner. No other service should touch these table and if they need information, it has to be through an API call. You will need to be religious about this and no matter what the problem is - never allow anyone to touch tables owned by your service.

How frequently do you update snapshots for Component-tests? Is there a risk of running tests against stale data? 
>> Different teams follow different techniques. Some teams update it daily while other update it weekly. I think the deciding factor should be how stable is you dependent apis in terms of introducing breaking changes. If the dependent api is also in development mode, then you need to update it frequently. I don’t think there is any risk of running against stale data. Here the snapshots are served as mock. How many times do we update mocks when running unit tests? I would assume never unless you are changing the test itself. In this case, we are updating snapshots on some regular interval which is better than stale mocks used in unit tests.


When is the ideal time for applying chaos-engineering in the development cycle? 
>> Good question! I think to get started, you need to use your core workflows which means that services that support them are probably already in prod. The fun part about chaos mindset is that after you perform this exercise couple of time, you may already be wired to to start applying it to new development right from the beginning.

Follow up - In case we apply chaos-engineering at the design (or implementation) time - I fear that it may end up over engineering the solution. What are your thoughts? 

>> I haven’t seen any use-cases so far where applying fall-back mechanisms would change original design significantly. One thing to caution against - don’t try to build a system that auto correct itself in the name of chaos-engineering. Keep it simple!


What are your thoughts on impact on performance of the system in microservice architecture? Isn’t making too many microservices degrade the overall performance? 
>> If compared to on-premise Enterprise product, sure performance is going to be slow just by the fact that there more boundaries to cross in a microservice architecture. But if you are building a cloud product in which case you are already paying the cost of going over the internet! If performance is still a concern, there are few patterns to apply
  • Stickiness routing - For subscription style operations: If your dataset is small enough and stickiness can help - by all means go for it! 
  • Local cache - For reference data points that doesn’t need to be fetched on every call, create local cache with TTL (redis/in-process). 
  • GraphQL - GraphQL is a great way to save on the transport by fetching only the fields you need. 


Popular posts from this blog

Break functional and orchestration responsibilities for better testability

An Effective TTL-Caching Pattern

Microservices and Tech Stack - Lessons Learned