How do you calculate Cloud ROI?

In general, there is always confusion between TCO and ROI. TCO is the Total Cost of Ownership and a component of ROI. TCO helps businesses focus on their core competencies and reduces the load on the IT department. When IT companies came out with the concept of managed services, they started looking at outsourcing their IT department’s functions to IT service providers. However, they retained minimal IT department resources and started reducing the Total cost of ownership. Cots products gave away inhouse resources of the IT department and further reduced the TCO. However, the real TCO reduction started when the cloud came into the picture. It totally reduced the capex cost and moved the infrastructure from the premises to a third-party premise. ROI on the other hand focuses not just on removing capex costs but also gives a return on investment for the customer. Let us look at the parameters that impact the ROI on the cloud.

1. Downtime: The single most important parameter that could possibly affect the return on investment is the downtime that happens across the IT systems. It affects customers, affects internal stakeholders and so on. Due to extensive experience, the public cloud service providers have drastically reduced the downtime due to Hardware Infrastructure. The second biggest reason for incidents is the application infrastructure. The cloud service providers not only abstracted out the hardware for the users but also abstracted the application infrastructure namely web servers, application servers, cache management, container management, application deployment and all these too. There were issues in deployment without a rollback plan. All the public cloud service providers apart from providing just hardware provided managed services for application infrastructure. Effectively downtime hours had been reduced by the cloud service providers both due to hardware as well as application infrastructure.

2. Agility: Agility is the ability to move quickly and easily. IT departments have been slow on deployments for the changes required by the business as the primary change is not only in the development code but the workflow that needs to change. The cloud service providers have got flexible workflows for deployment by bringing DevOps infrastructure into the cloud. The DevOps infrastructure of the cloud service provider provides Code Repository, Automatic Build, Automatic Deployment, and rollbacks followed by quick monitoring.

Autoscaling is an important parameter that is affecting agility. Quick provisioning of compute engines, creating security policies and attaching it to the compute engines, ability change at one place that gets reflected everywhere and so on. While Agility cannot be exactly defined in hours, the flow disruption occurs due to lack of agility.

3. Pay for what you use: In a traditional data centre model the resources are there even if you do not use it. A onetime irreversible budget was created. Irreversibility is the biggest issue in capex buying. There will be many idle resources and the capacity prediction has been a big problem. Cloud service providers solve all the problems of underutilized and over-provisioned resources. Hygiene issues can be automated. Resource usages are completely automated.

4. Technology deprecation: Every advancement in technology leads to cost reduction or capacity enhancement at a lower price. The cost of upgrading technology in a data centre environment is enormous. The process of doing POC, pilot and production is a long-driven process. However, public cloud service providers make it easy.

The final equation for ROI is always in terms of money. Converting the above tangible and intangible parameters into time lost or gained is important to calculate the return on investment. The ROI Equation will have an investment component. It has a traditional investment component as well. So, a simple way to calculate cloud ROI is as follows,

Note: The step no (c) is the most difficult parameter to arrive at to deliver ROI.

The Cloud Operating Model – Optimize Costs, Enhance Availability, Drive Performance

The cloud operating model is a way the workflows are defined on the cloud to achieve the IT operational goals which are quite different from what has been happening in the data centre. Cloud paves the way for moving the infrastructure out of the data centre.  An abstraction layer on hardware is created with a hypervisor which hosts the guest operating systems. These hypervisors are getting replaced by container images nowadays to ensure better portability of the applications. The operating model of the data centre has many layers to ensure availability, whereas the cloud operating layer has fewer fine-grained models to ensure availability. The cloud operating model is entirely based on the data centre operating model and present challenges as well as opportunities to improve the performance. Some advantages in the cloud operating models include.

SLAs: In a typical data centre operating model SLAs are a function of all the layers of Infrastructure starting from the cables, power supply units, and so on to the point where the applications are hosted. In the Cloud Model hardware availability, SLAs are taken off the CIO’s hands and it is usually by default 99.99%. If we are using the platform from the cloud service provider, the underlying application infrastructure SLAs are taken care of by the service provider. Now the application availability is the only SLA that needs to be ensured. Managing the load balancers, firewalls, cache have all been taken off the IT services plate. A good example of improved SLAs is the usage of Gmail and Microsoft O365 platforms. Mail delivery has become more reliable and people focus more on new features. Improved availability of software and hardware have made life much easier for IT people.

Scalability: The hardware scalability is out of the door and the need for various types of hardware going through RFPs and procuring the same is gone. However, as every solution brings in a new problem scalability has brought in the problem of cost. Indiscriminate usage of auto-scaling on the cloud has led to increased cost, making the operations unviable. So, the cloud operating model requires very robust auto-scaling policies.

Cost: A typical data centre has a budget forecasted at the start of the year and the cost is managed by the budget set at the start of the financial year. In the cloud, cost is an engineering problem. It can continuously beleveraged, and we have an opportunity to spend lesser by using cloud resources more diligently. Cost is not a one time exercise but is a continuous exercise.

Services Options: The no of services that are available from the cloud are mind-boggling. No two services from two different vendors are equal. AWS has more than 150 services, Azure has hundreds of services and GCP too has just 90 services. Every vendor has an approach to abstract the hardware and software out. Understanding of the service options is paramount to a good operating model. Unlike data centre, the options to improve service are continuously available for a cloud services provider.  Managed services option is provided by almost all cloud service providers. The focus of managed services is not on our IT applications but on hardware by default and more on application infrastructure, namely database, authentication service, caching service and so on.

Security: The cloud operating models provide security at the infrastructure and platform software levels by default. Most of the cloud service providers take care of network security as well. However, in the cloud operating model, security becomes a shared responsibility, and it is not the responsibility of the provider or IT department alone. Cloud Service providers still ensure a robust security mechanism to protect the data, devices, and application infrastructure.

Support: The support model for cloud service providers is different and it depends on the choice of subscription. It is important for us to go through the entire subscription model and choose what the business needs are as well. The Business Continuity both at the geography levels as well as at the zone level are available by default. Disaster recovery and data replications are features that makes the cloud operating model a better model than data centre operating models.

To conclude, it is important to unlearn some of the data centre operations metrics and learn new metrics on the cloud operating model to provide better service to our customers. Cloud operating models are different and at every level of abstraction the cloud service providers add to the underlying layers the operating model becomes different.

Multi Cloud – The new Paradigm shift in Cloud Transformation

Gartner’s 2020 Cloud Service Providers Magic Quadrant Report says that the leading players in the public cloud space are AWS, Azure and GCP. By the year 2026, it is estimated that the public cloud provider business will be in the range of 488 billion dollars annually. Cloud service providers provide the infrastructure, tools and software needed to run the business. While the applications are getting environment independent using the container technology which allows the users to migrate application from one environment to another environment in a seamless the cloud service providers do position some of the tools that in turn leads to a vendor lock-in situation. Apart from this, there are multiple reasons why enterprises wanted to manage multiple cloud service providers. The idea of this blog is to see the reasons why customers want to use multiple cloud service providers.

Reasons for Multi Cloud Options.

  1. Customers want to avoid a vendor lock-in. Usage of a specific tool or infra continuously leads to locking the customer with a vendor. Customers want to be vendor-agnostic on Infrastructure and applications.
  2. Certain services are done best by certain vendors and the customer needs the best of the breeds of services from different vendors. As an example, the AI infrastructure is delivered best by Google as Google is a leading AI Player. Some of the best developer tools are available from Azure and AWS provides the Auto Scaling feature possible.
  3. Cost is another factor for which the customers are migrating into multiple cloud environment. As an example, GCP gives per second billing on usage and AWS gives hourly billing. New entrants give better pricing and customers look for workload migration. Cloud is a continuous cost leverage unlike, data centre which is once in a year model of budgeting for IT.
  4. Niche services are provided by certain vendors. Many customers are adopting SaaS model and none of them wants to worry about managing the cost of infrastructure and platform. As an example, sales force leads the CRM space with their SaaS offering. ServiceNow in the ITSM space. However, adopting a SaaS model paves way for vendor lock-in as it is difficult to migrate applications once a customer is on SaaS. However, most enterprises have a combination of SaaS and Public Cloud Service Providers.

Managing Multi Cloud Environments

Managing multiple clouds is not easy as the resources required are different and therefore is a costly affair from a resource perspective. Each cloud service provider has a different set of products and no two products are designed to be equal. To find an equivalent product is an exercise by itself. Here are a few tips to manage multi cloud environments.

  1. Abstract the service: Define your services independent of the cloud service provider and try and see what all products can fit into your abstraction. Assume that you need a compute engine. Define the compute engine configuration and choose the equivalents from multiple cloud service providers. Service Abstraction is an important part of multi cloud management.
  2. Cost Management: Cloud cost management is an engineering problem and not a finance problem. In a multi cloud environment, it is important to choose a tool that can help us identify the spend issues on various platforms. As an example, cloud health from VMware gives a perspective of the resource utilization in various clouds.
  3. Reporting: Managing Multi cloud environments leads to receiving multiple reports and it is important to configure relevant reports of interest.
  4. Define the Dependencies on Cloud Service Provider: It is important to define the dependencies on the cloud service provider. As an example, if you are dependent on AWS Cloud Watch for your alerts and not any other third-party tools it is better to define the same.
  5. DevOps Management: All cloud service providers make the DevOps toolchain much easier than the private cloud or data centre DevOps Tools Chain. The code pipeline is easy to set up with a cloud service provider unless a specific tool is used on-prem. There is a good amount of dependency on the cloud service.

To conclude Multi Cloud is a great option which one should exercise before implementation to make sure the IT Infrastructure and applications are portable.

Cloud AI Infrastructure – Redeem Better ROI from AI

The exponential growth of AI in the last decade has happened due to two specific reasons. Growth in Processing power and Growth in Storage power. Until 1999 the growth of processing power had been slow but due to exponential growth of processing power with the invention of GPUs and TPUs, AI has truly come out of the ‘AI Winter’. Kryder law states that the disk drive density tends to double every 13 months, and this has truly reduced the price of storage. With 5G technology on the anvil, the need for storage will further reduce as low latency networks will be the order of the day and the need for storage at multiple places for processing will be further reduced.

There are four powers necessary to contribute to AI’s success. Namely, Processing Power, Power of Algorithms, Domain knowledge and Data. Many enterprises want to take advantage of AI. However, transformation using AI has become a costly affair using capex systems. Usually, high powered CPIUs used by AI programs are normally not required to run the normal day to day transactions. A capex investment becomes a big problem for many businesses to deliver ROI using AI Technology. Cloud technology has come as a saving grace for enterprises to deliver ROI on AI.

To start with, IBM Watson platform has been delivered only on cloud. An on-premises version of Watson is not typically available. Many enterprises have taken advantage of Watson’s AI Capabilities specifically for prediction purposes. Watson offers Infrastructure as well as machine learning models to deliver outcomes.  Other cloud companies have not been far behind in delivering service from the cloud. AWS SageMaker made an entry where large machine learning models can be built, trained, and deployed using the same. Amazon Comprehend, Amazon Code Guru etc helped AWS to offer a range of services on machine learning using their hardware.

Microsoft Azure offers a comprehensive list of AI services. This includes speech, vision, and text analytics services. Apart from this, Microsoft offers a comprehensive set of tools from a machine learning studio to enable DevOps for AI code deployment. Microsoft Azure also offers Bot Services to build a bot, deploy a bot and train a bot.

The widest range of services are from Google Cloud. Google Cloud separates the AI offering in the form of 4 services. Google offers scalable AI – Infrastructure services that enable compute engines to be attached to a GPU by default and allows scaling to a TPU. Google Compute Engines run with Nvidia processors. Google also offers horizontal APIs for Text search, Computer Vision using Vision AI and a set of Speech AIs. Google being the largest contributor to Open Sources has come out with Google BERT, making the services more robust. Google has been an AI-first company apart from a search company. GCP offers wide-ranging vertical-specific out-of-the-box AI solutions. They offer solutions on Healthcare and Life Sciences, Data Cleansing, Data Labelling and Data Analytics. GCP focuses on the complete lifecycle of AI, from data collection, data cleaning, data labelling to training the data with scalable infrastructure and deploying the Machine learning code using Kubeflow.

Cloud service providers continue to evolve with new services. The ability of the cloud service providers to provide high-end computing engines on-demand gives enterprises an opportunity to transform their business using AI. They offer a comprehensive set of AI tools and frameworks, creating an ecosystem that adds value while delivering the service.

Data References:

https://www.srgresearch.com/articles
https://www.channele2e.com/channel-partners/csps/cloud-market-share-2020-amazon-aws-microsoft-azure-google-ibm/

Advent of Cloud Native Applications

Cloud native applications are defined as applications that are scalable and reliable by construction. The difference between cloud native applications and non-cloud native applications is that the non-cloud native applications are scalable by requirement and not by construct. Construct implies that the design of applications by default must take care of scalability and reliability. In any application, there are two major failures. Failures due to code and failures due to performance. Cloud native applications can detect run-time failures and take mitigations on its own. Cloud native applications are usually container packaged, micro services oriented and dynamically orchestrated.

Application Containers: Technically the applications are container based which enables the deployment across different flavours of operating systems. The biggest benefits of containers are faster deployment, portability and cost efficient. Containers are just processes running in your system. Unlike a VM which provides hardware virtualization, a container provides operating-system-level virtualization by abstracting the “user space”. Containers require fewer system resources as they do not require any operating system images. Containers are good for modern application development say microservices etc.

Container as a service use cases are picking up in the cloud and many cloud-based applications are built using container as a service. Monolithic applications are decoupled as microservices using the container images.

Microservices: Microservices is an architectural style where autonomous, independently deployable services collaborate to form a broader system/application. The benefits of microservices includes the following.

  1. Decompose applications into smaller services and each service can be owned by a smaller team.
  2. A good microservice is the one that can be thrown away if necessary and rewritten.
  3. Microservices offer up-gradation flexibility while one service could be in one version and the other service could be in a different version.
  4. Microservices are loosely coupled and scalable.
  5. Microservices improves the deployment frequency as the services are very light in nature and can be deployed very frequently.

Services Orchestration:

Orchestration is the process of getting all the (infrastructural) components lined up to deliver your digital service to your customers. All moving part in an IT environment are part of orchestration. From a code change to production, everything is orchestration. Orchestration is the heartbeat of your lifecycle of every iteration you made. There are 4 steps in the IT services part of orchestration.

  1. Provision the infrastructure.
  2. Code and Commit changes
  3. Build and test the service.
  4. Deploy and Run the services

Cloud native applications will help in all above the steps in an automated way.

To conclude, cloud native application development approach is critical to the utilization of the cloud. Many legacy applications are getting migrated as cloud native application to enable the scalability and availability of the applications in general.