Anthos Blog Series (Part 1) — Anthos Service Mesh

Published in

Searce

9 min readSep 28, 2021

Greetings! Welcome to part 1 of my blog series on Anthos. Over the course of this series, we are going to cover various topics associated with Google Cloud’s Anthos. The series will involve conceptual understanding supplemented by practical tutorials for you to get up to speed on what some consider a revolutionary piece of technology.

Before, we begin, let’s read the description of Anthos as defined in the official Google Cloud documentation:

Anthos unifies the management of infrastructure and applications across on-premises, edge, and in multiple public clouds with a Google Cloud-backed control plane for consistent operation at scale. — [1]

Anthos actually comprises of a suite of services, the key ones being:

Infrastructure management
Container management and orchestration
Service management
Policy enforcement

This part of the blog series will focus mostly on point no 3, a.k.a Anthos Service Mesh (ASM). I recommend grasping a theoretical understanding of Anthos before we delve further into our adventure.

Now, without further ado, let’s begin.

Anthos Service Mesh

It’s tough to know what a ‘cowboy boot’ is, without knowing what a boot is, tougher still to know what a ‘cassette player’ is without knowing what a casette is. So it’s only logical that I tell you what a service mesh is before explaining ASM, lest these words start sounding like an alien language or the incoherent ramblings of someone who’s drank a few more cups of coffee than they should have. I’ll avoid the former scenario, but I can give no guarantees about the latter.

A service mesh is an architecture that enables managed, observable, and secure communication across your services, letting you create robust enterprise applications made up of many microservices on your chosen infrastructure — [2]

Translation: When you have a microservices based architecture, it proves challenging to manage these individual services. For example, you may want to authenticate/authorise requests between services, you would probably like to get some observability on the network traffic between services, you may even want to split traffic between services. All this and more can be achieved with a service mesh.

ASM is powered by istio (open-source service mesh). The service mesh consists of the data plane and one or more control planes. Essentially, a side-car container is deployed alongside each of your microservices, these side-cars act as network proxies, i.e, all traffic flowing in and out of a particular microservice, will have to go via this proxy. These proxies not only monitor and log network traffic, but they also enable authentication between services (via MTLS); prior to this, these tasks had to be performed by the application container itself if at all.

Here is what the ASM components look like (with a Google Managed Control Plane)

Note that all communication to services happen via their proxies

Tutorial — Installing Anthos Service Mesh with a Google Managed Control Plane

Now it’s time for the fun part, we’re gonna setup a GKE cluster, install ASM on it, deploy an application so that we have some services to observe, and then we’re going to spend some time taking a look at what ASM offers.

Prerequisites of managed ASM (features and limitations): https://cloud.google.com/service-mesh/docs/supported-features-mcp

Step 1: Registering your Kubernetes cluster to Anthos

Before we begin, create a GKE cluster on your GCP project. To use ASM, our cluster must be registered with Anthos. We are gonna cover 2 methods for registering the cluster.

a. Registering a GKE cluster within the same project using the console:

Assuming you have workload identity enabled on your GKE cluster, this form of registration is super simple (one-click in fact). You can register it from the console, go to the GCP console > Anthos > Clusters > Register Existing Clusters.

Here you will see a list of unregistered clusters from your project, simply click on the REGISTER button next to the cluster.

After you register your cluster, it will show up on the list of registered clusters in your Anthos console. You can select click your registered cluster to see more details about it, you can even unregister the cluster from here.

b. Registering a cluster using the Connect for Anthos agent:

Here we are going to use the ‘Connect for Anthos’ agent, to register our cluster to Anthos and connect it to the GCP control plane. This method can be used for any Kubernetes cluster (not just GKE), just ensure you have satisfied the prerequisites before registration.

First, we must create a service account with this role: roles/gkehub.connect aka GKE connect agent role. Create and download the key for this service account. We are going to use the permissions bound to it to register our cluster.

Execute the following gcloud command to install the connect agent on the cluster and register it with Anthos.

gcloud container hub memberships register ${CLUSTER_NAME}-connect \ — gke-cluster=${CLUSTER_ZONE}/${CLUSTER_NAME} \ — service-account-key-file=./connect-sa-key.json

The output of the command should look like this:

Step 2: Installing Anthos Service Mesh on your cluster

With the cluster registration out of the way, let’s install ASM on our cluster and deploy an application with the ASM side car containers injected into them.

a. Download the version of the script that installs Anthos Service Mesh 1.9.6 (latest version is 1.10 as of writing this blog) to the current working directory:

curl https://storage.googleapis.com/csm-artifacts/asm/install_asm_1.9 > install_asm
content_copy

b. Download the SHA-256 of the file to the current working directory:

curl https://storage.googleapis.com/csm-artifacts/asm/install_asm_1.9.sha256 > install_asm.sha256
content_copy

c. Make the script executable:

chmod +x install_asm

d. Execute this command to install ASM. It enables Mesh CA (by default as compared to Istio’s Citadel). The enable_all flag allows the script to enable the required Google APIs, set Identity and Access Management permissions, and make the required updates to your cluster, which includes enabling GKE Workload Identity (in case it is not already enabled).

./install_asm \ 
--project_id $PROJECT_ID \  
--cluster_name $CLUSTER_NAME \
--cluster_location us-central1-b \ 
--mode install \  
--enable_all \  
--output_dir asm

e. Like istio, ASM uses side cars as network proxies to give us the information that we expect from a service mesh. Let’s enable side car injection on the default namespace. Then, every application we deploy in the default namespace, will be a part of the service mesh.

kubectl label namespace default  istio-injection- istio.io/rev=asm-198-6 --overwrite

Note: the label rev=asm-198–6, should be the same label given to istio-system pods. This varies between users and won’t work if you choose the wrong label so be mindful.

Basically, we’re labelling the default namespace, and any pod that is part of the default namespace, will subsequently be labelled and have the side car container injected into it.

Now that we have ASM installed and the injection configured, we can go ahead and deploy our application to the default namespace and start observing it. I have gone ahead and deployed a sample book info application front-ended by an istio ingress gateway. You can find the source code here: (https://github.com/istio/istio)

Step 3: Evaluating service performance and security using ASM

Here’s what the sample book info app frontend looks like:

Anthos Service Mesh provides us with a suite of features and tools that help us observe and manage secure, reliable services in a unified way. Here are some of the things we’re gonna look at with respect to our sample book-info application:

Service metrics and logs
Preconfigured service dashboards
Telemetry data
Service-to-service relationships (with topography view)
Service Level Objectives (SLOs)

I simulated some traffic to the external istio gateway endpoint so we have something to observe.

Go to GCP Console > Anthos > Service Mesh

If you’ve deployed your app on a namespace that has been successfully asm-injected, you should get a list (table view) of your services as shown below, along with some useful metrics like:

req/sec
error rate
50th percentile latency
99th percentile latency

You can switch to topology view to get a more graphical representation of your services and their inter-connectivity.

Each microservice is depicted as a canonical service consisting of a deployment/pod and a kubernetes service. ASM observes network traffic internal to the cluster and services, and accordingly plots this graph.

You can select click on a specific service to get even more details about it such as Server/Client error rate, CPU/Memory utilisation, etc:

Once you’ve selected a specific service, go to ‘Connected Services’ to get a service to service relationship topographical view as shown below.

Go to the ‘Health’ section. Here you can create Service Level Objectives (SLOs) and Service Level Indicators (SLIs) for individual services. You can set SLIs based on some pre-defined metrics like Availability (successful requests over a period of time), Latency (how many responses were equal to or faster than the minimum latency you provided), and you can even use custom metrics to create your own user defined SLIs and SLOs.

Naturally, you can set alerting policies based on these SLOs. This is a very impressive and useful feature, especially for certain applications that are latency sensitive or need to abide by certain non-functional requirements or need to maintain compliance.

Eg: Setting a 95% availability monthly SLO on a service

In my opinion, this is one (among many) of the most useful features of ASM.

Finally, go to the Security section. Here we can see the ports via which services are communicating with each other, whether or not these service communications are mTLS encrypted (as indicated by the green lock icon), and the request principals.

Now that we’ve seen the amazing benefits of ASM, here is a bit of info that a lot of people don’t know:

“Psst, you can use Anthos Service Mesh without actually using Anthos.”

That being said, there are a lot of UI components that you miss out on. This table shows you what you get with just ASM & with ASM + Anthos.

Conclusion

In spite of the lack of brevity in this article, we have just barely explored the tip of the iceberg that is Anthos.

Congratulations for reaching this far, I hope you’ve found this document at least insightful, if not directly useful. Stay tuned, this is only part 1 of my series on Anthos, so expect more in the future.

Drop a comment letting me know what Anthos related topic you would like covered next. Until next time :)