Practice Exams:

Google Professional Data Engineer – Managed Instance Groups and Load Balancing part 4

  1. SSL Proxy and TCP Proxy Load Balancing

We’ve just seen in a lot of detail how the global external Http Https load balancing works. Let’s move on to the SSL proxy load balancing and see how the Google Cloud platform implements this. The SSL proxy load balancing sits at the sessions layer of the OSI stack, so it’s below the application layer. Remember, the rule of thumb is if you can do application layer load balancing, you should not prefer to do SSL or PCP load balancing. But if your traffic is encrypted but not Https, SSL proxy load balancing is what you will choose. We’ve just seen a visualization of the OSI network layer stack and we’ve seen where SSL proxy load balancing sits. Traffic flows over the Internet using the TCP IP protocol.

The network layer is the IP addresses, the transport layer is TCP, and the application layer is typically Http traffic. If you’re looking for traffic to be secure that is encrypted as it passes over the network, then typically you will configure the session layer to be SSL, the Secure Socket layer, or at the application layer, the traffic should be Https. SSL proxy load balancing is specifically used for traffic which is non Https traffic but still encrypted. SSL proxy load balancing is global load balancing. If your traffic follows the Http or Https protocol, just use the application level load balancing don’t move down to the lower layer. SSL connections that are received by the proxy are typically terminated at the global layer, and the proxy then reestablishes connections with the instance group in the back end.

Here is a block diagram of SSL proxy load balancing. Let’s look at what the components are and how they interact with each other. At the very top are the clients which make connections to your load balancer. You might have a user in Iowa and a user in Boston. Note that their connections are secure. They are encrypted using SSL indicated by the lock. This is external traffic and it hits the global SSL proxy load balancer. If you remember our hierarchical diagram early on, SSL proxy load balancing is global because it spans multiple regions and it’s external because it receives traffic from the Internet. The SSL connection that this load balancer receives from the clients is terminated at the SSL proxy, which means we have to install a certificate on this proxy.

This proxy then makes fresh connections to the back ends. These fresh connections can be SSL or non SSL connections. SSL connections are preferred because the proxy creates fresh connections. It is said that the SSL connections are terminated at the global layer and then proceed to the closest available instance group, the instance group that is the closest to the user that made the request. That’s all we’ll cover in the SSL proxy load balancer. Let’s look at the last external global load balancer that Google offers and that is the TCP proxy. Once again, we refer to the OSI network stack and find that the PCP proxy functions at the transport layer. PCP proxy load balancing allows you to use a single IP address for all users around the world.

 It is global load balancing after all, and it automatically routes traffic to the instance that are the closest to the user. We’ve spoken about this rule of thumb earlier. We prefer load balancing at higher layers of the OSI network stack where possible. So the advantage of transport layer load balancing is that there is more intelligent routing that is possible as compared with the network layer, which is what we look at next. TCP load balancing also provides better security because TCP vulnerabilities can be patched at the load balancer itself. Here is a block diagram for how TCP proxy load balancing works. Note that it looks very similar to SSL proxy load balancing, except that the traffic is not encrypted at the very top.

We have our clients which generate traffic for our load balancer. We have users in Iowa, users in Boston. This is an external load balancer, which means it receives traffic from the Internet. Once again, because it is a proxy load balancer, the proxy makes new connections to the back end. The connections which come in from the internet are terminated at the TCP proxy layer. The new connections which the proxy makes to the back end can be TCP connections or even SSL connections. Just like in the case of SSL. The TCP connections are terminated at the global layer at the load balancer and then proceed to the closest available instance group.

  1. Lab: SSL Proxy Load Balancing

In this demo we will create an SSL load balancer which will terminate incoming SSL connections at the global load balancing layer and then distributes these connections across our instances using either SSL or TCP. Note that an SSL load balancer is only meant for SSL traffic and not for Http or Https. For that, an Http load balancer is used. Let me begin though by showing a couple of resources which I have provisioned in advance which you will need to create on your own before starting this tutorial. The first is a VPC network which I have named Load Balance Network and the second is a firewall rule to allow SSL connections to come into this network. This rule applies to any instances which have the tag SSLB and applies to incoming connections on port four four three, which is the SSL port.

So with our Vpc network and firewall rule in place, let us go ahead and provision our instances. So for our first instance we will call this instance first and we will include this network tag so that our firewall rule will apply to it. We select our Vpc network and finally we do have to enter a startup script. So you can find the startup script at the location displayed on the page. And what this essentially does is it installs Apache web server on the host and then includes a basic web page in the varw HTML directory which simply displays the text first. Our second instance will be nearly identical to the first one, so we can simply clone the first one once it’s ready. And the only thing we need to change over here are the names and the web page itself. So instead of first we will display the text second.

So that’s one small change to the startup script. We hit create and with this we now have our instances ready. The next step for us is to create instance groups for our first group. We will call instance group first which will be in the same zone as our instances. And note here that we specify a port name mapping which will be used later when we configure our load balancer. So we call this port SSL LB and it maps to port four four three which is used for SSL connections. For this example we will just use an unmanaged instance group. Since we don’t really need auto scaling or auto healing, we select our load balance network and finally we add our first instance to this instance group. Once that is ready, we are ready to provision our second instance group which we shall call instance group Second.

Again this is more or less the same as the first instance group. We retain the port name mapping and this shall also be an unmanaged instance group. The only difference here is that we remove the first instance and add the second one once that is ready. We are now set up with our instance groups and we can go ahead and create our SSL load balancer. So we navigate to network services and load balancing. And the kind of load balancer we are provisioning is an SSL proxy. So let us pick that option. And also our load balancer will be accepting connections from the internet and distributing them to our internal VMs. So let’s specify that this is going to be an internet facing load balancer.

One of the benefits of having an SSL load balancer is that the SSL processing can be offloaded to the load balancer itself, rather than have our instances handle the resource intensive encryption and decryption. In order to use this feature though, we need to select multiple regions and in connection termination we choose yes. Once we hit continue, we move on and supply a name for our load balancer. And then in the back end configuration we change the protocol to SSL. If you remember in our firewall rule we had specified that communication to our instances can happen only through the SSL port of four four three. Also over here we set the named port which is going to be used and again going back in our instance group we have specified a named port of SSLB which maps to port four four three.

So we now specify our first back end which is our first instance group, and say that communication will occur on port four four three. We then add our second instance group and then communication again is on four four three. The next step is to add a health check so that the load balancer knows that the back end instances are up and in a healthy state. The way to do this for our application is to see whether the back end instances are responding to connections using the SSL protocol on port four four three. So with our health check in place, let us now create a front end configuration. After giving it a name, we can assign the kind of IP address we want. In this example, I’m just using a static IP which I already had before.

As for the certificate, in this example I am using a self signed certificate which I had created beforehand. So you could either create one yourself by googling how to create a self signed certificate, or if you have one already, you could just use that. So there are three different components to the certificate which I’m adding here. Gcp might warn us about using a self signed certificate, but let’s proceed anyway. We just quickly review the details of our load balancer and if all looks good, we just hit create. Note that this load balancer provisioning might take a few minutes, so you will need to wait until all the health checks pass. And this might take a few minutes, but all our instances are healthy, so let us now go see if our application is reachable.

So we grab our IP address for the load balancer and note that we need to use the Https protocol. When accessing from a browser, we enter the IP address, and since we are using a selfsigned certificate, we will get such a warning. But we choose to proceed anyway, and we can see that our app is accessible. To see if the load Balancer is distributing the load between our instances, we may need to hit refresh a few times, but eventually we can see that our second instance is reachable as well. So we have now successfully set up a Load balancer to distribute SSL traffic among different instances. That concludes this demo. I hope you found it useful. Thank you.

  1. Network Load Balancing

We are done discussing all the global load balancers that Gcp offers. Let’s now look at the network load balancer, which is an external load balancer, meaning it deals with traffic from the Internet. But it is regional in that the instances to which it distributes traffic are located in a single region. They can be spread across multiple zones in the region. Network load balancing operates at the network layer of the OSI stack. The network load balance so operates at the network level, which means it has access to the IP address and the protocol of the incoming traffic. And it’s based on these characteristics that it redistributes a load across instances. It uses the IP address, port and protocol type in order to make load distribution decisions. The network load balancer is not a proxy load balancer.

It is a pass through load balancer in that once it receives an external connection from the client, it does not create a new internal connection. It does not terminate the connection, it passes through the connection to the VM instance. It’s regional because the VM instances that it connects with are all located in the same region, maybe across multiple zones, but in the same region. Network load balancers work with UDP traffic. They also work with PCP and SSL traffic, though in those cases you should prefer a load balancer operating at a higher layer of the OSI stack, that is, PCP proxy or SSL proxy load balancing. You choose to use network load balancing for SSL and TCP traffic, if only that traffic is not supported by the SSL proxy and the TCP proxy load balancers.

SSL proxy and TCP proxy load balancers support traffic only on some specified set of ports. If you use a port outside of this set, then you might need to use network load balancing. The load balancing algorithm for a network load balancer uses five different pieces of information in order to determine which instance it will connect to. It picks an instance on the base on a hash of the source IP and the port, destination, IP and port, and the protocol for the incoming traffic. This can be thought of as a five Tuple hash. If you use a five Tuple hash, it is much harder to get a session affinity. The incoming PCP connections will be spread across instances and each new connection may go to a different instance.

Once an instance on the back end has been chosen, regardless of the session affinity settings, all packaged for a connection are directed to that same instance until the connection is closed. This connection will have no impact on load balancing decisions. For new incoming connections, this can be a problem because this can result in an imbalance between back ends. If you have many longlived TCP connections which send a lot of traffic, network load balancing forwards traffic to target pools. It need not use managed instance groups or any instance groups. A target pool is a different logical partition. It’s simply a group of instances which receive incoming traffic from some forwarding rule. You can have a managed instance group associated with a target pool, but a target pool can be standalone as well.

A target pool can typically only be used with forwarding rules for TCP and UDP traffic, you’ll have a main target pool to which the network load balancer directs traffic. In addition, you also can have backup pools which will receive requests if the main target pool is unhealthy. The network load balancer makes the decision as to whether to send traffic to the backup pool based on the failover ratio. The failover ratio is the ratio of healthy instances to failed instances in a target pool. If a large number of instances in the primary target pool fail, such that the primary target pools ratio is below the failover ratio that we’ve configured, the traffic is directed to the backup pool rather than the primary pool. Just like with managed instance groups, target pools also have health checks.

You can configure health checks to check the health of specific instances within the target pool. Network load balancing uses legacy health checks, so the Gcp has legacy health checks as well as the current latest version of health checks. Network load balancing uses legacy health checks. That’s something to watch out for when you’re setting up this configuration. As in the case of the other load balancers that we’ve set up, you need to configure your firewall rules so that traffic from the load balancer can reach instances, and health check requests should be able to reach instances as well. The load balancer as well as the health checks use these IP ranges to connect to the instances. Ensure that your firewall rules are configured to allow traffic from these IPS.

  1. Internal Load Balancing

We are done with all the external load Balancers google offers one kind of internal load balancing that is regional in nature. An external load balancer is one which receives traffic from the external world that is from the Internet. An internal load balancer receives traffic from within your network and then distributes it across multiple instances in a region. Since Internet internal load balancing is constrained to be within a single Vpc, it can depend entirely on internal or private IP addresses. Internal load balancing will expose a private load balancing IP address that only your Vpc instances can access. There are a few advantages to having a load balancer that is within a single network. The Vpc. Traffic stays internal. There is more security and less latency to access the instances.

The fact that we do not need a public IP address reduces our exposure and reduces the threat to our load balancer and our network. The typical use case for internal load balancing is when you have front end applications which typically form the websites that your customers will access. And these front end applications need to make requests to the corresponding backend APIs. This is useful to balance requests from our front end instances to back end instances. Here is a block diagram of a regional internal load balancer that serves traffic to VM instances on a subnet. At the back end we have just a single subnet which is present in the US central region. The instances in these subnets are served by the internal load balancer. Here we have a single subnet, but you can have all instances belong to different subnets as long as they are in the same Vpc for the internal load balancer to use.

The back end instances have been split across multiple zones. Load balancing and the service itself will be more reliable when more than one zone is included. The load balancing has been set up along with a regional forwarding rule which forwards traffic to the load balancer. Notice the IP address of the load balancer. It is from the IP address which belongs to the same Vpc network. The requests which come into the internal load balancer get forwarded to one of the two instance group within the subnet that it serves. Here is how the internal load balancer distributes traffic. The back end instance for a particular client is selected using a hashing algorithm and this hashing algorithm also takes into account instance health. Traffic is forwarded only to those instances which are healthy.

 Internal load balancing uses a five Tuple hash. It includes five parameters for the hash which determines which instance it will connect to the client source IP, the client port, the destination IP which is the same as the IP address of the load balancer, the destination port and the protocol. The protocol can be either TCP or UDP. When you use a fight up a hash, you typically do not get a session affinity. Each request from the client might be routed to a different instance. If you do want a session affinity, you can hash only some of the five parameters in order to choose your instance. You can have a three Tuple hash where you have the hash based on the client IP, destination IP and the protocol of the packet. Or you can have a two Tuple hash where you use just the client IP and the destination IP.

No protocol. We’ve spoken about health checks earlier. The same things apply for internal load balancing as well. Based on the kind of traffic you have, you can choose one of these health checks. Internal load balancing is a managed service and is highly available. There is no additional configuration that you need to set up in order to ensure high availability. Internal load balancing is a regional service, which means in order to increase the reliability of this service, you can configure instance groups in different zones. You can spread these managed instance groups across regions, but you can spread them across multiple zones. Failures in one zone do not affect another zone in the same region. This is how you protect yourself against Zonal failure when you have multiple instance groups, whether managed or unmanaged, behind an internal load balancer.

All these instance groups are treated as if they are in a single pool and the load balancer distributes traffic amongst them. Using the load balancing algorithm, the internal load balancer on the Gcp behaves a little differently from traditional proxy internal load balancing. In a traditional load balancing setup you will configure an internal IP on a load balancing device or instance and your clients will connect to this IP. The traffic which comes into that IP is terminated at the load balancer. This is a proxy load balancer based on some kind of hashing function. The load balancer will choose the back end to which it sends traffic. In fact, there are two connections. The client to the load balancer is one connection because it’s a proxy load balancer to back end is the second connection.

This is a diagrammatic representation of how a traditional proxy internal load balancer works. Notice that the client instances talk to the load balancer and the load balancer in turn conveys this via a fresh connection to the back end instances. The Gcp internal load balancer differs from the traditional proxy internal load balancing in several different ways. The first instance, the Gcp load balancer is not proxied, it is a pass through load balancer in that the connection that is received from the client is passed on to the back end instance. It provides very lightweight load balancing as a managed service. It’s not a physical device which provides load balancing. It’s built on Google’s andromedia network. Virtualization Stack.

Internal load balancing is basically software that runs on the Google cloud and this delivers traffic directly from the client instance to the back end instance. There is no proxy. And here is a basic diagram showing how gcp internal load balancing works notice the internal load balancer receives connections from the client, which it passes on directly to the back end. A typical use case for an internal load balancer is a three tier web app, so clients connect directly to an external load balancing device. Client traffic is typically http https traffic, which means this has to be a global external http https load balancer. This load balancer then distributes the traffic across the front end instances. The front end instances and the background backend instances may be hosted on the same network, in which case the front end instance connects to the back end instance via an internal load balancer.