Practice Exams:

Amazon AWS SysOps – S3 Storage and Data Management – For SysOps (incl Glacier, Athena & Snowball) Part 3

  1. CloudFront Overview

Now we’re getting into content delivery and we’ll start by clap front. So Clap Front is a content delivery network or CDN, and what it does is it improves read performance because the content is going to be distributed and caged at the edge locations. And edge locations are all around the world and there’s about 216 points of presence globally as I’m recording this lecture. And they add all of time new points of presence. So it’s much more than the 30 something regions that Alice has. This is a worldwide thing. And so what does platform give you on top of this caching at the edge? Well, it gives you DDoS protection. So to protect against attack that are distributed denial of service, it gives you integration with a shield and also a web application.

Firewall will see those into the security section of this course. But the idea is that it’s really protected and it’s a good way to front your applications when you deploy them globally. And also allow you to expose an external Https endpoint by loading a certificate and also talk internally in Https to your applications if you needed to encrypt that traffic as well. So let’s take a diagram. So this is a map of the world and there are some orange regions and their edge, everything on this graphic edge. But as you can see, it’s all around the globe. And so, for example, say we have an SV bucket in Australia and some user from America wants to access it. It’s actually going to access an edge location close to it, so in America, and that network is going to be transmitted over the private event all the way to the S Three bucket and the content is going to be cached.

So the idea is that this American user, the more users you have in America, the more they will want to do the same kind of reads and they will have content served directly from America, not necessarily from Australia because it will be fetched once into America and then served from there. So caged locally. So another user maybe in Asia will talk to an edge location closer to Asia. And that edge location again, will support the traffic to the SV bracket to get the content and then case it at the edge. So Cloud Front allows you really to distribute your reads all around the world based on these different edge locations and improve latency and reduce the load on your main S Three buckets. So I said, s three buckets. But what are the different Cloud Front origins?

Well, the first one is an S Three bucket and you would use Cloud Front in front of S Three as a very common pattern to distribute your files globally and cage them at the edge. You also get enhanced security as we’ll see in the hands on between Cloud Front and your S Three bucket using a Cloud Front OAI or Origin access identity. And this allows your S Three bucket to only allow communication from Cloud Front and from nowhere else. And then finally, you could also use Cloud Front as an ingress to upload files into S Three from anywhere in the world. Okay, the other option is to use custom origin and there must be an Http endpoint. So this could be anything that respects the Http protocol. So it could be an application bouncer, it could be an easy two instance.

It can be an S three website. But we first must enable the bucket as a static S Three website and note that it is different from an entry bucket and the website we need to enable that setting as we’ve seen before. And we could be any Http backend you want, for example, if it was on your on premises infrastructure. Okay, so how does Cloud Front work at a high level? So we have a bunch of edge locations all around the globe and they’re connected to the origin we define. It could be an S Three buckets, or it could be any Http endpoint.

And our clients want to access our Cloud Front distribution. For doing this, the client will send an Http request directly into Cloud Front. And this is what an Http request would look like. There would be a URL, some query string parameters, and there would be also some headers. And then the edge location will forward the request to your origin. And that includes the query strings and that includes the request headers. So everything gets forwarded onto your origin. I mean, you can configure this and then your origin responds to the edge location. The edge location will cache the response based on the cache settings we’ve defined and return the response back to our client. And the next time another client makes a similar request, the edge location will first look into the cache before forwarding the request to the origin.

That is the whole purpose of having a CDN. Okay, so very, very simple. This is how Cloud Front works at a high level. So let’s look at S Three as an origin in details. So you have the cloud and you have your origin, which is your S Three buckets. And for example, you have an edge location in Los Angeles and some users want to read some data from there. So your edge location is going to fetch the data from your S Three buckets over the Private list network and give you the results from that edge location. The idea here is that for the edge location of cloud fronts to access your SF bucket, it’s going to use an OAI or an origin access identity. It’s an im role for your CloudFront origin. And using that role, it’s going to access your SSL bucket.

And the bucket policy is going to say, yes, this role is accessible, and yes, send the file to Cloud Front. So this works as well for other edge locations. For example, in Sao Paulo, in Brazil, or Mumbai or Melbourne. And so all around the world, your Edge locations are going to serve caged content from your S Three buckets. And so we can see how Cloud Fronts can become super helpful as a CDN. Now, what do we have? ALB or EC two as an origin. The security changes a little bit. So we have our EC Two instance or instances, and they must be public because they must be publicly accessible from http standpoint. And we have our users all around the world, so they will access our Edge location and our Edge Location will access our EC Two instance. And as you can see, it traverses the Security Group. So the security group must allow the IPS of CloudFront Edge locations into the EC two instance. And for this, there is a list of public IP for Edge locations that you can get on this website.

And the idea is that the Security Group must allow all these public IP Edge locations to allow Cloud Front to fetch contents from your EC Two instances. So that makes sense. What if we use an ALB as an origin? So now we have a Security group for the ALB and the ALB must be public to be accessible by Cloud Front. But the backend easy to instances now can be private. And so in terms of Security Group, for the easy to instances, it needs to allow the Security Group of the load balancer. We’ve seen this extensively. And for the Edge location, which are again public locations, it needs to access your ALB through the public network. And so that means that your Security Group for your ALB must allow the public IP of the Edge locations, the same public IP as we had from before. So, two different architectures, same concept, but we better understand network security for S Three, for ALB or EC Two in Front behind, I must say Cloud Front.

Now, Cloud Front is a CDN. It also has some really nice features. One of them is derestriction. So you can restrict who can access your distribution. So you can provide a whitelist where saying, okay, users from this list of approved countries and only this list can go to confront. Or you can say blacklist where you’re saying, okay, the users from these countries are not allowed to access our distribution. And the way the country is determined is using a third party GeoIP database where the incoming IP is matched against it to figure out the country. So the use case for geo restriction will be when you have copyright laws to prevent access to your content and you want to prove to regulators that you are indeed restricting content access from, say, France, if you have content in America.

Okay? Now you may be asking yourself what is really the difference between Cloud Fronts and something like S Three? Cross region replication. So, Cloud Fronts is using a global edge network and files are going to be caged for a TTL. So time to live maybe for a day. So it’s great when you have static content that must be available everywhere around the world. Okay? And maybe you are okay with if that content is outdated a little bit. Now for s three crossregion replication, it must be set up for each region in which you want to have replication to happen and the files will be updated in near real time. It’s going to be read only, so it’s going to help you with read performance.

So Scree crossregion replication will be great if you have dynamic content that needs to be available at low latency in a few amount of regions. Hope that makes sense. Hope that’s very clear platform is for catching globally and sere crossregion replication for replicationing into select regions. All right, so that’s it for this lecture. I will see you in the next lecture for some hands-on.

  1. CloudFront with S3 – Hands On

Okay, so we are going to create an S Three bucket and we’ll create a Cloud Front distribution as well. So in front of that bucket, to basically distribute the content of that bucket anywhere around the globe with low latency. Then we’re going to create what’s called an origin access identity. So this is a user of Cloud Front that will be accessing our S Three bucket and will limit the S Three bucket to be only accessed using this identity user. So effectively, we’ll basically make sure that no one can access Rs Three bucket except if they go through Cloud Front. And why would we do this? Well, we can do this for many reasons for monitoring, because maybe you have cookies, maybe because of some policies or et cetera, et cetera. And that is a very popular exam questions.

So let’s go see how we do this right now. So let’s go ahead and create a bucket and I’ll collect my content through Cloud Front. That’s a horrible name, but let’s go ahead with it. Click on Next and then I will keep everything as is. Click on Next and create bucket. Okay, so I have my bucket being created and I’m just going to upload a few files in it. So I’m going to upload the coffee, the beach, and the index HTML. Click on upload. And here we go. My files are uploaded. So now we’re going to go straight ahead into Cloud Front and get to see how Cloud Front works. So in Cloud Front, what I have to do is to create a distribution. So let’s go ahead and create a distribution and then we’re going to get started with a web type of distribution. So, all right, the origin domain now is going to be basically our bucket name. So here it’s my content through CloudFront Estrada Amazonas, the origin path. We’re going to leave it empty and the idea we’re just going to leave it as is.

So this is basically going to give you a description of the origin. All right, we’re going to restrict the access bucket. And this is what’s very important here, the restrict bucket access option. So if I say no, there’s no option. If I say yes, you see there’s a lot more option that happens. And so here basically, this is if we want our users to always have access to our S Three content only using the Cloud Front URLs, not the Amazon S Three URLs. So it’s super important that if we want that to happen, yes, we need to restrict the bucket access. And then I can select an origin access identity. So here we could use an existing identity, but we don’t have any identity.

Or we can just go ahead and create a new one and you can just name it. We’ll call it Access Identity demo, for example. And then finally there’s need to grant read permissions for that identity onto Rs Three buckets. And so either no, you will update the permissions yourself or yes, please update the bucket policy for me and we’ll just say, yes, please update the bucket policy for me because we’re a bit lazy. All right, so now we’re good. Now let’s just scroll down and say, okay, we are okay with Http and Https but maybe we want to redirect Http to Https so that we only force encryption connections to Cloud Front. So we’ll select the second option. Maybe we’ll allow only get and head, but we could allow more methods. I can scroll down and see there’s tons of parameters, but for now I’ll just leave it as is and at the very bottom we will scroll all the way down and we’ll say on create distribution. Now this distribution can take a lot of time to be created and this is the case usually with Cloud Front. So right now the state is enabled, but the status is in progress and it can take a few, maybe ten minutes to get created. So I’m going to pause the video until then.

So while this is getting created, what’s really important to see is that now we have an origin access identity that has been created. And so you can see that there is our comments, there is an ID, so E eight Y six, et cetera, et cetera. And there is an Amazon S three canonical user ID for this. So automatically we have created an origin access identity while we were creating our distribution. If we go back to our bucket and now goes to permissions and bucket policy.

Well, we can see that a policy was created automatically for us saying, okay, basically our user the cloud front origin access identity e 86 the one we have right here. This one is allowed to talk and do a get object on anything of my content within these buckets. So it’s really neat because now we basically understand that this user that was created by Class Front has access to our buckets. And we could even make a statement to deny anything if it’s not coming from this principle. So we could edit that bucket policy. We won’t do it here, but we could update it to say anything that doesn’t come from this user will be denied. And so effectively we would only restrict our bucket access to this origin access identity. So super important for you to understand this because that could be an exam question. So now we’re going to wait for the distribution to be over.

Okay? So my Cloud Front distribution is done and now I should be able to access it, for example, my beach Jpg file through my URL. So I take the URL of Cloud Front and as you can see, we get an access denied. And this is due to the fact we’ve been redirected to the S three buckets. So this is due to a DNS issue and we’ll have to wait about 3 hours for it to be fixed. But in the meantime, what we’ll do is that we’ll make the files public in our S Three buckets to fix this temporarily. And so the one thing we have to do to fix this is to go to our S Three management console and make these files public. But so if I wanted to make, for example, my coffee JPEG public so I right click and make public click on Make Public, I get an error, it says failed and I get access denied.

So why is my access denied? Well, because there is a setting in properties. You have a sorry, in permissions, you have a public access setting. And here you can change the public access setting for this bucket. And basically we’re going to untick everything. We’re basically going to allow us to set some objects to be public. So I click on Save and I’ll click on Confirm and will basically allow us to make some objects public. So let’s go back to my coffee JPEG. I right click make public and make public. So now it was a success. And so if I go back to my Cloud Front URL and in there I’ll do coffee JPEG, I should be able here we go to see my coffee. So it’s pretty cool this worked. And similarly, I can also look at beach JPEG and I can try to make this public. So I’ll try to make public make public. Here we go. And in there I’ll go back to my clap front URL and then I will do Beach JPEG.

And here we go. The beach is appearing as well. Finally, you may have seen that when I accessed my distribution using the domain name, I was being redirected directly to an S Three URL instead of getting served through the domain name. So this has to do with a DNS propagation issue. And if you want to read more about it, this question on stack of a flow called AWS Cloud Front redirecting to S Three buckets. This answer explains why. And the idea is that you need to wait three 4 hours for basically the DNS to propagate properly before you get the Cloud Front access to your images and your files directly using the Cloud Front URL instead of using the seal. So just something you should know, it’s not a bug, it’s just something that you be aware of. It’s a temporary redirect which will be fixed when the DNS has propagated on AWS side. So I have waited about a day now and if I click my domain name and go to Beach Jpg now, as you can see, my Front URL directly takes me to Beach Jpg and I’m not redirected to the S Three buckets. And so it’s a really cool thing. Now, because the DNS is propagated, I am served properly only through Cloud Front. And so for this now I can go back to my S Three and I can make that file beach jpg not public again. So I’ll click on Public Access and change the ACL of that object and remove read object. So now this file is private and I can go back to coffee jpg do the same on the permissions. Remove public access. Excellent.

So that no files in my bucket now have public access. They’re all private. And now I can go back to my permissions and the public access settings. I can now retake all these things preventing me from making anything public. So I’ll confirm this. Click confirm. Yeah, I’ll confirm. And so the idea now is that all these files are private, and only the Bucket policy will allow Cloud Front through the origin access identity, to access my files so we can verify it. I can just refresh this page and it’s working. I can go to coffee jpg, it’s working too. And I can go to index HTML.

And as you can see, I also see my coffee picture. So now here, it’s really cool because we use Cloud Fronts. And the idea is that we have all this content being caged at the edges of the world, and my S Three bucket is only accessible through Cloud Front. And that’s a very, very popular exam question. So remember, the important things here is the origin access identity, this ID right here that is put into a bucket policy allowing only my Cloud Front user to access my S Three buckets. And therefore I’m really protected and I can access Cloud Front all around the world from anywhere. And only using my using Cloud Front, I can access my S Three buckets. So that’s it for this lecture. I hope you enjoyed it and I will see you in the next lecture.