Integration of AWS, Terraform and Jenkins

Satyam Singh
15 min readJun 22, 2020

What is Terraform ?

Terraform is an open-source infrastructure as code software tool created by HashiCorp. It enables users to define and provision a datacenter infrastructure using a high-level configuration language known as HashiCorp Configuration Language, or optionally JSON.

The current project has been divided into two parts i.e. first part involves AWS and Terraform and second part involves integration of the the setup with Jenkins.

Part 1 : AWS and Terraform

First of all, we need to specify the provider to be used in our Terraform Code , in our case we are using AWS, thereby we need to specify the same , I have specified the access key, secret key , though you can create a profile and specify the same for security purpose and region under which we are creating the infrastructure.

provider "aws" {
region = "ap-south-1"
access_key = "****************"
secret_key = "****************"
}

Note :- Never upload the Terraform Code with credentials explicitly specified on any public platform like GitHub and many more as it would pose huge risk to your account’s security.

Here, availability zone “ap-south-1c ” is excluded as the instance type (which is specified in AWS Instance Resource) is not available in this particular Availability Zone.

data "aws_availability_zones" "task_az" {
exclude_names = ["ap-south-1c"]
}

Instead of manually creating key in AWS console and then specifying it directly in our AWS Instance, automation in key generation could be done by creating a TLS Private Key and here we use RSA algorithm for private key generation which is required for generation of key pair required for accessing EC2 Instance , under aws_key_pair resource, public_key has been specified whose value are obtained from tls_private_key resource.

resource "tls_private_key" "tlskey" {
algorithm = "RSA"
}

resource "aws_key_pair" "tkey" {
key_name = "task-key"
public_key = tls_private_key.tlskey.public_key_openssh
}

Note :-If not specified , the size of TLS private key generated using RSA algorithm is 2048 bits.

For purpose of automation, we can generate VPC using Terraform resource known as aws_pc , here specifying CIDR Block (a set of IP Addresses used for creating unique identifiers for the network and individual devices) is mandatory to be specified.

resource "aws_vpc" "vpc" {
cidr_block = "10.1.0.0/16"
enable_dns_support = true
enable_dns_hostnames = true
tags = {
"Name" = "task_vpc"
}
}

Subnet defines a range of IP addresses under VPC and could be created using Terraform resource known as aws_subnet, the two parameters that needs to be specified are vpc_id (we get those from aws_vpc resource created before) and cidr_block , also for purpose of Public IP creation which would be useful for SSH connection to EC2 Instance, the map_public_ip_on_launch has been set to true .

Also. first availability zone excluding the one that has been blacklisted is also specified.

resource "aws_subnet" "subnet_public" {
vpc_id = aws_vpc.vpc.id
cidr_block = "10.1.0.0/16"
map_public_ip_on_launch = "true"
availability_zone = data.aws_availability_zones.task_az.names[0]
tags = {
"Name" = "task_subnet"
}
}

Internet Gateway performs network address translation (NAT) for EC2 instances which have been assigned public IPv4 addresses and could be generated using Terraform resource known as aws_internet_gateway and the required parameter in this case is vpc_id that could be obtained by aws_vpc resource created before .

resource "aws_internet_gateway" "igw" {
vpc_id = aws_vpc.vpc.id
tags = {
"Name" = "task_ig"
}
}

VPC consist of an implicit router and Route Table is used to control the direction of network traffic and could be generated using Terraform resource known as aws_route_table and the required parameters are vpc_id from aws_vpc resource , route is optional and is used for specifying a list of route objects , if used, the required parameter in the same are cidr_block .

resource "aws_route_table" "rtb_public" {
vpc_id = aws_vpc.vpc.id
route {
cidr_block = "0.0.0.0/0"
gateway_id = aws_internet_gateway.igw.id
}
tags = {
"Name" = "task_route"
}
}

Subnet in VPC must be associated with Route Table as it controls the routing of Subnet and could be generated using Terraform resource known as aws_route_table_association and the required parameters are route_table_id that could be obtained from Route Table generated in aws_route_table resource above, also subnet_id has been specified whose value we get from aws_subnet resource above.

resource "aws_route_table_association" "rta_subnet_public" {
subnet_id = aws_subnet.subnet_public.id
route_table_id = aws_route_table.rtb_public.id
}

Security Group in AWS is used for controlling inbound and outbound traffic and could be be generated using Terraform resource known as aws_security_group and under it ingress and egress is optional but could be specified as per need , in our case we have specified for port 22 with TCP Protocol for enabling SSH connection ,also we have specified port 80 for enabling HTTP connection and in case of ingress , -1 is specified in egress that indicates all protocols , under ingress and egress, the required parameters are from_port, to_port and protocol .

Ingress is used for specifying inbound rules which defines the traffic allowed in the EC2 instances and on which ports whereas Egress is used for specifying outbound rules which defines the traffic allowed to leave the EC2 instances on which ports and to which destinations .

resource "aws_security_group" "sg_80" {
name = "sg_80"
vpc_id = aws_vpc.vpc.id

ingress {
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}

egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}

ingress {
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}

egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "task_sg"
}
}

EC2 Instances provides a balance of compute, memory and networking resources and could be generated using Terraform resource known as aws_instance and the required parameters are ami and instance_type.

AMI or Amazon Machine Images provides the information required for launching an instance whereas instance type which has been predefined in this case i.e. “t2.micro” is the combination of CPU, Memory, Storage and Networking Capacity as per requirements of the users or clients .

Also, availability_zone, key_name, subnet_id and vpc_security_group_ids has been specified whose values are obtained from data i.e. aws_availability_zones and resources i.e. aws_key_pair, aws_subnet and aws_security_group respectively .

resource "aws_instance"  "myinstance"  {
ami = "ami-0447a12f28fddb066"
instance_type = "t2.micro"
availability_zone = data.aws_availability_zones.task_az.names[0]
key_name = aws_key_pair.tkey.key_name
subnet_id = aws_subnet.subnet_public.id
vpc_security_group_ids = [ aws_security_group.sg_80.id ]

tags = {
Name = "tfos"
}
}

After launching the EC2 instance , setup of provisioner and connection is done under null resource as both of them needs to be declared inside resource or in case of connection, it could be declared under provisioner as well. In connection, type ,user, private key is defined (could be obtained from tls_private_key resource) and host(public IP which could be obtained from aws_instance resource) and it depends on EC2 Instance created using aws_instance resource.

After the connection is set up, set up for project inside the instance could be done using “remote-exec” provisioner , under which installation of httpd and git takes place,and then httpd server is started.

resource "null_resource" "op_after_creation"  {

depends_on = [
aws_instance.myinstance
]

connection {
type = "ssh"
user = "ec2-user"
private_key = tls_private_key.tlskey.private_key_pem
host = aws_instance.myinstance.public_ip
}

provisioner "remote-exec" {
inline = [
"sudo yum install httpd git -y",
"sudo systemctl restart httpd",
"sudo systemctl enable httpd"
]
}
}

EBS Volume is a durable, block-level storage that could be attached to EC2 instances , also it protects in case of failure of single component and could be generated from Terraform resources known as aws_ebs_volume and required parameter is availability_zone which could be obtained from EC2 instances generated using aws_instance resource and it depends on null resource i.e. op_after_creation, here size is defined in GiBs and in this case it is 2 GiBs .

resource "aws_ebs_volume" "myebs" {
depends_on = [
null_resource.op_after_creation
]
availability_zone = aws_instance.myinstance.availability_zone
size = 2

tags = {
Name = "webPageStore"
}
}

EBS Volume could be attached to the EC2 Instance using Terraform resource known as aws_volume_attachment which depends on aws_ebs_volume resource and the required parameters are device_name whose value in this case is “/dev/sdf” , volume_id whose value could be obtained from EBS Volume created using aws_ebs_volume resource and instance_id whose value could be obtained using EC2 Instance created using aws_instance.

Note :- Here, the reason behind force_detach being true is due to the absence of partitioning of EBS Volume , since the absence of partitioning results in difficulty when destroying the EBS Volume while destroying the infrastructure,though it is not a good practice as it results in data loss .

resource "aws_volume_attachment" "ebs_att" {
depends_on = [
aws_ebs_volume.myebs
]
device_name = "/dev/sdf"
volume_id = aws_ebs_volume.myebs.id
force_detach = true
instance_id = aws_instance.myinstance.id
}

As soon as attachment of EBS Volume to EC2 Instance takes place, execution of null_resource i.e. op_after_attach starts as it depends on aws_volume_attachment resource, connection setup is similar to the one created in the previous null_resource , here usage of “remote-exec” provisioner is done as well but the setup is different i.e., the EBS Volume attached to the EC2 Instance is formatted and then mounted to Document Root of httpd server i.e., /var/www/html.

After which, all the content present inside html directory as git clone doesn’t clone the respective repository if the target directory consist of any file or directory.

resource "null_resource" "op_after_attach"  {

depends_on = [
aws_volume_attachment.ebs_att
]


connection {
type = "ssh"
user = "ec2-user"
private_key = tls_private_key.tlskey.private_key_pem
host = aws_instance.myinstance.public_ip
}

provisioner "remote-exec" {
inline = [
"sudo mkfs.ext4 /dev/xvdf",
"sudo mount /dev/xvdf /var/www/html",
"sudo rm -rf /var/www/html/*",
"sudo git clone https://github.com/satyamcs1999/terraform_aws_jenkins.git /var/www/html/"
]
}
}

VPC Endpoints ensures that the data between VPC and S3 is transferred within Amazon Network , thereby helps in protecting instances from internet traffic and it could be generated using Terraform resources known as aws_vpc_endpoint and the required parameters are service_name and vpc_id .

service_name should be specified in the format “com.amazonaws.<region>.<service>” whereas the value of vpc_id is obtained from the aws_vpc resource generated above.

resource "aws_vpc_endpoint" "s3" {
vpc_id = aws_vpc.vpc.id
service_name = "com.amazonaws.ap-south-1.s3"
}

VPC Endpoints are associated with Route Tables and the reason for the same is that the traffic from instances in the subnet could be routed through the endpoint and this association could be generated using Terraform resource known as aws_vpc_endpoint_route_table_association and the required parameters are route_table_id and vpc_endpoint_id , whose value could be obtained from aws_route_table and aws_vpc_endpoint respectively .

resource "aws_vpc_endpoint_route_table_association" "verta_public" {
route_table_id = aws_route_table.rtb_public.id
vpc_endpoint_id = aws_vpc_endpoint.s3.id
}

S3 , an abbreviation of Simple Storage Service is a public cloud storage resource , an object level storage and provides S3 buckets , which are similar to file folders , consisting of data and its metadata. It could be generated using Terraform resource known as aws_s3_bucket and it depends on null_resource i.e, op_after_attach and there are as such no required parameters except if website is used , under which index_document is a required parameter.

In this case, “t1-aws-terraform” has been declared as bucket , acl i.e. Access Control Lists for bucket has been set to “public-read ” and force_destroy has been set to true so as to delete bucket with objects within it without error. Under website, the index_document has been set to “index.html”.

resource "aws_s3_bucket" "task_bucket" {

depends_on = [
null_resource.op_after_attach
]
bucket = "t1-aws-terraform"
acl = "public-read"
force_destroy = "true"
website{
index_document = "index.html"
}

tags = {
Name = "t1-terraform"
}
}

CodePipeline is a fully managed continuous delivery service that helps in automating the release pipeline and could be generated using Terraform resource known as aws_codepipeline and the required parameters are name , i.e., task_codepipeline in this case , role_arn (it grants AWS CodePipeline permission to make calls on behalf of AWS services), artifact_store and stage(at least two).

Under artifact_store, the required parameters are location whose values could be obtained from aws_s3_bucket and type which is in this case ,S3.

Under stage , the required parameters are name and action , name in this case is “Source” and “Deploy”, under action , details of both Source and Deploy i.e.,the required parameters are name, category, owner, provider and version . Also in this case , input_artifacts and output_artifacts has been specified in “Deploy” and “Source” stage respectively.

This overall setup has been done for creating a continuous delivery pipeline between GitHub repo and S3 bucket and and accordingly values has been provided to the parameters of actions .

Note :- The recommended policy for providing role_arn parameter to grant someone to make call on behalf on AWS is AdministratorAccess.

resource "aws_codepipeline" "task_codepipeline" {
name = "task_codepipeline"
role_arn = "arn:aws:iam::**********:role/sats"
artifact_store {
location = aws_s3_bucket.task_bucket.bucket
type = "S3"
}
stage {
name = "Source"

action {
name = "Source"
category = "Source"
owner = "ThirdParty"
provider = "GitHub"
version = "1"
output_artifacts = ["source_output"]

configuration = {
Owner = "thespecguy"
Repo = "terraform_aws_jenkins"
Branch = "master"
OAuthToken = "****************************"
}
}
}

stage {
name = "Deploy"

action {
name = "Deploy"
category = "Deploy"
owner = "AWS"
provider = "S3"
version = "1"
input_artifacts = ["source_output"]

configuration = {
BucketName = "t1-aws-terraform"
Extract = "true"
}
}
}
}

Waiting time between two resources could be generated using Terraform resource known as time_sleep and it has no required parameter as such. In our case , waiting time could be generated using create_duration parameter and it depends on execution of aws_codepipeline.

The reason behind creation of waiting time is due to time it takes for S3 to replicate the data across multiple servers , if the objects within the bucket is accessed before the replication completes, it would show an error like “NoSuchKey” error.

resource "time_sleep" "waiting_time" {
depends_on = [
aws_codepipeline.task_codepipeline
]
create_duration = "5m"
}

As soon as waiting time is over , “local-exec” provisioner enables execution in local system , and in this case , AWS CLI command for making a specific object publicly accessible is performed as the public access to bucket doesn’t ensure public access to the objects within it , so to make a object publicly accessible , the permission has to be provided separately for the object as well, in our case , the object “freddie_mercury.jpg” has been provided “public-read” access.

resource "null_resource" "codepipeline_cloudfront" {

depends_on = [
time_sleep.waiting_time
]
provisioner "local-exec" {
command = "/usr/local/bin/aws s3api put-object-acl --bucket t1-aws-terraform --key freddie_mercury.jpg --acl public-read"
}
}

CloudFront is a fast Content Delivery Network (CDN) for secure delivery of data, videos, application and APIs to customers globally with low latency and high transfer speed and could be generated using Terraform resource known as aws_cloudfront_distribution and the required parameters are default_cache_behavior, enabled, origin, restrictions and viewer_certificate and it depends on complete execution of null_resource i.e., codepipeline_cloudfront, alongside the required parameter, the is_ipv6_enabled is set to “true” thereby enabling IPv6 for distribution.

Under origin , domain_name whose value could be obtained from aws_s3_bucket and origin_id of the format “S3-<bucket name>” and both of them are required parameters ,enabled is set to “true” to enable the acceptance of end user requests for content.

Under default_cache_behavior , the required parameters are allowed_methods which specifies the HTTP methods CloudFront would process and forward it to Amazon S3, cached_methods which caches the response to requests using the specified HTTP methods, target_origin_id that is used to specify the origin the CloudFront would route request to and it’s format is same as origin_id ,forwarded_values that specifies the handling of query strings , cookies and headers by CloudFront and under forwarded_values , the required parameters are cookies which specifies how CloudFront handles cookies and query_string that indicates if the query string needs to be forwarded to the origin using CloudFront and last required parameter under default cache behavior is viewer_protocol_policy that specifies the protocol that users could use to access the files in the origin specified by target_origin_id and matches the path pattern.

Alongside the required parameters in default_cache_behaviour, min_ttl, max_ttl and default_ttl has been used which specifies the minimum,maximum and default TTL(Time to Live) for the cached content.

Under restrictions, there is another sub-resource known as geo_restriction under which the required parameter is restriction_type which helps in restricting distribution of content by country.

Under viewer_certificate, cloudfront_default_certificate is set to “true” which enables the viewers to use HTTPS to request the objects .

resource "aws_cloudfront_distribution" "task_cloudfront_distribution" {
depends_on = [
null_resource.codepipeline_cloudfront
]
origin {
domain_name = aws_s3_bucket.task_bucket.bucket_domain_name
origin_id = "S3-t1-aws-terraform"
}

enabled = true
is_ipv6_enabled = "true"

default_cache_behavior {
allowed_methods = ["GET", "HEAD", "OPTIONS"]
cached_methods = ["GET", "HEAD","OPTIONS"]
target_origin_id = "S3-t1-aws-terraform"

forwarded_values {
query_string = "false"

cookies {
forward = "none"
}
}
viewer_protocol_policy = "redirect-to-https"
min_ttl = 0
default_ttl = 3600
max_ttl = 86400
}
restrictions {
geo_restriction {
restriction_type = "none"
}
}
viewer_certificate {
cloudfront_default_certificate = "true"
}
}

As soon as CloudFront Distribution is set up, null_resource i.e., cloudfront_url_updation under which connection is set up to the EC2 instances which is same as the one created in previous ones , here usage of “remote- exec” provisioner is done but for different purpose i.e. updation of image source in HTML img tag with the domain_name whose value could be obtained from CloudFront Distribution created using cloudfront_distribution resource .

Updation is performed using “sed” which is a Linux command.

resource "null_resource" "cloudfront_url_updation" {
depends_on = [
aws_cloudfront_distribution.task_cloudfront_distribution
]
connection {
type = "ssh"
user = "ec2-user"
private_key = tls_private_key.tlskey.private_key_pem
host = aws_instance.myinstance.public_ip
}

provisioner "remote-exec"{
inline = [
"sudo sed -ie 's,freddie_mercury.jpg,https://${aws_cloudfront_distribution.task_cloudfront_distribution.domain_name}/freddie_mercury.jpg,g' /var/www/html/index.html"
]
}
}

EBS Snapshot is a point-in-time copy of EBS Volume and are incremental copies of data and could be generated using Terraform resource known as aws_ebs_snapshot and it depends on the complete execution of null_resource i.e., cloudfront_url_updation and the required parameter is volume_id whose value could be obtained from aws_ebs_volume resource.

resource "aws_ebs_snapshot" "task_snapshot" {
depends_on = [
null_resource.cloudfront_url_updation
]
volume_id = aws_ebs_volume.myebs.id

tags = {
Name = "Task 1 snapshot"
}
}

public_ip generated by aws_instance could be placed as an output so as to access the web page set up inside EC2 Instance by using output command in Terraform and it depends on execution of aws_ebs_snapshot resource.

output "instance_public_ip" {
depends_on = [
aws_ebs_snapshot.task_snapshot
]
value = aws_instance.myinstance.public_ip
}

Part 2 : Integration with Jenkins

Job 1 : Generation of Public URL using ngrok

First of all set up ngrok which uses the concept of tunneling providing Public URL, the command to activate ngrok is as follows

./ngrok http 8080

Here , the port number specified i.e., 8080 is the default port number for Jenkins.

The Web URL specified is used to setup webhook in GitHub

Job 2 : Setting up Webhook in GitHub

First, select the repository and then select Settings on right hand corner.

Then , select Webhooks from the list of options present on the left hand side.

Then click on Add Webhook on the top right .

Then in Payload URL, specify the URL in the format “generatedURL/github-webhook/” and under Current type , select “application/json”.

Hence , the Webhook setup in GitHub has been done successfully

Job 3 : Setting up Jenkins

In the command line , the command for enabling Jenkins are as follows

systemctl start jenkins

Then , using ifconfig command, find the IP Address respective to the Network Card of your system.

After which, specify the IP address along with Port Number 8080 i.e., default port number for Jenkins and then this screen would appear .

Enter Jenkins using the respective Username and Password.

Select on “New item”

Enter the name of the Job and click on “Freestyle project”, then click OK.

Job 4 : Jenkins Job Setup

For setting up Jenkins with GitHub , place the URL of the respective repository under “Repository URL” section of Git under Source Code Management.

For setting up Build Trigger to the Webhook that was setup before , click on “GitHub hook trigger for GITScm polling” .

Under Build, select “Execute shell”

Then , add the code for setting up CI/CD Pipeline of AWS and Terraform with Jenkins .

terraform init installs the required plugins to build the infrastructure

terraform plan provides the order in which execution would be done

terraform apply sets up the complete infrastructure

terraform destroy destroys the complete infrastructure

aws configure set is used for setting up security credentials

Note

To learn how to create an GitHub OAuth Token , check this link:

depends_on is used in Terraform as execution in Terraform doesn’t takes place in sequential manner , it would create problem in setting up infrastructure if resource dependent on other resources is executed first and many more cases so as to maintain a proper order of execution, it is used.

tags are used in Terraform for defining key and values and associating it with the resources.

Thank You !!!

GitHub Repository(Mentioned Above) :

GitHub Repository(Contains Terraform code used above with README):

LinkedIn :

--

--