{"id":26303,"date":"2019-03-18T15:51:05","date_gmt":"2019-03-18T20:51:05","guid":{"rendered":"https:\/\/centricconsulting.com\/?p=26303"},"modified":"2021-12-15T00:15:55","modified_gmt":"2021-12-15T05:15:55","slug":"part-2-scripting-gitlab-and-jenkins-installs-using-terraform-on-aws_devops","status":"publish","type":"post","link":"https:\/\/centricconsulting.com\/blog\/part-2-scripting-gitlab-and-jenkins-installs-using-terraform-on-aws_devops\/","title":{"rendered":"Part 2: Scripting GitLab and Jenkins Installs using Terraform on AWS"},"content":{"rendered":"
Part two of a four-part series<\/a>.<\/em><\/p>\n In Part 1<\/u><\/strong><\/a>,<\/u><\/strong> I demonstrated how to build the core network infrastructure in AWS using an automation best practice:\u00a0Infrastructure-as-Code.<\/p>\n With a few strokes of the keyboard we created a fully subnetted virtual private network with all the routing we will need to begin deploying our DevOps infrastructure.<\/p>\n In this blog, I\u2019ll build on the existing infrastructure, adding open source continuous integration tools and the supporting infrastructure to provide a highly available deployment.<\/strong><\/p>\n I\u2019ll continue to use Terraform as the tool of choice to script the deployment of a Jenkins master server, Jenkins slaves within an autoscaling group and a highly available GitLab repository behind a load balancer.<\/p>\n To build out the application servers and associated infrastructure we continue to build out additional Terraform scripts which will provision the EC2 instances, application load balancers, RDS instances, Redis clusters, EFS (NAT) volumes, SSH key pairs, security groups and KMS Encryption keys.<\/p>\n User Data<\/a> shell scripts and templates will also be created, which will install software and configure the applications and external resources. External resources, required for high availability, include multiple EFS volumes, a Redis cache cluster and a Postgresql database connection. I\u2019ll discuss what each service is and why we need it as we move along.<\/p>\n The main.tf script will utilize a Terraform template file to customize Gitlab. The template file \u201cgitlab_application_user_data.tpl\u201d contains generic configuration commands. In this script we will reference that script and pass is variables obtained from creating other resources, such as database instance name, userid, password and URL.<\/p>\n First we create the database, then configure the EC2 instance to use values from the database creation process. We use the Terraform template construct to pass variables into the EC2 configuration script.<\/strong><\/p>\n This script will also build the user_data for the EC2 instance using 2 objects; the rendered template (with interpolated variables) and a rendered shell script. The combined scripts will install and configure GitLab on an EC2 instance.<\/p>\n Templates are a powerful advanced feature of Terraform which can be useful to pass Terraform outputs into your EC2 instance configuration scripts. For more information on Terraform template files, see the Terraform documentation page<\/a>.<\/strong><\/p>\n <\/a><\/p>\n The git.sh script, referenced in the main.tf script above, provides a simple bash script to perform the initial configuration of the GitLab EC2 instance.<\/p>\n The function of this script is to:<\/strong><\/p>\n We\u2019ll dive into greater detail on EFS volumes later. For now, just think of them as NFS file shares to help configure our GitLab servers with shared storage in a high availability configuration.<\/strong><\/p>\n <\/a><\/p>\n Gitlab requires some additional configuration to disable the internal Redis and internal Postgresqlql database and reconfigure it to connect to the AWS Redis and Postgresql RDS database as well as make some configuration changes to enable HA.<\/strong><\/p>\n I discussed the use of Terraform templates earlier when building out the main.tf script. The following Terraform template will be used.<\/p>\n It requires several database variables that are generated within other Terraform scripts as input variables as described earlier. Create this file in your \u201ctemplates\u201d sub-folder so it to be found by the reference to it in the main.tf script.<\/p>\n This template file is a cloud-init script, which is how you initialize your EC2 instances upon first boot. To learn more about cloud-init scripts, see cloud-init documentation here<\/a>.<\/p>\n The function of this script is to:<\/strong><\/p>\n <\/a><\/p>\n <\/p>\n This script is responsible for build out all of the security groups that control inbound and outbound network access to our services.<\/p>\n A brief description of the six security groups (SGs):<\/strong><\/p>\n <\/a><\/p>\n <\/a><\/p>\n <\/a><\/p>\n <\/a><\/p>\n In this script we are creating our postgreSQL database. To configure our GitLab instance to connect to this external database, earlier, I reference output variables from this resource.<\/strong><\/p>\n Look back to see where I reference aws_db_instance.gitlab_postgres.address<\/em><\/strong>. The connection string to connect to this database is referenced by the address<\/a><\/em><\/strong> attribute. We can choose whether or not we want multi-AZ by setting the variable \u2018multi_az\u2019 to either \u201ctrue\u201d or \u201cfalse\u201d in our variables.tf file, which you can find in Part 1 of the blog series.<\/p>\n <\/a><\/p>\n GitLab requires a Redis cache. You can either use the built-in Redis cache or an external cache. In order to meet our objective of building a highly-available infrastructure, we will need at least 2 GitLab servers and therefore will need an external, shared Redis cluster.\u00a0<\/strong><\/p>\n Here is where we create a 2-node AWS Redis cluster that we can attach our GitLab servers to. Note that we create an elasticache subnet group with a list of 2 private subnets.\u00a0 This controls how many copies of the Redis cluster are built and in which subnets they are deployed.<\/p>\n For high availability, we place a cache in each of 2 subnets that are spread across 2 distict availability zones. We reference the output variable<\/a> aws_elasticache_replication_group.gitlab_redis.primary_endpoint_address<\/em><\/strong>, when configuring our GitLab instance to direct our GitLab instances to utilize this Redis cluster.<\/p>\n <\/a><\/p>\n When we launch an EC2 instance in this environment, we want the ability to run AWS CLI commands (from an instance) when running our cloud_init scripts. This will allow us to pull information about other AWS resources and use them to customize our configuration. One example where this comes in handy is when we install GitLab.<\/strong><\/p>\n During the installation, we set an environment variable of the External URL used to access GitLab. Because our GitLab will be sitting behind an application load balancer, we can use the AWS CLI to get the public URL of our load balancer.<\/p>\n We can set the EXTERNAL_URL environment variable prior to installing GitLab and the configuration files will get automatically updated with the correct external URL of the GitLab instance (i.e. the ALB Public DNS Name).<\/p>\n This iam.tf script creates an EC2 launch configuration with a read-only IAM access role and allows any EC2 instance launched with this launch configuration to assume this role, thereby granting access to list any resource in the AWS account. All of our EC2 instances will be launched using this template in order to grant this access.<\/p>\n In a production environment, you would want to tighten up security to conform to the least-privilege best practice.<\/strong><\/p>\n <\/a><\/p>\n To demonstrate another best practice of securing all data at rest, this script will generate 2 KMS keys used to secure data in the Elastic File System (EFS) for the Jenkins and GitLab data.<\/p>\n See the efs.tf script below to see how these keys are used to secure the EFS filesystems.<\/strong><\/p>\n <\/a><\/p>\n As part of building a highly available architecture, we need to build a series of EFS filesystems, which are basically equivalent to CIFS or NFS filesystems.\u00a0All of our filesystem data for both the GitLab servers and Jenkins Master servers will mount and utilize these EFS filesystems, making our EC2 instances immutable.<\/strong><\/p>\n Any of the servers can be terminated without losing any data. Our autoscaling groups will detect any failed servers and simply deploy another. As part of the deployment, the existing EFS filesystems will be mounted prior to starting the application and our service is restored without any data loss.<\/p>\n The EFS filesystems are created in this efs.tf script and mounted as part of the EC2 cloud_init scripts.<\/p>\n <\/a><\/p>\n <\/a><\/p>\n <\/a><\/p>\n When we deploy our EC2 instances, we need an EC2 Key Pair installed. This will allow SSH login capability to the instance. This script will take a pre-built key from your workstation and install it on every EC2 instance deployed during this exercise.<\/p>\n Ensure you update your variables.tf file to point to a valid ssh public key that you have access to from your workstation. You will need to generate an SSH key on your workstation to ensure you can access the EC2 instances and bastion host.<\/p>\n Generating SSH keys is beyond our scope, but you can find many resources on the web that will walk you through the process using ssh-keygen or Putty. We reference this key in the EC2 launch configurations discussed next. The $key_path is defined in the variables.tf file from Part 1 of this blog<\/a>.<\/strong><\/p>\n <\/a><\/p>\n The elb.tf script is responsible for setting up our Application Load Balancers (ALB), the Target Groups and the Launch Configurations for our Jenkins Master and GitLab instances.\u00a0 The ALBs provide the front-end access point to our application, by providing a public IP address and DNS name.<\/strong><\/p>\n Behind the load balancers we have target groups where our traffic is load balanced to a pair of application servers. When creating the load balancers, you will note that we are sure to specify at least two different subnets that are in different availability zones (AZ) in order to ensure we do not lose access to all of our EC2 instances if there is a service interruption of a single availability zone.<\/p>\n This configuration is a bit complex and I will not dive into all of the details of every part of the configuration. This script is one of the keys to that HA architecture. We have experts on hand to help with designing and implementing a high availability architecture.<\/p>\n <\/a><\/p>\n <\/a><\/p>\n <\/a><\/p>\n <\/a><\/p>\n As a part of the EC2 instances and EC2 Launch Configurations, scripts are run to configure the Jenkins Master and Jenkins Slave instances, install software and connect to external data sources such as Redis, EFS and Postgresql RDS.<\/p>\n The following scripts should be created in a \u201cscripts\u201d sub-folder. Alternatively, you can place them elsewhere and update the Terraform scripts to point to the new location. They will be executed as part of the launch configuration during the first boot of the EC2 instances to configure the software.<\/strong><\/p>\n The function of this script is to:<\/strong><\/p>\n <\/a><\/p>\n <\/a><\/p>\n Jenkins slaves are used to offload the Jenkins Master server to run Jenkins jobs. The installation an configuration of the slaves is very similar to the master with one exception: The slaves will not have an EFS volume attached as they do not need to save any configuration. A failed slave will simply be replaced by a new slave.<\/strong><\/p>\n The function of this script is to:<\/p>\n <\/a><\/p>\n <\/a><\/p>\n If you have made it this far and have all of these scripts in place (including all of the scripts from Part 1 of the series), and are ready to deploy the new resources, run another terraform plan. Fix any errors reported and continue to run terraform plan until the scripts pass all the pre-flight checks.\u00a0<\/strong><\/p>\n Then you can run terraform apply to add all of the new resources to your AWS account and install the applications.<\/p>\n When complete you should have one Bastion host, one GitLab server, one Jenkins Master server, one or more Jenkins Slave servers (depending on your variables.tf settings), a PostgreSQL RDS instance, a Redis Cluster, multiple EFS filesystems, an Application Load Balancer with a public DNS Name, an EC2 launch configuration, two target groups, a KMS key, an EC2 Key Pair, Security Groups for EC2 and RDS, and all of the networking, subnets, and routing from Part 1 of the series.<\/p>\n Architecturally, it should resemble this drawing:<\/p>\nLet\u2019s get started\u2026<\/h2>\n
main.tf<\/h4>\n
git.sh<\/h4>\n
\n
gitlab_application_user_data.tpl<\/h4>\n
\n
\n
sg.tf<\/h4>\n
\n
rds.tf<\/h4>\n
redis.tf<\/h4>\n
iam.tf<\/h4>\n
kms.tf<\/h4>\n
efs.tf<\/h4>\n
keypair.tf<\/h4>\n
elb.tf<\/h4>\n
jenkins-master.sh<\/h4>\n
\n
jenkins-slave.sh<\/h4>\n
\n