Cloud Architecture – Implementing Nginx in AWS | Global Change through Technology

Brief:

In this post, we will be deploying Nginx as an AMI instance in Amazon’s Elastic Compute Cloud (EC2). This post will document our steps to configure and optimize Nginx for serving static pages to a global audience of users.

Planning and Preparation:

Nginx is a powerful web server that can be deployed in combination with other services such as Fast CGI or Apache backends to provide scalable and efficient web infrastructures. AWS makes it extremely fast and convenient to implement your own Nginx instances. You can choose to deploy the Linux variant of your choice and then deploy Nginx on your OS, or you can choose to deploy the Nginx AMI Appliance developed by Nginx Inc., available in the Amazon Marketplace for am additional licensing fee.

A few factors to consider when making this decision:

Customization: For those who are comfortable with Linux and interested in tweaking Operating System parameters to get the best performance, a custom implementation would make the most sense as you have a lot of control.
Cost: The key to running an effective cloud services infrastructure is to ensure that you have a tight cloud implementation that leaves little wastage of excess resources with the help of cloud features such as auto scaling. That being said, the Nginx AMI has a license fee (US $0.13/hr for an m1.medium instance at time of writing) and should be factored into your total costs of ownership.

* Prior benchmark testing on loads up to 30,000 transactions per second revealed that performance differentials between the custom built AMI and Nginx Marketplace AMIs were insignificant.

Installation:

For this post, we will choose to implement the Nginx AMI which can be provisioned either from the AWS Web Management console or scripted and launched via Amazon’s EC2 tools. We will need to first access the Nginx AMI appliance page in the Amazon Marketplace to accept the terms and conditions.

After the installation in complete, we can verify that the host is setup via the following steps:

Open a browser and connect to the Public DNS name of the EC2 host. We should be able to view the default welcome page.
We can also SSH into the host and run the following command to verify the status of the host. The default user is ec2-user
```
/etc/init.d/nginx status
We should get a response like "nginx (pid  1169) is running..."
```
Next, we check for any available updates and run sudo yum update to apply all updates
We can view the default configuration files of Nginx here:
/etc/nginx/conf.d/default.conf
/etc/nginx/nginx.conf
We should configure xginx to automatically start at reboot
chkconfig nginx on
Some basic commands:
1. Start the nginx service: service nginx start
2. Restart the nginx service: sudo nginx -s reload

Alternatively, I like to use Chef to control automated installations of Nginx. Nginx is currently supported on Ubuntu 10.04, 12.04 and CentOS 5.8 and 6.3 Operating Systems. This support page gives you more information regarding the implementation.

Component Configuration:

Our next task is to configure Nginx and all the necessary components we require to serve up our web content. For this post, we will be using Nginx to host some static web content.

Configure Remote Access
1. I provisioned a Security group in AWS known as Webserver, enabling SSH, HTTP and HTTPS traffic from all IPs. Depending on how you choose to secure your deployment, you may choose to deploy a single management host with SSH access and then only enable SSH from that host into the Web server.
HTTP Server Components
1. We should make sure that we install Nginx with the latest components.
2. Nginx should be configured with only required components in order to minimize it’s memory footprint. We can run the following command:
```
./configure --prefix=/webserver/nginx --without-mail_pop3_module --without-mail_imap_module  --without-mail_smtp_module --with-http_ssl_module  --with-http_stub_status_module  --with-http_gzip_static_module
```
3. The Nginx AMI image is automatically configured at startup to serve a default index.html page at the location /usr/share/nginx/html
4. To configure additional components, you can run the nginx-setup command. You will be asked to select which components to install, afterwhich the script will install all prerequisite packages.
5. After completion, the web application will be installed in the following default location /var/www/default
Load Static content
1. Our Static content is stored on an EBS volume snapshot, which we can access as follows:
  1. Attach and mount the volume in EC2 console
  2. Run fdisk -l to identify which device is our EBS volume (in this case /dev/xvdf1)
  3. Create a new directory for this EBS volume sudo mkdir /mnt/ec2snap
  4. Set permissions to access this directory sudo chmod 0777 /mnt/ec2snap
  5. Mount the device into this folder sudo mount /dev/xvdf1 /mnt/ec2snap -t ntfs
2. Our static content can now be copied to our default folder location
  1. First, we rename the default files created at the time of installation
    sudo mv /usr/share/nginx/html/index.html /usr/share/nginx/html/index_old.html
  2. Then we perform the copy and set necessary permissions on the file
    cp /mnt/ec2snap/html/index.html /usr/share/nginx/html
    sudo chmod 644 /usr/share/nginx/html/index.html
  3. Now we should test our web server to ensure that our configuration is working, by browsing to our public web server online.
  4. Lastly, we should also un-mount the EBS volume
    sudo umount /mnt/ec2snap
Backup snapshot
1. At this point, it’s wise to quickly run a snapshot before delving into the configuration files. (Command Ref.)
  ec2-create-snapshot –aws-access-key AKIAJMLFQQMQVPBBDJFQ –aws-secret-key SGm81OzfAQT/obL24hFH79NYvd8OAb/05qRSAlI3 –region us-west-1 vol-f41285d5 -d “backup-Nginx-$(date +”%Y%m%d”)”

Nginx Optimization:

Nginx allows administrators to perform a considerable number of tweaks to optimize performance based on our underlying system resources. We’ve listed a number of basic tweaks here. Make sure that you thoroughly test these settings before deploying into a production environment.

CPU and Memory Utilization – Nginx is already very efficient with how it utilizes CPU and Memory. However, we can tweak several parameters based on the type of workload that we plan to serve. As we are primarily serving static files, we expect our workload profile to be less CPU intensive and more disk-process oriented.
1. Worker_processes – We can configure the number of single-threaded Worker processes to be 1.5 to 2 x the number of CPU cores to take advantage of Disk bandwidth (IOPs).
2. Worker_connections – We can define how many connections each worker can handle. We can start with a value of 1024 and tweak our figures based on results for optimal performance. The ulimit -n command gives us the numerical figure that we can use to define the number of worker_connections.
3. SSL Processing – SSL processing in Nginx is fairly processor hungry and if your site serves pages via SSL, then you need to evaluate the Worker_process/CPU ratios. You can also turn off Diffie-Hellman cryptography and move to a quicker cipher if you’re not subject to PCI standards. (Examples: ssl_ciphers RC4:HIGH:!aNULL:!MD5:!kEDH;)
Disk Performance – To minimize IO bottlenecks on the Disk subsystem, we can tweak Nginx to minimize disk writes and ensure that Nginx does not resort to on-disk files due to memory limitations.
1. Buffer Sizes – Buffer size defines how much data we can store in the host. A buffer size that is too low will result in Nginx having to upstream responses on disk, which introduces additional latency due to disk read/write IO response times.
  1. client_body_buffer_size: The directive specifies the client request body buffer size, used to handle POST data. If the request body is more than the buffer, then the entire request body or some part is written in a temporary file.
  2. client_header_buffer_size: Directive sets the headerbuffer size for the request header from client. For the overwhelming majority of requests it is completely sufficient to have a buffer size of 1K.
  3. client_max_body_size: Directive assigns the maximum accepted body size of client request, indicated by the line Content-Length in the header of request. If size is greater the given one, then the client gets the error “Request Entity Too Large” (413).
  4. large_client_header_buffers: Directive assigns the maximum number and size of buffers for large headers to read from client request. The request line can not be bigger than the size of one buffer, if the client sends a bigger header nginx returns error “Request URI too large” (414). The longest header line of request also must be not more than the size of one buffer, otherwise the client get the error “Bad request” (400).These parameters should be configured as follows:
```
client_body_buffer_size 8K;
client_header_buffer_size 1k;
client_max_body_size 2m;
large_client_header_buffers 2 1k;
```
2. Access/Error Logging – Access Logs record every request for a file and quickly consume valuable disk I/O. Error logs should not be set too Low unless it is our intention to capture every single HTTP error. A warm level of logging is sufficient for most production environments. We can configure Logs to store data in chunks, defining chunk sizes in (8KB, 32KB,128KB)
3. Open File Cache – The open file cache directive stores Open file descriptors, including information of the file, location and size.
4. OS File Caching – We can define parameters around the size of the cache used by the underlying server OS to cache frequently accessed disk sectors. Caching the web server content will reduce or even eliminate disk I/O.
Network I/O and latency – There are several parameters that we can tweak in order to optimize how efficiently the server can manage a given amount of network bandwidth due to peak loads.
1. Time outs – Timeouts determine how long the server maintains a connection and should be configured optimally to conserve resources on the server.
  1. client_body_timeout: Directive sets the read timeout for the request body from client. The timeout is set only if a body is not get in one readstep. If after this time the client send nothing, nginx returns error “Request time out” (408).
  2. client_header_timeout: Directive assigns timeout with reading of the title of the request of client. The timeout is set only if a header is not get in one readstep. If after this time the client send nothing, nginx returns error “Request time out” (408).
  3. keepalive_timeout: The first parameter assigns the timeout for keep-alive connections with the client. The server will close connections after this time. The optional second parameter assigns the time value in the header Keep-Alive: timeout=time of the response. This header can convince some browsers to close the connection, so that the server does not have to. Without this parameter, nginx does not send a Keep-Alive header (though this is not what makes a connection “keep-alive”). The author of Nginx claims that 10,000 idle connections will use only 2.5 MB of memory
  4. send_timeout: Directive assigns response timeout to client. Timeout is established not on entire transfer of answer, but only between two operations of reading, if after this time client will take nothing, then nginx is shutting down the connection.These parameters should be configured as follows:
```
client_body_timeout   10;
client_header_timeout 10;
keepalive_timeout     15;
send_timeout          10;
```
2. Data compression – We can use Gzip to compress our static data, reducing the size of the TCP packet payloads that will need to traverse the web to get to the client computer. Furthermore, this also reduces CPU load when serving large file sizes. The Nginx HTTP Static Module should be used with the following parameters:
  gzip on;
  gzip_static on;
3. TCP Session parameters – The TCP_* parameters of Nginx
  1. TCP Maximum Segment Lifetime (MSL) – The MSL defines how long the server should wait for stray packets after closing a connection and this value is set to 60 by default on a Linux server.
4. Increase System Limits – Specific parameters such as the number of open file parameters and the number of available ports to serve connections can be increased.

Nginx configuration file:

Prior to rolling out any changes into production, it’s a good idea to first test our configuration files.

We can run the command nginx -t to test our config file. We should make sure that we receive an ‘OK’ result before restarting the services

Conclusion:

In this post, we revealed how easy it is to set up an Nginx server on Amazon AWS. The reference links below provide a wealth of additional information on how to deploy Nginx under varying scenarios, please give them a read.. Ook!

Road Chimp, signing off.

Reference:

http://nginx.org/en/docs/howto_setup_development_environment_on_ec2.html
http://www.lifelinux.com/how-to-optimize-nginx-for-maximum-performance/

Optimizing Nginx for High Traffic Loads

https://calomel.org/nginx.html
http://nginxcp.com/forums/Forum-help-and-support
http://wiki.nginx.org/Pitfalls
Configure start-stop script in Nginx
Configuring PHP-FPM on an Nginx AMI

nginx Installation On Amazon Linux AMI

Custom CenOS: http://www.idevelopment.info/data/AWS/AWS_Tips/AWS_Management/AWS_10.shtml
Chef Configuration for Nginx
Github Cookbooks for Nginx in Chef
http://kovyrin.net/2006/05/18/nginx-as-reverse-proxy/lang/en
http://www.kegel.com/c10k.html
http://forum.directadmin.com/showthread.php?p=137288

Optimising NginX, Node.JS and networking for heavy workloads

Configure Larger System Open File Limits

Cloud Architecture – Implementing Nginx in AWS