Building the next big thing on the web?  It is never a good idea to serve your static files, i.e. style sheets, javascripts and images, from your main application server.  This task should be offloaded to another server for three reasons -

  1. It ties up resources in your app server while it would be performing other tasks.
  2. App servers are hardly ever optimized for simple file serving.
  3. Browsers can fetch resources in parallel from the two servers.  This can improve the load time significantly.

Most major websites on the web such as Facebook, Best Buy et.al use this concept.  Some opt to use commercial services called Content Delivery Networks (CDN) to accomplish this task.  Two such companies are Akamai and Limelight.  You can see this for yourself by checking the source of images on Best Buy’s site, for instance.  Images have the URL:

http://www.bestbuyon.com/sites/default/files/COD_MW3_300x145.jpg

If you look up the domain bestbuyon.com, you will see that it is actually being served by multiple Akamai servers.

dig www.bestbuyon.com

;; QUESTION SECTION:
;www.bestbuyon.com.		IN	A

;; ANSWER SECTION:
www.bestbuyon.com.	86357	IN	CNAME	on.bestbuy.com.edgesuite.net.
on.bestbuy.com.edgesuite.net. 21557 IN	CNAME	a1982.b.akamai.net.
a1982.b.akamai.net.	20	IN	A	65.32.34.146
a1982.b.akamai.net.	20	IN	A	65.32.34.128

;; AUTHORITY SECTION:
b.akamai.net.		1755	IN	NS	n9b.akamai.net.
b.akamai.net.		1755	IN	NS	n3b.akamai.net.
b.akamai.net.		1755	IN	NS	n1b.akamai.net.
b.akamai.net.		1755	IN	NS	n5b.akamai.net.
b.akamai.net.		1755	IN	NS	n0b.akamai.net.
b.akamai.net.		1755	IN	NS	n4b.akamai.net.
b.akamai.net.		1755	IN	NS	n2b.akamai.net.
b.akamai.net.		1755	IN	NS	n8b.akamai.net.
b.akamai.net.		1755	IN	NS	n6b.akamai.net.
b.akamai.net.		1755	IN	NS	n7b.akamai.net.

These CDN services can be fairly expensive and may be out of scope of a startup budget, or if you are doing a side project.  In my past experience, starting price was around $700/mo.   The good news is that you can create a less ambitious version of a CDN yourself using free, open-source tools.  The set up which I describe below can handle boat-loads of traffic.  If you start outgrowing this, then you already have a successful website and should be earning enough to pay for a CDN!

Resizing Images on the fly

I have found over the years that it is always best to upload original images and resize them on the fly as needed.  This can be useful in multiple scanarios:

  1.  Think of an application like Facebook.  Users upload images of all different shapes and sizes.  You have to resize the image to fit a certain box area before displaying on the web page.
  2. You have your website all set up and running.  6 months later, you decide to redesign the pages.  Do you recreate all your images from scratch?  The better solution is to let our caching, imaging server do the work.

The catch is that image resizing is a heavy task.  You don’t want to do that on every request.  Instead, the image server must cache the resized image for subsequent calls.  In our case, resized images are saved to disk and served directly from there.

Tools of the trade – Linux, Nginx and PHP

For this project, we will use three open-source tools.  They are listed below in the order in which they have to be installed.  If you are linux-savvy, the whole project should take less than an hour.

Any Linux Distribution

Install any recent version on Linux.  I have used Ubuntu 10.04 LTS Server for my work.  But you can use any flavor you like.

Nginx

Nginx (pronounced engine-x) is an amazing, light-weight web server that can serve files at blazing speeds while utlizing very few resources.  In fact, I run it on Amazon EC2′s micro instance which only gives me 640MB of RAM.  The server design is asynchronous.  Unlike Apache, it does not use a threading model and is a lot more scalable.  A nice side-bonus is that the config files are a lot sensible also.

prompt$ sudo apt-get install nginx

Next set up a website using the following configuration.  Change domain name and directory location to suit your setup:

server {
        listen   80;
        server_name  static.example.com;
        root   /var/www/static.example.com/;

        location ~ ^/(.*)/cache/(.*)$
        {
                try_files $uri @resize;
                expires 4h;
        }

        location / {
                expires 4h;
        }

        location ~ \.php(.*)$  {
                fastcgi_pass 127.0.0.1:8000;
                fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
                fastcgi_index index.php;
                include /etc/nginx/fastcgi_params;
        }

        location @resize {
                rewrite ^/(.*)/cache/(.*)$ /resize.php?dir=$1&path=$2;
                include /etc/nginx/fastcgi_params;
                fastcgi_pass 127.0.0.1:8000;
        }
}

The above configuration performs the following actions:

  1. If requested image exists on disk, serve it directly.
  2. If image does not exist on disk, return 404.
  3. If image does not exist on disk and URL contains /cache/, then call PHP script via fast-cgi which creates a resized version of the image, saves it to disk and returns it to caller.
As you can see, with the above setup, the image is resized just once.  Future requests are served directly from disk with no overhead of call to the PHP script via fast-cgi.  Additionally, expires 4h set the cache expiration in browser to 4 hours.  You could set this to 10 years or so.  The idea being that a visitor to your website isn’t re-downloading the common images going from page to page.

PHP

We will use a short PHP script to perform image resizing.  This script makes use of Imagick library.  So make sure that is installed also.

prompt$ sudo apt-get install php5-cgi php5-imagick

Create a file called /var/www/static.example.com/resize.php and enter the following content:

        ini_set("memory_limit","80M");

        # prevent creation of new directories
        $is_locked = false;

        # figure out requested path and actual physical file paths
        $orig_dir = dirname(__FILE__);
        $path = $_GET['path'];
        $tokens = explode("/", $path);
        $file = "/".implode('/', array_slice($tokens,3));
        $orig_file = $orig_dir.$file;

        if (!file_exists($orig_file))
        {
                header("Status: 404 Not Found");
                echo "<br/>PATH=$path<br/>ORIGFILE=$orig_file";
                return 0;
        }

        # check if new directory would need to be created
        $save_path = "$orig_dir/cache/$tokens[2]$file";
        $save_dir = dirname($save_path);

        if(!file_exists($save_dir) && $is_locked)
        {
                header("Status: 403 Forbidden");
                echo "Directory creation is forbidden.";
                return 0;
        }

        # parse out the requested image dimensions and resize mode
        $x_pos = strpos($tokens[2], 'x');
        $dash_pos = strpos($tokens[2], '-');
        $target_width = substr($tokens[2], 0, $x_pos);
        $target_height = substr($tokens[2], $x_pos+1, $dash_pos-$x_pos-1);
        $mode = substr($tokens[2], $dash_pos+1);

        $new_width = $target_width;
        $new_height = $target_height;
        $image = new Imagick($orig_file);
        list($orig_width, $orig_height, $type, $attr) = getimagesize($orig_file);

        # preserve aspect ratio, fitting image to specified box
        if ($mode == "0")
        {
                $new_height = $orig_height * $new_width / $orig_width;
                if ($new_height > $target_height)
                {
                        $new_width = $orig_width * $target_height / $orig_height;
                        $new_height = $target_height;
                }
        }
        # zoom and crop to exactly fit specified box
        else if ($mode == "2")
        {
                // crop to get desired aspect ratio
                $desired_aspect = $target_width / $target_height;
                $orig_aspect = $orig_width / $orig_height;

                if ($desired_aspect > $orig_aspect)
                {
                        $trim = $orig_height - ($orig_width / $desired_aspect);
                        $image->cropImage($orig_width, $orig_height-$trim, 0, $trim/2);
                        error_log("HEIGHT TRIM $trim");
                }
                else
                {
                        $trim = $orig_width - ($orig_height * $desired_aspect);
                        $image->cropImage($orig_width-$trim, $orig_height, $trim/2, 0);
                }
        }

        # mode 3 (stretch to fit) is automatic fall-through as image will be blindly resized
        # in following code to specified box
        $image->resizeImage($new_width, $new_height, imagick::FILTER_LANCZOS, 1);

        # save and return the resized image file
        if(!file_exists($save_dir))
                mkdir($save_dir, 0777, true);

        $image->writeImage($save_path);
        echo file_get_contents($save_path);

        return true;

You will need to start up php as a fast-cgi server running on port 8000.  You can download the /etc/init.d/php-fascgi script and /etc/default/php-fastcgi scripts from the following location:

http://blog.codefront.net/2007/06/11/nginx-php-and-a-php-fastcgi-daemon-init-script/

Once up and running, you should see an output similar to this:

prompt$ ps -ef | grep php
www-data 25255 31528  0 Nov10 ?        00:00:52 /usr/bin/php-cgi -q -b 127.0.0.1:8000
www-data 25295 31528  0 Nov10 ?        00:00:54 /usr/bin/php-cgi -q -b 127.0.0.1:8000
www-data 31528     1  0 Nov06 ?        00:00:00 /usr/bin/php-cgi -q -b 127.0.0.1:8000

At this point, all necessary software has been installed.   Next, we’ll look at the directory structures and how to invoke image resizing.

How to use our super-duper setup

Accessing an original file:

http://static.example.com/originals/myface.jpg

Accessing a resized file:

http://static.example.com/cache/150x200-0/originals/myface.jpg

As you can see, it is not difficult to construct a URL for resized image if you know the location of the original.  In the example above, 150×200 are the maximum dimensions of the returned image.  The last number ’0′ is the resizing mode:

Mode 0 Size the image to fit the box while preserving the aspect ratio. The returned image will be of specified dimension or smaller.
Mode 1 Stretches the image two fit the specified dimensions. Aspect ratio is not preserved.
Mode 2 Zoom and crops the image to fill the specified dimension.

Verify Setup

Once set up, your directory structure should look something like below. Make sure the file are owned by same user as the one running nginx (usually www-data).

root@myserver:/var/www/static.example.com$ tree
.
├── cache
│   ├── 600x300-2
│   │   └── originals
│   │       └── astro.jpg
│   └── 600x400-0
│       └── originals
│           └── astro.jpg
├── originals
│   └── astro.jpg
└── resize.php

It is fairly simply to test our setup.  You can request image at varying sizes and ensure they are returned.  One should also check the HTTP Response Header to make sure that the PHP script is not being called every time.

Original: http://static.example.com/originals/astro.jpg

Resized Mode 0: http://static.example.com/cache/300×200-0/originals/astro.jpg

Resized Mode 1: http://static.example.com/cache/300×100-1/originals/astro.jpg

Resized Mode 2: http://static.example.com/cache/300×100-2/originals/astro.jpg

HEAD output on first call to resize (served by PHP):

prompt$ HEAD http://static.example.com/cache/300x150-2/originals/astro.jpg
200 OK
Connection: close
Date: Fri, 11 Nov 2011 05:17:45 GMT
Server: nginx/0.7.65
Content-Type: text/html
Client-Date: Fri, 11 Nov 2011 05:17:45 GMT
Client-Peer: 184.72.219.132:80
Client-Response-Num: 1
X-Powered-By: PHP/5.3.2-1ubuntu4.7

HEAD output on subsequent calls (served directly by nginx):

prompt$ HEAD http://static.example.com/cache/300x150-2/originals/astro.jpg
200 OK
Cache-Control: max-age=14400
Connection: close
Date: Fri, 11 Nov 2011 05:19:00 GMT
Accept-Ranges: bytes
Server: nginx/0.7.65
Content-Length: 44024
Content-Type: image/jpeg
Expires: Fri, 11 Nov 2011 09:19:00 GMT
Last-Modified: Fri, 11 Nov 2011 05:17:45 GMT
Client-Date: Fri, 11 Nov 2011 05:19:00 GMT
Client-Peer: 184.72.219.132:80
Client-Response-Num: 1

One last thing

The PHP script in charge if resizing images creates directories as needed.  For example, it an image was requested with the following URL:

/cache/300x150-2/originals/astro.jpg

The script would create /cache/300×150-2 directory if it doesn’t already exist.  This could be used to fill up your disk space by malicious users simply by requesting a particular file with constantly varying dimensions.  In general, when you design a website, you will have used images at fixed dimensions.  These don’t change until you alter the design.  So during the cruise phase, it is best to set $is_locked = true in the script so that it doesn’t create new directories.

Summary (or for the impatient)

  1. Install Linux
  2. sudo apt-get install nginx php5-cgi php5-imagick
  3. Create nginx virtualhost, start php5-cgi process
  4. Download resize.php into directory structure show above
  5. Test
Tagged with:
 

15 Responses to How to build a scalable, caching, resizing image server

  1. pieter says:

    Dear Sumit Birla,

    We have setup the same enviorment in your example. But when we type the caching url we get following error:

    /cache/300Ã200-0/originals/myface.jpg” failed (2: No such file or directory) …. request: “GET /cache/300%C3%97200-0/originals/myface.jpg HTTP/1.1″

    Please can you help us.?

    Grz

    Pieter

    • Sumit Birla says:

      Can you verify that the file at the location “/cache/300×200-0/originals/myface.jpg” actually exists?

      “GET /cache/300%C3%97200-0/originals/myface.jpg HTTP/1.1″ the %C3%97 should be an ‘x’ instead?

  2. Gregor Kuhlmann says:

    Exactly what I was looking for! With the help of your article, I got a similar setup running in no time.

  3. We take a very similar approach to serving images resized, cropped and with effects applied at Jux.com… I built an image API in Ruby that handles this. Code is up on Github. Thanks for the detailed post!

    http://magickly.jux.com/docs.html

  4. Jim says:

    Awesome post…however I too am getting a 404 on just the /cache urls. The files are there, looks like something with the rewrite. I am on 1.2.1 of nginx.

  5. Keith says:

    Does this setup also check if the existing original image has been modified, so somehow checking the last modified date and if its different then to regenerate the thumbnail

    • Sumit Birla says:

      No, it doesn’t check if the original file has been modified. I usually empty the cache folder to rebuild the resized images. The reason for not having a check is to avoid running any kind of code/script when the files are being served.

  6. [...] How to build a scalable, caching, resizing image server. Share this:Like this:LikeBe the first to like this. This entry was posted in Uncategorized. Bookmark the permalink. ← RC4 Encryption Algorithm for VBA and VBScript [...]

  7. [...] How to build a scalable, caching, resizing image server [...]

  8. sendy says:

    great post Sumit..

    how about like this

    HTTP/1.1 200 OK =>
    Server => nginx admin
    Date => Mon, 25 Feb 2013 14:11:58 GMT
    Content-Type => image/png
    Content-Length => 67534
    Connection => close
    Vary => Accept-Encoding
    X-Powered-By => PHP/5.2.17
    Cache-Control => max-age=864000, must-revalidate
    Last-Modified => Mon, 25 Feb 2013 14:11:05 GMT
    X-Cacheable => YES
    Accept-Ranges => bytes
    X-Varnish => 1792383513 1792383466
    Via => 1.1 varnish
    age => 0
    X-Cache => HIT
    X-Cache-Hits => 1

  9. Anonymous says:

    Here you go my scalable friend…

    <?php

    $curls = 0;

    for ($i = 10000, $iMax = 50000; $i /dev/null 2>&1 &');
    echo $url . "\n";

    $curls = (int) trim(exec('ps auxw | grep "curl -I" | wc -l'));
    while ($curls > 50) {
    echo "Uuhhhh too many curls ($curls)\n";
    sleep(5);
    $curls = (int) trim(exec('ps auxw | grep "curl -I" | wc -l'));
    }
    }

  10. andy says:

    I’m trying to install this on my server. I’ve had more trouble than it’s worth, i’m a complete newbie to nginx and you missed a few steps.

    Just so you know i couldnt get nginx using php with what you had, i kept getting a 502 error.

    This page helped solve it:
    http://wildlyinaccurate.com/solving-502-bad-gateway-with-nginx-php-fpm

    I’m now getting:

    “PHP message: PHP Fatal error: Class ‘Imagick’ not found in /

    Which seems to suggest i need to install a pear package, but i’m now getting

    running: phpize
    sh: 1: phpize: not found
    ERROR: `phpize’ failed

    So this is a bust for me, it looked so promising. Can you suggest a solution to my imagick problem?

    • Sumit Birla says:

      Depending on which linux distribution you are using, it may just be a matter of installing the right package. On Debian/Ubuntu, it would be ‘sudo apt-get install php5-imagick’ You will have to restart your php-fastcgi server after installing the imagick package. Hope this fixes your problem.

  11. Thanks for one’s marvelous posting! I genuinely enjoyed reading it, you can be a great author.
    I will always bookmark your blog and will often come back at some point.
    I want to encourage continue your great posts, have a nice
    weekend!

  12. naytro says:

    you should use X-Accel-Redirect instead of file_get_contents

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>