Why Bootdev — Dynamic CDN

In the old days, we put images, css / js, woff etc any assets to CDN, so that clients can download somewhere that geographically optimised.

Around end of 2012 – early 2013, new idea comes out like we should CDN everything, so that we can reduce the complex architecture with memcache, page cache cluster (Varnish) or even microccache etc. Just 1 layer cache and having everything to CDN. Like architecture below.

dnamic cdn

Your website domain name will directly point to CDN with CNAME. And then the CDN will point to your load balancer address or your web server. So it is like a proxy. When you do create, it will 100% bypass CDN and goes to web server, when u do UPDATE, the CDN will update from web server then invalidate itself, when you do DELETE, the CDN will bypass request to web server and invalidate its cache. When you read, it read from the CDN, not from web server. So the CRUD action can be completed.

You will need your own cache invalidation strategy, like send update to cloudfront / using versioning object or url.

Here is a sample conf of how we bypass some URLs to go to web server, and making Drupal works.

E1EB6A9C-17EC-446D-AD59-80B471A4F962 62367506-DDC3-4E5C-8F05-24E2D20DBBBB

With AWS cloudfront, you can bypass header ORIGIN, so that you can preform CORs actions. Also you can use similar header bypass feature to detect mobile/PC. With such architecture well setup, theoretically, you can have unlimited PV, as your server wont be really hitted. Your bound will be write DB bound only, which is not a concern in most case.

If you don’t want to understand all these, but want to lower your cost and have higher traffic and faster response, contact bootdev at founders@bootdev.com ! We can deploy Dynamic CDN to you in minutes no matter you are using AWS or not. We can point our CloudFront account to your server, it can be Azure, Linode, or any bare-meter. It just need to be Drupal, and you can enjoy the best performance ever.

ref: https://media.amazonwebservices.com/blog/cloudfront_dynamic_web_sites_full_1.jpg

Advertisements

Why BootDev — Nginx Config

We had spent many effort on Nginx configuration. Different with Apache + mod_php5, Nginx + php_fpm need much detail configuration and nginx is Drupal module dependent. It means some Drupal module require support from Nginx configuration. Like CDN module / Advagg module.

There is a github about Drupal + Nginx, but that will be too much and you will require to filter the necessary part in your project.

Here i share the main nginx configure

server {
  server_name *.compute.amazonaws.com;
  root   /opt/source/app;
  access_log  /var/log/nginx/access.log;
  error_log  /var/log/nginx/error.log;

  #include /etc/nginx/apps/drupal/drupal.conf;
  #Cache everything by default
  set $no_cache 0;
  #Don't cache POST requests
  if ($request_method = POST)
  {
    set $no_cache 1;
  }

  #Don't cache if the URL contains a query string
  if ($query_string != "")
  {    
    set $no_cache 1;
  }

  #Don't cache the following URLs
  if ($request_uri ~* "/(administrator/|login.php)")
  {
    set $no_cache 1;
  }

  #Don't cache if there is a cookie called PHPSESSID
  if ($http_cookie = "PHPSESSID")
  {
    set $no_cache 1;
  }

  # Enable compression, this will help if you have for instance advagg module
  # by serving Gzip versions of the files.
  gzip_static on;

  location = /favicon.ico {
    log_not_found off;
    access_log off;
  }

  location = /robots.txt {
    allow all;
    log_not_found off;
    access_log off;
  }

  # This matters if you use drush
  location = /backup {
    deny all;
  }

  # Very rarely should these ever be accessed outside of your lan
  location ~* \.(txt|log)$ {
    deny all;
  }

  location ~ \..*/.*\.php$ {
    return 403;
  }

  location / {
    # This is cool because no php is touched for static content
    try_files $uri @rewrite;
  }

  location @rewrite {
    # Some modules enforce no slash (/) at the end of the URL
    # Else this rewrite block wouldn't be needed (GlobalRedirect)
    rewrite ^/(.*)$ /index.php?q=$1;
  }

  location ~ \.php$ {
    fastcgi_split_path_info ^(.+\.php)(/.+)$;
    #NOTE: You should have "cgi.fix_pathinfo = 0;" in php.ini
    include fastcgi_params;
    fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
    fastcgi_intercept_errors on;
    fastcgi_pass unix:/var/run/php-fpm-www.sock;
    fastcgi_read_timeout 40;
    fastcgi_cache MYAPP;
    fastcgi_cache_valid 200 301 30s;
    fastcgi_cache_bypass $no_cache;
    fastcgi_no_cache $no_cache;

    # Set cache key to include identifying components
    fastcgi_cache_valid 302     1m;
    fastcgi_cache_valid 404     1s;
    fastcgi_cache_min_uses 1;
    fastcgi_cache_use_stale error timeout invalid_header updating http_500;
    fastcgi_ignore_headers Cache-Control Expires;
    fastcgi_pass_header Set-Cookie;
    fastcgi_pass_header Cookie;

    ## Add a cache miss/hit status header.
    add_header X-Micro-Cache $upstream_cache_status;

    ## To avoid any interaction with the cache control headers we expire
    ## everything on this location immediately.
    expires epoch;

    ## Cache locking mechanism for protecting the backend of too many
    ## simultaneous requests.
    fastcgi_cache_lock on;
  }

  # Catch image styles for D7.
  location ~ ^/sites/.*/files/ {
    try_files $uri @rewrite;
  }

  # Catch image styles for AmazonS3 D7.
  location ~ ^/system/files/styles/ {
    try_files $uri @rewrite;
  }

  location ~* \.(js|css|png|jpg|jpeg|gif|ico|svg)$ {
    expires max;
    log_not_found off;
  }

  location ~* \.(eot|ttf|woff|svg) {
    add_header Access-Control-Allow-Origin *;
    try_files $uri @rewrite;
  }

  ##   
  # Advanced Aggregation module CSS
  ##   
  # http://drupal.org/project/advagg.
  ##    
  location ^~ /sites/default/files/advagg_css/ {
    expires max;
    add_header ETag '';
    add_header Last-Modified 'Wed, 20 Jan 1988 04:20:42 GMT';
    add_header Accept-Ranges '';
    add_header Access-Control-Allow-Origin *;
    
    location ~* /sites/default/files/advagg_css/css__[[:alnum:]-_]+\.css$ {
      access_log off;
      try_files $uri @drupal;
    }
  }

  ###
  ### CDN Far Future expiration support.
  ###
  location ^~ /cdn/farfuture/ {
    tcp_nodelay   off;
    access_log    off;
    log_not_found off;
    etag          off;
    gzip_http_version 1.0;
    if_modified_since exact;
    location ~* ^/cdn/farfuture/.+\.(?:css|js|jpe?g|gif|png|ico|bmp|svg|swf|pdf|docx?|xlsx?|pptx?|tiff?|txt|rtf|class|otf|ttf|woff|eot|less)$ {
      expires max;
      add_header X-Header "CDN Far Future Generator 1.0";
      add_header Cache-Control "no-transform, public";
      add_header Last-Modified "Wed, 20 Jan 1988 04:20:42 GMT";
      rewrite ^/cdn/farfuture/[^/]+/[^/]+/(.+)$ /$1 break;
      try_files $uri @nobots;
    }
    location ~* ^/cdn/farfuture/ {
      expires epoch;
      add_header X-Header "CDN Far Future Generator 1.1";
      add_header Cache-Control "private, must-revalidate, proxy-revalidate";
      rewrite ^/cdn/farfuture/[^/]+/[^/]+/(.+)$ /$1 break;
      try_files $uri @nobots;
    }
    try_files $uri @nobots;
  }

}

The idea of this config file is to support CDN far future, CDN, advagg, Drupal image styling, AmazonS3 and microcache modules. You need to catch different url pattern for different purpose.

For microcache, we put the cache into memory and expire every 30s, so that in each 30s, only the 1st visitor hit your site will generate by PHP. 2nd – N users will hit microcahe. In this approach, we can support high traffic website and then same time avoid handling cache invalidation problem.

Here I also share the cache config which we put nginx cache into memory which release better performance.

fastcgi_cache_path /dev/shm/microcache levels=1:2 keys_zone=MYAPP:5M max_size=256M inactive=2h;
fastcgi_cache_key "$scheme$request_method$host$request_uri";
add_header X-Cache $upstream_cache_status;
map $http_cookie $cache_uid {
  default nil; # hommage to Lisp :)
  ~SESS[[:alnum:]]+=(?<session_id>[[:alnum:]]+) $session_id;
}
map $request_method $no_cache {
  default 1;
  HEAD 0;
  GET 0;
}

You can read the comment inside the config file for more detail explanation.

This config requires to work with another PHP-FPM config, so that the memory is optimized. Then, you can estimate how many request per second that your server can serve. And i will talk about it next time.

Why bootdev — CDN configuration

CDN illustration

Why bootdev CDN config

One of BootDev’s website backend feature is CDN auto-deploy. The idea is we deploy the right way of using Amazon Cloudfront (CDN) automatically. You don’t need to investigate and do lots of work in Drupal to make things done.

From our hard earned experience, Drupal CDN configuration summarize as 2

  1. File storage + CDN (Drupal AmazonS3 module + CNAME with Cloudfront)
  2. Directly CDN (Drupal CDN Module)

1. Drupal AmazonS3 module

This module is great. It helps to move Drupal file system on Cloud. The advantage is you don’t need big local storage. But, you will be headache of using this. Some examples are

  • Generating XMLsitemap, the cron job will hang as it is not using local storage
  • new image style, you will need to turn off AmazonS3 module to add new style
  • Features module operations take longer, especially with image field operations
  • Multi-upload / insert image to editor, as the style generation link will be shown at the first time (/system/styles/files/….) will be different when the same style shown second time (CDN url). If you are not notice about it, your image will keep generate and waste lots lots of CPU.
  • Nginx Support, you will need to add AmazonS3 Nginx support (try url /system/stles/files)

For the multi-upload problem, you will need to insert images when u “edit” it rather than create. But, it will complain by your content editor. Another solution is running a quick fix database script to change the url in text body after image generated.

For the others, share some code snap as below, you can put this in your Drupal settings.php or config in admin page. Of coz, you will need to set your AWS right way like setting right S3 bucket as the Cloudfront origin, map to Route53 CNAME of your domain and put it on Cloudfront CNAME. That’s why you need bootdev if you are not understand what I’m talking about.

$conf[‘aws_key’] = ‘your key’;
$conf[‘aws_secret’] = ‘your secret’;
$conf[‘amazons3_bucket’] = ‘your bucket for AmazonS3 module’;
$conf[‘amazons3_cache’] = ‘1’;
$conf[‘amazons3_cloudfront’] = ‘1’;
$conf[‘amazons3_cname’] = ‘1’;
$conf[‘amazons3_domains’] = ‘cdn1.xxx.com
cdn2.xxx.com
your mutiple CNAME, to increase frontend performance.
You will need this patch https://www.drupal.org/node/2044307#comment-9384777 ‘;

For support nginx, so that image styles can be generated, you will need this

  # Catch image styles for D7.
location ~ ^/sites/.*/files/ {
try_files $uri @rewrite;
}

# Catch image styles for AmazonS3 D7.
location ~ ^/system/files/styles/ {
try_files $uri @rewrite;
}

2. Drupal CDN module

OK, now images are served. But what about the css/js/other static resources ? You will need CDN module.

You will need to config AWS to pull your data which we are not going to explain here howto. After that, you can have your CDN module to alter your site url, so that for examples: .css .js .xx will alter to CDN url.

Here is some basic config which u can put into your Drupal settings.php

$conf[‘cdn_basic_mapping’] = ‘http://d2fqrtkrfnbzdb.cloudfront.net| .css .js
http://yourlocalsite.com|.ttf .woff .svg .eot’;
$conf[‘cdn_mode’] = ‘basic’;

//CDN conf (Improve frontend performance)
$conf[‘cdn_farfuture_status’] = ‘1’;

d2fqrtkrfnbzdb.cloudfront.net is your CDN url, you can have other CNAME. You will see i put yourlocalsite.com as a CDN, which is not a good way. Why we are doing this is, we want to keep the webfront files serve as local. So that it won’t have CORS problem. Otherwise your font won’t work. Another approach is uploading your font to S3 manually and then manually set S3 CORS header and then set it as ClountFront origin. But, it make things complex and the font will leave the git repo management, which harder to deploy. So, we suggest to serve it locally, if you do not a very very large site.

You can also add .png next to .css / .js to serve your logo/icons or other resources. And Far future conf help to expire your resource at browser later and suggested to turn on.

To support far future, you will need below in nginx config, so that your CDN can curl the far future path. Reference: https://www.drupal.org/node/2380397

###
### CDN Far Future expiration support.
###
location ^~ /cdn/farfuture/ {
tcp_nodelay   off;
access_log    off;
log_not_found off;
etag          off;
gzip_http_version 1.0;
if_modified_since exact;
location ~* ^/cdn/farfuture/.+\.(?:css|js|jpe?g|gif|png|ico|bmp|svg|swf|pdf|docx?|xlsx?|pptx?|tiff?|txt|rtf|class|otf|ttf|woff|eot|less)$ {
expires max;
add_header X-Header “CDN Far Future Generator 1.0”;
add_header Cache-Control “no-transform, public”;
add_header Last-Modified “Wed, 20 Jan 1988 04:20:42 GMT”;
rewrite ^/cdn/farfuture/[^/]+/[^/]+/(.+)$ /$1 break;
try_files $uri @nobots;
}
location ~* ^/cdn/farfuture/ {
expires epoch;
add_header X-Header “CDN Far Future Generator 1.1”;
add_header Cache-Control “private, must-revalidate, proxy-revalidate”;
rewrite ^/cdn/farfuture/[^/]+/[^/]+/(.+)$ /$1 break;
try_files $uri @nobots;
}
try_files $uri @nobots;
}

After you do all above, you can have CDN run smoothly and also cloud storage + CDN for Drupal 🙂 Or use bootdev, one click, all set.

P.S.

Why cloudfront / CDN ?

  1. Faster your content to audience
  2. Lower your AWS network traffic cost
  3. Available world wide
  4. Lower your Server CPU/Memory as requests share to CDN

Why BootDev — For website builders’ Backend as a service (wBaaS)

Recently, we launched a project, bootdev which deploys a configured Drupal site (Drucloud) into a pre given architecture into users’ personal AWS account. We call it website backend as a service (wBaaS). With this approach, we can deliver our hard earned experience to other business owner or developers without REWORK of what we did.

What we deliver is just configuration and knowledge. After the deployment, we dont host it, users’ host their site in their own AWS. We also provide a technical foundation / playground for people to add features /  experience best practice(s).

Currently available with Drupal 7.

With just 1 click, you can have pre configured like:

  • Caching
    • Nginx micro-cache
    • Cache pre-warmer
    • PHP APC
    • Memcache (AWS ElastiCache)
    • MySQL Query Cache
  • Database
    • AWS RDS configuration for Drupal
    • Multi-AZ
    • Secure by under AWS VPC
  • Web server
    • Nginx Drupal configuration
      • advagg css / js compression support
      • Image style catching
      • Micro cache
      • AWS Cloudfront CDN support (far future expire)
      • Cache control headers
    • PHP-FPM
      • APC
      • Max client configurations under EC2 m3.large
  • SOLR Search
    • SOLR index cron job
    • Drupal SOLR integration
  • DevOps
    • Cloudformation
    • Chef
    • Auto-scaling
    • Git deploy with bitbucket private repo + deploy key
  • Maintenance
    • System maintenance cron job
    • xmlsitemap
    • social network stat
  • Email
    • SPAM email control
    • Mass mail support (AWS SES support)
  • MAP
    • Google Map integration
  • Social
    • Social network meta-tag
  • File handling
    • Drupal S3 integration
    • Push to CDN
    • Separate file handling with other Drupal services
  • CDN
    • Drupal CDN configuration
    • Suport CORs
  • Server architecture
    • 2* EC2 m3.large web server
    • 1* EC2 m1.small chef server
    • 1* Mutli-AZ m3.large RDS
    • 1* EC2 m3.large SOLR server
    • 2* ElastiCache node m1.medium
    • Cloudfront CDN
    • Amazon S3
    • Auto-scaling with ELB
    • 2* EC2 m1.small GlusterFS
    • Inside VPC

In coming topics, I will explain why you need each of those configurations to make your site awesome.

recommend_package_datasheet(1)

Email pipe

Hi all,

If you have a email server on Linux server, you can pipe email to a backend script easily.

This allows you to parse email or control server in email.

Take an example for PHP

echo 'test: "php -q |/your/script.php"' >> /etc/aliases

This command will need php cli support, in Ubuntu, enter

sudo apt-get install php5-cli

And then create a file called script.php start with #!

  1. #!/usr/bin/php
  2. <?php
  3. // read in email from stdin
  4. $fd = fopen("php://stdin", "r");
  5. $email = "";
  6. while (!feof($fd)) {
  7. $email .= fread($fd, 1024);
  8. }
  9. fclose($fd);

The email content will be in $email, you can do whatever you like

If you are thinking of email command, add a line like

  1. exec ($email);

If you are thinking of calling Drush without managing drupal init externally, you can 

  1. exec ("drush your-custom-command " .$email);


Of coz, it needs certain privilege.

Enjoy ~

Keith

Drupal Adding apache solr index for field-collection

If images / content are bounded by field collection module, you can try approach below to index to solr

enjoy

 

//Index product images

$field_collection_items = field_get_items(‘node’, $node, ‘field_image_set’);

$field_collection_item_ids = array();

$i = 0;

foreach ($field_collection_items as $field_collection_item){

$field_collection_item_ids[$i] = $field_collection_item[‘value’];

$i++;

}

$field_collection_item_fields = entity_load(‘field_collection_item’, $field_collection_item_ids);

$image_fields = array();

$i = 0;

foreach ($field_collection_item_fields as $field_collection_item_field){

$image_fields[$i] = field_get_items(‘field_collection_item’, $field_collection_item_field, ‘field_image’);

$i++;

}

 

$i = 0;

foreach ($image_fields as $image_field){

foreach ($image_field as $image)

$path = file_create_url($image[‘uri’]);

$document->setMultiValue(‘sm_field_image_’.$i++, $path); //Set multiple field_image values

}