Why BootDev — Controlling Cache headers

From AWS document, when u want to cache objects at browser, From S3 or from Cloudfront, at the same time support CORs resources like font, You can use a parameter MaxAgeSeconds: http://docs.aws.amazon.com/AmazonS3/latest/dev/cors.html

With all the tests i tried Chrome don’t really respect MaxAgeSeconds , you still need traditional Cache-Control: Max-age=xxx AND Expires: into header. When using AWS Cloudfront as your edge cache / CDN, and especially adding S3 as your origin, you take special care of your cache headers.

You can use API / CLI / UI to change the cache header at the metadata session of S3.

Screen Shot 2015-07-10 at 3.41.24 PM

And at your bucket policy’s permission, set CORs

Screen Shot 2015-07-10 at 3.43.35 PM

Once you success in setting up those things, you can curl -I -H to test your settings. If you use Chrome to test, REMEMBER

  1. DONT click refresh
  2. DONT click CMD +R
  3. Click another link in your website to test 

Otherwise, you will end up in lots of confusion !

run command:

curl -I http://xxxxxx.example.com/fonts/Neutra2Display-Titling.woff -H “Origin: xxxx.example.com”

Screen Shot 2015-07-10 at 3.47.35 PMScreen Shot 2015-07-10 at 3.47.59 PM

first time u will see “Miss From Cloudfront”, if it is your production site url, you may ask why ? You should have many people visiting this obejcy. As the header is different than normal browser, Cloudfront treat it as a new object. So, no worry.

At the second time you curl, you will see “HIT from cloudfront”. So with this setup your resource (this time is font), will be cached on Cloudfront a long time and then once downloaded to browser, it will locally cached as the Cache-control: max-age set.

P.S. Cloudfront respect Cache-Control, so how long your browser will cache = how long your object will stay on Cloudfront.

With MaxAgeSeconds only, your resource can keep at browser with 304.

With Cache-control and expires header, your resource can keep at 200, from cache.

Question: So what does MaxAgeSeconds do here ? Any special require that we always want 304 but not 200, from cache ? I need someone to answer me as well 🙂

Why Bootdev — Dynamic CDN

In the old days, we put images, css / js, woff etc any assets to CDN, so that clients can download somewhere that geographically optimised.

Around end of 2012 – early 2013, new idea comes out like we should CDN everything, so that we can reduce the complex architecture with memcache, page cache cluster (Varnish) or even microccache etc. Just 1 layer cache and having everything to CDN. Like architecture below.

dnamic cdn

Your website domain name will directly point to CDN with CNAME. And then the CDN will point to your load balancer address or your web server. So it is like a proxy. When you do create, it will 100% bypass CDN and goes to web server, when u do UPDATE, the CDN will update from web server then invalidate itself, when you do DELETE, the CDN will bypass request to web server and invalidate its cache. When you read, it read from the CDN, not from web server. So the CRUD action can be completed.

You will need your own cache invalidation strategy, like send update to cloudfront / using versioning object or url.

Here is a sample conf of how we bypass some URLs to go to web server, and making Drupal works.

E1EB6A9C-17EC-446D-AD59-80B471A4F962 62367506-DDC3-4E5C-8F05-24E2D20DBBBB

With AWS cloudfront, you can bypass header ORIGIN, so that you can preform CORs actions. Also you can use similar header bypass feature to detect mobile/PC. With such architecture well setup, theoretically, you can have unlimited PV, as your server wont be really hitted. Your bound will be write DB bound only, which is not a concern in most case.

If you don’t want to understand all these, but want to lower your cost and have higher traffic and faster response, contact bootdev at founders@bootdev.com ! We can deploy Dynamic CDN to you in minutes no matter you are using AWS or not. We can point our CloudFront account to your server, it can be Azure, Linode, or any bare-meter. It just need to be Drupal, and you can enjoy the best performance ever.

ref: https://media.amazonwebservices.com/blog/cloudfront_dynamic_web_sites_full_1.jpg

fastcgi with apache, virtualhost per user setup

Im showing a working conf for fastcgi per user virtualhost & php.ini on apache2

Googled some reference with lots of apache like putting FcgidInitialEnv into virtualhost does not work. Tested out a working conf for fastcgi with apache, virtualhost per user by suExec setup.

About installing dependencies on CentOS 6, reference below ***Remember to add the epel repo

http://ruleoftech.com/2013/using-php-fpm-with-apache-2-on-centos
http://mcdee.com.au/fastcgi-with-php-apache-centos6/
http://kuaileyongshi.blog.51cto.com/1480649/648789

Beware suexec not allow to set uid < 500 ===> you cannot use apache (CentOS) www-data(Ubuntu) in SuexecUserGroup

Just add below apache conf in /etc/httpd/conf.d/anyfilename.conf, this conf run a php wrapper by FCGIWrapper

Then the wrapper can specify which php.ini to run

<IfModule fcgid_module>
  FcgidInitialEnv PHPRC "/php"
</IfModule>

# using mod_fcgid and user specific php.ini
# You can copy this part, add user and create more virtualhost on demand
<VirtualHost *:80>
  ServerName fcgi1.keithyau.com
  DocumentRoot "/var/www/keithyau"
  SuexecUserGroup keithyau keithyau

  <Directory /var/www/keithyau>
    Options FollowSymLinks +ExecCGI
    AllowOverride All
    AddHandler fcgid-script .php
    FCGIWrapper /var/www/change-this/fcgi1/fcgi1-wrapper .php
    Order allow,deny
    Allow from all
  </Directory>
</VirtualHost>

The fcgi1-wrapper is something like this, PHPRC is the folder to contain custom php.ini you can add more settings.

  #!/bin/sh
  PHPRC=/var/www/keithyau/fcgi1
  export PHPRC
  export PHP_FCGI_MAX_REQUESTS=5000
  export PHP_FCGI_CHILDREN=8
  exec /var/www/html/fastcgi-bin/php-cgi

Screen Shot 2014-03-12 at 6.20.10 PM

 

We can config like disable_function in this php.ini. Finally, check the loaded phpinfo

Complete solution to bypass GFW (SQUID + SSH Tunnel + AWS)

I used single solution to bypass GFW for a long time. Running VPN, SSH Tunnel, SQUID … Each of them got pros and cons. And sometimes not smooth. Especially I want to watch Videos that only free licensing in China, but also facebook(ing). VPN cant help.

While the power of GFW increasing, SSH Tunnel & SQUID sometimes will be de-functionated. Recently, I use the combination of SSH Tunnel + SQUID as my solution and works well.

For junior technical people, the idea is a bit complex. What I am doing is SSH encrypt SQUID requests. (Mapping the remote SQUID port to a new created local port underlining SSH)

There is free one AWS EC2 available for 2012 subscription, so I use AWS for my remote server.

SSH Tunnel
localhost:22+————————————-+ 22:remote
  +  +
  + +
   8080+  + 3128 (Remote SQUID on AWS)

 

With this setup, I can use FoxyProxy (Google Chrome) or AutoProxy (Firefox) help to forward all request from browser to local binded SQUID port (8080). So, I can enjoy http/https requests but not SOCK5 and those requests are encrypted, GFW wont drop the request by filtering keywords.

To establish the tunnel above, the easiest way in Windows system is using bitvise. With the attached file setting. (Of coz, with your AWS key and able to login)

Capture

Enjoy

Keith

Email pipe

Hi all,

If you have a email server on Linux server, you can pipe email to a backend script easily.

This allows you to parse email or control server in email.

Take an example for PHP

echo 'test: "php -q |/your/script.php"' >> /etc/aliases

This command will need php cli support, in Ubuntu, enter

sudo apt-get install php5-cli

And then create a file called script.php start with #!

  1. #!/usr/bin/php
  2. <?php
  3. // read in email from stdin
  4. $fd = fopen("php://stdin", "r");
  5. $email = "";
  6. while (!feof($fd)) {
  7. $email .= fread($fd, 1024);
  8. }
  9. fclose($fd);

The email content will be in $email, you can do whatever you like

If you are thinking of email command, add a line like

  1. exec ($email);

If you are thinking of calling Drush without managing drupal init externally, you can 

  1. exec ("drush your-custom-command " .$email);


Of coz, it needs certain privilege.

Enjoy ~

Keith

Drupal Adding apache solr index for field-collection

If images / content are bounded by field collection module, you can try approach below to index to solr

enjoy

 

//Index product images

$field_collection_items = field_get_items(‘node’, $node, ‘field_image_set’);

$field_collection_item_ids = array();

$i = 0;

foreach ($field_collection_items as $field_collection_item){

$field_collection_item_ids[$i] = $field_collection_item[‘value’];

$i++;

}

$field_collection_item_fields = entity_load(‘field_collection_item’, $field_collection_item_ids);

$image_fields = array();

$i = 0;

foreach ($field_collection_item_fields as $field_collection_item_field){

$image_fields[$i] = field_get_items(‘field_collection_item’, $field_collection_item_field, ‘field_image’);

$i++;

}

 

$i = 0;

foreach ($image_fields as $image_field){

foreach ($image_field as $image)

$path = file_create_url($image[‘uri’]);

$document->setMultiValue(‘sm_field_image_’.$i++, $path); //Set multiple field_image values

}

 

Adding index to apache solr by Drupal hook

1. Add index by setMultiValue

function scp_solr_search_apachesolr_update_index(&$document, $node){
$document->setMultiValue(‘sm_field_xxxx’,
$node->field_xxxx[‘und’][0][‘value’]);

2. Add modify query

function scp_solr_search_apachesolr_modify_query(&$query, $caller){
$query->params[‘fl’] .= ‘,sm_field_xxxx’;

3. update solr and done ~!

* scp_solr_search is my module name Enjoy