Tips for Optimizing & Updating MySQL

Your database is one of the most important aspects of your server software stack.  Call me old fashion, but I still rely on MySQL as my database of choice and don’t see myself changing any time soon.  Over the past 7 years of running FantasySP, I’ve learned quite a bit about maintaining MySQL and making sure my server is running as smooth as possible.

Keeping MySQL Up to Date

The most obvious way to ensure MySQL performs its best is to make sure that your server software is up to date.  Updating to the latest minor release of MySQL should be one of the safer routes to go in.  These fix bugs, security, and minor performance improvements.  You should update to the minor release versions on a pretty regular basis.

A major release version upgrade is much more involved.  If you see that MySQL is nearing its end of life, then it’s time you start planning.   I have done 2 major upgrades from 5.1 -> 5.5, and then from 5.5 -> 5.6.  Chances are you will encounter problems with query performance with a major release update.  MySQL Query Cache Optimizer can behave very differently depending on what major release version you are using.   Queries that ran fine on 5.4 will run poorly on 5.5.

When Upgrading MySQL to a New Major Release

In order to safely upgrade to a newer MySQL major release version you should first make sure that you are running something like NewRelic to monitor your query performance.  Get a good baseline for typical application performance so you can compare it to when you upgrade.

To be safe, you could be running the old & new version of MySQL on your development box if you can.  This way you can plug in two queries with the same dataset on two different version of MySQL and get an idea of performance beforehand.  I usually don’t do this and just deal with the shitstorm that ensues, but some of you may want to play it safe.

Usually when a query performs badly it is because the newer version of MySQL decided to ignore the indices that were previously created for a table.  I have seen this happen quite a few times.  If you are running a vanilla WordPress installation then you probably won’t have to deal with this, but custom apps will probably encounter this problem.  You’ll be able to see bad query performance when you use EXPLAIN on your troublesome query. Full table scans lead to terrible performance, so don’t be alarmed if this happens.

To fix this you may have to develop a better index to speed up performance, or you can force MySQL to use a specific index with the following query:

SELECT * FROM  table_name USE INDEX (names) WHERE last = ‘something’

Another reason an updated major release of MySQL runs poorly could be that your configuration file is outdated.  Sometimes they may change the names of something or remove it and MySQL will start with Query Cache disabled or default settings.  Make sure you save a backup of your original my.conf file.

Useful Scripts to See Performance

Using NewRelic is great, but there are two popular database scripts to help with diagnosing performance issues.  You have probably heard of them, but I just want to double check you have them.  One is mysqltuner.pl and the other is tuning-primer.sh.

They will tell you how your cache is performing, if you have full table scans, how many prunes are happening per day, if your query cache is big enough, etc.  All very useful stuff to determine performance.  I am assuming you know how to deal with fixing those issues, but if you don’t you can Google them or buy a MySQL book or two to learn more.

Software to Manage & Run MySQL Queries

My development environment of choice is Windows.  With that in mind, I like to run a local MySQL application to make my life easier to diagnose queries, create tables, indexes, synchronize data, etc.

My software of choice is SQLyog and can’t say enough about how awesome it is.  If you deal with custom MySQL queries a lot, then I strongly suggest using an application similar to this to make your life easier.

Pruning Old Data

MySQL databases tend to be like old attics where you just keep saving shit even if you don’t actually need it.  If you have data from 2011 that you don’t need then make sure you get rid of it.

If you properly prune your database on an annual basis then you can extend the life of your current hardware stack, save money, and have much better performance.  A smaller database means less space for MyISAM indexes, which means you need less memory for caching.  (Assuming you use MyISAM)

So the big question is, how do you prune old rows in a table with 20 GIG worth of data without using DELETE queries? We all know that DELETE queries are way too expensive to run at this scale.  The solution?  Create a new temporary table with the same indices and table names.  Then copy the data you need from the old table to the new table.  Once your done, DROP the original table and rename your new table to the old table.

I learned this trick from Sean at the Clicky blog and trimmed the size of my database in half.

 

Google PageSpeed Service Review

I’ve been on the lookout for a service that would reliably speed up various web projects such as FantasySP and Top IAmA.  The faster the pages load, the happier my users will be.  I first started out using Cloudflare, but their service has been getting worse and worse.  Their pay features aren’t worth the money and using Cloudflare will most likely slow down response times.  A winning combination!

Then I heard Google PageSpeed was in invite only beta. (Not to be confused with mod_pagespeed, which is an Apache module that offers very similar features).  Google PageSpeed Service is very similar to cloudflare and all traffic will be passed through their servers via a DNS change.  I was finally accepted into the program and have spent the past few days tweaking my sites for best performance.

Though before we get to the results, let’s first go over why FantasySP tends to load a little slow to begin with.

What’s really slowing down my site are advertisements and third party javascript. No surprise there, right?  I offer my members a banner free experience, but the average joe has to load a page that is slower than I’d like.  To measure front-end performance I use NewRelic.  Prior to using Google PageSpeed Service, the average web page loads anywhere from 6 to 8 seconds.

Enter Google PageSpeed Service

I started out using Recommended Settings, which are the things you are probably familiar with: remove whitespace, combine CSS, minfiy javascript/css, use Google’s CDN, optimize images etc.  I decided to go all in with Google even though I already used Amazon Cloudfront as my CDN.  Google adds some additional goodies such as converting images to inline DATA, further reducing HTTP requests. It also automatically converts all references to images in stylesheets to run off their CDN, even if it’s not local to my server.  (Goodbye Amazon Cloudfront?).

Google PageSpeed went live on Christmas day, the 25th.

NewRelic: Browser performance
NewRelic: Browser performance

Immediately all pages started to load in under 5 seconds.  Best of all, no functionality on the site was broken, and I did not detect any additional latency by using Google’s service.

On December 26th I decided to enable additional experimental features that are labeled “high risk”.  I also enabled a new low risk feature called prefetch DNS resolve.

The best way to solve slow javascript is to defer javascript, which should prevent javascript from hanging and slowing down page rendering.  Google claims this is “high risk” and may break things, so I made sure to do thorough testing during a day that would have less traffic than normal.

Once defer javascript was enabled, you’ll notice that DOM Processing decreased even further, whereas page rendering actually increased.  Unfortunately I cannot explain why that is. (Experts, feel free to chime in).   So the next question might be, is deferring javascript even worth it according to that graph?  The difference between the 25th and 26th do not seem to matter?

Let’s have a look at what deferring javascript does in a browser not known for speed: Internet Explorer 9.  By enabling Google PageSpeed on December 25th, the pageload times decreased to under 6 seconds for the first time in seven days.  On the 26th, deferring javascript decreased pageload times to around 3 seconds.  Keep in mind that number includes banner advertisements.

Load Times in Internet Explorer 9
Load Times in Internet Explorer 9

 

Clearly deferring javascript helps with IE9, but what about the other browsers?

Windows firefox
Windows firefox
Mac Chrome 23
Mac Chrome 23
Windows Chrome 23
Windows Chrome 23

In every browser, you’ll notice that DOM processing and network times decreased, whereas page rendering times increased.  My theory is that since the javascript is deferred, the dom is processed a lot faster, but that doesn’t mean the page is fully rendered.  Again, feel free to chime in here with your own explanation.  I *think* the end user will feel as though the pages load a lot faster, despite the additional page rendering.

Unfortunately I am unable to test Android and iPhone browser performance because these users are directed to a different sub domain. Plus, I don’t think Google supports these browsers for deferring Javascript.  Older browser performance from IE8/IE7 remain unchanged because many of Google’s optimizations are for modern browsers only.

According to my testing, the previous bottlenecks like advertisements and rendering Google Charts no longer slow down pageloads.

Next up is to see performance from Google Webmaster Tools.  When I used Cloudflare, Googlebot noticed huge performance issues.  Cloudflare caused my 200 ms response times to double and even triple.  Will the same results appear for Google’s Service?

Google Webmaster Tools
Google Webmaster Tools

As you can see, passing traffic through Google PageSpeed servers does cause a penalty of about 100 ms in response times.  Their servers are doing an awful lot of optimizations behind the scenes, so this is not at all surprising.  The trade off is that the end user gets entire seconds shaved off their load times.  I think I’ll take that any day.

More Advanced Optimization Techniques

Of course, I am not satisfied there and wanted to further push the boundaries of what Google PageSpeed Service can do.  Next up is to cache and prioritize visible content.  Cloudflare has something similar, but they call it railgun. Railgun also requires you to run an extra process to send HTTP data back and forth to Cloudflare to show them which HTML content is stale so it can be cached.  I have no idea how well Railgun performs since no one has actually reviewed the service.

Google’s cache and prioritize visible content works a bit different.  You do not need to run any additional service on your server.  Instead they parse the HTML based on a set of rules you specify and rely on using Javascript to speed up rendering times.  Their website says: “Non-cacheable portions are stripped from the HTML, and the rest of the page is cached on PageSpeed servers. Visible content of the page is prioritized on the network and the browser so it doesn’t have to compete with the rest of the page.”  Visible content means, content that the end user actually can see on their browser.  I do not know if it is user specific, or they just base this off the most common resolution.  So anything above the fold will load immediately, whereas content below the fold may load via a Javascript call.

If deferring javascript works on your site then this feature should work once configured correctly.    Here is their full video explaining the feature:

I went ahead and applied FULL caching to Top IAmA with no rules.  This site has no advertisements and very little javascript.  Response times here are a best case scenario   Every page that gets rendered can be cached for 30 minutes.  This means that Google only needs to fetch a page once every 30 minutes, otherwise it serves it straight from their servers.  Googlebot shows the results:

Googlebot crawl speed Top IAmA
Googlebot crawl speed Top IAmA

You’ll notice two things about this graph… 1) The response times to the far left are terrible and 2) The response times under Google PageSpeed with cached content are extremely fast.  Response times went from 200ms to around 50 ms.  The response times to the far left are a result of Cloudflare’s Full Caching features with Rocket Loader enabled. As I mentioned earlier, avoid Cloudflare at all costs.  The response times in the middle of the graph are from my server.

I did attempt to add Google’s cache and prioritize visible content to FantasySP.  Once I tallied up all of the IDs and classes needed for user specific data I previewed the feature site-wide.  After a some tweaking of class names, everything seemed to be working without any issues.  Although I soon ran into occasional Javascript errors under IE10 that would halt rendering the full page.

This shows you how fragile this feature is because a single javascript error will cause rendering to fail.

The error for IE10 shows as:

SCRIPT5007: Unable to get property ‘childNodes’ of undefined or null reference
f2ae86053f8f64f57a4ef28a17bd0669-blink.js, line 30 character 224

Every other browser seemed to have no issues whatsoever.  IE10 seemed to load the fastest, with Chrome, Safari, and Firefox not too far behind.  I applied this technique to a certain subset of pages on FantasySP that I feel confident will have no errors and can use the speed boost.  Have a look at this Russel Wilson or Matt Ryan page.  During pageload, watch the console to see how different sections of the page load via javascript calls from blink.js. If you don’t see it on the first pageload, then refresh to see the cached version appear.

Google Analytics

The final set of data will be looking at “Avg. Document Content Loaded Time (sec)” metric in Google Analytics under Site Speed -> DOM Timings.  Google Analytics explains this metric as: “Average time (in seconds) that the browser takes to parse the document and execute deferred and parser-inserted scripts (DOMContentLoaded), including the network time from the user’s location to your server.”

I broke up the data by browser and they include: Chrome, Internet Explorer, Safari, and Firefox.  Internet Explorer comes in just a hair faster than Chrome.  Although Safari seems to be worse.

DOMContentLoaded
DOMContentLoaded

Google Analytics confirms what the NewRelic graphs show.  Users will perceive pageloads as much faster with Javascript being deferred.

Conclusion

According to NewRelic browser metrics and testing, Google PageSpeed Service offers significant improvements to load times.  It is relatively easy to set up and is better than any other competing service.  Sites both big and small should use at least some of the services provided.

Google PageSpeed Service with advanced optimization techniques is a perfect solution when it comes to simple WordPress blogs that use Disqus commenting.  All you have to do is add an ignore rule for /wp-admin/* and enjoy a much faster blog that should be able to handle an almost unlimited amount of visits and traffic.  I’d love to see results from anyone else out there willing to try.

Overall I’d say using Google PageSpeed is a no brainer.  Test it out and let me know your thoughts.

Update #1: Over at the hackernews thread, Jeff Kaufman, who works with the PageSpeed team gave some very insightful comments that are a must read.

Update #2: This blog you are currently reading now uses Google PageSpeed Service with advanced caching techniques, such as Google’s cache and prioritize visible content.

Fix the WordPress SQL_CALC_FOUND_ROWS Bug

Anyone who has a WordPress blog with a lot of posts will eventually encounter an extremely slow query. I refer to this as the SQL_CALC_FOUND_ROWS Bug. If you have slow-query-log enabled then a query similar to this might have shown up before:

SELECT SQL_CALC_FOUND_ROWS wp_posts.* FROM wp_posts WHERE 1=1 AND wp_posts.ID NOT IN (44682, 44657, 44630) AND wp_posts.post_type = 'post' AND (wp_posts.post_status = 'publish') GROUP BY wp_posts.ID ORDER BY wp_posts.post_date DESC LIMIT 24580, 5

Just how detremential is this to your blogs performance?  Well thanks to newrelic, I can show you:

SQL_CALC_FOUND_ROWS Bug

Unfortunately I don’t know why WordPress runs this query.  What I do know is that it apparently only shows up on index.php.  What you probably care about is how to fix this problem.   I’ve located a possible workaround thanks to this open ticket at wordpress.org.  The diff log on the changes are listed as well.  I went ahead and applied these code changes to wp-includes/query.php.  The fixed query.php can be found here.

How are the results so far?  Inconclusive.  I just applied this patch and nothing broke so far, which is always a plus.  I suggest you give it a try and see how your blog responds in a development environment.  If I STILL spot slowdowns in the revised query, then I will update this post and let you know.

Some of you might be asking, does this affect your version of WordPress? The answer is yes.  I am running WordPress 3.2.1.

Please do post your thoughts, concerns, or comments to help others out.

Does Cloudflare live up to the hype?

I have given Cloudflare a try for the past week or so for FantasySP.com and want to pass along my thoughts.  I saw a few reviews that were quite astonishing and decided to find out if they are accurate.  I have been reading great things about it and couldn’t resist trying it out.

Will Cloudfront Increase Pages Per Visit?

Unfortunately no.  The only reason you’ll see an increase in pages per visit is if you change the setting to add your Google Analytics code to every page.  I decided to use this myself and look at the results:

Pages/Visit

Notice the bump in pages/visit?  Almost double.  That is when I added the Analytics code to every page and it completely skewed everything out of whack.   Meanwhile GetClicky and awstats both showed them to be at normal levels.  So clearly this Cloudflare setting should not be used and anyone who shows these type of drastic results are just lying to themselves.

I obviously don’t think Cloudflare adds this setting to artificially increase analytics stats.  Although, I can’t explain why this happens, since they claim its fine to add without removing the old code.

Will Cloudflare Decrease Bounce Rate?

Probably not.  If anything, it could have a slight increase in your bounce rate depending on how much bad bot/spam traffic you were getting.  Although, a bad bot might even help your bounce rate because they are often times so abusive and load many pages.  If Google Analytics shows a drastic decrease in bounce rate then it is due to the Cloudflare setting to add the script to every page.  Have a look at my bounce rate:

Bounce Rate

Sorry folks, Cloudflare is no miracle bounce rate solution. :(

Will Cloudflare Increase the Speed of the Site?

Yes, even using their free service!  Cloudflare has a whole list of features that show how they can decrease the time it takes for a page to load.  They will cache content, similar to a CDN and provide optimized routes to your site, and block abusive traffic.  If you pay for their service then you will see even better benefits (none of which I have tested).

So I guess the real question is, does Cloudflare make FantasySP faster?  Yes and no.  I have most of my static content already on Amazon’s Cloudfront.  My DNS before Cloudflare was Amazon’s Route 53.  So how much of a benefit am I really getting when some of Cloudflare’s services were already taken care of?  Here is a look at Googlebot’s time it takes to crawl FantasySP.com:

Googlebot Crawl Rate

According to Googlebot, the site isn’t faster, perhaps slightly slower if anything.  It could be that Amazon’s Route 53 provides equally as good routing for end users and I don’t see much of a difference.  Either way, there isn’t enough evidence to make a call on this.  What really matters is if there is a difference for the end user in a real browser.  For my site, javascript is what is holding back the end user experience.  Which brings up my next point, Rocket Loader.

Does Rocket Loader Improve Speeds?

To test Rocket Loader’s performance, I will be using NewRelic’s End User performance tracking.  I have been using this for well over a month and it should show any trends with end user speeds.

As you can see, there is a large decrease in pageload time and goes from 8 seconds to 6! Awesome right?  Well, unfortunately I noticed that Rocket Loader was screwing up some of the rendering of advertisements that were on the site.  Thus making my site’s javascript unusable.  Rocket Loader works fine with Google AdSense, but AdSense doesn’t generally hang as much because of their recent optimizations.

What I really wanted Rocket Loader to do, was to improve performance of the other ad networks that I use.  These are the ones causing all the slowndown.  Alas, it wasn’t meant to be.  Clearly it was making a difference, but it is a bit buggy at this point.  My next step is to selectively use Rocket Loader on certain javascript components and see if it will make a difference in overall load time without breaking things.  Stay Tuned for that!

So Where Does This Leave Us?

After reading all of this, you might think I’d come to the conclusion that Cloudflare isn’t worth the trouble.  Exactly the opposite.  Cloudflare delivers on a lot of the promises of the service.  Just be sure you don’t make mistakes like enabling Google Analytics on every page.  That feature just doesn’t make any sense to me.  (I’d love to hear your thoughts on this.)

If you aren’t using a CDN or a customized DNS service, then switching to Cloudflare should have an impact on quicker load times.  Even if you do have a CDN and a customized DNS service, the fact that Cloudflare blocks bad threats and saves your server resources and prevents lots of spam.  It’s also nice to see trending data for search engine bots and outbound links.

I’d love to try out Cloudflare’s pay only services to see if they improve upon speeds, but they do not offer a trial for such a thing.  Perhaps down the line I will test these as well.  I also think Rocket Loader will continue to improve and will truly make a difference for some users out there.

What do you guys think?

My Review of New Relic, with some Performance Numbers

When optimizing your web application or server, the process can be a bit daunting.  In fact, its usually something that is never fully checked off your to-do-list.  The reason being is that you’ll never really know how your application will scale until it starts to show its growing pains.  Just when you thought that you’ve addressed the performance issues, a few weeks or months go by and new ones will crop up.

Before New Relic came along, I used to cat the slow-query-log.  I would address any queries that I saw took a while to execute.   Then there is the MySQL Performance Tuning Primer Script.  Also a great tool and will give you a good idea on how MySQL is performing as a whole.  I would continue to look at these things, but the load averages kept climbing and I could not figure out what the problem was.  I also couldn’t figure out which pages took the longest to load.

Enter New Relic

I actually considered throwing more hardware at the problem, since I could not figure it out through the usual means.  Then New Relic came along and I decided to give it a try. Here is my site performance graph since I first started using New Relic:

Performance

You’ll notice that when it was first turned on, the database was clearly the issue. The problem was UPDATING a table that has almost 2 million rows. (Duh.  Apparently, it was way more costly than I ever imagined)  Incrementing the view count on one of the main tables was the root cause to 90% of my problems.  I went ahead and created dedicated tables to hold these stats  and load averages eventually settled to around .20 across the board.  MySQL was no longer an issue.  FantasySP was silky smooth, once again!

The best part about New Relic is that once you set it up, you don’t have to spend time SSH’ing your box to find performance issues.  Everything you need is much easier to read, not to mention there is a performance breakdown based on each page/script.  Now I can easily see the pages, ajax requests, and background tasks that take the longest to complete.  All of these can be easily optimized now.  Their interface isn’t perfect, but it’s improving every day.

Impact on Google Bot & Organic Search

Of course when you optimize your website you do so for the end user experience.  The added plus to this is that Google Bot gets a huge benefit from it too.  I’ve been reading over and over about how load time is now a ranking factor.  Sure, but it’s not that big of a deal, right?  Well, lets take a look at how Google Bot responded to these tweaks:

FantasySP.com | Pages Crawled Per Day
FantasySP.com | Time Spend Downloading
m.FantasySP.com | Pages Crawled Per Day
m.FantasySP | Time Spend Downloading

Clearly, the faster the response time, the more pages that can be crawled in a given day.  The average response time for FantasySP.com is now at around 100ms.  The average response time for Mobile FantasySP is at 72ms.  Google Bot quickly realized that the site is performing faster and ramped up its crawl rate.

So you might be wondering, has the faster and more responsive site resulted in more search engine traffic from Google?  So far, the answer is yes.  By about 5 – 10% the past 7 straight days. Coincidence?  I doubt it.  Getting a bump in SERPs for being a considerably faster site is very real.  There is no question that the easier it is for Google Bot to crawl, the better off I’ll be in the long run.

Conclusion

New Relic has just released a whole bunch of new changes such as Real User Monitoring , an improved install process, and a much more affordable pricing structure.  It doesn’t matter if your a company of 15 developers or just one developer working on a side project. . . New Relic will be worth it.  In fact, for smaller sites you should be able to take advantage of New Relic’s 14 day professional trial and be done optimizing by the time your trial is finished.  However, I’ve been addicted to the stats and will eventually subscribe to their “Standard Package”.

New Relic essentially takes the guesswork out of optimization.  Now you can easily track down costly plugins in WordPress and even compare performance numbers from week to week.  This is by far the best tool you can use to make your site or app run faster.  Spend the time and install it, or submit a support ticket and have your host do it.

I can go on and on about New Relic and you will probably get a bit lost in the data for a few days.  So go ahead and give it a try.