Your database is one of the most important aspects of your server software stack. Call me old fashion, but I still rely on MySQL as my database of choice and don’t see myself changing any time soon. Over the past 7 years of running FantasySP, I’ve learned quite a bit about maintaining MySQL and making sure my server is running as smooth as possible.
Keeping MySQL Up to Date
The most obvious way to ensure MySQL performs its best is to make sure that your server software is up to date. Updating to the latest minor release of MySQL should be one of the safer routes to go in. These fix bugs, security, and minor performance improvements. You should update to the minor release versions on a pretty regular basis.
A major release version upgrade is much more involved. If you see that MySQL is nearing its end of life, then it’s time you start planning. I have done 2 major upgrades from 5.1 -> 5.5, and then from 5.5 -> 5.6. Chances are you will encounter problems with query performance with a major release update. MySQL Query Cache Optimizer can behave very differently depending on what major release version you are using. Queries that ran fine on 5.4 will run poorly on 5.5.
When Upgrading MySQL to a New Major Release
In order to safely upgrade to a newer MySQL major release version you should first make sure that you are running something like NewRelic to monitor your query performance. Get a good baseline for typical application performance so you can compare it to when you upgrade.
To be safe, you could be running the old & new version of MySQL on your development box if you can. This way you can plug in two queries with the same dataset on two different version of MySQL and get an idea of performance beforehand. I usually don’t do this and just deal with the shitstorm that ensues, but some of you may want to play it safe.
Usually when a query performs badly it is because the newer version of MySQL decided to ignore the indices that were previously created for a table. I have seen this happen quite a few times. If you are running a vanilla WordPress installation then you probably won’t have to deal with this, but custom apps will probably encounter this problem. You’ll be able to see bad query performance when you use EXPLAIN on your troublesome query. Full table scans lead to terrible performance, so don’t be alarmed if this happens.
To fix this you may have to develop a better index to speed up performance, or you can force MySQL to use a specific index with the following query:
SELECT * FROM table_name USE INDEX (names) WHERE last = ‘something’
Another reason an updated major release of MySQL runs poorly could be that your configuration file is outdated. Sometimes they may change the names of something or remove it and MySQL will start with Query Cache disabled or default settings. Make sure you save a backup of your original my.conf file.
Useful Scripts to See Performance
Using NewRelic is great, but there are two popular database scripts to help with diagnosing performance issues. You have probably heard of them, but I just want to double check you have them. One is mysqltuner.pl and the other is tuning-primer.sh.
They will tell you how your cache is performing, if you have full table scans, how many prunes are happening per day, if your query cache is big enough, etc. All very useful stuff to determine performance. I am assuming you know how to deal with fixing those issues, but if you don’t you can Google them or buy a MySQL book or two to learn more.
Software to Manage & Run MySQL Queries
My development environment of choice is Windows. With that in mind, I like to run a local MySQL application to make my life easier to diagnose queries, create tables, indexes, synchronize data, etc.
My software of choice is SQLyog and can’t say enough about how awesome it is. If you deal with custom MySQL queries a lot, then I strongly suggest using an application similar to this to make your life easier.
Pruning Old Data
MySQL databases tend to be like old attics where you just keep saving shit even if you don’t actually need it. If you have data from 2011 that you don’t need then make sure you get rid of it.
If you properly prune your database on an annual basis then you can extend the life of your current hardware stack, save money, and have much better performance. A smaller database means less space for MyISAM indexes, which means you need less memory for caching. (Assuming you use MyISAM)
So the big question is, how do you prune old rows in a table with 20 GIG worth of data without using DELETE queries? We all know that DELETE queries are way too expensive to run at this scale. The solution? Create a new temporary table with the same indices and table names. Then copy the data you need from the old table to the new table. Once your done, DROP the original table and rename your new table to the old table.
I learned this trick from Sean at the Clicky blog and trimmed the size of my database in half.
I’ve been on the lookout for a service that would reliably speed up various web projects such as FantasySP and Top IAmA. The faster the pages load, the happier my users will be. I first started out using Cloudflare, but their service has been getting worse and worse. Their pay features aren’t worth the money and using Cloudflare will most likely slow down response times. A winning combination!
Then I heard Google PageSpeed was in invite only beta. (Not to be confused with mod_pagespeed, which is an Apache module that offers very similar features). Google PageSpeed Service is very similar to cloudflare and all traffic will be passed through their servers via a DNS change. I was finally accepted into the program and have spent the past few days tweaking my sites for best performance.
Though before we get to the results, let’s first go over why FantasySP tends to load a little slow to begin with.
Enter Google PageSpeed Service
Google PageSpeed went live on Christmas day, the 25th.
Immediately all pages started to load in under 5 seconds. Best of all, no functionality on the site was broken, and I did not detect any additional latency by using Google’s service.
On December 26th I decided to enable additional experimental features that are labeled “high risk”. I also enabled a new low risk feature called prefetch DNS resolve.
According to my testing, the previous bottlenecks like advertisements and rendering Google Charts no longer slow down pageloads.
Next up is to see performance from Google Webmaster Tools. When I used Cloudflare, Googlebot noticed huge performance issues. Cloudflare caused my 200 ms response times to double and even triple. Will the same results appear for Google’s Service?
As you can see, passing traffic through Google PageSpeed servers does cause a penalty of about 100 ms in response times. Their servers are doing an awful lot of optimizations behind the scenes, so this is not at all surprising. The trade off is that the end user gets entire seconds shaved off their load times. I think I’ll take that any day.
More Advanced Optimization Techniques
Of course, I am not satisfied there and wanted to further push the boundaries of what Google PageSpeed Service can do. Next up is to cache and prioritize visible content. Cloudflare has something similar, but they call it railgun. Railgun also requires you to run an extra process to send HTTP data back and forth to Cloudflare to show them which HTML content is stale so it can be cached. I have no idea how well Railgun performs since no one has actually reviewed the service.
You’ll notice two things about this graph… 1) The response times to the far left are terrible and 2) The response times under Google PageSpeed with cached content are extremely fast. Response times went from 200ms to around 50 ms. The response times to the far left are a result of Cloudflare’s Full Caching features with Rocket Loader enabled. As I mentioned earlier, avoid Cloudflare at all costs. The response times in the middle of the graph are from my server.
The error for IE10 shows as:
SCRIPT5007: Unable to get property ‘childNodes’ of undefined or null reference
f2ae86053f8f64f57a4ef28a17bd0669-blink.js, line 30 character 224
The final set of data will be looking at “Avg. Document Content Loaded Time (sec)” metric in Google Analytics under Site Speed -> DOM Timings. Google Analytics explains this metric as: “Average time (in seconds) that the browser takes to parse the document and execute deferred and parser-inserted scripts (DOMContentLoaded), including the network time from the user’s location to your server.”
I broke up the data by browser and they include: Chrome, Internet Explorer, Safari, and Firefox. Internet Explorer comes in just a hair faster than Chrome. Although Safari seems to be worse.
According to NewRelic browser metrics and testing, Google PageSpeed Service offers significant improvements to load times. It is relatively easy to set up and is better than any other competing service. Sites both big and small should use at least some of the services provided.
Google PageSpeed Service with advanced optimization techniques is a perfect solution when it comes to simple WordPress blogs that use Disqus commenting. All you have to do is add an ignore rule for /wp-admin/* and enjoy a much faster blog that should be able to handle an almost unlimited amount of visits and traffic. I’d love to see results from anyone else out there willing to try.
Overall I’d say using Google PageSpeed is a no brainer. Test it out and let me know your thoughts.
Anyone who has a WordPress blog with a lot of posts will eventually encounter an extremely slow query. I refer to this as the SQL_CALC_FOUND_ROWS Bug. If you have slow-query-log enabled then a query similar to this might have shown up before:
SELECT SQL_CALC_FOUND_ROWS wp_posts.* FROM wp_posts WHERE 1=1 AND wp_posts.ID NOT IN (44682, 44657, 44630) AND wp_posts.post_type = 'post' AND (wp_posts.post_status = 'publish') GROUP BY wp_posts.ID ORDER BY wp_posts.post_date DESC LIMIT 24580, 5
Just how detremential is this to your blogs performance? Well thanks to newrelic, I can show you:
How are the results so far? Inconclusive. I just applied this patch and nothing broke so far, which is always a plus. I suggest you give it a try and see how your blog responds in a development environment. If I STILL spot slowdowns in the revised query, then I will update this post and let you know.
Some of you might be asking, does this affect your version of WordPress? The answer is yes. I am running WordPress 3.2.1.
Please do post your thoughts, concerns, or comments to help others out.
I have given Cloudflare a try for the past week or so for FantasySP.com and want to pass along my thoughts. I saw a few reviews that were quite astonishing and decided to find out if they are accurate. I have been reading great things about it and couldn’t resist trying it out.
Will Cloudfront Increase Pages Per Visit?
Unfortunately no. The only reason you’ll see an increase in pages per visit is if you change the setting to add your Google Analytics code to every page. I decided to use this myself and look at the results:
Notice the bump in pages/visit? Almost double. That is when I added the Analytics code to every page and it completely skewed everything out of whack. Meanwhile GetClicky and awstats both showed them to be at normal levels. So clearly this Cloudflare setting should not be used and anyone who shows these type of drastic results are just lying to themselves.
I obviously don’t think Cloudflare adds this setting to artificially increase analytics stats. Although, I can’t explain why this happens, since they claim its fine to add without removing the old code.
Will Cloudflare Decrease Bounce Rate?
Probably not. If anything, it could have a slight increase in your bounce rate depending on how much bad bot/spam traffic you were getting. Although, a bad bot might even help your bounce rate because they are often times so abusive and load many pages. If Google Analytics shows a drastic decrease in bounce rate then it is due to the Cloudflare setting to add the script to every page. Have a look at my bounce rate:
Sorry folks, Cloudflare is no miracle bounce rate solution. :(
Will Cloudflare Increase the Speed of the Site?
Yes, even using their free service! Cloudflare has a whole list of features that show how they can decrease the time it takes for a page to load. They will cache content, similar to a CDN and provide optimized routes to your site, and block abusive traffic. If you pay for their service then you will see even better benefits (none of which I have tested).
So I guess the real question is, does Cloudflare make FantasySP faster? Yes and no. I have most of my static content already on Amazon’s Cloudfront. My DNS before Cloudflare was Amazon’s Route 53. So how much of a benefit am I really getting when some of Cloudflare’s services were already taken care of? Here is a look at Googlebot’s time it takes to crawl FantasySP.com:
Does Rocket Loader Improve Speeds?
To test Rocket Loader’s performance, I will be using NewRelic’s End User performance tracking. I have been using this for well over a month and it should show any trends with end user speeds.
So Where Does This Leave Us?
After reading all of this, you might think I’d come to the conclusion that Cloudflare isn’t worth the trouble. Exactly the opposite. Cloudflare delivers on a lot of the promises of the service. Just be sure you don’t make mistakes like enabling Google Analytics on every page. That feature just doesn’t make any sense to me. (I’d love to hear your thoughts on this.)
If you aren’t using a CDN or a customized DNS service, then switching to Cloudflare should have an impact on quicker load times. Even if you do have a CDN and a customized DNS service, the fact that Cloudflare blocks bad threats and saves your server resources and prevents lots of spam. It’s also nice to see trending data for search engine bots and outbound links.
I’d love to try out Cloudflare’s pay only services to see if they improve upon speeds, but they do not offer a trial for such a thing. Perhaps down the line I will test these as well. I also think Rocket Loader will continue to improve and will truly make a difference for some users out there.
When optimizing your web application or server, the process can be a bit daunting. In fact, its usually something that is never fully checked off your to-do-list. The reason being is that you’ll never really know how your application will scale until it starts to show its growing pains. Just when you thought that you’ve addressed the performance issues, a few weeks or months go by and new ones will crop up.
Before New Relic came along, I used to cat the slow-query-log. I would address any queries that I saw took a while to execute. Then there is the MySQL Performance Tuning Primer Script. Also a great tool and will give you a good idea on how MySQL is performing as a whole. I would continue to look at these things, but the load averages kept climbing and I could not figure out what the problem was. I also couldn’t figure out which pages took the longest to load.
Enter New Relic
I actually considered throwing more hardware at the problem, since I could not figure it out through the usual means. Then New Relic came along and I decided to give it a try. Here is my site performance graph since I first started using New Relic:
You’ll notice that when it was first turned on, the database was clearly the issue. The problem was UPDATING a table that has almost 2 million rows. (Duh. Apparently, it was way more costly than I ever imagined) Incrementing the view count on one of the main tables was the root cause to 90% of my problems. I went ahead and created dedicated tables to hold these stats and load averages eventually settled to around .20 across the board. MySQL was no longer an issue. FantasySP was silky smooth, once again!
The best part about New Relic is that once you set it up, you don’t have to spend time SSH’ing your box to find performance issues. Everything you need is much easier to read, not to mention there is a performance breakdown based on each page/script. Now I can easily see the pages, ajax requests, and background tasks that take the longest to complete. All of these can be easily optimized now. Their interface isn’t perfect, but it’s improving every day.
Impact on Google Bot & Organic Search
Of course when you optimize your website you do so for the end user experience. The added plus to this is that Google Bot gets a huge benefit from it too. I’ve been reading over and over about how load time is now a ranking factor. Sure, but it’s not that big of a deal, right? Well, lets take a look at how Google Bot responded to these tweaks:
Clearly, the faster the response time, the more pages that can be crawled in a given day. The average response time for FantasySP.com is now at around 100ms. The average response time for Mobile FantasySP is at 72ms. Google Bot quickly realized that the site is performing faster and ramped up its crawl rate.
So you might be wondering, has the faster and more responsive site resulted in more search engine traffic from Google? So far, the answer is yes. By about 5 – 10% the past 7 straight days. Coincidence? I doubt it. Getting a bump in SERPs for being a considerably faster site is very real. There is no question that the easier it is for Google Bot to crawl, the better off I’ll be in the long run.
New Relic has just released a whole bunch of new changes such as Real User Monitoring , an improved install process, and a much more affordable pricing structure. It doesn’t matter if your a company of 15 developers or just one developer working on a side project. . . New Relic will be worth it. In fact, for smaller sites you should be able to take advantage of New Relic’s 14 day professional trial and be done optimizing by the time your trial is finished. However, I’ve been addicted to the stats and will eventually subscribe to their “Standard Package”.
New Relic essentially takes the guesswork out of optimization. Now you can easily track down costly plugins in WordPress and even compare performance numbers from week to week. This is by far the best tool you can use to make your site or app run faster. Spend the time and install it, or submit a support ticket and have your host do it.
I can go on and on about New Relic and you will probably get a bit lost in the data for a few days. So go ahead and give it a try.