A case study of performance optimisation

I run what could quite possibly be the highest trafficked ColdFusion/MySQL website out there, and while our web hosting benefactors have been gracious, we really don't have the computing power needed to competently handle the million or so page views we get each day. So I spend a lot of time looking for ways to do more with less, server-wise.

There's a lot you can do to identify bottlenecks and poorly performing code in your environment. Even after you've picked the low hanging fruit, there can be a lot of small wins that mean a great deal particularly when the server is running hot 18 hours a day. Some of the techniques I use include:

The case study

Back in January 2007, we upgraded to ColdFusion 7 and MySQL 5... and immediately started seeing a whole bunch of queries running in the MySQL process list which I didn't write. SHOW FULL COLUMNS FROM tablename? I wanted to know where these mysterious queries were coming from, as we were having performance problems with the new configuration and I was sure this was the cause... or at least the symptom.

Step one — make sure it's not my fault. Does it happen on a freshly installed copy of ColdFusion? I had one on my local computer, and it was causing the same symptom: whenever a query is run, SHOW FULL COLUMNS followed like a shadow. Similarly, a freshly installed copy of MySQL with a carefully recreated database schema didn't change anything.

Step two — try regressing versions. My gut assumption was that ColdFusion 7 was innocent, as I had recently upgraded a swath of machines to version 7 at work and didn't see these queries popping up on our MSSQL servers. After many hours of installing and reinstalling, it turned out that it was neither program's fault: it was the combination! Regress either back to their previous versions and the problem went away.

Step three — identify the culprit. After tracing network activity between the servers, it quickly became apparent that these queries were being generated on ColdFusion's end... but how? As I was restoring the configuration for the umpteenth time, the answer hit me: Connector/J! I had been dutifully swapping versions of this jar every time I changed versions of MySQL. So I tried connecting to MySQL 5 with an older version of the connector, and magically, the rogue queries disappeared.

Now this presented an awkward situation. The previous connector did fix things, but it didn't support a number of MySQL 5 features I wanted to take advantage of. And I don't like being forced into avoiding upgrades; certainly not for something as trivial as this. So I downloaded the Java source code to see if I couldn't figure this one out for myself.

One download, unzip and search later, I had my answer. The query string existed in the getCollation() method of com.mysql.jdbc.Field, which in turn was exclusively called by isCaseSensitive() in com.mysql.jdbc.ResultSetMetaData. Immediately I made the logical connection to ColdFusion 7's new result attribute for <cfquery>, which returns such metadata as, you guessed it, case sensitivity.

It turns out ColdFusion was asking Connector/J for the metadata on every field, which in turn triggered a SHOW FULL COLUMNS query for every varchar and text column returned. Since I was using case insensitive collation exclusively throughout my database schema (and I wasn't using the result attribute anywhere), I was able to recompile the connector with the isCaseSensitive() method neutered. Problem solved.

I filed a bug report with MySQL, and it has been resolved as of version 5.0.7 of Connector/J.

14 comments

This issue was raised during CF8 development last year. It no longer occurs in CF8 and there is a workround for CF7 as well.

Refer to:
http://tjordahl.blogspot.com/2007/07/mysql-and-coldfusion-8.html and
http://tjordahl.blogspot.com/2007/07/faster-mysql-jdbc-driver-for-coldfusion.html

CF8 performance takes a hit due to Java 6 problems:
http://compoundtheory.com/?action=displayPost&ID=270

Some other tips:

- if your application uses session management limit the session time for agents that do not accept cookies (most likely bots). For example set the session time to 5 seconds for these agents. You can do this if you use Application.cfc and free up heaps of memory - especially witha high traffic/popular site that presumably gets lots of bot traffic.

- makes sure you turn off client management via the CF admin even if your application does not use client management. Otherwise CF will happily write to your registry with every request.
http://carehart.org/blog/client/index.cfm/2006/10/4/bots_and_spiders_and_poor_CF_performance

- use getTickcount() to calcuate and then output the page/request processing time in a html comment at the bottom of you page.

- If you are using Application.cfc add a session counter to the application scope and increment it via onSessionStart() and decrement via onSessionEnd(). Also output this in an html comment. This together with the request processing time gives you instant info on how busy the server was when processing a specific page.

@Johan

I use a different trick with bots. Rather than give them a short session time (e.g. 5 seconds), I force them to all use the same session by fixing url.cfid/url.cftoken before cfapplication runs. This works well for me, as I can receive as many as 10 spider hits every 5 seconds, and I won't pay any session create/destroy penalties for the life of the application.

@Steven Erat

Yep, I raised it in the Adobe Pre-release forums for Scorpio. :-)
Glad to hear it has been fixed. (I never tested it in the release version.)

Simon - fixing cfid/cftoken for bot sessions - didn't of think of that approach. Nice. How do you reliably determine if request agent is a bot? Be bad news if valid agents (real users) started sharing the same session.

I did an analysis of user agent strings that were hitting robots.txt, and built a few regular expressions to identify them. If an otherwise valid agent decided to include "Googlebot" or "Yahoo! Slurp" in its string, tough titties.

I've disabled a number of site features for robots, the most important being the ability to log in.

Guess that is fair enough if someone wants to masquerade as a bot they get what they deserve! However it does not address the reverse case where an automated agent masquerades as a valid agent. The short session approach will still work here.

To clarify there are 3 session times: 5 minutes (or longer based on your site requirements) for agents that accept cookies (test for CFID cookie). 5 seconds for "bot" agents that are caught by a basic regex checking for things like bot, crawl, spider, reader etc. 90 seconds for all other agents.

So you're basing that on the assumption that the second hit will come within 90 seconds. I presume you've tested and confirmed that you can change the timeout after a session has been established (seems reasonable to expect but you never know).

If I had the time I would do a more extensive analysis on the performance curve for creating, holding and destroying sessions. For example, how much more onerous is the 1,000th session compared to the 1st? How frequently is the session array swept?

Correct - you should review your site stats to get useful values but 90s is pretty good for most sites as users will either have left or click to a second page within that time. Similarly the 5 minute value should be based on reviewing the stats, however the shorter you can keep it the better.

I am not sure how frequently CF processes onSessionEnd() but it works well. I load tested with 200 clients with/without cookies and session memory usage stayed pretty constant.

BTW - I found this article on why app servers crash (not new) quite insteresting: http://www.linuxjournal.com/article/4878

Be nice to have a programmable throttle solution in CF in future - I think client access via APIs an effective/easy way to throttle requests will be essential. if you have any ideas on that I'd be interested.

Based on your suggestion, I've modified my application to vary the sessiontimeout based on the user's cookies. Previously it was a static 30 minutes. Now it's:

No cookies: 3 mins
Session cookie: 15 mins
Login cookie: 30 mins

Let us know how it goes - at a minimum you should free up a lot of memory previously consumed due to bots.

Great conversation, folks. Always nice to see entries that help remind people of core troubleshooting techniques.

The bot problem (its affect on sessions and clients) is one that I think affects many more sites than most realize.

Thanks, Johan, for pointing out my blog entry on it. BTW, you say "CF will happily write to your registry with every request" but of course if one changes client vars to use a database instead, then the I/O goes there--but it can be just as burdensome, like driving with the parking break on.

While Simon mentions SeeFusion (and we should add FusionReactor as well, fusion-reactor.com, and the built-in monitor in CF8 Enterprise or Developer), note that none of these tools (in my experience) will detect and observe this problem of queries to the db based on client variable i/o. They only observe the queries driven by CFML code in a given request.

As Simon notes, you may better detect things like this by watching the queries getting into the DB from CF (using SQL Server profiler, or MySQL's tools). Keep up the good work, Simon.

Hi
Can i get a bundle Telstra bundle offer $109.95 per month?
Thanks!
Rakesh










(no HTML)