Using Memcache with MySQL and PHP
Memcache is a great caching tool available for nearly every scripting or programming environment. I use it with PHP to speed up some applications I have written by completely avoiding asking the database for information. I wanted a very clean way of implementing this in my various projects, with as little change to the existing code as possible. Since memcache is an OBJECT caching system, you can’t simply drop your mysql_query resource into memcache, ask for it at another time, and get your results back … Which is how I had originally thought it worked the first time I read about it. The good news is that it can also store Strings, Integers, Floats, Arrays etc … Below I have a few other things I’ve learned whilst using memcache as well as the caching script I use for database queries.
What I’ve learned:
- memcache is great for storing slow queries that return small data sets [1 - 50 results, depending on the average row weight]
- memcache is not so great for any query that returns large data sets [100 - ∞, depending on the average row weight]
… in fact, I’ve found that memcache can occasionally be slower than running high-yield queries again [that is, if you have your MySQL server caching queries as well]. It all really boils down to benchmarking the scripts yourself: test every situation with and without the cache! [the more realistic the DB load, the better]
Caching is faster? Yay! Cache it!
DB query is faster? Yay! DON’T cache it!
So, basically, this isn’t a plug-and-play solution for ALL slow page loads / queries, just some of them.
Here’s the code I use to cache MySQL queries:
<?php
# Connect to memcache:
global $memcache;
$memcache = new Memcache;
# Gets key / value pair into memcache ... called by mysql_query_cache()
function getCache($key) {
global $memcache;
return ($memcache) ? $memcache->get($key) : false;
}
# Puts key / value pair into memcache ... called by mysql_query_cache()
function setCache($key,$object,$timeout = 60) {
global $memcache;
return ($memcache) ? $memcache->set($key,$object,MEMCACHE_COMPRESSED,$timeout) : false;
}
# Caching version of mysql_query()
function mysql_query_cache($sql,$linkIdentifier = false,$timeout = 60) {
if (($cache = getCache(md5("mysql_query" . $sql))) !== false) {
$cache = false;
$r = ($linkIdentifier !== false) ? mysql_query($sql,$linkIdentifier) : mysql_query($sql);
if (is_resource($r) && (($rows = mysql_num_rows($r)) !== 0)) {
for ($i=0;$i<$rows;$i++) {
$fields = mysql_num_fields($r);
$row = mysql_fetch_array($r);
for ($j=0;$j<$fields;$j++) {
if ($i === 0) {
$columns[$j] = mysql_field_name($r,$j);
}
$cache[$i][$columns[$j]] = $row[$j];
}
}
if (!setCache(md5("mysql_query" . $sql),$cache,$timeout)) {
# If we get here, there isn't a memcache daemon running or responding
}
}
}
return $cache;
}
?>
The function mysql_query_cache() will return an array filled with the results. Since I don’t use this for large result sets, I don’t free the MySQL resource … you may want to free the resource after it’s been used if you get larger data sets.
Like I had mentioned before, I wanted the code to be as easy-to-use as possible when using it. So, I’ve set up a before and after test scenario showing how to retrofit your code with the new caching code:
<?php
$sql = "
SELECT `dataID`, `dataTitle`
FROM `tbldata`
WHERE `dataTypeID` BETWEEN 2 AND 2093
AND `dataStatusID` IN (1,2,3,4)
AND `dataTitle` LIKE '%something%'
ORDER BY `dataDate` DESC
LIMIT 10
";
# Before: [without memcache]
$rSlowQuery = mysql_query($sql);
# $rSlowQuery is a MySQL resource
$rows = mysql_num_rows($rSlowQuery);
for ($i=0;$i<$rows;$i++) {
$dataID = intval(mysql_result($rSlowQuery,$i,"dataID"));
$dataTitle = mysql_result($rSlowQuery,$i,"dataTitle");
echo "<a href=\"/somewhere/$dataID\">$dataTitle</a><br />\n";
}
# After: [with memcache]
$rSlowQuery = mysql_query_cache($sql);
# $rSlowQuery is an array
$rows = count($rSlowQuery);
for ($i=0;$i<$rows;$i++) {
$dataID = intval($rSlowQuery[$i]["dataID"]);
$dataTitle = $rSlowQuery[$i]["dataTitle"];
echo "<a href=\"/somewhere/$dataID\">$dataTitle</a><br />\n";
}
?>
Easy, huh? Run print_r() on the returned array to get an idea of how the array is structured if need be.
As always, if you have a better, more efficient, objective, more adaptable solution than mine, please leave a comment! I am 100% open to constructive criticism :-)
cool, new to memcached and was looking for somthing like this. question though, how can you tell if data is being served by memcached or the db?
vin
July 3, 2008
You could add a row to the array it returns… something like:
$cache["from"] = "db";and set it up logically so that it will populate this row based upon what the function does.
pureform
July 3, 2008
it working fine. I would like to know how can i manipulate the cache
panneerelvam
August 12, 2008
Hi, new to all of this, so please forgive the newspeak. I have a daemon running (I think) on the default port.
I needed to add $memcache->connect(’localhost’, ‘11211′); below the line $memcache = new Memcache; in order to not fall into the ‘no daemon running’ case in your script.
Does this sound correct?
Ben
September 23, 2008
That’s correct … You’ll need to instantiate the Memcached object before running the connect() method on it.
pureform
September 25, 2008
One issue I am having…
When I use this to check for the existance of one item in the database, for example SELECT name_id FROM names WHERE name = ‘Bill’; and Bill is not in my test dB.
count($rSlowQuery); returns a value of one, even if nothing is found. Doing print_r on the the above displays nothing, yet count returns a value of 1… any ideas?
Paul
October 16, 2008
[...] found a nice solution on http://pureform.wordpress.com/2008/05/21/using-memcache-with-mysql-and-php/ and I decided to work this out for my [...]
Bastiaans Blog » Blog Archive » Caching MySQL queries with memcache in PHP
November 17, 2008
[...] PHP Caching (Wikipedia), memcached & PHP, Official Ruby on Rails caching guide, Excellent Article on Rails caching, Caching with Smarty, [...]
Broadcasting Adam » 8 Tips for Lightning Fast Websites
January 5, 2009
I don’t understand a part of this code, someone have the answer ?
You don’t put the ‘mysql_query($sql);’ in a if statement like :
if($we_have_this_request_in_cache){
show from cache
}else{
ask the bd
}
Maybe i’m wrong but why the mysql_query is not executed each time the page reload ?
Thanks
By the way, thank you for sharing this page:)
Sebastien
March 6, 2009
$fields = mysql_num_fields($r);
Should be outside the for loop..
Guest
March 6, 2009
I think the code is just not working .. They are some logical error in the php. The request always ask for the database and never touch the memcached.
The second time we reload the page, is more fast but we just hit the MySQL cache, not the memcached itself…
It need a little modification to work, but not too much.
I think ;)
Sebastien
March 6, 2009
I just have one comment about the “is memcache faster?” question. Whether memcache is faster or not it is often more scalable. So if it takes .01 seconds to get something from the database and .015 seconds to get it from memcache it might take .015 seconds to get it at 100/sec. from memcache whereas the database will start to take longer at higher concurrency. It’s something to consider.
Stephen
April 29, 2009
[...] Using Memcache with MySQL and PHP (tags: php mysql cache howto code emon) [...]
links for 2009-05-03 « Talkabout
May 3, 2009
Was going to comment on “is memcache faster?” but see Stephen beat me to it. Getting data from memcached should always be faster than getting it from the database, if it is not then something is wrong with your setup. It uses non-blocking network I/O – and is of course very scalable. That said not all data is suited for caching. Memcached is a very good tool for the right job.
ficuscr
June 19, 2009
In my test, it wasn’t faster to always pull from the cache… and it wasn’t tenths or hundredths of seconds, it was several seconds difference. I can’t remember the exact queries I was caching, but it never hurts to benchmark your code and adjust accordingly instead of always caching no matter what.
Now that I think about it, the non-cache condition I found may have applied to your thoughts on “not all data is suitable for caching”… I think it may have been for a pretty large result set [200 - 300K]. But still, benchmarking is always a good idea :-)
pureform
June 19, 2009