Getting started with Memcached and MySQL

 

Preamble

I know Memcached since a while but I have never taken time to test it a bit. In principle Memcached is a simple in-memory container storing key,value information. My idea was to test it in one of its popular deployment, means with an application server and a database server. My database server will be a MySQL Community Server 5.6.16 and my application server will be a small Java application running under Eclipse on my desktop.

Internally Memcached has below limitation (to be taken into account when implementing it in your Java code:

  • The key is an arbitrary string up to 250 bytes in length. No space or newlines for ASCII mode
  • The value associated with a key has a 1 megabyte limit on size (can be increased with -I parameter)

For the Java Memcached client you will often see mentioned two projects:

I have used Memcached 1.4.17, Connector/J 5.1.28, spymemcached 2.10.3, Java 1.7.0_51 and all my virtual test servers are running Oracle Linux Server release 6.5 64 bits. In below my database server is server1.domain.com with non-routable IP address 192.168.56.101 and my Memcached server is server3.domain.com with non-routable IP address 192.168.56.103.

Each Virtual Machine has 1 core of my Intel(R) Xeon(R) CPU E5507 @ 2.27GHz and 512MB of RAM.

Memcached installation

As usual either you compiled the latest release from source downloaded on official web site or, lazy approach, you pick the one available with your Linux distribution (maybe not the latest one by the way):

[root@server3 tmp]# yum -y install memcached.x86_64
Loaded plugins: security
Setting up Install Process
Resolving Dependencies
--> Running transaction check
---> Package memcached.x86_64 0:1.4.4-3.el6 will be installed
--> Finished Dependency Resolution
 
Dependencies Resolved
 
===============================================================================================================================================================================================================
 Package                                         Arch                                         Version                                            Repository                                               Size
===============================================================================================================================================================================================================
Installing:
 memcached                                       x86_64                                       1.4.4-3.el6                                        public_ol6_latest                                        68 k
 
Transaction Summary
===============================================================================================================================================================================================================
Install       1 Package(s)
 
Total download size: 68 k
Installed size: 147 k
Downloading Packages:
memcached-1.4.4-3.el6.x86_64.rpm                                                                                                                                                        |  68 kB     00:00
Running rpm_check_debug
Running Transaction Test
Transaction Test Succeeded
Running Transaction
  Installing : memcached-1.4.4-3.el6.x86_64                                                                                                                                                                1/1
  Verifying  : memcached-1.4.4-3.el6.x86_64                                                                                                                                                                1/1
 
Installed:
  memcached.x86_64 0:1.4.4-3.el6
 
Complete!

Configuration file (listening port, maximum number of connection and cache size in MB):

[root@server3 init.d]# cat /etc/sysconfig/memcached
PORT="11211"
USER="memcached"
MAXCONN="1024"
CACHESIZE="192"
OPTIONS=""

Start it:

[root@server3 init.d]# service memcached start
Starting memcached:                                        [  OK  ]
[root@server3 init.d]# service memcached status
memcached (pid  7026) is running...
[root@server3 init.d]# ps -ef | grep 7026
498       7026     1  0 12:01 ?        00:00:00 memcached -d -p 11211 -u memcached -m 64 -c 1024 -P /var/run/memcached/memcached.pid
root      7042  1837  0 12:01 pts/0    00:00:00 grep 7026

I finally realized that installed release is 1.4.4 while latest release on official web site is 1.4.17. So I decided to try to compile it from source and it went well with simply:

./configure --prefix=/usr
make
make test
make install

Remark:
I don’t know why but if you go on in the download section of Google Project Hosting Memcached web site then the latest available release is 1.14.15 while if you go on http://memcached.org/ then the latest available release is 1.4.17…

I just had to install libevent devel with:

yum install libevent-devel.x86_64

Doing this is not at all clean at the running release is not any more the one of the installed package but I wanted to still benefit from start-up script that’s why I decided to overwrite binaries of existing package with old version. Never do this on a production system…

Then for the test database I was thinking to something complex with a kind of sales table and doing summary of sales per day or whatever. I have finally decided to go for something much simpler and used the excellent wed site generatedata.com. Well in fact I had to install their script locally on my desktop using UwAmp as I wanted to generate more than 100 rows (current online web site limitation with no account).

I generated below customers test table with 30,000 rows, I deliberately avoid the creation of a primary key to generate more workload on my MySQL instance:

DROP TABLE `customers`;
 
CREATE TABLE `customers` (
  `id` mediumint,
  `firstname` varchar(255) default NULL,
  `lastname` varchar(255) default NULL
);
 
INSERT INTO `customers` (`id`,`firstname`,`lastname`) VALUES (1,"Illana","Franklin");
INSERT INTO `customers` (`id`,`firstname`,`lastname`) VALUES (2,"Jade","Jefferson");
.
.

My MySQL instance configuration is as below, deliberately the InnoDB pool is small to avoid having all the rows in cache:

[mysqld]
user=mysql
basedir=/mysql/software/mysql01
datadir=/mysql/data01/mysql01
port=3316
pid_file=/mysql/software/mysql01/conf/mysql01.pid
log_error=/mysql/dump/mysql01/mysql01.err
default-storage-engine=innodb
tmpdir=/tmp
binlog_format=row
server-id=1
log_bin=/mysql/logs/mysql01/mysql-bin.log
socket=/mysql/software/mysql01/conf/mysql01.sock
sql_mode='TRADITIONAL,PIPES_AS_CONCAT'
slow_query_log_file=/mysql/dump/mysql01/mysql01-slow.log
slow_query_log=1
long_query_time=5
max_allowed_packet=100M
default-storage-engine=innodb
 
innodb_buffer_pool_size=10M
innodb_flush_log_at_trx_commit=2
innodb_log_buffer_size=12M
innodb_log_file_size=50M
innodb_log_files_in_group = 4
innodb_thread_concurrency=0
innodb_rollback_on_timeout=1
innodb_lock_wait_timeout=600
innodb_autoinc_lock_mode=2
innodb_flush_log_at_trx_commit=2
innodb_flush_method=O_DIRECT
 
[mysql]
default-character-set=utf8
socket=/mysql/software/mysql01/conf/mysql01.sock
 
[client]
default-character-set=utf8
socket=/mysql/software/mysql01/conf/mysql01.sock

Memcached testing

As mentioned above Memcached is storing in memory key,value pairs. There are multiple way to store information I initially thought of id column of my customer table for Memcached key and a string concatenating first name and last name columns for the Memcached value. But I did not find this too much generic and decided to use the most generic way that is also described in Memcached official website. So to hash the whole query with a kind of MD5 function to be use it for the Memcached key and the Java SQL resultset with minimal transformation for the Memcached value (this resultset must be below 1 MB in size). Doing this it could be used for any other test scenario…

Using spymemcached the two most used Memcached client functions are:

  • set to store a key,value pair in Memcached server
  • get to retrieve a value associated with a key in Memcached server

I tried the straight approach with:

c.set(MD5encode(query1), 3600, rs);

Where rs is a classical ResultSet from java.sql.ResultSet Java class but got the annoying below error:

Exception in thread "main" java.lang.IllegalArgumentException: Non-serializable object
	at net.spy.memcached.transcoders.BaseSerializingTranscoder.serialize(BaseSerializingTranscoder.java:110)
	at net.spy.memcached.transcoders.SerializingTranscoder.encode(SerializingTranscoder.java:162)
	at net.spy.memcached.MemcachedClient.asyncStore(MemcachedClient.java:296)
	at net.spy.memcached.MemcachedClient.set(MemcachedClient.java:884)
	at jdbcdemo4.jdbcdemo4.main(jdbcdemo4.java:99)
Caused by: java.io.NotSerializableException: com.mysql.jdbc.JDBC4ResultSet
	at java.io.ObjectOutputStream.writeObject0(Unknown Source)
	at java.io.ObjectOutputStream.writeObject(Unknown Source)
	at net.spy.memcached.transcoders.BaseSerializingTranscoder.serialize(BaseSerializingTranscoder.java:105)
	... 4 more

Believe me or not but I have spent hell of time on this until I found two very nice references on how to handle it (How to serialize ResultSet in java and Storing MySQL result set in memcached). I mean generic solution on how to handle it because you find countless solution that are tightly linked to the format of your resultset, which again I did not want to remain as much generic as possible…

In short the trick is to use CatchedRowSetImpl class object that serialize ResultSet and have the same methods…

To write my code what was missing is a function to MD5 hash my query text to use it as a Memcached key, Google was my friend and I found many references. I decided to use the one of How to generate MD5 Hash in Java – String Byte Array digest Example as it was most concise one.

Then the principle is quite simple, I generate a random query which select first and last name of customers table for a random id. I MD5 encode the query and check if the result is in Memcached. If yes then I get it and display its value. If not then we have to execute the query on MySQL instance and store its result in Memcached (I have used to maximum retention time, 30 days, as I was testing for multiple days. You should not do this by default…). I execute 1,000 queries per run.

I have also added a execution time computation to have some figures to compare as well as percentage of query results found and not found in Memcached. Of course the implementation in your Java code is not transparent at all and everything must be (re)develop with the idea to use Memcached before going to the MySQL instance. Finally Java code gives something like:

package jdbcdemo4;
 
// MD5
import java.security.MessageDigest;
import java.security.NoSuchAlgorithmException;
 
// Spymemcached
import net.spy.memcached.MemcachedClient;
import net.spy.memcached.AddrUtil;
import net.spy.memcached.BinaryConnectionFactory;
 
//import com.mysql.jdbc.jmx.LoadBalanceConnectionGroupManager;
import com.mysql.jdbc.Driver;
import java.util.Properties;
import java.sql.ResultSet;
import java.sql.Connection;
import java.sql.SQLException;
import java.util.Random;
import com.sun.rowset.CachedRowSetImpl;
import java.util.concurrent.Future;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.TimeoutException;
 
public class jdbcdemo4 {
 
	private static String MD5encode(String string1)
	{
		String md5=null;
 
		if (string1 == null) return null;
 
		try {
			MessageDigest digest1 = MessageDigest.getInstance("MD5");
			byte[] hash = digest1.digest(string1.getBytes());
			StringBuilder sb = new StringBuilder(2*hash.length);
 
			digest1.update(string1.getBytes());
			for(byte b : hash)
			{
				sb.append(String.format("%02x", b&0xff));
			}
			md5=sb.toString();
		}
		catch (NoSuchAlgorithmException e)
		{
			e.printStackTrace();
		}
		return md5;
	}
 
	public static void main(String[] args) throws Exception {
		Driver driver = new Driver();
		Properties props = new Properties();
		ResultSet rs;
		Connection conn = null;
		String JDBC_URL = "jdbc:mysql://192.168.56.101:3316", query1, query1_md5;
		Random generator = new Random();
		CachedRowSetImpl crsi=new CachedRowSetImpl();
		MemcachedClient c=new MemcachedClient(new BinaryConnectionFactory(), AddrUtil.getAddresses("192.168.56.103:11211"));
		Object object1=null;
		int in_memcached=0, not_in_memcached=0, id;
 
		long start_time = System.currentTimeMillis();
		props.put("user","yjaquier");
		props.put("password","secure_password");
 
		System.out.println("------------ MySQL Connector/J JDBC pooling testing ------------\n");
		System.out.println("Trying connection...\n");
 
		// Get JDBC connection
		try {
			conn = driver.connect(JDBC_URL, props);
		  }
		catch (SQLException e) {
			e.printStackTrace();
			System.exit(1);
		}
 
		// Looping and executing queries and try to get first the result from Memcached
		for(int i=1; i <= 1000; i++) {
			id=generator.nextInt(30000) + 1;
			System.out.print("\nQuery "+i+" for id="+id+", ");
			query1="SELECT firstname,lastname FROM test.customers where id="+id;
			query1_md5=MD5encode(query1);
			Future<Object> f=c.asyncGet(query1_md5);
			try {
				object1=f.get(5, TimeUnit.SECONDS);
			}
			catch (TimeoutException e) {
				f.cancel(false);
				System.out.println("Memcached timeout...");
			}
			if (object1==null) {
				System.out.print("Query result not in Memcached, ");
				not_in_memcached++;
				rs = conn.createStatement().executeQuery(query1);
				crsi.populate(rs);
				rs.close();
				c.set(query1_md5, 60*60*24*30, crsi);
				while (crsi.next()) {
					System.out.print("customer : " + crsi.getString("firstname")+", "+crsi.getString("lastname"));
				}
				crsi.close();
			}
			else  {
				System.out.print("Query result in Memcached, ");
				in_memcached++;
				crsi = (CachedRowSetImpl)object1;
				crsi.beforeFirst();
				while (crsi.next()) {
					System.out.print("customer : " + crsi.getString("firstname")+", "+crsi.getString("lastname"));
				}
				crsi.close();
			}
		}
 
		System.out.println("\nExiting...");
		if (conn != null) {
			conn.close();
		}
 
		c.shutdown();
 
		long end_time = System.currentTimeMillis();
 
		System.out.println("Executed in " + (end_time - start_time) + " milliseconds");
		System.out.println("Percentage in Memcached "+Math.round(in_memcached*100/(in_memcached+not_in_memcached))+"% ("+in_memcached+"), not in Memcached "+
							Math.round(not_in_memcached*100/(in_memcached+not_in_memcached))+"% ("+not_in_memcached+")");
	}
}

Performance results

For the first run allmost all results are not in Memcached and we have the worst execution time:

Executed in 31321 milliseconds
Percentage in Memcached 1% (17), not in Memcached 98% (983)

I have done multiple run and it is constantly between 31 and 32 seconds (flush the Memcached server after each run unless obviously the time is decreasing, see below).

For MySQL server monitoring I have used MONyog that is included in our SkySQL contract. Here below are the MySQL server CPU details and InnoDB statistics:

memcached01
memcached01
memcached02
memcached02

If we execute it over and over more and more query results are in Memcached, of course I have a small number of possible query results and they can all fit in my Memcached server (192 MB). In two below charts we easily see the decrease in MySQL server CPU and InnoDB usage:

memcached03
memcached03
memcached04
memcached04

It decreases till you reach a perfect run where all results are found in Memcached:

Executed in 3214 milliseconds
Percentage in Memcached 100% (1000), not in Memcached 0% (0)

I have also done multiple run and it is constantly between 3.2 and 3.3 seconds so around 10 times improvement. But the most important thing is the decrease of load on the MySQL server, I’m not displaying any charts are all is flat when query results are all in Memcached. I would even say that even with the same response time it would be satisfactory. The Memcached server has a CPU load of 35-40% (1 virtual CPU see above VMs sizing).

After this testing I realized that I did not deactivate MySQL query cache:

mysql> show variables like 'query%';
+------------------------------+---------+
| Variable_name                | Value   |
+------------------------------+---------+
| query_alloc_block_size       | 8192    |
| query_cache_limit            | 1048576 |
| query_cache_min_res_unit     | 4096    |
| query_cache_size             | 1048576 |
| query_cache_type             | OFF     |
| query_cache_wlock_invalidate | OFF     |
| query_prealloc_size          | 8192    |
+------------------------------+---------+
7 rows in set (0.00 sec)

But realized after few query cache efficiency charts that query cache is not in action even if query cache size is non-zero because a new parameter has been added and is set to OFF by default (since MySQL 5.6.8): query_cache_type. So in short query cache is allocated but not activated. You can dynamically activate is by changing query_cache_type versus setting to off query cache size and not being able to dynamically activate it…

The question could be on the competition between MySQL Query cache and Memcached. The Memcached FAQ has a quite strong position on this:

What about the MySQL query cache?
The MySQL query cache can be a useful start for small sites. Unfortunately it uses many global locks on the mysql database, so enabling it can throttle you down. It also caches queries per table, and has to expire the entire cache related to a table when it changes, at all. If your site is fairly static this can work out fine, but when your tables start changing with any frequency this immediately falls over.

Memory is also limited, as it requires using a chunk of what’s directly on your database.

Looking for monitoring, performance information (hitrate, ..) ?:

[root@server3 bin]# telnet server3 11211
Trying 192.168.56.103...
Connected to server3.
Escape character is '^]'.
stats
STAT pid 8534
STAT uptime 736
STAT time 1393843959
STAT version 1.4.17
STAT libevent 1.4.13-stable
STAT pointer_size 64
STAT rusage_user 0.047000
STAT rusage_system 0.668000
STAT curr_connections 10
STAT total_connections 12
STAT connection_structures 11
STAT reserved_fds 20
STAT cmd_get 1000
STAT cmd_set 982
STAT cmd_flush 0
STAT cmd_touch 0
STAT get_hits 18
STAT get_misses 982
STAT delete_misses 0
STAT delete_hits 0
STAT incr_misses 0
STAT incr_hits 0
STAT decr_misses 0
STAT decr_hits 0
STAT cas_misses 0
STAT cas_hits 0
STAT cas_badval 0
STAT touch_hits 0
STAT touch_misses 0
STAT auth_cmds 0
STAT auth_errors 0
STAT bytes_read 5069673
STAT bytes_written 147252
STAT limit_maxbytes 201326592
STAT accepting_conns 1
STAT listen_disabled_num 0
STAT threads 4
STAT conn_yields 0
STAT hash_power_level 16
STAT hash_bytes 524288
STAT hash_is_expanding 0
STAT malloc_fails 0
STAT bytes 5049018
STAT curr_items 982
STAT total_items 982
STAT expired_unfetched 0
STAT evicted_unfetched 0
STAT evictions 0
STAT reclaimed 0
END

Not very sexy, as described in Memcached official documentation for better reporting I have tried two self contained project (means not a plugin for an existing monitoring solution like Cacti or Nagios) and tried successfully:

For PhpMemcacheAdmin 1.2.2:

memcached05
memcached05

For PHP-based control panel 1.1.2.3:

memcached06
memcached06

References

2 thoughts on “Getting started with Memcached and MySQL

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>