Quite frequently I run into question like this “I’m using SQL_NO_CACHE but my query is still much faster second time I run it, why is that ?
The answer to this question is simple – because SQL_NO_CACHE only bypasses query cache but it has no change on other caches, which are
MySQL Caches – Innodb Buffer Pool and Key Buffer are best example though Falcon, PBXT and other storage engines have similar buffers. There is also table_cache both MySQL side and Internal Innodb one which can affect query execution speed.
OS Caches Operation Systems typically cache file IO unless you explicitely bypass it by using O_DIRECT flag or mounting file system in direct IO mode.
Hardware Caches State of CPU cache may affect query execution speed but only lightly, the hardware IO cache may however cause dramatic difference. Hardware RAID cache is the one but more important SAN caches which can be pretty big.
So How can you bypass these caches?
For MySQL Caches you can restart MySQL and this is the only way to clean all of the caches. You can do FLUSH TABLES to clean MySQL table cache (but not Innodb table meta data) or you can do “set global key_buffer_size=0; set global key_buffer_size=DEFAULT” to zero out key buffer but there is no way to clean Innodb Buffer Pool without restart.
For OS Caches on Linux you can use drop caches control available in new Linux Kernels. You could also remount file system in question and the safest thing is of course to reboot.
For Hardware Caches it is more hardware specific. Typically doing some IO will flush cashes but you can’t be sure as you do not know what policies exactly do they employ. For RAID hardware caches reboot of the box is also enough however SAN caches may survive longer. Though few of us have SAN available for performance benchmarking 🙂
So does that mean that there is no way to profile a query in mysql? I can’t be expected to reboot my system every time I want to tweak a query and test if it runs faster!
… In the same sense as with any other database. Practically everything has caches.
To do real profiling you need to profile real query mix which will have each query having appropriate cache/hit ratio not running one query in the loop and assuming results will be fine.
I have a web app which doesn’t have a very large client base but each user can potentially query a large chunk of data. So the most important thing is when the user queries a given chunk the first time. Once it’s cached it’s of course orders of magnitude faster, but it’s the initial query that I need to optimize. I imagined that having the ability to profile a query should be pretty basic in a rdbms.
JM,
Think about it practically – DBMS works through OS using its caches. To get query uncached you need to clean these caches. for MyISAM you can do SET GLOBAL key_buffer_size=0 and when set it back to default. You can also do cat 3 > /proc/sys/vm/drop_caches on Linux to clear caches however you do not have anything for Innodb.
In practice however fully empty case is not that interesting because query is likely to have something cached, dictionary information etc. I would have Slow Query log enabled with Percona Patches so you can see how many IOs query does etc. When query optimization usually can happen analytical way – if you see from EXPLAIN query scans 1000000 rows you do not need it to be IO bound to optimize it you can just estimate how it would be 🙂