table_cache negative scalability

Couple of months ago there was a post by FreshBooks on getting great performance improvements by lowering table_cache variable. So I decided to investigate what is really happening here.

The “common sense” approach to tuning caches is to get them as large as you can if you have enough resources (such as memory). With MySQL common sense however does not always works – we’ve seen performance issues with large query_cache_size also sort_buffer_size and read_buffer_size may not give you better performance if you increase them. I found this also applies to some other buffers.

Even though having previous experience of surprised behavior I did not expect such a table_cache issue – the LRU for cache management is classics and there are scalable algorithms to deal with it. I would expect Monty to implement one of them.

To do the test I have created 100.000 empty tables containing single integer column and no indexes and when ran SELECT * FROM tableN in the loop. Each table in such case is accessed only once and on any but first run each access would require table replacement in table cache based on LRU logic.
MySQL Sandbox helped me to test this with different servers easily.

I did test on CentOS 5.3, Xeon E5405, 16GB RAM and EXT3 file system on the SATA hard drive.

MySQL 5.0.85 Created 100.000 tables in around 3min 40 sec which is about 450 tables/sec – This indicates the “fsync” is lying on this test system as default sync_frm option is used.

With default table_cache=64 accessing all tables take 12 sec which is almost 8500 tables/sec which is a great speed. We can note significant writes to the disk during this read-only benchmark. Why ? Because for MyISAM tables table header has to be modified each time the table is opened. In this case the performance was so great because all 100.000 tables data (first block of index) was placed close by on disk as well as fully cached which made updates to headers very slow. In the production systems with table headers not in OS cache you often will see significantly low numbers – 100 or less.

With significantly larger table_cache=16384 (and appropriately adjusted number of open files) the same operation takes 660 seconds which is 151 tables/sec which is around 50 times slower. Wow. This is the slow down. We can see the load becomes very CPU bound in this case and it looks like some of the table_cache algorithms do not scale well.

The absolute numbers are also very interesting – 151 tables/sec is not that bad if you look at it as an absolute number. So if you tune table cache is “normal” case and is able to bring down your miss rate (opened_tables) to 10/sec or less by using large table_cache you should do so. However if you have so many tables you still see 100+ misses/sec while your data (at least table headers) is well cached so the cost of table cache miss is not very high, you may be better of with significantly reduced table cache size.

The next step for me was to see if the problem was fixed in MySQL 5.1 – in this version table_cache was significantly redone and split in table_open_cache and table_definition_cache and I assumed the behavior may be different as well.

MySQL 5.1.40
I started testing with default table_open_cache=64 and table_definition_cache=256 – the read took about 12 seconds very close to MySQL 5.0.85.
As I increased table_definition_cache to 16384 result remained the same so this variable is not causing the bottleneck. However increasing table_open_cache to 16384 causes scan to take about 780 sec which is a bit worse than MySQL 5.0.85. So the problem is not fixed in MySQL 5.1, lets see how MySQL 5.4 behaves.

MySQL 5.4.2
MySQL 5.4.2 has higher default table_open_cache so I took it down to 64 so we can compare apples to apples. It performs same as MySQL 5.0 and MySQL 5.1 with small table cache.
With table_open_cache increased to 16384 the test took 750 seconds so the problem exists in MySQL 5.4 as well.

So the problem is real and it is not fixed even in Performance focused MySQL 5.4. As we can see large table_cache (or table_open_cache_ values indeed can cause significant performance problems. Interesting enough Innodb has a very similar task of managing its own cache of file descriptors (set by innodb_open_files) As the time allows I should test if Heikki knows how to implement LRU properly so it does not have problem with large number. We’ll see.

34 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Ask BjÃ¸rn Hansen

14 years ago

Hi Peter,

Interesting! We have a setup with a few tens of thousands of MyISAM tables (it looks like you tested with InnoDB?).

It’s been a few years since I tested, but we only got decent performance after setting the table cache really high. I wonder if it makes a difference that your tables were empty?

– ask

Peter Zaitsev

Author

14 years ago

Ask,

In fact I tested MyISAM. In Innodb table_cache is not so much relevant as it keeps all tables open anyway.

How many misses per second are you getting and how large is your data. The point is from CPU point the large table cache scales pretty badly however even with large number of tables in cache it still may be cheaper to invalidate the table than to really open it.

In case of MyISAM to close old table and open new one we’re speaking about 2 write operations (modify index header) which will also cause 2 reads (because key block is typically just 1K) in case OS cache is under pressure. This means in IO bound case larger caches make sense.

There is one more difference – I tested with uniform distribution to tables – in real life you will most likely see skewed distribution with some tables accessed more than others, which is completely different picture.

As the common case large table cache is indeed better. I just wanted to show there are indeed cases like freshbooks which benefit from cache decrease.

I believe in their case the miss ratio was not changed dramatically as table_cache was increased – so they got the problem and little benefits.

Walter

14 years ago

Hi Peter,

very interesting. I’d like to see some more data on how the scaling performs. eg. table_cache values of 1000, 4000 and 8000 or so. That way we can get a better feel for what might be an optimal number. I would not be surprised if e.g. 1000 or less even would still perform fast, maybe even faster then 64?

Walter

Peter Zaitsev

Author

14 years ago

Walter,

You surely are welcome to repeat the test with different table cache sizes and different data access distribution.
I predict in this workload (because it Always causes misses as long as table_cache is less than 100000) there will be no gains due to miss rate reduction with increase of table_cache. And I expect semi-linear regression with increased table cache size.

Alex

14 years ago

is it possible to “play around” with the table_cache setting without restarting mysql?

i currently use 2048 on a r900 / 64gb , we have 180 tables and about 300 to 400 queries per second, so far perforamnce is not any issue but its always good to stay ahead of possible bottlenecks 🙂

Robert

14 years ago

I played with table_open_cache in 5.1 a few month ago for the same reason. I have about 1900 tables distributed over 32 databases which are randomly accessed.

I also started with a size of 64 and increased the exponent of the power of two with every test. So I tested 64, 128, 256, 512, 1024… The best value for my setup was 512 compared with the opened_tables rate. Perhaps it’s a coincidence, but for my setup I selected a value of approximately 1/4 of the number of tables.

But it depends on the usage of the tables. In such a benchmark, as Peter has done it, all tables are in use. In the real world, not all tables are constantly being accessed.

To repeat Peter’s conclusion, use a small value of table(_open)_cache. If you have a high number of (re-)opened tables, increase the value and see what’s happening.

Rich Lafferty

14 years ago

Peter,

Thanks for digging further into this one! At FreshBooks we’re still running a tiny table cache — 128 on our 32GB R900s. But historical problems with our schema means that we’re looking at millions of tables, unfortunately, which is why the “make it much, much larger” approach didn’t work for us.

I suspect that for a reasonably huge number of tables (10k’s?) you can still come up with a table cache that caches enough of them that you get more benefits from the cached ones being held open than from the cost of the LRU algorithm — in our case, though, we had an *unreasonable* number and the cache would never be “large enough” — and that’s the case where smaller is better.

Alex: SET GLOBAL table_cache = [value]

Cheers,

-Rich from FreshBooks

Mark Callaghan

14 years ago

Nice work. I think the penalty for reopening InnoDB tables includes the cost of resampling index cardinality and that can do several disk reads per index. I hope Percona fixes this because I would rather not learn that code myself.

Peter Zaitsev

Author

14 years ago

Mark,

The reopening tables in Innodb never happens as it always keeps tables open (I guess you meant it does this because cost of re-sampling is too high) innodb_dict_size_limit in Percona patches allows to change this behavior. I agree we should change this behavior.

Peter Zaitsev

Author

14 years ago

Rich,

Thanks. You confirmed my expectations. Somebody asked me how many tables did Freshbooks had and I guessed millions, it turns out I was close.

Indeed the math one would expect – increasing table_cache in such case would allow to decrease miss ratio (just a bit) for small expense of a bit of memory (may be some 8KB per table cache entry). So it comes as a huge surprise when you see this small decrease in miss ratio comes in so huge increase of CPU time.

Peter Zaitsev

Author

14 years ago

Also,

On setting table_cache online. Most time you can decrease it online, however increasing it may cause issues because MySQL will not change the number of open files limit in this case.

Wagner Bianchi

14 years ago

Hi Peter,

On my database schema I have many tables, the exactly number is 853 tables and all these tables are accessed all time. Well, goig ahead with what exlain the manual, all thread connected will use the same table that was opened, i.e., in cache and it is configured by the value of table_cache.

If I have 1000 active users or simultaneous conectiosn on my server (MySQL), if I use exchanges is most fast than use the high number configures in table_cache variable? What about the comparisons look at Open and Opened_tables?

If I am missunderstood, plese corrects me.

Thanks.

Peter Zaitsev

Author

14 years ago

Wagner,

In your case number of tables is tiny and I trust you can fit them all in the cache easily – you can use large enough table cache. On the other hand i must say I did not see how table_cache size affects hits. I’m assuming only misses/invalidations slowdown but I have not tested it.

Chris

14 years ago

I don’t think 853 tables is tiny. Why so many? Are some of them duplicates providing duplicate funtionality to different groups of users? Why not combine like data in a single table for all users? Or am I missing something?

Wagner Bianchi

14 years ago

Maybe this number is tiny compared with the number up on the test did by Peter. That was 100.000.

Regards.

Igor

14 years ago

I think table_cache values between 1,000 and 5,000 are of much higher practical relevance for vast majority of MySQL users (those that are not hosting providers etc).

Sergei Golubchik

14 years ago

I see, you didn’t bother to submit a bug report about it.

Am I the only reader of your blog that want this fixed ?
http://bugs.mysql.com/49177

Mark Callaghan

14 years ago

@Sergei – I want it fixed and I shared that with MySQL.

@Peter – whether or not Innodb keeps a file descriptor open for the tablespace, index cardinality stats (~8 leaf blocks per index) are sampled when MySQL creates a table handler instance for an InnoDB table. That is the cost I mentioned above.

Kostja

14 years ago

And as you can see from the analysis in the bug report, this is not a table cache problem, but a MyISAM one.

DaveJ

14 years ago

Great Post, Thank you all for your info.

Question:

After kicking off mk-table-checksum in the middle of the night I start getting errors such as:

100414 0:21:52 [ERROR] Error in accept: Too many open files
100414 0:21:52 [ERROR] /usr/local/mysql/bin/mysqld: Can’t open file: ‘./wc_miehr_posasmp/panel_decert.frm’ (errno: 24)
100414 0:21:52 [ERROR] /usr/local/mysql/bin/mysqld: Can’t open file: ‘./wc_miehr_posasmp/panel_decert.frm’ (errno: 24)

[root@wcdb /db]# perror 24
OS error code 24: Too many open files

I have this many MyISAM tables:

[root@xxx /db/mydata]# find -L . -name “*.MYI”|wc
13713 13713 558523

Number of possible needed fd’s: 13713 * 2 = 27426

I have this many InnoDB tables:

[root@xxx /db/mydata]# find -L . -name “*.ibd”|wc
31014 31014 1144412

Innodb is handled differently right? So 31014 table handles are always opened and would not be the cause of the errno: 24?

If I add the two, 31014 + 27426 = 58440 os file descriptors.

So –

1) Do I get errno: 24 becuase I have the open_files_limit set to 40000?
2) Could you explain the difference between MyISAM and InnoDB and the difference in how the OS handles opened files?
3) Would converting the MyISAM tables to InnoDB fix or help the os ulimit max open tables issue?

Thanks for your help!

Server info:

Server version: 5.0.79-enterprise-log MySQL Enterprise Server (Commercial)

my.cnf:

[mysqld]
server-id = 2
port = 3306
basedir = /usr/local/mysql
datadir = /db/mydata
tmpdir = /db-logs/ibdlogs/dbtmp
pid-file = /db/mysql.pid
log-error = /db/mysql.err
log-slow-queries = /db/slow-queries.log
log-bin = /db-logs/bin-logs/mysql-logs/mysql-bin
expire_logs_days = 30 # Expire logs after in X days.
long_query_time = 2
thread_cache_size = 100
max_connections = 400
table_cache = 20000
open_files_limit = 40000
max_binlog_size = 500M
max_allowed_packet = 20M
tmp_table_size = 100M
max_heap_table_size = 100M
query_cache_size = 100M
default_table_type = MYISAM
log_long_format

# Innodb
innodb_status_file = 1
innodb_file_per_table = 1
innodb_data_home_dir = /db2/innodb
innodb_log_group_home_dir = /db-logs/ibdlogs
innodb_data_file_path = ibdata1:10M:autoextend
innodb_buffer_pool_size = 30G
innodb_log_buffer_size = 8M
innodb_additional_mem_pool_size = 8M
innodb_log_file_size = 256M
innodb_log_files_in_group = 2
innodb_thread_concurrency = 16
innodb_open_files = 20000
innodb_flush_method = O_DIRECT
innodb_flush_log_at_trx_commit = 2
transaction_isolation = READ-COMMITTED

DaveJ

14 years ago

Great Post, Thank you all for your info.

Question:

After kicking off mk-table-checksum in the middle of the night I start getting errors such as:

I have this many MyISAM tables:
[root@xxx /db/mydata]# find -L . -name “*.MYI”|wc
13713 13713 558523

Number of possible needed fd’s: 13713 * 2 = 27426

I have this many InnoDB tables:
[root@xxx /db/mydata]# find -L . -name “*.ibd”|wc
31014 31014 1144412

Innodb is handled differently right? So 31014 table handles are always opened?

If I add the two, 31014 + 27426 = 58440 os file descriptors.

So –

1) Do I get errno: 24 becuase I have the open_files_limit set to 40000?
2) Could you explain the difference between MyISAM and InnoDB table cache?
3) Would converting the MyISAM tables to InnoDB fix or help the os ulimit max open tables issue?

Thanks for your help!

Server info:

Server version: 5.0.79-enterprise-log MySQL Enterprise Server (Commercial)

my.cnf:

Shahryar Ghazi

13 years ago

we recently faced a similar issue so I wrote a tutorial here –> http://www.geeksww.com/tutorials/database_management_systems/mysql/installation/mysql_tablecache_informationschema_and_open_files_limit.php

Chris Howells

12 years ago

This has bitten me once or twice too. It still seems to be an issue in the most recent Percona release of MySQL 5.5.14 too.

A patch has been written, have you thought about including it in Percona?

http://bugs.mysql.com/bug.php?id=49177
http://lists.mysql.com/commits/121507

sudeep

10 years ago

I have opened only one table but if i run SHOW GLOBAL STATUS LIKE ‘Opened_tables’ it is showing 100 open tables

David Keith

10 years ago

Sudeep, that is because you used Opened_tables, try:
show global status like ‘Open_tables’;

And you will most likely see only one.

Rick James

9 years ago

The table cache used to be scanned linearly; has that ever been fixed? I see that 5.6.8 now defaults to 2000 — without a fix, this could lead to some degradation?

Peter, your blog mentions the defunct version 5.4. I’m here because the antique mysql_tuner is pointing to this blog. Can you add some updated information?

Some useful computed metrics (from GLOBAL STATUS):

Open_tables / table_open_cache
open_files_limit / table_open_cache

seyfer

9 years ago

I nave changed table_open_cache = 15000 in my.cnf and than mysql restart.
But mysql tuner still said taht i have table_open_cache = 400
Why it’s not changes?

seyfer

9 years ago

Oh, i just found solution about my previous question
http://stackoverflow.com/questions/19705738/mysql-table-cache-refuses-to-change-value/28936292