MySQL Tutorials - Herong's Tutorial Examples - v4.43, by Herong Yang
Workaround on InnoDB "SELECT COUNT(*)" Problem
This section provides a tutorial example showing workarounds of the 'SELECT COUNT(*)' performance problem. count(fid) is much faster, if 'fid' is defined as a secondary index.
After learning how secondary index works on InnoDB tables, now we can look at some workarounds on the "SELECT COUNT(*)" performance problem.
1. Review 'Body' pages cached in the buffer.
mysql> select index_name, data_size/number_records as record_size, -> count(*) as pages, sum(number_records) as records -> from information_schema.innodb_buffer_page -> where table_name rlike '`Body`' -> group by index_name, data_size/number_records -> order by index_name; | index_name | record_size | pages | records | +------------+-------------+-------+---------+ | PRIMARY | 14.0000 | 1 | 801 | | PRIMARY | 1812.0000 | 801 | 6400 | | SECOND | 14.0000 | 6 | 6400 | | SECOND | 18.0000 | 1 | 6 | | THIRD | 14.0000 | 6 | 6400 | | THIRD | 18.0000 | 2 | 12 |
2. Run "SELECT COUNT(*)" with the primary index.
mysql> select count(*) from Body; | 6400 | mysql> call GetReads(); | @read | @last | +-------+----------+ | 2405 | 71090813 | mysql> select count(*) from Body use index (PRIMARY); | 6400 | mysql> call GetReads(); | @read | @last | +-------+----------+ | 2405 | 71059434 |
As expected, InnoDB engine performs "SELECT COUNT(*)" statement on an InnoDB table with a full table scan, reading all pages from the table. It actually reads all pages 3 times: 2405 > 3*801!
3. Compare "SELECT COUNT(*)" performance with no index, and different indexes. I see no differences.
mysql> select count(*) from Body USE INDEX (); @read = 2405 mysql> select count(*) from Body use index (PRIMARY); @read = 2405 mysql> select count(*) from Body use index (SECOND); @read = 2405 mysql> select count(*) from Body use index (THIRD); @read = 2405 mysql> select count(*) from Body; @read = 2405 mysql> explain select count(*) from Body; +-------------+-------+---------------+-------+----------+-------------+ | select_type | type | possible_keys | key | filtered | Extra | +-------------+-------+---------------+-------+----------+-------------+ | SIMPLE | index | NULL | THIRD | 100.00 | Using index | +-------------+-------+---------------+-------+----------+-------------+
4. Compare "SELECT COUNT(*)", "SELECT COUNT(fid)" and "SELECT COUNT(fid)" statements.
mysql> select count(*) from Body; @read = 2405 mysql> select count(id) from Body; @read = 2405 mysql> select count(id) from Body use index (); @read = 2405 mysql> select count(fid) from Body; @read = 72 mysql> select count(fid) from Body use index (SECOND); @read = 72 mysql> select count(fid) from Body use index (); @read = 867
What a surprise! InnoDB is able take advantages of a secondary index, if we count unique values of "fid" column, which is equivalent to the total number of rows.
In the above test, "SELECT COUNT(fid)" took only 72-page reading, comparing 2405-page reading for "SELECT COUNT(*)". That is a 97% reduction of execution time.
In theory, InnoDB could perform "SELECT COUNT(fid)" on the secondary index only, without using the primary index at all. It could do a full table scan of all 6 data pages of the secondary index and get the job done.
5. Add a secondary index as a shadow of the primary index.
mysql> create UNIQUE index SHADOW on Body (id); mysql> show index from Body; | Non_unique | Key_name | Column_name | Null | Index_type | +------------+----------+-------------+------+------------+ | 0 | PRIMARY | id | | BTREE | | 0 | SHADOW | id | | BTREE | | 1 | SECOND | fid | YES | BTREE | | 0 | THIRD | fid | YES | BTREE | mysql> select index_name, data_size/number_records as record_size, -> count(*) as pages, sum(number_records) as records -> from information_schema.innodb_buffer_page -> where table_name rlike '`Body`' -> group by index_name, data_size/number_records -> order by index_name; | index_name | record_size | pages | records | +------------+-------------+-------+---------+ | PRIMARY | 14.0000 | 1 | 801 | | PRIMARY | 1812.0000 | 801 | 6400 | | SHADOW | 9.0000 | 4 | 6400 | | SHADOW | 13.0000 | 2 | 8 | | SECOND | 14.0000 | 6 | 6400 | | SECOND | 18.0000 | 1 | 6 | | THIRD | 14.0000 | 6 | 6400 | | THIRD | 18.0000 | 2 | 12 |
5. Run "SELECT COUNT(id)" statement with SHADOW index. I see no improvement.
mysql> select count(id) from Body use index (SHADOW); @read = 2405 mysql> select count(id) from Body use index (PRIMARY); @read = 2405
What we have learned so far:
Table of Contents
MySQL Introduction and Installation
Introduction of MySQL Programs
Perl Programs and MySQL Servers
Java Programs and MySQL Servers
Character Strings and Bit Strings
Table Column Types for Different Types of Values
Using DDL to Create Tables and Indexes
Using DML to Insert, Update and Delete Records
Using SELECT to Query Database
Window Functions for Statistical Analysis
Use Index for Better Performance
Transaction Management and Isolation Levels
Defining and Calling Stored Procedures
Variables, Loops and Cursors Used in Stored Procedures
System, User-Defined and Stored Procedure Variables
Storage Engines in MySQL Server
►InnoDB Storage Engine - Primary and Secondary Indexes
Primary Key Index on InnoDB Table
InnoDB Primary Key Index on Large Table
"SELECT COUNT(*)" on InnoDB Table
Secondary Index on InnoDB Table
Performance of Index Range on InnoDB Table
►Workaround on InnoDB "SELECT COUNT(*)" Problem
Performance Tuning and Optimization
Installing MySQL Server on Linux