Daily Archives: April 28, 2015

Mysql Million rows test: insertion

These days, I started to use mysql to query table with million rows. In order to get to know mysql performance better. I did this test. Basically, I created different tables with different type/number of indices. I tried to insert as many rows as possible within a certain time. And see the result.

Environment:
Windows 8.1
Memory 12GB
5.1.73-community
CPU AMD A10, 2.1G

index table1(no index)
CREATE TABLE `itable1` (
`Id1` int(11) DEFAULT NULL,
`id2` int(11) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT=’insertion test, no index’

index table2(1 int index)
CREATE TABLE `itable2` (
`Id1` int(11) DEFAULT NULL,
`id2` int(11) DEFAULT NULL,
KEY `idx1` (`Id1`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT=’insertion test, 1 int index’

index table3(2 int indices)
CREATE TABLE `itable3` (
`Id1` int(11) NOT NULL DEFAULT ‘0’,
`id2` int(11) DEFAULT NULL,
KEY `idx1` (`Id1`),
KEY `idx2` (`id2`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT=’insertion test, 2 int indices’

index table4(10 int indices)
CREATE TABLE `itable4` (
`Id1` int(11) DEFAULT NULL,
`id2` int(11) DEFAULT NULL,
`id3` int(11) DEFAULT NULL,
`id4` int(11) DEFAULT NULL,
`id5` int(11) DEFAULT NULL,
`id6` int(11) DEFAULT NULL,
`id7` int(11) DEFAULT NULL,
`id8` int(11) DEFAULT NULL,
`id9` int(11) DEFAULT NULL,
`id10` int(11) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT=’insertion test, 10 int indices’

index table5(2 varchar indices)
CREATE TABLE `itable5` (
`Id1` varchar(255) DEFAULT NULL,
`id2` varchar(255) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT=’insertion test, 2 varchar indices’

I run for 170 seconds, and the result is below:

Conclusion:
1. The more indices a table has, the slower the insertion does.
2. The more indices a table has, the more space it will take up.
*3. Index table insertion speed is constant as number of row increases.

3* beyong my expectation. Mysql uses B+ tree. For one row insertion, it takes logN time. Continously inserting N rows takes log1 + log2 + … + logN = logN! time. logN! is approximately NlogN. So I expected it is going to be a curve, not linear. Werid…

For my java code, please click here: https://www.github.com/allenlipeng47/MillionTableTest

I think the 2 tables below can generally gives us a feeling about index time / space factor.

Mysql Million rows test: select, join

These days, I started to use mysql to query table with million rows. In order to get to know mysql performance better. I did this test.
Basically, I have 2 big tables and 1 small table. Each big table has 100M items. The small table has 1M items. I tested select, join operations among them indexed filed or no indexed field.

Environment:
Windows 8.1
Memory 12GB
5.1.73-community
CPU AMD A10, 2.1G

big table1:
CREATE TABLE `btable1` (
`indexId1` int(11) NOT NULL AUTO_INCREMENT,
`indexId2` int(11) DEFAULT NULL,
`unIndexVarchar` varchar(255) DEFAULT NULL,
PRIMARY KEY (`indexId1`),
KEY `indexId2` (`indexId2`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8 COMMENT=’100 million rows, 2 indices field’

big table 2:
CREATE TABLE `btable2` (
`indexId` int(11) NOT NULL AUTO_INCREMENT COMMENT ‘indexed primary key’,
`unIndexId` int(11) DEFAULT NULL,
`unIndexVarchar` blob,
PRIMARY KEY (`indexId1`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8 COMMENT=’100 million rows, 1 index field’

small table:
CREATE TABLE `stable` (
`indexId` int(11) NOT NULL AUTO_INCREMENT,
`unIndexId` int(11) DEFAULT NULL,
`unIndexVarchar` blob,
PRIMARY KEY (`indexId`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8 COMMENT=’small table, has 1 million items’

Then, I did the following test among these tables:
1. SmallTable count
select count(*) from stable
result: 1M; time: 2s

2. BigTable count
select count(*) from btable1
result: 100M; time: 110s

3. BigTable join SmallTable. 
select count(*) from btable1 join stable on btable1.indexId1 = stable.unIndexId
result: 1M; time: 9.11s

4. BigTable1 join SmallTable join BigTable2
select count(*) from btable1 join stable on btable1.indexId1 = stable.indexId join btable2 on btable2.unIndexId = stable.indexId
result: 1M; time: 14s

5. BigTable1 join BigTable2 join SmallTable
select count(*) from btable1 join btable2 on btable1.indexId1 = btable2.indexId join stable on btable1.indexId1 = stable.indexId
result: 1M; time: 8.8s

6. SmallTable count, unindex condition
select count(*) from stable where stable.unIndexVarchar=’AK’
result: 1451, time: 2.9s

7. BigTable join SmallTable, unindex condition
select count(*) from btable1 join stable on btable1.indexId1 = stable.indexId where stable.unIndexVarchar=’KA’
result: 1471, time: 3s

8. BigTable join SmallTable, unindex condition
select count(*) from btable1 join stable on btable1.indexId1 = stable.indexId where btable1.unIndexVarchar=’IO’
result: 1543, time: 4.3s

9. BigTable join SmallTable, unindex condition
select count(*) from btable1 join stable on btable1.indexId1 = stable.unIndexId where stable.unIndexVarchar=’AK’

Consolution:
2, 3 shows if a big table join a small table, it will be efficient than counting a big table.
3, 4 shows if a small table joins, and the result is less, it will help to break the big table.
4, 5 shows the join order doesn’t matter, mysql can do a optimizatino on the sql query.
1, 6 shows that these 2 queries has no big difference. They all have O(n) time.
7, 8 shows query with condition on small table will be efficient than condition on a big table.
9 takes longer time than 3. It shows if a condition is without index, it will be inefficient.