Tag Archive: mysql


Benchmarks for DATE operations in MySQL

This article compares the relative speed of extracting the date part of a value in MySQL with LEFT() and with the DATE() function.

LEFT() is faster than DATE(). To prove this, I inserted two million un-indexed sequential values into a table and selected the minimum and maximum values. Both queries are table scans, so it does read through all the records. The table below lists the time in seconds for MAX() on my computer. I tested with three data types: DATE, TIMESTAMP and DATETIME.

I don’t know why it’s faster to use LEFT() than DATE(). I would assume the reverse to be true, but clearly it’s not, at least on the systems I’ve tested.

The time for MIN() is one or two milliseconds faster than MAX(), probably because the values are sequential and only one assignment is performed, whereas the MAX() query must perform two million assignments.

HandlerSocket mySQL’s NoSQL, PHP and Webscale

HandlerSocket is a server plug-in that interfaces with InnoDB directly bypassing for the most part the core mySQL server all together. Using Handler socket you do not connect to the traditional port 3306 or use the mySQL protocol to communicate with the mySQL server, you use 9998 reads, 9999 writes and talks a different protocol (much smaller) to an epoll-based service. This awesome addition means that you can have many many connections with very little overhead. The core mySQL server does a good job, but can do better. I’m sure the reason why epoll is not in the core engine is probably due to licensing issues or some other reason I am not aware of, like it would be a big deal to add it. 

Why bypass the server? Really bypassing the sql parser is what was desired. The sql parser ads a significant amount of slowdown at a huge transaction rate, additionally mutex contention is involved prior to reaching the storage engine with malloc overhead (although Monty has fixed this in MariaDB). Even though you can easily get 45-70K selects a second through the mySQL SQL parser layer for InnoDB primary key lookups, this number falls short of Memcache’s 600K Gets per second or various other NoSQL solutions that keep data in memory. Continue reading “HandlerSocket mySQL’s NoSQL, PHP and Webscale” »

Most of high scale web applications use MySQL + memcached. Many of them use also NoSQL like TokyoCabinet/Tyrant. In some cases people have dropped MySQL and have shifted to NoSQL. One of the biggest reasons for such a movement is that it is said that NoSQL performs better than MySQL for simple access patterns such as primary key lookups. Most of queries from web applications are simple so this seems like a reasonable decision.
Like many other high scale web sites, we at DeNA(*) had similar issues for years. But we reached a different conclusion. We are using “only MySQL”. We still use memcached for front-end caching (i.e. preprocessed HTML, count/summary info), but we do not use memcached for caching rows. We do not use NoSQL, either. Why? Because we could get much better performance from MySQL than from other NoSQL products. In our benchmarks, we could get 750,000+ qps on a commodity MySQL/InnoDB 5.1 server from remote web clients. We also have got excellent performance on production environments.
Maybe you can’t believe the numbers, but this is a real story. In this long blog post, I’d like to share our experiences.
(*) For those who do not know.. I left Oracle in August 2010. Now I work at DeNA, one of the largest social game platform providers in Japan. Continue reading “Using MySQL as a NoSQL – A story for exceeding 750,000 qps on a commodity server” »

I wanted to compare the following DBs, NoSQLs and caching solutions for speed and connections. Tested the following

My test had the following criteria

  • 2 client boxes
  • All clients connecting to the server using Python
  • Used Python’s threads to create concurrency
  • Each thread made 10,000 open-close connections to the server
  • The server was
    • Intel(R) Pentium(R) D CPU 3.00GHz
    • Fedora 10 32bit
    • Intel(R) Pentium(R) D CPU 3.00GHz
    • 2.6.27.38-170.2.113.fc10.i686 #1 SMP
    • 1GB RAM
  • Used a md5 as key and a value that was saved
  • Created an index on the key column of the table
  • Each server had SET and GET requests as a different test at same concurrency

Results please !

Work sheet

throughput set

throughput get

I wanted to simulate a situation where I had 2 servers (clients) serving my code, which connected to the 1 server (memcached, redis, or whatever). Another thing to note was that I used Python as the client in all the tests, definately the tests would give a different output had I used PHP. Again the test was done to check how well the clients could make and break the connections to the server, and I wanted the overall throughput after making and breaking the connections. I did not monitor the response times. I didnt change absolutely any parameters for the servers, eg didn’t change the innodb_buffer_pool_size or key_buffer_size.

MySQL

MySQL lacked the whole scene terribly, I monitored the MySQL server via the MySQL Administrator and found that hardly there were any conncurrent inserts or selects, I could see the unauthenticated users, which meant that the client had connected to MySQL and was doing a handshake using MySQL authentication (using username and password). As you could see I didn’t even perform the 40 and 60 thread tests.

I truncated the table before I swtiched my tests from MyISAM to InnoDB. And always started the tests from lesser threads. My table was as follows

CREATE TABLE `comp_dump` (
  `k` char(32) DEFAULT NULL,
  `v` char(32) DEFAULT NULL,
  KEY `ix_k` (`k`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

NoSQL

For Tokyo Tyrant I used a file.tch as the DB, which is a hash database. I also tried MongoDB as u may find if u have opened the worksheet, But the server kept failing or actually the mongod failed after coming at an unhandled Exception. I found something similar over here. I tried 1.0.1, 1.1.3 and the available Nightly build, but all failed and I lost my patience.

Now what

If you need speed just to fetch a data for a given combination or key, Redis is a solution that you need to look at. MySQL can no way compare to Redis and Memcache. If you find Memcache good enough, you may want to look at Tokyo Tyrant as it does a synchronous writes. But you need to check for your application which server/combination suits you the best. In Marathi there is a saying “मेल्या शिवाय स्वर्ग दिसत नाही”, which means “You can’t see heaven without dieing” or need to do your hard work, can’t escape that ;)

I’ve attached the source code used to test, if anybody has any doubts, questions feel free to ask

Attachment Size
throughput-get.png 8.57 KB
throughput-set.png 8.65 KB
worksheet.png 42.36 KB
comparision.tar.gz 7.46 KB

MySQL Connections

MySQL will allow n number of connections at a given point of time, To find out that n no of connections run the following command.

mysql> show variables like 'max_connections';
+-----------------+-------+
| Variable_name   | Value |
+-----------------+-------+
| max_connections | 100   |
+-----------------+-------+
1 row in set (0.00 sec)

So this server will allow max of 100 connections at any given point of time.

Before we change the following variable, its necessary to understand how MySQL uses connections and tables. Consider that for every connection on your server, you are querying 3 tables. ie. MySQL will open 3 file descriptors for each connection (1 for each table)

MySQL Doc
…the table is opened independently by each concurrent thread. …

Considering your site has a about 300 connections per/second, 200 of those connections will be dropped since your setting is 100. So changing the setting to 300, MySQL will open (300 x 3 = 900) file descriptors. A file descriptor is nothing but a connection to that file (programmers,sysads will know what I mean). Now its necessary to check what does your OS suuport, normally OS will support about X no. of file descriptors at a given point of time. So increasing your connection settings will not help solve the issue. You need to consult your OS’s manual before changing the settings on a production server.

Fine, Lets assume that every param is under the limit, we can tweak around, As in MySQL almost every parameter can be controlled by configuration file /etc/my.cnf in Linux or [MySQL INSTALL DIR]\my.ini in Windows.

Open the file and add the following under mysqld section max_connections=300. So it should be something like this.

[mysqld]
max_connections=300
...

Restart MySQL Server after changing the file. You can alternatively run a Command while the server is running, but the setting won’t be maintained if the server restarts. The following command can be executed

mysql> set global max_connections=300;

References

  1. Show Variables
  2. Server System Variables
  3. Server Parameters
  4. Too many connections
  5. Table Cache

MySQL Connection Errors

There could be many reasons why a connection to MySQL server can fail, like

  • Networking Problem
  • Server itself could be down
  • Authentication Problems
  • Maximum Connection Errors allowed.

Of all the errors, this thread will discuss Maximum Connection Errors.
This particular parameter max_connect_errors defines the no. of connection errors a particular host can make before it is banned. Yes Banned! This is a feature that MySQL provides to limit erroneous clients.

By default 10 maximum connection errors are allowed per host. You can check using foll. command.

mysql> show variables like 'max_connect_errors';
+--------------------+-------+
| Variable_name      | Value |
+--------------------+-------+
| max_connect_errors | 10    |
+--------------------+-------+
1 row in set (0.02 sec)

You can set the above variable in the same manner as setting max_connections, ie in my.cnf file.

[mysqld]
max_connect_errors=100
...

Once particular host exceeds the no. of errors, a particular command is required to be given to the server to reset the host connect errors and indirectly allow the hosts to connect to the server. This command flush hosts will help you do that.

In our scenario, where there are 300 connections per/second, and each connection-query is taking some processing time before connection is closed, There could be a time where the Server is busy, and starts to queue processes, in such a case the connection would be still held until its query is processed. So when a new connection is required, the server may reject it considering the connections that are currently active. ie. We’d set max_connections to 300, so every new connection request could be denied adding to the connection errors, which could mean the host could be blocked. Considering a Web Server + MySQL Server scenario, the web server’s host could be banned until a flush hosts is fired at the MySQL Server.

So in such a case, it is necessary to increase the no. of simultaneous connections that can be made to the server by changing the max_connections parameter. Still if the load on MySQL server is high, and you are getting Connection errors, its time to load balance the Server.

MySQL Replication Manager

MySQL replication manager has ability to

  1. Start, Stop Slave
  2. Set Status of Slave

Download

License: GNU General Public License

MySQL: Efficient Stored Procedure Editing

This is not about how to write a Stored Procedure (SP), But how to efficiently write a SP.

MySQL Query Browser

The easiest way to write a SP is to use a MySQL Query Browser, Just select the database and right click the dB and “Create Stored Routite…” This shall help you do easily modify, edit and create procedures.

Command Line

Unfortunately, not many of the dB guys could have access to GUI and create/edit access permissions, they may have to rely on the CLI. This is where create / edit of SPs is the most tedious. Lets set something in our system, before we log into MySQL.

  • In your ~/.bashrc add the following…
    export EDITOR=vim VISUAl=vim
  • Create / Edit ~/.vimrc file, and add the following…
    set noai
    set nonu
  • Log into the MySQL CLI, and set the following command
    delimiter //

The Developer now is ready on the server side to create/edit the procs. On the client side I assume the Developer has a Notepad equivalent text editor, or even better, I suggest Kate (if you are on Linux box).

In the editor…

  • Write an if exists drop procedure statement…
  • With few carriage returns following…, write the whole content (SP)
  • If a specific user/s need execute, write the grant execute permission after few carriage returns…

The dev is now ready to Create the proc..

  1. Paste the if exists drop statement and execute
  2. In the mysql prompt, select the dB and enter the following prompt
    \e

    . This is basically to go into editor mode. Press “i”, to get into the insert mode and paste the actual SP Content

  3. Press Escape key, to go back in normal mode and then press “:wq” to save and exit
  4. Enter “//” (the command delimiter) and press Enter, that should either create the proc, or return with the errors.
  5. Execute the Execute Grant statements

Done, Simple.

PS: Copy Paste @ rescue yet again !!! :D

My previous post Redis, Memcache, Tokyp Tyrant, MySQL comparison. The MySQL was taking a huge time for doing a reverse DNS lookup.

I turned on the skip-name-resolve parameter in the my.cnf and the Throughput of MySQL grew considerably, almost more than double.

Here are the new results.

GET

SET

worksheet

MyISAM vs InnoDB

Nothing much has changed in the above test. Except for the fact InnoDB starts leading the way when there are high number of concurrent Inserts/Updates or Writes on the table. As seen from the “Set” graph InnoDB starts closing for MyISAM’s write efficiency around 30 concurrent requests and then by 60 concurrent requests its already ahead in throughput of writes – 1284/s against 825/s. Further I had put a watch on processlist and was watching the processess, there were times during MyISAM when the inserts took over 6seconds to finish, which also means that if you are in a need of an application which requires quicker response during heavy loads / heavy concurrency… You need to check the MyISAM vs. InnoDB scenario really closely. At low concurrency MyISAM is well ahead in writes, and in Reads, both MyISAM and InnoDB perform equally well.

Again you need to make sure that you check ur test conditions really well before just taking InnoDB for granted.

Attachment Size
throughput-get2.png 7.96 KB
throughput-set2.png 8.71 KB
worksheet-2.png 23.31 KB
comparision.ods 29.02 KB

Mysql Replication: Same Server, Rewrite database

MySQL support same-server replication into another database, Its quite a weired requirement, but in reality weired is common.

Consider a server 192.168.5.70, which has 2 databases db1 and db2
Now we shall set up replication for two tables on db1, ie. table1 and table2.

Here is the my.cnf

[mysqld]
server-id=1
#### Replication ####
report-host=master-is-slave-host
log-bin=192.168.5.70-binlog
relay-log=192.168.5.70-relaylog

replicate-same-server-id=1

binlog-do-db=db1

# Note.... On rewrite, the  command is changed into buffer
# so the replicate-do-db and replicate-do-table should have the
# re-written db name.
replicate-rewrite-db=db1->db2
replicate-do-table=db2.table1
replicate-do-table=db2.table2

Lets look at it carefully.

replicate-same-server-id=1
This is to tell the slave the commands that it has to execute will be having the same server-id as its own.

binlog-do-db=db1
To log only database db1

replicate-rewrite-db=db1->db2
Let the slave know that whatever command is for db1 it has to execute on db2

replicate-do-table=db2.table1
replicate-do-table=db2.table2

Note how the table in db2 is replicated, as against the general thinking of db1.table1, MySQL could be using some buffered string and then evaluating the replicate-do-table rule.

Powered by WordPress | Theme: by 85ideas. Editor by Khoanguyen