Tag Archive: redis


I wanted to compare the following DBs, NoSQLs and caching solutions for speed and connections. Tested the following

My test had the following criteria

  • 2 client boxes
  • All clients connecting to the server using Python
  • Used Python’s threads to create concurrency
  • Each thread made 10,000 open-close connections to the server
  • The server was
    • Intel(R) Pentium(R) D CPU 3.00GHz
    • Fedora 10 32bit
    • Intel(R) Pentium(R) D CPU 3.00GHz
    • 2.6.27.38-170.2.113.fc10.i686 #1 SMP
    • 1GB RAM
  • Used a md5 as key and a value that was saved
  • Created an index on the key column of the table
  • Each server had SET and GET requests as a different test at same concurrency

Results please !

Work sheet

throughput set

throughput get

I wanted to simulate a situation where I had 2 servers (clients) serving my code, which connected to the 1 server (memcached, redis, or whatever). Another thing to note was that I used Python as the client in all the tests, definately the tests would give a different output had I used PHP. Again the test was done to check how well the clients could make and break the connections to the server, and I wanted the overall throughput after making and breaking the connections. I did not monitor the response times. I didnt change absolutely any parameters for the servers, eg didn’t change the innodb_buffer_pool_size or key_buffer_size.

MySQL

MySQL lacked the whole scene terribly, I monitored the MySQL server via the MySQL Administrator and found that hardly there were any conncurrent inserts or selects, I could see the unauthenticated users, which meant that the client had connected to MySQL and was doing a handshake using MySQL authentication (using username and password). As you could see I didn’t even perform the 40 and 60 thread tests.

I truncated the table before I swtiched my tests from MyISAM to InnoDB. And always started the tests from lesser threads. My table was as follows

CREATE TABLE `comp_dump` (
  `k` char(32) DEFAULT NULL,
  `v` char(32) DEFAULT NULL,
  KEY `ix_k` (`k`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

NoSQL

For Tokyo Tyrant I used a file.tch as the DB, which is a hash database. I also tried MongoDB as u may find if u have opened the worksheet, But the server kept failing or actually the mongod failed after coming at an unhandled Exception. I found something similar over here. I tried 1.0.1, 1.1.3 and the available Nightly build, but all failed and I lost my patience.

Now what

If you need speed just to fetch a data for a given combination or key, Redis is a solution that you need to look at. MySQL can no way compare to Redis and Memcache. If you find Memcache good enough, you may want to look at Tokyo Tyrant as it does a synchronous writes. But you need to check for your application which server/combination suits you the best. In Marathi there is a saying “मेल्या शिवाय स्वर्ग दिसत नाही”, which means “You can’t see heaven without dieing” or need to do your hard work, can’t escape that ;)

I’ve attached the source code used to test, if anybody has any doubts, questions feel free to ask

Attachment Size
throughput-get.png 8.57 KB
throughput-set.png 8.65 KB
worksheet.png 42.36 KB
comparision.tar.gz 7.46 KB

Redis vs Memcached

This month’s Linux Journal has an article about Redis.  I read about it while sitting on the shitter, because that’s about all that Linux Journal is useful for.  The article itself was crap, but the introduction to this product was at least tempting.

The basics:  take memcached and add a disk backing, replication, virtual memory, and some cool additional data structures.  Hashes, lists, sets, sorting, joining, transactions, yay!  I instantly got a geek boner scanning the feature list.  My boner quickly faded to about half mast when I saw the C clients that are available.  No consistent hashing?  Wait, no server hashing at all? Yet they have a gay Ruby client, fully featured?  C is the lowest common denominator when it comes to language support.  Start there, then add high level bindings using this low level library.  libmemcached got it right.  You start on the bottom and work your way up to higher level bindings.  It seems that Redis took the wrong approach to this which will result in every language having a different client implementation.  This leads to inconsistencies between languages, which is bad news.  That on first glance gave me that headed-for-the-toilet feeling for this project.

With a half-mast boner, I decided to do some benchmark comparisons.  Perhaps I’ll see some real numbers that might arouse me.  Stand back, I’m going to do some science!

The Test Setup

I used the same machine as client and server for both tests to avoid network adding a skew to the results.  (This should stress the algorithms themselves)  The hardware itself doesn’t matter, it’s an apples to apples comparison.  The software does however:

  • Redis 2.0.0 rc4
  • Memcached 1.4.5 (release)
  • Credis (libcredis client) 0.2.2
  • Libmemcached 0.31  (I know this is a little outdated, but for this test it works fine)
  • Memcache client benchmark app: mc_stress.c
  • Redis client benchmark app: redis_stress.c

The clients I wrote are just simple iterations of setting and getting keys and timing the results.  Since WP sucks for posting source code snippets I had to stuff these into PDF files.  Feel free to recreate my tests and post your results, since that’s what science is all about.

Results

This test varies the key size with a static value of only 3 bytes.  (Using 100k unique keys)  I’m guessing this will stress the internal hashing algorithms used for processing the key names.  As you can see here, Redis came up short by a range of 20-26% the speed of memcache.

Same test, but showing the Multi-GET performance.  Redis clearly has some kind of problem here, and I believe it might be due to requesting a large amount of keys in a single MGET operation.  100,000 keys seemed to be the upper limit… but in every case, Redis came up short.

This test stresses the different value sizes.  Using a small 10 byte key length, I used varying sizes of value lengths up to around 16KB.  Even though memcached advertises a 1MB value limitation, performance dropssignificantly over 8192 bytes.  SET speed was reduced by 100 times and depending on the test, GET speed was either significantly reduced, or reduced by 100 times as well.  I suspect that playing around with slab sizes might assist with the speed here, but that becomes quite the pain in the ass.  The main point:  If you use memcache, do not stuff large values into a single key.  Hash them out into namespaces, you’ll be much better off.

Adding to this, Redis boasts a 1GB value size.  We would need to graph another relationship of transfer rate to determine if there is an inefficiency in handling larger values, or if it’s just due to the size of the value itself.  Memcache shows a similar trend as value size increases, but again Redis is 20-40% slower than memcache.

Here is the same test, showing Multi-GET performance.  Similar trend as key length test.

Conclusion

After crunching all of these numbers and screwing around with the annoying intricacies of OpenOffice, I’m giving Redis a big thumbs down.  My initial sexual arousal from the feature list is long gone.  Granted, Redis might have its place in a large architecture, but certainly not a replacement to memcache.  When your site is hammering 20,000 keys per second and memcache latency is heavily dependent on delivery times, it makes no business sense to transparently drop in Redis.   The features are neat, and the extra data structures could be used to offload more RDBMS activity… but 20% is just too much to gamble on the heart of your architecture.

Maybe sometime in the future Redis will be up to par with Memcached performance.  Or, it could be the extra VM and disk backing is inherently flawed to begin with.  In that case, Memcached will always win, and Redis will be only useful in a niche market somewhere between DB caching and NoSQL fanboys.

My previous post Redis, Memcache, Tokyp Tyrant, MySQL comparison. The MySQL was taking a huge time for doing a reverse DNS lookup.

I turned on the skip-name-resolve parameter in the my.cnf and the Throughput of MySQL grew considerably, almost more than double.

Here are the new results.

GET

SET

worksheet

MyISAM vs InnoDB

Nothing much has changed in the above test. Except for the fact InnoDB starts leading the way when there are high number of concurrent Inserts/Updates or Writes on the table. As seen from the “Set” graph InnoDB starts closing for MyISAM’s write efficiency around 30 concurrent requests and then by 60 concurrent requests its already ahead in throughput of writes – 1284/s against 825/s. Further I had put a watch on processlist and was watching the processess, there were times during MyISAM when the inserts took over 6seconds to finish, which also means that if you are in a need of an application which requires quicker response during heavy loads / heavy concurrency… You need to check the MyISAM vs. InnoDB scenario really closely. At low concurrency MyISAM is well ahead in writes, and in Reads, both MyISAM and InnoDB perform equally well.

Again you need to make sure that you check ur test conditions really well before just taking InnoDB for granted.

Attachment Size
throughput-get2.png 7.96 KB
throughput-set2.png 8.71 KB
worksheet-2.png 23.31 KB
comparision.ods 29.02 KB

Seeing that Redis v2.0 has been just been released and Oren Eini (aka @ayende) has justchecked in performance optimization improvements that show a 2x speed improvement for raw writes in RavenDB, I thought it was a good time to do a benchmark pitting these 2 popular NoSQL data stores against each other.

Benchmarks Take 2 – Measuring write performance

For the best chance of an Apples to Apples comparison I just copied the RavenDB’s benchmarks solution project and modified it slightly only to slot in the equivalent Redis operations. The modified solution is available here. Redis was also configured to run in its ‘most safest mode’ where it keeps an append only transaction log with the fsync option so the operation does not complete until the transaction log entry is written to disk. This is so we can get Redis to closely match RavenDB’s default behaviour. Enabling this behaviour in Redis is simply a matter of uncommenting the lines below in redis.conf:

appendonly yes
appendfsync always

To use this new configuration simply run ‘redis-server.exe /path/to/redis.conf’ on the command line.
Other changes I made for these new set of benchmarks was to remove batching from the Redis benchmark since its an accidental complexity not required or useful for the Redis Client.

Here are the benchmarks with these new changes in place:

Which for this scenario show that:

Redis is 11.75x faster than RavenDB

Note: The benchmarks here are of Redis running on a Windows Server through the Linux API emulation layer - Cygwin. Expect better results when running Redis on Unix servers where it is actively developed and optimized for. It is understood that the Cygwin version of redis-server is 4-10x slower than the native Linux version so expect results to be much better in production.

I attribute the large discrepancy between Redis and RavenDB due to the fact that Redis doesn’t use batches so only pays the ‘fsync penalty’ once instead of once per batch.

The ‘appendfsync always‘ mode is not an optimal configuration for a single process since Redis has to block to wait for the transaction log entry to be written to disk, a more sane configuration would be ‘appendfsync everysec‘ which writes to the transaction log asynchronously. Running the same benchmark using the default configuration yields the following results:

Which is a 39% improvement over the previous benchmarks where now:

Redis is 16.9x faster than RavenDB

Which unless I hear otherwise? should make this the fastest NoSQL solution available for .NET or MONO clients.

Measuring raw write performance using Redis is a little unfair since it has a batchful operation MSETspecifically optimized for this task. But that is just good practice, whenever you cross a process boundary you should be batching requests to minimize the number of your calls minimizing latency and maximizing performance.

Even though performance is important, its not the only metric when deciding which NoSQL database to use. If you have a lot of querying and reporting requirements that you don’t know up front then a document database like RavenDBMongoDB or CouchDB is a better choice. Likewise if you have minimal querying requirements and performance is important than you would be better suited to usingRedis – either way having a healthy array of vibrant choices available benefits everybody.

Notes about these benchmarks

Since these benchmarks just writes entities in large batches to a local Redis or RavenDB instance using a single client, I don’t consider this to be indicative of a *real-world* test rather a measure is raw write performance, i.e. How fast each client can persist 5,163 entities in their respective datastore.

A better *real-world* test would be one that accesses the server over the network using multiple concurrent clients that were benchmarking typical usage of a real-world application rather than just raw writes as done here.

So why is Redis so fast?

Based on the comments below there appears to be some confusion as to what Redis is and how it works. Redis is a high-performance data structures server written in C that operates predominantly in-memory and routinely persists to disk and maintains an Append-only transaction log file for data integrity – both of which are configurable.

For redundancy each instance has built-in support for replication so you can turn any redis instance into a slave of another, which can also be trivially configured at runtime. It also features its own Virtual Machine implementation so if your dataset exceeds your available memory, un-frequented values are swapped out to disk whilst the hot values remain in memory.

Like other high-performance network servers e.g. Nginx (the worlds fastest HTTP server), Node.js (a popular, very efficient web framework for JavaScript), Memcached, etc it achieves maximum efficiency by having each Redis instance run in a single process where all IO is asynchronous and no time is wasted context-switching between threads. To learn more about this architecture, check out Douglas Crockford (of JavaScript and JSON fame) imformative speech comparing event-loops vs threading for simulating concurrency.

It achieves concurrency by being really fast and achieves integrity by having all operations atomic. You are not just limited to the available transactions either as you can compose any combination of Redis commands together and process them atomically in a single transaction.

Effectively if you wanted to create the fastest NoSQL data store possible you would design it just like Redis and Memcached. Big kudos to @antirez for his continued relentless pursuit of optimizations resulting in Redis’s stellar performance.

The Redis Client,  JSON and the Redis Admin UI

Behind the scenes the Redis Client automatically stores the entities as JSON string values in Redis. Thanks to the ubiquitous nature of JSON I was easily able to develop a Redis Admin UI which provides a quick way to navigate and introspect your data in Redis. The Redis Admin UI runs on both .NET and Linux using Mono – A live demo is available here.

Download Benchmarks

The benchmarks (minus the dependencies) are available in ServiceStack’s svn repo.

I also have a complete download with including all dependencies available here:
http://servicestack.googlecode.com/files/NoSqlPerformance.zip (18MB)

Gaining in Popularity

Redis is sponsored by VMWare and has a vibrant pro-community behind it and been gaining a lot of popularity lately. Already with a library for every popular language in active use today, it is gaining momentum outside its Linux roots with twitter now starting to make use of it as well as popular .NET shops like the StackOverflow team taking advantage of it.

Unlike RavenDB and MongoDB which are document-orientated data stores, Redis is a ‘data structures’ server which although lacks some of the native querying functionalities found in Document DBs, encourage you to leverage its comp-sci data structures to maintain your own custom indexes to satisfy all your querying needs.

Try Redis in .NET

If these results have piqued your interest in Redis I invite you to try it out. If you don’t have a linux server handy, you can still get started by trying one of the windows server builds.

Included with ServiceStack is a feature-rich C# client which provides a familiar and easy to use C# API which like the rest of Service Stack runs on .NET and Linux with Mono.

Useful resources for using the C# .NET Client

I also have some useful documentation to help you get started:
Designing a NoSQL Database using Redis
A refactored example showing how to use Redis behind a repository pattern
Painless data migrations with schema-less NoSQL datastores and Redis
How to create custom atomic operations in Redis
Publish/Subscribe messaging pattern in Redis
Achieving High Performance, Distributed Locking with Redis

NoSQL for Time Series Data Benchmark

Speed of processing is an important aspect in choosing a data storage for time series data. The fast the better.

So how fast it is when NoSQL meet time series data? The following was what I found.

Test env

Hardware/OS

  • Ubuntu 2.6.32-19-generic SMP 64bit
  • Intel(R) Core(TM)2 Duo CPU T7500 @ 2.20GHz
  • 2G memory
  • 5400RPM hard disk

we have compared:

ruby client:

  • tokyocabinet-ruby-1.30
  • mongo 1.0.1 with BSON_ext
  • redis 1.0.4

benchmark code

I pushed it to github in case you want to roll you own results.

Results

write 1M records

build index

read last 30 days ohlc by symbol

read all ohlc by symbol

Storage size

Results clearly shows Toyko Cabinet BDB was the winner.

Notes

This wasn’t a full benchmark for NoSQL database, but more specific for time series data.

Introducing the Redis Admin UI

Confident that I’ve optimized ServiceStack’s JSON web services performance enough with the adoption of my latest efforts developing .NET’s fastest JSON Serializer, I’m now turning my attention towards creating apps that take advantage of it.

I’m a firm believer that performance is one of, if not the most important featurein developing an App that most users will love and use on a regular basis.  It’s the common trait amongst all the apps and websites I regularly use and is why I’m continually seeking software components and/or techniques that can help make my software run faster; or whenever there is no alternative to develop them myself. Although having said this I’m not a complete perf maniac and find that its important to strike a balance between productivity, utility and performance – which is what has effectively kept me tied to C# language for all my server development.

Redis, Sweet Redis

One of the exciting movements that have occurred in recent times is the introduction of NoSQL suite of data persistence solutions. There are numerous impressive NoSQL solutions out there but the one that I have been most interested in is Redis which from the projects website:

is an advanced key-value store. It is similar to memcached but the dataset is not volatile, and values can be strings, exactly like in memcached, but also lists, sets, and ordered sets. All this data types can be manipulated with atomic operations to push/pop elements, add/remove elements, perform server side union, intersection, difference between sets, and so forth.

I found this fascinating since it provides an extremely fast data-store (that gets routinely persisted) supporting rich data-structures that can be safely accessed by multiple app servers concurrently since all operations are atomic. Sweet just what I always wanted – although to make it productive I developed a C# Redis Client that apart from supporting Redis’s entire feature-set also provides a high-level typed API that can persist any .NET POCO Type which gets persisted as JSON in Redis.

Admin tab showing redis instance info

Aggregate view of complex types

View single complex type

Redis Web Services

In order to be able to access Redis from a web page some JSON web services are in order. I could’ve just implemented the services required by the Admin UI although I wanted to flex some ServiceStackmuscle so decided to create web services for all of Redis’s operations which on final count totalled near100 web services that I ended up knocking out over a single weekend. One of the benefits of using ServiceStack to develop your web services is that you get SOAP, XML, JSON and JSV endpoints for free. So after spending the next couple of days creating unit tests to provide 100% coverage, the back-end was complete – thus giving Redis CouchDB-like powers by allowing it to be accessed from any HTTP client.

Those interested in the Redis Web Services component can check out a live preview – with the complete list of available web services are available here:

http://www.servicestack.net/RedisAdminUI/Public/Metadata

And some examples on how to call them:
http://www.servicestack.net/RedisAdminUI/Public/Json/SyncReply/GetServerInfo (JSON)
http://www.servicestack.net/RedisAdminUI/Public/Xml/SyncReply/SearchKeys?Pattern=urn:c* (XML)

Ajax UI

With the web services in place, it is now possible to build pure static html/js/css ajax apps talking directly to the servers’ JSON data services – with no other web framework required!

The closure-library although not as terse or as initially productive as jQuery really shines in building large applications. It has a good framework for developing and re-using JavaScript classes and modules and comes with a set of rich, well-tested, cross-browser-compatible widgets. So within a couple of weeks of hacking on the client I was able to churn out a fairly useful featureset:

  • A TreeView displaying a heirachal view of the filtered redis keyset
  • Deep linking support so you can refresh, save or send a link of the entry you’re looking at
  • Back and forward button support
  • A tabular, aggregate view of all your ‘grouped keys’
  • An auto-complete filter to filter the tabular data
  • Updating and deleting of string values
  • Identifying the type, viewing and deleting of all keys
  • An admin interface to view redis server stats and the ability to destroy and rebuild the entire redis instance

Restrictions and Assumptions

In order to provide a useful Generic UI I’ve had to make a few assumptions on conventions used. Coincidentally these also happen to be the same conventions that the ServiceStack’s C# Redis Clientuses when storing data :-) .

  1. Keys parts should be separated with a ‘:’
  2. Keys within the same group are expected to be of the same type
  3. Complex types are stored as JSON

There are likely to be others I’ve subconsciously used so I’ll make an effort to keep this list of assumptions live.

Download and installation

Like the rest of ServiceStack the Redis Admin UI is Open source released under the liberal new BSD licence.

In keeping with tradition with most of my software, the Redis Admin UI works cross-platform on Windows with .NET and Linux and OSX using Mono (Live demo is hosted on CentOS/Nginx).
I’ve had an attempt at some basic installation instructions that are included in the downloaded andviewable online.

The latest version is hosted on Service Stacks code project site at the following url:

http://servicestack.googlecode.com/files/RedisAdminUI.zip

The Admin UI is highly customizable and very hackable since its written all in Java Script, so if you are interested in customizing the UI for your own purposes I invite you get started by downloading thedevelopment version from svn trunk.

Powered by WordPress | Theme: by 85ideas. Editor by Khoanguyen