Archive for November, 2010


New Tools in AS3

This one took a little longer than I’d hoped – life’s too busy right now! In this third part of my series on resource management in ActionScript 3 I will be focusing on a few new tools in AS3 / Flex2 that let you track and manage memory more effectively. There are only a couple new “official” features that are specifically geared towards resource management, but they are very useful. These are supplemented by a handy unofficial feature, and of course there are lots of language features that can help you in more generic ways.

System.totalMemory
This is simple tool, but it’s important because it marks the first run-time profiling tool developers have had in Flash. It allows you to monitor how much memory is in use by the Flash player at run-time. This gives you some ability to profile your own work during development without a system monitor. More importantly, it let’s you preemptively deal with major memory leaks in your content before it causes a serious issue for your user. It’s better to throw an error, and abort your application, than bog down the user’s system or even stall it completely.

Here’s a simple example of what this could look like:

import flash.system.System;
import flash.net.navigateToURL;
import flash.net.URLRequest;

// check our memory every 1 second:
var checkMemoryIntervalID:uint = setInterval(checkMemoryUsage,1000);

var showWarning:Boolean = true;
var warningMemory:uint = 1000*1000*500;
var abortMemory:uint = 1000*1000*625;

function checkMemoryUsage() {
if (System.totalMemory > warningMemory && showWarning) {
// show an error to the user warning them that we’re running out of memory and might quit
// try to free up memory if possible
showWarning = false; // so we don’t show an error every second
} else if (System.totalMemory > abortMemory) {
// save current user data to an LSO for recovery later?
abort();
}
}
function abort() {
// send the user to a page explaining what happpened:
navigateToURL(new URLRequest(“memoryError.html”));
}

This could obviously be enhanced in a number of ways, but hopefully demonstrates the basic concept well.

It is important to note that totalMemory is a shared value within a single process. A single process may be just one browser window, or all open browser windows, depending on the browser, the OS, and how the windows were opened (ex. in OSX all Safari windows share a single process and totalMemory value, whereas it is much more convoluted in Windows).

Weak References
One of the features I’m really happy to see implemented in AS3 is weak references. These are references to objects that are not counted by the Garbage Collector in determining an object’s availability for collection. That is, if the only references remaining to an object are weak, that object will be removed by the GC on its next pass. Unfortunately, weak references are only supported in two contexts. The first is event listeners, which is great because event listeners are one of the most common references that cause problems with garbage collection. I strongly recommend alwaysusing weak references with listeners by passing true as the fifth parameter of your addEventListener calls:

someObj.addEventListener(“eventName”,listenerFunction,useCapture,priority,weakReference);
stage.addEventListener(Event.CLICK,handleClick,false,0,true);
// the reference back to handleClick (and this object) will be weak.

To learn more, read my article on weakly referenced listeners.

The other place that weak references are supported is in the Dictionary object. Simply pass true as the first parameter when you instantiate a new Dictionary to have it use weak references as its keys:

var dict:Dictionary = new Dictionary(true);
dict[myObj] = myOtherObj;
// the reference to myObj is weak, the reference to myOtherObj is strong

To learn more about using dictionaries in ActionScript 3, read my article about the Dictionary Object in AS3. One of the cool things about having weak reference support in Dictionary is that we can hook into it to use weak references in other contexts. For example, I used Dictionary to create WeakReference and WeakProxyReference classes that can be used to create weak references anywhere.

WeakReference class
WeakReference takes advantage of Dictionary to allow you to hold a weak reference to any object within any context. It has a small amount of overhead for instantiation and access, so I would only recommend using it for potentially large objects that may not be properly freed. This should notreplace writing code that cleans up properly after itself, but it can help you to ensure large data objects are freed properly for garbage collection.

I based the ActionScript 3 WeakReference class on the WeakReference class for Java. To use it, you simply instantiate a new WeakReference, passing the referent (object you wish to reference) as the first parameter. You then store a strong reference to the instance of WeakReference, not the referent, and access the referent via the get() method of WeakReference.

import com.gskinner.utils.WeakReference;
var dataModelReference:WeakReference;
function registerModel(data:BigDataObject):void {
   dataModelReference = new WeakReference(data);
}
...
function doSomething():void {
   // get a local, typed reference to the data:
   var dataModel:BigDataObject = dataModelReference.get() as BigDataObject;
   // call methods, or access properties of the data object:
   dataModel.doSomethingElse();
}

For well architected code, this is a good solution, because it lets you maintain type safety, and is non-ambiguous. For those who just want to hack code together quickly, I also put together the WeakProxyReference class (which was a great learning experience for the new Proxy object).

WeakProxyReference class
WeakProxyReference uses the new Proxy class to transparently wrap a weakly referenced object. It works the same as WeakReference, except that you can call methods on the weak reference object directly and have them passed to the referent. The main issues are the loss of type safety, and slightly more ambiguous code. Note that it will still throw appropriate run-time errors (ex. if you try to access a non-existent property on the object), but not compile-time errors.

import com.gskinner.utils.WeakProxyReference;
var dataModel:Object; // note that it is untyped, and not named as a reference
function registerModel(data:BigDataObject):void {
   dataModel = new WeakProxyReference(data);
}
function doSomething():void {
   // don't need to get() the referent, you can access members directly on the reference:
   dataModel.doSomethingElse();
   dataModel.length++;
   delete(dataModel.someProperty);

   // if you do need access to the referent, you need to use the weak_proxy_reference namespace:
   var bdo:BigDataObject = dataModel.weak_proxy_reference::get() as BigDataObject;
}

You can download WeakReference and WeakProxyReference at the end of this article.

Unsupported Way to Force GC
In my previous article on Garbage Collection in AS3 I said that the GC is indeterminate – that there is no way to know when it will run. That is not entirely true, there is a trick that will let you force the Flash player to carry out a full GC pass. This trick can be really handy for exploring Garbage Collection, and testing your applications during development, but it should never be deployed in production code because it can wreak havoc with processor load. It is also officially unsupported, so you cannot rely on it to work in updated versions of the player.

To force an immediate GC mark/sweep, all you have to do is call connect() on two LocalConnections with the same name. This will throw an error, so you’ll have to wrap it in a try/catch block.

try {
   new LocalConnection().connect('foo');
   new LocalConnection().connect('foo');
} catch (e:*) {}
// the GC will perform a full mark/sweep on the second call.

Again, this should only be used as a development aid. It should never be used in production code! There is an example of this in use in the demo FLAs for the WeakReference class, which you can download at the end of this article.

Summary
ActionScript 3 has substantially increased the amount of work developers must do to manage resources in their applications. While we have only been provided with a few new tools to help us with this, they are better than nothing and signify that Adobe is at least aware of the issue. Pairing these new tools with effective strategies and approaches (the subject of my next article) should allow you to successfully manage resources in Flash 9 and Flex 2 projects.

Download WeakReference and WeakProxyReference.

A big thank you to Thomas Reilly for proofreading this article, and ensuring I’m not misleading the rest of the Flash world too badly. Any errors are mine, not his.

Resource Management Issues in FP9

ActionScript 3 has empowered Flash developers with faster code execution and a ton of API enhancements. Unfortunately, it has also led to the need for a much higher level of developer responsibility than ever before. In order to prepare and educate developers on how to deal with some of this new responsibility, I am writing a series of articles on resource management in AS3, Flex 2, and Flash 9. The first of these articles discussed the mechanics of the Garbage Collector in Flash Player 9. This article will focus on the implications some of the new features of AS3 have on resource management, and the potential headaches they could cause you even in simple projects. The next article in the series will introduce some of the new tools we have at our disposal to deal with these issues.

The biggest change in AS3 that affects resource management is the new display list model. In Flash Player 8 and below, when a display object was removed from the screen (with removeMovie or unloadMovie), it and all of its descendants were immediately removed from memory, and halted all code execution. Flash Player 9 introduces a much more flexible display list model, where display objects (Sprites, MovieClips, etc) are treated the same as normal objects. This means that developers can now do really cool things like reparenting (moving a DO from one display list to another), and instantiating display objects from loaded SWFs. Unfortunately, it also means that display objects are now treated the same as every other object by the Garbage Collector, which raises a whole slew of interesting (and possibly non-obvious) issues.

Issue 1: Dynamic Content
One of the more obvious issues is related to Sprites (or other DOs) that you instantiate dynamically, then wish to remove at a later time. Because display object’s no longer live and die on the display list, when you remove the object from the stage it continues to exist in memory. If you have not cleaned up all other references to the clip, including the object’s listeners, it may never be removed. If you have done a good job of cleaning up all references, the clip will be removed from memory the next time the GC runs a sweep, which is at some indeterminate point in the future based loosely on memory usage (see my previous article on Garbage Collection in AS3for more information).

It is very important to note that not only will the display object continue to use memory, it will also continue to execute any “idle” code, such as Timers, enterFrames, and listeners outside its scope. A couple of examples may help illustrate this issue:

  1. You have a game sprite that subscribes to its own enterFrame event. Every frame it moves and carries out some calculations to determine it’s proximity to other game elements. In AS3, even after you remove it from the display list and null all references to it, it will continue to run that code every frame until it is removed by garbage collection. You must remember to explicitly remove the enterFrame listener when the sprite is removed.
  2. Consider a MovieClip that follows the mouse by subscribing to the stage’s mouseMove event (which is the only way to achieve this effect in the new event model). Unless you remember to remove the listener, the clip will continue to execute code every time the mouse is moved, even after the clip is “deleted”. By default, the clip will execute forever, as a reference to it exists from the stage for event dispatch (we will look at how to avoid this in the next article).

Now imagine the implications of instantiating and removing a bunch of sprites before the GC does a sweep, or if you failed to remove all references. You could inadvertently max out the CPU fairly easily, slowing your application or game to a crawl, or even stalling the users’ computers entirely. There is NO WAY to force the Flash Player to kill a display object and stop it executing. You must do this manually when it is removed from the display. I will examine strategies to manage this task in a future article.

Here’s a simple example (Flash Player 9 required). Click the “create” button to create a new Sprite instance. The sprite instance will start outputting a counter. Click remove and note how the output continues, despite the fact that all references to the sprite have been nulled. You can create multiple instances to see how this issue compounds over the life of an application. Source code is available at the end of this article.

Issue 2: Loaded Content
If you consider that the contents of loaded SWFs are also now treated the same as every other object, it is easy to imagine some of the problems that you can encounter with loaded content. Just as with other display objects, there is no way to explicitly remove a loaded SWF and its contents from memory, or to stop it from executing. Calling Loader.unload simply nulls the loader’s reference to the SWF, it still has to be picked up by a GC sweep (assuming all other references to it have been properly cleared).

Consider the following two scenarios:

  1. You build a shell that loads your experimental Flash pieces. This experimental work is cutting edge, and pushes the CPU to the limits. A user clicks a button to load one experiment, views it, then clicks a button to load a second experiment. Even if all references are cleared to the first experiment, it will continue to run in the background, which will likely max the processor out when the second experiment starts running at the same time.
  2. A client commissions you to build an application that loads AS3 SWFs created by other developers. These developers add listeners to the stage, or otherwise create external references to their own content. You now have no way of unloading their content, it will live in memory, and consume CPU until the user quits your application. Even if their content does not have any external references, it will continue to execute indefinitely until the next garbage collection sweep.

For security sensitive projects, it is very important to understand that if you load third party content, you have no way of controlling when it is unloaded, or what it executes. It would be VERY easy to write a SWF that once loaded into your application continued to execute in the background and captured / transmitted user input / interactions, even after being unloaded.

Another simple example. Exactly the same scenario as in Issue 1, but loading a SWF each time instead of using dynamic instantiation.

Issue 3: The Timeline
This one is mostly an issue for those playing with Flash 9 public alpha, and it is important to remember that it is alpha. Hopefully this gets resolved before beta or final.

The timeline in AS3 is code driven. It dynamically instantiates and removes display objects as the playhead moves. This means it is subject to the same problems as those listed in issue 1. Clips that are removed from the screen due to timeline updates will remain in memory, and continue to execute any idle code on them until they are picked up by the GC. Not really expected behaviour for Flash developers, and certainly not for Flash designers (who shouldn’t have to think about GC at all).

Here’s another quick example. Same concept, but this time it is jumping between two frames to instantiate / remove a clip.

What is Adobe thinking? Or alternatively, why is this an issue?
Flash Developers are likely looking at this, and thinking WTF, this is a nightmare!? On the other hand, Java developers are probably looking at it and saying “so what?”. This disparity is understandable – Flash developers are not used to having to do manual resource management beyond basic best practices (ex. kill references when you’re done), whereas Java developers have been through all this before. These issues are par for the course for most modern memory managed languages, and unfortunately there is no way to completely avoid them.

On the other hand, Flash raises a lot of challenges that are rare in other languages (including Flex for the most part). Flash content tends to have a lot of idle / reactive code execution, whereas Java and Flex are mostly interactive (ie. CPU intensive code usually only executes based on a user interaction). Flash projects load external content from third party sources (possibly with poor coding standards) far more often as well. Flash developers also have fewer tools, profilers and frameworks to utilize. Finally, Flash developers generally come from a much less formal programming background – most Flash developers I know have backgrounds in Music, Art, Business, Philosophy or just about anything but programming. This diversity results in AWESOME creativity and content, but does not really prepare the community for dealing with resource management issues.
Summary
Resource management is going to be an important part of AS3 development. Ignoring the issue could result in sluggish content, and potentially stalling users’ systems completely. There is no longer any way to explicitly remove a display object from memory and stop its code from executing, which means we have a responsibility to clean up properly after our objects. Over the next few weeks I will outline some tools and strategies for tackling these issues. Hopefully as a community we can establish best practices and frameworks to make this transition easier.

You can download the source code for the demos by clicking here.

I will continue to update these articles with the latest information and community input as it becomes available, so you may want to check back occasionally.

The FP9 Garbage Collector

I’ve been playing around with AS3 for a while now, and am really excited by its capabilities. The raw execution speed by itself will create so many possibilities. Toss in E4X, Sockets, ByteArrays, new display list model, RegEx, a formalized event and error model, and a few dozen other features for flavour, and you have a pretty heady brew.

With great power comes great responsibility, and this will be very true for AS3. A side effect of all this new control is that the Garbage Collector is no longer able to make as many assumptions about what it should tidy up automatically for you. This means that Flash developers moving to AS3 will need to develop a very strong understanding of how the GC operates, and how to work with it effectively. Building even seemingly simple games or applications without this knowledge can easily result in SWFs that can leak like a sieve, hogging all of a system’s resources (CPU/RAM), and causing the user’s system to hang (potentially even forcing them to hard reboot their computer).

Over the next few weeks (or even months), I will be writing a series of articles on this topic. I will look at the underlying mechanics of the Garbage Collector, discuss the issues you are likely to face, examine the new tools available to you for handling resource management in AS3, and offer solutions/code to help you circumvent many of the common problems you will face.

I’ll begin at the beginning, and look at the Garbage Collector in the Flash 9 player.

About the Garbage Collector
The garbage collector is a behind-the-scenes process that is responsible for deallocating the memory used objects that are no longer in use by the application. An inactive object is one that no longer has any references to it from other active objects. In order to understand this, it is very important to realize that when working with non-primitive types (anything other than Boolean, String, Number, uint, int), you are always passing around a reference to the object, not the object itself – deleting a variable removes the reference, not the object. This is easily demonstrated:

// create a new object, and put a reference to it in a:
var a:Object = {foo:"bar"}
// copy the reference to the object into b:
var b:Object = a;
// delete the reference to the object in a:
delete(a);
// check to see that the object is still referenced by b:
trace(b.foo); // traces "bar", so the object still exists.

If I were to delete “b” as well in the example above, it would leave my object with no active references and free it for garbage collection. The AS3 GC uses two methods for locating objects with no active references: Reference counting and mark sweeping.
Reference Counting
Reference counting is one of the simplest methods for keeping track of active references, and has been around in Flash since AS1. When you create a reference to an object its reference count is incremented. When you delete a reference, its reference count is decremented. If the reference count of an object reaches zero, it is marked for deletion by the GC. For example:

var a:Object = {foo:"bar"}
// the object now has a reference count of 1 (a)
var b:Object = a;
// now it has a reference count of 2 (a & b)
delete(a);
// back to 1 (b)
delete(b);
// down to 0, the object can now be deallocated by the GC

Reference counting is simple, doesn’t carry a huge CPU overhead, and works well in most situations. Unfortunately it really falls down when it comes to circular referencing. This is when objects cross-reference each other (directly, or indirectly via other objects). Even if the application is no longer actively using the objects, their reference counts remain above zero, so they are never removed. Here’s a quick demo:

var a:Object = {}
// create a second object, and reference the first object:
var b:Object = {foo:a};
// make the first object reference the second as well:
a.foo = b;
// delete both active application references:
delete(a);
delete(b);

In the above example, both of my active application references have been deleted. I no longer have any way of accessing the two objects from my application, but their reference counts are both 1 because they reference each other. This can also be much more complex (a references c which references b which references a, etc), and is hard to deal with in code. Flash player 6 and 7 suffered from problems related to circular referencing in XML objects – each XML node referenced both its children and its parent, so they were never deallocated. Fortunately, player 8 added a new GC technique called mark sweeping.
Mark Sweeping
The second strategy employed by the AS3 (and fp8) GC to find inactive objects is mark sweeping. The player starts at the root node of your application (which is conveniently the “root” in AS3), and walks through every reference on it, marking each object it finds. It then iterates through each of the marked objects, marking their children. It continues this recursively until it has traversed the entire object tree of your application, marking everything it finds. At the end of this process, it can safely assume that any objects in memory that are not marked no longer have any active references to them, and can be safely deallocated. You can see how this works in the diagram below (green references were followed during mark sweeping, green objects are marked, white objects will be deallocated).

Mark sweeping is very accurate, but because it has to traverse your entire object structure, it is also costly in terms of CPU usage. Flash player 9 reduces this cost by carrying out iterative mark sweeping (ie. it occurs over a number of frames, instead of all at once), and by only having it run occasionally.

Deferred GC and Indeterminacy
A *very* important thing to understand about the Garbage Collector in FP9 is that it’s operations are deferred. Your objects will not be removed immediately when all active references are deleted, instead they will be removed at some indeterminate time in the future (from a developer standpoint). The GC uses a set of heuristics that look at RAM allocation and the size of the memory stack (among other things) to determine when to run. As a developer, you must accept that fact that you will have no way of knowing when (or even if) your inactive objects will get deallocated. You must also be aware that inactive objects will continue to execute indefinitely (until the GC deallocates it), so code will keep running (ex. enterFrames), sounds will keep playing, loads will keep happening, events will keep firing, etc.

It’s very important to remember that you have no control over when your objects will be deallocated, so you must make them as inert as possible when you are finished with them. Strategies to manage this will be the focus for a future article.

There are a number of conceivable reasons to setup a RamDisk on your Media Center PC. The biggest plausible reason is for the performance increase that can be gained by using a RamDisk as opposed to a Hard Disk to store the temporary files created for time-shifting live TV. In place of writing and erasing this data to your Hard Disk, which is a fair bit slower than RAM and can cause unnecessary wear on your Hard Disk, we can setup a virtual drive that uses RAM instead. The down side of using a RamDisk is that the data must be saved to the Hard Drive before shutting down or it will be lost forever. Also, in regard to Windows 7 Media Center, the files we are going to be using this drive for are quite large and will require somewhat extreme amounts of ram. For instance, an hour of SD programming will require anywhere from 3-4gb and an hour of HD programming will require approximately 7-8gb. This means you’ll need around 16gb of RAM on top of your base system RAM to keep a 2 hour buffer for HD programming. If you meet this steep level of entry the let’s get started. The first thing you’ll need is a copy of QSoft’s RamDisk. Navigating their website can be somewhat tricky so we’ve provided copies of the RamDisk x64 Evaluation and Preferences program for download. Purchasing a copy of the non-time limited software is as little as $12 which is extraordinary as well. Now to get started, the installation process is somewhat difficult for novice users so let’s go over that in detail now. Extract the software to your hard drive and open your Control Panel. Next go to Hardware and Sound then open Device Manager. Continue reading “Setting Up A RamDisk Scratch Drive for Windows 7 Media Center” »

1.Totals
1.1.Handles

Handle: là một loại con trỏ thông minh (smart pointer). Handle là 1 khái niệm trừu tượng được sử dụng để chỉ việc 1 ứng dụng truy xuất các block trên bộ nhớ.
Windows API sử dụng nhiều handle để thể hiện nhiều đối tượng trong windows là con đường nối giữa Hệ điều hành và không gian người sử dụng. Continue reading “Hiểu hơn Tab Performance trong Task Manager để sử dụng RAM hiệu quả” »

If you have a home network and are running Windows 7 and have XP on other PC(s) you might want to share files between them.  Today we will look at the steps to share files and hardware devices like a printer.

Sharing Files In Windows 7 and XP

Sharing folders between two Windows 7 machines with the new HomeGroup feature is an easy process, but the HomeGroup feature is not compatible with Vista or XP.  For this tutorial we are using Windows 7 x64 RC1 and XP Professional SP3 connected through a basic Linksys home wireless router. Continue reading “Share Files and Printers between Windows 7 and XP” »

While I’m preparing a different post related to this topic, I noticed administrative shares not working in my Windows 7 workgroup computer. If the computer becomes part of domain, it started working for domain administrator, but in workgroup, no luck. After several tries and searches, this way helped me fix the issue.

By default all local drives (Partitions) shared for administrators to access over the network even when they are really not shared. These type of shares called admin or administrative shares in Windows 7. Normally it can be accessed by typing Computer IP address or name with partition letter and dollar ($) sign at the end, as shown below. Continue reading “Fix it Now, Administrative Shares (C$) Not Working in Windows 7 Workgroup Computer” »

Ngày 24/8/1998, một đám tang vô cùng đặc biệt được tổ chức tại huyện Gia Tường, tỉnh Sơn Đông (Trung Quốc). Người chết là một cô gái mới 16 tuổi trên là Thẩm Xuân Linh.

Nhưng cô được nhận những nghi lễ long trọng nhất của làng, những người anh trai của cô mặc tấm áo tang chỉ được mặc khi đưa tang cha đẻ. Anh trai cô quỳ rất lâu trước linh cữu em gái, người trong làng ai cũng đeo băng tang.

Nhưng không ai biết rằng, cô gái mười sáu tuổi này thực ra không hề có máu mủ ruột thịt gì với những người còn sống, cũng như với dân làng này, thậm chí cô chỉ là một đứa con gái riêng của mẹ kế mà ngay cả tên trong sổ hộ khẩu của làng cũng không có. Continue reading “Chuyện cô gái 16 tuổi làm cảm động cả đất trời” »

I wanted to compare the following DBs, NoSQLs and caching solutions for speed and connections. Tested the following

My test had the following criteria

  • 2 client boxes
  • All clients connecting to the server using Python
  • Used Python’s threads to create concurrency
  • Each thread made 10,000 open-close connections to the server
  • The server was
    • Intel(R) Pentium(R) D CPU 3.00GHz
    • Fedora 10 32bit
    • Intel(R) Pentium(R) D CPU 3.00GHz
    • 2.6.27.38-170.2.113.fc10.i686 #1 SMP
    • 1GB RAM
  • Used a md5 as key and a value that was saved
  • Created an index on the key column of the table
  • Each server had SET and GET requests as a different test at same concurrency

Results please !

Work sheet

throughput set

throughput get

I wanted to simulate a situation where I had 2 servers (clients) serving my code, which connected to the 1 server (memcached, redis, or whatever). Another thing to note was that I used Python as the client in all the tests, definately the tests would give a different output had I used PHP. Again the test was done to check how well the clients could make and break the connections to the server, and I wanted the overall throughput after making and breaking the connections. I did not monitor the response times. I didnt change absolutely any parameters for the servers, eg didn’t change the innodb_buffer_pool_size or key_buffer_size.

MySQL

MySQL lacked the whole scene terribly, I monitored the MySQL server via the MySQL Administrator and found that hardly there were any conncurrent inserts or selects, I could see the unauthenticated users, which meant that the client had connected to MySQL and was doing a handshake using MySQL authentication (using username and password). As you could see I didn’t even perform the 40 and 60 thread tests.

I truncated the table before I swtiched my tests from MyISAM to InnoDB. And always started the tests from lesser threads. My table was as follows

CREATE TABLE `comp_dump` (
  `k` char(32) DEFAULT NULL,
  `v` char(32) DEFAULT NULL,
  KEY `ix_k` (`k`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

NoSQL

For Tokyo Tyrant I used a file.tch as the DB, which is a hash database. I also tried MongoDB as u may find if u have opened the worksheet, But the server kept failing or actually the mongod failed after coming at an unhandled Exception. I found something similar over here. I tried 1.0.1, 1.1.3 and the available Nightly build, but all failed and I lost my patience.

Now what

If you need speed just to fetch a data for a given combination or key, Redis is a solution that you need to look at. MySQL can no way compare to Redis and Memcache. If you find Memcache good enough, you may want to look at Tokyo Tyrant as it does a synchronous writes. But you need to check for your application which server/combination suits you the best. In Marathi there is a saying “मेल्या शिवाय स्वर्ग दिसत नाही”, which means “You can’t see heaven without dieing” or need to do your hard work, can’t escape that ;)

I’ve attached the source code used to test, if anybody has any doubts, questions feel free to ask

Attachment Size
throughput-get.png 8.57 KB
throughput-set.png 8.65 KB
worksheet.png 42.36 KB
comparision.tar.gz 7.46 KB

HBase Architecture 101 – Storage

One of the more hidden aspects of HBase is how data is actually stored. While the majority of users may never have to bother about it you may have to get up to speed when you want to learn what the various advanced configuration options you have at your disposal mean. “How can I tune HBase to my needs?”, and other similar questions are certainly interesting once you get over the (at times steep) learning curve of setting up a basic system. Another reason wanting to know more is if for whatever reason disaster strikes and you have to recover a HBase installation.

In my own efforts getting to know the respective classes that handle the various files I started to sketch a picture in my head illustrating the storage architecture of HBase. But while the ingenious and blessed committers of HBase easily navigate back and forth through that maze I find it much more difficult to keep a coherent image. So I decided to put that sketch to paper. Here it is.

Please note that this is not a UML or call graph but a merged picture of classes and the files they handle and by no means complete though focuses on the topic of this post. I will discuss the details below and also look at the configuration options and how they affect the low-level storage files.

The Big Picture

So what does my sketch of the HBase innards really say? You can see that HBase handles basically two kinds of file types. One is used for the write-ahead log and the other for the actual data storage. The files are primarily handled by the HRegionServer‘s. But in certain scenarios even the HMaster will have to perform low-level file operations. You may also notice that the actual files are in fact divided up into smaller blocks when stored within the Hadoop Distributed Filesystem (HDFS). This is also one of the areas where you can configure the system to handle larger or smaller data better. More on that later.

The general flow is that a new client contacts the Zookeeper quorum (a separate cluster of Zookeeper nodes) first to find a particular row key. It does so by retrieving the server name (i.e. host name) that hosts the -ROOT- region from Zookeeper. With that information it can query that server to get the server that hosts the .META. table. Both of these two details are cached and only looked up once. Lastly it can query the .META. server and retrieve the server that has the row the client is looking for.

Once it has been told where the row resides, i.e. in what region, it caches this information as well and contacts the HRegionServer hosting that region directly. So over time the client has a pretty complete picture of where to get rows from without needing to query the .META. server again.

Note: The HMaster is responsible to assign the regions to each HRegionServer when you start HBase. This also includes the “special” -ROOT- and .META. tables.

Next the HRegionServer opens the region it creates a corresponding HRegion object. When the HRegion is “opened” it sets up a Store instance for each HColumnFamily for every table as defined by the user beforehand. Each of the Store instances can in turn have one or more StoreFile instances, which are lightweight wrappers around the actual storage file called HFile. A HRegion also has a MemStore and a HLog instance. We will now have a look at how they work together but also where there are exceptions to the rule.

Stay Put

So how is data written to the actual storage? The client issues a HTable.put(Put) request to the HRegionServer which hands the details to the matching HRegion instance. The first step is now to decide if the data should be first written to the “Write-Ahead-Log” (WAL) represented by the HLog class. The decision is based on the flag set by the client using Put.writeToWAL(boolean) method. The WAL is a standard Hadoop SequenceFile (although it is currently discussed if that should not be changed to a more HBase suitable file format) and it stores HLogKey‘s. These keys contain a sequential number as well as the actual data and are used to replay not yet persisted data after a server crash.

Once the data is written (or not) to the WAL it is placed in the MemStore. At the same time it is checked if the MemStore is full and in that case a flush to disk is requested. When the request is served by a separate thread in the HRegionServer it writes the data to an HFile located in the HDFS. It also saves the last written sequence number so the system knows what was persisted so far. Let”s have a look at the files now.

Files

HBase has a configurable root directory in the HDFS but the default is /hbase. You can simply use the DFS tool of the Hadoop command line tool to look at the various files HBase stores.

$ hadoop dfs -lsr /hbase/docs

drwxr-xr-x – hadoop supergroup 0 2009-09-28 14:22 /hbase/.logs
drwxr-xr-x – hadoop supergroup 0 2009-10-15 14:33 /hbase/.logs/srv1.foo.bar,60020,1254172960891
-rw-r–r– 3 hadoop supergroup 14980 2009-10-14 01:32 /hbase/.logs/srv1.foo.bar,60020,1254172960891/hlog.dat.1255509179458
-rw-r–r– 3 hadoop supergroup 1773 2009-10-14 02:33 /hbase/.logs/srv1.foo.bar,60020,1254172960891/hlog.dat.1255512781014
-rw-r–r– 3 hadoop supergroup 37902 2009-10-14 03:33 /hbase/.logs/srv1.foo.bar,60020,1254172960891/hlog.dat.1255516382506

-rw-r–r– 3 hadoop supergroup 137648437 2009-09-28 14:20 /hbase/docs/1905740638/oldlogfile.log

drwxr-xr-x – hadoop supergroup 0 2009-09-27 18:03 /hbase/docs/999041123
-rw-r–r– 3 hadoop supergroup 2323 2009-09-01 23:16 /hbase/docs/999041123/.regioninfo
drwxr-xr-x – hadoop supergroup 0 2009-10-13 01:36 /hbase/docs/999041123/cache
-rw-r–r– 3 hadoop supergroup 91540404 2009-10-13 01:36 /hbase/docs/999041123/cache/5151973105100598304
drwxr-xr-x – hadoop supergroup 0 2009-09-27 18:03 /hbase/docs/999041123/contents
-rw-r–r– 3 hadoop supergroup 333470401 2009-09-27 18:02 /hbase/docs/999041123/contents/4397485149704042145
drwxr-xr-x – hadoop supergroup 0 2009-09-04 01:16 /hbase/docs/999041123/language
-rw-r–r– 3 hadoop supergroup 39499 2009-09-04 01:16 /hbase/docs/999041123/language/8466543386566168248
drwxr-xr-x – hadoop supergroup 0 2009-09-04 01:16 /hbase/docs/999041123/mimetype
-rw-r–r– 3 hadoop supergroup 134729 2009-09-04 01:16 /hbase/docs/999041123/mimetype/786163868456226374
drwxr-xr-x – hadoop supergroup 0 2009-10-08 22:45 /hbase/docs/999882558
-rw-r–r– 3 hadoop supergroup 2867 2009-10-08 22:45 /hbase/docs/999882558/.regioninfo
drwxr-xr-x – hadoop supergroup 0 2009-10-09 23:01 /hbase/docs/999882558/cache
-rw-r–r– 3 hadoop supergroup 45473255 2009-10-09 23:01 /hbase/docs/999882558/cache/974303626218211126
drwxr-xr-x – hadoop supergroup 0 2009-10-12 00:37 /hbase/docs/999882558/contents
-rw-r–r– 3 hadoop supergroup 467410053 2009-10-12 00:36 /hbase/docs/999882558/contents/2507607731379043001
drwxr-xr-x – hadoop supergroup 0 2009-10-09 23:02 /hbase/docs/999882558/language
-rw-r–r– 3 hadoop supergroup 541 2009-10-09 23:02 /hbase/docs/999882558/language/5662037059920609304
drwxr-xr-x – hadoop supergroup 0 2009-10-09 23:02 /hbase/docs/999882558/mimetype
-rw-r–r– 3 hadoop supergroup 84447 2009-10-09 23:02 /hbase/docs/999882558/mimetype/2642281535820134018
drwxr-xr-x – hadoop supergroup 0 2009-10-14 10:58 /hbase/docs/compaction.dir

The first set of files are the log files handled by the HLog instances and which are created in a directory called .logs underneath the HBase root directory. Then there is another subdirectory for each HRegionServer and then a log for each HRegion.

Next there is a file called oldlogfile.log which you may not even see on your cluster. They are created by one of the exceptions I mentioned earlier as far as file access is concerned. They are a result of so called “log splits”. When the HMaster starts and finds that there is a log file that is not handled by a HRegionServer anymore it splits the log copying the HLogKey‘s to the new regions they should be in. It places them directly in the region’s directory in a file named oldlogfile.log. Now when the respective HRegion is instantiated it reads these files and inserts the contained data into its local MemStore and starts a flush to persist the data right away and delete the file.

Note: Sometimes you may see left-over oldlogfile.log.old (yes, there is another .old at the end) which are caused by the HMaster trying repeatedly to split the log and found there was already another split log in place. At that point you have to consult with the HRegionServer or HMaster logs to see what is going on and if you can remove those files. I found at times that they were empty and therefore could safely be removed.

The next set of files are the actual regions. Each region name is encoded using a Jenkins Hash function and a directory created for it. The reason to hash the region name is because it may contain characters that cannot be used in a path name in DFS. The Jenkins Hash always returns legal characters, as simple as that. So you get the following path structure:

/hbase/<tablename>/<encoded-regionname>/<column-family>/<filename>

In the root of the region directory there is also a .regioninfo holding meta data about the region. This will be used in the future by an HBase fsck utility (see HBASE-7) to be able to rebuild a broken .META. table. For a first usage of the region info can be seen in HBASE-1867.

In each column-family directory you can see the actual data files, which I explain in the following section in detail.

Something that I have not shown above are split regions with their initial daughter reference files. When a data file within a region grows larger than the configured hbase.hregion.max.filesize then the region is split in two. This is done initially very quickly because the system simply creates two reference files in the new regions now supposed to host each half. The name of the reference file is an ID with the hashed name of the referenced region as a postfix, e.g. 1278437856009925445.3323223323. The reference files only hold little information: the key the original region was split at and wether it is the top or bottom reference. Of note is that these references are then used by the HalfHFileReader class (which I also omitted from the big picture above as it is only used temporarily) to read the original region data files. Only upon a compaction the original files are rewritten into separate files in the new region directory. This also removes the small reference files as well as the original data file in the original region.

And this also concludes the file dump here, the last thing you see is a compaction.dir directory in each table directory. They are used when splitting or compacting regions as noted above. They are usually empty and are used as a scratch area to stage the new data files before swapping them into place.

HFile

So we are now at a very low level of HBase’s architecture. HFile‘s (kudos to Ryan Rawson) are the actual storage files, specifically created to serve one purpose: store HBase’s data fast and efficiently. They are apparently based on Hadoop’s TFile (see HADOOP-3315) and mimic the SSTable format used in Googles BigTable architecture. The previous use of Hadoop’s MapFile‘s in HBase proved to be not good enough performance wise. So how do the files look like?

The files have a variable length, the only fixed blocks are the FileInfo and Trailer block. As the picture shows it is the Trailer that has the pointers to the other blocks and it is written at the end of persisting the data to the file, finalizing the now immutable data store. The Index blocks record the offsets of the Data and Meta blocks. Both the Data and the Meta blocks are actually optional. But you most likely you would always find data in a data store file.

How is the block size configured? It is driven solely by the HColumnDescriptor which in turn is specified at table creation time by the user or defaults to reasonable standard values. Here is an example as shown in the master web based interface:

{NAME => 'docs', FAMILIES => [{NAME => 'cache', COMPRESSION => 'NONE', VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'false'}, {NAME => 'contents', COMPRESSION => 'NONE', VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'false'}, ...

The default is "64KB" (or 65535 bytes). Here is what the HFile JavaDoc explains:

"Minimum block size. We recommend a setting of minimum block size between 8KB to 1MB for general usage. Larger block size is preferred if files are primarily for sequential access. However, it would lead to inefficient random access (because there are more data to decompress). Smaller blocks are good for random access, but require more memory to hold the block index, and may be slower to create (because we must flush the compressor stream at the conclusion of each data block, which leads to an FS I/O flush). Further, due to the internal caching in Compression codec, the smallest possible block size would be around 20KB-30KB."

So each block with its prefixed "magic" header contains either plain or compressed data. How that looks like we will have a look at in the next section.

One thing you may notice is that the default block size for files in DFS is 64MB, which is 1024 times what the HFile default block size is. So the HBase storage files blocks do not match the Hadoop blocks. Therefore you have to think about both parameters separately and find the sweet spot in terms of performance for your particular setup.

One option in the HBase configuration you may see is hfile.min.blocksize.size. It seems to be only used during migration from earlier versions of HBase (since it had no block file format) and when directly creating HFile during bulk imports for example.

So far so good, but how can you see if a HFile is OK or what data it contains? There is an App for that!

The HFile.main() method provides the tools to dump a data file:

$ hbase org.apache.hadoop.hbase.io.hfile.HFile
usage: HFile [-f ] [-v] [-r ] [-a] [-p] [-m] [-k]
-a,--checkfamily Enable family check
-f,--file File to scan. Pass full-path; e.g.
hdfs://a:9000/hbase/.META./12/34
-k,--checkrow Enable row order check; looks for out-of-order keys
-m,--printmeta Print meta data of file
-p,--printkv Print key/value pairs
-r,--region Region to scan. Pass region name; e.g. '.META.,,1'
-v,--verbose Verbose output; emits file and meta data delimiters

Here is an example of what the output will look like (shortened here):

$ hbase org.apache.hadoop.hbase.io.hfile.HFile -v -p -m -f \
hdfs://srv1.foo.bar:9000/hbase/docs/999882558/mimetype/2642281535820134018

Scanning -> hdfs://srv1.foo.bar:9000/hbase/docs/999882558/mimetype/2642281535820134018
...
K: \x00\x04docA\x08mimetype\x00\x00\x01\x23y\x60\xE7\xB5\x04 V: text\x2Fxml
K: \x00\x04docB\x08mimetype\x00\x00\x01\x23x\x8C\x1C\x5E\x04 V: text\x2Fxml
K: \x00\x04docC\x08mimetype\x00\x00\x01\x23xz\xC08\x04 V: text\x2Fxml
K: \x00\x04docD\x08mimetype\x00\x00\x01\x23y\x1EK\x15\x04 V: text\x2Fxml
K: \x00\x04docE\x08mimetype\x00\x00\x01\x23x\xF3\x23n\x04 V: text\x2Fxml
Scanned kv count -> 1554

Block index size as per heapsize: 296
reader=hdfs://srv1.foo.bar:9000/hbase/docs/999882558/mimetype/2642281535820134018, \
compression=none, inMemory=false, \
firstKey=US6683275_20040127/mimetype:/1251853756871/Put, \
lastKey=US6684814_20040203/mimetype:/1251864683374/Put, \
avgKeyLen=37, avgValueLen=8, \
entries=1554, length=84447
fileinfoOffset=84055, dataIndexOffset=84277, dataIndexCount=2, metaIndexOffset=0, \
metaIndexCount=0, totalBytes=84055, entryCount=1554, version=1
Fileinfo:
MAJOR_COMPACTION_KEY = \xFF
MAX_SEQ_ID_KEY = 32041891
hfile.AVG_KEY_LEN = \x00\x00\x00\x25
hfile.AVG_VALUE_LEN = \x00\x00\x00\x08
hfile.COMPARATOR = org.apache.hadoop.hbase.KeyValue\x24KeyComparator
hfile.LASTKEY = \x00\x12US6684814_20040203\x08mimetype\x00\x00\x01\x23x\xF3\x23n\x04

The first part is the actual data stored as KeyValue pairs, explained in detail in the next section. The second part dumps the internal HFile.Reader properties as well as the Trailer block details and finally the FileInfo block values. This is a great way to check if a data file is still healthy.

KeyValue's

In essence each KeyValue in the HFile is simply a low-level byte array that allows for "zero-copy" access to the data, even with lazy or custom parsing if necessary. How are the instances arranged?


The structure starts with two fixed length numbers indicating the size of the key and the value part. With that info you can offset into the array to for example get direct access to the value, ignoring the key - if you know what you are doing. Otherwise you can get the required information from the key part. Once parsed into a KeyValue object you have getters to access the details.

Note: One thing to watch out for is the difference between KeyValue.getKey() and KeyValue.getRow(). I think for me the confusion arose from referring to "row keys" as the primary key to get a row out of HBase. That would be the latter of the two methods, i.e. KeyValue.getRow(). The former simply returns the complete byte array part representing the raw "key" as colored and labeled in the diagram.

This concludes my analysis of the HBase storage architecture. I hope it provides a starting point for your own efforts to dig into the grimy details. Have fun!

Update: Slightly updated with more links to JIRA issues. Also added Zookeeper to be more precise about the current mechanisms to look up a region.

Update 2: Added details about region references.

Update 3: Added more details about region lookup as requested.

Powered by WordPress | Theme: by 85ideas. Editor by Khoanguyen