Tag Archive: string builder


C# StringBuilder Mistake

You want to find out if your code has this common StringBuilder mistake. Sometimes StringBuilder code has inefficient string appends in it. Optimize StringBuilder code that has this mistake. Here we look at this mistake and see ways you can fix it easily, using the C# programming language.

~~~ StringBuilder mistake and benchmark using C# ~~~                 
    Based on .NET 3.5 SP1.
    It is faster to avoid concatenating strings inside Append calls. 

Before, with concat:      1841 ms
After, separate Appends:  1154 ms  [faster]

StringBuilder problem code

First, I have seen code like I show next in the real-world. In fact, some C# code that I installed on this web site for an advertising company made this mistake. Let’s look at the mistake first.

=== StringBuilder code with performance problem (C#) ===

static void Test1()
{
    StringBuilder builder = new StringBuilder();
    for (int i = 0; i < _iterations; i++)
    {
        string extra = i.ToString();
        string before = "okay" + extra;
        string url = "http://sometest.com/" + extra;
        string text = "Some Test" + extra;
        string after = "That's Good" + extra;

        builder.Append(@"<li style=""float: right; clear: none;
display: block; margin: 0; padding: 0;"">" +
            before + @" <a href=""" + url + @""">" + text + "</a> " + after + "</li>");
    }
}

It declares a StringBuilder. We declare a StringBuilder, which is great for this situation. It uses the plus operator. The five string declarations in the inner loop are not really important to look at—they are just there for the benchmark and because they were there in the code I am fixing. Look at the lines with Append.

It’s a mess. I apologize for the syntax mess in the last part of the loop above. You doubtless have a lot of code that looks like this or worse. What this code does is build an HTML string of list elements. It isn’t optimal. This is the before code that has a hidden inefficiency. Read on for the improvements we should make.

Faster StringBuilder code

When you use StringBuilder, don’t also use +. The plus operator creates a bunch of temporary strings. Eliminate these for better performance.

=== Faster StringBuilder implementation (C#) ===

static void Test2()
{
    StringBuilder builder = new StringBuilder();
    for (int i = 0; i < _iterations; i++)
    {
        string extra = i.ToString();
        string before = "okay" + extra;
        string url = "http://sometest.com/" + extra;
        string text = "Some Test" + extra;
        string after = "That's Good" + extra;

        builder.Append(@"<li style=""float: right; clear: none;
display: block; margin: 0; padding: 0;"">").Append(
            before).Append(@" <a href=""").Append(url).Append(@""">").Append(
            text).Append("</a> ").Append(after).Append("</li>");
    }
}

It doesn’t use +. We eliminate the plus on the strings, and simply pass the strings directly to Append on StringBuilder. It chains Appends. See my article on the Append methods here. This syntax is the same as calling builder.Append() over and over again, just shorter.

See StringBuilder Append Syntax.

Optimization results

The above figures shows the time for the method to run 1000 iterations in the inner loop 1000 times. The memory usage isn’t big here because the StringBuilder doesn’t grow too large in each of the 1000 runs.

Summary

Here we looked at a common mistake found when using StringBuilder in real-world programs. You should avoid using the + concatenation operator in the same statements as StringBuilder. This can improve performance by 40%. I think the best option is to use the syntax in the final example.

C# StringBuilder Secrets

You need to append strings together, yielding a longer string, using the C# programming language. Measure performance and avoid string append methods that can cause severe performance problems. Here we see how you can start using StringBuilder by including the System.Text namespace in your file. You usually use StringBuilder in a loop, and the examples here show more usages.

::: String type:                        
    Has immutable buffers.
    Cannot be changed.
    Each operation returns a new string.
    Copies can increase memory pressure.

::: StringBuilder type:                 
    Has mutable buffers.
    Can be changed without copying.
    Usually used in loops.

Using StringBuilder

First, we see some of the essential methods on the StringBuilder type in the base class library. The methods shown here will allow you to use it effectively in many programs, appending strings and lines. This example does not show a loop, and it is not ideal as a program, but it is for demonstration purposes.

=== Program that uses StringBuilder (C#) ===

using System;
using System.Text;
using System.Diagnostics;

class Program
{
    static void Main()
    {
        // 1.
        // Declare a new StringBuilder.
        StringBuilder builder = new StringBuilder();

        // 2.
        builder.Append("The list starts here:");

        // 3.
        builder.AppendLine();

        // 4.
        builder.Append("1 cat").AppendLine();

        // 5.
        // Get a reference to the StringBuilder's buffer content.
        string innerString = builder.ToString();

        // Display with Debug.
        Debug.WriteLine(innerString);
    }
}

=== Output of the program ===

The list starts here:
1 cat

Description of the new keyword. It uses the ‘new’ keyword for StringBuilder. Use the new keyword to make your StringBuilder. This is different from regular strings. StringBuilder has many overloaded constructors.

Description of appending. It calls the instance Append() method. This method adds the contents of its arguments to the buffer in the StringBuilder. Every argument to StringBuilder will automatically have its ToString method called.

Using AppendLine method. It calls AppendLine, which does the exact same thing as Append(), except with a newline on the end. Next, Append and AppendLine call themselves. This shows terse syntax with StringBuilder. Finally, ToString returns the buffer. You will almost always want ToString(). It will return the contents as a string.

Memory pressure

StringBuilder will make your appends go faster usually, but there’s another benefit. In .NET, there is a concept of “memory pressure”, meaning that the more temporary objects created by your app, the more often garbage collection runs. StringBuilder creates fewer temporary objects and adds less memory pressure. This site contains an experiment that shows the memory usage of a StringBuilder after garbage collection occurs.

See StringBuilder Memory Use.

Replacing and inserting

Here we see how you can use StringBuilder to replace or insert characters in loops. First convert the string to a StringBuilder, and then call StringBuilder’s methods to do these operations. This is faster because the StringBuilder type uses character arrays internally, not unchangeable strings.

See Replace String Examples.

=== Program that uses Replace (C#) ===

using System;
using System.Text;

class Program
{
    static void Main()
    {
        StringBuilder builder = new StringBuilder(
            "This is an example string that is an example.");
        builder.Replace("an", "the"); // Replaces 'an' with 'the'.
        Console.WriteLine(builder.ToString());
        Console.ReadLine();
    }
}

=== Output of the program ===

This is the example string that is the example.

Question

When shouldn’t I use StringBuilder?

When you don’t have a loop, generally you should avoid StringBuilder. There is lots of logic in StringBuilder that will be slow on very small operations. StringBuilder can make your code more cumbersome. Here we look at some potential problems with using StringBuilder in problems.

When to consider char arrays. Character arrays, specified as char[], are vastly simpler, but your code must be more precise. If you know a maximum or absolute size of your output string, and your requirements are simple, use char arrays.

Consider this common mistake. Many developers make a very specific StringBuilder mistake that reduces speed by about 40%. Don’t use the + operator on strings within a StringBuilder. It will simply draw in the slowness of strings, resulting in many temporary assignments and copies to the managed heap.

See StringBuilder Mistake.

What about AppendFormat? AppendFormat draws in regular strings and substitutions. Internally, many versions of AppendFormat in the .NET Framework are implemented with StringBuilder instances. It is usually faster to call Append repeatedly with all the required parts. However, the syntax of AppendFormat can be clearer to read and maintain in some programs. You can see an example of the AppendFormat method in the section “Using AppendFormat.”

Looping

Here we see how you can use StringBuilder in a simple loop. As I have noted, almost always your StringBuilder will be used in a loop. This can be a foreach, for, or while loop. Here’s another example of StringBuilder, but in a foreach loop.

=== Program that uses foreach (C#) ===

using System;
using System.Text;

class Program
{
    static void Main()
    {
        string[] items = { "Cat", "Dog", "Celebrity" };

        StringBuilder builder2 = new StringBuilder(
            "These items are required:").AppendLine();

        foreach (string item in items)
        {
            builder2.Append(item).AppendLine();
        }
        Console.WriteLine(builder2.ToString());
        Console.ReadLine();
    }
}

=== Output of the program ===

These items are required:
Cat
Dog
Celebrity

Using AppendFormat

The AppendFormat method on the StringBuilder type is very important and deserves a closer look. It can be used to add text to your StringBuilder based on a pattern. You can use substitution markers to fill fields in this pattern. This site has more detail about the AppendFormat method.

See StringBuilder AppendFormat Method.

Benchmark

This site has a separate article on benchmarks of StringBuilder versus string here. Further on in the article you are reading, you can see a way of using StringBuilder more effectively as a parameter. The key to good performance here is to avoid string copies, thereby reducing allocations on the managed heap. The benchmarks here are based on the .NET Framework 3.5 SP1.

See StringBuilder Performance Test.

Testing for equality

You will find that StringBuilder defines an instance method Equals that can be used to compare the capacities and contents of two StringBuilders. It is important to use this method because you can avoid lots of error-prone code that you write yourself. There are some subtleties to Equals, however, which are looked into on this site.

See StringBuilder Equals Method.

Clearing data

How can you clear the data inside your StringBuilder that you have already appended? Sometimes it is best to allocate a new StringBuilder instance; other times, you can assign the Length property to zero or use the Clear method from the .NET Framework 4.0.

See StringBuilder Clear Method.

Using StringBuilder parameter

Here we can see how you can use the StringBuilder type as a method parameter. This is a significant optimization because it will avoid converting back and forth to strings. The example shows StringBuilder parameters and reusing the same StringBuilder. Strings alone would be hugely slower in many cases.

=== Example 1: many StringBuilders created (C#) ===

using System;
using System.Text;

class Program
{
    static string[] _items = new string[]
    {
        "cat",
        "dog",
        "giraffe"
    };

    /// <summary>
    /// Append to a new StringBuilder and return it as a string.
    /// </summary>
    static string A1()
    {
        StringBuilder b = new StringBuilder();
        foreach (string item in _items)
        {
            b.AppendLine(item);
        }
        return b.ToString();
    }

    static void Main()
    {
        // Called in loop.
        A1();
    }
}

=== Example 2: one StringBuilder used as parameter (C#) ===

using System;
using System.Text;

class Program
{
    static string[] _items = new string[]
    {
        "cat",
        "dog",
        "giraffe"
    };

    /// <summary>
    /// Append to the StringBuilder param, void method.
    /// </summary>
    static void A2(StringBuilder b)
    {
        foreach (string item in _items)
        {
            b.AppendLine(item);
        }
    }

    static void Main()
    {
        // Called in loop.
        StringBuilder b = new StringBuilder();
        A2(b);
    }
}

=== Important differences ===

Version 1: Many StringBuilders created
           Increased memory pressure

Version 2: One StringBuilder created
           Less memory pressure
           Same result as before

=== Benchmark results (many iterations of methods) ===

Version 1: 5039 ms
Version 2: 3073 ms

Understanding ‘immutable’ strings

The word ‘immutable’ indicates that the data being pointed at is not changeable. To see an example of an immutable object in the C# language, try to assign to a character in a string. This causes a compile-time error, because the string type does not define a set accessor. However, character arrays can be changed. Internally, StringBuilder uses “mutable” char arrays for its buffer.

See String Append, Adding Strings Together.

Avoiding ArgumentOutOfRangeException

You may be getting this exception if you are putting too much data in your StringBuilder. The maximum number of characters in a StringBuilder is equal to Int32.MaxValue. The author suggests you check for infinite loops or other serious problems.

See int.Max and Min Constants.

Members

Here we look at more members on the StringBuilder type in the .NET Framework base class library. This document only shows some of the StringBuilder members. The Replace, Insert and Remove methods are very important, even though they are less common than Append.

Visit msdn.microsoft.com.

Append
MSDN description:
"Appends the string representation of a specified object
to the end of this instance."

AppendFormat
Appends string using the formatting syntax available in string.Format.
This can improve code clarity.

EnsureCapacity
This rarely is useful for changing the capacity.
This is an optimization you can use.

Insert
Very similar to the Replace method.
Used to add characters at an index.

Remove
Essentially the same as the Remove method on string.
Avoids character array copying.

Replace
Replaces one string of characters with another.
See the article on the Replace method.

ToString
Internally converts the buffer to a string type.
Often does not perform any additional allocations.

See StringBuilder ToString Method.

Summary

Here we saw ways you can effectively use StringBuilder in your C# application. StringBuilder can improve the performance of your program, even if you misuse it. However, by using it optimally, results are much better. Use StringBuilder as a parameter instead of calling ToString frequently. The article doesn’t show many genuine secrets, but the tips are valuable nonetheless.

Source : http://dotnetperls.com/stringbuilder-1

C# StringBuilder Performance Test

You are deciding whether to use StringBuilder instead of string when appending string data. Determine the point at which StringBuilder becomes more efficient. Here we see benchmarks of StringBuilder and strings, and discuss the important considerations when choosing between the two, using the C# programming language.

StringBuilder performance

Benchmarking StringBuilder

Here we see the code styles that were benchmarked to produce the graph. This is an adjunct article to the author’s otherStringBuilder work. What was tested here were tight loops that append a single string each iteration. The number of strings that were appended was steadily increased to determine the slope of the graph.

=== StringBuilder default code [red line] ===
    Uses StringBuilder with no capacity.
    Loops over number of strings.
    ToString method used at end.

StringBuilder builder1 = new StringBuilder();
for (int i = 1; i < count1; i++)
{
    builder1.Append(s);
}
return builder1.ToString();

=== StringBuilder with capacity [purple line] ===
    Capacity value is accurate.

StringBuilder builder2 = new StringBuilder(count1 * maxLength);
for (int i = 1; i < count1; i++)
{
    builder2.Append(s);
}
return builder2.ToString();

=== String concat [blue line] ===
    Uses overloaded operator + for string.Concat.

string st1 = string.Empty;
for (int i = 1; i < count1; i++)
{
    st1 += s;
}
return st1;

StringBuilder benchmark results. My test tried to defeat every compiler optimization by using different strings of the same lengths to append. The tests were repeated 20 times more for counts below 8 strings. You can see the graph at the top of this article.

The author’s interpretation. The benchmark indicates that using the string reference type with Concat is faster for one to four strings inclusive. Using StringBuilder with an accurate capacity is faster than StringBuilder always.

Best practices

Here we look at what the author thinks the best way to use StringBuilder is. Use StringBuilder always when you have a loop of a variable length. This will make your program usable in more scenarios than string Concat will.

An example of the guideline. For example, if you have a program that has three strings 99% of the time, but 30,000 strings 1% of the time, using StringBuilder will make the program usable in that 1%. As always, one of the most important parts of good programming is planning for errors and edge cases.

Warning for algorithmic inefficiencies. Obviously, if your program is lightning fast 99% of the time, but fails 1% of the time, your program is unacceptable for important work. Therefore, I think StringBuilder is preferable in most loops.

Weaknesses of benchmark

There are many problems to this test. The individual strings themselves were 20 characters long. This may not match many programs in the real world. I didn’t benchmark cases where there are hundreds of thousands of strings. However, it is clear to me that the trend of the lines would continue. The capacity settings I used were always 100% accurate, which in real-world programs might be unlikely. However, I feel the findings here encourage using capacities.

See Capacity Property.

Summary

Here we saw some factors that will influence when you want to use StringBuilder in your C# programs. StringBuilder is entirely an optimization, and it offers no logical improvements to string Concat other than its internal implementation. Sometimes, it is okay to use small loops of four or fewer iterations with simple string concatenations. However, in edge cases this can be disastrous. Plan for your edge cases with StringBuilder.

Source: http://dotnetperls.com/stringbuilder-performance

Powered by WordPress | Theme: by 85ideas. Editor by Khoanguyen