October 15, 2008

Not so fast, TStringBuilder

Last week Olaf Monien posted a TStringBuilder benchmark that surprised me, because in my experience the TStringBuilder class isn't that fast. Here is why.

TStringBuilder is a new class of Delphi 2009 that mimicks the StringBuilder class of .NET. There have already been many reference in various blgos (from CodeGear developers) hinting at the idea that using the TStringBuilder class is faster than concatenating strings. Last week Olaf Monien posted a TStringBuilder benchmark along that same line, and that surprised me, because in my experience the TStringBuilder class isn't that fast. So who is right?

I think we are both right. In his code (you can download the project from his blog) he adds a single character to the string or TStringBuilder:

SB.Append(' ');
s := s + ' ';

and the former is indeed faster. In my test: 3,151 for TStringBuilder vs. 3,837 for a plain concatenation. I was indeed surprised. Than I noticed there is a specific overload of the Append method that takes as parameter a Char and has specific optimized code. If you alter the original source code slightly, adding a new string that takes the single space as value, and append or concatenate the string, with the lines:

s2 := ' ';
SB.Append(s2);
s := s + s2;

The timing changes dramatically, with 11,170 for TStringBuilder vs. 3,791 for a plain concatenation (notice the latter does not change at all). And I've enabled the various string-processing optimizations Delphi 2009 provides (at least $StringChecks OFF, which makes a minor but noticeable difference, cutting about 500 ticks).

In other words, this demo proves that concatenating characters to strings via TStringBuilder is slightly faster, concatenating actual strings is significantly slower. In a series of test with larger strings and more real-world situations I noticed a less noticeable differnce, but it seems that string concatenation invariably wins over using the TStringBuilder. So is this new class useless? Not at all. It lets you add various data types to a string, making the code much more readable, more .NET comaptible, you can concatenate operations, and the time penatly is generally negligible. But don't use this class because it is faster... until someone rewrites it in assembly!





 

7 Comments

Not so fast, TStringBuilder 

"But don't use this class because it is faster... 
until someone rewrites it in assembly!"

Please do not fuel the myth that assembly code is 
inherently faster than pascal code. I am pretty sure 
that the code could be improved without resorting to 
assembler, and also writing it in assembler in a bad 
way could reduce performance even more.

And I take readable Pascal code over assembler code 
any time, even if it were slightly less efficient.
Comment by Thomas Mueller [http://www.dummzeuch.de] on October 15, 19:01

Not so fast, TStringBuilder 

Olaf’s Thoughts in his post shows that TStringBuilder
is significantly faster than String.
http://www.monien.net/blog/index.php/2008/10/delphi-2009-tstringbuilder
Comment by Alex on October 16, 09:58

Not so fast, TStringBuilder 

The performance "problem" with TStringBuilder is that
it's internally just uses concatenation into a dynamic
array (which behaves the same way as string
concatenation), so it's just a string concatenation in
disguise with overhead most of the time, and minor
gains in corner cases.

As for readability, I don't think that code having
dozens of overloaded Append improves readibility in
any way. The overloading especially means you don't
really have an idea of what is actually getting
concatenated, or how it will come out (IntToStr and
the rest may look ugly, but they provide information
to whoever reads the code, Append() does not).

TStringBuilder is a necessary evil in .Net because
string invariance implies very low concatenation
performance... up to the point of triggering
catastrophic collapses of the GC, as has been
demonstrated many times.
Comment by Eric [] on October 16, 12:41

Not so fast, TStringBuilder 

 TStringBuilder is slightly faster on my computer with 
the following code. (2.2 seconds vs 2.6 seconds) 
However, if I double Limit, my computer runs out of 
memory so it probably is not a realistic example. If I 
reduce Limit by a factor of 10, TStringBuilder is 
slightly slower. In addition, if I move the 
TStringBuilder code before the concatenation code, 
TStringBuilder is slower.

var
  InitialString: string;
  FinalString: String;
  StartTime: TDateTime;
  Index: integer;
  ConcatenationTime: double;
  StringBuilder: TStringBuilder;
  Limit: integer;
  StringBuilderTime: double;
begin
  Limit := 10000000;
  InitialString := 'abcdefghijklmnopqrstuvwxyz';

  StringBuilder := TStringBuilder.Create;
  try
    StartTime := Now;
    for Index := 0 to Limit - 1 do
    begin
      StringBuilder.Capacity := 
Limit*Length(InitialString);
      StringBuilder.Append(InitialString)
    end;
    FinalString := StringBuilder.ToString;
    StringBuilderTime := (Now-StartTime)*24*3600;
    ShowMessage(FloatToStr(StringBuilderTime) + ' 
seconds');
  finally
    StringBuilder.Free;
  end;

  FinalString := '';
  StartTime := Now;
  for Index := 0 to Limit - 1 do
  begin
    FinalString := FinalString + InitialString;
  end;

  ConcatenationTime := (Now-StartTime)*24*3600;
  ShowMessage(FloatToStr(ConcatenationTime) + ' 
seconds');


end
Comment by Richard B. Winston [] on October 17, 16:41

Not so fast, TStringBuilder 

Assembly code is not faster by itself. Highly 
optimized assembly code usually is, as FastMM and the 
FastCode project show.
Moreover, AFAIK the Delphi compiler still doesn't 
take advantage of newer processor, and thereby those 
features are only available by writing assembly code 
by hand.
Comment by Luigi D. Sandon on October 17, 19:39

Not so fast, TStringBuilder 

"And I take readable Pascal code over assembler code 
any time, even if it were slightly less efficient."

Come on Thomas, dont be that cruel with asm code, it
is readeable too afaik! and it gives you the chance to
optimize the code as much as you can, specially in
this speed-critical routines.
Comment by Javier Santo Domingo on October 18, 04:16

Use Streams 

If you want performance on huge string concatenation,
use TMemoryStream (or, of course its neighbour
TFileStream). I recently rewrote SQL dump capabilities
in HeidiSQL and did some timing on this quite complex
code structure, basically showing the difference
between String and TStream concatenation:
http://www.heidisql.com/forum.php?t=4558
TStreams are slightly more difficult to handle but
performance shoots that.
Comment by Anse [http://www.anse.de/] on January 8, 03:07


Post Your Comment

Click here for posting your feedback to this blog.

There are currently 0 pending (unapproved) messages.