Last month I was approached by a customer who was migrating code from XE2 and saw some incompatibilities (or actually some bugs) in the way regular expressions are processed in the TRegEx class. He pointed out that using:

  RegEx := TRegEx.Create(sPattern);
  Match := RegEx.Match(sInput);

would work properly, while forcing the expressions to compile (for extra performance, if the same expressions is applied multiple times), like in the following code will basically break the regular expressions engine:

  RegEx := TRegEx.Create(sPattern, [roCompiled]);
  Match := RegEx.Match(sInput);

This looked really a bug and I noticed the difference was that with the compiled expression there was an extra match (and skipping it with the NextMatch call things would work). I asked to the R&D team and it turns out this change was done specifically to fix a real limitation (qc.embarcadero.com/wc/qcmain.aspx -- submitted by Jan Goyvaerts, who originally wrote the library and fixed in XE6).

What happens is that older version of TRegEx didn't support "empty matches", while now they do. To preserve the behavior, the structure is created with the roNotEmpty flag tunred on:

constructor Create(const Pattern: string; Options: TRegExOptions = [roNotEmpty]);

So you can now disable the option PCRE_NOTEMPTY (or roNotEmpty) using:

RegEx := TRegEx.Create(sPattern, []);

However, if you pass a different value to the constructor, the "not empty" is not applied. Once you know about this change, the workaround is add this option explicitly, so the "compiled" code above should read:

RegEx := TRegEx.Create(sPattern, [roNotEmpty, roCompiled]);”

By the way, TRegEx support has also been enhanced in XE7 to use a newer version of the library (PCRE 8.35) on Windows and mobile, while on Mac and for iOS simulator the TRegEx code binds to the version of the library available on the system. You can read this at docwiki.embarcadero.com/RADStudio/XE7/en/What%27s_New_in_Delphi_and_C%2B%2BBuilder_XE7#PCRE_8.35_for_Windows_and_Mobile_Platforms.

Now it is debatable if this is optimal (as there might be subtle differences, while using system libraries helps reduce the application footprint), but that's not the real point of this blog post.