Not unexpectedly, there is debate in the Delphi community around strings in the NextGen compiler, touching on string immutability, copy-on-write, the loss of the AnsiString type in Delphi's new iOS compiler, and the like. Some sources are:
- forums.embarcadero.com/thread.jspa (very long thread)
Before I start, let me clarify that this blog post is not the result of any internal debate or discussion, so it is not an official company position. It does reflect my understanding of the current scenario and it is a direct reaction, after reading some of the posts above (which do contain a lot of insight and good ideas).
Let me start with what I posted as a comment to the first blog entry above:
Very nice roundup about the features and advantages of strings in Delphi compared to other platforms. I agree on some speed loss compared to FastCode, but mostly because we moved away from assembly (given platform portability is very relevant for Delphi today). Also, TStringBuilder can be faster in the scenarios it beats the native memory manager in terms of pre-allocations, something FastMM4 does pretty well on Windows, the native iOS allocator doesn’t seem so smart doing.
Stay assured, there is no plan to change anything going forward unless there is a proven advantage. This is particularly true for immutable strings. They could provide some advantage (for example in heavy multi-threading, when managing string fragments, and in other scenarios)… but will also have disadvantages. At this point there is no definitive decisions on this. Strings in the current Delphi iOS compiler are NOT IMMUTABLE, there is only a warning indicating where things might change in the future. Native might change, not will certainly change.
The last part is key to the debate. The current implementation of strings is still centered around the COW (Copy-On-Write) mechanism, with some really minor differences. And you can change a string in each of the following ways in Delphi for iOS (aka, our NextGen compiler):
Delete(S, 1, 2); // not allowed
Insert(S, 1, 'He'); // not allowed
S := 'T'; // not allowed
I grabbed these three lines from the forum thread. The first and the latter (and their TStringHelper equivalents) are still and will be allowed in the future. The only question is if their performance will stay the same or decrease, which will make sense if most other string operations will become faster. As for the last statement, this triggers a warning in Delphi for iOS (a warning you can disable), indicating a potential future compatibility. This means there is an internal evaluation of truly immutable strings, no decision taken.
We more than welcome input, but better if based on some substance. Immutable strings are used in some languages to help the garbage collector, and that's something we don't need (given we don't have and don't plan adding a GC in Delphi). But they are used in other language that break away from the concept of linear strings (string s= array of characters) and use a concept of string fragments (string = tree of immutable string fragments, potentially shared by different strings). Again, this is just a research area, and there is nothing decided. Unless we prove significant advantages, we won't deviate from the current model.
So getting back to the thread, at one point someone suggested a petition around 4 requests:
- the loss of AnsiString (and single byte Chars)
- that strings may become immutable
- that strings are either 1-based or 0-based
- the re-implementation of RTL functions with rather suboptimal code
Let me comment in the reverse order:
- Improving RTL functions is always a goal. honestly, with fewer RTL functions (and fewer string types) it becomes easier to focus on those.
- For now, 1-based or 0-based is just a local compiler option, not sure if the request is to keep it this way...
- Immutability, as described above, is a potential future, which would be implemented only if there is a proven advantage
- The loss of AnsiString is the most controversial issue. The problem is that it is also the loss of UTF8String. And of AnsiChar. And of using pointers to directly manage character buffers. Which is not really lost, because this is what we are doing internally and suggesting externally. I do understand the request, but it seems to have different implications for different people (and string formats). Optimizing for English language is not a primary goal, as Delphi is used a lot around the world. Improving the management of raw dynamic arrays of bytes storing characters is certainly on the table.
Two more (short) comments related with other complains.
The first is around the fact we are supposedly "dumbing down" the language or making it "Java-like". I really disagree this is the case. Adding features shared by many modern languages doesn't imply dumbing down, but might have the effect of making the language easier to learn and use. Letting the power users go ahead with all the tricks they know of. Cleaning up the large collection of string types goes in this direction. The fact a type is not native implies that an operation requires a function call rather than an internal (magic) function call, but I don't see a huge difference in using a record with methods versus a native type, for any complex operation.
The second is the idea that we should help maintaining the code on very old versions of Delphi. This is quite debatable. We are adding language features across compilers (Delphi ships with 5 compilers today) for platform cross-platform compatibility. Worrying about how to write Delphi code that shines today and still works on Delphi 6 (a twelve years old development tool) would be like asking Microsoft to add WinRT support in Visual Basic 6.
Now having said this we are open to feedback, particularly constructive feedback and suggestions. And would someone want to start an open source project to implement a few cross-platform Delphi string types (AnsiString, UTF8String), I'll be willing to contribute.