November 28, 2012

DataSnap, Deployment, Performance, and More

My DataSnap webinar is running right now, but this blog post was prompted also by a performance post last week. Here are some comments and tips.

My DataSnap webinar is running right now (I'm listening to myself while blogging), it will be available as a reply and you can still attend the second and third rounds this evening and in the night (European time). To sign up, enter your data at http://forms.embarcadero.com/forms/AMUSCA1211DelphiMulti-tierMCantu11-28.

There is an introductory video at http://www.embarcadero.com/rad-in-action/development-and-deployment-of-delphi-multi-tier-applications and the same site will soon host the webinar white paper, a rather detailed document of well over 40 pages, which doesn't repeat my past white papers but adds a lot of new content.

This blog post, however, was also prompted by a DataSnap performance post last week on robertocschneiders.wordpress.com/2012/11/22/datasnap-analysis-based-on-speed-stability-tests/. While I partially disagree with the testing scenario, it does highlight a few important elements. My disagreement comes from the fact that having a rather sophisticated architecture for processing complex data types, managing sessions, mapping methods, and a lot more will never compete on speed with "simpler" architectures. I'm not saying that simpler architectures are not more effective (and DataSnap can certainly learn from leaner approaches), but you have to compare solutions offering similar features.

There are other issues that show up, as I'll highlight in a second, but for most of them I know a lot of fault on our side: we should document the alternatives better, have wizards generating more optimized versions, make it easy to pick different configurations. After this long preamble, let me cover a few technical points.

Thread Pooling

 

Regarding threading, creating one for each incoming request is Indy’s IdHTTPServer default configuration, but you can tune it adding code to the server main form, which creates and manages the Web server component. The OnCreate event handler of this form initializes the HTTP server, with the code:

procedure TForm1.FormCreate(Sender: TObject);
begin
  FServer := TIdHTTPWebBrokerBridge.Create(Self);
end;

Now we can change the configuration to use a thread pool, pre-allocating a number of threads for the incoming concurrent requests:

procedure TForm1.FormCreate(Sender: TObject);
var
  SchedulerOfThreadPool: TIdSchedulerOfThreadPool;
begin
  FServer := TIdHTTPWebBrokerBridge.Create(Self);

  SchedulerOfThreadPool := TIdSchedulerOfThreadPool.Create(FServer);
  SchedulerOfThreadPool.PoolSize := 50;
  FServer.Scheduler := SchedulerOfThreadPool;

  FServer.MaxConnections := 150;
end;

As you can see in the last line above you can also put a limit to the concurrent connection (MaxConnections): After reaching this limit the server will stop responding and simply return an error. This value should probably be quite high, but it is relevant, as it will prevent the server from shutting down in case of an attack or simply an excessive workload. It is often better to return a clear error message (“too many connections”) than simply fail to respond under the load.

Sessions Management

DataSnap creates sessions automatically, one for each request by a new client. The fact a client is new is determined by looking to extra headers, cookies, or query parameters. If nothing is found, a new session in created. So if you write a custom application for testing and make a plain HTTP call (maybe in a loop) you and up creating a server side session for each request. These will stay in memory for 20 minutes after the last request, by default. 

What I have added to my testing code is a first call to create and return the session information, which is then copied to HTTP extra headers. Here are the lines I added before the calling loop:

        IdHTTP1.Get(strUrl);
        strSession := Copy(
          Idhttp1.Response.RawHeaders.Values ['Pragma'], 1, 30);
        IdHTTP1.Request.CustomHeaders.Clear;
        IdHTTP1.Request.CustomHeaders.AddValue('Pragma', strSession);

As an alternative you can add the sessionid in the URL:

sURLSessionEquals = '?SESSIONID=';
strUrl := strUrl + sURLSessionEquals + strSessionID;
If you want to make sure this code is effective you can look at the server memory consumption, but also do a direct test adding this code to the server application:
  ShowMessage(IntToStr(TDSSessionManager.Instance.GetSessionCount));

This second change keeps memory consumption under control in the testing scenario, but again does not really affect the throughput. 

An alternative to change the session management behavior is to ask the server to close the session from within a server method, by using one of the custom parameters of the "invocation metadata": 

 

function TServerMethods1.ReverseString(Value: string): string;
begin
  GetInvocationMetadata.CloseSession := True;
  Result := System.StrUtils.ReverseString(Value);
end;

You might want to add this code to a specific "Close" method a client could potentially code when leaving the application.

Threading Crashes

Finally, the blog post mentioned refers to a pretty severe crash in the server in case of two threads hitting at the same time. This has been confirmed... and fixed by the R&D team. Actually the main issue is fixed, but others related issues are still under investigation. It will appear in a coming Delphi XE3 update, but I can make at least the main workaround available if anyone is currently blocked.

As I showed in my webinar, fix the fixes applied the server can receive a number of concurrent requests without problem. While I value speed and memory considerations, I think stability is even more important and having improvements in the coming update is certainly relevant.

Winding Down

There were many interesting questions in the webinar (just ended) and I'll soon have the white paper and the demos to download, so expect other blog posts on this topic soon. I think DataSnap is relevant and one of my goals is to push it significantly in future versions of Delphi, so feedback is appreciated. 

 





 

24 Comments

DataSnap, Deployment, Performance, and More  

I have not done my testing yet, so I cannot really talk 
about the results and workaround, but I really 
appreciate the way the Project Manager is taking in 
consideration the community input. This is a step 
forward into managing Delphi.

Thanks Marco.
Comment by Fabrizio Bitti on November 28, 16:46

DataSnap, Deployment, Performance, and More  

Marco,

Thank you for addressing the issue so quickly. I really appreciate you  
response, as we're about to start a new development based on DataSnap 
and were really concern with the results published.

Regards,
Mario
Comment by Mario Enriquez [http://www.multidomus.net] on November 28, 17:36

DataSnap, Deployment, Performance, and More  

 It would be nice to have a complete sequence showing 
step by step:
1) creating a Datasnap server saying 'Hello word' and 
exposing the reversestring function in severall 
declinaison (Web, windows service, Isapi etc) .
2) deploying this server on EC2 or W Azur
3) configuring IIS 7 to access to the server (Web, 
windows service, Isapi etc)
4) creating a Datasnap client that comunicate with 
the remote server.
A small video of 10 minutes= 1 000 new custumers! 
Comment by Didier on November 28, 18:16

Performance 

You can have a lot of features with no speed penalty.
WCF is one good example of it.
It depends mainly on the implementation architecture, 
and I suspect DataSnap was never meant to be fast nor 
scaling from its design.

I'm happy Embarcadero R&D found out the crashing issue 
of DataSnap, but it is a pity that it was observed so 
easily at first benchmarking.
The bottleneck and instability discovered with "hello 
world" does not lead into confidence for this 
platform. Bottlenecks are bottlenecks, as my captain 
said.

If DataSnap does not handle such simple requests, how 
would it resist to a basic DoS attack?

With mORMot, we tried to develop a KISS but scaling 
RESTful/JSON framework, able to be the fastest for 
simple "Hello World" requests like in the blog post, 
but also still the fastest when all its features are 
enabled, like sessions, security (e.g. per-url 
signature and Windows authentication), logging, 
caching, balancing, interface-based services with 
automatic endpoints configuration, client-server ORM, 
and very fast database access with direct sending of 
the JSON buffers to the clients.
Some upcoming features, like event-driven process 
(which does not exists as such in WCF), will be easily 
added, due to its modular architecture.

When comparing n-Tier solutions, we should compete 
with the existing and established platforms. In our 
case, it is WCF.
If we are not easier to configure/deploy, and faster 
to process, with at least similar features, and 
possibly some unique possibilities, Delphi-based 
solutions won't be worth considering.
ROI should be maximized.
Comment by A. Bouchez [http://synopse.info] on November 28, 18:43

DataSnap, Deployment, Performance, and More  

This somehow reminds me of the unfair comparison
between Delphi and Jbuilder that I saw a long time ao
in a Borcon.
The comparison was the allocation an object in both
languages doing some math operation and releasing the
object repeated 10000 times. Guess who was faster and
why? Does that mean that the fastest language was
really the fastest? What about CPU resources being
used by the RTE?
The test performed against DataSnap seems to be flawed
on its inception because it does not try to mimic what
the real application will do. 
A proper evaluation will clearly point strong and weak
points of all frameworks. Thus allowing a more
informed decision on which one is better suited for
the job at hand.

  
Comment by Alan Fletcher [http://delphibistro.com] on November 28, 19:29

DataSnap, Deployment, Performance, and More  

Hello Marco.

I understand and respect your opinion. Surely the 
server I used in testing is not optimized, but we will 
work to optimize it, and then I'll redo the tests 
(with the fixes already made ​​by Embarcadero).

I know the frameworks that I've tested have different 
features, to different purposes, I put it in the text. 
I also put my goal was not to compare the frameworks, 
and yes, test the performance of DataSnap!

I want to make clear (Maybe gotten the wrong 
impression) that I am not wanting a Wizard with more 
options and everything is configurable thru clicks. 
None of the other servers I have developed has Wizards 
and are still working perfectly. I think the 
documentation of the DataSnap is very weak. I am not 
charging everything to be easy to implement, just need 
to have documentation to tell me how the framework 
works, so I can implement.

Thanks for all the support that I'm taking from you 
Marco. It's a shame the support of the Embarcadero (at 
least in Brazil) does not follow the example and does 
not help us.

Comment by Roberto Schneiders [http://robertocschneiders.wordpress.com] on November 28, 20:49

DataSnap, Deployment, Performance, and More  

 Hello Marco,

Thanks for your comment.

As I'm still learning Delphi programming since 3 
years, I have still haven't found a dcent and easy 
way to have my Delphi client work with Azure or 
Amazon databases, in a sort of 2 or n-tiers 
application.

So I didn't upgrade from Xe2 to xe3. I believe my 
next upgrade will be when I can easily and quickly 
work with Cloud databases from a Delphi client on a 
PC or a Mac thanks to Delphi native components, easy 
methods + easy operation manual.

Best Regards, Paul
Comment by paul Z. on November 28, 23:02

DataSnap, Deployment, Performance, and More  

Paul, I think we have good support for Amazon and Azure in Delphi. 
What is exactly missing? Unless some of the features are in Enterprise 
only, but I don't think so...

-Marco
Comment by Marco Cantu [http://www.marcocantu.com] on November 29, 09:57

DataSnap, Deployment, Performance, and More  

 I really appreciate the way you are dealing with 
DataSnap problems, Marco. I also don't agree with 
testing scenario, but face that DataSnap is not perfect 
and needs some attention is the way to go. Keep the 
good work!
Comment by Alexandre Machado [http://alexandrecmachado.blogspot.com] on November 29, 10:23

DataSnap, Deployment, Performance, and More  

[quote]
My disagreement comes from the fact that having a 
rather sophisticated architecture for processing 
complex data types, managing sessions, mapping 
methods, and a lot more will never compete on speed 
with "simpler" architectures.
[/quote]
I'm not sure if you have ever used Jersey but I can 
assure you that it can handle complex data types, 
managing sessions, mapping methods, and a lot more. 
And using small frameworks like Yammer's Dropwizard 
on top of if, development is fast and enjoyable. And 
that's coming from a programmer using and loving 
Delphi since 1995 that hates Java's verbosity and 
complexity.
The thing is that Java with Jersey (as mORMot and 
others) can handle Hello World and very high loads 
and complex scenarios without problem (as Yammer does 
internally with Jersey/Dropwizard). The main problem 
Java has is that the HotSpot JIT uses lots and lots 
of memory compared to native languages.
While I thank you for the way you are handling this
I fully agree with A. Bouchez: "You can have a lot of 
features with no speed penalty".
So perhaps the benchmark is flawed, but your 
disagreement seems to be unfounded.

Comment by Carlos Sanchez on November 29, 14:56

DataSnap, Deployment, Performance, and More  

Seems I wasn't clear about my comment about features vs. speed. I'm 
saying I think a benchmark showing more real wold scenarios (sessions, 
database access, complex data structures) would make more sense to 
test a solution in the real world.

I'm not saying that other frameworks would become slower in that case, 
as they'll easily keep a strong lead. But if my threads responds in a tenth 
of a millisecond when not accessing a DB and 10 or 100 times slower 
when DB access is involved, some of factors will become less relevant.

Truly, it is not that DataSnap will shin in JSON processing... nor that it 
provides the most complete architecture (I did extend it personally with 
many new features including scripting and ActiveRecords).

For the average business operation, once you have decent speed a rock 
solid stability (still not there), database mapping, JavaScript UI mapping, 
architectures, flexibility in development are equally important than raw 
speed.

That's the direction I see for DataSnap, getting better but not try to 
compete speed-wise and on the ability to create business applications 
with different clients (mobile, web) faster and reliably. Would be a good 
enough goal in my opinion.

-Marco
Comment by Marco Cantu [http://www.marcocantu.com] on November 29, 15:10

DataSnap, Deployment, Performance, and More  

Marco,

Thanks, I understand. I would love to use Delphi for 
back-end systems but so far the options have not been 
as good as what is found elsewhere. If Embarcadero 
can reach the goal you are stating it will be a 
really good option.
On the other hand, I think that there is something 
wrong with the development and marketing direction of 
Embarcadero. Delphi has had a back-end development 
framework since at least the original MIDAS but 
almost nobody outside of the Delphi community knows 
or cares about it. Look at what node.js has done 
since 2009 and it is easy to think that they are 
doing something good (not perfect as it has its own 
problems) and "we" are going something wrong or not 
good enough.

Regards.
Comment by Carlos Sanchez on November 29, 15:29

DataSnap, Deployment, Performance, and More  

> But if my threads responds in a tenth 
> of a millisecond when not accessing a DB 
> and 10 or 100 times slower when DB access
> is involved

In that case the threading model becomes even more
crucial, as the longer it takes to query the DB, the
higher the number of concurrent requests happening at
the same time.

If the server is already burdened by inactive threads
handling inactive connections, it will scale much more
problematically than a server where threads are only
used for useful work.
Comment by Eric [http://delphitools.info] on November 29, 16:50

DataSnap, Deployment, Performance, and More  

There is another optimization which I found with Indy
HTTP Server which also applies to (Indy-based)
DataSnap REST servers: setting Server.KeepAlive to
True, this avoids creation of new open sockets in the
operating system for every request (easily monitorable
with TCPView). With HTTP KeepAlive, the number of used
sockets will be limited to the number of active clients. 

The HTTP 1.1 specification, keep-alive is the default:

"A significant difference between HTTP/1.1 and earlier
versions of HTTP is that persistent connections are
the default behavior of any HTTP connection. That is,
unless otherwise indicated, the client SHOULD assume
that the server will maintain a persistent connection,
even after error responses from the server."

However, in DataSnap it is disabled by default. 

So I would like to suggest to add a recommendation to
the optimization tips to set Server.KeepAlive to True.
Comment by Michael Justin [http://mikejustin.wordpress.com/] on November 30, 08:59

DataSnap, Deployment, Performance, and More  

@Eric: I don't think you are right about this. When the 
database back-end is the bottleneck, doesn't matter if 
your REST library can handle 1 zillion simultaneous 
requests... Maybe it won't kill your server, but if 
your server is receiving 1.000.000 requests per second 
and your back-end can only handle 100.000, your 
application won't work for 90% or your users anyway.
In the case of Roberto Schneiders's tests: If his back-
end, in a real test case scenario, can only handle, 
say, 50 requests per second (and considering that his 
app wouldn't request more than that), DataSnap could 
handle it (Of course, after applying the crash bug 
fix). 
Comment by Alexandre Machado [http://alexandrecmachado.blogspot.com] on November 30, 10:25

DataSnap, Deployment, Performance, and More  

By the way, with the new anonymous functions in Delphi 
it's possible to make a promise framework for async 
operations.

Like:
  db.runQuery(...).then (
    procedure(..);
    begin
      // do something when the query is ready
    end
  )

It would simplify server-side multithreading developmet.
Comment by Maxim [] on November 30, 17:45

DataSnap, Deployment, Performance, and More  

 @Maxim

As long as it comes with its own scheduler and 
something to convert the regular blocking calls from 
DB drivers and others into async calls. If it 
simulates async with 1 thread per promise it will have 
bad performance.
And as long as the programmer avoids callback hell 
http://callbackhell.com/
Comment by Carlos Sanchez on November 30, 23:19

DataSnap, Deployment, Performance, and More  

I'm hopeful that DataSnap will continue to receive
attention.  Most of the developers that were
significantly involved in dbx DataSnap have either
moved on to other projects (Jim -> LiveBindings) or
have left Embarcadero (Steve, Adrian, Mathew, others).

I was concerned that Embarcadero might think DataSnap
was good enough, and no resources were given to mature
and extend DataSnap further.

Also, please don't forget the TCP transport.  Many new
exciting features were added for HTTP and REST in
Delphi XE that were not supported by TCP.  Some of
that was corrected in XE2, like the session manager.

We, for one, have an architecture that expects each
client connection to be a separate thread.  This is
partly because in D2009, this was the only way to
identify each client connection.  If this changes, be
_SURE_ the change is well advertised and documented.

Thanks!
Comment by JonR on December 1, 16:04

DataSnap, Deployment, Performance, and More  

I am impressed by the way you've handled this problem
Marco. Admitting that something is wrong and fixing
the problem is the best way for building trust among
the customers. 
Comment by Unspoken on December 2, 13:04

DataSnap, Deployment, Performance, and More  

@Alexandre:
> When the database back-end is the bottleneck,
doesn't matter if
> your REST library can handle 1 zillion simultaneous 
requests...

Actually it does, for many reasons, like
1) being able to return a proper error message (rather
than just rejecting connections)
2) the concurrency of Entreprise-class databases is
much higher than the 150 cited here
3) long query times may not be db back-end bottleneck,
but a load balancing, network times, I/O fetches, etc.
4) multiple queries in a single transaction can last a
"long" time, from the POV of middleware while not
being bottlenecked on the db at all

Essentially, such a low load limit introduces
artificial limitations at a level that could (and
should) be essentially transparent in terms of
workload. It's bound to create problems where there
should be none.
Comment by Eric [http://delphitools.info] on December 3, 07:47

DataSnap, Deployment, Performance, and More  

@ Carlos
Why do you think one-thread-per-promise is not enough?
If threads and connections are pooled?
What is inside node.js? Is it so difficult to 
implement?
Comment by Maxim [] on December 4, 06:01

DataSnap, Deployment, Performance, and More  

@Maxim

Yes! that should be enough it the pool size can be 
adjusted at startup time. I forgot about pooling 
threads and just thought about 1 thread per promise.

On the other hand I prefer the style of coding that 
solutions like Python's gevent bring to the table. No 
callbacks, just "synchronous" coding thanks to monkey 
patching.
 
When I spoke about node.js I was thinking on the 
marketing and projects built around it (the entire 
ecosystem); not the actual implementation which by 
the way, has good performance. I mean, in less than 3 
years they went from new kid on the block to being 
used in big enough projects instead of Java and with 
good results.

Yes, this is getting off topic so I leave it as it is.
Comment by Carlos Sanchez on December 4, 23:23

DataSnap, Deployment, Performance, and More  

 Hi Marco, could you provide the source code for 
JqueryMobile sample ?. I'd like to know how to convert 
jsonarray data to list

Best regards,
Alexandre.
Comment by Alexandre [http://www.korp.com.br] on December 7, 17:27

DataSnap, Deployment, Performance, and More  

In the section "session Management" you determine the 
session id with this code:



strSession := Copy(Idhttp1.Response.RawHeaders.Values 
['Pragma'], 1, 30);



Notice that the session id is somethimes shorter from 
what I can see so the copy includes the "," that 
follows the session id.



I think better would be the folowing code:

strSession := Copy(Idhttp1.Response.RawHeaders.Values 
['Pragma'], 1, Pos(',', 
Idhttp1.Response.RawHeaders.Values ['Pragma'])-1);

Comment by Peter Sawatzki on February 2, 21:41


Post Your Comment

Click here for posting your feedback to this blog.

There are currently 0 pending (unapproved) messages.