Tuesday, September 07, 2010

AXIS2- ConfigurationContextFactory - watch out!

The story: 
Well it was quite an unusual thing to see. The production system was slowly showing signs of multiple Timer threads (idle) being created- rapidly. The more the execution boosted the more Timer threads created. That is something not good - and the Thread Dumps + JVM statistics were indicating that something was wrong.

The abnormal thread creation could end up on dangerous thread starvation problems even for a system with high capacity (in all terms). 

Spotting the problem? Almost.
 After some research a colleague has spotted that the creation of these timers was related to a subsystem - responsible to perform web service requests to an external system. The subsystem was built around the AXIS2 Web Service Framework. With a quick code review he saw that there was indeed a case where the web service stubs were not successfully performing - calling clean up methods!So we provided this fix, happy

Lesson One:
When using in your code Web services stubs (let's say in the context of AXIS2) always include finally blocks that will perform stub clean up.


Of course any other custom clean up code - make sure you call it!!!

Problem Solved no?
Despite the fact that indeed a potential bug was found - the system was still creating Timer threads, too many. Debuging on the test environment proved the case - that is was not the clean up (for this case). 

So what?
I really needed a start, so acquiring the source of the AXIS2 engine + the modules source (external parner library) I had to see where we had the problem. For sure our code was not trying to be smart so it had to be either in the libraries or some obvious thing in the code?

Problem : ConfigurationContextFactory
No it is the class the error but you can not blame the developer as well. This class is responsible for creating a ConfigurationContext for the Axis2 engine. You may have different types File, URI, etc. Creating this context (depending on your case) is a quite heavy operation. The code read's multiple files, properties and tries to do some smart things regarding initialization. 

I could clearly see the mistake
was being called in every web service stub creation. The initial developer in our side thought that the Factory would have some sort of singleton implementation and would cache the instance created and fetched back.

Unfortunately that is not the case. Every call to create, creates a new instance and this instance upon initialization from the DeploymentEngine - DEPENDING on your configuration - is instantiating some sort of Scheduler Threads that support hot web service deployment features. (Check your axis config properties for this feature). In our case this was enabled by the web service client (partner library). 

One Solution (Lesson two):
Our problem solved by implementing a Context Factory with a singleton - and re-using this instance. As it is indicated in this forum thread and this one, this is the right way to go, but make sure no synchronization problems are being introduce upon changing dynamically the context by multiple threads.

It proved to be simple enough - but created a quite nasty one in the start. So watch out!

No comments:

Post a Comment