Sunday, February 2, 2014

Memory leak detection in a real world application

    Some time ago I was performing a soak test of the application in order to minimize the possibility of problems after final production deployment. I discovered some interesting facts.
Before the execution of a soak test I enabled GC logging:
-Xloggc:gc.log -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps
During test execution I enforced Full GCs (in 'zero load' periods - ) in order to check what is the current state of the memory occupation. You can enforce GC by means of JConsole (once you enable JMX connection) or jmap:
jmap histo:live pid
histo:live means that only live objects will be presented - non reachable objects will be removed during Full GC process. This is memory footprint which I obtained (some details were removed and the values were changed so please do not pay your attention to the values but to correlation and tendency):
3.922: [GC [PSYoungGen: 262656K->36090K(306176K)] 262656K->36162K(1005568K), 0.0536500 secs]
5.645: [GC [PSYoungGen: 298746K->43488K(568832K)] 298818K->46215K(1268224K), 0.0752860 secs]
13.123: [GC [PSYoungGen: 568800K->43504K(568832K)] 571527K->79558K(1268224K), 0.1472590 secs]
24.819: [GC [PSYoungGen: 568816K->43500K(655872K)] 604870K->100741K(1355264K), 0.1700250 secs]
32.260: [GC [PSYoungGen: 655852K->43488K(655872K)] 713093K->155914K(1355264K), 0.1583190 secs] 
929.341: [GC [PSYoungGen: 655840K->43515K(514560K)] 768266K->200665K(1213952K), 0.1068270 secs]
6475.346: [GC [PSYoungGen: 514555K->51800K(585216K)] 671705K->211738K(1284608K), 0.1081810 secs]

AFTER AROUND 2 HOURS JVM WAS WARMED UP (EVERY REGION OF THE APPLICATION WAS TOUCHED) SO I PERFORMED FULL GC:

6902.523: [Full GC [PSYoungGen: 544K->0K(595212K)] [ParOldGen: 427467K->161823K(699392K)] 
[PSPermGen: 154437K->153089K(307712K)], 1.2439520 secs]

AFTER SOME MORE TIME:

258654.523: [Full GC [PSYoungGen: 1453K->0K(595212K)] [ParOldGen: 487154K->165154K(699392K)]
[PSPermGen: 154437K->153089K(307712K)], 1.5154321 secs]

AND ONE MORE:

648976.523: [Full GC [PSYoungGen: 5447K->0K(595212K)] [ParOldGen: 427467K->176173K(699392K)]
[PSPermGen: 154437K->153089K(307712K)], 1.1874563 secs]
It does not look very good, does it? It seems that there is a small memory leak! After each Full GC cycle the application consumes more and more memory. I performed Full GC when there was no load so after some warm up phase the memory should stay more or less at the same level. In order to investigate what happened under the hood I performed a heap dump:
jmap -dump:file=dump.bin pid
and I used Eclipse Memory Analyzer to examine previously created file. Here is what I saw:


The details of memory leak detection, histogram and dominator tree are all very useful:


It seemed that connection objects were not GCed. Obviously it was not a connection leak (connection pool size was set to about 50 and application was working fine. Because of the fact that some connections could not be reused [e.g. they were closed by DB server], some new connections were established and their objects were accumulated). But why?
As Kirk Pepperdine claims - we should find the exact line in the code that is responsible for the problem. Basically, we have one! The variable m_binxmlprocs from XMLType is suspicious. In order to check that line I decompiled XMLType class and then everything became clear. The role of a static field m_binxmlprocs was to point to cache object. The key of the cache map was the connection object. But... wait a minute! They are kept in SoftHashMap so they will be removed once there is a lack of memory. It does not mean that there is any memory leak. It's Oracle XDB's implementation of SoftHashMap so we should be fine. However, if you do not trust that implementation you can always invoke XMLType.cleanCache() explicitly when the map reaches desired size just to sleep a little bit better ;).

3 comments :

  1. Hi, we've had similar problems with our application. I wouldn't be so quick to trust the Oracle SoftHashMap implementation if I were you! See here: http://piotr.bzdyl.net/2014/07/memory-leak-in-oracle-softhashmap.html

    I believe they have fixed this in recent patch sets but if you're on an older version of the JDBC drivers you still potentially have this problem.

    ReplyDelete
    Replies
    1. My intuition told me to clean these references periodically as I stated: "However, if you do not trust that implementation you can always invoke XMLType.cleanCache() explicitly when the map reaches desired size just to sleep a little bit better ;).".

      Delete
  2. شركة كشف تسربات المياة تقدم لك بعض الأمور التى اذا لاحظتها فلن تجد افضل من شركة كشف تسربات المياة بالمدينة المنورة لحل مشكلتك وهى ارتفاع فاتورة استهلاك المياه. وظهور تصدعات في أجزاء المبني,سماع صوت فى الانابيب ولايمكن التعرف عليه
    كشف تسربات المياه بالمدينة المنورة
    وايضا ظهوربقاع للمياه علي الحوائط والجدران وايضا تسريب للمياه ان الكشف عن تسربات المياه هو الحل الوحيد لمشكلة تسريب المياه والماء هو منبع واساس ومصدرالحياة فلكي نحافظ علي المياه يجب الكشف الدوري وبصفة مستمرة لان المياه لا يمكن الاستغناء عنها ولا يمكن العيش بدونها. عميلنا العزيز هل تعاني من مشكلة تسربات المياه ولجات لكل الطرق وبدون فأئدة وبدون جدوي من الان فصاعدا لا تقلق عميلنا العزيز فمع شركة كشف تسربات المياه التي سوف تقضي علي قلقك نهائيا من ارتفاع فواتير المياه ولن تعاني بعد الان من ظهور الروائح الكريهة اجهزة كشف تسربات المياه تتمكن من معالجة جميع اشكال التسريب نهائيا
    شركة كشف تسربات المياه بالمدينة المنورة

    و بصفة دائما كشف تسربات تتواجد بجوارك دائما وتتشرف وتسعد بخدمتكم دائما وبصفة مستمرة و دائمة.

    ReplyDelete