Sunday, February 2, 2014

Memory leak detection in a real world application

    Some time ago I was performing a soak test of the application in order to minimize the possibility of problems after final production deployment. I discovered some interesting facts.
Before the execution of a soak test I enabled GC logging:
-Xloggc:gc.log -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps
During test execution I enforced Full GCs (in 'zero load' periods - ) in order to check what is the current state of the memory occupation. You can enforce GC by means of JConsole (once you enable JMX connection) or jmap:
jmap histo:live pid
histo:live means that only live objects will be presented - non reachable objects will be removed during Full GC process. This is memory footprint which I obtained (some details were removed and the values were changed so please do not pay your attention to the values but to correlation and tendency):
3.922: [GC [PSYoungGen: 262656K->36090K(306176K)] 262656K->36162K(1005568K), 0.0536500 secs]
5.645: [GC [PSYoungGen: 298746K->43488K(568832K)] 298818K->46215K(1268224K), 0.0752860 secs]
13.123: [GC [PSYoungGen: 568800K->43504K(568832K)] 571527K->79558K(1268224K), 0.1472590 secs]
24.819: [GC [PSYoungGen: 568816K->43500K(655872K)] 604870K->100741K(1355264K), 0.1700250 secs]
32.260: [GC [PSYoungGen: 655852K->43488K(655872K)] 713093K->155914K(1355264K), 0.1583190 secs] 
929.341: [GC [PSYoungGen: 655840K->43515K(514560K)] 768266K->200665K(1213952K), 0.1068270 secs]
6475.346: [GC [PSYoungGen: 514555K->51800K(585216K)] 671705K->211738K(1284608K), 0.1081810 secs]

AFTER AROUND 2 HOURS JVM WAS WARMED UP (EVERY REGION OF THE APPLICATION WAS TOUCHED) SO I PERFORMED FULL GC:

6902.523: [Full GC [PSYoungGen: 544K->0K(595212K)] [ParOldGen: 427467K->161823K(699392K)] 
[PSPermGen: 154437K->153089K(307712K)], 1.2439520 secs]

AFTER SOME MORE TIME:

258654.523: [Full GC [PSYoungGen: 1453K->0K(595212K)] [ParOldGen: 487154K->165154K(699392K)]
[PSPermGen: 154437K->153089K(307712K)], 1.5154321 secs]

AND ONE MORE:

648976.523: [Full GC [PSYoungGen: 5447K->0K(595212K)] [ParOldGen: 427467K->176173K(699392K)]
[PSPermGen: 154437K->153089K(307712K)], 1.1874563 secs]
It does not look very good, does it? It seems that there is a small memory leak! After each Full GC cycle the application consumes more and more memory. I performed Full GC when there was no load so after some warm up phase the memory should stay more or less at the same level. In order to investigate what happened under the hood I performed a heap dump:
jmap -dump:file=dump.bin pid
and I used Eclipse Memory Analyzer to examine previously created file. Here is what I saw:


The details of memory leak detection, histogram and dominator tree are all very useful:


It seemed that connection objects were not GCed. Obviously it was not a connection leak (connection pool size was set to about 50 and application was working fine. Because of the fact that some connections could not be reused [e.g. they were closed by DB server], some new connections were established and their objects were accumulated). But why?
As Kirk Pepperdine claims - we should find the exact line in the code that is responsible for the problem. Basically, we have one! The variable m_binxmlprocs from XMLType is suspicious. In order to check that line I decompiled XMLType class and then everything became clear. The role of a static field m_binxmlprocs was to point to cache object. The key of the cache map was the connection object. But... wait a minute! They are kept in SoftHashMap so they will be removed once there is a lack of memory. It does not mean that there is any memory leak. It's Oracle XDB's implementation of SoftHashMap so we should be fine. However, if you do not trust that implementation you can always invoke XMLType.cleanCache() explicitly when the map reaches desired size just to sleep a little bit better ;).

4 comments :

  1. Hi, we've had similar problems with our application. I wouldn't be so quick to trust the Oracle SoftHashMap implementation if I were you! See here: http://piotr.bzdyl.net/2014/07/memory-leak-in-oracle-softhashmap.html

    I believe they have fixed this in recent patch sets but if you're on an older version of the JDBC drivers you still potentially have this problem.

    ReplyDelete
    Replies
    1. My intuition told me to clean these references periodically as I stated: "However, if you do not trust that implementation you can always invoke XMLType.cleanCache() explicitly when the map reaches desired size just to sleep a little bit better ;).".

      Delete
  2. The best and the worst casino - jtmhub.com
    The best and the worst 의정부 출장샵 casino · 포항 출장안마 Starburst: The 나주 출장샵 biggest and worst casino in 화성 출장안마 the world · Slotomania: The best 구리 출장마사지 and the worst casino in the world · Slotomania: The best and the worst

    ReplyDelete
  3. Awesome! Thank you so much for sharing these kinds of information. Keep writing and updating something on Leak Detection Torrance.

    ReplyDelete