Long running Java process resource consumption monitoring , leak detection and GC tuning
For easily monitoring the JVM metrics there is no better tool than Java VisualVM or its older counterpart JConsole; These two tools comes with the Java JDK. So it is absolutely necessary that you copy a JDK version to some temp directory in which your server runs. {Make sure that you do not install the JDK / put the java executable in the path.}
Before going in JVM monitoring, it is essential to understand a little about Java's memory model.
Note - This article is written with HotSpot JVM in mind,which is one of the most commonly used one ; implemented by OpenJDK and Oracle - formerly from Sun. This is not the only one , notable being JRockit and others; and some details would change with a different JVM.
When you are running a Java Process in a Server you allocate the Memory for the Java Virtual Machine. This memory is split to process heap { what you give withe the Xmx Xms command}where it is again subdivided on generational basis (Young, Old) and the PermGen - which is gradually getting removed in Java 7 . Assume that you have not specified the stack size here.
Have a look at the common parameters usually given below for a JVM running in the server mode,and find the relevance of each. Here you see that the Java heap is given as 31g + Perm as 1g . So total JVM memory given is 32 g.
java -D[Server:testserver] -XX:PermSize=1024m -XX:MaxPermSize=1024m -Xms31232m -Xmx31232m -server -Xloggc:/opt/gclogs/testserver_gc.log -verbose:gc -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=5M -XX:+PrintGCCause -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintClassHistogramAfterFullGC -XX:G1HeapRegionSize=32m -XX:HeapDumpPath=/opt/testserver/heapdump -XX:+HeapDumpOnOutOfMemoryError -XX:+DoEscapeAnalysis -XX:+UseCompressedOops -XX:+DisableExplicitGC -XX:+UseG1GC -XX:MaxGCPauseMillis=1000 -XX:StringTableSize=3000000 -XX:OnOutOfMemoryError=kill -9 %p -XX:+PrintAdaptiveSizePolicy -XX:+ParallelRefProcEnabled -XX:GCHeapFreeLimit=15 -XX:GCTimeLimit=80 -Djavax.net.ssl.keyStore=/opt/certs/keystore.jks -Djava.net.preferIPv4Stack=true -Djavax.net.ssl.trustStore=/opt/certs/trustore.jks -Djava.io.tmpdir=/opt/process/tmp -Djava.util.prefs.systemRoot=/opt/oodee/jboss-eap/.java/.prefs
Output of top for the above
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
13168 root 20 0 41.0g 33g 65m S 97.7 78.5 8124:53 java
The JVM itself needs memory for managing the Java process, threads, garbage collection etc. So in the top command for the above mentioned Java process you can see the Resident memory as 33g and virtual memory as 41 g. You need not be overly concerned about the Virtual memory. Linux is good in managing virtual memory. You need to keep a check only for the resident memory.
So the JVM process is asking Linux for 41g or 8g to do its work and really using(resident memory) about 33g currently.
The resident memory is the RAM occupied by objects in the Heap (biggest part), JVM internal data , meta data of classes and things like Stringtable -PermGen and Thread Stacks. You cannot map the resident memory directly only to heap usage of the application.
We can dig deeper and see how much heap is used by using tools like JConsole or JVisualVM.
There are also command line utilities from the JDK by which you can get the same information, However theses do not plot the values over time, so it is hard to get the trend from these utilities. However they are very useful for JVM heap/ garbage collection introspection,
You can also use nice utilities like JvmTop to get this information.
JvmTop 0.8.0 alpha - 11:35:39, amd64, 12 cpus, Linux 2.6.32-35, load avg 0.46 http://code.google.com/p/jvmtop PID MAIN-CLASS HPCUR HPMAX NHCUR NHMAX CPU GC VM USERNAME #T DL 13168 oss-modules.jar 10172m 31232m 627m 1072m 8.80% 0.00% O7U71 root 1343Currently the process is only using 10 g of the 31g allocated heap. So the rest of the 20g is permgen + threads *stacksize + jvms internal data.
As you can see JVM is no light weight process. Once upon a time when RAM was dear this would have been an issue; not now, at least not unless you start a lot of JVMs in one node.
These are jstack- to get Stack trace -for debugging CPU consuming threads, or thread locks; jmap to get the heap dump or heap histogram and the jstat utility which gives the generational usage, and also the live GC events as they happen in the JVM.
./jstat -gccapacity 13168 20000
NGCMN NGCMX NGC S0C S1C EC OGCMN OGCMX OGC OC PGCMN PGCMX PGC PC YGC FGC
0.0 31981568.0 19333120.0 0.0 655360.0 18677760.0 0.0 31981568.0 12648448.0 12648448.0 1048576.0 1048576.0 1048576.0 1048576.0 1745 1
0.0 31981568.0 19333120.0 0.0 655360.0 18677760.0 0.0 31981568.0 12648448.0 12648448.0 1048576.0 1048576.0 1048576.0 1048576.0 1745 1
./jstat -gccause 13168 20000
S0 S1 E O P YGC YGCT FGC FGCT GCT LGCC GCC
0.00 100.00 30.70 67.18 58.30 1745 1513.719 1 20.213 1533.932 G1 Evacuation Pause No GC
0.00 100.00 32.46 67.18 58.30 1745 1513.719 1 20.213 1533.932 G1 Evacuation Pause No GC
./jstat -gcutil 13168 20000
S0 S1 E O P YGC YGCT FGC FGCT GCT
0.00 100.00 34.56 67.18 58.30 1745 1513.719 1 20.213 1533.932
Now moving on to the common use of the tools for other purposes.For heap dumps
su proceeuser
/bin/jmap -dump:format=b,file=/tmp/2930javaheap.hrpof 2930(pid of process)
For dumping a big heap make sure you have that much disk space , efficient FTP and powerful machine to analyze; Else anything over maybe 12 to 16 GB is useless
Eclipse MAT is a good tool to analyze the heap dump; Also if you cannot transfer ,you can run Jhat on the heap on the same node and you will be able to use the web browser to browse the results generated ;
Depending on the heap size and host system resources these tools take some time to analyze
/jdk/jdk1.6.0_38/bin/jhat -J-d64 -stack false -refs false -port 9191 /tmp/45133heap.bin
/jdk/jdk1.6.0_38/bin/jhat -J-d64 -port 9191 /tmp/45133heap.bin
A lighter version of this is class histogram; With the live option it will trigger a full GC and collect the class histogram; So after all tasks are done if you take a histogram and repeat it for one or two cycles and see which are the instances which go up ; can be very helpful in identifying leaks (see below a python script which can help you compare two histograms)
/jdk/jdk1.6.0_38/bin/jmap -histo:live 60030 > /tmp/60030istolive1330.txt
Output will be of the form
015-06-07T10:58:00.653+0300: 179686.836: [Class Histogram (after full gc):
num #instances #bytes class name
----------------------------------------------
1: 6591148 2747828352 [C
2: 298130 237694584 [B
3: 6572509 157740216 java.lang.String
4: 5097003 151094560 [Ljava.lang.Object;
5: 3810491 121935712 java.util.HashMap$Entry
6: 617568 89067520
7: 617568 79061744
Another important tool; which will help you see the GC process ongoing in the live JRE is jstat; This also has very low overhead
jboss-as@backendnode root]$ /jdk/jdk1.6.0_38/bin/jstat -gccause 60030 (jboss server pid) 5000 (will repeat every 5 seconds)
S0 S1 E O P YGC YGCT FGC FGCT GCT LGCC GCC
0.00 0.00 9.54 15.05 46.86 4076 668.310 29 164.007 832.317 Heap Inspection Initiated GC No GC
Further; certain run time parameters can be added to the JRE to get more information; Some of these can be used in production; GC logs are one such ; Currently it is configured to be printed to opt/gclogs
Do a ls –lrt and see the last modified file and tail it to see the GC logging happening
Note – JDK has the Xrunhorf with options that I have been trying out in standalone clients; This prints after application stop or after cntrl break;Any queries please do post; Here is a Python script to compare two histogram dumps;
histogramparser.py
__author__ = 'acp' import re import fileinput import operator import sys objectschanged={} def create_object_list(line2,mapofObjects,instance): container = line2.split() #print(container) if container[3] in mapofObjects: val=mapofObjects[container[3]] #print(container[3],"difference=",abs(int(container[2])-val)) objectschanged[container[3]]=(int(container[instance])-val) else: mapofObjects[container[3]]=int(container[instance]) #Call the main function print("A Python Script to parse the Jmap generated Histograms ") print("Author - Alex.Punnen") print("Usage- jmaphistoparser.py filename1 filename2") #print ("Where M is the difference in memory and I difference in isntance and fo") instance=1 #for instance difference #instance=2 #for memeory difference mapofObjects={} content=[] for line in fileinput.input(): content.append(line) count=0 #skip lines while len(content) >0 : line=content.pop() if(':' in line): count+=1 create_object_list(line,mapofObjects,instance) print("------------------------Objects Changed Between Heaps-------------------------------------------") sorted_x = sorted(objectschanged.items(), key=operator.itemgetter(1)) for(x,y) in sorted_x: print(x,y)
Comments