Overview:

There are five available garbage collectors (GC) for Java Virtual Machines (JVM). Here are some quick lesions-learned on each GC engines:

  1. G1: is the default for Java 9 and newer. It’s the best choice for real-time applications with rapid vertical scaling. Some tests show that this is slower than Parallel GC, although Parallel is known to oversubscribe allotted RAM limits leading to application slowness.
  2. Parallel: is the default for Java 8 or older. this engine does everything at once which will result in random lags. It’s intended for Applications where throughput is the focus not real-time usage.
  3. ConcMarkSweep (CMS): is designed to eliminate the long pause associated with the full gc of parallel & serial collector. It’s similar to G1 by using multiple background threads to scan and clear heaps
  4. Serial: good for single virtual CPU machines. Nobody uses this GC nowadays
  5. Shenandoah: is available on JDK 12, a super low-latency GC that operates mostly concurrently with the application. It’s most apppriate for gambling, finance, and latency-sensitive interactive apps. This engine costs more CPU consumption than Parallel GC.
Practical Examples:

Similar to the process of SQL server performance tuning, Java memory allocation should be manually configured according to the host’s RAM availability to ensure machine up-time with performance consistency. Bottom line is that 4G should be allocated to JVM for servers with 12 GB or less, or 80% of available memory for servers with more than 12 GB. The most important points are setting explicit Garbage collection as G1, specifying low memory heap, adding a high reserve, etc.

12GB of RAM:

java.args=-server -Xms8G -Xmx8G -XX:+UseG1GC -XX:+ParallelRefProcEnabled -XX:MaxGCPauseMillis=200 -XX:+UnlockExperimentalVMOptions -XX:+DisableExplicitGC -XX:+AlwaysPreTouch -XX:G1NewSizePercent=30 -XX:G1MaxNewSizePercent=40 -XX:G1HeapRegionSize=8M -XX:G1ReservePercent=20 -XX:G1HeapWastePercent=5 -XX:G1MixedGCCountTarget=4 -XX:InitiatingHeapOccupancyPercent=15 -XX:G1MixedGCLiveThresholdPercent=90 -XX:G1RSetUpdatingPauseTimePercent=5 -XX:SurvivorRatio=32 -XX:+PerfDisableSharedMem -XX:MaxTenuringThreshold=1 nogui ... OTHER ARGS ...

16GB of RAM:

java.args=-server -Xms13107m -Xmx13107m -XX:+UseG1GC -XX:+ParallelRefProcEnabled -XX:MaxGCPauseMillis=200 -XX:+UnlockExperimentalVMOptions -XX:+DisableExplicitGC -XX:+AlwaysPreTouch -XX:G1NewSizePercent=40 -XX:G1MaxNewSizePercent=50 -XX:G1HeapRegionSize=16M -XX:G1ReservePercent=15 -XX:G1HeapWastePercent=5 -XX:G1MixedGCCountTarget=4 -XX:InitiatingHeapOccupancyPercent=20 -XX:G1MixedGCLiveThresholdPercent=90 -XX:G1RSetUpdatingPauseTimePercent=5 -XX:SurvivorRatio=32 -XX:+PerfDisableSharedMem -XX:MaxTenuringThreshold=1 nogui ... OTHER ARGS ...

12GB of RAM with ColdFusion, IIS, Windows Server 2016:

java.args=-server -Xms8G -Xmx8G -XX:+UseG1GC -XX:+ParallelRefProcEnabled -XX:MaxGCPauseMillis=200 -XX:+UnlockExperimentalVMOptions -XX:+DisableExplicitGC -XX:+AlwaysPreTouch -XX:G1NewSizePercent=30 -XX:G1MaxNewSizePercent=40 -XX:G1HeapRegionSize=8M -XX:G1ReservePercent=20 -XX:G1HeapWastePercent=5 -XX:G1MixedGCCountTarget=4 -XX:InitiatingHeapOccupancyPercent=15 -XX:G1MixedGCLiveThresholdPercent=90 -XX:G1RSetUpdatingPauseTimePercent=5 -XX:SurvivorRatio=32 -XX:+PerfDisableSharedMem -XX:MaxTenuringThreshold=1 --add-opens=java.rmi/sun.rmi.transport=ALL-UNNAMED --add-opens=java.base/java.nio=ALL-UNNAMED --add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/sun.util.cldr=ALL-UNNAMED --add-opens=java.base/sun.util.locale.provider=ALL-UNNAMED -Xbatch -Djdk.attach.allowAttachSelf=true -Dcoldfusion.home={application.home} -Duser.language=en -Dcoldfusion.rootDir={application.home} -Dcom.sun.xml.bind.v2.bytecode.ClassTailor.noOptimize=true -Dcoldfusion.libPath={application.home}/lib -Dorg.apache.coyote.USE_CUSTOM_STATUS_MSG_IN_HEADER=true -Dcoldfusion.jsafe.defaultalgo=FIPS186Random -Dorg.eclipse.jetty.util.log.class=org.eclipse.jetty.util.log.JavaUtilLog -Djava.util.logging.config.file={application.home}/lib/logging.properties -Djava.locale.providers=COMPAT,SPI -Dsun.font.layoutengine=icu -Dcoldfusion.classPath={application.home}/lib/updates,{application.home}/lib,{application.home}/lib/axis2,{application.home}/gateway/lib/,{application.home}/wwwroot/WEB-INF/cfform/jars,{application.home}/wwwroot/WEB-INF/flex/jars,{application.home}/lib/oosdk/lib,{application.home}/lib/oosdk/classes

Explanations:

-Xms: sets the starting global memory heap size to prevent pauses caused by heap expansion
-Xmx: places upper boundary on the global heap size to increase the predictability of garbage collection
-XX:+UseG1GC: use the Garbage First (G1) Collector, instead of relying on Explicit GC. The Garbage-First (G1) collector is a server-style garbage collector, targeted for multi-processor machines with large memories. It meets garbage collection (GC) pause time goals with a high probability, while achieving high throughput. The G1 garbage collector is fully supported in Oracle JDK 7 update 4 and later releases. (Source:
-XX:MaxGCPauseMillis: sets a target for the maximum GC pause time so the engine would use as baseline
-XX:+ParallelRefProcEnabled: multi-thread reference processing, reducing young and old GC times
-XX:MaxGCPauseMillis: sets the peak pause time expected in the environment. 250 ms as the default is adequate for most systems. When this value is set lower than 200, it causes GC to run more aggressively and less efficiently, which can steal cycles without yielding considerable benefit
-XX:+UnlockExperimentalVMOptions: required to activate experiemental parameters
-XX:+AlwaysPreTouch option the JVM touches every single byte of the max heap size with a '0', resulting in the memory being allocated in the physical memory in addition to being reserved in the internal data structure (virtual memory). Pretouching is single threaded, so it is expected behavior that it causes JVM startup to be delayed. The trade off is that it will reduce page access time later, as the pages will already be loaded into memory. (Source: https://access.redhat.com/search/#/knowledgebase
-XX:+DisableExplicitGC: in conjunction with -XX:+UseG1GC to force JVM to use G1GC
-XX:ParallelGCThreads: controls the parallelism of global GC phases, which should include parallel reference processing
-XX:ConcGCThreads: number of threads for garbage collectors
-XX:InitiatingHeapOccupancyPercent: percentage of the global heap size as trigger to start a concurrent GC cycle. Please note that a value of 0 denotes 'constant GC cycles', and the default value is 45
-XX:G1NewSizePercent: Sets the percentage of the heap to use as the minimum for the young generation size. The default value is 5 percent
-XX:G1MaxNewSizePercent: percentage of the heap size to use as the maximum for young generation size. The default value is 60 percent
-XX:G1HeapRegionSize: reduce fragmentation of old generation by setting this value higher
-XX:G1ReservePercent: option to increase the amount of reserve memory for next spaces
-XX:G1HeapWastePercent: percentage of heap that you are willing to allow as non-deallocation
-XX:G1MixedGCCountTarget: sets the target number of mixed garbage collections after a marking cycle to collect old regions with at most G1MixedGCLIveThresholdPercent live data
-XX:G1MixedGCLiveThresholdPercent (source:
-XX:G1RSetUpdatingPauseTimePercent: percent of the allowed maximum pause time
-XX:SurvivorRatio: controls the size of the survivor spaces. During 'young' spaces collection, every single object is copied. The Object may be copied to one of survival spaces. For each object being copied, GC algorithm increments its age (aka number of collection survived). If the age is above the current tenuring threshold, it would be copied to what is known as an 'old' space. The Object could also be copied to the old space directly if the survival space gets full - this is called an 'overflow.' If value is set too low, collection copying will overflow into the old generation. If this value is too high, some spaces will be empty.The default value should be 32 as that is known to keep the spaces half-filled.
-XX:+PerfDisableSharedMem: feature to reduce worst-case pause latencies
-XX:MaxTenuringThreshold: specifies for how many minor GC cycles an object will stay in the survivor spaces until it finally gets tenured into the old space