Garbage collection doesn’t mean no memory management

Managed languages rely on the garbage collector. According to Wikipedia article on garbage collection, it is:

In computer science, garbage collection (GC) is a form of automatic memory management. The garbage collector, or just collector, attempts to reclaim garbage, or memory used by objects that are no longer in use by the application. Garbage collection was invented by John McCarthy around 1959 to solve the problems of manual memory management in Lisp.

In this post, we have a look at:

  • when is the garbage collection triggered?
  • how to know when objects are no longer in use?

This post is related to the Java platform but it’s probably the same principles on .net one.

Triggers of garbage collection

As you know, you can invoke garbage collection explicitly in your code:

System.gc();

The javadoc of this static method is:

Runs the garbage collector.

Calling the gc method suggests that the Java Virtual Machine expend effort toward recycling unused objects in order to make the memory they currently occupy available for quick reuse. When control returns from the method call, the Java Virtual Machine has made a best effort to reclaim space from all discarded objects.

But it’s not recommended to use this explicit call. Findbugs considers that it’s a bug:

Dm: Explicit garbage collection; extremely dubious except in benchmarking code (DM_GC)
Code explicitly invokes garbage collection. Except for specific use in benchmarking, this is very dubious.

In the past, situations where people have explicitly invoked the garbage collector in routines such as close or finalize methods has led to huge performance black holes. Garbage collection can be expensive. Any situation that forces hundreds or thousands of garbage collections will bring the machine to a crawl.

So, we have to rely on the mechanism provided by the platform. The Java HotSpot Virtual Machine uses a generational collection which means that the memory is divided in generations. Each generation holds objects of a different ages. The reason why this approach is used is that it exploits the weak generational hypothesis:

  • Most objects die young (not referenced for a long time)
  • Few references from older to younger objects exist

In HotSpot, the generations are:

  • Young generation (green)
  • Old generation (blue)
Memory: Credits: www.wilsonmar.com

Memory - Credits: www.wilsonmar.com

The memory management whitepaper describes the life of an object through generations as:

Memory in the Java HotSpot virtual machine is organized into three generations: a young generation, an old generation, and a permanent generation. Most objects are initially allocated in the young generation. The old generation contains objects that have survived some number of young generation collections, as well as some large objects that may be allocated directly in the old generation. The permanent generation holds objects that the JVM finds convenient to have the garbage collector manage, such as objects describing classes and methods, as well as the classes and methods themselves.

The young generation consists of an area called Eden plus two smaller survivor spaces [...]. Most objects are initially allocated in Eden. (As mentioned, a few large objects may be allocated directly in the old generation.) The survivor spaces hold objects that have survived at least one young generation collection and have thus been given additional chances to die before being considered “old enough” to be promoted to the old generation. At any given time, one of the survivor spaces (labeled From in the figure) holds such objects, while the other is empty and remains unused until the next collection.

So, a minor garbage collection is triggered when the the eden is full and can promote survivor objects to the old generation. The major garbage collection is triggered when the tenured space is full. Note that a call to System.gc() launches a major GC.

What we need to remember is that minor GC and major GC consume CPU (there are more than one strategy for garbage collection but it is the subject of another post). Major GC is longer than minor GC. If you don’t choose the right values (options Xms, Xmx, etc) your application can launch too many garbage collections and so, its global performance will drop significatively.

Live objects

How does the garbage collector determine that an object is no more referenced? One of the most used algorithm is the mark-and-sweep one and it uses the following principle to determine whether an object live or not:

The objects that a program can access directly are those objects which are referenced by local variables on the processor stack as well as by any static variables that refer to objects. In the context of garbage collection, these variables are called the roots . An object is indirectly accessible if it is referenced by a field in some other (directly or indirectly) accessible object. An accessible object is said to be live. Conversely, an object which is not live is garbage.

Java memory leaks are often caused by adding object reference in a collection and never removing it. If your application has a cache system, it’s better to use weak references rather than a strong reference. The article of Wikipedia on weak references defines them as:

In computer programming, a weak reference is a reference that does not protect the referent object from collection by a garbage collector. An object referenced only by weak references is considered unreachable (or “weakly reachable”) and so may be collected at any time. Weak references are used to avoid keeping in memory referenced but unneeded objects.

Conclusion

In this post, garbage collectors are not deeply explained: the algorithms are not described precisely and it is just an abstact view that every developer has to know. To summarize:

  • when the size of the memory is not adapted to your application, it implies too many garbage collections and it consumes CPU time
  • most of the memory leaks are caused by strong references that are never released: when it is relevable, use weak/soft references instead.
Share and Enjoy:
  • Digg
  • del.icio.us
  • Facebook
  • Google Bookmarks
  • Reddit
  • Yahoo! Buzz

Post a Comment

Your email is never published nor shared. Required fields are marked *

Additional comments powered by BackType