After reading many articles about garbage collector and its algorithms in Java, I decided to write my own experience in brief about various garbage collector algorithms.
What is Garbage Collection?
JVM’s heap stores all objects created by running java application irrespective of their scope. Garbage Collection is s process of automatically freeing objects that are no longer referenced by the program.
When an object is no longer referenced by the program, the heap space it occupies can be recycled so that the space is made available for new objects. In this process, the garbage collector must run finalize method of the objects being freed.
In addition, garbage collector may also combat heap fragmentation. Heap fragmentation occurs when new objects are created, allocated and unreferenced objects are freed, such that freed portions of heap memory are left in between portions occupied by live objects. Requests to allocate new objects may have to be filled by extending the size of the heap even though there is enough total unused space in existing heap.
The advantage of garbage collection is that it helps ensure program integrity. Java programmers cannot crash the JVM by incorrectly freeing memory.
The disadvantage of garbage collection is that JVM has to keep track of objects which are being referenced by the executing program, finalize and free unreferenced objects. This activity will likely require more CPU time than would have been required if program explicitly freed unnecessary memory.
Garbage Collector Algorithms :
There are two tasks for garbage collector algorithms :
a) Detect garbage objects.
b) Must reclaim the heap space used by garbage objects and make that space available to the program.
Below are the approaches to distinguishing live objects from garbage :
1. Reference Counting : A reference count is maintained for each object in the heap. When the object is first created and a reference to it is assigned, the object’s reference count is incremented. When an object is destroyed, any objects referenced by that object also have their reference counts decreased.
Advantage – A reference counting collector can run in small chunks of time with execution of program.
Disadvantage – Reference counting does not detect cycles like two or more objects that refer to one another. For example, a parent object that has a reference to a child object that has reference back to parent.
Another disadvantage is the overhead of incrementing and decrementing count every time.
2. Tracing Collectors : This algorithm, trace out the graph of the object references starting with root nodes. The basic tracing algorithm is called Mark and Sweep. In Mark phase, the garbage collector traverses the tree of references and marks each object it encounters. In sweep phase, unmarked objects are freed and the resulting memory is made available to the executing program. In JVM, sweep phase must include finalization of objects.
There are two strategies commonly used by Tracing Collectors / Mark and Sweep Collectors to combat heap fragmentation.
a) Compacting : Compacting collectors slide live over free memory space towards one end of the heap so that the other end of the heap becomes one large contiguous free area. All references to the moved objects are updated to refer to the new location.
Updating references can be made simpler by adding a level of indirection. Instead of referring directly to objects in the heap, object references refer to the table of object handles. The object handle refers to the actual objects on the heap. When an object is moved, only object handle must be updated with new location. It adds a performance overhead even though it simplifies.
b) Copying: It uses stop and copy algorithm. Heap is divided into two regions. Only one region is used at anytime. Once a region is full, the program execution is stopped and all live objects are copied to the other region as they are encountered by the traversal. Once copy procedure is done, program execution begins.
Disadvantage of Stop and Copy is that all live objects must be copied at every garbage collection.
3. Generational Collectors : This groups objects by age and garbage collecting younger objects more often than older objects. Heap is divided into two or more sub heaps, each of which serves one generation of objects. the youngest generation is garbage collected more often. Once object survives this, it is moved to older generation (next generation).
4. Adaptive Collectors : This algorithm monitors current situation on heap and adjusts its garbage collection technique. It may switch from one algorithm to another on the fly or may divide heap into sub heaps and use different algorithms on different heaps simultaneously.
Garbage collectors usually stop the program while collecting garbage objects. They can cause long pauses sometimes during the execution of the program. One approach is to use algorithms that collect incrementally. Rather than attempting to find and discard all unreachable objects at each invocation, just attempts to find and discard a portion of the unreachable objects. A common incremental collector is generational collector.