Monday, January 2, 2012

CLR and Garbage Collection

The Microsoft .Net Framework CLR reserves space for each object instantiated in the system. Memory is not infinite, CLR needs to get rid of those objects that are no longer in use so that space can be used in other object.

The first step in Garbage Collection is identifying those objects that can be wiped out. To accomplish this step, CLR maintains the list of references for an object. If an object has no more references, i.e. there is no way thata the object could be referred to by the application, CLR  that object as garbage. During Garbage Collection, CLR reclaims memory for all garbage objects.

CLR could clean all the managed resources, but have absolutely no knowledge of how to clean an unmanaged
resources. Thus, if your object had an internal unmanaged resource, CLR could not reclaim it. For this purpose, .Net provides you a feature of FINALIZATION. Finalization allows a managed resource to gracefully clean all its unmanaged resources before it is reclaimed by CLR. Thus before reclaiming a garbage, CLR calls its Finalize method.

Point to remember when using Finalize method:-
1. you should not access any managed resource in the Finalize method, as it might already have been    reclaimed by CLR. You never know in which order CLR reclaims objects.
2. Forcing the garbage collector to execute the Finalize method befor reclaiming could hurt the performance of you application.
3. you have no control over when Finalize will be called.

The Garbage Collector is exposed in the static System.GC class. You can force Garbage collection to run and let it identify the garbage objects, calling its finalization by calling GC.COLLECT.(it could hurt the performance to of your application.)

The generation is a mechanism implemented by the garbage collector in order to improve performance.
Collection of object occurs when generation 0 is completely full.

** If no more memory is available fro the heap, then the new operator throws an OutOfMemoryException.
How does the garbage collector know if the application is using an object or not? As you might imagine, this isn't a simple question to answer.

Every application has a set of roots. Roots identify storage locations, which refer to objects on the managed heap or to objects that are set to null. For example, all the global and static object pointers in an application are considered part of the application's roots. In addition, any local variable/parameter object pointers on a thread's stack are considered part of the application's roots. Finally, any CPU registers containing pointers to objects in the managed heap are also considered part of applications roots. The list of active roots is maintained by the JIT(Just In Time compiler) and common language runtime (CLR) and is made accessible to the garbage collector's algorithm.

When the garbage collector starts running, it makes the assumption that all objects in the heap are garbage. In other words, it assumes that none of the application's roots refer to any objects in the heap.
Figure 2 Allocated Objects in Heap
it  shows a heap with several allocated objects where the application's roots refer directly to objects A, C, D, and F. All of these objects become part of the graph. When adding object D, the collector notices that this object refers to object H, and object H is also added to the graph. The collector continues to walk through all reachable objects recursively.


 Once all the roots have been checked, the garbage collector's graph contains the set of all objects that are somehow reachable from the application's roots; any objects that are not in the graph are not accessible by the application, and are therefore considered garbage. The garbage collector now walks through the heap linearly, looking for contiguous blocks of garbage objects (now considered free space). The garbage collector then shifts the non-garbage objects down in memory (using the standard memcpy function that you've known for years), removing all of the gaps in the heap. Of course, moving the objects in memory invalidates all pointers to the objects. So the garbage collector must modify the application's roots so that the pointers point to the objects' new locations. In addition, if any object contains a pointer to another object, the garbage collector is responsible for correcting these pointers as well. Figure 3shows the managed heap after a collection.


Figure 3 Managed Heap after Collection
Fig3 Managed Heap after Collection


After all the garbage has been identified, all the non-garbage has been compacted, and all the non-garbage pointers have been fixed-up, the NextObjPtr is positioned just after the last non-garbage object. At this point, the new operation is tried again and the resource requested by the application is successfully created.


 However, keep in mind that GCs only occur when the heap is full and, until then, the managed heap is significantly faster than a C-runtime heap.
And notice how the two bugs I discussed at the beginning of this article no longer exist. First, it is not possible to leak resources, since any resource not accessible from your application's roots can be collected at some point. Second, it is not possible to access a resource that is freed, since the resource won't be freed if it is reachable. If it's not reachable, then your application has no way to access it.




No comments:

Post a Comment

Qualcomm Short Term

  113 is the SL. 1st Target by mid July.

Total Pageviews