手记

garbage collection brief introduction

Before get started, I want to tell you a joke:

A few weeks ago, a conceited idiot told others that he was planning to design and implement a new programming language IN HALF A YEAR when he didn’t even understand how the runtime manages the memory! What is the conceited idiot doing now? He is writing the article you are reading.

So today, I want to talk about garbage collection.

language without automatic garbage collection.

As you may know, not all programming languages have garbage collection. The well-known languages C and C++ do not have garbage collection. Programmers have to manage the memory manually which always lead to runtime error.For instance, forget to free memory may lead to memory leak, and free a memory twice will cause error too.

Base on the fact that programmers are getting lazier and lazier and manual memory management require programmers to pay more attention , more languages tend to offer automatic garbage collection.

garbage collection: what is ’garbage’

A seemingly silly question is “What is so called garbage?”. Objects can’t be called “garbage” until we no longer use them or they are out of reach.

Basically, there are three different ways to collect garbage:

  • reference count
  • mark and sweep
  • stop and copy

For better performance, you can use more than one way at the same time.

reference count

The easiest way to collect garbage is collecting by reference count. The basic idea are as follows:

we add a reference count on each object to track how many references exist on the object.
the initial value of a newly created object is 0.
Increase it’s count when creating a reference to this object and decrease it’s count when removing a reference to this object
reclaim this object when it’s count is 0.
As u can see, it is easy to implement and we do need to stop the runtime to collect garbage, But unfortunately it has a very serious problem : it can not reclaim objects in reference cycle.

mark and sweep

As the name suggested, Mark-and-sweep has two steps: mark and swap:

[mark] step one: mark all objects that can be reached from the root .
[sweep] step two: reclaim unmarked objects and unmark the marked objects
It sounds like a good idea, right? Unlike reference count, this algorithm can precisely reclaim all garbage objects! But wait, There is a serous problem too: it has to stop the runtime running when collecting garbage at least in the marking phase!

stop and copy

Basic idea:

divide memory into too same size regions, one to store objects, the other one prepare to store objects copied from the previous region.
allocate objects in one region
when the region is full copy all reachable objects to the other region and modify all pointers.
repeat 2 and 4.
Personally I do not think it is an efficient garbage collection algorithm, coz it wastes MORE than HALF of memory!!!

summery

One thing we should know is that most of objects have very short life cycle. So modern garbage collections tend to manage lang-life objects and short-life objects separately. And since mark-and-sweep and stop-and-copy will stop-the-world when collecting garbage, it’s a good idea to partition memory into multiple sections which will reduce the stop-the-world time.

3人推荐
随时随地看视频
慕课网APP