跳过正文

系编-GC

·584 字· loading · loading ·
Masterlong
作者
Masterlong
熬夜,但是世界之夜
目录
系统级编程 - 这篇文章属于一个选集。
§ 6: 本文

Garbage Collector
#

  • 追踪内存分配
    • bitmap位图标志
    • 链表
  • 动态内存管理
    • EMM: explicit
      • 隐式空闲分配:使用块的大小
        • 跟踪:
          • BST, sort by size
          • 一系列不同size编组的list
          • 边界标记:tag之 从而合并空闲块
      • 显示~存在指针,速度++存储++
    • AMM: automatic

自动内存管理
#

  • Storage management is still a hard problem in modern programming
  • C and C++ programs have many storage bugs
    • forgetting to free unused memory
    • dereferencing a dangling pointer
    • overwriting parts of a data structure by accident
    • and so on…
  • Storage bugs are hard to find
    • a bug can lead to a visible effect far away in time and program text from the source

Classical GC algorithm
#

  • 将内存视作一个有向图
    • 每个内存块视作图中的节点
    • 每个指针是图中的一条边
    • 根节点:全局变量、局部变量、参数、寄存器……作为指针指向堆中的内存块
  • img
  • 这些不可达的节点就是“Garbage”,需要被我们回收

Mark & Sweep Collection
#

把所有可达节点做上标记,没有标记的就被sweep:

Mark and Sweep Collecting标记清除法(McCarthy,1960)

  • Can build on top of malloc/free package
    • Allocate using malloc until you “run out of space”
  • When out of space:
    • Use extra mark bit in the head of each block
    • Mark: Start at roots and sets mark bit on all reachablememory
    • Sweep: Scan all blocks and free blocks that are not marked
  • Conservative/保守的Garbage Collector
    • 有些是Garbage,但是可能没有标识出来
void GC()
{
    HaltAllProcessing();
    ObjectCollection roots = GetRoots();
    for(int i = o; i <roots.Count(); ++i)
    	Mark(roots[i]);
    Sweep();
}

Root Enumeration

  • The root enumeration lists all references held in registers, global or static fields, local variables on the stack, function arguments on stack, etc…
  • The runtime system needs to provide ways for the GC to collect the list of roots. E.g. in the case of .NET maintains the information on the active roots and provides APls to the GC to enumerate them.

img

基于DFS:

ptr mark (ptr p){
    if (!is_ptr(p) ) return;
    // do nothing if not pointer
    if(markBitset(p) ) return;
    // check if already marked
    setMarkBit(p);
    // set the mark bit
    for (i=0; i < length(p) ; i++)// mark all children
        mark(p[i]);
    return;
}

ptr sweep (ptr p, ptr end) {
    while (p < end){
        if markBitset(p)
        	clearMarkBit() ;
        else if (allocateBitset(p))
        	free(p);
        P+=length (p);
    }
}

Advantage

  • The advantage of mark and sweep is that you may never have to run it (it only runs when the heap is full).
  • The other advantage is that it finds and frees all unused memory blocks.

Disadvantage

  • Because the Mark and Sweep cycle locks out other processing while it runs,the GC can make a noticeable dentin a programs performance.
  • Fragmentation

Copy collection
#

Basic idea: use 2 heaps

  • One used by program
  • The other unused until GC time

GC:

  • Start at the roots & traverse the reachable data
  • Copy reachable data from the active heap (from-space) to the other heap (to-space)
  • Dead objects are left behind in from space • Heaps switch role

img

img

快but占空间

引用计数
#

img

image-20231102195747541

Reference Counting problems

  • Cost of reference counting
    • Too many ref-count increments and decrements
  • Fail to reclaim circular structures
    • Not always effective

环状引用探测不到&不用一直扫描停机

Generational GC
#

Empirical(根据经验的)observation:

  • If an object has been reachable for a long time, it is likely to remain so

Empirical observation:

  • In many languages, most objects died young

Conclusion:

  • we save work by scanning the young objects frequently and the old objects infrequently

达尔文的进化论:新生物种是最容易被淘汰(?

回收策略:

Assign objects to different generations G0, G1,…

  • G0 contains young objects, most likely to be garbage
  • G0 scanned more often than G1
  • Common case is two generations (new, tenured)

– No need to traverse the entire reference tree

– Older generations are garbage collected infrequently

– The youngest generation can be collected in 5 to 50ms (ms stands for milliseconds, 1ms = 0.001 seconds). – The youngest generation, which is where most of the work is done, contains the highest proportion of garbage. By focusing the work here, the overall system is more efficient.

系统级编程 - 这篇文章属于一个选集。
§ 6: 本文

相关文章

系编-Buffer overflow attack
·50 字· loading · loading
系编-Decoding
·47 字· loading · loading
系编-Function Call
·82 字· loading · loading