Add your comments here...
Given that the GC can not know the types in some ranges of memory (e.g. external libraries for which there is now RTTI availabe), it has to be able to cope with maybe-pointers anyway. A moving GC is only allowed to move allocations for which it can be sure to know all references. Unions are just another example of this.
-- Marc Schütz
If you do bit-tagging, you lose the guarantee that pointers are always 4-/8-byte aligned, thereby effectively raising the probability of an arbitrary value in memory ranges without RTTI looking like a pointer by a factor of 4/8.
-- Marc Schütz
The following from the main page can not be correct:
"Things that are reliable and can be done: Use a union to share storage with a pointer..."
It should be obvious that a moving garbage collector can not know whether it is safe to change a pointer that's a union member when it points at a block of memory that has been moved. The value could actually, for instance, be an integer that just happens to alias to that memory block.
From what I glean, this is safe for now because the current collector doesn't move memory.
What I really want to see is an extension that allows tagged unions, with callbacks that allow the GC to interpret those unions. However to do this the compiler and GC will have, somehow, to deal with the fact that in some cases, the value of a union and its tag can not be changed in an atomic operation. Ie, there is an interaction between multithreaded execution, GC and tagging.
I expect that if D ever does get a compacting GC, its use will have to be optional because, as CLI programmer's know, movable objects can't be sent to native DLLs without marshaling. Unions aren't the only thing that will break if D gets a compacting GC.
Because of that, perhaps having a combination of GCs would be a better approach. Choose, on creation of an object, whether it will spend its lifetime in pinned memory or movable memory.
Other approached include pinning objects sometime after creation (a real pain, that can destroy the performance of a GC) or copying objects between movable and nonmovable memory invisibly (which would either require GC immediately to fix up pointers or a forwarding object, and a either a complete lack of inlining optimizations of calls to that type or special code that always checks for a fowarding ptr).
If you add the caveat that tagging a pointer can't make it point past the end of the object it points at, then I see no reason that bit-taggings shouldn't already work in D. After all, assuming that the object pointed at is longer than 1 byte, (obj *)((int)ptr|1) at most increments the pointer by 1.
I'm almost sure there's no alignment requirements for the GC, so tagging should work.
Bit tagging can be important to the run times of some sorts of languages. If you don't support bit-tagging then D becomes closed to some implementing some very useful programs, such as the run time systems for typeless languages.
Bit-tagging is actually supported, isn't it?
The high-efficiency approach to GC is nice to see, especially since few seem to care about performance these days.
One possible caveat to auto-relocation (heap compaction) may be cache invalidation. If a chunk of mem was in-cache, and then relocated, it will likely no longer be in-cache (unless it was one of the last chunks to be moved).
Also, this means it's not reliable to store pointers to heap mem, in other mem blocks. Offsets work, but there are existing implementations that specifically store ptrs, and consider them valid as long as the referred-to heap mem is valid (e.g., some lists).
Thus, it seems some method of "pinning" (similar to what .Net does) might be worthwhile, if not already implemented.
P.S. - I have been longing for a broadly-accepted language like D for quite some time! Beatiful design!
Q: Can I allocate memory with malloc, add a root to the GC, and trust that the GC will be able to clean it up, or is mixing memory allocation systems a no-no?
A: The GC can't free memory it didn't allocate. The "addRoot" function just takes a pointer to a GC resource and adds it to a list that is scanned by the GC. This is useful when a pointer is passed to an "external resource" that the GC doesn't scan and so without calling addRoot the memory would be collected. The addRange function adds an entire chunk of memory to the scan list.
I made 2 other GC versions.
- Use boehm GC instead of original gc.
- Note: Unfortunely, boehm gc is slower than original GC
- GC do not scan array of element size<4 such as char
- Note: GC spend time is 1/3 with big size array, but bit slow down with small size array
- Corresponding page in the D Specification:
- GCW (Garbage Collection Workgroup)
- Tech tip: Instantiating class objects elsewhere than the gc heap
- Copying/heap compacting GC
- gc and dll functions
- Can D do Hard-RT?
- GC and destruction events
- another gc question
- The C heap and D.
- What percentage of memory allocations contain pointers?
- Transitioning to a type aware Garbage Collector