Last update March 7, 2012

Gdc Hacking



Table of contents of this page
The GDC Hackers Guide   
Why GDC   
GCC Versions   
Binary Releases   
Quicklinks   
GDC call for contributors   
Installing GDC (in the simplest way possible)   
GCC Structure   
GDC Structure   
DMD Front End   
GDC bindings between DMD and GCC   
Intermediate Representation   
Extensions to DMD Frontend   

The GDC Hackers Guide    

This page is meant as a resource for all of us that wants to help Walter develop the D language by developing a modified DMD frontend that can make use of GCC's middle and back ends. In order for us to do this, we must learn how to understand and edit the GDC/GCC sources.

The paperwork is complete and efforts are currently underway to merge GDC into the official GCC codebase, which represents a great step forward for D. However, more than ever, GDC is in need of contributors to keep it up to date with both the D frontend and the trunk development of the GCC backend.

The primary development repository for GDC can be found at {{ https://bitbucket.org/goshawk/gdc}}. Development of GDC is generally discussed in the D.gnu newsgroup on the DigitalMars website ({{ http://digitalmars.com/NewsGroup.html}}), and on the Freenode IRC network in #d.gdc ({{ irc://chat.freenode.net/#d.gdc}}).

Why GDC    

There are many advantages to adding a D frontend to GCC, and most of them stem from the fact that the GCC codebase has been the focus of extensive development over several decades, and that the GCC middle and back ends are designed such that multiple languages can easily take advantage of them. This means that a D frontend that can make use of the GCC middle and backend code will gain many advantages that would require years of development to match in the DMD codebase.

Firstly, GCC targets many more platforms than DMD. Currently tested platforms include:

  • x86 Linux - Working
  • x86_64 Linux - Working
  • x86 Windows - Mostly Working
  • x86_64 Windows - Mostly Working
  • ARM Linux - Mostly Working
  • OSX - ?
Secondly, GCC has a very well-developed optimization framework that can generally generate more performant code than DMD, particularly when taking advantage of more recent CPU features such as SIMD instructions (both directly and through automatic vectorization).

Thirdly, GDB is primarily developed to debug code generated by GCC, so debugging code generated by GCC will generally result in a better user experience.

GCC Versions    

Supported GCC versions: https://bitbucket.org/goshawk/gdc/wiki/Home#!supported-gcc-versions

Binary Releases    

GDC binaries are generally not distributed for Linux, as the details of building and installing it are quite distribution-specific. However, due to the difficulty of building GDC on Windows, Daniel Green has been building and posting binaries at {{ https://bitbucket.org/goshawk/gdc/downloads}}.

Quicklinks    

  • GDC ("gdc development reloaded")
  • GCC (the GNU Compiler Collection)
  • GDB (the GNU Debugger)
Possibly out of date:

GDC call for contributors    

GDC is currently developed by a very small group (as the commit history on BitBucket? shows). While the addition of a D frontend to GCC represents a great step forward for D, it also represents an informal promise by the D community to keep the D frontend up to date with the latest GCC development.

Thanks to Bitbucket and Mercurial, contributing to GDC is as easy as forking the repository and submitting a pull request (although this workflow will likely change when the merge is officially complete), and filing a bug report is a simple web form.

If you use GDC, we encourage you to try to contribute, whether by submitting pull requests or bug reports. In the past, GDC has nearly died due to poor communication and lack of development. Avoiding those issues is easier than ever before, but GDC will always need a community that's willing to give back.

Installing GDC (in the simplest way possible)    

https://bitbucket.org/goshawk/gdc/wiki/Home#!installation

GCC Structure    

Here we gather some texts which can help out in order to understand GCC/GDC. GCC is very complex, and unless we acquire good documentation many will surely give up very soon (if anyone knows of some good books, add them too).

I will give a short overview of the structure of GCC (for the newbies). GCC is a compiler for many languages and many targets, so it is divided into pieces.

  • frontend - Turn the source code into an internal representation - GENERIC).
  • middleend - Convert the GENERIC to GIMPLE and perform optimizations.
  • backend - Turn GIMPLE into target-specific ASM instructions.
What we know as "GDC" is only an implementation of the frontend part of GCC. The middleend uses callbacks to interface with the frontend. GDC is located within its own subfolder in the "core" GCC source tree - (srcdir)/gcc/d/. It is within this subfolder that we must perform all changes to the language. GCC has other frontends such c (C), cp (C++), java (Java), objc (Objective-C), Fortran, Ada. One can look at these for advice, but one probably shouldn't... (one exception: the "c++" package is currently also required to build GDC, since the bundled recls library uses it)

Note that GDC is currently not an official language for GCC, but a "third party" addition. As such, it is similar to GPC (GNU Pascal Compiler), see http://www.gnu-pascal.de/

Work is underway to merge GDC into the official GCC codebase, at which point this will no longer be the case.

The frontend contains the lexer and parser - these together turn the source file into GENERIC. The GDC frontend relies heavily on the DMD sources to perform this work, and you will find the entire DMD sources in a subfolder.

Sadly, GCC is in a very poor state as far as code readability is concerned. Complex macros and source code generators litter the middle and backends. The source is well commented, but that really doesn't help... Well, I'll let you find out that by yourselves

The documentation (that I have read) is very hard to understand, so if anyone have any good resources, or tips, write them here. Happy hacking!

GDC Structure    

DMD Front End    

FileFunction
aav.cAssociative array
access.cAccess check (private, public, package ...)
aliasthis.cImplements the alias this D symbol.
argtypes.cConvert types for argument passing (e.g. char are passed as ubyte).
array.cDynamic array
arrayop.c Array operations (e.g. a[] = b[] + c[]).
async.cAsynchronous input
attrib.c Attributes i.e. storage class (const, @safe ...), linkage (extern(C) ...), protection (private ...), alignment (align(1) ...), anonymous aggregate, pragma, static if and mixin.
builtin.cIdentify and evaluate built-in functions (e.g. std.math.sin)
cast.cImplicit cast, implicit conversion, and explicit cast (cast(T)), combining type in binary expression, integer promotion, and value range propagation.
class.cClass declaration
clone.cDefine the implicit opEquals, opAssign, post blit and destructor for struct if needed, and also define the copy constructor for struct.
cond.cEvaluate compile-time conditionals, i.e. debug, version, and static if.
constfold.cConstant folding
cppmangle.cMangle D types according to Intel's Italium C++ ABI.
dchar.cConvert UTF-32 character to UTF-8 sequence
declaration.cMiscellaneous declarations, including typedef, alias, variable declarations including the implicit this declaration, type tuples, ClassInfo, ModuleInfo and various TypeInfos.
delegatize.cConvert an expression expr to a delegate { return expr; } (e.g. in lazy parameter).
doc.c Ddoc documentation generator ( NG:digitalmars.D.announce/1558)
dsymbol.cD symbols (i.e. variables, functions, modules, ... anything that has a name).
dump.cDefines the Expression::dump method to print the content of the expression to console. Mainly for debugging.
entity.cDefines the named entities to support the "\&Entity;" escape sequence.
enum.cEnum declaration
expression.cDefines the bulk of the classes which represent the AST at the expression level.
func.cFunction declaration, also includes function/delegate literals, function alias, (static/shared) constructor/destructor/post-blit, invariant, unittest and allocator/deallocator.
gnuc.cImplements functions missing from GCC, specifically stricmp and memicmp.
hdrgen.cGenerate headers (*.di files)
identifier.cIdentifier (just the name).
idgen.cMake id.h and id.c for defining built-in Identifier instances. Compile and run this before compiling the rest of the source. ( NG:digitalmars.D/17157)
impcvngen.cMake impcnvtab.c for the implicit conversion table. Compile and run this before compiling the rest of the source.
imphint.cImport hint, e.g. prompting to import std.stdio when using writeln.
import.cImport.
init.c Initializers (e.g. the 3 in int x = 3).
inline.cCompute the cost and perform inlining.
interpret.cAll the code which evaluates CTFE
json.cGenerate JSON output
lexer.cLexically analyzes the source (such as separate keywords from identifiers)
lstring.cLength-prefixed UTF-32 string.
macro.cExpand DDoc macros
mangle.cMangle D types and declarations
mars.cAnalyzes the command line arguments (also display command-line help)
module.cRead modules.
mtype.cAll D types.
opover.cApply operator overloading
optimize.cOptimize the AST
parse.cParse tokens into AST
rmem.cImplementation of the storage allocator uses the standard C allocation package.
root.cBasic functions (deal mostly with strings, files, and bits)
scope.cScope
speller.cSpellchecker
statement.cHandles while, do, for, foreach, if, pragma, staticassert, switch, case, default , break, return, continue, synchronized, try/catch/finally, throw, volatile, goto, and label
staticassert.cstatic assert.
stringtable.cString table
struct.cAggregate (struct and union) declaration.
template.cEverything related to template.
todt.cGenerate data structures to initialize static variables added to the object file.
toobj.cGenerate the object file for Dsymbol and declarations except functions.
traits.c__traits.
typinf.cGet TypeInfo from a type.
unialpha.cCheck if a character is a Unicode alphabet.
unittests.cRun functions related to unit test.
utf.cUTF-8.
version.cHandles version

GDC bindings between DMD and GCC    

FileFunction
asmstmt.ccBuilds inline assembler and extended inline assembler statements.
d-apple-gcc.cDeprecated - stub functions for any dependencies that can't be linked in from Apple-GCC? objects.
d-asm-i386.hImplements D Inline assembler for x86 and x86_64.
d-bi-attrs.hSupported GCC function and type attributes.
d-builtins2.ccHandles importing of special modules (ie: gcc.builtins, core.vararg) in the runtime library, anything related to builtin intrinsics of GDC.
d-builtins.cHandles GCC backend init routines for building all common and builtin trees of GCC.
d-codegen.cCode generation utilities, emit instructions, static chain/closure creation and passing, expand frontend builtins.
d-convert.ccConvert between basic D types, and conversions to boolean value for conditions.
d-c-stubs.ccDeprecated - stub functions for any dependencies that can't be linked in from GCC objects.
d-decls.ccBased on tocsym.c - builds and returns back end reference to a declaration or object.
d-dmd-gcc.hContains declarations used by the modified DMD front-end to interact with GCC-specific code.
d-gcc-complex_t.hSame as DMD's complex_t., but use GCC's {REAL VALUE TYPE-based}? real_t instead of long double.
d-gcc-includes.hHeaders included from GCC.
d-gcc-real.ccObject-oriented layer for interacting with GCC's {REAL VALUE TYPE-based}? real_t.
d-gcc-tree.hDeclaration of tree and tree_node for files that cannot include d-gcc-includes.h
d-glue.ccBuilds GCC trees for all functions, statements, and expressions. Also convert D types into GCC types.
d-gt.cFor linking with the GCC garbage collector
d-incpath.cAdds import paths for frontend to scan.
d-irstate.ccContains the core functionality of IRState class in d-codegen.cc
d-lang.ccImplementation of GCC back-end callbacks and data structures. Main entry point for the D compiler (cc1d) to compile sources.
d-objfile.ccSetup and emit global variables and functions to send to GCC backend for processing.
d-spec.cThe GDC frontend driver for processing command-line options passed to the main application.
dt.ccImplements backend functions called from todt.c in the DMD frontend.
d-todt.cImplements methods removed from todt.c as require special treatment for GDC.
d-tree.defAll GDC specific tree codes are defined here.
lang.optAll GDC specific command-line flags are defined here
symbol.ccImplements Symbol class for d-decls.cc.

Intermediate Representation    

%% To be written here: briefly describe how GDC builds tree representations of D types, expressions, etc.

Extensions to DMD Frontend    

%% To be written here: describe in more detail areas where GDC splits away from DMD frontend.
FrontPage | News | TestPage | MessageBoard | Search | Contributors | Folders | Index | Help | Preferences | Edit

Edit text of this page (date of last change: March 7, 2012 20:38 (diff))