Skip to content

cmd/link: rearrange compiler/linker generated data for clarity #76038

@ianlancetaylor

Description

@ianlancetaylor

A Go executable contains information generated by the compiler and linker that is not program code or data. This data currently appears in various different locations in the executable. I think we should try to clarify it to make it easier for users to understand.

This information I am discussing is at least the following:

  • buildinfo, which includes module dependencies and build settings. In ELF this is found in the .go.buildinfo section. It can be found without a section table by searching for a magic string, which is "\xff Go buildinf:". It contains a 32-byte header followed by a pair of strings. It's written out by (*Link).buildinfo in cmd/link/internal/ld/data.go. It's parsed by readRawBuildInfo in debug/buildinfo/buildinfo.go.
  • The go:buildinfo.ref symbol is a pointer-sized symbol that refers to the buildinfo data, to keep the C linker from removing the .go.buildinfo section. This symbol appears in the .rodata section. It's written out by (*Link).buildinfo in cmd/link/internal/ld/data.go.
  • moduledata, which has pointers to most of the other generated information. In ELF this is found at the symbol runtime.firstmoduledata (for an executable). There is no way to find it if the symbol table is missing. The format is simply the struct runtime.moduledata. It appears in the .noptrdata section. It is written out by (*Link).symtab in cmd/link/internal/ld/symtab.go.
  • pcheader, which points to more of the generated information. The moduledata points to the pcheader. The format is the struct runtime.pcHeader. In ELF it appears at the start of the .gopclntab section. It starts with a four byte magic number 0xfffffff1. It is written out by (*pclntab).generatePCHeader in cmd/link/internal/ld/pcln.go.
  • The function offset table, which has one entry for each function and method in the executable, plus one more entry that records the end of the last function. This can be found via a slice in moduledata. The format is runtime.functab, which is just a pair of uint32 offsets. The number of functions can also be found in the PC header, although as noted the slice has one more entry. This appears in the .gopclntab section. It is written out by writePCToFunc in cmd/link/internal/ld/pcln.go.
  • The PC to function lookup table. This divides the address space devoted to functions into buckets, where each bucket covers 4096 bytes. See runtime.findfuncbucket for details. It can be found via a pointer in the moduledata. The size doesn't seem to be recorded, it must be computed based on minpc and maxpc in moduledata (or the addresses in the function offset table). It appears in the .rodata section. It is written out by (*pcln).findfunctab.
  • The function table. This records information for each function. It can be found via the function offset table, or by the pclntable field in moduledata or the pclnOffset field in pcheader. The format starts with runtime._func, and is followed by variable length arrays containing pcdata and funcdata offsets. This appears in the .gopclntab section. It is written out by writeFuncs in cmd/link/internal/ld/pcln.go.
  • The function name table. This is simply the name of each function as a series of NUL terminated strings. The function table has an offset into this table for each function. This can also be found as the funcnametab field of moduledata, and the funcnameOffset field of pcheader. It appears in the .gopclntab section. It is written out by (*pclntab).generateFuncnametab.
  • The compilation unit table, which maps file numbers for a compilation unit to offsets into the file name table. See runtime.funcfile for how this is used. Basically it lets each function store small numbers when mapping from PC to file name. This can be found via the cutab field of moduledata or the cuOffset field of pcheader. The format is just a slice of uint32 values. It appears in the .gopclntab section. It is written out by (*pclntab).generateFilenameTabs.
  • The file name table, which contains the actual file names, as a series of NUL terminated strings. Entries are found via the compilation unit table. This can also be found via the filetab field of moduledata or the filetabOffset field of pcheader. It appears in the .gopclntab section. It is written out by (*pclntab).generateFilenameTabs.
  • The pcdata information, also known as pctab. The function table pcdata offsets point here, as does the function table pcsp field. The table can also be found via the pctab field of moduledata or the pctabOffset field of pcheader. The format of this table is complex and is described at http://go.dev/s/go12symtab in the PC-Value Table Encoding appendix. It appears in the .gopclntab section. The data is created by the compiler.
  • The funcdata information. The function table funcdata offsets point here, and it can also be found via the gofunc field of moduledata. The format of this table varies depending on the exact funcdata information being recorded. It appears in the .rodata section (not the .gopclntab section). There doesn't seem to be a way to know the complete size of the data. The data is created by the compiler, though the linker seems to compute some of the inline symbol information.
  • Type descriptors used for the reflect package. Besides direct references from code, this can be found via the typelinks field of moduledata, which indexes into the types (and etypes) pointers in moduledata. These appear in the .rodata section.
  • The type links used by the reflect package to find type descriptors. This is a sequence of int32 offsets into the types section, sorted by type string. This appears in the .typelink section. It is written by (*Link).typelink.
  • Type descriptors have sizes that depend on their kind. Many types also have a GC bitmask. The size of each GC bitmask depends on the type. These appear in the .rodata section.
  • Larger types will compute the GC bitmask at runtime as needed. For these types, the program will contain a pointer that is filled out as needed (by runtime.getGCMaskOnDemand). These pointers appear in the .noptrdata section (though they could be .noptrbss; there is a TODO in dgcptrmaskOnDemand in cmd/compile/internal/reflectdata/reflect.go).
  • The data and BSS sections have their own GC information. These can be found via the gcdata and gcbss fields of moduledata. They appear in the .rodata section. These are GC programs, not bitmasks; they are expanded into bitmasks by runtime.progToPointerMask.
  • Interface tables computed by the compiler. These can be found via the itablink field of moduledata, which is a slice of pointers to runtime.itab structs. itablinks appears in the .itablink section. The itabs themselves appear in the .rodata section. The itabs are variable length, as they have a list of pointers to methods.
  • FIPS checking information used by crypto/internal/fips140/check to verify the checksum of FIPS sections when in FIPS mode. This can be found at the go:fipsinfo symbol. It appears in the .go.fipsinfo section. Currently it is 120 bytes.
  • Function descriptors are created whenever a top-level function or method expression is converted to a function type. A value of function type is a pointer to memory such that the first word of that memory is the address of the function code. Remaining words are closure or method pointers, but a top-level function or method expression doesn't have any of those. So a function descriptor is just a word of memory pointing to the function code. These appear in the .rodata section. They are created by WriteFuncSyms in cmd/compile/staticdata/data.go.

Metadata

Metadata

Assignees

No one assigned

    Labels

    ImplementationIssues describing a semantics-preserving change to the Go implementation.NeedsFixThe path to resolution is known, but the work has not been done.compiler/runtimeIssues related to the Go compiler and/or runtime.

    Type

    No type

    Projects

    Status

    Todo

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions