Under The Hood-蒲公英云

July 1997

Code for this article: Hood0797.exe (3KB)

Matt Pietrek is the author of Windows 95 System Programming Secrets (IDG Books, 1995). He works at NuMega Technologies Inc., and can be reached atmpietrek@tiac.com or at http://www.wheaty.net.

In this column, I usually discusstechnologies that are new, or at least haven’t already been covered extensively. However, with more and more developers joining the ranks of Win32®programmers, topics that are old hat to veterans remain a mystery to newer programmers. The subject of linkers falls into this category. Before you Visual Basic®programmers head for the exits, be advised that Visual Basic 5.0 uses a linker. In fact, it uses the same linker that Visual C++® 5.0 does. Visual Basic 5.0 does a good job of hiding this fact, but if you snoop around you’ll see that it produces OBJ files and sends them off to the Microsoft linker.

What is a linker? How does it work? This month I’ll shed some light on these questions. As part of researching this column, I went back to my old sources of information. Interestingly, it seems that much of what I’ll describe here is either out of print or no longer on the MSDN CD-ROM, even though linker technology affects nearly every Windows programmer.

For the purpose of this column, I’ll consider Microsoft’s LINK.EXE to be the standard linker. (Other linkers, such as Borland’s TLINK32, may have slight differences in behavior from what I describe here.) In a future column, I’ll go a step further and examine some of the more useful and interesting switches in the Microsoft linker. First, I’ll give you an overly simplistic definition of a linker, and then refine it later. A linker’s job is to take one or more object modules (typically OBJ files) and combine them into an executable file (that is, an EXE or DLL). However, this begs the question: what is an object module?

An object module is the output from a program that takes human-readable text and translates it into machine code and data that a CPU can understand. For C++, the C++ compiler reads a C++ source file. For assembly language, an assembler (for instance, MASM) reads an assembly language (ASM) file that contains direct equivalents for the code and data bytes that the CPU uses. In Visual Basic 5.0, the input files are the FRM, BAS, and CLS files from your project. This concept holds true for most other languages, such as Fortran.

The primary components of an object module are machine code and data. The raw bytes that make up the code and data are stored in contiguous blocks called sections. For example, Microsoft compilers put their machine code into a section called .text, and data goes into a section called .data. These names have no special meaning other than as a reminder of the intended use of a particular section. Other compilers can (and do) use different names for their sections. If you’ve done MS-DOS® or 16-bit Windows®programming, you can substitute the word “segment” for “sections” in the preceding description, and much of what I’ll say still applies. If you have Visual C++ installed on your system, you can see sections within an OBJ file yourself with the DUMPBIN program. Execute the command

 DUMPBIN <objname>

where <objname> is the name of any OBJ file. Figure 1gives a rundown of the most commonly encountered sections. You can see the sections from a typical compile of a C++ program by running DUMPBIN on an OBJ, such as CHKSTKOBJ from the Visual C++ \LIB directory:

 Dump of file CHKSTK.OBJ
 File Type: COFF OBJECT
      Summary
            0 .data
           2F .text

The fancy name for the output of a compiler or assembler is a compilation unit. However, most of us just think of them as OBJ files. The linker’s most important job is to collect all the compilation units and combine all the sections from the different compilation units. Of course, if things were really this simple, the linker would be nothing more than a fancy program for concatenating blobs of data. The complicated work of a linker comes from processing fixups. More on this later.

You may be wondering how the linker decides to arrange the code and data sections from the various OBJs in the final executable. It turns out that the linker has an elaborate set of rules that must be followed. In fact, the duties of a linker are so complicated that it makes two passes through its input files. The first pass allows the linker to see what it will be working with and collect its thoughts. In the second pass, the linker applies all the rules to write out the executable file.

While I won’t attempt to describe every aspect of every linker rule here, there are a couple of rules that cover the majority of linker behavior. The primary rule is that the linker must put all of the code and data from every specified OBJ file into the executable. If you give the linker three OBJ files, then the code and data from all three OBJ files must somehow be incorporated into the executable. However, the linker doesn’t simply take the raw sections from each OBJ file and string them end to end in the executable. Rather, the linker combines (concatenates) all sections with the same name. For example, if the three OBJ files each had a .text section, the resulting executable will have a single .text section, comprised of the three individual .text sections concatenated together in the order in which they were encountered.

Figure 2 A.OBJ, B.OBJ, and C.OBJ

Another rule observed by the linker is that the sequence of sections in the executable file is dictated by the order in which the linker processes the sections. The linker works its way through the list of OBJ files in exactly the order specified on the command line. However, the primary rule of combining sections with the same name takes precedence.

Figure 2 shows three OBJ files, A.OBJ, B.OBJ, and C.OBJ. In each file are three sections—all have _text and _data sections, but in different positions within each OBJ. They also all have a third section unique to their source file (that is, a_asm, b_asm, and c_asm). Let’s say you invoked LINK, passing it the command line


 A.OBJ B.OBJ C.OBJ

The order of segments (and how sections with the same name are combined) is shown in the right-hand side of Figure 2. You can download the source and OBJ files from the link at the top of this article. This way, you can experiment with different linker command lines—for example “Link B.OBJ A.OBJ C.OBJ”—even if you don’t have MASM or a compatible assembler.

With these two rules in mind, you’re a good way toward knowing how the linker does its job under MS-DOS and 16-bit Windows. The Win32 linker adds several twists to what I just described, though. For starters, there’s the $ section name rule. If a section name contains a $ in it (for example, .idata$4), the $ and everything that follows will be stripped off in the executable file. However, before the linker strips down the name, it combines the sections with names that match up to the $. The name portion after the ‘$’ is used in arranging the OBJ sections in the executable. These sections are sorted alphabetically, based on the portion of the name after the $. For example, three sections called foo$c, foo$a, and foo$b will be combined into a single section called foo in the executable. The data in this section will start with foo$a’s data, continue with foo$b’s, and end with foo$c’s. This automatic combining of sections with $ in their name is used in a variety of ways. You’ll see one example later when I discuss imported functions. It’s also used to create the data tables needed for static initialization of C++ constructors and destructors.

Besides the $ combination rule, the Win32 linker has a few other special cases up its sleeve. Sections with the code attribute are given special preference and put first in the executable file. Following any code, the linker puts any uninitialized data sections—comprised of global data for which an initial value wasn’t specified at the time of compilation (for instance, int i; declared as a global variable in C++). Next comes initialized data (including the .data section), as well as linker-generated data sections such as .reloc.

Uninitialized data is usually put into a section called .bss by a compiler. It’s rare to see a .bss section in an executable file these days, though. The Microsoft linker merges the .bss section into .data, which is the main initialized data section used by compilers. But wait, there’s another catch! This only happens if the executable is for a subsystem other than Posix, and the subsystem version is greater than 3.5. Other sections that contain uninitialized data are left alone by the linker (that is, they aren’t merged).

Working backwards from the end of the executable, if there is a .debug section in the OBJs, it’s placed last in the executable. In the absence of a .debug section, the linker tries to put the .reloc section last because, in most cases, the Win32 loader won’t need to read the relocation information. Cutting down the amount of the executable that needs to be read decreases the load time. I’ll describe relocations later.

Yet another exception to the basic two rules that comes up under Win32 is removable sections. These sections exist in an OBJ file, but the linker doesn’t copy them into the executable. These sections typically have the LINK_ REMOVE and LINK_INFO attributes (see WINNT.H), and are named .drectve. Microsoft compilers spit them out to pass on information to the linker. If you look at an OBJ that was compiled with Visual C++, you’ll see that the data in the .drectve section probably looks something like this:


 -defaultlib:LIBC -defaultlib:OLDNAMES

If this data looks suspiciously like command-line arguments to the linker, you’re on the right track. You can see other evidence of this when you use the __declspec(dllexport) modifier with C++. For example:


 void __declspec(dllexport) ExportMe( void ){…}

will cause the .drectve section to also contain:


 -export:_ExportMe

Sure enough, if you look at the list of command-line options to LINK, -export is one of them.

Fixups and Relocations

Why can’t compilers simply generate executable files directly from the source file, thereby eliminating the need for a linker? The primary reason is that most programs don’t consist of a single source file. Compilers specialize in taking a single source file and producing a raw, machine-code equivalent. Because a source file may contain references to code or data external to the source file, a compiler can’t generate exactly the right code to call that function or access that variable. Instead, the compiler’s only option is to include extra information in the output file that describes the external code or data. The term for this description of external code and data references is a fixup. Putting it bluntly, the code that the compiler generates for accessing external functions and variables is incorrect, and must be fixed up later.

Consider a call to a function named Foo in C++:


 //…
 Foo();
 //…

The exact bytes emitted from a 32-bit C++ compiler will be this:


 E8 00 00 00 00

The 0xE8 is the CALL instruction opcode. The next DWORD should contain the offset to the Foo function (relative to the CALL instruction). It’s pretty clear that Foo probably isn’t zero bytes away from the CALL instruction. Simply put, this code wouldn’t work as expected if you were to execute it. The code is broken, and needs to be fixed up.

In the above example, the linker needs to replace the DWORD following the CALL opcode with the correct address of Foo. In the executable file, the linker will write a DWORD with the relative address of Foo. How does the linker know this needs to be done? A fixup record tells it so. How does the linker know where function Foo is? The linker knows about all symbols in an executable because it’s responsible for arranging and combining the components of the executable.

Now, about those fixup records. For Intel-based OBJ files, there are only three types of fixup records that are normally encountered. The first are 32-bit relative fixups, known as REL32 fixups. (For the curious, they correspond to the IMAGE_REL_I386_REL32 #define in WINNT.H.) In the above example of a call to function Foo, there will be a REL32 fixup record, and it will have the offset of the DWORD that the linker needs to overwrite with the appropriate value. If you were to run


 DUMPBIN /RELOCATIONS

for the OBJ created by the above code, you’d see something like this:


                                Symbol    Symbol
  Offset    Type    Applied To  Index     Name
  ————————  ————    ——————————  ——————    ——————
  00000004  REL32    00000000      7      _Foo

In English, this fixup record says that the linker needs to calculate the relative offset to function Foo, and write that value to offset four in the section. Since this fixup record is only needed by the linker prior to creating the executable, it’s discarded and doesn’t appear in the executable. Why then do most executables contain a section called .reloc? This is where the second type of fixup comes into play. Consider the following program:


 int i;
 int main()
 {
     i = 0x12345678;
 }

Visual C++ generates this instruction for the assignment in the executable:


 MOV DWORD PTR [00406280],12345678

What’s interesting is the [00406280] part of the instruction. It’s referencing a fixed location in memory, and assumes that the DWORD containing the variable i is 0x6280 bytes above the load address of the executable file, which is at 0x400000 by default. Now, consider what happens if the executable can’t be loaded at the default load address. Instead, let’s say that the Win32 loader loads it 2MB higher in memory (that is, the executable loads at 0x600000). When this happens, the [00406280] part of the instruction needs to be adjusted to 0x00606280.

It’s for just such occasions that DIR32 (Direct 32) fixups are used in OBJ files. They signify locations where the actual (direct) address of something needs to be plugged in. By implication, they also signify the locations where the load address of the executable file is significant. When creating the executable, the loader takes the DIR32 fixups from the OBJs and creates the .reloc section. Before this happens though, running DUMPBIN /RELOCATIONS on the OBJ shows:


                                Symbol   Symbol
  Offset    Type    Applied To  Index    Name
  ————————  —————   ——————————  ——————   ——————
  00000005  DIR32    00000000      4     _i

The fixup record says that the linker needs to calculate the direct 32-bit address of the variable _i, and write that value to offset five in the section.

The .reloc section in an executable is basically a series of addresses in the executable where the difference between the default and actual load address needs to be accounted for. By default, the linker creates the executable so that the .reloc section isn’t needed by the Win32 loader. However, when the Win32 loader needs to load an executable somewhere other than its preferred load address, the .reloc section allows all the direct references to code and data to be updated.

The third type of fixups commonly found in Intel OBJs, DIR32NB (Direct 32, No Base), are used for debug information. One of the secondary jobs of the linker is to create debug information that includes the names of functions and variables, along with their addresses. Since only the linker knows where all the functions and variables will end up, the DIR32NB fixup is used to indicate spots in the debug information where the address of a function or variable is needed. The key difference between DIR32 and DIR32NB fixups is that the values patched in for DIR32NB fixups don’t include the default load address of the executable.

Libraries

In some circumstances, it’s worthwhile to combine two or more OBJs together into a single file, which can then be given to the linker. The classic example of this is the C++ runtime library (RTL). The C++ RTL is made up of numerous source files that are compiled, and the resulting OBJs are combined into a library. For Visual C++, the standard, single threaded, static version of the runtime library is called LIBC.LIB. There are other variations for debugging (for example, LIBCD.LIB) and multithreading (LIBCMT.LIB).

Library files usually have the .LIB extension. They consist of a library header, followed by the raw data of the contained OBJs. The library header informs the linker which symbols (functions and variables) can be found in the following OBJs, as well as which OBJ a given symbol resides in. You can see the contents of a library via the DUMPBIN /LINKERMEMBER switch. Without going into the details of why, you’ll find DUMPBIN’s output more readable if you specify :1 or :2 afterwards. For example, using PENTER.LIB from Visual C++ 5.0 with the command


 “DUMPBIN /LINKERMEMBER:1 PENTER.LIB”

produces this snippet of output:


     6 public symbols
       180 _DumpCAP@0
       180 _StartCAP@0
       180 _StopCAP@0
       180 _VERSION
       180 mcount
       180 penter

The 180 in front of each symbol name indicates that the symbol (for instance, _DumpCAP@0) can be found in an OBJ file beginning 0x180 bytes into the library. As you can see, PENTER.LIB only has one OBJ in it. More complicated LIB files will have multiple OBJs, so the offsets preceding the symbol names will be different.

Unlike OBJs passed on the command line, the linker does not have to include every OBJ in a library into the final executable. Quite the opposite, in fact. The linker won’t include any OBJ code or data from a library OBJ unless there’s a reference to at least one symbol from that OBJ. Put another way, explicitly named OBJs on the linker command line fly first class, and are always included in the executable. OBJs from LIB files fly standby, and are only included in the executable if referenced.

A symbol in a library can be referenced (and hence, its OBJ included) in three ways. First, there can be a direct reference to a symbol from one of the explicit OBJ files. For example, if I were to call the C++ printf function from a source file I wrote, there would be a reference (and a fixup) generated for it in my OBJ file. When creating the executable, the linker would search its LIB files for the OBJ containing the printf code, and include the OBJ it finds.

Second, there can be an indirect reference. Indirect means an OBJ included via the first method contains references to symbols in yet another OBJ file in the library. This second OBJ may in turn reference symbols in a third OBJ file in the library. One of the linker’s toughest jobs is to track down and include every OBJ that has a referenced symbol, even if that symbol is located via 49 levels of indirection.

When looking for a symbol, the linker searches the LIB files in the order it encountered them on the command line. However, once a symbol is found in a library, that library becomes the preferred library, and is given first crack at all future symbols. The library loses its favored status once a symbol isn’t found in the library. In this case, the next library in the linker list is searched. (For a more technically detailed description, see the Microsoft Knowledge Base article Q31998.)

Let’s now address the issue of import libraries. Structurally, import libraries are no different than regular libraries. When resolving symbols, the linker doesn’t know the difference between an import library and a regular library. The key difference is that there’s no compilation unit (for example, source file) that corresponds to each OBJ in the import library. Instead, the linker itself produces the import library, based upon the symbols that are exported from an executable being built. Put another way, when the linker creates the exports table in an executable, it also creates the corresponding import library to reference those symbols. This point leads nicely to my next topic, the imports table.

Creating the Imports Table

One of the most fundamental features that Win32 rests upon is the ability to import functions from other executables. All of the information about the imported DLLs and functions resides in a table in the executable known as the imports table. When it’s in a section all by itself, this section is named .idata.

Since imports are so vital to Win32 executables, it may seem strange that the linker doesn’t have any special knowledge of import tables. Put another way, the linker doesn’t know or care whether a function you’ve called resides in another DLL, or within the same executable. The way that this is accomplished is all very clever. By simply following the section combining and symbol resolution rules described above, the linker creates the imports table, seemingly unaware of the special significance of the table.

Let’s look at some fragments from an import library to see how the linker accomplishes this feat. Figure 2shows portions of running DUMPBIN on the USER32.LIB import library. Pretend that you’ve called ActivateKeyboardLayout API. A fixup record for _ActivateKeyboardLayout@8 can be found in your OBJ file. From the USER32.LIB header, the linker determines that this function can be found in the OBJ at offset 0xEA14 in the file. At this point, the linker is committed to including the contents of this OBJ in the finished executable (see Figure 3).

From Figure 3, you can see that a variety of sections from the OBJ will be brought in, including .text, .idata$5, .idata$4, and .idata$6. In the text section is a JMP instruction (the 0xFF 0x25 opcode). From the COFF symbol table at the end of Figure 3, you can see that _ActivateKeyboardLayout@8 resolves to this JMP instruction in the .text section. Thus, the linker hooks up your CALL to ActivateKeyboardLayout to the JMP instruction in the .text section of the import library’s OBJ.

The linker combines the .idata$XXX sections into a single .idata section in the executable. Now recall that the linker has to follow the rule for combining sections with a $ in their name. If there are other imported functions brought in from USER32.LIB, their .idata$4, .idata$5 and .idata$6 sections will also be thrown into the mix. The net result is that all the .idata$4 sections create one array, while all the .idata$5 sections create another array. If you’re familiar with the term “import address table,” this process is how that table is created.

Finally, notice that the raw data for the .idata$6 section contains the string ActivateKeyboardLayout. This is how the name of imported functions make it into the import address table. The important point is that creating the import table isn’t a big deal for the linker. It’s just doing its job, following the rules I described earlier.

Creating the Exports Table

Besides creating an import table for executables, a linker is also responsible for creating the opposite: the exports table. Here, the linker’s job is both harder and easier. In pass one, the linker has the task of collecting information about all the exported symbols and creating an exported function table. During the first pass, the linker creates the export table and writes it to a section called .edata in an OBJ file. This OBJ file is standard in all respects, except that it uses an extension of .EXP rather than .OBJ. That’s right, you can use DUMPBIN to examine the contents of those EXP files that seem to accumulate in the presence of DLLs that you build.

During its second pass, the linker’s job is almost trivial. It simply treats the EXP as a regular OBJ file. This in turn means that the .edata in the OBJ will be included in the executable. Sure enough, if you see an .edata section in an executable, it’s the export table. These days, though, finding an .edata section is increasingly rare. It seems that if the executable uses the Win32 console or GUI subsystems, the linker automatically merges the .edata section with the .rdata section, if one is present.

Wrap Up

Obviously, a linker has many more jobs than I’ve described here. For example, producing certain types of debug information (such as CodeView info) is a major piece of a linker’s total work. However, creating debug information isn’t an absolutely mandatory job for the linker, so I haven’t spent any time describing it. Likewise, a linker should be able to create a MAP file listing the public symbols that were included in the executable, but again it’s not a mandatory function of a linker.

While I’ve covered a lot of complex ground, at its heart a linker is simply a tool for combining multiple compilation units into a functioning executable. The first cornerstone is in combining sections; the second is in resolving references (fixups) between the combined sections. Throw in a dash of knowledge about system-specific data structures such as the exports table, and you’ve covered the basics of this powerful and essential tool.