In Visual Studio 2019 Preview 2 we made the compiler back-end to prune away debug information that is unrelated to code or data emitted into binary and changed certain hash implementations in the PDB engine, to improve linker throughput, which resulted in more than 2x reduction on link time for some large AAA game title.
Debug Info Pruning
This is to have the compiler back-end prune away debug info of any user defined types (UDTs) that are not referenced by any symbol record. This cuts down the size of OBJ sections holding debug info, like .debug$S which holds debug records for symbols and .debug$T which holds debug records for types if /Z7 is used. When /Zi or /ZI is used, the compiler will write debug info for types into one PDB file which is usually set to be shared by multiple compilations of all source files under one directory. In this case we don’t prune away types from the compiler generated PDB but will only remove S_UDT records from the .debug$S sections if underlying UDTs are not referenced by any symbol. With smaller debug sections in OBJs and LIBs there is less work to do on type merging and symbol processing when to generate PDB, and therefore it speeds up linking because PDB generation usually takes the majority of link time. The linker aggressively does memory mapped file I/O, and therefore smaller OBJs and LIBs alleviate pressure on virtual memory, which is crucial for link speed when working on big binaries like those in game development.
Type pruning done by the compiler is not free and degrades compilation throughput, especially when the compiler needs to generate a PDB under option /Zi or /ZI and the PDB server (mspdbsrv.exe) is in use for some reason, like the use of /MP or in a smart build system where the build driver kicks off multiple compilations targeting the same PDB file at one time. Since linking is usually the biggest bottleneck in build throughput, we have made type pruning on by default when mspdbsrv.exe is not used in compilation. We think this is a good tradeoff, since compilations can be easily done in parallel. And in development iteration (edit-build-debug) cycle, where usually only a small portion of source files need to be re-compiled, link time becomes dominating in overall build time. If you want to force enabling it in the case where mspdbsrv.exe will be involved, add compiler option /d2prunedbinfo.
Type and Global Symbol Hash Improvement in PDB
The PDB file stores various hashes on types for convenience of adding new type records into an existing PDB file and for type querying at debug or profile time. The PDB file format has been around for more than 25 years and there are lots of tools built by Microsoft and other companies that deal with PDBs. While the type hashes in today’s PDB are inefficient to handle a large amount of types, we don’t want to simply switch to an efficient hash with different structures, so to maintain compatibility on PDB format. In Preview 2 we use xxhash to check whether a given type is unique. When type merging is done and it is time to commit everything into PDB file on disk, we then rebuild the hashes used in today’s PDB file and write them out. xxhash is extremely fast. Though it doesn’t meet the security requirement for cryptographic applications, the hash function has a good measure of quality and we use it here only for uniqueness checking.
Similar to how type merging throughput is improved, we now make the linker communicate the number of public symbols to PDB, so the PDB engine can set up a hash table with a sufficient number of buckets which results in far fewer hash collisions. Same as type merging, we need to convert the in-memory version of hash into on-disk format before committing it into PDB.
In Preview 2 the improvements on internal PDB hashes are only effective when generating a PDB from scratch, since reading records out of an existing PDB and rebuilding fast in-memory version of hashes is expensive, the overhead of which offsets possible gain resulted from processing types and symbols with fast hashes.
Results
Here is the comparison between the latest Visual Studio 2017 15.9 Update release and Visual Studio 2019 Preview 2. We built one AAA game title and Google’s Chrome. In the tables below, the first two rows with numbers are for link time in the unit of seconds and the last row is for size of total input to the linker in the unit of bytes:
AAA Game Title | |||
Link time (seconds) | VS 2017 15.9 Update (base) | VS 2019 Preview 2 (diff) | base/diff (higher is better) |
/DEBUG:full | 392.1 | 163.3 | 2.4 |
/DEBUG:fastlink | 72.3 | 31.2 | 2.32 |
Input size (bytes) | 12,882,624,412 | 8,131,565,290 | 1.58 |
Google Chrome (x64 release build) | |||
Link time (seconds) | VS 2017 15.9 Update (base) | VS 2019 Preview 2 (diff) | base/diff (higher is better) |
/DEBUG:full | 126.8 | 71.9 | 1.76 |
/DEBUG:fastlink | 30.3 | 21.5 | 1.41 |
Input size (bytes) | 5,858,077,238 | 5,442,644,550 | 1.08 |
Google Chrome (x86 debug build) | |||
Link time (seconds) | VS 2017 15.9 Update (base) | VS 2019 Preview 2 (diff) | base/diff (higher is better) |
/DEBUG:full | 232.6 | 106.9 | 2.18 |
/DEBUG:fastlink | 43.8 | 38.8 | 1.13 |
Input size (bytes) | 8,384,258,922 | 7,962,819,862 | 1.05 |
We don’t see as large a linker input size reduction when building Chrome as when building AAA game title, because the compilation for Chrome is using /Zi, for which the compiler writes types into PDB file, while the compilation of AAA game title is using /Z7, for which type records are written into .debug$T sections in OBJs and unreferenced ones will be pruned away. We would also see that full PDB link time tends to benefit more from the improvements than fastlink PDB link time. This is because fastlink PDB generation doesn’t involve type merging and creation of global symbols, and therefore the latter two improvements don’t apply. Type pruning done by the compiler benefits both kinds of linking by reducing raw amount of work on debug records that the linker has to do to produce PDB.
Closing Remarks
We know build throughput is important for developers and we are continuing to improve our toolset’s performance. For next few releases we will be working on reducing compiler throughput cost on pruning unreferenced types as well as continuous improvement on various PDB internal hashes. If you have feedback or suggestions for us, let us know. We can be reached via comments below, via email (visualcpp@microsoft.com), or you can provide feedback via Help -> Report a Problem in the Product in Visual Studio IDE, or via Developer Community. You can also find us on Twitter (@VisualC) and Facebook (msftvisualcpp).
The post Linker Throughput Improvement in Visual Studio 2019 appeared first on C++ Team Blog.