00001 00003 // Version 1.40 00004 // October 22nd, 2002 - .NET (VC7, _MSC_VER=1300) support! 00005 // Version 1.30 00006 // Nov 24th, 2000 00007 // Version 1.20 00008 // Jun 9th, 2000 00009 // Version 1.10 00010 // Jan 23rd, 2000 00011 // Version 1.00 00012 // May 20th, 1999 00013 // Todd C. Wilson, Fresh Ground Software 00014 // (todd@nopcode.com) 00015 // This header file will kick in settings for Visual C++ 5 and 6 that will (usually) 00016 // result in smaller exe's. 00017 // The "trick" is to tell the compiler to not pad out the function calls; this is done 00018 // by not using the /O1 or /O2 option - if you do, you implicitly use /Gy, which pads 00019 // out each and every function call. In one single 500k dll, I managed to cut out 120k 00020 // by this alone! 00021 // The other two "tricks" are telling the Linker to merge all data-type segments together 00022 // in the exe file. The relocation, read-only (constants) data, and code section (.text) 00023 // sections can almost always be merged. Each section merged can save 4k in exe space, 00024 // since each section is padded out to 4k chunks. This is very noticeable with smaller 00025 // exes, since you could have only 700 bytes of data, 300 bytes of code, 94 bytes of 00026 // strings - padded out, this could be 12k of runtime, for 1094 bytes of stuff! For larger 00027 // programs, this is less overall, but can save at least 4k. 00028 // Note that if you're using MFC static or some other 3rd party libs, you may get poor 00029 // results with merging the readonly (.rdata) section - the exe may grow larger. 00030 // To use this feature, define _MERGE_DATA_ in your project or before this header is used. 00031 // With Visual C++ 5, the program uses a file alignment of 512 bytes, which results 00032 // in a small exe. Under VC6, the program instead uses 4k, which is the same as the 00033 // section size. The reason (from what I understand) is that 4k is the chunk size of 00034 // the virtual memory manager, and that WinAlign (an end-user tuning tool for Win98) 00035 // will re-align the programs on this boundary. The problem with this is that all of 00036 // Microsoft's system exes and dlls are *NOT* tuned like this, and using 4k causes serious 00037 // exe bloat. This is very noticeable for smaller programs. 00038 // The "trick" for this is to use the undocumented FILEALIGN linker parm to change the 00039 // padding from 4k to 1/2k, which results in a much smaller exe - anywhere from 20%-75% 00040 // depending on the size. Note that this is the same as using /OPT:NOWIN98, which *is* 00041 // a previously documented switch, but was left out of the docs for some reason in VC6 and 00042 // all of the current MSDN's - see KB:Q235956 for more information. 00043 // Microsoft does say that using the 4k alignment will "speed up process loading", 00044 // but I've been unable to notice a difference, even on my P180, with a very large (4meg) exe. 00045 // Please note, however, that this will probably not change the size of the COMPRESSED 00046 // file (either in a .zip file or in an install archive), since this 4k is all zeroes and 00047 // gets compressed away. 00048 // Also, the /ALIGN:4096 switch will "magically" do the same thing, even though this is the 00049 // default setting for this switch. Apparently this sets the same values as the above two 00050 // switches do. We do not use this in this header, since it smacks of a bug and not a feature. 00051 // Thanks to Michael Geary <Mike@Geary.com> for some additional tips! 00052 // 00053 // Notes about using this header in .NET 00054 // First off, VC7 does not allow a lot of the linker command options in pragma's. There is no 00055 // honest or good reason why Microsoft decided to make this change, it just doesn't. 00056 // So that is why there are a lot of <1300 #if's in the header. 00057 // If you want to take full advantage of the VC7 linker options, you will need to do it on a 00058 // PER PROJECT BASIS; you can no longer use a global header file like this to make it better. 00059 // Items I strongly suggest putting in all your VC7 project linker options command line settings: 00060 // /ignore:4078 /RELEASE 00061 // Compiler options: 00062 // /GL (Whole Program Optimization) 00063 // If you're making an .EXE and not a .DLL, consider adding in: 00064 // /GA (Optimize for Windows Application) 00065 // Some items to consider using in your VC7 projects (not VC6): 00066 // Link-time Code Generation - whole code optimization. Put this in your exe/dll project link settings. 00067 // /LTCG:NOSTATUS 00068 // The classic no-padding and no-bloat compiler C/C++ switch: 00069 // /opt:nowin98 00070 // 00071 // (C++ command line options: /GL /opt:nowin98 and /GA for .exe files) 00072 // (Link command line options: /ignore:4078 /RELEASE /LTCG:NOSTATUS) 00073 // 00074 // Now, notes on using these options in VC7 vs VC6. 00075 // VC6 consistently, for me, produces smaller code from C++ the exact same sources, 00076 // with or without this header. On average, VC6 produces 5% smaller binaries compared 00077 // to VC7 compiling the exact same project, *without* this header. With this header, VC6 00078 // will make a 13k file, while VC7 will make a 64k one. VC7 is just bloaty, pure and 00079 // simple - all that managed/unmanaged C++ runtimes, and the CLR stuff must be getting 00080 // in the way of code generation. However, template support is better, so there. 00081 // Both VC6 and VC7 show the same end kind of end result savings - larger binary output 00082 // will shave about 2% off, where as smaller projects (support DLL's, cpl's, 00083 // activex controls, ATL libs, etc) get the best result, since the padding is usually 00084 // more than the actual usable code. But again, VC7 does not compile down as small as VC6. 00085 // 00086 // The argument can be made that doing this is a waste of time, since the "zero bytes" 00087 // will be compressed out in a zip file or install archive. Not really - it doesn't matter 00088 // if the data is a string of zeroes or ones or 85858585 - it will still take room (20 bytes 00089 // in a zip file, 29 bytes if only *4* of them 4k bytes are not the same) and time to 00090 // compress that data and decompress it. Also, 20k of zeros is NOT 20k on disk - it's the 00091 // size of the cluster slop- for Fat32 systems, 20k can be 32k, NTFS could make it 24k if you're 00092 // just 1 byte over (round up). Most end users do not have the dual P4 Xeon systems with 00093 // two gigs of RDram and a Raid 0+1 of Western Digital 120meg Special Editions that all 00094 // worthy developers have (all six of us), so they will need any space and LOADING TIME 00095 // savings they will need; taking an extra 32k or more out of your end user's 64megs of 00096 // ram on Windows 98 is Not a Good Thing. 00097 // 00098 // Now, as a ADDED BONUS at NO EXTRA COST TO YOU! Under VC6, using the /merge:.text=.data 00099 // pragma will cause the output file to be un-disassembleable! (is that a word?) At least, 00100 // with the normal tools - WinDisam, DumpBin, and the like will not work. Try it - use the 00101 // header, compile release, and then use DUMPBIN /DISASM filename.exe - no code! 00102 // Thanks to Gëzim Pani <gpani@siu.edu> for discovering this gem - for a full writeup on 00103 // this issue and the ramifactions of it, visit www.nopcode.com for the Aggressive Optimize 00104 // article. 00105 00106 #ifndef _AGGRESSIVEOPTIMIZE_H_ 00107 #define _AGGRESSIVEOPTIMIZE_H_ 00108 00109 #pragma warning(disable:4711) 00110 00111 #ifdef NDEBUG 00112 // /Og (global optimizations), /Os (favor small code), /Oy (no frame pointers) 00113 #pragma optimize("gsy",on) 00114 00115 #if (_MSC_VER<1300) 00116 #pragma comment(linker,"/RELEASE") 00117 #endif 00118 00119 // Note that merging the .rdata section will result in LARGER exe's if you using 00120 // MFC (esp. static link). If this is desirable, define _MERGE_RDATA_ in your project. 00121 #ifdef _MERGE_RDATA_ 00122 #pragma comment(linker,"/merge:.rdata=.data") 00123 #endif // _MERGE_RDATA_ 00124 00125 #pragma comment(linker,"/merge:.text=.data") 00126 #if (_MSC_VER<1300) 00127 // In VC7, this causes problems with the relocation and data tables, so best to not merge them 00128 #pragma comment(linker,"/merge:.reloc=.data") 00129 #endif 00130 00131 // Merging sections with different attributes causes a linker warning, so 00132 // turn off the warning. From Michael Geary. Undocumented, as usual! 00133 #if (_MSC_VER<1300) 00134 // In VC7, you will need to put this in your project settings 00135 #pragma comment(linker,"/ignore:4078") 00136 #endif 00137 00138 // With Visual C++ 5, you already get the 512-byte alignment, so you will only need 00139 // it for VC6, and maybe later. 00140 #if _MSC_VER >= 1000 00141 00142 // Option #1: use /filealign 00143 // Totally undocumented! And if you set it lower than 512 bytes, the program crashes. 00144 // Either leave at 0x200 or 0x1000 00145 //#pragma comment(linker,"/FILEALIGN:0x200") 00146 00147 // Option #2: use /opt:nowin98 00148 // See KB:Q235956 or the READMEVC.htm in your VC directory for info on this one. 00149 // This is our currently preferred option, since it is fully documented and unlikely 00150 // to break in service packs and updates. 00151 #if (_MSC_VER<1300) 00152 // In VC7, you will need to put this in your project settings 00153 #pragma comment(linker,"/opt:nowin98") 00154 #else 00155 00156 // Option #3: use /align:4096 00157 // A side effect of using the default align value is that it turns on the above switch. 00158 // Does nothing under Vc7 that /opt:nowin98 doesn't already give you 00159 // #pragma comment(linker,"/ALIGN:512") 00160 #endif 00161 00162 #endif // _MSC_VER >= 1000 00163 00164 #endif // NDEBUG 00165 00166 #endif // _AGGRESSIVEOPTIMIZE_H_