Page 1 of 3

Trying out new gcc 4.7.x builds with lto and size optimizations

Posted: Sun Oct 21, 2012 10:37 am
by sledgehammer_999
This is a thread to discuss possible optimizations to gcc 4.7.x builds regarding size. I encourage anyone with experience in gcc to contribute here.
Dayman I hope you have something to contribute since you did gcc builds in the past.

The purpose of this is to find a way to make qbittorrent as small as possible with gcc(mingw). qBittorrent has to be linked statically to any external lib (including runtime) except the OS provided ones(Windows XP as base).

MSVC has a feature called "Link Time Code Generation" which effectively reduces the binary size a lot. Currently MSVC2008 built qbittorrent is at 15.2MB.

The gcc equivalent is supposed to be "Link Time Optimization". I was under the impression that it would help with the final binary size but it didn't. I compiled zlib,openssl,boost,libtorrent,qt from source with lto enabled. I did the same for qbittorrent. And the results are:
1. With LTO ~30MB
2. Without LTO ~20MB

So lto actually increases the binary size. Or I did something really stupid in the process.

Some info about qt configuration. In mkspecs\win32-g++\qmake.conf I changed some vars to the following:

Code: Select all

QMAKE_CFLAGS_RELEASE	= -Os -momit-leaf-frame-pointer -flto -static-libgcc
QMAKE_CXXFLAGS		= $$QMAKE_CFLAGS -msse -msse2 -mmmx -m3dnow
QMAKE_CXXFLAGS_RELEASE	= $$QMAKE_CFLAGS_RELEASE -static-libstdc++
QMAKE_LFLAGS_RELEASE	= -Wl,-s -lwinpthread -static-libgcc -static-libstdc++ -flto
Note: the binary isn't truly "static". It links to the libwinpthread-1.dll instead of the static lib for some reason.

I use gcc 4.7.2 from Mingw builds

I also can provide compilation instructions/help for the libs involved if anyone is interested.

Re: Trying out new gcc 4.7.x builds with lto and size optimizations

Posted: Sun Oct 21, 2012 11:17 am
by Dayman
Unfortunately -flto has never worked for me (probably because I did mingw64 builds) - always fails during openssl linking.

I was building with "-Os -mtune=generic -mmmx -msse -msse2 -fomit-frame-pointer -fpredictive-commoning -pipe -m64 -Wl,-O1 -Wl,--as-needed -flto -fuse-linker-plugin"
Should probably try w/o -fuse-linker-plugin, since it's useless w/o -fwhole-program

Moreover, size of binaries generated by gcc is always bigger than of those created by MSVC.

My fully dynamic AMD64 build with MSVC is 26 Megs vs. 31.4 Megs with mingw64.
AMD64 MSVC binaries can be packed (official builds are packed with UPX, but UPX still doesn't support PE32+/AMD64) with mpress and will occupy just 9.5 Megs.

I will try doing -flto build again, maybe i'll be able to overcome linktime problems.

By the way, using /O1 instead of /O2 may help reduce size when doing MSVC builds.


I did build openssl with wingw64. Now i remember. Dies during make test. Just dies. Segfault (or how is this called on windows?) when launching destest. Does your `make test` work fine?

Re: Trying out new gcc 4.7.x builds with lto and size optimizations

Posted: Sun Oct 21, 2012 11:42 am
by sledgehammer_999
[quote="Dayman"]
Unfortunately -flto has never worked for me (probably because I did mingw64 builds) - always fails during openssl linking.
[/quote]

I had openssl issues when linking qbt with openssl. I overcame them. If you try to build qbt again with gcc, post the issues to see if they are the same.

[quote="Dayman"]Moreover, size of binaries generated by gcc is always bigger than of those created by MSVC.[/quote]

That's why I thought LTO will be an improvedment on that. MSVC builds without LTCG produce bigger binaries than with LTGC.

[quote="Dayman"]AMD64 MSVC binaries can be packed (official builds are packed with UPX,[/quote]

No they are not. I don't UPX the qbt binary because it cannot start the application for some reason. Maybe because I have everything statically linked in. On the other hand the gcc static binary can be UPX'ed just fine. :S

[quote="Dayman"]
I did build openssl with wingw64. Now i remember. Dies during make test. Just dies. Segfault (or how is this called on windows?) when launching destest. Does your `make test` work fine?
[/quote]

I never did a "make test" because MSYS is painfully slow compared to the MSVC "make test". I will try it though.

Re: Trying out new gcc 4.7.x builds with lto and size optimizations

Posted: Sun Oct 21, 2012 12:05 pm
by Dayman
Dayman wrote: My fully dynamic AMD64 build with MSVC is 26 Megs vs. 31.4 Megs with mingw64.
Disregard this. I found out that my libtorrent.dll wasn't stripped (3 Megs of useless symbols).

So all in all those builds are even, but since they're dynamic and i put gcc stdlib in qBt dir GCC actually wins lol. mingw64 supports DEP btw, but no ASLR.

-fpredictive-commoning helps shrug off some space too (usually a bunch of kilos, but still better than nothing)
    Perform predictive commoning optimization, i.e., reusing computations (especially memory loads and stores) performed in previous iterations of loops.

    This option is enabled at level -O3.

Re: Trying out new gcc 4.7.x builds with lto and size optimizations

Posted: Sun Oct 21, 2012 12:13 pm
by sledgehammer_999
Dayman wrote: mingw64 supports DEP btw, but no ASLR.
I have no idea what these are or do. :p
Dayman wrote:-fpredictive-commoning helps shrug off some space too (usually a bunch of kilos, but still better than nothing)
    Perform predictive commoning optimization, i.e., reusing computations (especially memory loads and stores) performed in previous iterations of loops.

    This option is enabled at level -O3.
Interesting. But I hesitate to try anything because LTO linkage takes ages.

Sidenote: make test fails on me with:
md2test.c:1:10: error: expected '=', ',', ';', 'asm' or '__attribute__' before '.' token

Re: Trying out new gcc 4.7.x builds with lto and size optimizations

Posted: Sun Oct 21, 2012 12:22 pm
by Dayman
sledgehammer_999 wrote: Sidenote: make test fails on me with:
md2test.c:1:10: error: expected '=', ',', ';', 'asm' or '__attribute__' before '.' token
That's because those test files are supposed to be softlinks, but windows doesn't support this.
If you unpack the original openssl archive with tar from within msys it will handle this situation itself.
If you unpack with anything else you'll have to fix it yourself (snippet from my build script):

Code: Select all

# Fix tests, when unpacking OpenSSL in windows (no links)
cd test
ALL_TESTS=`find -name "*.c" -print`
for i in $ALL_TESTS
do
    if [ "`cat $i`" == "dummytest.c" ]
    then
        sed -i -e 's/\(dummytest.c\)/\#include "\1"/' $i
    fi
done
cd ..
make build_tests
make test

Re: Trying out new gcc 4.7.x builds with lto and size optimizations

Posted: Sun Oct 21, 2012 1:14 pm
by sledgehammer_999
OK. I untared from MSYS and I got "ALL TESTS SUCCESSFUL" yoohoo!

Re: Trying out new gcc 4.7.x builds with lto and size optimizations

Posted: Sun Oct 21, 2012 1:35 pm
by Dayman
I've tested libtorrent a bit.

'Contol group 1' ("-Os -mtune=generic -mmmx -msse -msse2 -fomit-frame-pointer -fpredictive-commoning -pipe -m64 -Wl,-O1 -Wl,--as-needed -s"): 2.764 MB
'Contol group 2' ("-O2 -mtune=generic -mmmx -msse -msse2 -fomit-frame-pointer -fpredictive-commoning -pipe -m64 -Wl,-O1 -Wl,--as-needed -s"): 3.914 MB
Flags are changed for control group 1.
"-flto": 2.433 MB
"-fuse-linker-plugin": 2.433 MB
"-fwhole-program": 2.388 MB
"-ffunction-sections -fdata-sections": 2.443 MB
"-Wl,--gc-sections": 2.443 MB
"-shared-libgcc": 2.443 MB (has no use in this case, openssl on the other hand links to stdlib statically by default)
"-Wl,--relax": 2.443 MB
"-Wl,--sort-common": 2.443 MB
"replaced -Os with -O2": 3.512 MB

Conclusion: lto reduces binary size by around 15%. A lot of flags do not work in gcc windows port (flags taken from http://wiki.wxwidgets.org/Reducing_Executable_Size and man ld)

Re: Trying out new gcc 4.7.x builds with lto and size optimizations

Posted: Sun Oct 21, 2012 3:22 pm
by sledgehammer_999
[quote="Dayman"]
I've tested libtorrent a bit.

'Contol group 1' ("-Os -mtune=generic -mmmx -msse -msse2 -fomit-frame-pointer -fpredictive-commoning -pipe -m64 -Wl,-O1 -Wl,--as-needed -s"): 2.764 MB
'Contol group 2' ("-O2 -mtune=generic -mmmx -msse -msse2 -fomit-frame-pointer -fpredictive-commoning -pipe -m64 -Wl,-O1 -Wl,--as-needed -s"): 3.914 MB
Flags are changed for control group 1.
"-flto": 2.433 MB
"-fuse-linker-plugin": 2.433 MB
"-fwhole-program": 2.388 MB
"-ffunction-sections -fdata-sections": 2.443 MB
"-Wl,--gc-sections": 2.443 MB
"-shared-libgcc": 2.443 MB (has no use in this case, openssl on the other hand links to stdlib statically by default)
"-Wl,--relax": 2.443 MB
"-Wl,--sort-common": 2.443 MB
"replaced -Os with -O2": 3.512 MB

Conclusion: lto reduces binary size by around 15%. A lot of flags do not work in gcc windows port (flags taken from http://wiki.wxwidgets.org/Reducing_Executable_Size and man ld)
[/quote]

Hmm, obviously I did something wrong for the binary size to increase.
Interestingly, my google search yielded the same wxwidgets page.

This is what I am gonna do now.
I will make successive posts documenting what I do to compile each library. So you can have an idea what I am doing, and maybe pinpoint some error on my part.

I will begin with zlib and openssl.

I will try to use these options throughout all libs:
Compiler: -Os -s -mmmx -msse -msse2 -fwhole-program -flto  -ffunction-sections -fdata-sections -fomit-frame-pointer -fpredictive-commoning
Linker: -Wl,--gc-sections -Wl,--relax -Wl,--sort-common -Wl,-s -Wl,--as-needed -static-libgcc -static-libstdc++ -fwhole-program -flto

(Maybe fwhole-program is not need in linker options)

Re: Trying out new gcc 4.7.x builds with lto and size optimizations

Posted: Sun Oct 21, 2012 3:43 pm
by sledgehammer_999
Zlib 1.2.7

Untared through MSYS

cd into the source file and reading the comments in the beginning of win32/Makefile.gcc:
Open win32/Makefile.gcc in an editor(not notepad) and replace CFLAGS and LDFLAGS with:

Code: Select all

CFLAGS = $(LOC) -Os -s -mmmx -msse -msse2 -fwhole-program -flto  -ffunction-sections -fdata-sections -fomit-frame-pointer -fpredictive-commoning
LDFLAGS = $(LOC) -Wl,--gc-sections -Wl,--relax -Wl,--sort-common -Wl,-s -Wl,--as-needed -static-libgcc -static-libstdc++ -fwhole-program -flto
And then issue in MSYS:

Code: Select all

cp contrib/asm?86/match.S ./match.S
make LOC=-DASMV OBJA=match.o -fwin32/Makefile.gcc
Manually copy libz.a to mingw's lib folder and zconf.h zlib.h to mingw's include folder.

Re: Trying out new gcc 4.7.x builds with lto and size optimizations

Posted: Sun Oct 21, 2012 3:56 pm
by sledgehammer_999
Openssl 1.0.1c

Issue this in MSYS:

Code: Select all

./config threads no-shared no-zlib-dynamic zlib
Open Makefile in an editor and find "CFLAG" append to the end of it the compiler options mentioned above. Later obvious override previous ones.

Then you need to pass the LDFLAGS but there isn't any in Makefile. So we set an environment variable in MSYS and start compiling.

Code: Select all

LDFLAGS="-Os -s -mmmx -msse -msse2 -fwhole-program -flto  -ffunction-sections -fdata-sections -fomit-frame-pointer -fpredictive-commoning"
make
Then empty LDFLAGS in MSYS:

Code: Select all

LDFLAGS=""
Manually copy the "openssl" folder found into "include" to mingw's "include" folder. Then manually copy libcrypto.a and libssl.a to mingw's lib folder.

Re: Trying out new gcc 4.7.x builds with lto and size optimizations

Posted: Sun Oct 21, 2012 5:08 pm
by sledgehammer_999
Boost 1.51.0

I assume you have a working b2.exe in the PATH
In MSYS

Code: Select all

b2 -q --with-system --toolset=gcc variant=release link=static runtime-link=static cflags="-Os -s -mmmx -msse -msse2 -fwhole-program -flto  -ffunction-sections -fdata-sections -fomit-frame-pointer -fpredictive-commoning" cxxflags="-Os -s -mmmx -msse -msse2 -fwhole-program -flto  -ffunction-sections -fdata-sections -fomit-frame-pointer -fpredictive-commoning" linkflags="-Wl,--gc-sections -Wl,--relax -Wl,--sort-common -Wl,-s -Wl,--as-needed -static-libgcc -static-libstdc++ -fwhole-program -flto"

Re: Trying out new gcc 4.7.x builds with lto and size optimizations

Posted: Sun Oct 21, 2012 5:12 pm
by sledgehammer_999
Libtorrent 0.16.4

I assume you have a working b2.exe in the PATH and have set BOOST_ROOT correctly(or pass it to b2 with -sBOOST_ROOT= )
In MSYS

Code: Select all

 b2 -q --without-python --toolset=gcc variant=release link=static runtime-link=static encryption=openssl logging=none geoip=static dht-support=on boost=source character-set=unicode boost-link=static cflags="-Os -s -mmmx -msse -msse2 -fwhole-program -flto  -ffunction-sections -fdata-sections -fomit-frame-pointer -fpredictive-commoning" cxxflags="-Os -s -mmmx -msse -msse2 -fwhole-program -flto  -ffunction-sections -fdata-sections -fomit-frame-pointer -fpredictive-commoning" linkflags="-Wl,--gc-sections -Wl,--relax -Wl,--sort-common -Wl,-s -Wl,--as-needed -static-libgcc -static-libstdc++ -fwhole-program -flto"

Re: Trying out new gcc 4.7.x builds with lto and size optimizations

Posted: Sun Oct 21, 2012 5:42 pm
by sledgehammer_999
Qt 4.8.3

Untar it and make replace the following lines in mkspecs\win32-g++\qmake.conf
Note: it probably helps to rename the source folder to something else like "qt-mingw472-4.8.3". Rename it before you compile the source because the path is hardcoded into qmake.exe

Code: Select all

QMAKE_CFLAGS_RELEASE	= -Os -s -mmmx -msse -msse2 -fwhole-program -flto -ffunction-sections -fdata-sections -fomit-frame-pointer -fpredictive-commoning
QMAKE_LFLAGS_RELEASE	= -Wl,-s -Wl,--gc-sections -Wl,--relax -Wl,--sort-common -Wl,-s -Wl,--as-needed -static-libgcc -static-libstdc++ -fwhole-program -flto -Wl,-static
QMAKE_RCC = $$[QT_INSTALL_BINS]$${DIR_SEPARATOR}rcc.exe
The last line  is needed because qmake passes the wrong rcc.exe path to the compiler by default(only happens with gcc) and qbittorrent compilation errors out.

In MSYS

Code: Select all

configure.exe -release -opensource -static -fast -system-zlib -no-qt3support -no-opengl -no-openvg -no-dsp -no-vcproj -no-dbus -no-phonon -no-phonon-backend -no-multimedia -no-audio-backend -no-webkit -no-script -no-scripttools -no-declarative -no-declarative-debug -arch windows -qt-style-windowsxp -nomake examples -nomake demos -platform win32-g++ -openssl-linked
bin/qmake.exe projects.pro QT_BUILD_PARTS="libs translations"
mingw32-make

Re: Trying out new gcc 4.7.x builds with lto and size optimizations

Posted: Sun Oct 21, 2012 5:51 pm
by Dayman
[quote="sledgehammer_999"]

Code: Select all

QMAKE_RCC = $$[QT_INSTALL_BINS]$${DIR_SEPARATOR}rcc.exe
The last line  is needed because qmake passes the wrong rcc.exe path to the compiler by default(only happens with gcc) and qbittorrent compilation errors out.
[/quote]
Nice, now i can stop fixing Makefiles generated for qBt with

Code: Select all

:: rcc fails to launch with Qt 4.8.3; Makefile is not using full path for rcc.exe (compare to moc.exe); Replacing with relative path (rcc.exe must be in %PATH%)
sed -i -e "s/bin\\rcc\.exe/rcc\.exe/" .\src\Makefile.Release
And patch Qt confs directly.
The other bad thing that happens is that make install fails fot Qt with gcc, currently fixing with this:

Code: Select all

diff -ru Qt/src/winmain/Makefile.Release Qt2/src/winmain/Makefile.Release
--- Qt/src/winmain/Makefile.Release	2012-06-29 16:07:30.708911400 +0400
+++ Qt2/src/winmain/Makefile.Release	2012-06-29 16:08:11.803261800 +0400
@@ -212,7 +212,7 @@
 install_target: first FORCE
 	@$(CHK_DIR_EXISTS) t:$(INSTALL_ROOT)\_outdir\Qt\Qt64_qbt_G\lib $(MKDIR) t:$(INSTALL_ROOT)\_outdir\Qt\Qt64_qbt_G\lib 
 	-$(INSTALL_FILE) "..\..\lib\qtmain.prl" "t:$(INSTALL_ROOT)\_outdir\Qt\Qt64_qbt_G\lib\qtmain.prl"
-	@$(CHK_DIR_EXISTS) t:$(INSTALL_ROOT)/_outdir/Qt/Qt64_qbt_G/lib/pkgconfig $(MKDIR) t:$(INSTALL_ROOT)/_outdir/Qt/Qt64_qbt_G/lib/pkgconfig 
+	@$(CHK_DIR_EXISTS) t:$(INSTALL_ROOT)\_outdir\Qt\Qt64_qbt_G\lib\pkgconfig $(MKDIR) t:$(INSTALL_ROOT)\_outdir\Qt\Qt64_qbt_G\lib\pkgconfig 
 	-$(INSTALL_FILE) "..\..\lib\pkgconfig\qtmain.pc" "t:$(INSTALL_ROOT)\_outdir\Qt\Qt64_qbt_G\lib\pkgconfig\qtmain.pc"
 	-$(INSTALL_FILE) "$(DESTDIR_TARGET)" "t:$(INSTALL_ROOT)\_outdir\Qt\Qt64_qbt_G\lib\$(TARGET)"
Wrong slashes are generated for qtmain target (and only this target). Had no motivation to make a better fix (this one is unportable).


Also: Building Qt with -flto -fuse-linker-plugin -fwhole-program in attachment.
[me=Dayman]remembers building wxWidgets monolithic library with gcc. 3 Gigs ate by ld and can't allocate memory comes next. I hope current mingw versions are built with LARGEADDRESSAWARE in mind.[/me]