Optimisation Level

Flag Description
-O0 Disables all optimisations and is useful for debugging.
-O1 Optimize.
-Os Enables all the flags for -O2 but disables the flags that increase the binary size.
-O2 Safest optimisation level for speed.
-O3 Enables aggressive optimisations.
-Ofast Only in recent compilers, enables even more optimisations by violating standards compliance.

The most common optimisation levels are -Os and -O2. Anything more aggressive than than that is bound to break code on a large scale. The optimisation levels are cumulative such that specifying -O3 will include the optimisations from -O2.

Note that you will find many instances along the following lines:

-O2 -fomit-frame-pointer

however, -fomit-frame-pointer is already included in -O1 which is included in -O2 which makes specifying the flag redundant. When in doubt, consult the official optimisation options manual.

Specifying the Architecture, CPU Tune and CPU Type

The safest -march that can be set is -march=native which allows the compiler to guess the processor's features and use them during compilation. This is a feature supported from GCC 4 and up and eliminates the need to supply an -march manually.

However, if you need to specify a -march, -mcpu or -mtune, then a good procedure is as follows:

  • determine which CPU you have and on Linux you can issue cat /proc/cpuflags to see the flags supported by your CPU. A description of those features can be found on the cpuflags page.
  • issue gcc -v in a terminal to determine which compiler version you have.
  • browse the manual page of gcc corresponding to your version and locate the architecture you are on. For example, for i386 and x86_64 with gcc version 4.2.4 we find the manual page listing the modifiers.
  • finally, you have to chose a -march, -mcpu, or -mtune that closely matches the CPU features you have. It is important to specify a processor that has less or the same set of features matching your CPU.

As an example, suppose we have an Intel i7 running on gcc version 4.2.4. Looking at the list on the manual page above, we find:

pentium4, pentium4m
    Intel Pentium4 CPU with MMX, SSE and SSE2 instruction set support. 
prescott
    Improved version of Intel Pentium4 CPU with MMX, SSE, SSE2 and SSE3 instruction set support. 
nocona
    Improved version of Intel Pentium4 CPU with 64-bit extensions, MMX, SSE, SSE2 and SSE3 instruction set support. 

Since we are running a 64-bit processor, and because the Intel i7 supports the MMX, SSE, SSE2 and SSE3 instruction set, the most suiting for us in gcc version 4.2.4 is nocona. However, prescott and even pentium4 will be fine.

64bit vs 32bit

To switch between 64bit and 32bit use the flag -m64, respectively -m32.

Streaming SIMD Extensions (SSE)

The following is the list of SSE flags for the gcc compiler.

-msse2 -msse3 -mssse3 -msse4.1 -msse4.2

They do not need to all be listed in the CFLAGS as the compiler will choose the most advanced one during compilation.

3DNow!

For AMD-based systems:

-m3dnow

can be added in order to enable multimedia extensions.

Intel i5 and i7

Both support SSE4.1:

-O2 -m64 -flto -msse4.1 -mfpmath=sse -ffast-math -funroll-loops

For i7 with Nehalem processors which support SSE4.2, replace -msse4.1 with -msse4.2.

Intel Atom N2xx and Z5xx

-O2 -flto -mssse3 -mfpmath=sse

fuss/gcc_compile_flags.txt ยท Last modified: 2017/02/22 18:30 (external edit)

Access website using Tor Access website using i2p


For the copyright, license, warranty and privacy terms for the usage of this website please see the license, privacy and plagiarism pages.