Get the documentation of gcc command-line arguments

gcc has an extensive documentation [1] in the texinfo format (.texi file extention). texinfo [2] can be converted into many different output formats, such as HTML, PDF, ...and generic XML.

I would like to write a quick-and-dirty script to extract the description of each command-line argument of gcc. I could then index them and use them in inapp in order to show the relevant piece of documentation when one of those argument is found (e.g., -faggressive-loop-optimizations).

Where are the documentation files?

The .texi files are scattered in the project. Usually in doc/ folders. For example, gcc documentation is in:

gcc-4.9.4/gcc/doc

The complete documentation should be generated with:

./configure
make html

Unfortunately, the Makefile fails on my laptop, and also, there is no option to generate the XML output.

Documentation about the generating docs is in (bottom of the page):

gcc-4.9.4/INSTALL/finalinstall.html
If you would like to generate online HTML documentation, do ‘cd objdir; make html’ and HTML will be generated for the gcc manuals in objdir/gcc/HTML.

In order to have the object directory fo gcc, you need to start building the compiler itself.

Another way?

The documentation can be generated manually:

$ cd gcc-4.9.4/gcc/doc
$ nano install.texi2html
# Comment out the last line removing the 'gcc-vers.texi' file.
$ ./install.texi2html
$ cp HTML/gcc-vers.texi gcc-vers.texi
./bugreport.texi:88: warning: undefined flag: BUGURL
$ makeinfo -I . -I include --xml -o gcc.xml gcc.texi

The complete documentation will be in one XML file of about 4.2MB.

Samewise, the HTML documentation can be generated with:

$ makeinfo -I . -I include --html -o gcc_html gcc.texi

XML format

Example of XML documentation block for the flag -faggressive-loop-optimizations:

<table commandarg="code" spaces=" " endspaces=" ">
    [...]
    <tableentry>
        <tableterm>
            <item spaces=" ">
                <itemformat command="code">-faggressive-loop-optimizations</itemformat>
            </item>
        </tableterm>
        <tableitem>
            <indexcommand command="opindex" index="op" spaces=" ">
                <indexterm index="op" number="681" incode="1">faggressive-loop-optimizations</indexterm>
            </indexcommand>
            <para>
                This option tells the loop optimizer to use language constraints to
                derive bounds for the number of iterations of a loop.  This assumes that
                loop code does not invoke undefined behavior by for example causing signed
                integer overflows or out-of-bound array accesses.  The bounds for the
                number of iterations of a loop are used to guide loop unrolling and peeling
                and loop exit test optimizations.
                This option is enabled by default.
            </para>
        </tableitem>
    </tableentry>
    [...]
</table>

The corresponding block in the HTML documentation:

<dl>
    [...]
    <dt>
        <code>-faggressive-loop-optimizations</code>
    </dt>
    <dd>
        <a name="index-faggressive-loop-optimizations-793"></a>
        This option tells the loop optimizer to use language constraints to
        derive bounds for the number of iterations of a loop.  This assumes that
        loop code does not invoke undefined behavior by for example causing signed
        integer overflows or out-of-bound array accesses.  The bounds for the
        number of iterations of a loop are used to guide loop unrolling and peeling
        and loop exit test optimizations.
        This option is enabled by default.
        <br>
    </dd>
    [...]
</dl>

Note the index identifier do not seem to match from XML to HTML documentations.

Anything else?

According to Texinfo documentation [3], the XML output can be transformed back to the original Texinfo format with the command txixml2texi.