221 lines
8.8 KiB
HTML
221 lines
8.8 KiB
HTML
|
||
|
||
<!DOCTYPE html>
|
||
<!--[if IE 8]><html class="no-js lt-ie9" lang="en" > <![endif]-->
|
||
<!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
|
||
<head>
|
||
<meta charset="utf-8">
|
||
|
||
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
||
|
||
<title>Compiler optimizations</title>
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
<script type="text/javascript" src="_static/js/modernizr.min.js"></script>
|
||
|
||
|
||
<script type="text/javascript" id="documentation_options" data-url_root="./" src="_static/documentation_options.js"></script>
|
||
<script type="text/javascript" src="_static/jquery.js"></script>
|
||
<script type="text/javascript" src="_static/underscore.js"></script>
|
||
<script type="text/javascript" src="_static/doctools.js"></script>
|
||
<script type="text/javascript" src="_static/language_data.js"></script>
|
||
|
||
<script type="text/javascript" src="_static/js/theme.js"></script>
|
||
|
||
|
||
|
||
|
||
<link rel="stylesheet" href="_static/css/theme.css" type="text/css" />
|
||
<link rel="stylesheet" href="_static/pygments.css" type="text/css" />
|
||
<link rel="index" title="Index" href="genindex.html" />
|
||
<link rel="search" title="Search" href="search.html" />
|
||
</head>
|
||
|
||
<body class="wy-body-for-nav">
|
||
|
||
|
||
<div class="wy-grid-for-nav">
|
||
|
||
<nav data-toggle="wy-nav-shift" class="wy-nav-side">
|
||
<div class="wy-side-scroll">
|
||
<div class="wy-side-nav-search" >
|
||
|
||
|
||
|
||
<a href="index.html" class="icon icon-home"> Point Cloud Library
|
||
|
||
|
||
|
||
</a>
|
||
|
||
|
||
|
||
|
||
<div class="version">
|
||
1.12.0
|
||
</div>
|
||
|
||
|
||
|
||
|
||
<div role="search">
|
||
<form id="rtd-search-form" class="wy-form" action="search.html" method="get">
|
||
<input type="text" name="q" placeholder="Search docs" />
|
||
<input type="hidden" name="check_keywords" value="yes" />
|
||
<input type="hidden" name="area" value="default" />
|
||
</form>
|
||
</div>
|
||
|
||
|
||
</div>
|
||
|
||
<div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="main navigation">
|
||
|
||
|
||
|
||
|
||
|
||
|
||
<!-- Local TOC -->
|
||
<div class="local-toc"><ul>
|
||
<li><a class="reference internal" href="#">Compiler optimizations</a></li>
|
||
</ul>
|
||
</div>
|
||
|
||
|
||
</div>
|
||
</div>
|
||
</nav>
|
||
|
||
<section data-toggle="wy-nav-shift" class="wy-nav-content-wrap">
|
||
|
||
|
||
<nav class="wy-nav-top" aria-label="top navigation">
|
||
|
||
<i data-toggle="wy-nav-top" class="fa fa-bars"></i>
|
||
<a href="index.html">Point Cloud Library</a>
|
||
|
||
</nav>
|
||
|
||
|
||
<div class="wy-nav-content">
|
||
|
||
<div class="rst-content">
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
<div role="navigation" aria-label="breadcrumbs navigation">
|
||
|
||
<ul class="wy-breadcrumbs">
|
||
|
||
<li><a href="index.html">Docs</a> »</li>
|
||
|
||
<li>Compiler optimizations</li>
|
||
|
||
|
||
<li class="wy-breadcrumbs-aside">
|
||
|
||
|
||
|
||
</li>
|
||
|
||
</ul>
|
||
|
||
|
||
<hr/>
|
||
</div>
|
||
<div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
|
||
<div itemprop="articleBody">
|
||
|
||
<div class="section" id="compiler-optimizations">
|
||
<span id="id1"></span><h1>Compiler optimizations</h1>
|
||
<p>Using excessive compiler optimizations can really hurt your compile-time
|
||
performance, and there’s a question whether you really need these optimizations
|
||
everytime you recompile to prototype something new, or whether you can live
|
||
with a less optimal binary for testing things. Obviously once your tests
|
||
succeed and you want to deploy your project, you can simply re-enable the
|
||
compiler optimizations. Here’s a few tests that we did a while back with
|
||
<a class="reference external" href="http://pcl.ros.org/">pcl_ros</a>:</p>
|
||
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="o">-</span><span class="n">j1</span><span class="p">,</span> <span class="n">RelWithDebInfo</span> <span class="o">+</span> <span class="n">O3</span> <span class="p">:</span> <span class="mi">3</span><span class="n">m20</span><span class="o">.</span><span class="mi">376</span><span class="n">s</span> <span class="o">-</span><span class="n">j1</span><span class="p">,</span> <span class="n">RelWithDebInfo</span> <span class="p">:</span> <span class="mi">2</span><span class="n">m48</span><span class="o">.</span><span class="mi">064</span><span class="n">s</span>
|
||
<span class="o">-</span><span class="n">j1</span><span class="p">,</span> <span class="n">Debug</span> <span class="p">:</span> <span class="mi">2</span><span class="n">m0</span><span class="o">.</span><span class="mi">452</span><span class="n">s</span>
|
||
<span class="o">-</span><span class="n">j2</span><span class="p">,</span> <span class="n">Debug</span> <span class="p">:</span> <span class="mi">1</span><span class="n">m8</span><span class="o">.</span><span class="mi">151</span><span class="n">s</span>
|
||
<span class="o">-</span><span class="n">j4</span><span class="p">,</span> <span class="n">Debug</span> <span class="p">:</span> <span class="mi">0</span><span class="n">m42</span><span class="o">.</span><span class="mi">846</span><span class="n">s</span>
|
||
</pre></div>
|
||
</div>
|
||
<p>In general, we got used to enable all compiler optimizations possible. In PCL
|
||
pre-0.4, this is how the CMakeLists.txt file looked like:</p>
|
||
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">add_definitions</span><span class="p">(</span><span class="o">-</span><span class="n">Wall</span> <span class="o">-</span><span class="n">O3</span> <span class="o">-</span><span class="n">DNDEBUG</span> <span class="o">-</span><span class="n">pipe</span> <span class="o">-</span><span class="n">ffast</span><span class="o">-</span><span class="n">math</span> <span class="o">-</span><span class="n">funroll</span><span class="o">-</span><span class="n">loops</span> <span class="o">-</span><span class="n">ftree</span><span class="o">-</span><span class="n">vectorize</span> <span class="o">-</span><span class="n">fomit</span><span class="o">-</span><span class="n">frame</span><span class="o">-</span><span class="n">pointer</span> <span class="o">-</span><span class="n">pipe</span> <span class="o">-</span><span class="n">mfpmath</span><span class="o">=</span><span class="n">sse</span> <span class="o">-</span><span class="n">mmmx</span> <span class="o">-</span><span class="n">msse</span> <span class="o">-</span><span class="n">mtune</span><span class="o">=</span><span class="n">core2</span> <span class="o">-</span><span class="n">march</span><span class="o">=</span><span class="n">core2</span> <span class="o">-</span><span class="n">msse2</span> <span class="o">-</span><span class="n">msse3</span> <span class="o">-</span><span class="n">mssse3</span> <span class="o">-</span><span class="n">msse4</span><span class="p">)</span>
|
||
<span class="c1">#add_definitions(-momit-leaf-frame-pointer -fomit-frame-pointer -floop-block -ftree-loop-distribution -ftree-loop-linear -floop-interchange -floop-strip-mine -fgcse-lm -fgcse-sm -fsched-spec-load)</span>
|
||
<span class="n">add_definitions</span> <span class="p">(</span><span class="o">-</span><span class="n">Wall</span> <span class="o">-</span><span class="n">O3</span> <span class="o">-</span><span class="n">Winvalid</span><span class="o">-</span><span class="n">pch</span> <span class="o">-</span><span class="n">pipe</span> <span class="o">-</span><span class="n">funroll</span><span class="o">-</span><span class="n">loops</span> <span class="o">-</span><span class="n">fno</span><span class="o">-</span><span class="n">strict</span><span class="o">-</span><span class="n">aliasing</span><span class="p">)</span>
|
||
</pre></div>
|
||
</div>
|
||
<p>Obviously, not all those flags were enabled by default, but we were definitely
|
||
playing around with them, and sometimes committing them to the repository,
|
||
which led to increase compilation times for some of the projects that needed to
|
||
precompile/use PCL.</p>
|
||
<p>In general there is no good rule of thumb here, but we decided to disable these
|
||
excessive optimizations by default, and rely on CMake’s <em>RelWithDebInfo</em> by
|
||
default. You should do the same too when you prototype.</p>
|
||
</div>
|
||
|
||
|
||
</div>
|
||
|
||
</div>
|
||
<footer>
|
||
|
||
|
||
<hr/>
|
||
|
||
<div role="contentinfo">
|
||
<p>
|
||
© Copyright
|
||
|
||
</p>
|
||
</div>
|
||
Built with <a href="http://sphinx-doc.org/">Sphinx</a> using a <a href="https://github.com/rtfd/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>.
|
||
|
||
</footer>
|
||
|
||
</div>
|
||
</div>
|
||
|
||
</section>
|
||
|
||
</div>
|
||
|
||
|
||
|
||
<script type="text/javascript">
|
||
jQuery(function () {
|
||
SphinxRtdTheme.Navigation.enable(true);
|
||
});
|
||
</script>
|
||
|
||
|
||
|
||
|
||
|
||
|
||
</body>
|
||
</html> |