<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en"><generator uri="https://jekyllrb.com/" version="4.4.1">Jekyll</generator><link href="https://camyang.com/feed.xml" rel="self" type="application/atom+xml" /><link href="https://camyang.com/" rel="alternate" type="text/html" hreflang="en" /><updated>2026-05-13T22:21:51+00:00</updated><id>https://camyang.com/feed.xml</id><title type="html">blank</title><entry><title type="html">JAX, Static-Shape Programming and Polyhedron</title><link href="https://camyang.com/blog/2025/hedrax/" rel="alternate" type="text/html" title="JAX, Static-Shape Programming and Polyhedron" /><published>2025-08-20T00:00:00+00:00</published><updated>2025-08-20T00:00:00+00:00</updated><id>https://camyang.com/blog/2025/hedrax</id><content type="html" xml:base="https://camyang.com/blog/2025/hedrax/"><![CDATA[<blockquote>
  <p>Rectangles are fine. Weird shapes are fun.</p>
</blockquote>

<p>JAX wants <strong>static shapes</strong>.<br />
Your loops, alas, are sometimes <strong>not rectangles</strong>.</p>

<p>This post tours the pain –&gt; coping strategies –&gt; a tiny helper I wrote called <strong>HedraX</strong>, 
which lets you index arbitrary <strong>polyhedral</strong> domains in JAX without summoning five GPTs.</p>

<p>This is <strong>Part 1</strong>: we’ll build intuition with hand-rolled code and end with HedraX’s <strong>table indexer</strong>.<br />
In <strong>Part 2</strong>, I’ll show a more “closed-form” approach HedraX can auto-generate for suitable domains.</p>

<hr />

<h2 id="rectangles-are-boring">Rectangles are Boring</h2>

<p>In JAX, we often translate a Python-like loop like</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nf">range</span><span class="p">(</span><span class="n">N</span><span class="p">):</span>
    <span class="k">for</span> <span class="n">j</span> <span class="ow">in</span> <span class="nf">range</span><span class="p">(</span><span class="n">N</span><span class="p">):</span>
        <span class="n">a</span><span class="p">[</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">]</span> <span class="o">=</span> <span class="nf">f</span><span class="p">(</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">)</span>
</code></pre></div></div>

<p>into</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">a</span> <span class="o">=</span> <span class="n">jax</span><span class="p">.</span><span class="nf">vmap</span><span class="p">(</span>
      <span class="n">jax</span><span class="p">.</span><span class="nf">vmap</span><span class="p">(</span><span class="n">f</span><span class="p">,</span> <span class="n">in_axes</span><span class="o">=</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="bp">None</span><span class="p">)),</span>
      <span class="n">in_axes</span><span class="o">=</span><span class="p">(</span><span class="bp">None</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span>
    <span class="p">)(</span><span class="n">jnp</span><span class="p">.</span><span class="nf">arange</span><span class="p">(</span><span class="n">N</span><span class="p">),</span> <span class="n">jnp</span><span class="p">.</span><span class="nf">arange</span><span class="p">(</span><span class="n">N</span><span class="p">))</span>
</code></pre></div></div>

<p>This translation works fine for rectangular domains. 
But suppose we want the <strong>lower triangle</strong>:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nf">range</span><span class="p">(</span><span class="n">N</span><span class="p">):</span>
    <span class="k">for</span> <span class="n">j</span> <span class="ow">in</span> <span class="nf">range</span><span class="p">(</span><span class="n">i</span><span class="p">):</span>
        <span class="n">a</span><span class="p">[</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">]</span> <span class="o">=</span> <span class="nf">f</span><span class="p">(</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">)</span>
</code></pre></div></div>

<p>At first glance this looks “dynamic” because the inner bound <code class="language-plaintext highlighter-rouge">j</code> depends on <code class="language-plaintext highlighter-rouge">i</code>. Can we do this in JAX with static shapes?</p>

<p>The answer is <em>yes</em>.</p>

<hr />

<h2 id="closed-form-triangles">The Heroic (but Fragile) Closed-Form for Triangles</h2>

<p>Although the domain isn’t rectangular, it <strong>is</strong> <em>statically sized</em> : it has <code class="language-plaintext highlighter-rouge">N * (N + 1) // 2</code> points.</p>

<p>We can biject a linear index <code class="language-plaintext highlighter-rouge">k</code> to <code class="language-plaintext highlighter-rouge">(i, j)</code> and iterate over <code class="language-plaintext highlighter-rouge">k</code>:</p>

<p>We can picture the domain as a triangle, and we assign each point a linear index <code class="language-plaintext highlighter-rouge">k</code> in the order of enumerating the rows and columns.</p>

<div style="text-align: center; align-items:center; justify-content:center;">
<script type="text/tikz">
\begin{document}
\begin{tikzpicture}[>=stealth]
% ggplot-like background and styling
\begin{scope}
\clip (-0.6,-0.6) rectangle (5.7,5.7);
\draw[fill=gray!10, draw=gray!30] (-0.6,-0.6) rectangle (5.7,5.7);
\foreach \x in {0,...,5} {\draw[white, line width=0.8pt] (\x,-0.6) -- (\x,5.7);} % vertical grid
\foreach \y in {0,...,5} {\draw[white, line width=0.8pt] (-0.6,\y) -- (5.7,\y);} % horizontal grid
\end{scope}

% axes
\draw[->, thick, gray!60] (-0.2,0) -- (5.7,0) node[below right] {$i$};
\draw[->, thick, gray!60] (0,-0.2) -- (0,5.7) node[above left] {$j$};
\foreach \t in {0,...,5} {
  \draw[gray!60] (\t,0) -- ++(0,-0.08) node[below] {\t};
  \draw[gray!60] (0,\t) -- ++(-0.08,0) node[left] {\t};
}

% N = 6 domain (i = 0..5, j = 0..i)
% triangle fill and border
\fill[blue!40, opacity=0.25] (0,0) -- (5,0) -- (5,5) -- cycle;
\draw[blue!60!black, line width=1pt] (0,0) -- (5,0) -- (5,5) -- cycle;

% integer lattice points in domain and k-labels
% k = T_i + j where T_i = i(i+1)/2
\foreach \i in {0,...,5} {
  \pgfmathtruncatemacro{\Ti}{\i*(\i+1)/2}
  \foreach \j in {0,...,\i} {
    \pgfmathtruncatemacro{\k}{\Ti + \j}
    \fill[gray!20!black] (\i,\j) circle (1.5pt);
    \node[anchor=south west, inner sep=1pt, text=gray!40!black, scale=0.8] at (\i,\j) {\scriptsize $\,\,\k$};
  }
}

% diagonal guide (for visualization)
\draw[blue!60!black, dashed] (0,0) -- (5,5);
\end{tikzpicture}
\end{document}
</script>

</div>

<p>And here is the JAX code that implements this idea:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Lower triangle: j in [0, i] (including the diagonal)
# k ranges over 0..T_{N-1} where T_m = m(m+1)/2
</span><span class="k">def</span> <span class="nf">body</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">k</span><span class="p">):</span>
    <span class="c1"># Solve for row i from k using the quadratic formula
</span>    <span class="n">i</span> <span class="o">=</span> <span class="n">jnp</span><span class="p">.</span><span class="nf">floor</span><span class="p">((</span><span class="n">jnp</span><span class="p">.</span><span class="nf">sqrt</span><span class="p">(</span><span class="mf">8.0</span> <span class="o">*</span> <span class="n">k</span> <span class="o">+</span> <span class="mf">1.0</span><span class="p">)</span> <span class="o">-</span> <span class="mf">1.0</span><span class="p">)</span> <span class="o">/</span> <span class="mf">2.0</span><span class="p">).</span><span class="nf">astype</span><span class="p">(</span><span class="n">jnp</span><span class="p">.</span><span class="n">int32</span><span class="p">)</span>
    <span class="n">Ti</span> <span class="o">=</span> <span class="p">(</span><span class="n">i</span> <span class="o">*</span> <span class="p">(</span><span class="n">i</span> <span class="o">+</span> <span class="mi">1</span><span class="p">))</span> <span class="o">//</span> <span class="mi">2</span>        <span class="c1"># T_i
</span>    <span class="n">j</span> <span class="o">=</span> <span class="p">(</span><span class="n">k</span> <span class="o">-</span> <span class="n">Ti</span><span class="p">).</span><span class="nf">astype</span><span class="p">(</span><span class="n">jnp</span><span class="p">.</span><span class="n">int32</span><span class="p">)</span> <span class="c1"># j in [0, i]
</span>    <span class="n">a</span> <span class="o">=</span> <span class="n">a</span><span class="p">.</span><span class="n">at</span><span class="p">[</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">].</span><span class="nf">set</span><span class="p">(</span><span class="nf">f</span><span class="p">(</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">))</span>
    <span class="k">return</span> <span class="n">a</span><span class="p">,</span> <span class="bp">None</span>

<span class="n">K</span> <span class="o">=</span> <span class="n">N</span> <span class="o">*</span> <span class="p">(</span><span class="n">N</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)</span> <span class="o">//</span> <span class="mi">2</span>
<span class="n">a</span><span class="p">,</span> <span class="n">_</span> <span class="o">=</span> <span class="n">lax</span><span class="p">.</span><span class="nf">scan</span><span class="p">(</span><span class="n">body</span><span class="p">,</span> <span class="n">a0</span><span class="p">,</span> <span class="n">jnp</span><span class="p">.</span><span class="nf">arange</span><span class="p">(</span><span class="n">K</span><span class="p">))</span>
</code></pre></div></div>

<p>This works and is reasonably fast, but the math is bespoke. 
You also won’t want to re-derive a closed-form quadratic formula for every odd-shaped loop you meet.</p>

<hr />

<h2 id="precompute-route">The “fine, I’ll just precompute it” Route</h2>

<p>Another approach: <strong>explicitly enumerate</strong> the valid lattice points into a table and scan over that table.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="n">jax</span>
<span class="kn">import</span> <span class="n">jax.numpy</span> <span class="k">as</span> <span class="n">jnp</span>
<span class="kn">from</span> <span class="n">jax</span> <span class="kn">import</span> <span class="n">lax</span>

<span class="k">def</span> <span class="nf">build_coords_triangle</span><span class="p">(</span><span class="n">N</span><span class="p">):</span>
    <span class="c1"># Lower triangle (including the diagonal)
</span>    <span class="c1"># Store linear addresses k = i * N + j
</span>    <span class="n">pts</span> <span class="o">=</span> <span class="p">[</span><span class="n">i</span> <span class="o">*</span> <span class="n">N</span> <span class="o">+</span> <span class="n">j</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nf">range</span><span class="p">(</span><span class="n">N</span><span class="p">)</span> <span class="k">for</span> <span class="n">j</span> <span class="ow">in</span> <span class="nf">range</span><span class="p">(</span><span class="n">i</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)]</span>
    <span class="k">return</span> <span class="n">jnp</span><span class="p">.</span><span class="nf">asarray</span><span class="p">(</span><span class="n">pts</span><span class="p">,</span> <span class="n">dtype</span><span class="o">=</span><span class="n">jnp</span><span class="p">.</span><span class="n">int32</span><span class="p">)</span>

<span class="n">addresses</span> <span class="o">=</span> <span class="nf">build_coords_triangle</span><span class="p">(</span><span class="n">N</span><span class="p">)</span>  <span class="c1"># shape: (K,)
</span><span class="k">def</span> <span class="nf">body</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">k</span><span class="p">):</span>
    <span class="n">i</span><span class="p">,</span> <span class="n">j</span> <span class="o">=</span> <span class="n">k</span> <span class="o">//</span> <span class="n">N</span><span class="p">,</span> <span class="n">k</span> <span class="o">%</span> <span class="n">N</span>
    <span class="n">a</span> <span class="o">=</span> <span class="n">a</span><span class="p">.</span><span class="n">at</span><span class="p">[</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">].</span><span class="nf">set</span><span class="p">(</span><span class="nf">f</span><span class="p">(</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">))</span>
    <span class="k">return</span> <span class="n">a</span><span class="p">,</span> <span class="bp">None</span>

<span class="n">a</span><span class="p">,</span> <span class="n">_</span> <span class="o">=</span> <span class="n">lax</span><span class="p">.</span><span class="nf">scan</span><span class="p">(</span><span class="n">body</span><span class="p">,</span> <span class="n">a0</span><span class="p">,</span> <span class="n">addresses</span><span class="p">)</span>
</code></pre></div></div>

<p>This is conceptually simple but:</p>
<ul>
  <li>adds an <strong>address table</strong> (memory),</li>
  <li>adds an <strong>extra read</strong> per iteration,</li>
  <li>still asks you to <strong>hand-enumerate</strong> the domain.</li>
</ul>

<p>What if your domain is… less cozy?</p>

<hr />

<h2 id="general-polyhedra">From Triangles to “Whatever”</h2>

<p>Consider the polygonal domain
\(\mathcal{D} 
= \{ (i, j) \in \mathbb{Z}^2 \mid \; 5j - i - 8 \ge 0,\; -3i - 6j + 39 \ge 0,\; 4i + j - 10 \ge 0 \}.\)
that looks like this:</p>

<div style="text-align: center; align-items:center; justify-content:center;">
<script type="text/tikz">
\begin{document}
\begin{tikzpicture}[>=stealth]
% ggplot-like background
\begin{scope}
\clip (-0.6,-0.6) rectangle (8.6,8.6);
\draw[fill=gray!10, draw=gray!30] (-0.6,-0.6) rectangle (8.6,8.6);
\foreach \x in {0,...,8} {\draw[white, line width=0.8pt] (\x,-0.6) -- (\x,8.6);} % vertical grid
\foreach \y in {0,...,8} {\draw[white, line width=0.8pt] (-0.6,\y) -- (8.6,\y);} % horizontal grid
\end{scope}

% axes
\draw[->, thick, gray!60] (-0.2,0) -- (8.6,0) node[below right] {$i$};
\draw[->, thick, gray!60] (0,-0.2) -- (0,8.6) node[above left] {$j$};
\foreach \t in {0,...,8} {
  \draw[gray!60] (\t,0) -- ++(0,-0.08) node[below] {\t};
  \draw[gray!60] (0,\t) -- ++(-0.08,0) node[left] {\t};
}

% A rotated obtuse triangle (obtuse angle at A)
\def\Ax{2} \def\Ay{2}
\def\Bx{7} \def\By{3}
\def\Cx{1} \def\Cy{6}

% fill and border
\fill[orange!40, opacity=0.25] (\Ax,\Ay) -- (\Bx,\By) -- (\Cx,\Cy) -- cycle;
\draw[orange!60!black, line width=1pt] (\Ax,\Ay) -- (\Bx,\By) -- (\Cx,\Cy) -- cycle;

% axis-aligned bounding rectangle (hull), dashed border, no fill
\draw[gray!70, dashed, line width=0.9pt] (1,2) rectangle (7,6);

% integer lattice points inside the triangle
\foreach \x in {0,...,8} {
  \foreach \y in {0,...,8} {
    % signed areas via 2D cross products
    \pgfmathsetmacro{\sone}{(\Bx-\Ax)*(\y-\Ay) - (\By-\Ay)*(\x-\Ax)}
    \pgfmathsetmacro{\stwo}{(\Cx-\Bx)*(\y-\By) - (\Cy-\By)*(\x-\Bx)}
    \pgfmathsetmacro{\sthree}{(\Ax-\Cx)*(\y-\Cy) - (\Ay-\Cy)*(\x-\Cx)}
    % same-sign check (include boundary)
    \pgfmathtruncatemacro{\signone}{\sone >= 0 ? 1 : -1}
    \pgfmathtruncatemacro{\signtwo}{\stwo >= 0 ? 1 : -1}
    \pgfmathtruncatemacro{\signthree}{\sthree >= 0 ? 1 : -1}
    \ifnum\signone=\signtwo\relax
      \ifnum\signone=\signthree\relax
        \fill[gray!20!black] (\x,\y) circle (1.5pt);
      \fi
    \fi
  }
}

\end{tikzpicture}
\end{document}
</script>
</div>

<p>How to implement the <code class="language-plaintext highlighter-rouge">build_coords_triangle</code> function for this domain? 
It’s not obvious.</p>

<p>A simple approach is to <strong>bound</strong> the domain by a rectangle and reject points outside the domain, as shown
by the dashed rectangle in the figure above.</p>

<p>But, in higher dimensions:</p>
<ul>
  <li>bounding boxes get tedious,</li>
  <li>rejection gets expensive.</li>
</ul>

<hr />

<h2 id="introducing-hedrax">Introducing HedraX</h2>

<p>Happily, the problem of <strong>parametric polyhedral enumeration</strong> has been studied to death <a class="citation" href="#verdoolaege2007">(Verdoolaege et al., 2007; Klöckner, 2014; Verdoolaege, 2010)</a>.<br />
It powers <strong>polyhedral compilation</strong> in systems like LLVM/MLIR.</p>

<p>I wrapped just enough of that machinery into a tiny helper: <a href="https://github.com/thisiscam/hedrax"><strong>HedraX</strong></a>, specifically built for the use case of static-shape programming in JAX.<sup id="fnref:islpy"><a href="#fn:islpy" class="footnote" rel="footnote" role="doc-noteref">1</a></sup></p>

<p><strong>TL;DR:</strong> Tell HedraX your domain; it builds the address table for you and gives you an <code class="language-plaintext highlighter-rouge">unravel</code> to recover multi-indices.</p>

<h4 id="the-triangle-example">The Triangle Example</h4>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="n">hedrax</span> <span class="k">as</span> <span class="n">hdx</span>
<span class="kn">from</span> <span class="n">jax</span> <span class="kn">import</span> <span class="n">lax</span>

<span class="n">addresses</span><span class="p">,</span> <span class="n">unravel</span> <span class="o">=</span> <span class="n">hdx</span><span class="p">.</span><span class="nf">compile_table_indexer</span><span class="p">(</span>
    <span class="sh">"</span><span class="s">[N] -&gt; { [i, j] : 0 &lt;= j &lt;= i &lt; N }</span><span class="sh">"</span><span class="p">,</span>
    <span class="n">N</span><span class="o">=</span><span class="mi">10</span>
<span class="p">)</span>

<span class="k">def</span> <span class="nf">body</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">k</span><span class="p">):</span>
    <span class="n">i</span><span class="p">,</span> <span class="n">j</span> <span class="o">=</span> <span class="nf">unravel</span><span class="p">(</span><span class="n">k</span><span class="p">)</span>
    <span class="n">a</span> <span class="o">=</span> <span class="n">a</span><span class="p">.</span><span class="n">at</span><span class="p">[</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">].</span><span class="nf">set</span><span class="p">(</span><span class="nf">f</span><span class="p">(</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">))</span>
    <span class="k">return</span> <span class="n">a</span><span class="p">,</span> <span class="bp">None</span>

<span class="n">a</span><span class="p">,</span> <span class="n">_</span> <span class="o">=</span> <span class="n">lax</span><span class="p">.</span><span class="nf">scan</span><span class="p">(</span><span class="n">body</span><span class="p">,</span> <span class="n">a0</span><span class="p">,</span> <span class="n">addresses</span><span class="p">)</span>
</code></pre></div></div>

<p>Crazy domain? Just change the set:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">addresses</span><span class="p">,</span> <span class="n">unravel</span> <span class="o">=</span> <span class="n">hdx</span><span class="p">.</span><span class="nf">compile_table_indexer</span><span class="p">(</span>
    <span class="sh">"</span><span class="s">[N] -&gt; { [i, j] : 5j - i - 8 &gt;= 0 and -3i - 6j + 39 &gt;= 0 and 4i + j - 10 &gt;= 0 }</span><span class="sh">"</span><span class="p">,</span>
    <span class="n">N</span><span class="o">=</span><span class="mi">10</span>
<span class="p">)</span>
</code></pre></div></div>

<h4 id="the-gpt-unicorn">The GPT Unicorn</h4>

<p>With the table indexer in HedraX, you can even do <strong>unions</strong> of polyhedra.</p>

<p>For example, here is a ChatGPT-generated unicorn built as a union of convex polyhedra:</p>




  <div
  class="jupyter-notebook"
  style="position: relative; width: 100%; margin: 0 auto;">
  <div class="jupyter-notebook-iframe-container">
    <iframe
      src="/assets/jupyter/unicorn_domain.ipynb.html"
      style="position: absolute; top: 0; left: 0; border-style: none;"
      width="100%"
      height="100%"
      onload="this.parentElement.style.paddingBottom = (this.contentWindow.document.documentElement.scrollHeight + 10) + 'px'"></iframe>
  </div>
</div>



<div style="text-align: center; font-size: 1.5em; margin: 1em 0;">
<em>Voilà!</em>
</div>

<h4 id="what-about-the-quadratic-solving-approach">What About the Quadratic-Solving Approach?</h4>

<p><code class="language-plaintext highlighter-rouge">hdx.compile_table_indexer</code> automates the “precompute the table” route in <a href="#precompute-route">precompute-route</a>.<br />
It doesn’t produce the same neat closed-form mapping as in our <a href="#closed-form-triangles">closed-form approach</a> — but in <strong>Part 2</strong> I’ll show how HedraX can derive those closed-forms automatically when the domain admits them.</p>

<hr />
<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:islpy">
      <p>A lot of credit underneath the hood of HedraX goes to <a href="https://github.com/inducer/islpy">islpy</a> <a class="citation" href="#islpy">(Klöckner, 2014)</a>, which is a Python binding for the <a href="https://github.com/libratbag/isl">isl</a> library for manipulating parametric polyhedra. <a href="#fnref:islpy" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>]]></content><author><name></name></author><category term="blog" /><category term="jax" /><category term="polyhedral-compilation" /><category term="hedrax" /><summary type="html"><![CDATA[Rectangles are fine. Weird shapes are fun.]]></summary></entry><entry><title type="html">On The Computability of Parametric Inversion</title><link href="https://camyang.com/blog/2024/computable-parametric-inversion/" rel="alternate" type="text/html" title="On The Computability of Parametric Inversion" /><published>2024-12-20T00:00:00+00:00</published><updated>2024-12-20T00:00:00+00:00</updated><id>https://camyang.com/blog/2024/computable-parametric-inversion</id><content type="html" xml:base="https://camyang.com/blog/2024/computable-parametric-inversion/"><![CDATA[<p>Parametric inversion, introduced in <a class="citation" href="#TavaresLezama2016">(Tavares &amp; Solar-Lezama, 2016)</a>,
generalizes the classical notion of function inversion to non-invertible functions by introducing a parameterized function that selects specific elements from the preimage of a given function.
This approach enables inverting functions that are not bijective, opening up new possibilities for practical applications.</p>

<p>In <a class="citation" href="#TavaresLezama2016">(Tavares &amp; Solar-Lezama, 2016)</a>, the authors mentioned that</p>

<div class="language-markdown highlighter-rouge"><div class="highlight"><pre class="highlight"><code>In contrast to a conventional inverse, a parametric inverse always exists.
</code></pre></div></div>

<p>While this holds mathematically, the computability of parametric inverses presents additional challenges.
Computability is essential for algorithmic applications, determining whether such inverses can be constructed and utilized effectively.
In this document, I examine the computability of parametric inverses within the framework of <strong>Type-2 Computability</strong> <a class="citation" href="#weihrauch2000computable">(Weihrauch, 2000)</a>.
I show that a computable function need not have a computable parametric inverse, illustrating a limitation of this concept.</p>

<hr />

<h3 id="parametric-inversion">Parametric Inversion</h3>

<p><strong>Definition</strong> For a function $f: X \to Y$, a function $ f^{-1} : Y \times \Theta \to X $ is a <strong>parametric inverse</strong> of $f$ if, for all $y \in Y$:</p>

\[\{ f^{-1}(y, \theta) \mid \theta \in \Theta \} = \set{ x \in X \mid f(x) = y }.\]

<p>Here, $\theta$ serves as a parameter to select specific elements from the preimage of $f$, ensuring full coverage of the preimage for each $y$.</p>

<p>Mathematically, a parametric inverse always exists for any function $f$. To construct one:</p>

<ol>
  <li>For each $y$, choose an abitrary element $x^*_y \in \set{ x \in X \mid f(x) = y }$.</li>
  <li>Let $\Theta = X$.</li>
  <li>
    <p>Define a trivial parametric inverse as:</p>

\[f^{-1}(y, \theta) =
\begin{cases}
   \theta &amp; \text{if } f(\theta) = y, \\
   x^*_y &amp; \text{otherwise}.
\end{cases}\]
  </li>
</ol>

<p>However, when considering continuous (or computable) functions, ensuring that the parametric inverse is also continuous (or computable) becomes more complex.
This distinction leads to interesting limitations, as shown in the example below.</p>

<hr />

<h3 id="a-computable-function-without-a-computable-parametric-inverse">A Computable Function Without a Computable Parametric Inverse</h3>

<p>Consider the <strong>ReLU function</strong>, $\mathrm{relu}: \mathbb{R} \to \mathbb{R}$, defined as:</p>

\[\mathrm{relu}(x) = \max(x, 0).\]

<p>The ReLU function is continuous and computable because $\max(., .)$ is continuous and computable <a class="citation" href="#weihrauch2000computable">(Weihrauch, 2000, theorem 4.3.2)</a>.</p>

<p>However, any parametric inverse $\mathrm{relu}^{-1}$ is discontinuous, as shown below.</p>

<h5 id="proof-of-discontinuity">Proof of Discontinuity</h5>

<ol>
  <li>For $x = a$, where $a &lt; 0$, we have $\mathrm{relu}(a) = 0$.</li>
  <li>By the definition of parametric inverse, there exists $\theta^* \in \Theta$ such that:
    <ul>
      <li>$\mathrm{relu}^{-1}(0, \theta^*) = a$, and</li>
      <li>$\mathrm{relu}^{-1}(y, \theta^*) = y$ for all $y &gt; 0$.</li>
    </ul>
  </li>
</ol>

<p>The second point implies a jump discontinuity at $y = 0$ for $\mathrm{relu}^{-1}(y, \theta^*)$,
as shown in the figure:</p>

<div style="text-align: center; align-items:center; justify-content:center;">
<script type="text/tikz">
\begin{document}
\begin{tikzpicture}[>=stealth, scale=.8]

% Draw the axes
\draw[->] (-1, 0) -- (2, 0) node[below] {$y$}; % Horizontal axis (Y-axis)
\draw[->] (0, -1) -- (0, 2) node[left] {$x$}; % Vertical axis (X-axis)

% Draw the function for y > 0
\draw[thick, red, domain=0.09:2] plot (\x, \x);

\draw[thick, red] (0, 0) circle (.1); % Open circle for undefined

% Draw dot at (0, 0)
\draw[thick, red, fill=red] (0, -.5) circle (1pt);
\draw[thick, red] (0, -.5) node[left] {$a$};

\end{tikzpicture}
\end{document}
</script>

</div>

<p>Since every computable function on $\mathbb{R}$ is continuous <a class="citation" href="#weihrauch2000computable">(Weihrauch, 2000, theorem 4.3.1)</a>, the discontinuity imples $\mathrm{relu}^{-1}$ is not computable.
Thus, the $\mathrm{relu}$ example demonstrates a computable function lacking a computable parametric inverse.</p>

<hr />

<h3 id="when-is-parametric-inversion-computable">When Is Parametric Inversion Computable?</h3>

<p>Functions defined on <strong>computably enumerable sets</strong> (e.g., integers $\mathbb{Z}$, rationals $\mathbb{Q}$, finite-length strings $\Sigma^*$) admit computable parametric inverses. A simple construction is as follows:</p>

<p>Let $f: X \to Y$ be a computable function mapping bewteen computably enumerable domains $X$ and $Y$.
Let $\Theta = \mathbb{N}$. Define $f^{-1} : Y \times \Theta \to X$ with the following pseudo-code:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">make_parametric_inverse</span><span class="p">(</span><span class="n">f</span><span class="p">:</span> <span class="n">Callable</span><span class="p">[[</span><span class="n">X</span><span class="p">],</span> <span class="n">Y</span><span class="p">])</span> <span class="o">-&gt;</span> <span class="n">Callable</span><span class="p">[[</span><span class="n">Y</span><span class="p">,</span> <span class="n">Naturals</span><span class="p">],</span> <span class="n">X</span><span class="p">]:</span>
    <span class="k">def</span> <span class="nf">parametric_inverse</span><span class="p">(</span><span class="n">y</span><span class="p">:</span> <span class="n">Y</span><span class="p">,</span> <span class="n">theta</span><span class="p">:</span> <span class="n">Naturals</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="n">X</span><span class="p">:</span>
        <span class="k">for</span> <span class="n">x</span> <span class="ow">in</span> <span class="n">X</span><span class="p">:</span>  <span class="c1"># Assume X is computably enumerable
</span>            <span class="k">if</span> <span class="nf">f</span><span class="p">(</span><span class="n">x</span><span class="p">)</span> <span class="o">==</span> <span class="n">y</span><span class="p">:</span>  <span class="c1"># Equality testing is computable for computably enumerable Y
</span>                <span class="k">if</span> <span class="n">theta</span> <span class="o">==</span> <span class="mi">0</span><span class="p">:</span>
                    <span class="k">return</span> <span class="n">x</span>
                <span class="n">theta</span> <span class="o">-=</span> <span class="mi">1</span>
    <span class="k">return</span> <span class="n">parametric_inverse</span>
</code></pre></div></div>

<p>This algorithm ensures computability but is inefficient, relying on exhaustive enumeration.</p>

<p>For general domains, computability appears to hinge on additional structure of the function,
such as a computable enumeration of the function’s local optima, a topic worth further exploration.</p>

<hr />

<h3 id="conclusion">Conclusion</h3>

<p>Parametric inverses provide a flexible framework for “inverting” non-bijective functions, with guaranteed existence mathematically.
However, their computability depends on the function’s domain and properties.
As shown, computable functions on reals may lack computable parametric inverses due to discontinuities in the inverse.
On the other hand, functions on discrete domains offer a more favorable computability landscape.
This nuanced interplay highlights interesting directions for further study within computability theory.</p>]]></content><author><name></name></author><category term="blog" /><category term="computability" /><summary type="html"><![CDATA[Parametric inversion, introduced in (Tavares &amp; Solar-Lezama, 2016), generalizes the classical notion of function inversion to non-invertible functions by introducing a parameterized function that selects specific elements from the preimage of a given function. This approach enables inverting functions that are not bijective, opening up new possibilities for practical applications.]]></summary></entry><entry><title type="html">Estimating Fluid Velocity and Diffusion from Temperature Measurements (in Theory)</title><link href="https://camyang.com/blog/2024/fluid-sensor/" rel="alternate" type="text/html" title="Estimating Fluid Velocity and Diffusion from Temperature Measurements (in Theory)" /><published>2024-08-15T00:00:00+00:00</published><updated>2024-08-15T00:00:00+00:00</updated><id>https://camyang.com/blog/2024/fluid-sensor</id><content type="html" xml:base="https://camyang.com/blog/2024/fluid-sensor/"><![CDATA[<h1 id="background">Background</h1>

<p>My father, an sensor engineer, recently posed to me an intriguing question: How can we estimate the velocity and thermal diffusion coefficient of a running fluid using only temperature measurements?</p>

<p>While I have taken some computational physics classes, I am no an expert in fluid dynamics or sensor design. However, based on some fundamental physics principles,
we can sketch out a theoretical approach that might just work.</p>

<hr />

<h2 id="basic-setup">Basic Setup</h2>

<p>The temperature of a fluid in a long insulated pipe should basically become stationary after a while, assuming the fluid is flowing at a constant velocity.
So, to be able to get a signal from the temperature, we need to introduce a heat source at a specific point, say $x = 0$ in the pipe.</p>

<p>A good idea is to drive the heat source with a periodic signal, say $f(t) = A(1 + \sin(\omega_d t))$, so that our temperature sensors can pick up the signal at the same frequency $\omega_d$ and analyze the temperature distribution at that frequency.
Intuitively, this should reduce chances that the the teperature sensors picking up noise from the environment, as long as we pick a unique drive frequency.</p>

<h2 id="the-1-d-heat-partial-differential-equation">The 1-D Heat Partial Differential Equation</h2>

<p>To get started, ChatGPT told me to consider the classic one-dimensional heat equation, which describes how temperature evolves over time in a moving fluid in an infinitely long pipe.</p>

\[\frac{\partial T}{\partial t} = \alpha \frac{\partial^2 T}{\partial x^2} - v \frac{\partial T}{\partial x} + \frac{f(t)}{\rho c} \delta(x)\]

<p>Where:</p>

<ul>
  <li>$\alpha$ is the thermal diffusivity.</li>
  <li>$v$ is the velocity of the fluid (what we’re trying to estimate).</li>
  <li>$f(t)$ is the heat source, which we’ll assume is $f(t) = A(1 + \sin(\omega_d t))$.</li>
  <li>$\rho$ and $c$ are the fluid density and specific heat capacity.</li>
  <li>$\delta(x)$ is a Dirac delta function to model the fact that the heat source is located at a point $x = 0$.</li>
</ul>

<p>I’ll admit, there are practical challenges to implementing this in the real world, especially when it comes to building a sensor. But for now, let’s stick with the math!</p>

<hr />

<h3 id="solving-the-equation-in-the-frequency-domain">Solving the Equation in the Frequency Domain</h3>

<p>To make things easier, we switch from the time domain to the frequency domain by applying the Fourier transform. This lets us look at the system’s response at the driving frequency $\omega_d$. The Fourier transform of the temperature $T(x, t)$ at this frequency is $\hat{T}(x)$:</p>

\[\hat{T}(x, \omega) = \int_{-\infty}^{\infty} T(x, t) e^{-i \omega t} \, dt\]

<p>Since we’re interested in the response to the heat source’s drive frequency $\omega_d$, we consider the Fourier transform at this specific frequency, $\hat{T}(x) = \hat{T}(x, \omega_d)$.</p>

<p>By applying the Fourier transform to the heat equation, we get the following equation at frequency $\omega_d$:</p>

\[i \omega_d \hat{T}(x) = \alpha \frac{d^2 \hat{T}}{dx^2} - v \frac{d \hat{T}}{dx} - \frac{A}{\rho c} i \pi \delta(0) \delta(x)\]

<p>For $x \neq 0$, the delta function vanishes, so we are left with the homogeneous part of the equation:</p>

\[\alpha \frac{d^2 \hat{T}}{dx^2} - v \frac{d \hat{T}}{dx} - i \omega_d \hat{T}(x) = 0\]

<p>This is a second-order ordinary differential equation that we can solve. The general solution is:</p>

\[\hat{T}(x) = C_1 e^{\lambda_1 x} + C_2 e^{\lambda_2 x}\]

<p>Where the constants $\lambda_1$ and $\lambda_2$ are:</p>

\[\lambda_{1,2} = \frac{v \pm \sqrt{v^2 + 4 \alpha i \omega_d}}{2 \alpha}\]

<hr />

<h3 id="applying-boundary-conditions">Applying Boundary Conditions</h3>

<p>Now we apply the boundary conditions. Since we want the solution to remain bounded as $x \to \infty$, we must set $C_1 = 0$ for $x &gt; 0$. Similarly, for $x &lt; 0$, we set $C_2 = 0$ to avoid a diverging solution as $x \to -\infty$.</p>

<p>At $x = 0$, the temperature distribution must be continuous, so we require that $\hat{T}(0^-)$ equals $\hat{T}(0^+)$. This gives us the final form of the solution:</p>

<p>\begin{equation}
\label{eq:solution}
\hat{T}(x) = C \exp{\left(\frac{v - \text{sign}(x) \sqrt{v^2 + 4 \alpha i \omega_d}}{2 \alpha} x\right)}
\end{equation}</p>

<p>Where $C$ is a constant that depends on the heat source amplitude $A$ and the fluid properties $\rho$ and $c$. The sign function $\text{sign}(x)$ is $1$ for $x &gt; 0$ and $-1$ for $x &lt; 0$.</p>

<hr />

<h3 id="measuring-temperature-and-solving-for-v-and-alpha">Measuring Temperature and Solving for $v$ and $\alpha$</h3>

<p>A good idea here is to place two sensors at different locations and measure $\hat{T}(x_0)$ and $\hat{T}(x_1)$.
This way, we can cancel out the constant $C$! This effectively means we use two sensors to “denoise” the signal to remove effects due to the initial and conditions and the heat source’s waveform (in practice, the waveform cannot be perfectly sinusoidal).</p>

<p>So, let us place two temperature sensors at different positions along the flow, say at $x_0$ and $x_1$. Once we have temperature measurements at these two locations, we take their discrete Fourier transforms, giving us two complex values $M_0 = \hat{T}(x_0)$ and $M_1 = \hat{T}(x_1)$.</p>

<p>We take the log of the ratio between these measurements:</p>

\[K = \log{\frac{M_1}{M_0}}\]

<p>This $K$ value is a complex number, which consists of two real quantities, and we can use it to solve for two parameters of interest $v$ and $\alpha$ by setting up the equations from the solution \eqref{eq:solution}:</p>

\[K = \frac{v - \text{sign}(x_0) \sqrt{v^2 + 4 \alpha i \omega_d}}{2 \alpha} x_1 - \frac{v - \text{sign}(x_1) \sqrt{v^2 + 4 \alpha i \omega_d}}{2 \alpha} x_0\]

<p>Depending on whether $x_0$ and $x_1$ are both positive, or if one is negative, the solution process will vary slightly. For example, if both positions are positive, the equations reduces to a a system of linear equations, which can be solved for $v$ and $\alpha$:</p>

\[\begin{aligned}
  v      &amp; = \frac{a^2 - b^2}{b^3 + a^2 b} (x_1 - x_0)^{-1} \omega_d \\
  \alpha &amp; = \frac{a}{b^3 + a^2 b} (x_1 - x_0)^{-2} \omega_d
\end{aligned}\]

<p>where $a = \Re(K)$ and $b = \Im(K)$. If the sensors are on opposite sides of the source, the solution is a bit more complex and becomes a quadratic system, but it’s still solvable by Mathematica:</p>

\[\begin{aligned}
  \label{eq:solution-np}
    v      &amp; = \frac{(x_0 + x_1)^2 \left( b^2 (x_1 - x_0) - \lvert a \rvert \sqrt{\left( -4b^2 x_0 x_1 + a^2 (x_0 + x_1)^2 \right)}  \right)}{b \left( b^2 (x_0 - x_1)^2 + a^2 (x_0 + x_1)^2 \right) } \omega_d \\
    \alpha &amp; = \frac{(x_0 + x_1)^2 \left( a^2  (x_0 + x_1)^2 + \lvert a \rvert (-x_0 + x_1) \sqrt{\left( -4b^2 x_0 x_1 + a^2 (x_0 + x_1)^2 \right)} \right)}{2ab \left( b^2 (x_0 - x_1)^2 + a^2 (x_0 + x_1)^2 \right)} \omega_d
\end{aligned}\]

<hr />

<h1 id="practical-challenges-and-some-guesswork">Practical Challenges (And Some Guesswork)</h1>

<h2 id="sensor-placement">Sensor Placement</h2>

<p>One obvious challenge is placing the sensors at the right positions. They need to be at known distances from the heat source, and they should be sensitive enough to pick up small temperature changes at the drive frequency $\omega_d$. Also, we need to make sure the sensors are sampling fast enough to avoid aliasing (more on that below).</p>

<h2 id="drive-frequency-and-aliasing">Drive Frequency and Aliasing</h2>

<p>From some simulations, I found a constraint on the drive frequency $\omega_d$ that needs to be satisfied. Specifically, the following condition must hold:</p>

\[\frac{\omega_d}{\pi} = 2 f_d &lt; \frac{v}{||x_1| - |x_0||}\]

<p>This essentially means that the drive frequency should be less than half the fluid velocity divided by the distance between the sensors.
Though I haven’t fully worked out the details, I am speculating that this is due to Shannon’s sampling theorem.
If the drive frequency is too high compared to the fluid velocity, we’ll run into aliasing issues, which could throw off the measurements.</p>

<h2 id="real-world-imperfections">Real-World Imperfections</h2>

<p>The model I’ve used assumes everything is happening in one dimension, but in real life, the system could be more complex. There could be heat dissipating into the environment, non-uniform fluid flow, or other external factors affecting the results. These real-world complications would introduce some uncertainty into the measurements.</p>

<p>I’m not an expert in sensor design, but the theory behind it is sound, and with proper calibration, it seems possible to make this work. However, it’s worth noting that further work would be needed to handle the real-world deviations from the idealized model.</p>

<hr />

<h1 id="conclusion">Conclusion</h1>

<p>Using temperature measurements to estimate the velocity and thermal diffusivity of a flowing fluid is doable in theory, even though the practical aspects (like sensor design) might be trickier. By leveraging the 1-D heat equation with some Fourier analysis, we can get reasonable estimates for $v$ and $\alpha$. If nothing else, it’s a fun application of physics!</p>]]></content><author><name></name></author><category term="fluid dynamics" /><category term="heat equation" /><category term="math" /><category term="modeling" /><summary type="html"><![CDATA[A theoretical exploration on estimating fluid velocity and thermal diffusivity using temperature measurements.]]></summary></entry><entry><title type="html">Estimating Fluid Velocity and Diffusion from Temperature Measurements (Part 2, Simulation)</title><link href="https://camyang.com/blog/2024/fluid-sensing-part2/" rel="alternate" type="text/html" title="Estimating Fluid Velocity and Diffusion from Temperature Measurements (Part 2, Simulation)" /><published>2024-08-15T00:00:00+00:00</published><updated>2024-08-15T00:00:00+00:00</updated><id>https://camyang.com/blog/2024/fluid-sensing-part2</id><content type="html" xml:base="https://camyang.com/blog/2024/fluid-sensing-part2/"><![CDATA[<h1 id="introduction">Introduction</h1>




<div
  class="jupyter-notebook"
  style="position: relative; width: 100%; margin: 0 auto;">
  <div class="jupyter-notebook-iframe-container">
    <iframe
      src="/assets/jupyter/freq_domain_direct_estimation.ipynb.html"
      style="position: absolute; top: 0; left: 0; border-style: none;"
      width="100%"
      height="100%"
      onload="this.parentElement.style.paddingBottom = (this.contentWindow.document.documentElement.scrollHeight + 10) + 'px'"></iframe>
  </div>
</div>]]></content><author><name></name></author><category term="blog" /><category term="fluid dynamics" /><category term="heat equation" /><category term="simulation" /><category term="python" /><category term="jax" /><category term="modeling" /><summary type="html"><![CDATA[Introduction]]></summary></entry><entry><title type="html">Torque Analysis of a Motorized Filament Rewinder</title><link href="https://camyang.com/blog/2023/motorized-rewinder-analysis/" rel="alternate" type="text/html" title="Torque Analysis of a Motorized Filament Rewinder" /><published>2023-12-10T00:00:00+00:00</published><updated>2023-12-10T00:00:00+00:00</updated><id>https://camyang.com/blog/2023/motorized-rewinder-analysis</id><content type="html" xml:base="https://camyang.com/blog/2023/motorized-rewinder-analysis/"><![CDATA[<h1 id="introduction">Introduction</h1>

<p>It has long been a goal of mine to build a motorized filament spool holder for my multi-material 3D printer. The idea is to have a motorized spool holder that can automatically rewind the unused filament back to its spool after a filament swap, so that the idling filament doesn’t get tangled mid-print.
There are various attempts in the open-source 3D printing community at building such a device, and I also have a simple prototype a while ago:</p>

<div class="row mt-3">
    <div class="col-sm mt-3 mt-md-0">
        



<figure>
  <picture>
    <!-- Auto scaling with imagemagick -->
    <!--
      See https://www.debugbear.com/blog/responsive-images#w-descriptors-and-the-sizes-attribute and
      https://developer.mozilla.org/en-US/docs/Learn/HTML/Multimedia_and_embedding/Responsive_images for info on defining 'sizes' for responsive images
    -->
    
      
        <source class="responsive-img-srcset" srcset="/assets/img/motorized_rewinder_1-480.webp 480w,/assets/img/motorized_rewinder_1-800.webp 800w,/assets/img/motorized_rewinder_1-1400.webp 1400w," type="image/webp" sizes="95vw" />
      
    
    <img src="/assets/img/motorized_rewinder_1.jpg" class="img-fluid rounded z-depth-1" width="100%" height="auto" data-zoomable="" loading="eager" onerror="this.onerror=null; $('.responsive-img-srcset').remove();" />
  </picture>

  
</figure>

    </div>
    <div class="col-sm mt-3 mt-md-0">
        



<figure>
  <picture>
    <!-- Auto scaling with imagemagick -->
    <!--
      See https://www.debugbear.com/blog/responsive-images#w-descriptors-and-the-sizes-attribute and
      https://developer.mozilla.org/en-US/docs/Learn/HTML/Multimedia_and_embedding/Responsive_images for info on defining 'sizes' for responsive images
    -->
    
      
        <source class="responsive-img-srcset" srcset="/assets/img/motorized_rewinder_2-480.webp 480w,/assets/img/motorized_rewinder_2-800.webp 800w,/assets/img/motorized_rewinder_2-1400.webp 1400w," type="image/webp" sizes="95vw" />
      
    
    <img src="/assets/img/motorized_rewinder_2.jpg" class="img-fluid rounded z-depth-1" width="100%" height="auto" data-zoomable="" loading="eager" onerror="this.onerror=null; $('.responsive-img-srcset').remove();" />
  </picture>

  
</figure>

    </div>
</div>
<div class="caption">
    Renderings of my motorized filament rewinder design. The motor is hidden inside the front roller.
</div>

<p>However, one of the key challenges in building such a device is to estimate the torque required to rewind the filament back to the spool.
This estimate is essential for choosing the appropriate motor and drive mechanism (e.g., diameter of the drive roller) for the rewinder.</p>

<p>In this post, I’ll analyze the torque requirements for a motorized filament rewinder and discuss the key factors that affect the torque.</p>

<p><em>Disclaimer: The analysis is based on a <a href="https://en.wikipedia.org/wiki/Spherical_cow">spherical-cow model</a> and may not capture all the complexities of the real-world rewinder. However, my hope is that it should provide a good starting point for my motorized filament rewinder.</em></p>

<h1 id="torque-analysis">Torque Analysis</h1>

<p>To carry out the torque analysis, I need to model the rewinding process.
My analysis here will ignore any friction that is outside the scope of the rewinder itself (e.g., friction in the filament path, air resistance, etc.).
The analysis will focus on the mass of the filament spool and the force required to rewind the filament back to the spool at a certain linear acceleration.</p>

<p>Thus, I consider the following disassembly of a typical filament spool to model the mass distribution of the spool:</p>

<div class="row justify-content-md-center">
      

<figure>
  
    <video src="/assets/video/spool_disassembly.webm" class="img-fluid rounded z-depth-1 mx-auto d-block" width="50%" height="auto" autoplay="" loop="" />

  
  
    <figcaption class="caption">Disassembly of a typical filament spool. </figcaption>
  
</figure>

</div>

<p>The disaassembly consists of four parts:</p>

<ol>
  <li>The spool core</li>
  <li>Two spool disks</li>
  <li>The filament</li>
</ol>

<p>It happens that both the spool core and the filament are <em>hollow cylinders</em>.
With a mild approximation by ignoring the patterns on the disks, we can view the disks as “hollow cylinders” as well — the approximation is quite valid given that the disks are thin and weigh little in the entire spool.</p>

<p>The key to the analysis is to compute the moment of inertia of each part of the spool.
The moment of inertia of a hollow cylinder is given by:</p>

\[I = \frac{1}{2} m (r_1^2 + r_2^2)\]

<p>where $m$ is the mass of the cylinder, $r_1$ is the inner radius, and $r_2$ is the outer radius.</p>

<p>So I take out calipers and a kitchen scale to measure the dimensions and mass of these spool components:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">spool_hole_radius</span> <span class="o">=</span> <span class="mf">54.7</span> <span class="o">/</span> <span class="mi">2</span> <span class="o">*</span> <span class="mf">1e-3</span> <span class="c1"># m
</span><span class="n">spool_rim_radius</span> <span class="o">=</span> <span class="mf">100e-3</span>           <span class="c1"># m
</span><span class="n">spool_disk_weight</span> <span class="o">=</span> <span class="mf">53e-3</span>       <span class="c1"># kg
</span><span class="n">spool_core_weight</span> <span class="o">=</span> <span class="mf">44e-3</span>           <span class="c1"># kg
</span><span class="n">spool_core_thickness</span> <span class="o">=</span> <span class="mf">3.5e-3</span>       <span class="c1"># m
</span><span class="n">full_spool_weight</span> <span class="o">=</span> <span class="mi">1</span>               <span class="c1"># kg
</span></code></pre></div></div>

<p>The outer radius of the filament component changes as filament is used up.
Assuming the filament is wound uniformly, we can compute this radius as a function of the remaining filament weight:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">min_filament_radius</span> <span class="o">=</span> <span class="n">spool_hole_radius</span> <span class="o">+</span> <span class="n">spool_core_thickness</span>
<span class="n">max_filament_radius</span> <span class="o">=</span> <span class="n">spool_disk_weight</span>
<span class="k">def</span> <span class="nf">filament_radius_from_weight</span><span class="p">(</span><span class="n">filament_weight</span><span class="p">:</span> <span class="nb">float</span><span class="p">):</span>
  <span class="sh">"""</span><span class="s">Given the weight of the filament on the spool, returns the radius of the filament that is left on the spool.</span><span class="sh">"""</span>
  <span class="k">return</span> <span class="n">min_filament_radius</span> <span class="o">+</span> <span class="p">(</span><span class="n">max_filament_radius</span> <span class="o">-</span> <span class="n">min_filament_radius</span><span class="p">)</span> <span class="o">*</span> <span class="n">filament_weight</span> <span class="o">/</span> <span class="n">full_spool_weight</span>
</code></pre></div></div>

<p>The overall moment of inertia of the spool is the sum of the moments of inertia of the core, the two disks, and the filament.
It is a function of the weight of the filament on the spool:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">moments_of_inertia</span><span class="p">(</span><span class="n">filament_weight</span><span class="p">:</span> <span class="nb">float</span><span class="p">):</span>
  <span class="sh">"""</span><span class="s">
  Returns the moments of inertia (kg m^2) of the spool and filament together.
  </span><span class="sh">"""</span>
  <span class="c1"># the MOI of the one side rims
</span>  <span class="n">disk_moi</span> <span class="o">=</span> <span class="mi">1</span><span class="o">/</span><span class="mi">2</span> <span class="o">*</span> <span class="n">spool_side_rim_weight</span> <span class="o">*</span> <span class="p">(</span><span class="n">spool_rim_radius</span><span class="o">**</span><span class="mi">2</span> <span class="o">+</span> <span class="n">spool_hole_radius</span><span class="o">**</span><span class="mi">2</span><span class="p">)</span>
  <span class="c1"># the MOI of the center ring
</span>  <span class="n">core_moi</span> <span class="o">=</span> <span class="mi">1</span><span class="o">/</span><span class="mi">2</span> <span class="o">*</span> <span class="n">spool_core_weight</span> <span class="o">*</span> <span class="p">((</span><span class="n">spool_hole_radius</span> <span class="o">+</span> <span class="n">spool_core_thickness</span><span class="p">)</span><span class="o">**</span><span class="mi">2</span> <span class="o">+</span> <span class="n">spool_hole_radius</span><span class="o">**</span><span class="mi">2</span><span class="p">)</span>
  <span class="c1"># filament MOI
</span>  <span class="n">filament_moi</span> <span class="o">=</span> <span class="mi">1</span><span class="o">/</span><span class="mi">2</span> <span class="o">*</span> <span class="n">filament_weight</span> <span class="o">*</span> <span class="p">(</span><span class="nf">filament_radius_from_weight</span><span class="p">(</span><span class="n">filament_weight</span><span class="p">)</span><span class="o">**</span><span class="mi">2</span> <span class="o">+</span> <span class="n">min_filament_radius</span><span class="o">**</span><span class="mi">2</span><span class="p">)</span>
  <span class="c1"># total MOI
</span>  <span class="k">return</span> <span class="n">disk_moi</span> <span class="o">*</span> <span class="mi">2</span> <span class="o">+</span> <span class="n">core_moi</span> <span class="o">+</span> <span class="n">filament_moi</span>
</code></pre></div></div>

<p>Finally, we compute the torque required to rewind the filament, at a given filament weight and at a certain linear acceleration.
The key equation is:</p>

\[\tau = \frac{r_{\text{roller}}}{r_{\text{rim}} r_{\text{filament}} } I a\]

<p>where $\tau$ is the torque, $I$ is the moment of inertia, $a$ is the linear acceleration, $r_{\text{roller}}$ is the radius of the roller, $r_{\text{rim}}$ is the radius of the spool rim, and $r_{\text{filament}}$ is the radius of the filament on the spool.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">torque_for_acceleration</span><span class="p">(</span><span class="n">filament_weight</span><span class="p">:</span> <span class="nb">float</span><span class="p">,</span> <span class="n">acceleration</span><span class="p">:</span> <span class="nb">float</span><span class="p">,</span> <span class="n">roller_radius</span><span class="p">:</span> <span class="nb">float</span><span class="p">):</span>
  <span class="sh">"""</span><span class="s">
  Returns the torque (N . m) acting on the rim that is required to accelerate the filament at the given unload acceleration (m/s^2)
  </span><span class="sh">"""</span>
  <span class="n">filament_radius</span> <span class="o">=</span> <span class="nf">filament_radius_from_weight</span><span class="p">(</span><span class="n">filament_weight</span><span class="p">)</span>
  <span class="n">angular_acceleration</span> <span class="o">=</span> <span class="n">acceleration</span> <span class="o">/</span> <span class="n">filament_radius</span> <span class="c1"># rad/s^2
</span>  <span class="n">torque</span> <span class="o">=</span> <span class="nf">moments_of_inertia</span><span class="p">(</span><span class="n">filament_weight</span><span class="p">)</span> <span class="o">*</span> <span class="n">angular_acceleration</span>
  <span class="k">return</span> <span class="n">torque</span> <span class="o">*</span> <span class="n">roller_radius</span> <span class="o">/</span> <span class="n">spool_rim_radius</span>
</code></pre></div></div>

<h1 id="plotting">Plotting</h1>

<p>I can now plot the torque required to rewind the filament at a certain acceleration, given the weight of the filament on the spool.
The ideal acceleration I want to achieve is around 300 mm/s^2, and the roller radius in my rewinder design is 26 mm.
Pluggging these values into the above function and plotting over the range of filament weights, I get the following torque curve:</p>

<div class="row mt-3">
    <div class="col-sm mt-3 mt-md-0">
        



<figure>
  <picture>
    <!-- Auto scaling with imagemagick -->
    <!--
      See https://www.debugbear.com/blog/responsive-images#w-descriptors-and-the-sizes-attribute and
      https://developer.mozilla.org/en-US/docs/Learn/HTML/Multimedia_and_embedding/Responsive_images for info on defining 'sizes' for responsive images
    -->
    
      
        <source class="responsive-img-srcset" srcset="/assets/jupyter/torque_curve-480.webp 480w,/assets/jupyter/torque_curve-800.webp 800w,/assets/jupyter/torque_curve-1400.webp 1400w," type="image/webp" sizes="95vw" />
      
    
    <img src="/assets/jupyter/torque_curve.png" class="img-fluid rounded z-depth-1 mx-auto d-block" width="50%" height="auto" data-zoomable="" loading="eager" onerror="this.onerror=null; $('.responsive-img-srcset').remove();" />
  </picture>

  
</figure>

    </div>
</div>

<p>So now I know that with my current design, I need a motor that can provide at least ~0.05kg.cm of torque to rewind the filament!</p>

<p>I can now also find out the range of roller radii given a particular motor torque and speed.
The speed of the motor is a function of the linear speed of the filament, the radius of the roller, and the radius of the filament on the spool:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">motor_speed</span><span class="p">(</span><span class="n">filament_weight</span><span class="p">:</span> <span class="nb">float</span><span class="p">,</span> <span class="n">speed</span><span class="p">:</span> <span class="nb">float</span><span class="p">,</span> <span class="n">roller_radius</span><span class="p">:</span> <span class="nb">float</span><span class="p">):</span>
  <span class="n">filament_radius</span> <span class="o">=</span> <span class="nf">filament_radius_from_weight</span><span class="p">(</span><span class="n">filament_weight</span><span class="p">)</span>
  <span class="n">spool_rotational_speed</span> <span class="o">=</span> <span class="n">speed</span> <span class="o">/</span> <span class="n">filament_radius</span>
  <span class="k">return</span> <span class="n">spool_rotational_speed</span> <span class="o">*</span> <span class="n">spool_rim_radius</span> <span class="o">/</span> <span class="n">roller_radius</span>
</code></pre></div></div>

<p>With above, we know that can plot both the required motor torque and the motor speed for a given roller radius:</p>

<div class="row mt-3">
    <div class="col-sm mt-3 mt-md-0">
        



<figure>
  <picture>
    <!-- Auto scaling with imagemagick -->
    <!--
      See https://www.debugbear.com/blog/responsive-images#w-descriptors-and-the-sizes-attribute and
      https://developer.mozilla.org/en-US/docs/Learn/HTML/Multimedia_and_embedding/Responsive_images for info on defining 'sizes' for responsive images
    -->
    
      
        <source class="responsive-img-srcset" srcset="/assets/jupyter/motor_curve-480.webp 480w,/assets/jupyter/motor_curve-800.webp 800w,/assets/jupyter/motor_curve-1400.webp 1400w," type="image/webp" sizes="95vw" />
      
    
    <img src="/assets/jupyter/motor_curve.png" class="img-fluid rounded z-depth-1 mx-auto d-block" width="50%" height="auto" data-zoomable="" loading="eager" onerror="this.onerror=null; $('.responsive-img-srcset').remove();" />
  </picture>

  
</figure>

    </div>
</div>
<p>So for example, if the motor is rated at 0.035kg.cm of torque at 500RPM, I should be able to use a roller radius between ~17mm to ~19mm.</p>

<p>The code for this analysis is available in this <a href="https://colab.research.google.com/drive/1xIS6e5FFpnpZi6lFIzlSu9HtIOZoFKJn?usp=sharing">Google Colab notebook</a>.</p>]]></content><author><name></name></author><category term="3d-print" /><category term="torque" /><category term="motorized rewinder" /><category term="images" /><category term="links" /><category term="modeling" /><summary type="html"><![CDATA[An analysis of the torque required to rewind a filament spool using a motorized filament spool holder.]]></summary></entry><entry><title type="html">Enumerating Context-Free Languages and Minimizing Regular Expressions</title><link href="https://camyang.com/blog/2021/enumerating-cfgs-and-minimizing-regular-expressions/" rel="alternate" type="text/html" title="Enumerating Context-Free Languages and Minimizing Regular Expressions" /><published>2021-12-01T00:00:00+00:00</published><updated>2021-12-01T00:00:00+00:00</updated><id>https://camyang.com/blog/2021/enumerating-cfgs-and-minimizing-regular-expressions</id><content type="html" xml:base="https://camyang.com/blog/2021/enumerating-cfgs-and-minimizing-regular-expressions/"><![CDATA[<p>As I work on machine learning algorithms to combinatorial-optimization problems like compiler optimization,
one vastly simplified version of a class of problems is to <strong>learning to minimize regular expressions</strong>.
The problem is to learn a function that takes a regular expression as input and outputs a minimal equivalent regular expression that describes the same language.
Since this is machine learning, a good starting point is a dataset of regular expressions and their minimal equivalents that can be used directly for supervised learning.
To that end, this post describes my approach to generate a dataset of regular expressions and their minimal equivalents.</p>

<p>The key steps are:</p>

<ol>
  <li><strong>Enumerating all regular expressions</strong> up to a certain length $n$.</li>
  <li><strong>Finding the minimal equivalent</strong> for each regular expression by leveraging DFA minimization and hashing.</li>
</ol>

<hr />

<h3 id="background-on-regular-expressions-and-equivalence">Background on Regular Expressions and Equivalence</h3>

<p>A <strong>regular expression</strong> over an alphabet $\Sigma$ is a symbolic representation of a regular language, using operations like concatenation, union, and Kleene star. For example, the expression $(a|b)^*$ represents the language of all strings consisting of any number of $a$’s and $b$’s.</p>

<p>Mathematically, the set of all regular expressions over $\Sigma$ can be recursively defined as follows:</p>

<ul>
  <li>The empty set $\emptyset$, the empty string $\epsilon$, and any single character $a \in \Sigma$ are regular expressions.</li>
  <li>If $r_1$ and $r_2$ are regular expressions, then $r_1r_2$ (concatenation), $r_1 | r_2$ (union), and $r_1^*$ (Kleene star) are also regular expressions.</li>
</ul>

<p>The <strong>equivalence</strong> of two regular expressions $r_1$ and $r_2$ means that they describe the same language:</p>

\[L(r_1) = L(r_2)\]

<p>That is, the set of strings accepted by $r_1$ is identical to that accepted by $r_2$. Unlike general program equivalence (which is undecidable), regular expression equivalence is decidable, making it a good candidate for minimization tasks.</p>

<p>I encountered this problem while working on generating a dataset of regular expressions and their minimal versions. This post describes the methods I used to achieve that goal.</p>

<hr />

<h3 id="step-1-enumerating-regular-expressions">Step 1: Enumerating Regular Expressions</h3>

<p>The first step is to systematically generate all regular expressions up to a certain length $n$. This is a challenging combinatorial problem because the space of regular expressions grows exponentially with length.</p>

<p>To efficiently enumerate these expressions, we can represent them using a <strong>context-free grammar (CFG)</strong>. A CFG provides a formal mechanism to define the structure of regular expressions through production rules. For instance, a simplified CFG for regular expressions could look like this:</p>

\[S \rightarrow S + S \, | \, SS \, | \, S^* \, | \, (S) \, | \, a \, | \, b\]

<p>where $S$ is a non-terminal symbol representing a regular expression, and $a, b$ are terminal symbols (characters from the alphabet).</p>

<p>The key insight here is that we can enumerate all regular expressions up to a fixed length by expanding these CFG rules recursively.
This technique is based on <a href="https://arxiv.org/abs/1204.4982">Berstel and Brzozowski (2012)</a>, which provides a framework for enumerating regular expressions from the context-free language definition of regular expressions.</p>

<p>Formally, let $G = (V, \Sigma, P, S)$ be a CFG, where:</p>

<ul>
  <li>$V$ is the set of non-terminal symbols,</li>
  <li>$\Sigma$ is the set of terminal symbols (our alphabet),</li>
  <li>$P$ is the set of production rules,</li>
  <li>$S$ is the start symbol.</li>
</ul>

<p>The goal is to generate all strings in $L(G)$ (the language of the grammar) that have a length $\leq n$.
This is essentially done by recursively applying the production rules until we reach strings of terminal symbols.</p>

<p>Here’s the Python code implementing this recursive expansion:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nd">@dataclasses.dataclass</span><span class="p">(</span><span class="n">frozen</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
<span class="k">class</span> <span class="nc">CFG</span><span class="p">:</span>
    <span class="n">start</span><span class="p">:</span> <span class="n">NonTerminal</span>
    <span class="n">productions</span><span class="p">:</span> <span class="n">List</span><span class="p">[</span><span class="n">Production</span><span class="p">]</span>

<span class="k">def</span> <span class="nf">enumerate_cfg</span><span class="p">(</span><span class="n">cfg_info</span><span class="p">:</span> <span class="n">EnumerateCFGInfo</span><span class="p">,</span> <span class="n">size</span><span class="p">:</span> <span class="nb">int</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="n">Iterator</span><span class="p">[</span><span class="n">String</span><span class="p">]:</span>
    <span class="sh">"""</span><span class="s">Enumerates all regular expressions up to a fixed length for a CFG.</span><span class="sh">"""</span>
    <span class="k">def</span> <span class="nf">expand_rec</span><span class="p">(</span><span class="n">symb</span><span class="p">:</span> <span class="n">NonTerminal</span><span class="p">,</span> <span class="n">size</span><span class="p">:</span> <span class="nb">int</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="n">Iterator</span><span class="p">[</span><span class="n">String</span><span class="p">]:</span>
        <span class="k">if</span> <span class="n">size</span> <span class="o">==</span> <span class="mi">0</span><span class="p">:</span>
            <span class="k">if</span> <span class="n">symb</span> <span class="ow">in</span> <span class="n">cfg_info</span><span class="p">.</span><span class="n">empty_non_terminals</span><span class="p">:</span>
                <span class="k">yield</span> <span class="nf">tuple</span><span class="p">()</span>
            <span class="k">return</span>
        <span class="k">for</span> <span class="n">rule</span><span class="p">,</span> <span class="n">weight</span><span class="p">,</span> <span class="n">num_non_terminals</span> <span class="ow">in</span> <span class="n">cfg_info</span><span class="p">.</span><span class="n">productions</span><span class="p">[</span><span class="n">symb</span><span class="p">]:</span>
            <span class="n">rem_size</span> <span class="o">=</span> <span class="n">size</span> <span class="o">-</span> <span class="n">weight</span>
            <span class="k">if</span> <span class="n">rem_size</span> <span class="o">&gt;=</span> <span class="mi">0</span><span class="p">:</span>
                <span class="k">for</span> <span class="n">ns</span> <span class="ow">in</span> <span class="nf">partition</span><span class="p">(</span><span class="n">rem_size</span><span class="p">,</span> <span class="n">num_non_terminals</span><span class="p">):</span>
                    <span class="k">yield</span> <span class="k">from</span> <span class="n">itertools</span><span class="p">.</span><span class="nf">product</span><span class="p">(</span><span class="o">*</span><span class="p">[</span><span class="nf">expand_rec</span><span class="p">(</span><span class="n">s</span><span class="p">,</span> <span class="n">n</span><span class="p">)</span> <span class="k">for</span> <span class="n">s</span><span class="p">,</span> <span class="n">n</span> <span class="ow">in</span> <span class="nf">zip</span><span class="p">(</span><span class="n">rule</span><span class="p">,</span> <span class="n">ns</span><span class="p">)])</span>
    <span class="k">return</span> <span class="nf">expand_rec</span><span class="p">(</span><span class="n">cfg_info</span><span class="p">.</span><span class="n">grammar</span><span class="p">.</span><span class="n">start</span><span class="p">,</span> <span class="n">size</span><span class="p">)</span>
</code></pre></div></div>

<p>This code systematically enumerates all possible regular expressions up to length $n$, using the grammar $G$.</p>

<hr />

<h2 id="step-2-minimizing-regular-expressions-using-dfa-and-hashing">Step 2: Minimizing Regular Expressions Using DFA and Hashing</h2>

<p>Once we can enumerate regular expressions, the next step is to find the <strong>minimal equivalent</strong> for each expression. This is done by:</p>

<ol>
  <li><strong>Converting the regular expression into a DFA (deterministic finite automaton)</strong>, which is a formal model for recognizing regular languages.</li>
  <li><strong>Minimizing the DFA</strong> to obtain the smallest possible automaton that recognizes the same language as the original expression. DFA minimization is a process to reduce the number of states in a DFA, while ensuring that the automaton is minimal with respect to recognizing the same language (<a href="https://en.wikipedia.org/wiki/DFA_minimization">DFA Minimization</a>).</li>
  <li><strong>Hashing the minimized DFA</strong> to uniquely identify the language of the expression.</li>
</ol>

<p>In step 2, by hashing the minimal DFA, we ensure that equivalent regular expressions (which describe the same language) have the same hash.
In practice, I used the <a href="https://ics.uci.edu/~eppstein/PADS/"><strong>PADS library</strong></a> for handling DFAs and regular languages.
<a href="https://github.com/thisiscam/PADS">Here</a> is my fork of the library that supports hashing the DFAs.</p>

<p>Here is the Python-style pseudocode for this algorithm:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">generate_dataset</span><span class="p">(</span><span class="n">max_length</span><span class="p">,</span> <span class="n">alphabet</span><span class="p">):</span>
    <span class="n">minimals</span> <span class="o">=</span> <span class="p">{}</span>  <span class="c1"># Dictionary for storing minimal DFAs by hash
</span>    <span class="k">for</span> <span class="n">expr</span> <span class="ow">in</span> <span class="nf">enumerate_regular_expressions</span><span class="p">(</span><span class="n">max_length</span><span class="p">):</span>
        <span class="n">M</span> <span class="o">=</span> <span class="nf">convert_to_dfa</span><span class="p">(</span><span class="n">expr</span><span class="p">)</span>  <span class="c1"># Convert the regular expression to DFA
</span>        <span class="n">h</span> <span class="o">=</span> <span class="nf">hash</span><span class="p">(</span><span class="nf">minimize_dfa</span><span class="p">(</span><span class="n">M</span><span class="p">))</span>  <span class="c1"># Hash of the minimized DFA
</span>
        <span class="k">if</span> <span class="n">h</span> <span class="ow">in</span> <span class="n">minimals</span><span class="p">:</span>
            <span class="n">m</span> <span class="o">=</span> <span class="n">minimals</span><span class="p">[</span><span class="n">h</span><span class="p">]</span>  <span class="c1"># Retrieve the already stored minimal expression
</span>        <span class="k">else</span><span class="p">:</span>
            <span class="n">minimals</span><span class="p">[</span><span class="n">h</span><span class="p">]</span> <span class="o">=</span> <span class="n">expr</span>  <span class="c1"># Store the current expression as minimal
</span>            <span class="n">m</span> <span class="o">=</span> <span class="n">expr</span>

        <span class="k">yield</span> <span class="n">expr</span><span class="p">,</span> <span class="n">m</span>  <span class="c1"># Return the pair of original and minimal expression
</span></code></pre></div></div>

<hr />

<h3 id="conclusion">Conclusion</h3>

<p>By combining <strong>CFG-based enumeration</strong> with <strong>DFA minimization</strong>, we can generate a dataset of regular expressions (up to a specified length limit) and their minimal equivalents.
Of course, the size of the dataset grows exponentially with the maximum length, so a machine learning approach that relies on such a dataset is likely only applicable to the toy problems of minimizing short regular expressions.
Nonetheless, I found this approach to be a fun exercise in formal language theory and automata theory, and it provides a good starting point for exploring more complex problems in the future.</p>]]></content><author><name></name></author><category term="cfg," /><category term="formal-languages," /><category term="regular-expressions" /><summary type="html"><![CDATA[As I work on machine learning algorithms to combinatorial-optimization problems like compiler optimization, one vastly simplified version of a class of problems is to learning to minimize regular expressions. The problem is to learn a function that takes a regular expression as input and outputs a minimal equivalent regular expression that describes the same language. Since this is machine learning, a good starting point is a dataset of regular expressions and their minimal equivalents that can be used directly for supervised learning. To that end, this post describes my approach to generate a dataset of regular expressions and their minimal equivalents.]]></summary></entry></feed>