<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>4by12 &#187; Mathematics</title>
	<atom:link href="http://4by12.com/blog/archives/category/mathematics/feed" rel="self" type="application/rss+xml" />
	<link>http://4by12.com/blog</link>
	<description>by Guy Gur-Ari</description>
	<lastBuildDate>Sat, 21 Aug 2010 00:01:49 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>A Short Note on the Poincaré Algebra</title>
		<link>http://4by12.com/blog/archives/131</link>
		<comments>http://4by12.com/blog/archives/131#comments</comments>
		<pubDate>Fri, 22 May 2009 17:03:02 +0000</pubDate>
		<dc:creator>Guy Gur Ari</dc:creator>
				<category><![CDATA[Mathematics]]></category>
		<category><![CDATA[Physics]]></category>

		<guid isPermaLink="false">http://4by12.com/blog/?p=131</guid>
		<description><![CDATA[As physicists, we learn that the Poincaré algebra has two Casimirs, and , together describing the mass and spin of a particle. A standard question is then, &#8220;why does the algebra have two Casimirs?&#8221; and the standard answer is, &#8220;because it is a rank 2 algebra, for instance taking as the Cartan&#8221;. Well this seems [...]]]></description>
			<content:encoded><![CDATA[<p>As physicists, we learn that the Poincaré algebra has two Casimirs, <img src='/latexrender/pictures/ab38acaab90d464bdf88fa9c6fbfe333.png' title='p_\mu p^\mu' alt='p_\mu p^\mu' align=absmiddle> and <img src='/latexrender/pictures/59fc9404b7bfe27a4079ac851a4bdbfc.png' title='W_\mu W^\mu' alt='W_\mu W^\mu' align=absmiddle>, together describing the mass and spin of a particle. A standard question is then, &#8220;why does the algebra have two Casimirs?&#8221; and the standard answer is, &#8220;because it is a rank 2 algebra, for instance taking <img src='/latexrender/pictures/b06726df018922e6758f3214ead7da1e.png' title='\left\{ p_0,J_3 \right\}' alt='\left\{ p_0,J_3 \right\}' align=absmiddle> as the Cartan&#8221;. Well this seems wrong, since we can also take <img src='/latexrender/pictures/3e282892fd59f396c2f72049253e93a6.png' title='\left\{p_\mu\right\}_\mu' alt='\left\{p_\mu\right\}_\mu' align=absmiddle> as the Cartan, which is of dimension 4. </p>
<p>It is a standard result that any Cartan subalgebra of a (complex) semisimple Lie algebra has the same size, so what&#8217;s going on?</p>
<p>The answer is simple: Poincaré isn&#8217;t a <em>semisimple</em> Lie algebra. Therefore we have to be careful about how to define the rank. First, let&#8217;s see why Poincaré isn&#8217;t semisimple.</p>
<p><b>Definition.</b> If <img src='/latexrender/pictures/b2f5ff47436671b6e533d8dc3614845d.png' title='g' alt='g' align=absmiddle> is a complex Lie algebra, then an <em>ideal</em> in <img src='/latexrender/pictures/b2f5ff47436671b6e533d8dc3614845d.png' title='g' alt='g' align=absmiddle> is a complex subalgebra <img src='/latexrender/pictures/2510c39011c5be704182423e3a695e91.png' title='h' alt='h' align=absmiddle> of <img src='/latexrender/pictures/b2f5ff47436671b6e533d8dc3614845d.png' title='g' alt='g' align=absmiddle> such that, for all <img src='/latexrender/pictures/0f0aec368cf183b239ab385863abd4c1.png' title='X \in g' alt='X \in g' align=absmiddle> and <img src='/latexrender/pictures/d42f7caac220102a2e6a1724081e407e.png' title='H \in h' alt='H \in h' align=absmiddle>, <img src='/latexrender/pictures/ac35b9c3c2753689beabe9a005c524b4.png' title='[X,H] \in h' alt='[X,H] \in h' align=absmiddle>.</p>
<p>The brackets of a Lie algebra can be thought of as a product of two elements in that algebra. Then, an ideal (as always), is a sort of &#8216;zero&#8217;, making anything it multiplies a member of itself (just like <img src='/latexrender/pictures/7d114bf80931553e0acb29417fd29fbb.png' title='x \cdot 0 = 0' alt='x \cdot 0 = 0' align=absmiddle> for any <img src='/latexrender/pictures/9dd4e461268c8034f5c8564e155c67a6.png' title='x' alt='x' align=absmiddle>).</p>
<p><b>Definition.</b> A complex Lie algebra <img src='/latexrender/pictures/b2f5ff47436671b6e533d8dc3614845d.png' title='g' alt='g' align=absmiddle> is called <em>simple</em> if <img src='/latexrender/pictures/3c95306078da63d736c61a0819a7acbd.png' title='dim g \ge 2' alt='dim g \ge 2' align=absmiddle>, and the only ideals in <img src='/latexrender/pictures/b2f5ff47436671b6e533d8dc3614845d.png' title='g' alt='g' align=absmiddle> are <img src='/latexrender/pictures/b2f5ff47436671b6e533d8dc3614845d.png' title='g' alt='g' align=absmiddle> and <img src='/latexrender/pictures/49565e389414292f8fcf95678b9d3ab6.png' title='\left\{0\right\}' alt='\left\{0\right\}' align=absmiddle>.</p>
<p><b>Definition.</b> A complex Lie algebra is called <em>semisimple</em> if it&#8217;s (isomorphic to) a direct sum of simple Lie algebras.</p>
<p>We can now see why the Poincaré algebra isn&#8217;t simple. Translations, namely <img src='/latexrender/pictures/3e282892fd59f396c2f72049253e93a6.png' title='\left\{p_\mu\right\}_\mu' alt='\left\{p_\mu\right\}_\mu' align=absmiddle>, form a basis for an ideal: Translations commute, and the commutator of a translation with a rotation <img src='/latexrender/pictures/5fa13b39f73a0dbe5285a81e9addde23.png' title='M_{\mu \nu}' alt='M_{\mu \nu}' align=absmiddle> is a sum of translations. It is also not semisimple, which I guess can be seen by considering the <img src='/latexrender/pictures/7b7281d259c4febeb99ab5f28cc43c74.png' title='SU(2) \times SU(2)' alt='SU(2) \times SU(2)' align=absmiddle> decomposition of the Lorentz subalgebra, then adding translations which will &#8216;link&#8217; the two components.</p>
<p>The Cartan can be defined for non-semisimple algebras, and it turns out it is the Cartan of the largest semisimple subalgebra. In the case of Poincaré, the largest semisimple subalgebra is the Lorentz subalgebra. So we can still define the rank to be the size of the Cartan, with this more general definition. I don&#8217;t know if the relation between the number of Casimirs and the rank still holds in this case, but at least for the Poincaré algebra it does turn out correctly, since the rank of the Lorentz algebra is 2.</p>
]]></content:encoded>
			<wfw:commentRss>http://4by12.com/blog/archives/131/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Complex Square Root</title>
		<link>http://4by12.com/blog/archives/119</link>
		<comments>http://4by12.com/blog/archives/119#comments</comments>
		<pubDate>Wed, 13 Feb 2008 23:45:47 +0000</pubDate>
		<dc:creator>Guy Gur Ari</dc:creator>
				<category><![CDATA[Mathematics]]></category>

		<guid isPermaLink="false">http://4by12.com/blog/archives/119</guid>
		<description><![CDATA[Proof that 1 = -1: This function should be taken outside and shot.]]></description>
			<content:encoded><![CDATA[<p>Proof that 1 = -1:</p>
<p><center><img src='/latexrender/pictures/2413efa13bef96fa4ca860c81ee2db5c.png' title=' \frac{1}{\sqrt{i}} = \sqrt{\frac{1}{i}} = \sqrt{-i} = \sqrt{-1} \sqrt{i} = i \sqrt{i}' alt=' \frac{1}{\sqrt{i}} = \sqrt{\frac{1}{i}} = \sqrt{-i} = \sqrt{-1} \sqrt{i} = i \sqrt{i}' align=absmiddle></center><br />
<center><img src='/latexrender/pictures/6eef5215009310a2a934be514e73ccbf.png' title=' \Rightarrow 1 = i \sqrt{i} \sqrt{i} = i^2 = -1 ' alt=' \Rightarrow 1 = i \sqrt{i} \sqrt{i} = i^2 = -1 ' align=absmiddle></center></p>
<p>This function should be taken outside and shot.</p>
]]></content:encoded>
			<wfw:commentRss>http://4by12.com/blog/archives/119/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Legendre Transform</title>
		<link>http://4by12.com/blog/archives/117</link>
		<comments>http://4by12.com/blog/archives/117#comments</comments>
		<pubDate>Sat, 05 Jan 2008 02:19:04 +0000</pubDate>
		<dc:creator>Guy Gur Ari</dc:creator>
				<category><![CDATA[Mathematics]]></category>
		<category><![CDATA[Physics]]></category>

		<guid isPermaLink="false">http://4by12.com/blog/archives/117</guid>
		<description><![CDATA[The Legendre transform is a simple and useful tool in some branches of physics such as Thermodynamics and Mechanics. In my experience though, this transform is usually explained in a very confusing way. I want to give my own derivation, which I hope is clear and straightforward. If you&#8217;re already familiar with the transform, you [...]]]></description>
			<content:encoded><![CDATA[<p>The Legendre transform is a simple and useful tool in some branches of physics such as Thermodynamics and Mechanics. In my experience though, this transform is usually explained in a very confusing way. </p>
<p>I want to give my own derivation, which I hope is clear and straightforward. If you&#8217;re already familiar with the transform, you can skip straight to <i>Derivation of the Legendre Transform</i>.</p>
<h2>Some Background</h2>
<p>Let&#8217;s start with the definition. If you have a function f(x), its Legendre transform is a function g(p), where p is thought of as f(x)&#8217;s derivative. g(p) is defined as:</p>
<p><center><img src='/latexrender/pictures/b658c0d439db87fed08de704df38bd59.png' title='g(p) = xp-f(x)' alt='g(p) = xp-f(x)' align=absmiddle></center></p>
<p>This definition is already confusing, because x is considered a function of p. This function is found by solving the following equation for x:</p>
<p><center><img src='/latexrender/pictures/7a338881435ffcc84748515586d1ef4f.png' title='p = \frac{df}{dx}(x)' alt='p = \frac{df}{dx}(x)' align=absmiddle></center></p>
<p>To clarify matters, let&#8217;s consider an example. Take <img src='/latexrender/pictures/d271cedde6675e55152d3c7a4236f775.png' title='f(x)=x^2' alt='f(x)=x^2' align=absmiddle>, then:</p>
<p><center><img src='/latexrender/pictures/b43bbe8703dac6d17a59bdcb2d2141b8.png' title='p=\frac{df}{dx}(x)=2x' alt='p=\frac{df}{dx}(x)=2x' align=absmiddle></center><br />
<center><img src='/latexrender/pictures/e0bd34610fdc577ac311571354bffce0.png' title='x=p/2' alt='x=p/2' align=absmiddle></center></p>
<p>The Legendre transform is then:<br />
<center><img src='/latexrender/pictures/a8bc3ada7dcf28cdb00c87ef4d4ef1c0.png' title='g(p) = x(p)p-f(x(p))=\frac{p}{2}p-(\frac{p}{2})^2=\frac{p^2}{4}' alt='g(p) = x(p)p-f(x(p))=\frac{p}{2}p-(\frac{p}{2})^2=\frac{p^2}{4}' align=absmiddle></center></p>
<p>Okay, what is it good for? The usual explanation starts with some geometric construction like the one you see on <a href="http://en.wikipedia.org/wiki/Legendre_transformation">Wikipedia</a>(*). Then, to explain why the transform works, you are shown the differential:</p>
<p><center><img src='/latexrender/pictures/e22c87959ef4568019722ad190dab438.png' title='df=\frac{df}{dx}dx=pdx' alt='df=\frac{df}{dx}dx=pdx' align=absmiddle></center><br />
<center><img src='/latexrender/pictures/5d5c6d67173005c7fa3e7b2c33fd2033.png' title='dg=d(px)-df=xdp+pdx-pdx=xdp' alt='dg=d(px)-df=xdp+pdx-pdx=xdp' align=absmiddle></center></p>
<p>And the conclusion is that g is indeed a function of p, which is f&#8217;s derivative. Mathematically speaking, this argument is quite unconvincing, because saying that g is a function of f&#8217;(x) has no meaning. g is a function of some real number, and this number has no intrinsic connection to any derivatives. </p>
<p>One possible relation could be that g and f agree if you give the corresponding arguments:</p>
<p><center><img src='/latexrender/pictures/08fc33f328166c9edf9133731fb4ef50.png' title='g(\frac{df}{dx}(x))=f(x)' alt='g(\frac{df}{dx}(x))=f(x)' align=absmiddle></center></p>
<p>but this is not the case! Some better explanation is obviously needed.</p>
<h2>Derivation of the Legendre Transform</h2>
<p>We define as before:<br />
<center><img src='/latexrender/pictures/75c78038c497b73beec28d5fb7f4bfc8.png' title='p(x)=\frac{df}{dx}(x)' alt='p(x)=\frac{df}{dx}(x)' align=absmiddle></center></p>
<p>And we are looking for a function g(p). The crucial point is this: We require that g&#8217;s derivative will correspond to x:<br />
<center><img src='/latexrender/pictures/c75ab3b8e7cc1fdc00dda6965f7313d9.png' title='x=\frac{dg}{dp}(p)' alt='x=\frac{dg}{dp}(p)' align=absmiddle></center></p>
<p>Rigorously, this means:</p>
<p><center><img src='/latexrender/pictures/eaf1fc9c5cb5dc1b2e411783c781fe1c.png' title='x=\frac{dg}{dp}(p(x))' alt='x=\frac{dg}{dp}(p(x))' align=absmiddle></center></p>
<p>Where p(x) is the function defined by the first requirement.</p>
<p>Now we have something we can use. We look at the function g(p(x)) and we calculate:</p>
<p><center><img src='/latexrender/pictures/1451fc7f5eb233435ab1a450122e8f0e.png' title='\frac{d}{dx}g(p(x))=\frac{dg}{dp}(p(x)) \cdot \frac{dp}{dx}(x) = x \cdot \frac{d^2f}{dx^2}(x) ' alt='\frac{d}{dx}g(p(x))=\frac{dg}{dp}(p(x)) \cdot \frac{dp}{dx}(x) = x \cdot \frac{d^2f}{dx^2}(x) ' align=absmiddle></center></p>
<p>Both requirements were used in the last transition. We now integrate this equation by dx:</p>
<p><center><img src='/latexrender/pictures/3a6a8467d6cba077f9d1c7c65807e1d8.png' title='\int \frac{d}{dx}g(p(x)) dx=\int x \cdot \frac{d^2f}{dx^2}(x) dx ' alt='\int \frac{d}{dx}g(p(x)) dx=\int x \cdot \frac{d^2f}{dx^2}(x) dx ' align=absmiddle></center></p>
<p>Integrating by parts on the right-hand side: </p>
<p><center><img src='/latexrender/pictures/453e22468691df320229e9ee1f251ac4.png' title='g(p(x))=x \cdot \frac{df}{dx}(x)-\int \frac{df}{dx}(x) dx = xp(x)-f(x) + C' alt='g(p(x))=x \cdot \frac{df}{dx}(x)-\int \frac{df}{dx}(x) dx = xp(x)-f(x) + C' align=absmiddle></center></p>
<p>Thus g(p) is defined up to a constant, and choosing C=0 we get the familiar Legendre transform:</p>
<p><center><img src='/latexrender/pictures/b81cb17580cf7054289d25293caaf606.png' title='g(p)=xp-f(x)' alt='g(p)=xp-f(x)' align=absmiddle></center></p>
<h2>Motivation</h2>
<p>Finally, I want to comment on this second requirement that x=g&#8217;(p), which is how I came up with this derivation. In Thermodynamics, the important parameters of a problem always come in pairs: Pressure and Volume, Temperature and Entropy, Number of particles and the Chemical constant, and so on. They are paired by the First Law, which is conservation of energy:</p>
<p><center><img src='/latexrender/pictures/c0e068a2c0d723e3fecb7e8ab4b156a8.png' title='dU = TdS-PdV + \mu dN + ~ \cdots' alt='dU = TdS-PdV + \mu dN + ~ \cdots' align=absmiddle></center></p>
<p>So here, the internal energy U is a natural function of S,V,N. Its partial derivatives are T,-P, and mu. Of course it can depend on other variables and there are other derivatives, but these are the ones that are easiest to calculate.</p>
<p>The Legendre transform is used to get new forms of energy that are naturally dependent on other parameters. This is done by switching between pairs of variables. For instance, if you know the temperature T instead of the entropy S, you can transform like this:</p>
<p><center><img src='/latexrender/pictures/4011c60e02d111b524ddbd6dd1c2c14e.png' title='F = U-TS' alt='F = U-TS' align=absmiddle></center></p>
<p>F is called the &#8216;Helmholtz Free Energy&#8217;, and its natural parameters are T,V,N: </p>
<p><center><img src='/latexrender/pictures/8227eac1256b84954a5c271ce4f05591.png' title='dF = -SdT-PdV + \mu dN + ~ \cdots' alt='dF = -SdT-PdV + \mu dN + ~ \cdots' align=absmiddle></center></p>
<p>So T, which was a derivative before, is now a parameter. But equally as important, S is now a derivative. This is very important because it allows us to calculate S very easily. If we had a new function F=F(T,V,N) that wouldn&#8217;t allow simple calculation of S, it would be useless. This is what makes the Legendre transform so useful.</p>
<p>Having said that, we now see that we can define other transforms by changing this requirement. We still get functions that can be thought of as &#8216;functions of the derivative of f&#8217;. For example, require:</p>
<p><center><img src='/latexrender/pictures/183fb4e6bba69cbe5c3bbcb15e24cad3.png' title='ax+b=\frac{dg}{dp}(p)' alt='ax+b=\frac{dg}{dp}(p)' align=absmiddle></center></p>
<p>Where a,b are some arbitrary parameters. Then we get a new transform:</p>
<p><center><img src='/latexrender/pictures/6d1a23c978b7e257c3ccbe9781febdcf.png' title='g(p) = a(xp-f(x)) + bp' alt='g(p) = a(xp-f(x)) + bp' align=absmiddle></center></p>
<p>Perhaps this method can generate some other useful transforms.</p>
<p><small>(*) I just saw that Wikipedia has something that is related to my explanation under &#8216;Another definition&#8217;, although it goes through a different route and still uses the geometric requirements.</small></p>
]]></content:encoded>
			<wfw:commentRss>http://4by12.com/blog/archives/117/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Fun With PDEs &#8211; Part 2</title>
		<link>http://4by12.com/blog/archives/100</link>
		<comments>http://4by12.com/blog/archives/100#comments</comments>
		<pubDate>Mon, 17 Sep 2007 03:42:36 +0000</pubDate>
		<dc:creator>Guy Gur Ari</dc:creator>
				<category><![CDATA[Mathematics]]></category>
		<category><![CDATA[Physics]]></category>

		<guid isPermaLink="false">http://4by12.com/blog/archives/100</guid>
		<description><![CDATA[Last time I described the first pitfall I encountered when solving a PDE &#8212; an inherent instability in the partial derivatives. This time I&#8217;ll talk about the second pitfall, which is simpler conceptually, but has wider implications for programming in general. In my first implementation of the solution I used a simple method to calculate [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://4by12.com/blog/archives/93">Last time</a> I described the first pitfall I encountered when solving a PDE &#8212; an inherent instability in the partial derivatives. This time I&#8217;ll talk about the second pitfall, which is simpler conceptually, but has wider implications for programming in general.</p>
<p><!-- more --></p>
<p>In my first implementation of the solution I used a simple method to calculate the derivative:</p>
<p><center><img src='/latexrender/pictures/5a99aae890fe9ef5482881ed643463ad.png' title='\frac{\partial f}{\partial x}[j] = \frac{f[j+1]-f[j-1]}{2 \, dx}' alt='\frac{\partial f}{\partial x}[j] = \frac{f[j+1]-f[j-1]}{2 \, dx}' align=absmiddle></center></p>
<p>-<br />
This is just the average of the right and left derivatives. I wanted to improve the accuracy of the derivative calculation because of various reasons, so I turned to Savitsky-Golay filters, which were also described in the previous post. Using this method you get a digital filter that you apply to your data. If you&#8217;re not familiar with digital filters, you can think of a filter as an array of numbers <img src='/latexrender/pictures/5af756cddb7c4205b25dc611fb7532be.png' title='c_n, \, n=-l,\cdots,l' alt='c_n, \, n=-l,\cdots,l' align=absmiddle> which are applied to your function f as follows:</p>
<p><center><img src='/latexrender/pictures/9407c374a365816609eb54c168b72267.png' title='F[j] = \sum_{n=-l}^{l} f[j-n] * c_n' alt='F[j] = \sum_{n=-l}^{l} f[j-n] * c_n' align=absmiddle></center></p>
<p>-<br />
(I&#8217;m probably mixing up some of the signs here, but the intention is clear I&#8217;m sure). To calculate a derivative using S-G, you first apply their filter to your data, and then divide by the spatial resolution dx &#8212; the distance between adjacent points on the grid. It is worth noting that there is a highly efficient way of doing this calculation using FFT. For more details, see Numerical Recipes.</p>
<p>I implemented S-G and tested it using a simple sin(x) function, and it worked flawlessly. But after I inserted it into my main program, the simulation started spewing out strange results. After some debugging I discovered something very strange: Using S-G, the derivative of a constant function wasn&#8217;t zero, and in one case reached <img src='/latexrender/pictures/cdff2a8afd293bffefd62721a5da5a34.png' title='10^{44}' alt='10^{44}' align=absmiddle>. How odd! Debugging further, I found that non-zero derivatives only happened when the initial function f had very large (and constant) values, on the order of <img src='/latexrender/pictures/5fd4cbc231cfac0e9c216bb2c684ee89.png' title='10^{22}' alt='10^{22}' align=absmiddle>, and dx was very small, about <img src='/latexrender/pictures/3b71137b0196a3cef54e8b7fe41a2883.png' title='10^{-10}' alt='10^{-10}' align=absmiddle>.</p>
<p>To continue debugging, I dropped the fancy FFT convolution and switched to straightforward calculation &#8212; the one that&#8217;s shown in the last equation. This finally revealed the problem: When multiplying each value of the function f by the factor <img src='/latexrender/pictures/6f58730f154756d9dc7efb13fc938933.png' title='c_n' alt='c_n' align=absmiddle> and summing, some of the least significant bits are lost, because the numbers don&#8217;t all have the same exponent. So even though the calculation is accurate, the limitation of the computer&#8217;s double precision causes a loss of data. When the convolution is done, you&#8217;re left with a small value that&#8217;s not zero. But then to get the derivative you divide by dx, a very small number, and this shoots that small error through the roof. The smaller dx, the larger the derivative of the constant function! So once again, decreasing dx actually caused a larger error.</p>
<p>What helped me solve this problem was noticing that the S-G coefficients are anti-symmetric, i.e. <img src='/latexrender/pictures/beaa2c86c046bba1e03049196b794230.png' title='c_{-n} = -c_n' alt='c_{-n} = -c_n' align=absmiddle>. Specifically, anti-symmetric coefficients have the same exponent. Therefore I changed the summing order to sum pairs of anti-symmetric factors:</p>
<p><center><img src='/latexrender/pictures/030368e3f1eca0972f29ac06368059f6.png' title='F[j] = \sum_{n=1}^{l} (f[j-n] \cdot c_n + f[j+n] \cdot c_{-n}) + f[j] \cdot c_0' alt='F[j] = \sum_{n=1}^{l} (f[j-n] \cdot c_n + f[j+n] \cdot c_{-n}) + f[j] \cdot c_0' align=absmiddle></center></p>
<p>-<br />
This solved the problem: Constant functions now had a zero derivative, because each term in the sum was exactly zero, even in double precision. And the happy consequence was that this fix also solved the weird simulation results I started with.</p>
]]></content:encoded>
			<wfw:commentRss>http://4by12.com/blog/archives/100/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Fun with PDEs</title>
		<link>http://4by12.com/blog/archives/93</link>
		<comments>http://4by12.com/blog/archives/93#comments</comments>
		<pubDate>Tue, 01 May 2007 01:58:43 +0000</pubDate>
		<dc:creator>Guy Gur Ari</dc:creator>
				<category><![CDATA[Mathematics]]></category>
		<category><![CDATA[Physics]]></category>
		<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://4by12.com/blog/archives/93</guid>
		<description><![CDATA[I just finished working on a numerical simulation of a set of partial differential equations (PDE). I developed these equations for a physics research project I&#8217;m involved in. The equations did not seem to be solvable analytically, so I had to do it numerically. This was my first attempt at solving a PDE, and writing [...]]]></description>
			<content:encoded><![CDATA[<p>I just finished working on a numerical simulation of a set of partial differential equations (PDE). I developed these equations for a physics research project I&#8217;m involved in. The equations did not seem to be solvable analytically, so I had to do it numerically. This was my first attempt at solving a PDE, and writing the simulation turned out to be much more involved than with ordinary differential equations. Here are a couple of interesting pitfalls I encountered.</p>
<h2>PDE Primer</h2>
<p>In case you&#8217;re not familiar with the terminology, I&#8217;ll first explain what a PDE is. A simple equation contains one or more unknowns which represent numbers. For example: <img src='/latexrender/pictures/864a9b719133447ea244573cb054ddfb.png' title='x^2-x-1 = 0' alt='x^2-x-1 = 0' align=absmiddle>. An ordinary differential equation (ODE) is similar, except the unknown is a function rather than a number. Such an equation involves derivatives of the function. Here is an example:</p>
<p><center><img src='/latexrender/pictures/68d6a3d320e55151a2a773697c038965.png' title='f\prime(x) = f(x)' alt='f\prime(x) = f(x)' align=absmiddle></center></p>
<p>-<br />
The solution of this particular equation is <img src='/latexrender/pictures/3a1a0d583b0595d7e871ff3da1462c04.png' title='f(x) = c e^x' alt='f(x) = c e^x' align=absmiddle>, where <img src='/latexrender/pictures/4a8a08f09d37b73795649038408b5f33.png' title='c' alt='c' align=absmiddle> can be any number. Finally, a partial differential equation (PDE) involves a function that has two or more parameters, and includes partial derivatives of this function. For example, the following equation describes waves propagating through a medium:</p>
<p><center><img src='/latexrender/pictures/4a573760921c4497b5eff45969b866c9.png' title='\frac { \partial^2 f } { \partial t^2 } = v^2 \, \frac { \partial^2 f } { \partial x^2 }' alt='\frac { \partial^2 f } { \partial t^2 } = v^2 \, \frac { \partial^2 f } { \partial x^2 }' align=absmiddle></center></p>
<p>-<br />
PDEs are very important in physics. In fact, many of the basic laws of nature are described as PDEs. Examples include Maxwell&#8217;s equations, Shcrodinger&#8217;s equation, and Einstein&#8217;s field equations.</p>
<p>On to the simulation!</p>
<p><!-- more --></p>
<h2>Pitfall 1: Exploding Waves</h2>
<p>So, I built my model for the problem, derived the equations, and was ready to solve them. By &#8216;solving&#8217; I mean that I start out with the known function at time t=0, and I want to find out what that function is at a later time. My function initially looked like this:</p>
<p><center><img src="http://4by12.com/blog/wp-content/uploads/2007/05/rho0.gif"></center></p>
<p>Some thousands of time-steps later, it evolved into this:</p>
<p><center><img src="http://4by12.com/blog/wp-content/uploads/2007/05/rho1.gif"></center></p>
<p>So far so good, but then it completely exploded:</p>
<p><center><img src="http://4by12.com/blog/wp-content/uploads/2007/05/rho3_fix.gif"></center></p>
<p>Going back a bit in time, I was able to trace the beginning of this explosion:</p>
<p><center><img src="http://4by12.com/blog/wp-content/uploads/2007/05/rho2.gif"></center></p>
<p>And zooming in on the &#8216;wavy&#8217; part:</p>
<p><center><img src="http://4by12.com/blog/wp-content/uploads/2007/05/rho2_zoom.gif"></center></p>
<p>It looked as though waves were forming on my function, and then &#8216;exploding&#8217;. </p>
<p><H2>Inherent Instabilities</h2>
<p>I was certain I had a bug, but I couldn&#8217;t find it. While debugging, at one point I decreased the spatial resolution &#8212; using less points per unit of space to describe the function&#8230; and the problem was gone! So, <i>decreasing</i> the accuracy of my solution actually solved the instability&#8230; That was very weird.</p>
<p>Mentioning this to a Ph.D student at the lab, he said this problem sounded familiar to him. And as it turns out, this is a universal problem with PDEs: If the time step is too large compared with the spatial resolution, the amplitude of small waves with short wavelengths quickly increases with time until they dominate the solution. This is due to the way numerical derivatives are calculated. The difficulty here is that the time step needs to be incredibly small, making calculation unfeasible. For some equation, the situation is even worse, as they are unstable for any time step, no matter how small.</p>
<p>For simple PDEs, it is very easy to see this effect by taking the function f to be a wave, and watching what happens to the amplitude over time. You can see a derivation of this result <a href="http://farside.ph.utexas.edu/teaching/329/lectures/node79.html">here</a>. For a more in-depth discussion, Numerical Recipes is your friend. This method of analyzing equations is called von Neumann stability analysis.</p>
<h2>In Comes Lax</h2>
<p>Okay, so I found out not alone, but what can be done to solve this problem? The first thing I tried was to calculate the derivative more accurately. There is a method called Savitzky-Golay, where you fit a polynomial to your function at each point, and calculate the polynomial&#8217;s derivative at that point. The brilliant thing is that this whole operation (fit + derive) can be done using a single convolution, which costs a meager O(n log n) of processing time.</p>
<p>So I implemented S-G, only to discover it doesn&#8217;t solve the problem. More on that in a future post.</p>
<p>As it turns out, there is an incredibly simple solution due to Lax, which says the following. When advancing the function value to the next time step, you do something like this for each position:</p>
<p><center><img src='/latexrender/pictures/3de923578d6c440441bbc686d59bc5bf.png' title='f[j] = f[j] + \frac { \partial f } { \partial t } [j] * dt' alt='f[j] = f[j] + \frac { \partial f } { \partial t } [j] * dt' align=absmiddle></center></p>
<p>-<br />
The Lax method says that the f[j] at the right-hand side should be replaced by an average of it&#8217;s neighboring cells:</p>
<p><center><img src='/latexrender/pictures/08f7b12d31b3ef545da95e66ae44b89a.png' title='f[j] = \frac{f[j-1] + f[j+1]}{2} + \frac { \partial f } { \partial t } [j] \; dt' alt='f[j] = \frac{f[j-1] + f[j+1]}{2} + \frac { \partial f } { \partial t } [j] \; dt' align=absmiddle></center></p>
<p>-<br />
And that&#8217;s it! This replacement causes a numerical diffusion that &#8216;sedates&#8217; the unruly waves, causing them to decay instead of explode. The time step used in the simulation still needs to be below some value, but now it decreases linearly with the spatial distance dx, which is much better than before. So Lax saved the day &#8212; and that was the end of my first pitfall. This is getting to be quite a long post, so I&#8217;ll describe the second problem in another post. Cheers!</p>
]]></content:encoded>
			<wfw:commentRss>http://4by12.com/blog/archives/93/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Test your Math skills</title>
		<link>http://4by12.com/blog/archives/91</link>
		<comments>http://4by12.com/blog/archives/91#comments</comments>
		<pubDate>Sat, 17 Mar 2007 16:04:09 +0000</pubDate>
		<dc:creator>Guy Gur Ari</dc:creator>
				<category><![CDATA[Mathematics]]></category>

		<guid isPermaLink="false">http://4by12.com/blog/archives/91</guid>
		<description><![CDATA[Here&#8217;s a question that was asked in a recent oral exam for a Master&#8217;s degree in Mathematics. Let be a real function such that has a limit at each point. Does have at least one continuity point? Everything you need to solve this question is covered in the first year or so of undergraduate math. [...]]]></description>
			<content:encoded><![CDATA[<p>Here&#8217;s a question that was asked in a recent oral exam for a Master&#8217;s degree in Mathematics.</p>
<p><i>Let <img src='/latexrender/pictures/50a35dc27db6c5698f3542f466e79dc4.png' title='f : [0,1] \rightarrow \mathbb R' alt='f : [0,1] \rightarrow \mathbb R' align=absmiddle> be a real function such that <img src='/latexrender/pictures/8fa14cdd754f91cc6554c9e71929cce7.png' title='f' alt='f' align=absmiddle> has a limit at each point. Does <img src='/latexrender/pictures/8fa14cdd754f91cc6554c9e71929cce7.png' title='f' alt='f' align=absmiddle> have at least one continuity point?</i></p>
<p>Everything you need to solve this question is covered in the first year or so of undergraduate math. Don&#8217;t continue reading if you want to try solving it yourself&#8230;</p>
<p><b>Solution.</b> We will show that <img src='/latexrender/pictures/8fa14cdd754f91cc6554c9e71929cce7.png' title='f' alt='f' align=absmiddle> has a continuity point. <img src='/latexrender/pictures/8fa14cdd754f91cc6554c9e71929cce7.png' title='f' alt='f' align=absmiddle> has a limit at each point, so let <img src='/latexrender/pictures/c8c1e57914ed97bd100bce0ba895659a.png' title='F : [0,1] \rightarrow \mathbb R' alt='F : [0,1] \rightarrow \mathbb R' align=absmiddle> be a function such that<br />
<center><img src='/latexrender/pictures/94b2e8aa70f0393b9401acc7a4fd78eb.png' title='\forall x \in [0,1]' alt='\forall x \in [0,1]' align=absmiddle> <img src='/latexrender/pictures/f0cde3e5d0ff626f7025dffb1a0d8657.png' title=' F(x) = lim_{y \rightarrow x} f(y)' alt=' F(x) = lim_{y \rightarrow x} f(y)' align=absmiddle></center><br />
<br />
<b>Lemma.</b> <img src='/latexrender/pictures/800618943025315f869e4e1f09471012.png' title='F' alt='F' align=absmiddle> is continuous.<br />
<br />
<b>Proof.</b> Choose some <img src='/latexrender/pictures/27178974789ff9dda86b01e57e78701c.png' title='x_0 \in [0,1]' alt='x_0 \in [0,1]' align=absmiddle>. Intuitively, <img src='/latexrender/pictures/4e3e5cd03740a068808c98275f7da9f4.png' title='F(x_0)' alt='F(x_0)' align=absmiddle> is <img src='/latexrender/pictures/8fa14cdd754f91cc6554c9e71929cce7.png' title='f' alt='f' align=absmiddle>&#8216;s limit, so in a small enough surrounding of <img src='/latexrender/pictures/3e0d691f3a530e6c7e079636f20c111b.png' title='x_0' alt='x_0' align=absmiddle>, <img src='/latexrender/pictures/50bbd36e1fd2333108437a2ca378be62.png' title='f(x)' alt='f(x)' align=absmiddle> will be close to <img src='/latexrender/pictures/4e3e5cd03740a068808c98275f7da9f4.png' title='F(x_0)' alt='F(x_0)' align=absmiddle>. Hence, the limits <img src='/latexrender/pictures/d76f2c4d6bdf142af5106c3f36e9e970.png' title='F(x)' alt='F(x)' align=absmiddle> will also be close to <img src='/latexrender/pictures/4e3e5cd03740a068808c98275f7da9f4.png' title='F(x_0)' alt='F(x_0)' align=absmiddle>, and thus <img src='/latexrender/pictures/800618943025315f869e4e1f09471012.png' title='F' alt='F' align=absmiddle> is continuous at <img src='/latexrender/pictures/3e0d691f3a530e6c7e079636f20c111b.png' title='x_0' alt='x_0' align=absmiddle>.<br />
<br />
Formally, let <img src='/latexrender/pictures/48cd3d9bc60a933438e6e151fcabafdf.png' title='\epsilon &gt; 0' alt='\epsilon &gt; 0' align=absmiddle>. Then there exists <img src='/latexrender/pictures/65b1b5bbfb6ebb856c8b898af8fc277a.png' title='\delta &gt; 0' alt='\delta &gt; 0' align=absmiddle> such that<br />
<br />
<center><img src='/latexrender/pictures/cb492d73a16b3a0a009373a371f0d436.png' title='\forall x \in (x_0-\delta,x_0+\delta) \setminus \{x_0\} \; |f(x)-F(x_0)|&lt;\epsilon' alt='\forall x \in (x_0-\delta,x_0+\delta) \setminus \{x_0\} \; |f(x)-F(x_0)|&lt;\epsilon' align=absmiddle></center>
<p>
<br />
Which means that for all such x<br />
<center><img src='/latexrender/pictures/6944c5d870161a5a4a194b089acd3bb1.png' title='|lim_{y \rightarrow x} f(y)-F(x_0)|&lt;\epsilon' alt='|lim_{y \rightarrow x} f(y)-F(x_0)|&lt;\epsilon' align=absmiddle></center>
<p>
<br />
or<br />
<center><img src='/latexrender/pictures/3514a6aff062bfaec510f258dfe88b58.png' title='|F(x)-F(x_0)|&lt;\epsilon' alt='|F(x)-F(x_0)|&lt;\epsilon' align=absmiddle></center><br />
And the Lemma is proved.<br />
<br />
Our purpose now is to show that there is a point <img src='/latexrender/pictures/9dd4e461268c8034f5c8564e155c67a6.png' title='x' alt='x' align=absmiddle> such that <img src='/latexrender/pictures/53a42891fcd04159b729b5a53e2fa861.png' title='f(x) = F(x)' alt='f(x) = F(x)' align=absmiddle>. Let&#8217;s count the points of <img src='/latexrender/pictures/8fa14cdd754f91cc6554c9e71929cce7.png' title='f' alt='f' align=absmiddle> that are &#8216;far away&#8217; from the limit <img src='/latexrender/pictures/800618943025315f869e4e1f09471012.png' title='F' alt='F' align=absmiddle>. Choose some <img src='/latexrender/pictures/b62ce1b42a86d3b781d62418bc90e05b.png' title='\epsilon&gt;0' alt='\epsilon&gt;0' align=absmiddle> and define the set:<br />
<br />
<center><img src='/latexrender/pictures/b175ac134cce6956a861e64da1a21307.png' title='A_\epsilon = \{ \, x \, | \, f(x) \, &gt; \, F(x) + \epsilon \, \}' alt='A_\epsilon = \{ \, x \, | \, f(x) \, &gt; \, F(x) + \epsilon \, \}' align=absmiddle></center><br />
<br />
<b>Lemma.</b> <img src='/latexrender/pictures/d6ccdc090b9363056607b1653af822b8.png' title='A_{\epsilon}' alt='A_{\epsilon}' align=absmiddle> is finite.<br />
<br />
<b>Proof.</b> Suppose <img src='/latexrender/pictures/d6ccdc090b9363056607b1653af822b8.png' title='A_{\epsilon}' alt='A_{\epsilon}' align=absmiddle> is infinite, then because <img src='/latexrender/pictures/ccfcd347d0bf65dc77afe01a3306a96b.png' title='[0,1]' alt='[0,1]' align=absmiddle> is compact we can find a series <img src='/latexrender/pictures/fa04b846ba2bcccfcf4379c77615f235.png' title='\{x_n\} \subset A_\epsilon' alt='\{x_n\} \subset A_\epsilon' align=absmiddle> such that <img src='/latexrender/pictures/aa2a915c9393029669833940129d733c.png' title='x_n \longrightarrow x_0' alt='x_n \longrightarrow x_0' align=absmiddle> and <img src='/latexrender/pictures/25f69e7be32c1da74625c72434c81ef4.png' title='x_n \neq x_0' alt='x_n \neq x_0' align=absmiddle> for some <img src='/latexrender/pictures/27178974789ff9dda86b01e57e78701c.png' title='x_0 \in [0,1]' alt='x_0 \in [0,1]' align=absmiddle>.</p>
<p>Therefore, <img src='/latexrender/pictures/35b422c9f30b0513a06e988fa1cde74e.png' title='f(x_n) \longrightarrow F(x_0)' alt='f(x_n) \longrightarrow F(x_0)' align=absmiddle>. But for large enough <img src='/latexrender/pictures/7b8b965ad4bca0e41ab51de7b31363a1.png' title='n' alt='n' align=absmiddle>, we must have<br />
<center><img src='/latexrender/pictures/0b213ecbccf763fd05a05e1ebcf668e0.png' title='f(x_n) \, &gt; \, F(x_n) + \epsilon \, &gt; \, F(x_0) + {\epsilon \over 2}' alt='f(x_n) \, &gt; \, F(x_n) + \epsilon \, &gt; \, F(x_0) + {\epsilon \over 2}' align=absmiddle></center><br />
where we have used <img src='/latexrender/pictures/800618943025315f869e4e1f09471012.png' title='F' alt='F' align=absmiddle>&#8216;s continuity at <img src='/latexrender/pictures/3e0d691f3a530e6c7e079636f20c111b.png' title='x_0' alt='x_0' align=absmiddle>. Thus we reach a contradiction, and <img src='/latexrender/pictures/cdbeb806985445eb3a9b7c4053298ff3.png' title='A_\epsilon' alt='A_\epsilon' align=absmiddle> must be finite.<br />
<br />
Now let&#8217;s take <img src='/latexrender/pictures/8347f422640a124fbc645b76ae1594bc.png' title='A = \bigcup_{n=1}^{\infty} A_{1/n}' alt='A = \bigcup_{n=1}^{\infty} A_{1/n}' align=absmiddle>. Then from the lemma we have that A is an enumerable set. A also contains all the points for whom <img src='/latexrender/pictures/7d1aad5d65ad5fee2780510f86aa8615.png' title='f(x) \, &gt; \, F(x)' alt='f(x) \, &gt; \, F(x)' align=absmiddle>.</p>
<p>Likewise we can define<br />
<center><img src='/latexrender/pictures/eb506691dcc0f14366a91ea70f05fc2e.png' title='B_\epsilon = \{ \, x \, | \, f(x) \, &lt; \, F(x)-\epsilon \, \}' alt='B_\epsilon = \{ \, x \, | \, f(x) \, &lt; \, F(x)-\epsilon \, \}' align=absmiddle></center><br />
<center><img src='/latexrender/pictures/d2d65451240b737db06b59ccf45b1cc4.png' title='B = \bigcup_{n=1}^{\infty} B_{1/n}' alt='B = \bigcup_{n=1}^{\infty} B_{1/n}' align=absmiddle></center><br />
<br />
And together with A we find that<br />
<br />
<center><img src='/latexrender/pictures/4c2e92dc9062337a0a62761c72b1ec73.png' title='A \cup B = \{\, x \, | \, f(x) \, \neq \, F(x) \, \}' alt='A \cup B = \{\, x \, | \, f(x) \, \neq \, F(x) \, \}' align=absmiddle></center><br />
But <img src='/latexrender/pictures/b910c111ac8440bf4f4863bb5fc83aa8.png' title='A \cup B' alt='A \cup B' align=absmiddle> is enumerable, so <img src='/latexrender/pictures/c3a8a4a802cc9e5533914398c128c459.png' title='[0,1] \setminus (A \cup B) \neq \emptyset' alt='[0,1] \setminus (A \cup B) \neq \emptyset' align=absmiddle>, so there exists a point <img src='/latexrender/pictures/27178974789ff9dda86b01e57e78701c.png' title='x_0 \in [0,1]' alt='x_0 \in [0,1]' align=absmiddle> such that <img src='/latexrender/pictures/0c222d1748dfdd68d87a0e2b839ff40e.png' title='f(x_0) = F(x_0)' alt='f(x_0) = F(x_0)' align=absmiddle>. <img src='/latexrender/pictures/3e0d691f3a530e6c7e079636f20c111b.png' title='x_0' alt='x_0' align=absmiddle> is a continuity point for <img src='/latexrender/pictures/8fa14cdd754f91cc6554c9e71929cce7.png' title='f' alt='f' align=absmiddle>.<br />
<br />
QED<br /></p>
]]></content:encoded>
			<wfw:commentRss>http://4by12.com/blog/archives/91/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Continuous Compound Interest</title>
		<link>http://4by12.com/blog/archives/19</link>
		<comments>http://4by12.com/blog/archives/19#comments</comments>
		<pubDate>Sat, 11 Nov 2006 18:01:06 +0000</pubDate>
		<dc:creator>Guy Gur Ari</dc:creator>
				<category><![CDATA[Mathematics]]></category>

		<guid isPermaLink="false">http://4by12.com/blog/archives/19</guid>
		<description><![CDATA[Long time no blogging! A lot has happened since my last post. I got married, was out on a honeymoon, left my job and returned to school to finish my degree. But now I&#8217;m back, so read on and enjoy! We were in the office a while back, and we started talking about compound interest. [...]]]></description>
			<content:encoded><![CDATA[<p><i>
<p>Long time no blogging! A lot has happened since my last post. I got married, was out on a honeymoon, left my job and returned to school to finish my degree. But now I&#8217;m back, so read on and enjoy!</p>
<p></i></p>
<p>We were in the office a while back, and we started talking about compound interest. You can calculate compound interest over different time intervals: Yearly, monthly, weekly, and so on. What will happen if we take the limit as the interval approaches zero, i.e. use continuous time? How much interest will we get?</p>
<p><span id="more-19"></span></p>
<p>Let&#8217;s see. If M(t) is the money you have at time t, how much money will you have at t+dt for some interval dt? That&#8217;s M(t) plus an additional amount that&#8217;s proportional to M(t) &#8212; the interest. Let&#8217;s assume this additional amount is also proportional to dt (this seems like a reasonable assumption to make). So we get:</p>
<p><center><img src='/latexrender/pictures/7e6a97e749d97ca77a8ea92e9aff2882.png' title='M(t+dt)=M(t)+\alpha M(t) dt' alt='M(t+dt)=M(t)+\alpha M(t) dt' align=absmiddle></center></p>
<p></p>
<p>For some constant <img src='/latexrender/pictures/7b7f9dbfea05c83784f8b85149852f08.png' title='\alpha' alt='\alpha' align=absmiddle> which will determine the rate at which interest is accumulated.  Rearranging, we get:</p>
<p><center><img src='/latexrender/pictures/88bb5d57b30b013558a7940558c2e70c.png' title='{M(t+dt)-M(t) \over dt} = \alpha M(t)' alt='{M(t+dt)-M(t) \over dt} = \alpha M(t)' align=absmiddle></center></p>
<p></p>
<p>And the limit as dt approaches zero is:</p>
<p><center><img src='/latexrender/pictures/ec4c1d186c8feef27a94b4d9bda3d841.png' title='{dM(t) \over dt} = \alpha M(t)' alt='{dM(t) \over dt} = \alpha M(t)' align=absmiddle></center></p>
<p><center><img src='/latexrender/pictures/23b214f51be02a971c9ec9bd3caf97c0.png' title='\Rightarrow M(t)=M(0) e^{\alpha t}' alt='\Rightarrow M(t)=M(0) e^{\alpha t}' align=absmiddle></center></p>
<p></p>
<p>So this tells us continuous compound interest grows exponentially in time, which isn&#8217;t too surprising when you think about it.</p>
<p>
After working this out at the office, that final equation was left on the whiteboard. A couple of days later, our product manager walked in and saw it. He immediately recognized it. Apparently, this formula was taught in his MBA program as the maximum amount of interest one can get&#8230;</p>
]]></content:encoded>
			<wfw:commentRss>http://4by12.com/blog/archives/19/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Hypercubes and Hamilton Circuits</title>
		<link>http://4by12.com/blog/archives/25</link>
		<comments>http://4by12.com/blog/archives/25#comments</comments>
		<pubDate>Sat, 23 Sep 2006 21:56:34 +0000</pubDate>
		<dc:creator>Guy Gur Ari</dc:creator>
				<category><![CDATA[Mathematics]]></category>

		<guid isPermaLink="false">http://4by12.com/blog/archives/25</guid>
		<description><![CDATA[The Ising model is a simple model used to describe a magnet: It consists of a grid of spins. Each spin can be thought of as a tiny magnet, and can have the value +1 or -1 (called &#8216;up&#8217; and &#8216;down&#8217;). Two neighboring spins interact, having positive energy if they are in opposite directions (e.g. [...]]]></description>
			<content:encoded><![CDATA[<p>The <a href="http://en.wikipedia.org/wiki/Ising_model">Ising model</a> is a simple model used to describe a magnet: It consists of a grid of spins. Each spin can be thought of as a tiny magnet, and can have the value +1 or -1 (called &#8216;up&#8217; and &#8216;down&#8217;). Two neighboring spins interact, having positive energy if they are in opposite directions (e.g. up and down), and negative energy if they are in the same direction. So, neighboring spins &#8216;want&#8217; to be in the same direction, to minimize energy.</p>
<p>This model is used to calculate various properties of magnets. For example, suppose we want to measure the <em>average magnetization</em> of the grid in a given temperature. The magnetization of a specific state is just the sum of all the spin values in that state. So if all the spins are up and the grid size is 5&#215;5, the magnetization is 25. To calculate the average magnetization, we&#8217;d have to know the probability of each state. We&#8217;d then multiply each state&#8217;s magnetization by the probability of reaching that state, and sum over all the states. The probability distribution is given by the <a href="http://en.wikipedia.org/wiki/Maxwell-Boltzmann_distribution">Maxwell-Boltzmann distribution</a>, an important result of statistical physics.</p>
<p>If the grid size is sufficiently small (up to about 5&#215;5), we can actually go over all the possible states and calculate an accurate average value for the magnetization. What is the most efficient way of going over all the states? It is very convenient to implement this algorithm by flipping just one spin at a time. In this case, the most efficient algorithm would iterate over the possible states by flipping just one spin in each step.</p>
<p>Let&#8217;s think of a state as a series of bits, 0 meaning a spin value of &#8216;down&#8217; and 1 meaning a spin value of &#8216;up&#8217;. We can think of this series as a binary number, and iterate the states by incrementing this number. For example, if the grid size is 2&#215;2, then we can iterate as follows: 0000, 0001, 0010, 0011, 0100, &#8230; This method works, but it takes more than one spin to switch states (on average). For example, going from 0111 to 1000 requires 4 flips. It is easy to see that the average number of flips per step for a series of length n is approximately:</p>
<p><img src='/latexrender/pictures/79b981647f0cba258c46b5b1a0703211.png' title='&lt;flips/step&gt; \approx \frac{1}{2^n} \sum_{k=0}^{n-1} \frac{2^n}{2^k} = 2(1-2^{-n}) \to 2' alt='&lt;flips/step&gt; \approx \frac{1}{2^n} \sum_{k=0}^{n-1} \frac{2^n}{2^k} = 2(1-2^{-n}) \to 2' align=absmiddle></p>
<p>With the limit taken as <img src='/latexrender/pictures/9fcd9d5d39cca718980a307f659f2e54.png' title='n \to \infty' alt='n \to \infty' align=absmiddle>.</p>
<p>Let&#8217;s try to improve on this. In a previous post, I noted that we can think of a hypercube&#8217;s vertex as a series of bits. So iterating over all the spin states is like walking on a hypercube, visiting all of its vertices. We want to flip just one spin at each step &#8212; that would be the most efficient algorithm. In the hypercube analogy, this is the same as saying we are allowed to walk only on the graph&#8217;s edges, because edges only connect vertices that differ by exactly one bit. So now we&#8217;re trying to find an algorithm that walks the edges of a hypercube, vists all the vertices, and vists each vertex exactly once. This is almost the definition of a <a href="http://mathworld.wolfram.com/HamiltonianCircuit.html">Hamilton circuit</a>, except we don&#8217;t have to go back to the first vertex at the end.</p>
<p>Trying it out for a 3D cube, it&#8217;s obviously very easy to do this: We can start at the top face and walk through the 4 vertices there. Then we can &#8216;climb down&#8217; to the bottom face, and finish walking through the bottom 4 vertices. We can generalize this concept inductively: Given such an algorithm for an (n-1)-hypercube, we can use it to build an algorithm for an n-hypercube: Start with the vertex 0000&#8230;0, and use the (n-1) algorithm to go over all the 0xxx..x vertices. Let&#8217;s say the algorithm finishes at vertex 0111&#8230;1. Next, &#8216;climb up&#8217; to vertex 1111&#8230;1, and run the (n-1) algo. backwards to go through all the &#8216;upper&#8217; states, reaching 1000&#8230;0. We can even complete a full Hamilton cycle by climbing back down to 0000&#8230;0.</p>
<p>Of course, we have an algorithm for n=0, so inductively we get an algorithm for every n. This algorithm goes over all the states, using just one flip (or edge) for each step.</p>
<p>When I told my girlfriend about my discovery, she told me this was a well-known method called <a href="http://en.wikipedia.org/wiki/Gray_code">Gray code</a>. Party pooper. <img src='http://4by12.com/blog/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://4by12.com/blog/archives/25/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Benford&#8217;s Law</title>
		<link>http://4by12.com/blog/archives/79</link>
		<comments>http://4by12.com/blog/archives/79#comments</comments>
		<pubDate>Thu, 14 Sep 2006 17:51:44 +0000</pubDate>
		<dc:creator>Guy Gur Ari</dc:creator>
				<category><![CDATA[Mathematics]]></category>

		<guid isPermaLink="false">http://4by12.com/blog/archives/79</guid>
		<description><![CDATA[Benford discovered that in many sets of data, the leading digit is much more likely to be &#8217;1&#8242; than any other digit. Take, for example, the population counts of nations. The most significat digit probabilities there are as follows: MSDProbability 10.26 20.20 30.10 40.13 50.07 60.08 70.07 80.05 90.04 The MathWorld article offers an explanation [...]]]></description>
			<content:encoded><![CDATA[<p>Benford <a href="http://mathworld.wolfram.com/BenfordsLaw.html">discovered</a> that in many sets of data, the leading digit is much more likely to be &#8217;1&#8242; than any other digit. Take, for example, the <a href="https://www.cia.gov/cia/publications/factbook/rankorder/2119rank.html">population counts of nations</a>. The most significat digit probabilities there are as follows:</p>
<p><TABLE border="1"><TR><TH>MSD<TH>Probability<br />
<TR><TD>1<TD>0.26<br />
<TR><TD>2<TD>0.20<br />
<TR><TD>3<TD>0.10<br />
<TR><TD>4<TD>0.13<br />
<TR><TD>5<TD>0.07<br />
<TR><TD>6<TD>0.08<br />
<TR><TD>7<TD>0.07<br />
<TR><TD>8<TD>0.05<br />
<TR><TD>9<TD>0.04<br />
</TABLE><br />
</p>
<p>The MathWorld article offers an explanation in terms of distributions that are invariant under changes of the measurement unit. I have to say, I wasn&#8217;t entirely convinced by that explanation. I&#8217;d like to offer a different theory that might account for this phenomenon.</p>
<p><span id="more-79"></span></p>
<p>Suppose X is a real random variable with a distribution P(X). For example, X may represent the population of a nation. Further suppose that P(X) is <em>monotonically decreasing in X</em>, and that X ranges from 1 to +infinity. Now look at the most-significant digit of X. I claim that, in this very common situation, &#8217;1&#8242; will be more likely than the rest of the digits. Further, &#8217;1&#8242; will be more likely than &#8217;2&#8242;, which will be more likely than &#8217;3&#8242;, and so on.</p>
<p>To prove this claim, let&#8217;s consider different orders of 10. With 10^0, i.e. the range 1 through 10, what is the likelihood that 1 is the MSD (most significant digit)? That&#8217;s the likelihood that X is between 1 and 2, or P([1,2)). Similarly, the likelihood that digit 'k' is the MSD is P([k,k+1)). Because P(X) is monotonically decreasing, we have P([1,2)) > P([2,3)) > ... > P([9,10)).</p>
<p>Now let's look at 10^1. In the same manner, the range [10, 20) is more likely than [20, 30), and so on. And the same is true for any order of 10. Now, what is the probability of getting 1 as an MSD when we select X at random? That's the sum over all orders of 10:</p>
<p><img src='/latexrender/pictures/7dc4caf1e1f64d7ab6c70f5c09038f1b.png' title='P(msd=1) = \sum_{k=0}^{k=\infty} P([10^k,2*10^k))' alt='P(msd=1) = \sum_{k=0}^{k=\infty} P([10^k,2*10^k))' align=absmiddle> </p>
<p>And the probability of getting 2 as the MSD? That&#8217;s</p>
<p><img src='/latexrender/pictures/9569f9e96b39bece01c25bba54c3b2d6.png' title='P(msd=2) = \sum_{k=0}^{k=\infty} P([2*10^k,3*10^k))' alt='P(msd=2) = \sum_{k=0}^{k=\infty} P([2*10^k,3*10^k))' align=absmiddle>. </p>
<p>We already showed that, for any k, <img src='/latexrender/pictures/25abbf099d77c49566d060d79d2501c2.png' title='P([10^k,2*10^k)) &gt; P([2*10^k,3*10^k))' alt='P([10^k,2*10^k)) &gt; P([2*10^k,3*10^k))' align=absmiddle>, hence <img src='/latexrender/pictures/c5a1f22b14011785c68f5b2cf7f835f5.png' title='P(msd=1) &gt; P(msd=2)' alt='P(msd=1) &gt; P(msd=2)' align=absmiddle>. In the same manner, <img src='/latexrender/pictures/3cdc38fb54a5e9895e2b059e5914f699.png' title='P(msd=2) &gt; P(msd=3) &gt; P(msd=4)' alt='P(msd=2) &gt; P(msd=3) &gt; P(msd=4)' align=absmiddle> and so on. QED</p>
<p>So &#8217;1&#8242; is more likely than &#8217;2&#8242;, but can this account for the difference we see in Benford&#8217;s law? (There, &#8217;1&#8242; is more likely than &#8217;2&#8242; by a factor of about 2). It&#8217;s hard to say, because we don&#8217;t have the probability distributions for the datasets he used. But we can look at another, well-known distribution &#8212; the <em>power law distribution</em>: </p>
<p><img src='/latexrender/pictures/afac3b2043b489821b4efa7e96a7cc45.png' title='P(X) \sim X^{-(1+\alpha)}' alt='P(X) \sim X^{-(1+\alpha)}' align=absmiddle>. </p>
<p>This distribution occurs in many natural phenomena, including, for example, the <a href="http://www.sst.ph.ic.ac.uk/people/k.christensen/research/unified.html">sizes of earthqukes</a>. It is also a monotonically decreasing distribution, so it fits the bill. Let&#8217;s calculate the likelihood of getting an MSD of &#8216;m&#8217;:</p>
<p>[Unparseable or potentially dangerous latex formula. Error 5 : 571x44]<br />
<img src='/latexrender/pictures/00ca9406d3a1eaf6507073cd6d3d8952.png' title='\sim m^{-\alpha}-(m+1)^{-\alpha}' alt='\sim m^{-\alpha}-(m+1)^{-\alpha}' align=absmiddle></p>
<p>Now let&#8217;s plug in some numbers. Let&#8217;s take <img src='/latexrender/pictures/0fde0dd885c03a79d5b057e0521bbc6c.png' title='P(X)=X^{-2}' alt='P(X)=X^{-2}' align=absmiddle> and calculate the probability ratio between an MSD of &#8217;1&#8242; and &#8217;2&#8242;:</p>
<p><img src='/latexrender/pictures/3f1c9ce91c0e200971c4fd37033aec18.png' title='\frac{P(msd=m)}{P(msd=m+1)}=\frac{m+2}{m} \Longrightarrow \frac{P(msd=1)}{P(msd=2)}=3' alt='\frac{P(msd=m)}{P(msd=m+1)}=\frac{m+2}{m} \Longrightarrow \frac{P(msd=1)}{P(msd=2)}=3' align=absmiddle></p>
<p>Which is even greater than the ratio of about 2 as appears in <a href="http://mathworld.wolfram.com/BenfordsLaw.html">mathworld</a>.</p>
<p>As a final note, using <a href="http://en.wikipedia.org/wiki/Zipf's_law">Zipf&#8217;s law</a> I checked a couple of datasets on the <a href="https://www.cia.gov/cia/publications/factbook/index.html">CIA fact book</a>, and they <em>do not</em> distribute according to a power law. So what I&#8217;ve shown isn&#8217;t a direct explanation of Benford&#8217;s law for these datasets. But it does explain why Benford&#8217;s law appears in many naturally-occuring phenomena.</p>
]]></content:encoded>
			<wfw:commentRss>http://4by12.com/blog/archives/79/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Last Exam of the Semester</title>
		<link>http://4by12.com/blog/archives/65</link>
		<comments>http://4by12.com/blog/archives/65#comments</comments>
		<pubDate>Thu, 27 Jul 2006 17:45:21 +0000</pubDate>
		<dc:creator>Guy Gur Ari</dc:creator>
				<category><![CDATA[Mathematics]]></category>
		<category><![CDATA[Personal]]></category>

		<guid isPermaLink="false">http://4by12.com/blog/archives/65</guid>
		<description><![CDATA[Took my last test of the semester today, in Topology. It came after four days of intensive studying. Sometime during the second day I realized that I had underestimated the breadth of the material, and that I needed more time. So I went into a blitz that involved little sleep and much caffeine. I can [...]]]></description>
			<content:encoded><![CDATA[<p>
Took my last test of the semester today, in Topology. It came after four days of intensive studying. Sometime during the second day I realized that I had underestimated the breadth of the material, and that I needed more time. So I went into a blitz that involved little sleep and much caffeine.
</p>
<p>
I can say now that all that studying paid off, as I was able to swing back at almost all the topological curvballs <a href="http://www.ma.huji.ac.il/~erezla/">prof. Lapid</a> threw at us. There was one question that stumped me though. Here it is. If you can solve it, more power to ya (and please let me know the answer!)
</p>
<p>Let <img src='/latexrender/pictures/07054044bc4312d02476b50658c442b2.png' title='f:S^2 \rightarrow Y' alt='f:S^2 \rightarrow Y' align=absmiddle> be a <a href="http://en.wikipedia.org/wiki/Continuity_%28topology%29">continuous,</a> <a href="http://en.wikipedia.org/wiki/Injective_function">injective</a> function from the sphere <img src='/latexrender/pictures/5ad83b44f7458dc7e77258c700e8a861.png' title='S^2' alt='S^2' align=absmiddle> (that&#8217;s the unit sphere in <img src='/latexrender/pictures/bd99f1d9cb677df79ade058e005b84a8.png' title='R^3' alt='R^3' align=absmiddle>) to some <a href="http://en.wikipedia.org/wiki/Metrizable">metrizable space</a> Y. Is the image <img src='/latexrender/pictures/f8f1046e9aede510d58c395deeb7a32a.png' title='f(S^2)' alt='f(S^2)' align=absmiddle> necessarily <a href="http://en.wikipedia.org/wiki/Homeomorphism">homeomorphic</a> to <img src='/latexrender/pictures/5ad83b44f7458dc7e77258c700e8a861.png' title='S^2' alt='S^2' align=absmiddle>?
</p>
<p>Anyway, while studying I stumbled on a fun little something involving polarizing sunglasses. More on that in a future post.</p>
]]></content:encoded>
			<wfw:commentRss>http://4by12.com/blog/archives/65/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>
