<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
    <title>Lost in Technopolis</title>
    <link rel="alternate" type="text/html" href="http://www.newartisans.com/" />
    <link rel="self" type="application/atom+xml" href="http://www.newartisans.com/atom.xml" />
    <id>tag:www.newartisans.com,2009-01-16://1</id>
    <updated>2010-07-15T16:50:38Z</updated>
    <subtitle>A journal of technical discovery, and sometimes, just pure amazement.</subtitle>
    <generator uri="http://www.sixapart.com/movabletype/">Movable Type Pro 4.261</generator>

<entry>
    <title>A word on Haskell Monads and C++</title>
    <link rel="alternate" type="text/html" href="http://www.newartisans.com/2010/07/a-word-on-haskell-monads-and-c.html" />
    <id>tag:www.newartisans.com,2010://1.3387</id>

    <published>2010-07-15T12:20:06Z</published>
    <updated>2010-07-15T16:50:38Z</updated>

    <summary></summary>
    <author>
        <name>John Wiegley</name>
        
    </author>
    
    <category term="haskellcfp" label="haskell c++ fp" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://www.newartisans.com/">
        

        <![CDATA[<p>After spending a good while trying to understand monads in Haskell, and why the Haskell world is so fascinated by them, I finally understand why they aren&#8217;t as exciting to other languages, or why they are completely missing from languages like C++: because they&#8217;re mostly already there.</p>

<p>At its simplest, <em>a monad is an abstraction of a value which knows how to apply functions to that value, returning a new monad</em>.  In other words, it&#8217;s a way to turn values into little packages that wrap additional functionality around that value.  Sounds a lot like what an object does&#8230;</p>

<p>But this doesn&#8217;t tell you what&#8217;s exciting about them, from Haskell&#8217;s point of view.  Another way of looking at them, without going into the wheres and whys, is this: In a lazily-evaluated, expression-based language, monads let you express sequenced, interdependent computation.</p>

<p>Consider the following two code examples.  First, in C++:</p>

<pre><code>#include &lt;iostream&gt;
int main() {
  std::cout &lt;&lt; "Hello, world!"
            &lt;&lt; "  This is a sample"
            &lt;&lt; " of using a monad in C++!"
            &lt;&lt; std::endl;
  return 0;
}</code></pre>

<p>And the same code in Haskell:</p>

<pre><code>module Main where
main :: IO ()
main = do putStr "Hello, world!"
          putStr "  This is a sample"
          putStr " of using a monad in C++!"
          putStr "\n"</code></pre>

<p>What the IO monad in the second example is doing is making the sequenced evaluation of the print statements possible using a nice, normal looking syntax.  The C++ code doesn&#8217;t need monads to do this, because it already embodies the concept of abstracted values (here, the iostream passed between insertion operators) and sequenced computation (because it&#8217;s not lazy).</p>

<p>To compare Monads with C++:</p>

<ol>
<li><p>Monads are abstractions of values.  So are most C++ objects.</p></li>
<li><p>Monads permit functions to be applied to the &#8220;contained&#8221; value, returning a a new version of the monad.  C++ objects provide methods, where the mutated object is the new version.</p></li>
<li><p>Monads provide a way to encapsulate values in new monads.  C++ objects have constructors.</p></li>
</ol>

<p>As another example, consider the case where you have to call five functions on an integer, each using the return value of the last:</p>

<pre><code>j(i(h(g(f(10))))</code></pre>

<p>This is an identical operation in both Haskell and C++.  But what if the return value of each function wasn&#8217;t an integer, but an &#8220;object&#8221; that could either be an integer, or an uninitialized value?  In most languages, there&#8217;s either a type, or syntax, for this concept:</p>

<pre><code>C++      boost::optional&lt;int&gt;
C#       int?
Java     Integer
Haskell  Maybe Int</code></pre>

<p>If each function returns one of these, but takes a real integer, it means we have to check the &#8220;null&#8221; status of each return value before calling the next function.  In C++ this leads to a fairly common idiom:</p>

<pre><code>if (boost::optional&lt;int&gt; x1 = f(10))
  if (boost::optional&lt;int&gt; x2 = g(*x1))
    if (boost::optional&lt;int&gt; x3 = h(*x2))
      if (boost::optional&lt;int&gt; x4 = i(*x3))
        j(*x4);</code></pre>

<p>Note that not only are these calls sequential, but due to the meaning of optionality, they are also inherently short-circuiting.  If <code>f</code> returns <code>none</code>, none of the other functions get called.</p>

<p>Haskell can do this type of thing natively as well, and it looks similar:</p>

<pre><code>case f 10 of
  Nothing -&gt; Nothing
  Just x1 -&gt; 
    case g x1 of
      Nothing -&gt; Nothing
      Just x2 -&gt; 
        case h x2 of
          Nothing -&gt; Nothing
          Just x3 -&gt; 
            case i x3 of
              Nothing -&gt; Nothing
              Just x4 -&gt; j x4</code></pre>

<p>But it&#8217;s ugly as sin.  In C++, we can be evil and flatten things out using basic features of the language, assuming we pre-declare the variables:</p>

<pre><code>(   (x1 = f(10))
 &amp;&amp; (x2 = g(*x1))
 &amp;&amp; (x3 = h(*x2))
 &amp;&amp; (x4 = i(*x3))
 &amp;&amp; (x5 = j(*x4)), x5)</code></pre>

<p>Or you can eliminate the use of temporaries altogether by creating a wrapper class:</p>

<pre><code>template &lt;typename T&gt; struct Maybe {
  boost::optional&lt;T&gt; value;

  Maybe() {}
  Maybe(const T&amp; t) : value(t) {}
  Maybe(const Maybe&amp; m) : value(m.value) {}

  Maybe operator&gt;&gt;(boost::function&lt;Maybe&lt;T&gt;(const T&amp;)&gt; f) const {
    return value ? f(*value) : *this;
  }
};</code></pre>

<p>If we change our functions to return <code>Maybe&lt;int&gt;</code> instead of just <code>boost::optional&lt;T&gt;</code>, it allows us to write this:</p>

<pre><code>f(10) &gt;&gt; g &gt;&gt; h &gt;&gt; i &gt;&gt; j</code></pre>

<p>Which in Haskell is written almost the same way:</p>

<pre><code>f 10 &gt;&gt;= g &gt;&gt;= h &gt;&gt;= i &gt;&gt;= j</code></pre>

<p>But where Haskell needs Monads to make this type of thing reasonable and concise, C++ doesn&#8217;t.  We get passing around of object state between function calls as part of the core language, and there are many different ways to express it.  However, if you confined C++ to function definitions and return statements only &#8211; where all function arguments were pass-by-value &#8211; then things like Monads would become an essential technique for passing knowledge between calls.</p>

<p>So it&#8217;s not that you can&#8217;t use Monads in C++, it&#8217;s just that they require enough extra machinery, and aren&#8217;t unique enough compared to core features of the language, that there isn&#8217;t the same level of motivation for them as there is in Haskell, where they can really add to the expressiveness of code.</p>
]]>
    </content>
</entry>

<entry>
    <title>A C++ gotcha on Snow Leopard</title>
    <link rel="alternate" type="text/html" href="http://www.newartisans.com/2009/10/a-c-gotcha-on-snow-leopard.html" />
    <id>tag:www.newartisans.com,2009://1.3382</id>

    <published>2009-10-30T09:35:32Z</published>
    <updated>2009-11-10T08:12:39Z</updated>

    <summary>I&#8217;ve seen this issue mentioned in some random and hard to reach places on the Net, so I thought I&#8217;d re-express it here for those who find Google sending them this way. 


...Whenever a string is deconstructed, the standard library would check whether that string&#8217;s address matches matches the empty string&#8217;s: if so, it does nothing; if not, it calls =free=. 


...If a library that  does  have fully dynamic strings enabled (aka the standard library) receives an empty string from code which does not have it enabled (aka, the app you just built), it will try to free it and your application will crash. 


...Since my standard library *is* compiled with fully dynamic strings, the destructor for =basic string= doesn&#8217;t recognize that its the &#8220;special&#8221; empty string, so it tries to free it.</summary>
    <author>
        <name>John Wiegley</name>
        
    </author>
    
    <category term="applications" label="Applications" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="c" label="C++" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="mac" label="Mac" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="optimization" label="Optimization" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="programming" label="Programming" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://www.newartisans.com/">
        <![CDATA[<p>I&#8217;ve seen this issue mentioned in some random and hard to reach places on the Net, so I thought I&#8217;d re-express it here for those who find Google sending them this way.</p>
]]>
        <![CDATA[<p>On Snow Leopard, Apple decided to build g++ and the standard C++ library with &#8220;fully dynamic strings&#8221; enabled.  What this means for you relates to the empty string.</p>

<p>When fully dynamic strings are off (as was true in Leopard), there exists a single global variable representing the empty string.  This variable lives in the data segment of <code>libstdc++</code>, and so it does not exist on the heap.  Whenever a string is deconstructed, the standard library would check whether that string&#8217;s address matches matches the empty string&#8217;s: if so, it does nothing; if not, it calls <code>free</code>.</p>

<p>With fully dynamic strings on, there is no global empty string.  All strings are on the heap, and once their reference count goes to zero, they get deallocated.  Where this creates a problem is if you mix and match code.  If a library that <em>does</em> have fully dynamic strings enabled (aka the standard library) receives an empty string from code which does not have it enabled (aka, the app you just built), it will try to free it and your application will crash.</p>

<p>Here&#8217;s a reproducible case for this issue using Boost:</p>

<pre><code>#include &lt;string&gt;
#include &lt;sstream&gt;
#include &lt;boost/variant.hpp&gt;

int main()
{
  std::ostringstream buf;
  boost::variant&lt;bool, std::string&gt; data;
  data = buf.str();
  data = false;
  return 0;
}</code></pre>

<p>In this case &#8211; which really happened to me &#8211; I created an empty string by calling <code>ostringstream::str()</code>.  Since I don&#8217;t have fully dynamic string on, its address is in data space, not on the heap.  I pass this string to <code>boost::variant</code>, which makes a copy of that address.  Later, when the variant is reassigned <code>false</code>, it calls <code>~basic_string</code> to deconstruct the string.  Since my standard library <em>is</em> compiled with fully dynamic strings, the destructor for <code>basic_string</code> doesn&#8217;t recognize that its the &#8220;special&#8221; empty string, so it tries to free it.</p>

<p>The solution to this problem is three-fold:</p>

<ol>
<li><p>You must be using the <code>g++</code> that comes with Xcode, or if you build your own (say, via MacPorts), you must configure it using <code>--enable-fully-dynamic-string</code>.  I&#8217;ve already submitted a patch to this effect to the MacPorts crew.</p></li>
<li><p>All libraries must be compiled with <code>-D_GLIBCXX_FULLY_DYNAMIC_STRING</code>.</p></li>
<li><p>Your own code must be compiled with <code>-D_GLIBCXX_FULLY_DYNAMIC_STRING</code>.</p></li>
</ol>

<p>You&#8217;ll know if this issue is biting you by looking at a stack trace in gdb.  You&#8217;ll see a crash somewhere inside basic_string&#8217;s <code>_M_destroy</code> (which calls <code>free</code>).  Move up the trace a bit and check whether the string it&#8217;s trying to free is 0 bytes long.</p>

<p>To recap: what&#8217;s happened is that an empty string constructed by code without fully dynamic strings got deallocated by code that was.  That is, most likely you, or a library you built, handed an empty <code>std::string</code> to the system library.</p>
]]>
    </content>
</entry>

<entry>
    <title>Branch policies with Git</title>
    <link rel="alternate" type="text/html" href="http://www.newartisans.com/2009/10/branch-policies-with-git.html" />
    <id>tag:www.newartisans.com,2009://1.3381</id>

    <published>2009-10-29T03:05:57Z</published>
    <updated>2009-10-29T06:23:55Z</updated>

    <summary>When the master branch is at a state where I want to finally release it, I merge with =&#8212;no-ff=, so the merge gets represented as a single commit on the maint branch.  

...Since most development work happens on &#8220;next&#8221;, each time next is stable I merge into master, using =&#8212;no-ff= to keep the merge commits together.  

... Note that no commits are ever made directly to master, unless I&#8217;ve seriously broken something that needs to be addressed sooner than the next merge from &#8220;next&#8221;.  

... Then there are the various local-only topic branches that live on my machine, in which I develop highly unstable code relating to one feature or another, awaiting the day when it becomes stable enough to be merge into &#8220;next&#8221;.</summary>
    <author>
        <name>John Wiegley</name>
        
    </author>
    
    <category term="git" label="Git" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://www.newartisans.com/">
        <![CDATA[<p>I&#8217;ve been managing my <a href="http://wiki.github.com/jwiegley/ledger">Ledger</a> project with Git for some time now, and I&#8217;ve finally settled into a comfortable groove concerning branches and where to commit stuff.</p>
]]>
        <![CDATA[<p>Essentially I use four branches, in increasing order of commit frequency.  Each branch has its own policy and purpose, which are described below.</p>

<h2 id="maint">maint</h2>

<p>Every release of Ledger is made from the maint branch, and every commit on that branch is potentially a release.  This means that no commit is made until some serious vetting takes place.  When the master branch is at a state where I want to finally release it, I merge with =&#8211;no-ff=, so the merge gets represented as a single commit on the maint branch.  Then I tag the release and make a distribution tarball.</p>

<p>It&#8217;s possible after a release that patches need to get applied to maint, and a point release made.  Once this is done, the applicable patches are either merged into master, or if the two diverse too greatly I will begin cherry-picking instead.  Once cherry-picking starts, no more merges into master will occur until after the next release merge happens in maint.</p>

<p>The purpose of maint is to provide the most stable release possible to the public.</p>

<h2 id="master">master</h2>

<p>Master is where most people get the latest source code from, so it is kept reasonable stable.  There is a commit hook which guarantees that all commits to this branch build and pass the test suite.  Since most development work happens on &#8220;next&#8221;, each time next is stable I merge into master, using =&#8211;no-ff= to keep the merge commits together.  I also use =&#8211;no-commit=, so the merge must pass the commit hook in order to go in.</p>

<p>Note that no commits are ever made directly to master, unless I&#8217;ve seriously broken something that needs to be addressed sooner than the next merge from &#8220;next&#8221;.  In that case, I&#8217;ll cherry pick this commit into master afterward.  Merges only happen into master from next, and only from master into maint.</p>

<p>The purpose of master is to provide reasonably stable development snapshots to the public.</p>

<h2 id="next">next</h2>

<p>The next branch is where I commit most often, and while I try to keep it functional, this is not always the case.  I don&#8217;t run unit tests here for every commit, just before every push (mostly).  Most of my friends follow this branch, because it updates very often.</p>

<p>The purpose of next is to provide potentially unstable, frequent development snapshots to the public.</p>

<h2 id="test">test</h2>

<p>The test branch comes in and out of existence, and should only ever be pulled using =pull &#8211;rebase=.  It contains trial commits that I want someone to test out.  It&#8217;s a delivery branch, and after it&#8217;s been used I either delete it or ignore it until the next time it&#8217;s necessary.</p>

<p>The purpose of test is to communicate patch candidates to a particular person at a particular time.</p>

<h2 id="topic">topic</h2>

<p>Then there are the various local-only topic branches that live on my machine, in which I develop highly unstable code relating to one feature or another, awaiting the day when it becomes stable enough to be merge into &#8220;next&#8221;.</p>
]]>
    </content>
</entry>

<entry>
    <title>Response to PG&apos;s &quot;How to Do Philosophy&quot;</title>
    <link rel="alternate" type="text/html" href="http://www.newartisans.com/2009/05/response-to-pgs-how-to-do-philosophy.html" />
    <id>tag:www.newartisans.com,2009://1.3380</id>

    <published>2009-05-13T19:04:27Z</published>
    <updated>2009-05-13T22:50:18Z</updated>

    <summary>The &#8220;practical man&#8221; knows well the value of practical things and he is an expert at perfecting the animal life; but it takes more than a well-fed stomach to bring true content.  

... If a philosopher is anything, I say he is someone who forgoes all else to discover and adventure in  that  world, and to learn what effect immaterial consequences should have on our material life, if all is to be as it ought. 


...What Plato used his method for was to approach noesis: to know the &#8220;real real&#8221;, to have a direct apprehension of reality freed from mortal conceptions; to &#8220;remember&#8221; the soul&#8217;s birth and origin; to return our perception of the world to an original, direct perception of Truth itself.  

...There are human endeavors which are little more than words or pigments on paper, that come to life only through the eye of an appreciate heart and mind.</summary>
    <author>
        <name>John Wiegley</name>
        
    </author>
    
    <category term="productivity" label="Productivity" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://www.newartisans.com/">
        <![CDATA[<p>Back in late 2007, Paul Graham put up an essay titled &#8220;How to Do Philosophy&#8221;, in which Mr. Graham hoped to elucidate where Philosophy went wrong and why the field, as now practiced, must be renovated to remain useful.  In fact, he goes so far as to suggest that much of philosophy has no benefit whatsoever:</p>

<blockquote>
  <p>The proof of how useless some of their answers turned out to be is how little effect they have.  No one after reading Aristotle&#8217;s Metaphysics does anything differently as a result.</p>
</blockquote>

<p>If I may, as a student of philosophy, I would like to offer my response to this argument, whose tenets have been repeated many times throughout Philosophy&#8217;s history.</p>
]]>
        <![CDATA[<h2 id="thespiritofphilosophy">The spirit of philosophy</h2>

<p>As far back as Plato&#8217;s Republic (and most likely long before then) there have been debates on the merit of philosophy.  In Plato&#8217;s book it is between Socrates and Glaucon, who fears that men may waste their time in fruitless contemplation:</p>

<blockquote>
  <p>Socrates: I am amused, I said, at your [Glaucon&#8217;s] fear of the world, which makes you guard
  against the appearance of insisting upon useless studies; and I quite admit
  the difficulty of believing that in every man there is an eye of the soul
  which, when by other pursuits lost and dimmed, is by these purified and
  re-illumined; and is more precious far than ten thousand bodily eyes, for
  by it alone is truth seen&#8230;.</p>
</blockquote>

<p>Earlier Socrates had said something similar, and in briefer terms:</p>

<blockquote>
  <p>Socrates: Then must not a further admission be made?</p>
  
  <p>Glaucon: What admission?</p>
  
  <p>Socrates: That the knowledge at which geometry aims is knowledge of the eternal, and
  not of aught perishing and transient.</p>
  
  <p>Glaucon: That, he replied, may be readily allowed, and is true.</p>
  
  <p>Socrates: Then, my noble friend, geometry will draw the soul towards truth, and
  create the spirit of philosophy, and raise up that which is now unhappily
  allowed to fall down.</p>
</blockquote>

<p>This &#8220;spirit of philosophy&#8221; is held by Socrates over and over again to be precious beyond compare: a light to illumine every aspect of life.  If a lantern is something you can design, hold and weigh, yet this light is its intangible counterpart, granting the lamp its purpose.  It is the &#8220;why&#8221; to the lantern&#8217;s &#8220;what&#8221; and &#8220;how&#8221;.  It can neither be designed, nor held, nor weighed, but must be enkindled.  And only then does the lamp come aglow&#8230;</p>

<h2 id="theharponpracticality">The harp on practicality</h2>

<p>I understand the need for practical results in a material world, but results are meaningless deprived of context.  If we boil things down to their material essence, then what we do we do for survival: develop resources to protect and prolong life.  But is surviving enough?  Don&#8217;t people also seek meaning from what they do?  Certainly I don&#8217;t enjoy programming merely to make a paycheck; I have to feel something <em>more</em> to keep me motivated year after year.</p>

<p>The harp on practicality levied against philosophy overstresses the &#8220;what&#8221; against the &#8220;why&#8221;.  Mr. Graham debates how to make philosophy useful again, but I think he has lost the point of it: useful in terms of what?  Does usefulness have a &#8220;why&#8221;?  Who is to define the best &#8220;use&#8221; of anything, so that usefulness may be measured?  Thus, there is a conundrum at this center of his argument: How can any man judge philosophy who has not discovered what it aims to impart?</p>

<p>Anyone can understand the concept of practicality.  Even children connect the ideas of work and output.  It&#8217;s why we hate cleaning our room, because it takes so much work yet we gain so little from it.  But what is pratical is not the same as what is essential.  Happiness, most of us know, is not found in more money, more power, or by more efficient processes.  There is only one outcome in this life which is inevitable, and curiously neither industry nor indolence has any effect on its timing or nature.  But whereas the practical man fears death as the end of opportunity, perhaps the philosopher sees it differently:</p>

<blockquote>
  <p>Socrates: The philosopher desires death &#8211; which the wicked world will insinuate that
  he also deserves:  and perhaps he does, but not in any sense which they are
  capable of understanding.  Enough of them: the real question is, What is
  the nature of that death which he desires?  Death is the separation of soul
  and body &#8211; and the philosopher desires such a separation.  He would like to
  be freed from the dominion of bodily pleasures and of the senses, which are
  always perturbing his mental vision.  He wants to get rid of eyes and ears,
  and with the light of the mind only to behold the light of truth&#8230;.</p>
</blockquote>

<p>So the question is raised: Is there more than just this world?  I don&#8217;t necessarily mean physical death, either.  For there is a world of purely material pursuits and achievement &#8211; a world we share in common with animals &#8211; and there is a world of inspiration, abstraction, and fantasy, which only men participate in.  The &#8220;practical man&#8221; knows well the value of practical things and he is an expert at perfecting the animal life; but it takes more than a well-fed stomach to bring true content.  If not so, then cows should be our kings.</p>

<p>If a philosopher is anything, I say he is someone who forgoes all else to discover and adventure in <em>that</em> world, and to learn what effect immaterial consequences should have on our material life, if all is to be as it ought.</p>

<h2 id="thebaneofmethod">The bane of method</h2>

<p>Not everyone who reads Plato, of course, comes away with mystical opinions.  Just as there are those who eschew philosophy entirely and ignore its delights, so there are some who accept it but half-way.  They see that philosophy prescribes a method and they fall in love with that method, dedicating the whole of their pursuit to refining it.  Yes, Plato did stress the necessity of dialectic, but his stress had a purpose in mind.  Not a material or pratical goal &#8211; hardly even a &#8220;useful&#8221; one in immediate terms &#8211; but a personal and soulful one.</p>

<p>Philosophy is ever so much more than method.  In fact, the love of method has resulted in a few branches of philosophy which are hardly philosophy at all, but the art of analysis.  What Plato used his method for was to approach noesis: to know the &#8220;real real&#8221;, to have a direct apprehension of reality freed from mortal conceptions; to &#8220;remember&#8221; the soul&#8217;s birth and origin; to return our perception of the world to an original, direct perception of Truth itself.  Through this experience of true perception our breasts and minds would dilate, and every pursuit will become infused with the vibrating principle of Life.</p>

<h2 id="missingthepoint">Missing the point</h2>

<p>This is why, when I read essays like Mr. Graham&#8217;s, I find myself thinking that his own success and momentum have caused him to miss the point.  Philosophy is not meant to be practical.  It is not meant to have a use.  It does not exist to make us more productive girls and boys.  It is a diet of words to feed our soul by way of stimulating our mind.  It is not a roast-beef sandwich, but more the substance of an ethereal longing.</p>

<p>Some will ask, what is this thing that is words and nothing more?  To them I reply: Then what is poetry?  There are human endeavors which are little more than words or pigments on paper, that come to life only through the eye of an appreciate heart and mind.  Does a man read Shakespeare and ask what profit he has gained?  If he does then he cannot see the point.  What he gains is immaterial &#8211; literally and figuratively &#8211; but may in the long run be immensely valuable.  It depends on what he saw, how well he saw, and the breadth of his vision.</p>

<p>It is no different with Philosophy.  Consider it an artform, or a method of tuning the soul through delicate adjustments of the mind.  When one tunes a violin there is no melody played; that comes after.  The fruit of philosphy is the philosopher&#8217;s life itself.  It is how it changes the man that matters, not the changes he can prove to you from day to day.</p>

<p>So if you are accustomed to reading balance sheets and preparing quarterly projections, perhaps you are ill-equipped to judge philosophy.  But if you measure the smile of a happy engineer against the despair of an endless, daily grind, maybe then you will have found the weight of philosophy&#8217;s fruit.</p>
]]>
    </content>
</entry>

<entry>
    <title>Journey into Haskell, part 6</title>
    <link rel="alternate" type="text/html" href="http://www.newartisans.com/2009/03/journey-into-haskell-part-6.html" />
    <id>tag:www.newartisans.com,2009://1.3376</id>

    <published>2009-03-26T13:00:00Z</published>
    <updated>2009-03-23T05:28:45Z</updated>

    <summary>   Create a list of primes &#8220;as you go&#8221;, considering a number prime if it can&#8217;t be divided by any number already considered prime. 


... However, although my straightforward solution worked on discrete ranges, it couldn&#8217;t yield a single prime when called on an infinite range &#8212; something I&#8217;m completely unused to from other languages, except for some experience with the SERIES library in Common Lisp. 


... But when I suggested this on  #haskell , someone pointed out that you can&#8217;t reverse an infinite list.  

...This time when I put  primes [1..]  into GHCi it printed out prime numbers immediately, but visibly slowed as the accumulator grew larger. </summary>
    <author>
        <name>John Wiegley</name>
        
    </author>
    
    <category term="fp" label="FP" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="haskell" label="Haskell" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="programming" label="Programming" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://www.newartisans.com/">
        <![CDATA[<blockquote>
  <p>Create a list of primes &#8220;as you go&#8221;, considering a number prime if it can&#8217;t be divided by any number already considered prime.</p>
</blockquote>

<p>However, although my straightforward solution worked on discrete ranges, it couldn&#8217;t yield a single prime when called on an infinite range &#8211; something I&#8217;m completely unused to from other languages, except for some experience with the SERIES library in Common Lisp.</p>
]]>
        <![CDATA[<h2 id="anincompletesolution">An incomplete solution</h2>

<p>Looking similar to something I might have written in Lisp, I came up with this answer:</p>

<pre><code>primes = reverse . foldl fn []
    where fn acc n
              | n `dividesBy` acc = acc
              | otherwise         = (n:acc)
          dividesBy x (y:ys)
              | y == 1         = False
              | x `mod` y == 0 = True
              | otherwise      = dividesBy x ys
          dividesBy x [] = False</code></pre>

<p>But when I suggested this on <a href="irc://irc.freenode.net/haskell">#haskell</a>, someone pointed out that you can&#8217;t reverse an infinite list.  That&#8217;s when a light-bulb turned on: I hadn&#8217;t learned to think in infinites yet.  Although my function worked fine for discrete ranges, like <code>[1..100]</code>, it crashed on <code>[1..]</code>.</p>

<p>So back to the drawing board, later to come up with this infinite-friendly version:</p>

<pre><code>primes :: [Int] -&gt; [Int]
primes = fn []
    where fn _ [] = []
          fn acc (y:ys)
              | y `dividesBy` acc = fn acc ys
              | otherwise         = y : fn (y:acc) ys

          dividesBy _ [] = False
          dividesBy x (y:ys)
              | y == 1         = False
              | x `mod` y == 0 = True
              | otherwise      = dividesBy x ys</code></pre>

<p>Here the accumulator grows for each prime found, but returns results in order whose value can be calculated as needed.  This time when I put <code>primes [1..]</code> into GHCi it printed out prime numbers immediately, but visibly slowed as the accumulator grew larger.</p>
]]>
    </content>
</entry>

<entry>
    <title>Journey into Haskell, part 5</title>
    <link rel="alternate" type="text/html" href="http://www.newartisans.com/2009/03/journey-into-haskell-part-5.html" />
    <id>tag:www.newartisans.com,2009://1.3375</id>

    <published>2009-03-24T13:00:00Z</published>
    <updated>2009-03-23T05:28:22Z</updated>

    <summary>I actually imported the full source into HackPorts, ripped out its  List.hs  file, renamed it to my  Main.hs  file, and then began changing it from a function that prints out a list of available packages, to one that writes the data into properly formatted  Portfile  entries. 


...  As it does this, it fetches the current version&#8217;s tarball over HTTP, and uses OpenSSL (directly, through FFI) to generate MD5, SHA1 and RIPEMD160 checksums of the tarball image.  


...As a stub, I have them all depending on  port:ghc , but I think there&#8217;s sufficient information in the Cabal package info to figure out what the right dependencies should be, both among the Hackage packages themselves and against any external libraries (like OpenSSL). 


...Whereas  map  takes a list of values and returns a list of values,  mapM  takes a list of values and returns a list of actions that get invoked in sequence in the current Monad (in this case,  IO ). </summary>
    <author>
        <name>John Wiegley</name>
        
    </author>
    
    <category term="applications" label="Applications" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="fp" label="FP" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="haskell" label="Haskell" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="programming" label="Programming" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://www.newartisans.com/">
        <![CDATA[<p>Haskell may be difficult to start out with, but once things start rolling, they roll fast.  Yesterday (real world time, these blog entries are staggered) I had started the first lines of HackPorts, but now things are getting close to done for the first version.  It&#8217;s not that I&#8217;ve written much code, but that it was simple to integrate with other people&#8217;s code.</p>
]]>
        <![CDATA[<h2 id="borrowingallican">Borrowing all I can</h2>

<p>The first thing I wanted to do was avoid dealing with any of Hackage&#8217;s data formats, so I cribbed everything I could from the <code>cabal-install</code> package.  I actually imported the full source into HackPorts, ripped out its <code>List.hs</code> file, renamed it to my <code>Main.hs</code> file, and then began changing it from a function that prints out a list of available packages, to one that writes the data into properly formatted <code>Portfile</code> entries.</p>

<p>The code does the following bits of work:</p>

<ol>
<li><p>Talks to <code>cabal-install</code> and Cabal to get a list of all known packages on Hackage.</p></li>
<li><p>For every package, creates a directory named <code>haskell/$package</code>, and then writes information about that package into <code>haskell/$package/Portfile</code>.</p></li>
<li><p>As it does this, it fetches the current version&#8217;s tarball over HTTP, and uses OpenSSL (directly, through FFI) to generate MD5, SHA1 and RIPEMD160 checksums of the tarball image.</p></li>
</ol>

<p>And voilá, a directory populated with 1136 Portfile entries.  What&#8217;s missing now is the external dependency mapping.  As a stub, I have them all depending on <code>port:ghc</code>, but I think there&#8217;s sufficient information in the Cabal package info to figure out what the right dependencies should be, both among the Hackage packages themselves and against any external libraries (like OpenSSL).</p>

<h2 id="whatilearned">What I learned</h2>

<p>As for my Haskell education, I learned about using Haskell&#8217;s very nice FFI mechanism, and had a lot more experience using the IO Monad.  An example of using FFI to call out to OpenSSL:</p>

<pre><code>{-# OPTIONS -#include "openssl/md5.h" #-}

foreign import ccall "openssl/md5.h MD5" c_md5
    :: Ptr CChar -&gt; CULong -&gt; Ptr CChar -&gt; IO (Ptr Word8)</code></pre>

<p>I now have access to a <code>c_md5</code> function, which go directly over to the C library to do its work.  Not too shabby!</p>

<p>As for the IO Monad, here is the <code>main</code> function for Hackports:</p>

<pre><code>main :: IO ()
main = do
  createDirectoryIfMissing True "haskell"
  pkgs &lt;- allPackages verbose
  mapM writePortfile pkgs
  putStrLn "Hackage has been exported to MacPorts format in haskell/"</code></pre>

<p>The trickiest part for me was understanding how <code>mapM</code> differs from <code>map</code>.  Whereas <code>map</code> takes a list of values and returns a list of values, <code>mapM</code> takes a list of values and returns a list of actions that get invoked in sequence in the current Monad (in this case, <code>IO</code>).</p>
]]>
    </content>
</entry>

<entry>
    <title>How laziness changes thinking in Haskell</title>
    <link rel="alternate" type="text/html" href="http://www.newartisans.com/2009/03/functional-yet-lazy.html" />
    <id>tag:www.newartisans.com,2009://1.3374</id>

    <published>2009-03-22T06:06:16Z</published>
    <updated>2009-03-23T21:10:30Z</updated>

    <summary>If it were a file I was checksumming, I could memory map the file and pass around a byte pointer, and the OS would take care of lazily reading in the bytes for me as needed.  

...In C++ I&#8217;d have to switch from passing a vector to passing an istream iterator, but in Haskell, I don&#8217;t care what algorithm is populating my list, only that it  is  a list, and that I know how to work it. 


...Based on the behavior of the program, I&#8217;m led to believe it happened near what the code was actually  doing [^3] &#8212; but in fact the problem may have started long, long before, except that laziness differed the trigger to a later time. 


... I still think the benefits can outweight the difficulties &#8212; especially when it comes to parallelism, and avoiding unnecessary computations, and allowing code to safely traverse infinite series &#8212; but it definitely requires a level of algorithmic conciousness on the part of the engineer which seems quite a bit higher than with imperative languages.</summary>
    <author>
        <name>John Wiegley</name>
        
    </author>
    
    <category term="fp" label="FP" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="haskell" label="Haskell" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="programming" label="Programming" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://www.newartisans.com/">
        <![CDATA[<p>As I explore Haskell, I&#8217;m discovering that one of its trickiest aspects is not structuring things functionally, but the lazy evaluation.  It turns out lazy evaluation comes with both great benefits, and significant difficulties.  I&#8217;d like to point a few of these out, as they&#8217;re becoming clearer to me.</p>
]]>
        <![CDATA[<h2 id="benefits">Benefits</h2>

<p>One of the great benefits of lazy evaluation is that your code <em>doesn&#8217;t need to account for the scale of an operation</em>.  Let&#8217;s take a simple example: checksumming a large file whose contents are being read, on-demand, over HTTP.</p>

<p>In C++, if I wanted to checksum a large file read over HTTP, I couldn&#8217;t buffer it memory because I don&#8217;t know how large it might get.  Nor do I want my checksumming code to know anything about HTTP, or where the data comes from.  The answer here is to use I/O streams.  By passing a generic <code>istream</code> interface around, I can hide any knowledge of where the data came from.  The checksumming algorithm just reads data from the stream as it needs to, and the HTTP layer downloads more bytes as required (typically caching to avoid constant network access).</p>

<p>However, there&#8217;s a downside to this: the checksumming code now knows something about I/O.  In actuality, a checksumming algorithm only cares about the bytes being checksummed and little else.  It shouldn&#8217;t have to know about I/O, or strings, or any of the details of where data comes from or how it&#8217;s structured.  It should ideally receive a pointer to an arbitrary large sequences of 8-bit bytes, and return a fixed size checksum representing a fingerprint of those bytes.</p>

<p>Yet this naive approach can&#8217;t really be done in C++.  If it were a file I was checksumming, I could memory map the file and pass around a byte pointer, and the OS would take care of lazily reading in the bytes for me as needed.  But for a file being accessed over HTTP this would require first downloading the file and then checksumming it, when I specifically wanted to &#8220;checkum as I go&#8221;.  Who knows, maybe I&#8217;ll discover a reason to stop summing beyond a certain point and I&#8217;d like to stop downloading at that point as well.</p>

<p>Well, just as memory mapping gives me lazy access to the contents of a file, a language with lazy evaluation gives me lazy access to the results of any algorithm &#8211; including downloading data over HTTP.  With Haskell, I can indeed write my checksum algorithm as if it receives a giant byte buffer, and the language takes care of downloading only as much data as I&#8217;ve accessed (plus caching).  This simplifies my checksumming code, and reduces the amount of knowledge that has to be passed around, such as a &#8220;stream&#8221; as opposed to a generic, 8-bit pointer.<a href="#fn:1" id="fnref:1" class="footnote">1</a></p>

<p>This simplification lets you design your algorithm as if in an ideal world.  You want to process a bunch of numbers?  Work on a list.  What you say, the numbers are coming in from a socket and you don&#8217;t know when it will end?  Doesn&#8217;t matter, just work on a list.  In C++ I&#8217;d have to switch from passing a vector to passing an istream iterator, but in Haskell, I don&#8217;t care what algorithm is populating my list, only that it <em>is</em> a list, and that I know how to work it.</p>

<h2 id="detriments">Detriments</h2>

<p>For all its beauty, laziness has three costs I&#8217;ve run into so far.  The first is that it lets you very easily write functioning algorithms with horrible performance characteristics.  This happens because laziness causes a promise<a href="#fn:2" id="fnref:2" class="footnote">2</a> to be constructed, which takes memory and time to do.  Sometimes, the cost of the underlying operation is far less than the memory cost of carrying a promise around to do that operation at a later point.  This isn&#8217;t true of a slow operation like reading from a socket, but it&#8217;s certainly true of something trivial like summing two integers.  It means one has to be aware of promises, when they&#8217;re constructed, and when it&#8217;s more beneficial to force evaluation always versus the benefits to be had from deferred a computation whose result may never be needed.</p>

<p>The second is that when a poorly performing algorithm dies, it dies when its value is used, not when the promises are made.  This can make it look like the consumer is to blame, when really it&#8217;s the producer.  Here is a trivial example:</p>

<pre><code>mysum = foldl (+) 0

main = print (mysum [1..1000000])</code></pre>

<p>Although <code>foldl</code> is tail-recursive, so we aren&#8217;t blowing stack through recursive calls, it still blows stack because it builds up a huge, nested structure of promises that only gets evaluated once print is called to render it as a string.  That is, the return value from mysum itself is no problem, it&#8217;s just a lazy computation against a large list.  But then print needs the result, so it asks mysum to fulfill its promise.  This in turns causes mysum to churn through the large list of integers, building up the return value as it goes.</p>

<p>However, and here is where the surprise comes in: foldl doesn&#8217;t actually compute those values as it walks the input list.  No, even these are done lazily, because it can&#8217;t know how many of those values will actually be needed.  We may know from looking at the code that it will need them all, but it doesn&#8217;t know.  So it constructs something on the stack looking like this:</p>

<pre><code>((((((((0+1)+2)+3)+4)+5)+6)+7)+...)</code></pre>

<p>And so on, all the way to the last integer.  Only when <code>mysum</code> is done constructing promises across the entire input list, and the promise structure is returned, will it actually get evaluated by summing the integers together and finalizing each promise.  If you pick a input list large enough, there goes available memory.</p>

<p>The trick here is that the stack fault won&#8217;t ocur in foldl, or in mysum.  It will occur in print, where the need to resolve the promise result in the call to mysum actually being made, which then calls foldl, which then starts building thunks until memory is gone.  In this trivial example there&#8217;s very little code or time distance between the problem and its cause, but in real world code there may be enormous gaps between them.</p>

<p>In consequence of this I learned that it&#8217;s <a href="http://www.haskell.org/ghc/docs/latest/html/users_guide/ghci-debugger.html#tracing">hard to get GHC to produce stack traces</a> for you when there&#8217;s a runtime error.  Your code can be going on its merry way, when suddenly there&#8217;s a stack fault.  But that&#8217;s all you see: a stack fault indicating something went wrong.  Where did it go wrong?  Based on the behavior of the program, I&#8217;m led to believe it happened near what the code was actually <em>doing</em><a href="#fn:3" id="fnref:3" class="footnote">3</a> &#8211; but in fact the problem may have started long, long before, except that laziness differed the trigger to a later time.</p>

<p>So, even though laziness can delay costs and abstract how data is determined, by the same taken it also delays errors and abstracts blame.  In C++ if I pass in an I/O stream and there&#8217;s a crash reading from it, I know to look at my stream code.  But in Haskell if I get a stack fault simply by processing a list, how am I to know what&#8217;s wrong?  It&#8217;s not going to be in the List code, and probably not in the code walking the list, but in code which promised to produce the list potentially a long time ago.</p>

<p>I still think the benefits can outweight the difficulties &#8211; especially when it comes to parallelism, and avoiding unnecessary computations, and allowing code to safely traverse infinite series &#8211; but it definitely requires a level of algorithmic conciousness on the part of the engineer which seems quite a bit higher than with imperative languages.</p>

<div class="footnotes">
<hr />
<ol>

<li id="fn:1"><p>And if I do have to include state with this raw, lazy data, but I don&#8217;t the algorithm to know anything about it?  That&#8217;s where the Monad steps in.  Say instead of checksumming a file, I&#8217;m parsing an expression.  There are a lot of details that go along with parsing that have little to do with interpret the next bit of text, such as token position, error context, backtracking information, etc.  I want to be able to write a routine that parses a number very simply, without knowing about all those details.  It&#8217;s the Monad that manages this extra information.  You can <a href="http://en.wikibooks.org/wiki/Haskell/Practical_monads#Parsing_monads">read more here</a>.<a href="#fnref:1" class="reversefootnote">&#160;&#8617;</a></p></li>

<li id="fn:2"><p>Promises are what get turned into real values when data is finally needed.<a href="#fnref:2" class="reversefootnote">&#160;&#8617;</a></p></li>

<li id="fn:3"><p>If you use profiling libraries along with <code>-prof -auto-all</code>, you can get a much clearer picture of what was executing at the time of the fault.<a href="#fnref:3" class="reversefootnote">&#160;&#8617;</a></p></li>

</ol>
</div>
]]>
    </content>
</entry>

<entry>
    <title>Journey into Haskell, part 4</title>
    <link rel="alternate" type="text/html" href="http://www.newartisans.com/2009/03/journey-into-haskell-part-4.html" />
    <id>tag:www.newartisans.com,2009://1.3373</id>

    <published>2009-03-21T08:18:23Z</published>
    <updated>2009-03-23T05:03:10Z</updated>

    <summary>In the meantime, I&#8217;ve picked a toy project that also has a taste of usefulness: a script to convert the Hackage database into MacPorts Portfiles, respecting inter-package and external library dependencies.  

...The impure part takes a command-line argument, interprets it as a  FilePath  (an impure type, since it must concern itself with operating system-dependent naming conventions), and reads the contents of the file at that location.  

... This division into pure and impure has an interesting side-effect (no pun intended):  Most of a program&#8217;s code is written in isolation of its context of usage .  

... Too many times I&#8217;ve tried to use a utility&#8217;s code as a &#8220;library&#8221;, only to find  it was so caught up in its idea of how it should be used, it had never bothered to abstract its core principles into a set of &#8220;pure&#8221; function, independent from that intent.</summary>
    <author>
        <name>John Wiegley</name>
        
    </author>
    
    <category term="fp" label="FP" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="haskell" label="Haskell" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="programming" label="Programming" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://www.newartisans.com/">
        <![CDATA[<p>I&#8217;ve been reading <a href="http://book.realworldhaskell.org/">Real World Haskell</a> now, after having finished the delightful <a href="http://learnyouahaskell.com/">Learn You a Haskell</a> Tutorial.  I&#8217;m up to chapter 6, about to dive into Typeclasses.  In the meantime, I&#8217;ve picked a toy project that also has a taste of usefulness: a script to convert the Hackage database into MacPorts Portfiles, respecting inter-package and external library dependencies.  I call it <a href="http://github.com/jwiegley/hackports">HackPorts</a>, of course.</p>
]]>
        <![CDATA[<h2 id="requirements">Requirements</h2>

<p>This translation should require two things:</p>

<ol>
<li><p>The Cabal package, for read information about all packages known to it.  This avoids writing a custom parser, or using HTTP to crawl the online Hackage database.</p></li>
<li><p>A mapping file of external dependency names to MacPorts port names.  This is for dependencies on things like <code>libbz2</code>, where the script will need to be taught how MacPorts names that library.  This is likely to be the most labor-intensive step, having nothing to do with Haskell.</p></li>
</ol>

<h2 id="initialexperiences">Initial experiences</h2>

<p>Haskell makes a concerted point about separating &#8220;pure&#8221; from &#8220;impure&#8221; code.  Anything which talks to the outside world, such as reading and writing files, is impure.  Anything which can be expressed in terms of standard data types &#8211; or compositions thereof &#8211; is pure.</p>

<p>Take for example a program to count lines in a file.  The pure part of the code receives a giant string, splits it into lines at line boundaries, counts those lines, and returns an integer.  The impure part takes a command-line argument, interprets it as a <code>FilePath</code> (an impure type, since it must concern itself with operating system-dependent naming conventions), and reads the contents of the file at that location.  The program flows by passed the file contents as a string to the pure code, and receiving an integer to be printed on the output device.</p>

<p>This division into pure and impure has an interesting side-effect (no pun intended): <em>Most of a program&#8217;s code is written in isolation of its context of usage</em>.  Take Cabal, as a case in point here.  Part of Cabal deals with downloading information from the Web, reading and writing package files, and executing external commands, like <code>make</code>.  But another part of Cabal is concerned only with the structure of package files, and determining the total set of dependencies required for building a package.  These latter details can be discussed in complete isolation from what is done with that information.</p>

<p>As a result &#8211; and I&#8217;m not sure whether the Cabal authors designed it this way or not &#8211; Cabal is naturally part &#8220;program&#8221;, and part API.  I was able to start taking apart package files almost instantly, with extremely little code.  Here&#8217;s a toy program to print out a package&#8217;s maintainer, if given the path to a <code>.cabal</code> file:</p>

<pre><code>import System.Environment (getArgs)

import Distribution.Verbosity (verbose)
import Distribution.PackageDescription
import Distribution.PackageDescription.Parse (readPackageDescription)

main = do
  args &lt;- getArgs
  pkg  &lt;- readPackageDescription verbose (head args)
  print . maintainer . packageDescription $ pkg</code></pre>

<p>Now, I do suppose it&#8217;s just as easy to do a similar thing in Python&#8217;s distutils, for example:</p>

<pre><code>import sys

from distutils.extension import *

exts = read_setup_file(sys.argv[1])
print exts[0].language       # print the ext 'language'</code></pre>

<p>What excites me is that Haskell uniquely encourages the separation of alogrithm and application &#8211; the isolation of context-dependent knowledge into as small a region of a program as possible.</p>

<p>Too many times I&#8217;ve tried to use a utility&#8217;s code as a &#8220;library&#8221;, only to find  it was so caught up in its idea of how it should be used, it had never bothered to abstract its core principles into a set of &#8220;pure&#8221; function, independent from that intent.  This happens, for example, with the version control system Git.  Although many have wanted a <code>libgit.a</code> for accessing Git&#8217;s data structures directly from other languages, yet none exists.  One is forced to either shell out to the <code>git</code> command, or write another implementation to interface with the &#8220;pure&#8221; side of what Git does.</p>
]]>
    </content>
</entry>

<entry>
    <title>Updated site to use Blueprint CSS again</title>
    <link rel="alternate" type="text/html" href="http://www.newartisans.com/2009/03/updated-site-to-use-blueprint-css-again.html" />
    <id>tag:www.newartisans.com,2009://1.3372</id>

    <published>2009-03-20T08:27:44Z</published>
    <updated>2009-03-23T05:04:32Z</updated>

    <summary>Recently I changed how the content on this site was generated, from using the standalone OS X application  RapidWeaver , to the server-side publishing platform  Movable Type .    During that transition I changed the site&#8217;s style to the minimalist default offered by MT, which uses its own CSS for column layout and typography. 


...I used the superb application  CSSEdit  to help me massage Movable Type&#8217;s style into something that compatible with Blueprint&#8217;s own typography and layout. 


...I&#8217;m aware code examples were being truncated on the right side before, but this should be corrected now.</summary>
    <author>
        <name>John Wiegley</name>
        
    </author>
    
    <category term="web" label="Web" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://www.newartisans.com/">
        <![CDATA[<p>Recently I changed how the content on this site was generated, from using the standalone OS X application <a href="http://www.realmacsoftware.com/rapidweaver/">RapidWeaver</a>, to the server-side publishing platform <a href="http://www.movabletype.org/">Movable Type</a>.  During that transition I changed the site&#8217;s style to the minimalist default offered by MT, which uses its own CSS for column layout and typography.</p>

<p>Tonight I finally got around to switching the site back to <a href="http://github.com/joshuaclayton/blueprint-css/">blueprint-css</a>, which I very much prefer.  I used the superb application <a href="http://macrabbit.com/cssedit/">CSSEdit</a> to help me massage Movable Type&#8217;s style into something that compatible with Blueprint&#8217;s own typography and layout.</p>

<p>I hope the result is pleasing.  If anyone sees strange artifacts or display issues, please <a href="mailto:johnw@newartisans.com">let me know</a>.  I&#8217;m aware code examples were being truncated on the right side before, but this should be corrected now.  More on Haskell to come soon!</p>
]]>
        

    </content>
</entry>

<entry>
    <title>Journey into Haskell, part 3</title>
    <link rel="alternate" type="text/html" href="http://www.newartisans.com/2009/03/journey-into-haskell-part-3.html" />
    <id>tag:www.newartisans.com,2009://1.3371</id>

    <published>2009-03-19T00:59:06Z</published>
    <updated>2009-03-23T05:05:26Z</updated>

    <summary>The task at hand is to write a wrapper script for  /usr/bin/ld  that drops arguments beginning with  -Wl,-rpath, .  

... Here  ld-wrapper  is expected to return its arguments separated by  NUL  characters so I can feed it to  xargs , and from there to  /usr/bin/ld .    I&#8217;m sure there&#8217;s an easy, all-in-one way to do this with Haskell, I just haven&#8217;t reached that chapter yet. 


... I wanted to do this with an &#8220;inverse grep&#8221; instead of  select , but couldn&#8217;t find a way to grep for the opposite of a pattern.</summary>
    <author>
        <name>John Wiegley</name>
        
    </author>
    
    <category term="fp" label="FP" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="haskell" label="Haskell" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="programming" label="Programming" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://www.newartisans.com/">
        <![CDATA[<p>Today I need a wrapper script to drop arguments from a command-line.  I instinctively reached for <code>bash</code>, but then thought it would be a good exercise for my infant Haskell knowledge.</p>
]]>
        <![CDATA[<h2 id="thetask">The task</h2>

<p>The task at hand is to write a wrapper script for <code>/usr/bin/ld</code> that drops arguments beginning with <code>-Wl,-rpath,</code>.  Since it must deal with arguments containing spaces, and I didn&#8217;t want to get into executing external programs with Haskell just yet, I wrappered the wrapper:</p>

<pre><code>#!/bin/bash
$(dirname $0)/ld-wrapper "$@" | xargs -0 /usr/bin/ld</code></pre>

<p>Here <code>ld-wrapper</code> is expected to return its arguments separated by <code>NUL</code> characters so I can feed it to <code>xargs</code>, and from there to <code>/usr/bin/ld</code>.  I&#8217;m sure there&#8217;s an easy, all-in-one way to do this with Haskell, I just haven&#8217;t reached that chapter yet.</p>

<h2 id="haskellversion">Haskell version</h2>

<p>Anyway, here is the Haskell script:</p>

<pre><code>import Data.List
import System.Environment

main = do
  args &lt;- getArgs
  putStr $ intercalate "\0"
         $ filter (not . isPrefixOf "-Wl,-rpath") args</code></pre>

<p>Pretty basic: it filters the input arguments, keeping each one which does not begin with the sought-for string, and joins the list together using <code>NUL</code> as the separator.</p>

<h2 id="rubyversion">Ruby version</h2>

<p>As a quick sanity check, I wrote the same thing in Ruby, since it has facilities for being just as succinct:</p>

<pre><code>print ARGV.select {
  |y| !y.include?("-Wl,-rpath")
}.join("\0") + "\0"</code></pre>

<p>I wanted to do this with an &#8220;inverse grep&#8221; instead of <code>select</code>, but couldn&#8217;t find a way to grep for the opposite of a pattern.</p>

<p>What&#8217;s interesting is that the Ruby version is marginally faster than the compiled Haskell one.  For filtering 40,000 arguments, here are the averaged run-times over 20 invocations:</p>

<table>
<col />
<col />
<thead>
<tr>
	<th>Language</th>
	<th colspan="2">Speed</th>
</tr>
</thead>
<tbody>
<tr>
	<td>Haskell</td>
	<td>0.00774523019791s</td>
</tr>
<tr>
	<td>Ruby</td>
	<td>0.00551697015762s</td>
</tr>
</tbody>
</table>

<p>My guess is that Haskell is creating 40,000 different strings in memory as it constructs the final result, while Ruby is pasting one together as it goes.  I don&#8217;t know which.</p>

<p><strong>UPDATE</strong>: If I compile the Haskell version with <code>-O2</code>, it becomes a hair faster than Ruby, at 0.0049 compared to 0.0055.  If I switch to lazy bytestrings, it drops just a hair to 0.0048.</p>
]]>
    </content>
</entry>

<entry>
    <title>Journey into Haskell, part 2</title>
    <link rel="alternate" type="text/html" href="http://www.newartisans.com/2009/03/journey-into-haskell-part-2.html" />
    <id>tag:www.newartisans.com,2009://1.3370</id>

    <published>2009-03-18T07:07:37Z</published>
    <updated>2009-03-23T05:07:28Z</updated>

    <summary>It creates a Schroedinger type which has two data constructors: an Opened constructor which takes a Probable object &#8212; that is, whose Live or Dead state is known &#8212; and an Unopened constructor which takes a random generator, and an object without a particular state, such as a Cat. 


... If, however, you bind a function to an Opened box with a Live thing, it will apply the function to what&#8217;s in the box &#8212; in this case, the Cat itself.  

... Here is the meat of this example, it&#8217;s reason for being, all contained within this one line: If you bind a function to an Unopened box, it gets bound in turn to an Opened box containing a Cat whose fate has been decided by the dice.  

... This is fairly linear: it gets a random generator from the operating system, then creates an Unopened box and returns it, which gets printed.   print  does its work by calling  show  on the Schroedinger type, since it was derived from  Show  earlier.</summary>
    <author>
        <name>John Wiegley</name>
        
    </author>
    
    <category term="fp" label="FP" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="haskell" label="Haskell" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="programming" label="Programming" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://www.newartisans.com/">
        <![CDATA[<p>Everybody talks about Monads when they mention Haskell, so I got a bit ahead of myself and wanted to see something of what they&#8217;re about.  No, don&#8217;t worry, I&#8217;m not aspiring to yet another Monad tutorial.  I feel I have a ways to go before I&#8217;m ready to craft my own light-saber.</p>

<p>I did read about 10 Monad articles on the Web, and found myself more confused when I came out than when I went in.  Today&#8217;s exercise took about 5-6 hours of pure frustration, before a kind soul on IRC finally set me straight.  It sure is difficult when getting past a single compiler error takes you <em>hours</em>.</p>
]]>
        <![CDATA[<h2 id="thatbedeviledcat">That bedeviled cat</h2>

<p>Most geeks know about Schrödinger&#8217;s cat, the fated beast who, when put into a box with a random source tied to a deadly gas trigger, remains in a state of quantum superposition in which he&#8217;s neither alive nor dead until someone opens the box to look.</p>

<p>Well, people kept saying that Monad are like &#8220;computational containers&#8221;, so I wanted to model the following:</p>

<ol>
<li>There is a Schroedinger Monad into which you can put a Cat.</li>
<li>When you create the Monad, it is Unopened, and the Cat&#8217;s has no state.</li>
<li>You also pass in a random generator from the outside world.  This involves another Monad, the IO Monad, because randomness relates to the &#8220;world outside&#8221;.</li>
<li>As long as you don&#8217;t use the monad object, the Cat&#8217;s is neither Dead nor Live.</li>
<li>As soon as you peek into the box, or use it in any calculation, the Cat&#8217;s fate is decided by a roll of the dice.</li>
</ol>

<p>When I run the program ten times in a row, here&#8217;s what I get:</p>

<pre><code>Opened (Live (Cat "Felix"))
Opened Dead
Opened Dead
Opened Dead
Opened Dead
Opened (Live (Cat "Felix"))
Opened Dead
Opened Dead
Opened (Live (Cat "Felix"))
Opened (Live (Cat "Felix"))</code></pre>

<p>Let&#8217;s look at the code, and where I had troubles writing it.</p>

<h2 id="aflipofthecoin">A flip of the coin</h2>

<p>The first function flips a coin and returns True or False to represent Heads or Tails:</p>

<pre><code>import System.Random

flipCoin :: StdGen -&gt; Bool
flipCoin gen = fst $ random gen</code></pre>

<p>The sugar <code>fst $ random gen</code> is just shorthand for <code>fst (random gen)</code>.  There is no difference, I was just playing with syntax.  You do need to pass in a valid random generator, of type StdGen, for the function to work.</p>

<h2 id="cats">Cats</h2>

<pre><code>data Cat = Cat String deriving Show
data Probable a = Dead | Live a deriving Show</code></pre>

<p>These two types let me make Cats out of Strings, along with a Probable type which models a Live thing or a Dead thing.  It treats all Dead things as equal.  I can create a Live Cat with:</p>

<pre><code>felix = Live (Cat "Felix")</code></pre>

<p>Following my &#8220;fun with syntax&#8221; up above, I could also have written:</p>

<pre><code>felix = Live $ Cat "Felix"</code></pre>

<p>It doesn&#8217;t matter which.  The <code>$</code> character is the same as space, but with much lower precedence so that parentheses aren&#8217;t needed around the argument.  If there were no parens, it would look like I was calling <code>Live</code> with two separate arguments: <code>Cat</code> and <code>"Felix"</code>.</p>

<h2 id="flippingacat">Flipping a Cat</h2>

<pre><code>flipCat :: StdGen -&gt; a -&gt; Probable a
flipCat gen cat = if flipCoin gen 
                  then Live cat
                  else Dead</code></pre>

<p>When I have a Cat, I can subject it to a coin toss in order to get back a Live Cat or a Dead one.  I should probably have called this function <code>randomGasTrigger</code>, but hey.</p>

<p>The type of the function says that it expects a random generator (for <code>flipCoin</code>), some thing, and returns a Probable instance of that thing.  The Probable means &#8220;can be Live or Dead&#8221;, according to how I defined the type above.  The rest of the function is pretty clear, since it looks a lot like its imperative cousin would have.</p>

<h2 id="bringinginschroedinger">Bringing in Schroedinger</h2>

<pre><code>data Schroedinger a
    = Opened (Probable a)
    | Unopened StdGen a deriving Show</code></pre>

<p>This type declaration is more complicated.  It creates a Schroedinger type which has two data constructors: an Opened constructor which takes a Probable object &#8211; that is, whose Live or Dead state is known &#8211; and an Unopened constructor which takes a random generator, and an object without a particular state, such as a Cat.</p>

<p>Some values I could create with this type:</p>

<pre><code>felix   = Opened (Live (Cat "Felix")) -- lucky Felix
poorGuy = Opened Dead                 -- DOA
unknown = Unopened (mkStdGen 100) (Cat "Felix")</code></pre>

<p>In the third case, the idea is that his fate will be determined by the random generator created with <code>mkStdGen 100</code>.  However, I want a <em>real</em> random source, so I&#8217;m going to get one from the environment later.</p>

<h2 id="herecomesthemonad">Here comes the Monad</h2>

<pre><code>instance Monad Schroedinger where
    Opened Dead &gt;&gt;= _ = Opened Dead
    Opened (Live a) &gt;&gt;= f = f a
    Unopened y x &gt;&gt;= f = Opened (flipCat y x) &gt;&gt;= f
    return x = Opened (Live x)</code></pre>

<p>As complex as Monads sound on the Web, they are trivial to define.  Maybe it&#8217;s a lot like binary code: nothing could be simpler than ones and zeroes, yet consider that <em>all</em> complexity expressable by computers, down to video, audio, programming languages, and reading this article, are contained within the possibilities of those two digits.  Yeah.  Monads are a little like that.</p>

<p>This useless Monad just illustrates how to define one, so let&#8217;s cut it apart piece by piece.  By the way, I didn&#8217;t author this thing, I just started it.  Much of its definition was completed by folks on IRC, who had to wipe the drool from my face toward the end.</p>

<pre><code>instance Monad Schroedinger where</code></pre>

<p>Says that my Schroedinger type now participates in the joy and fun of Monads!  He can be discussed at parties with much auspiciousness.</p>

<pre><code>    Opened Dead &gt;&gt;= _ = Opened Dead</code></pre>

<p>The <code>&gt;&gt;=</code> operator is the &#8220;bind&#8221; function.  It happens when you bind a function to a Monad, which is like applying a function to it.  This line says that if you apply a function to an Opened box containing a Dead thing, what you&#8217;ll get back is an Opened box with a Dead thing.</p>

<pre><code>    Opened (Live a) &gt;&gt;= f = f a</code></pre>

<p>If, however, you bind a function to an Opened box with a Live thing, it will apply the function to what&#8217;s in the box &#8211; in this case, the Cat itself.  The function <code>f</code> is assumed to return another instance of the Schroedinger type, most likely containing the same cat or some transformed version of it.</p>

<pre><code>    Unopened y x &gt;&gt;= f = Opened (flipCat y x) &gt;&gt;= f</code></pre>

<p>Here is the meat of this example, it&#8217;s reason for being, all contained within this one line: If you bind a function to an Unopened box, it gets bound in turn to an Opened box containing a Cat whose fate has been decided by the dice.  That&#8217;s all.  The reason I used a Monad to do this is to defer the cat&#8217;s fate until someone actually looked inside the container.</p>

<pre><code>    return x = Opened (Live x)</code></pre>

<p>Lastly, if someone returns a cat from a box, assume its an Opened box with a Live Cat.  I don&#8217;t honestly understand why this is necessary, but it seems Opened Dead cats are handled by the binding above, as shown by the output from my program.  I&#8217;ll have to figure this part out soon&#8230;</p>

<h2 id="themainfunction">The main function</h2>

<p>The last part of the example is the main routine:</p>

<pre><code>main = do
  gen &lt;- getStdGen
  print (do
          box &lt;- Unopened gen (Cat "Felix")
          -- The cat's fate is undecided
          return box)</code></pre>

<p>This is fairly linear: it gets a random generator from the operating system, then creates an Unopened box and returns it, which gets printed.  <code>print</code> does its work by calling <code>show</code> on the Schroedinger type, since it was derived from <code>Show</code> earlier.</p>

<p>Something I still don&#8217;t understand: at exactly which point does the flipping happen?  When <code>box</code> is returned?  When <code>show</code> gets called?  Or when <code>print</code> actually needs the value from <code>show</code> in order to pass it out to the IO subsystem?</p>

<h2 id="closingthoughts">Closing thoughts</h2>

<p>The full version of this code is <a href="http://ftp.newartisans.com/pub/haskell/schroedinger3.hs">on my server</a>.  There is also <a href="http://ftp.newartisans.com/pub/haskell/schroedinger.hs">a simpler version without Monads</a>.  I worked on the Monad version just to tweak my brain.  At least I can say I&#8217;m closer to understanding them than when I started.</p>
]]>
    </content>
</entry>

<entry>
    <title>Journey into Haskell, Part 1</title>
    <link rel="alternate" type="text/html" href="http://www.newartisans.com/2009/03/journey-into-haskell-part-1.html" />
    <id>tag:www.newartisans.com,2009://1.3369</id>

    <published>2009-03-16T17:12:14Z</published>
    <updated>2009-03-23T05:08:10Z</updated>

    <summary>Having just begun my descent down the rabbit hole, into a Haskell land of hiding grins and late late hatters, I thought I&#8217;d try journaling what I discover on the way, so that maybe those who are merely curious could play the part of language voyeur.  

... This function starts out the list with 1, followed by 1, then it starts adding two lists together &#8212; provided by the same function before it&#8217;s even done!  

...I kept thinking it was something I had to return as I went along, not passed down to each deeper level &#8212; and then returned after I&#8217;d added to it.  

...If that first character is a colon, return a list with the current accumulator as the head, and recurse to process the rest of the string (and so on).</summary>
    <author>
        <name>John Wiegley</name>
        
    </author>
    
    <category term="fp" label="FP" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="haskell" label="Haskell" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="programming" label="Programming" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://www.newartisans.com/">
        <![CDATA[<p>Having just begun my descent down the rabbit hole, I thought I&#8217;d try journaling about what I discover along the way, so that those who are merely curious can play the part of language voyeur.  I&#8217;ve always wanted to do that: to see how someone dives into Erlang or O&#8217;Caml or Forth &#8211; or Haskell.  Here&#8217;s your chance.</p>
]]>
        <![CDATA[<p>This is day 5 of the Haskell experience, and I&#8217;m having quite a bit of fun so far.  It&#8217;s definitely twisting my head into pretzel shapes.  I&#8217;ve spent hours getting less done than I could achieve with Python in moments.  The hope is that all this retraining will pay off further down the road.</p>

<h2 id="fibonacci">Fibonacci</h2>

<p>My first attempt was a Fibonacci function, which I failed at miserably.  Turns out I was unable to conceive of &#8220;lazy recursion&#8221;.  When I looked up the answer, it just seemed beautiful:</p>

<pre><code>fib = 1 : 1 : zipWith (+) fib (tail fib)</code></pre>

<p>This function starts out the list with 1, followed by 1, then it starts adding two lists together &#8211; provided by the same function before it&#8217;s even done!  In imperative land this would blow the stack in a heartbeat, but in Haskell it makes sense.  The recursive call to <code>fib</code> returns <code>1, 1, 2, 3, 5</code> and the recursive call to <code>last fib</code> returns <code>1, 2, 3, 5</code>.  Add them together, and you get the sequence.</p>

<p>There is also the traditional definition, which matches what you find in math textbooks:</p>

<pre><code>fib 0 = 0
fib 1 = 1
fib n = fib (n-1) + fib (n-2)</code></pre>

<p>If evaluated at the interactive prompt, this function will generate numbers forever, so you have to ask for just a few, like the first 20:</p>

<pre><code>take 20 fib</code></pre>

<p>So, things began with my face on the ground, which was humbling, but also refreshing that such a simple problem could floor me so easily.</p>

<h2 id="splittingstrings">Splitting strings</h2>

<p>The next problem I tried to tackle was splitting a string into substrings at each colon.  That is:</p>

<pre><code>"Hello:World"
  =&gt; ["Hello", "World"]</code></pre>

<p>Again, fail.  How shocking it was to spend over an hour on this and ultimately have to resort to Google.  The answer was pretty straightforward:</p>

<pre><code>splitAtColons :: String -&gt; [String]
splitAtColons = sac' []
    where sac' acc []       = [acc]
          sac' acc (':':xs) = acc : sac' [] xs
          sac' acc (x:xs)   = sac' (acc ++ [x]) xs</code></pre>

<p>What I missed was using an accumulator to collect the current string.  I kept thinking it was something I had to return as I went along, not passed down to each deeper level &#8211; and then returned after I&#8217;d added to it.  Here&#8217;s the breakdown:</p>

<pre><code>splitAtColons :: String -&gt; [String]</code></pre>

<p>Defines the type of the function as something which takes a <code>String</code> and returns a list of <code>String</code>.</p>

<pre><code>splitAtColons = sac' []</code></pre>

<p>This is essentially what I missed.  The definition of <code>splitAtColons</code> calls a sub-function, passing in an empty string (aka list) as the &#8220;accumulator&#8221;.</p>

<pre><code>    where sac' acc [] = [acc]</code></pre>

<p>If <code>sac'</code> sees an empty string (<code>[]</code>) &#8211; the end of the string currently being processed &#8211; return the accumulated string in its own list.</p>

<pre><code>          sac' acc (':':xs) = acc : sac' [] xs
          sac' acc (x:xs)   = sac' (acc ++ [x]) xs</code></pre>

<p>Otherwise, take apart the current string into its first character, <code>x</code>, and the remainder, <code>xs</code>.  If that first character is a colon, return a list with the current accumulator as the head, and recurse to process the rest of the string (and so on).  Otherwise, add the non-colon character to the current accumulator, and recurse to process the rest of the string.</p>

<h2 id="firstreactions">First reactions</h2>

<p>Moral of my first story: prepare to be humbled.  Google and IRC were a lifeline, and the people on <a href="irc://irc.freenode.net/haskell">#haskell</a>, both helpful and patient.  More soon.</p>
]]>
    </content>
</entry>

<entry>
    <title>The JVM, and costs vs. benefits</title>
    <link rel="alternate" type="text/html" href="http://www.newartisans.com/2009/03/the-jvm-and-costs-vs-benefits.html" />
    <id>tag:www.newartisans.com,2009://1.3368</id>

    <published>2009-03-15T23:28:30Z</published>
    <updated>2009-03-23T05:17:34Z</updated>

    <summary>In a  recent entry  on differences between Haskell and Lisp, one of the Lisp community&#8217;s long-time members, Daniel Weinreb, asked about my stated aversion to JVM-based languages for everyday computing (some times referred to as &#8220;scripting&#8221;).  

...   The fact that you distinguish between server-side and client-side applications suggests to me that what you’re really talking about is start-up latency: you’re saying that a very small program written for the JVM nevertheless has a significant fixed overhead that causes perceived latency to the user. 

...   As a hypothetical question just to clarify your meaning: if there were a JVM implementation that started up instantly, so that the speed of execution of a small program would be the same as the speed of the same code appearing in the middle of a long-running server process, would that answer your objections? 


...Also, what you said about the JIT, and alternative VMs, can be supplemented by mentioning all the other JVM facilities that exist, like code coverage, performance and memory analysis, and live introspection; along with the ability to pick JVMs to run on phones, or satisfy real-time computing requirements.</summary>
    <author>
        <name>John Wiegley</name>
        
    </author>
    
    <category term="java" label="Java" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="jvm" label="JVM" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="lisp" label="Lisp" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="optimization" label="Optimization" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="programming" label="Programming" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://www.newartisans.com/">
        <![CDATA[<p>In a <a href="/2009/03/hello-haskell-goodbye-lisp.html">recent entry</a> on differences between Haskell and Lisp, one of the Lisp community&#8217;s long-time members, Daniel Weinreb, asked about my stated aversion to JVM-based languages for everyday computing (sometimes referred to as &#8220;scripting&#8221;).  Specifically, it was asked in relation to Clojure, and why I hasn&#8217;t been immediately taken by that language &#8211; despite it&#8217;s having so many features I respect and admire.</p>

<p>I wanted to respond to Daniel&#8217;s question in a separate blog entry, since this topic has come up so often, it seems, and deserves thought.  The JVM is a rich, mature platform, and you get so much for free by designing new languages on top of it.  The point of debate is: what are the costs, and are they always worth the asking price?</p>
]]>
        <![CDATA[<p>Daniel&#8217;s question was:</p>

<blockquote>
  <p>In your own case, you mention “tiny” and “fast-running” executables. I am not sure why “tiny” matters these days: disk space is very cheap, and the byte code used by the JVM is compact.  Common Lisp programs compiled with one of the major implementations, and programs written for the Java Virtual Machine, execute at very high speed.</p>
  
  <p>The fact that you distinguish between server-side and client-side applications suggests to me that what you’re really talking about is start-up latency: you’re saying that a very small program written for the JVM nevertheless has a significant fixed overhead that causes perceived latency to the user. Is that what you have in mind?[&#8230;]</p>
  
  <p>As a hypothetical question just to clarify your meaning: if there were a JVM implementation that started up instantly, so that the speed of execution of a small program would be the same as the speed of the same code appearing in the middle of a long-running server process, would that answer your objections?</p>
</blockquote>

<p>Hi Daniel, thank you for your <a href="/2009/03/hello-haskell-goodbye-lisp.html#comment-325">in-depth reply</a>.  As always, I enjoy reading what you&#8217;ve contributed to the Net&#8217;s compendium of thought on Lisp and related languages.</p>

<p>Your clarification was most accurate: When I said &#8220;scripting&#8221;, I was talking about a context of usage, not a particular language paradigm.  I like that Haskell seems to be just as appropriate for tiny, throw-away scripts as it is for large, long-running programs.</p>

<p>When it comes to the latter, I really no have objections at all to the JVM or its startup time.  I&#8217;m more than willing to wait 5 minutes for something to execute, if it will run for months at high efficiency.  I face this situation all the time at work, where we have a huge EJB application hosted on JBoss.  It may complicate debugging sometimes, but the costs are worth the benefits.  The sheer number of things that J2EE and JBoss manage on our behalf, compared the small amount of code necessary to take advantage of them, is quite amazing.</p>

<p>What the JVM takes away, at least in 2009, is the choice of what those costs will be, and when I have to pay them.  I think one of C&#8217;s biggest attractions for a long time has been that most of its costs are a conscious decision.  If you favor startup time, or a small memory footprint, or fast execution, you can pretty much decide.  This makes it as appropriate for embedded apps, as it is for running an HTTP server, as it is for building operating systems and compilers.  With Java, despite all the things you get for &#8220;free&#8221;, it comes at the cost of other freedoms.  And sometimes, Java&#8217;s priorities are not mine.</p>

<p>So while I can and do use the JVM for server-side computation, it&#8217;s a bit heavy weight for small and simple tasks.  Common Lisp&#8217;s answer to this problem was an ingenious one.  Instead of building programs that you run over and over, it offers an &#8220;environment&#8221; in which code is iteratively evaluated, so that you actually grow and nurture a burgeoning set of functionality within a long-running VM.  I like this model <em>when appropriate</em>, and enjoy it, for example, in Emacs, which I can leave running for days on end while at the same time extending its functionality by writing new functions and customizing variables.</p>

<p>To answer your query then: yes, if JVM startup time could be eliminated, it would &#8220;free my hand&#8221;.  I very much respect the maturity and stability of the JVM libraries Groovy and Clojure have access to.  Also, what you said about the JIT, and alternative VMs, can be supplemented by mentioning all the other JVM facilities that exist, like code coverage, performance and memory analysis, and live introspection; along with the ability to pick JVMs to run on phones, or satisfy real-time computing requirements.  It&#8217;s a rich platform, no doubt.</p>

<p>But why do we never see complaints about languages that link to the C++ standard library, or Boost, or any other of the large frameworks that exist?  Because in those worlds, you don&#8217;t pay for what you don&#8217;t use.  It&#8217;s been a design philosophy behind C++ for years, and to good effect.  We might complain about the language, or its APIs, but you hardly notice if <em>other</em> projects use it, because largely, one can pretend it&#8217;s not even there.  Not so with the JVM.  Every time I start a Java application on my system, I feel it.  Run several of them at once, and even my 3Gb laptop starts swapping.  Only with the JVM are such things a source of common complaint.</p>

<p>I&#8217;m hoping that some day, projects like the LLVM will start to abstract these two sides of development.  I want to be able to pick my language for its type safety, clarity, expressiveness, and joy of use; while at the same time I&#8217;d like to pick my VM for its security, footprint, handling of parallelism and messaging, and run-time appropriateness.  This would let me choose Lisp, Haskell, Python or C++, depending on the skillset of engineers available to me; and the JVM, .NET platform, or LLVM, depending on how I meant the code to be used.  Wouldn&#8217;t that be a powerful set of tools at one&#8217;s disposal?</p>
]]>
    </content>
</entry>

<entry>
    <title>Run times for Hello, World in 2009</title>
    <link rel="alternate" type="text/html" href="http://www.newartisans.com/2009/03/run-times-for-hello-world-in-2009.html" />
    <id>tag:www.newartisans.com,2009://1.3367</id>

    <published>2009-03-15T14:50:14Z</published>
    <updated>2009-03-23T05:17:52Z</updated>

    <summary>Someone recently asked what my issue was regarding the JVM, since at the moment it prevents me from falling too much in love with  Clojure  &#8212; a language with the double-benefits of functional programming, and Lisp syntax and macros. 


...These may not seem like much time in the scheme of things, but psychologically it builds up on me when I have to run a particular script over and over and over again.    I&#8217;ve already noticed the pain with Groovy. 


...Ruby (1.9.1-p0) | 0.0196997523308 |</summary>
    <author>
        <name>John Wiegley</name>
        
    </author>
    
    <category term="c" label="C++" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="haskell" label="Haskell" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="java" label="Java" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="jvm" label="JVM" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="lisp" label="Lisp" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="optimization" label="Optimization" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="programming" label="Programming" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://www.newartisans.com/">
        <![CDATA[<p>Someone recently asked what my issue was regarding the JVM, since at the moment it prevents me from falling too much in love with <a href="http://clojure.org">Clojure</a> &#8211; a language with the double-benefits of functional programming, and Lisp syntax and macros.</p>

<p>Well, below is my reason.  These may not seem like much time in the scheme of things, but psychologically it builds up on me when I have to run a particular script over and over and over again.  I&#8217;ve already noticed the pain with Groovy.</p>

<table>
<col />
<col />
<thead>
<tr>
	<th>Language</th>
	<th colspan="2">Running time</th>
</tr>
</thead>
<tbody>
<tr>
	<td>C</td>
	<td>0.00415675640106</td>
</tr>
<tr>
	<td>C++</td>
	<td>0.0043337225914</td>
</tr>
<tr>
	<td>Haskell (compiled)</td>
	<td>0.00494946241379</td>
</tr>
<tr>
	<td>Perl</td>
	<td>0.00773874521255</td>
</tr>
<tr>
	<td>Ruby (1.8.7)</td>
	<td>0.00913717746735</td>
</tr>
<tr>
	<td>Ruby (1.9.1-p0)</td>
	<td>0.0196997523308</td>
</tr>
<tr>
	<td>Python</td>
	<td>0.0269904136658</td>
</tr>
<tr>
	<td>ECL (Common Lisp)</td>
	<td>0.126332080364</td>
</tr>
<tr>
	<td>Java (JDK6)</td>
	<td>0.146584188938</td>
</tr>
<tr>
	<td>Haskell (interpreted)</td>
	<td>0.20009740591</td>
</tr>
<tr>
	<td>Groovy (JDK6)</td>
	<td>1.07791568041</td>
</tr>
</tbody>
</table>

<p>If you&#8217;d like to generate some of these timings for your own system, I have created a <a href="http://github.com/jwiegley/helloworld">Hello, world project on GitHub</a>.</p>
]]>
        

    </content>
</entry>

<entry>
    <title>Hello Haskell, Goodbye Lisp</title>
    <link rel="alternate" type="text/html" href="http://www.newartisans.com/2009/03/hello-haskell-goodbye-lisp.html" />
    <id>tag:www.newartisans.com,2009://1.3366</id>

    <published>2009-03-14T07:54:32Z</published>
    <updated>2009-03-23T05:18:10Z</updated>

    <summary>It makes it trivial to write DSLs, for example, since you all you need to do is model the syntax tree as a series of Lisp data structures, and then evaluate them directly.  

...Since it was designed at a time when there was One Processor to Rule them All, it didn&#8217;t go to great lengths to consider how its design might effect the needs of parallelism. 


...Even if the arguments  could  have been computed in parallel, there&#8217;s no way to know for sure that the evaluation of one argument doesn&#8217;t cause a side-effect which might interfere with another argument&#8217;s evaluation.  

...If then I do something in my function which needs some of those values, Haskell can start computing the ones it needs in parallel, waiting on the completion of the whole set before returning the final result.</summary>
    <author>
        <name>John Wiegley</name>
        
    </author>
    
    <category term="fp" label="FP" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="haskell" label="Haskell" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="lisp" label="Lisp" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="programming" label="Programming" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://www.newartisans.com/">
        <![CDATA[<p>As some one who has enjoyed the Lisp language (in several flavors) for about 15 years now, I wanted to express some of my reactions at recently discovering Haskell, and why it has supplanted Lisp as the apple of my eye.  Perhaps it will encourage others to explore this strange, wonderful world, where it looks like some pretty damn cool ideas are starting to peek over the horizon.</p>
]]>
        <![CDATA[<p>First, let me say that unlike many posts on the Lisp subject, I have nothing negative to report here.  It&#8217;s not that I haven&#8217;t had my share of ups and downs with Lisp, but that if you want to know about those, look around.  Most of what other bloggers have to say is dead on, so there&#8217;s little need to repeat it here.</p>

<p>I&#8217;ll just address some of the cooler aspects of Lisp, and how Haskell compares in response.</p>

<h1 id="elegantsyntax">Elegant syntax</h1>

<p>While many dislike Lisp&#8217;s abundant parentheses, I fell in love with them.  Perhaps it&#8217;s because I spend so much of my time working on compilers, and Lisp programs read like their parse trees.  This &#8220;code/data equivalence&#8221; is beautiful.  It makes it trivial to write DSLs, for example, since you all you need to do is model the syntax tree as a series of Lisp data structures, and then evaluate them directly.  It removes the need for an intermediate parse-tree representation.</p>

<p>When I first approached Haskell, I was shocked at the amount of syntax I saw.  Operators  abounded &#8211; more even than C &#8211; like: <code>-&gt;</code>, <code>=&gt;</code>, <code>::</code>, <code>$</code>, <code>$!</code>, etc, etc.   The more I looked, the more operators there seemed to be, until I began feel as lost as when I read Perl code.</p>

<p>What I didn&#8217;t realize is that in Haskell, much of the syntax you see are just special function names.  There is very little &#8220;true&#8221; syntax going on; the rest is built on top of a highly expressive core.  Lisp looks clean because nearly all its operators are used like functions.  Haskell goes for an &#8220;infix optional&#8221; style, which allows you to call anything as either prefix or infix, provided you quality the function name correctly:</p>

<pre><code>(/= (+ 1 2) 4)        ; Lisp reads very logically
((/=) ((+) 1 2) 4)    -- Haskell can look almost identical!
1 ^ 4                 -- this is the infix form of ((^) 1 4)</code></pre>

<p>Nothing can match Lisp&#8217;s rigorous purity, but once you see past the sugary veils, Haskell is pretty basic underneath as well.  Almost everything, for both languages, boils down to calling functions.</p>

<h1 id="macros">Macros</h1>

<p>Another beauty of Lisp is its macro facility.  I&#8217;ve not seen its like in any other language.  Because the forms of code and data are equivalent, Lisps macro are not just text substitution, <em>they allow you to modify code structure at compile-time</em>.  It&#8217;s like having a compiler construction kit as part of the core language, using types and routines identical to what you use in the runtime environment.  Compare this to a language like C++, where, despite the power of its template meta-language, it employs such a radically different set of tools from the core language that even seasoned C++ programmers often have little hope of understanding it.</p>

<p>But why is all this necessary?  Why do I need to be able to perform compile-time substitutions with a macro, when I can do the same things at runtime with a function?  It comes down to <strong>evaluation</strong>: Before a function is called in Lisp, each of its arguments must be evaluated to yield a concrete value.  In fact, it requires that they be evaluated in order<a href="#fn:1" id="fnref:1" class="footnote">4</a> before the function is ever called.</p>

<p>Say I wanted to write a function called <code>doif</code>, which evaluates its second argument only if the first argument evaluates to true.  In Lisp this requires a macro, because an ordinary function call would evaluate that argument in either case:</p>

<pre><code>(defun doif (x y) (if x y))       ; WRONG: both x and y have been evaluated already
(defmacro doif (x y) `(if ,x ,y)) ; Right: y is only evaluated if x is true</code></pre>

<p>What about Haskell?  Does it have a super-cool macro system too?  It turns out it doesn&#8217;t need to.  In fact, much of the coolness of Haskell is that you get so many things for free, as a result of its design.  The lack of needing macros is one of those:</p>

<pre><code>doif x y = if x then (Just y) else Nothing</code></pre>

<p>Because Haskell never evaluates anything unless you use it, there&#8217;s no need to distinguish between macros and functions.</p>

<h1 id="closures">Closures</h1>

<p>The next amazing thing Lisp taught me about was closures.  Closures are function objects which retain information from the scope they were constructed in.  Here&#8217;s a trivial example:</p>

<pre><code>(defun foo (x) (lambda (y) (+ x y)))

(let ((bar (foo 10)))
   (funcall bar 20))
  =&gt; 30</code></pre>

<p>In calling <code>foo</code>, I&#8217;ve created a function object which adds two numbers: the number that was originally passed to <code>foo</code>, plus whatever number get passed to that closure in turn.  Now, I could go on and on about the possibilities of this mechanism, but suffice it to say it can solve some really difficult problems in simple ways.  It&#8217;s deceptively simple, in fact.</p>

<p>Does Haskell have all this closurey goodness?  You bet it does, in spades.</p>

<pre><code>foo x = (\y -&gt; x + y)        -- here \ means lambda
bar = foo 10
bar 20                       -- arguably cleaner syntax, no?
  =&gt; 30</code></pre>

<p>In fact, Haskell even one-ups Lisp by making <em>partial application</em> something as natural to use as an ordinary function call:</p>

<pre><code>foo = (+)
bar = foo 10
bar 20
  =&gt; 30</code></pre>

<p>This code doesn&#8217;t just make <code>foo</code> an alias for add, which I could have done in Lisp as well.  It says that <code>foo</code> returns a function object expecting two arguments.  Then that <code>bar</code> assigns one of those arguments, returning a closure which references the 10 and expects another argument.  The final call provides the 20 to this closure and sets up the addition.  The fact I&#8217;m evaluating it in the interpreter loop causes Haskell to perform the addition and show me the result.</p>

<p>This combination of lazy evaluation with partial application leads to expressive capabilities I&#8217;ve frankly never experienced before.  Sometimes it causes my head to spin a bit.</p>

<h1 id="parallelism">Parallelism</h1>

<p>One thing about Common Lisp is that it harkens back to a day when computers were much simpler &#8211; before multi-threading, and multiple processor machines were both cheap and common.  Since it was designed at a time when there was One Processor to Rule them All, it didn&#8217;t go to great lengths to consider how its design might effect the needs of parallelism.</p>

<p>Let&#8217;s take function argument evaluation, as a simple example.  Because a function call in Lisp must evaluate all arguments, in order, function calls cannot be parallelized.  Even if the arguments <em>could</em> have been computed in parallel, there&#8217;s no way to know for sure that the evaluation of one argument doesn&#8217;t cause a side-effect which might interfere with another argument&#8217;s evaluation.  It forces Lisp&#8217;s hand into doing everything in the exact sequence laid down by the programmer.</p>

<p>This isn&#8217;t to say that things couldn&#8217;t happen on multiple threads, just that <em>Lisp itself can&#8217;t decide when it&#8217;s appropriate to do so</em>.  Parallelizing code in Lisp requires that the programmer explicitly demarcate boundaries between threads, and that he use global locks to avoid out-of-order side-effects.</p>

<p>With Haskell, the whole game is changed.  Functions aren&#8217;t allowed to have side-effects, and their value is not computed until needed.  These two design decisions lead to situations like the following: Say I&#8217;ve just called a function and passed it a bunch of arguments which are expensive to compute.  None of these operations need to be done in sequence, because none of them depend on the others for their value.  If then I do something in my function which needs some of those values, Haskell can start computing the ones it needs in parallel, waiting on the completion of the whole set before returning the final result.  This is a decision <em>the language itself can make</em>, as a by-product of its design.</p>

<h1 id="community">Community</h1>

<p>Lastly, the Haskell community is amazing.  Newbies, you are welcome here.  Their IRC channel is both a friendly and knowledgable place, where newcomers are cherished and developed.</p>

<p>Likewise, the web resources and books I&#8217;ve read about Haskell so far have all been top-notch.  You get the feeling people are <em>fascinated</em> by the language, and eager to share their joy  with others.  What a refreshing change.  Lisp may have a rich history, but I think Haskell is the one with the future.</p>

<div class="footnotes">
<hr />
<ol>

<li id="fn:1"><p>http://www.lispworks.com/documentation/HyperSpec/Body/03_ababc.htm<a href="#fnref:1" class="reversefootnote">&#160;&#8617;</a></p></li>

</ol>
</div>
]]>
    </content>
</entry>

</feed>
