John Fremlin's blog2010-06-16T18:42:34ZScala is not a better Java2010-06-16T18:42:34Z/scala-is-not-a-better-java<div class='blog-entry-story'><p>The <a href="http://www.scala-lang.org/">Scala</a> programming language has all the features of Java, and supports all sorts of fun things like closures, higher-kinded types and inline XML. If you're starting a new project for the JVM, shouldn't you just use Scala?</p><p>The Scala environment is very interesting and in 2.8 has the ability to pass around unboxed primitive types through to functions that are specified over multiple types &mdash; so that generic functions no longer have a massive performance penalty.</p><p>Scala has operator overloading, and even the ability to give the appearance of adding new methods to existing types via the implicit mechanism.</p><p>These things make the task of writing code much more convenient. There's less need to explicitly distinguish between arrays and other sequences and unboxed doubles and other numeric types. Papering over the division between arrays, primitive types and reference types in the JVM actually reduces to some extent the conceptual complexity of using it.</p><p>But these ideas are fundamentally opposed to the reason Java came into being. Scala positions itself as <q>multi-paradigm programming language designed to express common programming patterns in a concise, elegant</q> way. Java is culturally opposed to these ideas: it is deliberately <a href="http://java.sun.com/docs/white/langenv/Simple.doc.html">simple</a>. The <q>Java design team examined many aspects of the "modern" C and C++ languages to determine features that could be eliminated in the context of modern object-oriented programming</q> while still retaining some similarity to C++ syntax.</p><p>In particular, a founding principle was that everything <a href="http://java.sun.com/docs/white/langenv/Simple.doc2.html">should be very explicit</a>: <q>A major problem with C and C++ is the amount of context you need to understand another programmer's code: you have to read all related header files, all related #defines, and all related typedefs before you can even begin to analyze a program. In essence, programming with #defines and typedefs results in every programmer inventing a new programming language that's incomprehensible to anybody other than its creator, thus defeating the goals of good programming practices.</q> </p><p>This definition of a good programming practice is one that Scala vehemently opposes. The ability to write domain specific languages that integrate with the core is promoted as a positive.</p><p>In my opinion, the definition of a good programming practice should include whether or not it gets results. After the HotJava experiment failed perhaps it's time to admit that all three major webbrowsers (Mozilla Gecko, Webkit, and Internet Explorer) are all written in C++ for real reasons and not just force of habit.</p><p>Java was supposed to <a href="http://www.cs.virginia.edu/~evans/cs655/readings/steele.pdf">start small and grow</a> as a reaction against the perceived causes for which languages like Lisp and Smalltalk were rejected. Scala is proud of the power of its higher-kinded types: Java is proud of removing multiple inheritance. They're not really competitors: the cultural gulf is huge. </p><p>Perhaps Scala will be able to drag the JVM into competition with numerous other platforms. It has a good set of web companies (for example, Foursquare) using it as a replacement for programmer-orientated scripting languages, and that's where Scala has a natural fit, albeit with strong-typing.</p><p>Java is deliberately not programmer-orientated. That's the point of using it &mdash; it was designed to restrict the kind of trouble programmers can get themselves into. If you're stuck with the JVM, I guess the question is: how much rope do you want to give your programmers? Scala is essentially the opposite answer to Java. </p></div>Three unique features of Lisp in 20102010-05-15T00:00:00Z/lisp-features<div class='blog-entry-story'><p>Lisp's (and especially Scheme's) greatness is its coherence &mdash; instead of expressions, statements, sequence points and so on, it just has expressions. In implementations, instead of separate tools for compiler, debugger, profiler, you generally have one tool and, if you have the source, can in a unified way examine and fluidly adapt any part. This is never going to fly in the Balkanized world of other languages, where the compiler is generally not implemented in the language itself.</p><p>Mainstream languages like JavaScript, Python, C#, C++ and Java have steadily adopted some of Lisp's ideas (e.g. garbage collection (1959)), but some, sadly, remain overlooked. Having moved back to programming in C++ and doing some C#, I'm constantly amazed that some really fundamental things from Lisp remain shrouded in mysterious brackets.</p><p>1. In my opinion, most basic and most overlooked are the benefits of <a href="http://www.gigamonkeys.com/book/variables.html#dynamic-aka-special-variables">usable global variables</a>. In Lisp you don't have to keep passing extra parameters to functions because a function they call needs them. Each special variable has its own stack (per-thread in most implementations, leading to performance compromises). So you can use global variables without worrying about affecting other contexts.</p><p>Historically, lexical scoping is relatively new to Lisp (last thirty years?), and as it is much cleaner as a default, old Lisp's dynamic or special scoping received a bad name. But in the right places it makes the difference between having to add a parameter to a chain of function calls or adding an unnecessary field to a class, and simply implementing the needed functionality.</p><p>2. <a href="http://www.nhplace.com/kent/Papers/Condition-Handling-2001.html">The condition system</a> by Kent Pitman is utterly fantastic. C++'s STL is vastly better thought out than Common Lisp's sequences; but the conditions system in Common Lisp is a leap ahead of C++'s exceptions. Instead of an exception automatically unwinding the stack, callers can choose to catch a specific exception (called a condition) and perform some action &mdash; like, say, popping up a dialog box asking the user to free up disk space and retry &mdash; all without affecting the control flow. You're free to unwind the stack if you like of course. In implementation, this is normally pretty much a library on top of the language using special variables and closures. . .</p><p>3. Which brings me on to <em>powerful</em> closures. <a href="http://en.wikipedia.org/wiki/C%2B%2B0x#Lambda_functions_and_expressions">C++</a> is getting lambda functions, but they're not as powerful as Common Lisp's. Python resists allowing you to modify variables in the enclosing scope. In a Common Lisp lambda, you can not only modify variables in your enclosing lexical scope, but also return-from the enclosing scope (provided of course that it's still on the stack). This can be implemented by an exception with a tag unique to the enclosing function in C++ and other languages. But it is immensely useful, and much more convenient to have the compiler insert the boilerplate for you, as it allows <a href="http://www.lispworks.com/documentation/lw50/CLHS/Body/s_throw.htm">all sorts of things</a>.</p><p>It would be great if some these ideas would be massaged into other languages. </p></div>Facebook app in tpd22010-05-08T21:30:19Z/facebook-app-in-tpd2<div class='blog-entry-story'><p>Played in the <a href="http://code.google.com/codejam">codejam</a> then got an <a href="http://wiki.developers.facebook.com/index.php/FBML">FBML</a> <a href="http://apps.facebook.com/jfappcanvas/">Facebook app</a> working &mdash; a small scale recreation of a program by my father, that asks progressively harder arithmetic problems. You score is saved between rounds and I guess I should start adding social friends links and things.</p><p>The <a href="/static-blog/facebook-app-in-tpd2/facebook.lisp">source is here</a>. It's rather hacky and needs an up-to-the-minute <a href="http://github.com/vii/teepeedee2">tpd2</a>. <a href="http://apps.facebook.com/jfappcanvas/">Try it out</a>! </p></div>The .NET CLR and failing to allocate2010-04-22T21:43:11Z/clr-and-the-vm-myth<div class='blog-entry-story'><p>I've never found a VM that handled running out of memory properly. Maybe Lispworks? I know that Allegro and SBCL can just crash horribly when you try to allocate too much. And Java tends to bring down my machine. I was a .NET evangelist to the all the Scala people I ran into, but now having used it, I have to eat my words.</p><p>Take a look at this C# (pseudo-code as I haven't .NET at home).</p><p><pre> var sb = new StringBuilder(); while(true){ try { sb.EnsureCapacity(1000000); break; }catch(OutOfMemoryException){ GC.Collect(); } } </pre></p><p>Even when there is plenty of memory free (2GB available on the machine and only 300MB in the program), if you rapidly repeatedly allocate this 20MB, sometimes it will throw an OutOfMemoryException. This would normally crash your program but you can just call GC.Collect() to get the GC to agree to allocate more heap for the program. It is insane that it throws OutOfMemoryException when there is so, so much space free to allocate! The <a href="http://blogs.msdn.com/ericlippert/archive/2009/06/08/out-of-memory-does-not-refer-to-physical-memory.aspx">fragmentation argument does not wash</a>, as the program could freely allocate more memory. And simply, why doesn't the allocator call GC.Collect instead of throwing this exception? Having <q>managed code</q> in a VM means that you can de-fragment your memory when you feel like it . . . </p><p>PS. And while <a href="http://msdn.microsoft.com/en-us/library/system.threading.threadpool.queueuserworkitem.aspx">ThreadPool.QueueUserWorkItem</a> is pretty great, <a href="http://www.davidarno.org/2007/12/18/parallelfor-quirk-can-leave-your-code-running-sequentially/">parallel for, isn't</a>. </p></div>A new frontpage2010-04-18T20:16:00Z/frontpage-with-headlines<div class='blog-entry-story'><p>Blogs normally show the last few posts on the front page. This is great if you are frequent visitor. However, if you are just stumbling onto the site for the first time, you will have to dig to find the best content &mdash; which is unlikely to be always the most recent.</p><p>Consequently, I've changed the blog frontpage to display the hottest articles in larger text sizes. The old <a href="*latest*">chronological order is still available</a> and the <a href="feed.atom">Atom feed</a> is unchanged.</p><p>Any ideas for how to make the front page prettier much appreciated! As always the <a href="http://github.com/vii/teepeedee2">source is on github</a>. </p></div>RE2 vs cl-irregsexp performance2010-03-17T00:00:00Z/re2-benchmark<div class='blog-entry-story'><p><a href="http://research.swtch.com/">Russ Cox</a> wrote a <a href="http://swtch.com/~rsc/regexp/regexp1.html">series</a> <a href="http://swtch.com/~rsc/regexp/regexp2.html">of</a> <a href="http://swtch.com/~rsc/regexp/regexp3.html">articles</a> on regexps and then released <a href="http://code.google.com/p/re2">RE2</a>. The idea is to make a general regexp engine with a time and space complexity that does not grow horribly when confronted with a pathological string/regexp combination.</p><p>This interested me as when I made <a href="http://common-lisp.net/project/cl-irregsexp/">cl-irregsexp</a> I found that the performance of competing regexp engines varied wildly. The best was generally the Perl engine which has some very good special case paths.</p><p>Years ago, when I released cl-irregsexp I wanted a good headline. I looked a small example that would keep in the fast case for <a href="http://common-lisp.net/project/cl-irregsexp/">cl-irregsexp</a>, but which would trip up other implementations. I ended up with searching for either indecipherable or undecipherable &mdash indecipherable|undecipherable. Funnily, in many implementations this is slower than searching for one string then the other.</p><p>If the freshly released <a href="http://code.google.com/p/re2">RE2</a> lives up to its interesting and useful claim that complex regexps combined with large strings cannot cause it to blow up, then it should easily handle this straightforward case.</p><p><style> .bar { background-color: #336699; display:block; display:inline-block;} </style> <table id=bench-table> <thead><tr><th>Implementation</th><th style="width: 100%">Time (s), smaller is better</th></tr></thead> <tbody> <tr><td>ruby.rb</td><td><span class=bar style="width: 58.40%">.</span>41.18</td></tr> <tr><td>python.py</td><td><span class=bar style="width: 51.76%">.</span>36.49</td></tr> <tr><td>pcre</td><td><span class=bar style="width: 45.56%">.</span>32.12</td></tr> <tr><td>perl.pl</td><td><span class=bar style="width: 34.82%">.</span>24.55</td></tr> <tr><td>cl-irregsexp.clozure</td><td><span class=bar style="width: 17.05%">.</span>12.02</td></tr> <tr><td>re2</td><td><span class=bar style="width: 14.31%">.</span>10.09</td></tr> <tr><td>cl-irregsexp.sbcl</td><td><span class=bar style="width: 8.85%">.</span>6.24</td></tr> </tbody></table></p><p>And it does, and handles it well, coming second after <a href="http://common-lisp.net/project/cl-irregsexp/">cl-irregsexp</a> &mdash; taking only 60% more time than <a href="http://common-lisp.net/project/cl-irregsexp/">cl-irregsexp</a> with SBCL 1.0.36.13 and actually significantly faster than <a href="http://common-lisp.net/project/cl-irregsexp/">cl-irregsexp</a> on ClozureCL 1.4. This is the one thing that <a href="http://common-lisp.net/project/cl-irregsexp/">cl-irregsexp</a> specialises in (it doesn't do anything else very well) and it's nice to see a decent competitor emerge.</p><p>Combining the silly implementation tricks in cl-irregsexp with Russ Cox's general idea for dealing with regexps as DFA and then compiling them to native code via SBCL (if only computed gotos were added) could make something better than either without being massively complex. The avoidance of exponential blow-up would be nice too. But generally for most practical use, the Perl engine's tuned but unnecessarily self-limiting toolchest will be hard to beat. </p></div>CPU throttling invalidating benchmarks2010-03-16T00:00:00Z/cpu-throttling<div class='blog-entry-story'><p>I was trying to update the benchmarks for <a href="http://common-lisp.net/project/cl-irregsexp/">cl-irregsexp</a> and to my surprise I found that as my laptop heated up it was getting slower. The observation was that the more times I repeated a benchmark, the slower it became.</p><p>I'd set the cpu_scaling_governors to performance, and made sure that the max scaling frequency was okay. But the CPUs were entering throttling states, as shown by acpitool. </p><p>One of the thermal zones was right on the throttle temperature when I looked at it <pre> Thermal zone 2 : ok, 85 C Trip points : ------------- critical (S5): 130 C passive: 85 C: tc1=0 tc2=3 tsp=10 devices=CPU0 CPU1 </pre></p><p> The room I am in is unusually hot. I guess this may invalidate some of my benchmark results. Fortunately, people have repeated most of the HTTP ones to get similar numbers, but I shall have to run them again. With the heating off.</p><p>Does anybody know a way of checking whether throttling ever kicks in? I guess I shall poll the temperature every few seconds.</p><p>PS 20100318. Worked round it for long running benchmarks by forcing the CPU to be in the lowest possible speed, and moving to a colder room. </p></div>Programming languages don't have speeds2010-03-15T20:00:00Z/speed-and-programming-languages<div class='blog-entry-story'><p>Every two minutes, some ignorant fanboy for a particularly opinionated programming language will claim that it can be <q>fast as C</q> or even faster. Sadly, Common Lispers are not immune to this idiocy. And idiocy it is. Languages inherently don't have speeds.</p><p>Firstly, when comparing two languages with the same program, the results depend on how you structure the program and the compilers or language environments you choose. You have to compare best possible programs in each language to start with, which simply invalidates most of these arguments as the C program is usually very badly written.</p><p>Secondly, performance in real programs is more about communication than computations. The claim that a language can be fast as C because it can do floating point operations on data in registers (the typical argument), misses completely that modern performance is dominated by <a href="http://igoro.com/archive/gallery-of-processor-cache-effects/">data-locality considerations (caches, talking to other cores, etc.)</a>.</p><p>Thirdly, C compilers are generally much better at optimizing than compilers for novelty languages. Making a compiler that can produce code as good as modern GCC is pretty tricky. Many novelty languages solve this by compiling down to C code. If I remember correctly, there was a point when <a href="http://www.bitc-lang.org/">BitC</a> had better scores than C on the <a href="http://shootout.alioth.debian.org/">ever changing programming language benchmark game</a> &mdash; despite compiling down to C. Again, this does not make logical sense.</p><p>The essential reason why these arguments are ridiculous is that computers have speeds, not languages. If you have a problem and you know that it requires reading through 100TB of data, then you know that you can't do it faster than the time you need to read through 100TB. All your program can do is slow that down. Unfortunately, it is true that many programming environments do slow you down, or make it very difficult to avoid being slowed down.</p><p>Once you have decided on an algorithm, then your computer determines how fast it can possibly be, and your programming environment may restrict that more. For example, in many cases the most efficient data-structure for a problem is a bit-array. A common operation needed is to find the first set bit. X86 computers include the bsf/bsr instructions to quickly find that bit. How much trouble is it to use that instruction in your novelty programming language? In GCC, there is a semi-standard asm construct; that makes it a cinch to use an instruction like this. Under Common Lisp in SBCL, for example, you would have to mess about with the compiler and defknown.</p><p>Simply being unable to utilise the dedicated hardware (even very important things like vector instructions) is a common problem for less mainstream programming environments. But this is a distraction from the main point. Poor data layout in memory and unnecessary inter-core communication are the typical causes for inefficiency. </p><p>How much trouble is it to control how structures are laid out in memory in your chosen programming environment? The language specification probably doesn't say much about it and is therefore irrelevant. It's about the compiler/environment implementation, not the language. </p><p>Talking about the inefficiencies introduced into programs by a compiler implementation is much more interesting than talking about the (very few) cases where it does well, because that further constrains how you can structure your program. Finding the best algorithm is fantastic fun, but it's diminished by people trying to whitewash the inefficiencies introduced by their chosen programming environment.</p><p>Languages don't have speeds, computers do. </p></div>Operands to NOP on AMD642010-02-23T00:00:00Z/amd64-nopl<div class='blog-entry-story'><p>I was looking at a disassembly. It contained this <pre> nopl 0x0(%rax) </pre></p><p>What is the point of passing an operand to NOP? NOP is the instruction that does nothing. Yet not quite true: Intel's <a href="http://patft.uspto.gov/netacgi/nph-Parser?Sect2=PTO1&Sect2=HITOFF&p=1&u=/netahtml/PTO/search-bool.html&r=1&f=G&l=50&d=PALL&RefSrch=yes&Query=PN/5701442">US Patent 5,701,442 <q>Method of modifying an instruction set architecture of a computer processor to maintain backward compatibility</q></a> suggests that they could opt to use more complex NOP instructions to provide hints like memory prefetch requests. On processors without the prefetch logic the operations would do nothing, but processors with prefetch would initiate memory requests to bring the data into cache. The extended NOPs taking operands that lie in the amd64 and x86 instruction sets are called hinting nops for this reason, but as far as I know they don't yet hint anything (see the PREFETCH instructions near the same opcode code points).</p><p>It turns out that these interesting nopls, nopw, etc. are generated by GAS (that is, when using GCC). As the instruction set is encoded in quite a uniform way to simplify the decode stage, there are many instructions that achieve nothing: for example, xchg %ax,%ax (the standard two-byte nop) or leal 0(%esi),%esi, a three-byte nop. Dedicating opcodes for longer NOPs is a sensible way to simplify the CPU's own optimizations.</p><p>Aligning jump targets in code to a 16-byte boundary to make sure that the target can be fetched in a single cacheline request is important. However the padding used to flow through to this aligned boundary should be as efficiently encoded as possible, and using only a single instruction to take up eight bytes &mdash; nopw 0L(%[re]ax,%[re]ax,1), is more efficient than repeatedly exercising the decode logic on eight one-byte nops. The code to do this is in <a href="http://sourceware.org/cgi-bin/cvsweb.cgi/src/gas/config/tc-i386.c?rev=1.426&content-type=text/x-cvsweb-markup&cvsroot=src">binutils/gas/config/tc-i386.c</a>:i386_align_code.</p><p>Goes to show how mature the AMD64 instruction set has become &mdash; and how far from the days of the Binary Coded Decimal nonsense.</p><p> </p></div>K9Mail SSL client certificates and keys2010-01-29T20:43:00Z/k9mail-ssl-client-certificates<div class='blog-entry-story'><p>Finally broke down and replaced my utterly untrusty Nokia N95 &mdash; fantastic but unreliable hardware, reliably broken software &mdash; with an Android phone. It arrived yesterday. The first thing I discovered is that Android does support client SSL certificates with email (the N95 did . . .).</p><p>So I hacked up <a href="http://groups.google.com/group/k-9-mail/browse_thread/thread/2a676995bdd78935">a patch</a> for <a href="http://code.google.com/p/k9mail/">K9mail</a> to include certificates actually in the source code. However, shouldn't be hard to modify it to store them on the SD-card or something.</p><p>Android development seems to be relatively painless compared to Symbian, and a league apart from the diabolically short-sighted Java MIDP &mdash; removing named constants from classes to save space? Aaaargh.</p><p>The thing which took longest was figuring out that I should use ddms debugger to see the full logs; the <a href="http://code.google.com/p/k9mail/wiki/LoggingErrors">instructions on the wiki</a> exclude logs emitted by K9 under not the k9 tag but the TrustManagerFactory tag. </p></div>