John Fremlin's blog2010-06-16T18:42:34ZScala is not a better Java2010-06-16T18:42:34Z/scala-is-not-a-better-java<div class='blog-entry-story'><p>The <a href="http://www.scala-lang.org/">Scala</a> programming language has all the features of
Java, and supports all sorts of fun things like closures,
higher-kinded types and inline XML. If you're starting a new project
for the JVM, shouldn't you just use Scala?</p><p>The Scala environment is very interesting and in 2.8 has the ability
to pass around unboxed primitive types through to functions that are
specified over multiple types — so that generic functions no
longer have a massive performance penalty.</p><p>Scala has operator overloading, and even the ability to give the
appearance of adding new methods to existing types via the implicit
mechanism.</p><p>These things make the task of writing code much more
convenient. There's less need to explicitly distinguish between arrays
and other sequences and unboxed doubles and other numeric
types. Papering over the division between arrays, primitive types and
reference types in the JVM actually reduces to some extent the
conceptual complexity of using it.</p><p>But these ideas are fundamentally opposed to the reason Java came into
being. Scala positions itself as <q>multi-paradigm programming
language designed to express common programming patterns in a concise,
elegant</q> way. Java is culturally opposed to these ideas: it is
deliberately <a
href="http://java.sun.com/docs/white/langenv/Simple.doc.html">simple</a>.
The <q>Java design team examined many aspects of the "modern" C and
C++ languages to determine features that could be eliminated in the
context of modern object-oriented programming</q> while still
retaining some similarity to C++ syntax.</p><p>In particular, a founding principle was that everything <a
href="http://java.sun.com/docs/white/langenv/Simple.doc2.html">should
be very explicit</a>: <q>A major problem with C and C++ is the amount
of context you need to understand another programmer's code: you have
to read all related header files, all related #defines, and all
related typedefs before you can even begin to analyze a program. In
essence, programming with #defines and typedefs results in every
programmer inventing a new programming language that's
incomprehensible to anybody other than its creator, thus defeating the
goals of good programming practices.</q> </p><p>This definition of a good programming practice is one that Scala
vehemently opposes. The ability to write domain specific languages
that integrate with the core is promoted as a positive.</p><p>In my opinion, the definition of a good programming practice should
include whether or not it gets results. After the HotJava experiment
failed perhaps it's time to admit that all three major webbrowsers
(Mozilla Gecko, Webkit, and Internet Explorer) are all written in C++
for real reasons and not just force of habit.</p><p>Java was supposed to <a
href="http://www.cs.virginia.edu/~evans/cs655/readings/steele.pdf">start
small and grow</a> as a reaction against the perceived causes for
which languages like Lisp and Smalltalk were rejected. Scala is proud
of the power of its higher-kinded types: Java is proud of removing
multiple inheritance. They're not really competitors: the cultural
gulf is huge. </p><p>Perhaps Scala will be able to drag the JVM into competition with
numerous other platforms. It has a good set of web companies (for
example, Foursquare) using it as a replacement for
programmer-orientated scripting languages, and that's where Scala has
a natural fit, albeit with strong-typing.</p><p>Java is deliberately not programmer-orientated. That's the point of
using it — it was designed to restrict the kind of trouble
programmers can get themselves into. If you're stuck with the JVM, I
guess the question is: how much rope do you want to give your
programmers? Scala is essentially the opposite answer to Java.
</p></div>Three unique features of Lisp in 20102010-05-15T00:00:00Z/lisp-features<div class='blog-entry-story'><p>Lisp's (and especially Scheme's) greatness is its coherence —
instead of expressions, statements, sequence points and so on, it just
has expressions. In implementations, instead of separate tools for
compiler, debugger, profiler, you generally have one tool and, if you
have the source, can in a unified way examine and fluidly adapt any
part. This is never going to fly in the Balkanized world of other
languages, where the compiler is generally not implemented in the
language itself.</p><p>Mainstream languages like JavaScript, Python, C#, C++ and Java have
steadily adopted some of Lisp's ideas (e.g. garbage collection
(1959)), but some, sadly, remain overlooked. Having moved back to
programming in C++ and doing some C#, I'm constantly amazed that some
really fundamental things from Lisp remain shrouded in mysterious
brackets.</p><p>1. In my opinion, most basic and most overlooked are the benefits of
<a
href="http://www.gigamonkeys.com/book/variables.html#dynamic-aka-special-variables">usable
global variables</a>. In Lisp you don't have to keep passing extra
parameters to functions because a function they call needs them. Each
special variable has its own stack (per-thread in most
implementations, leading to performance compromises). So you can use
global variables without worrying about affecting other contexts.</p><p>Historically, lexical scoping is relatively new to Lisp (last thirty
years?), and as it is much cleaner as a default, old Lisp's dynamic or
special scoping received a bad name. But in the right places it makes
the difference between having to add a parameter to a chain of
function calls or adding an unnecessary field to a class, and simply
implementing the needed functionality.</p><p>2. <a
href="http://www.nhplace.com/kent/Papers/Condition-Handling-2001.html">The
condition system</a> by Kent Pitman is utterly fantastic. C++'s STL is
vastly better thought out than Common Lisp's sequences; but the
conditions system in Common Lisp is a leap ahead of C++'s
exceptions. Instead of an exception automatically unwinding the stack,
callers can choose to catch a specific exception (called a condition)
and perform some action — like, say, popping up a dialog box
asking the user to free up disk space and retry — all without
affecting the control flow. You're free to unwind the stack if you
like of course. In implementation, this is normally pretty much a
library on top of the language using special variables and
closures. . .</p><p>3. Which brings me on to <em>powerful</em> closures. <a
href="http://en.wikipedia.org/wiki/C%2B%2B0x#Lambda_functions_and_expressions">C++</a>
is getting lambda functions, but they're not as powerful as Common
Lisp's. Python resists allowing you to modify variables in the
enclosing scope. In a Common Lisp lambda, you can not only modify
variables in your enclosing lexical scope, but also return-from the
enclosing scope (provided of course that it's still on the
stack). This can be implemented by an exception with a tag unique to
the enclosing function in C++ and other languages. But it is immensely
useful, and much more convenient to have the compiler insert the
boilerplate for you, as it allows <a
href="http://www.lispworks.com/documentation/lw50/CLHS/Body/s_throw.htm">all
sorts of things</a>.</p><p>It would be great if some these ideas would be massaged into other
languages.
</p></div>Facebook app in tpd22010-05-08T21:30:19Z/facebook-app-in-tpd2<div class='blog-entry-story'><p>Played in the <a href="http://code.google.com/codejam">codejam</a>
then got an <a href="http://wiki.developers.facebook.com/index.php/FBML">FBML</a> <a
href="http://apps.facebook.com/jfappcanvas/">Facebook app</a> working
— a small scale recreation of a program by my father, that asks
progressively harder arithmetic problems. You score is saved between
rounds and I guess I should start adding social friends links and
things.</p><p>The <a href="/static-blog/facebook-app-in-tpd2/facebook.lisp">source is here</a>. It's
rather hacky and needs an up-to-the-minute <a
href="http://github.com/vii/teepeedee2">tpd2</a>. <a
href="http://apps.facebook.com/jfappcanvas/">Try it out</a>!
</p></div>The .NET CLR and failing to allocate2010-04-22T21:43:11Z/clr-and-the-vm-myth<div class='blog-entry-story'><p>I've never found a VM that handled running out of memory
properly. Maybe Lispworks? I know that Allegro and SBCL can just crash
horribly when you try to allocate too much. And Java tends to bring
down my machine. I was a .NET evangelist to the all the Scala people I
ran into, but now having used it, I have to eat my words.</p><p>Take a look at this C# (pseudo-code as I haven't .NET at home).</p><p><pre>
var sb = new StringBuilder();
while(true){
try {
sb.EnsureCapacity(1000000);
break;
}catch(OutOfMemoryException){
GC.Collect();
}
}
</pre></p><p>Even when there is plenty of memory free (2GB available on the machine
and only 300MB in the program), if you rapidly repeatedly allocate
this 20MB, sometimes it will throw an OutOfMemoryException. This would
normally crash your program but you can just call GC.Collect() to get
the GC to agree to allocate more heap for the program. It is insane
that it throws OutOfMemoryException when there is so, so much space
free to allocate! The <a
href="http://blogs.msdn.com/ericlippert/archive/2009/06/08/out-of-memory-does-not-refer-to-physical-memory.aspx">fragmentation
argument does not wash</a>, as the program could freely allocate more
memory. And simply, why doesn't the allocator call GC.Collect instead
of throwing this exception? Having <q>managed code</q> in a VM means
that you can de-fragment your memory when you feel like it . . . </p><p>PS. And while
<a
href="http://msdn.microsoft.com/en-us/library/system.threading.threadpool.queueuserworkitem.aspx">ThreadPool.QueueUserWorkItem</a>
is pretty great,
<a
href="http://www.davidarno.org/2007/12/18/parallelfor-quirk-can-leave-your-code-running-sequentially/">parallel
for, isn't</a>.
</p></div>A new frontpage2010-04-18T20:16:00Z/frontpage-with-headlines<div class='blog-entry-story'><p>Blogs normally show the last few posts on the front page. This is
great if you are frequent visitor. However, if you are just stumbling
onto the site for the first time, you will have to dig to find the
best content — which is unlikely to be always the most recent.</p><p>Consequently, I've changed the blog frontpage to display the hottest
articles in larger text sizes. The old <a
href="*latest*">chronological order is still available</a> and the
<a href="feed.atom">Atom feed</a> is unchanged.</p><p>Any ideas for how to make the front page prettier much appreciated! As
always the <a href="http://github.com/vii/teepeedee2">source is on
github</a>.
</p></div>RE2 vs cl-irregsexp performance2010-03-17T00:00:00Z/re2-benchmark<div class='blog-entry-story'><p><a href="http://research.swtch.com/">Russ Cox</a> wrote a <a
href="http://swtch.com/~rsc/regexp/regexp1.html">series</a> <a
href="http://swtch.com/~rsc/regexp/regexp2.html">of</a> <a
href="http://swtch.com/~rsc/regexp/regexp3.html">articles</a> on
regexps and then released <a
href="http://code.google.com/p/re2">RE2</a>. The idea is to make a
general regexp engine with a time and space complexity that does not
grow horribly when confronted with a pathological string/regexp
combination.</p><p>This interested me as when I made <a
href="http://common-lisp.net/project/cl-irregsexp/">cl-irregsexp</a> I
found that the performance of competing regexp engines varied
wildly. The best was generally the Perl engine which has some very
good special case paths.</p><p>Years ago, when I released cl-irregsexp I wanted a good headline. I
looked a small example that would keep in the fast case for <a
href="http://common-lisp.net/project/cl-irregsexp/">cl-irregsexp</a>,
but which would trip up other implementations. I ended up with
searching for either indecipherable or undecipherable &mdash
indecipherable|undecipherable. Funnily, in many implementations this
is slower than searching for one string then the other.</p><p>If the freshly released <a href="http://code.google.com/p/re2">RE2</a>
lives up to its interesting and useful claim that complex regexps
combined with large strings cannot cause it to blow up, then it should
easily handle this straightforward case.</p><p><style>
.bar { background-color: #336699;
display:block;
display:inline-block;}
</style>
<table id=bench-table>
<thead><tr><th>Implementation</th><th style="width: 100%">Time (s), smaller is better</th></tr></thead>
<tbody>
<tr><td>ruby.rb</td><td><span class=bar style="width: 58.40%">.</span>41.18</td></tr>
<tr><td>python.py</td><td><span class=bar style="width: 51.76%">.</span>36.49</td></tr>
<tr><td>pcre</td><td><span class=bar style="width: 45.56%">.</span>32.12</td></tr>
<tr><td>perl.pl</td><td><span class=bar style="width: 34.82%">.</span>24.55</td></tr>
<tr><td>cl-irregsexp.clozure</td><td><span class=bar style="width: 17.05%">.</span>12.02</td></tr>
<tr><td>re2</td><td><span class=bar style="width: 14.31%">.</span>10.09</td></tr>
<tr><td>cl-irregsexp.sbcl</td><td><span class=bar style="width: 8.85%">.</span>6.24</td></tr>
</tbody></table></p><p>And it does, and handles it well, coming second after <a
href="http://common-lisp.net/project/cl-irregsexp/">cl-irregsexp</a>
— taking only 60% more time than <a
href="http://common-lisp.net/project/cl-irregsexp/">cl-irregsexp</a>
with SBCL 1.0.36.13 and actually significantly faster than <a
href="http://common-lisp.net/project/cl-irregsexp/">cl-irregsexp</a>
on ClozureCL 1.4. This is the one thing that <a
href="http://common-lisp.net/project/cl-irregsexp/">cl-irregsexp</a>
specialises in (it doesn't do anything else very well) and it's nice
to see a decent competitor emerge.</p><p>Combining the silly implementation tricks in cl-irregsexp with Russ
Cox's general idea for dealing with regexps as DFA and then compiling
them to native code via SBCL (if only computed gotos were added) could
make something better than either without being massively complex. The
avoidance of exponential blow-up would be nice too. But generally for
most practical use, the Perl engine's tuned but unnecessarily
self-limiting toolchest will be hard to beat.
</p></div>CPU throttling invalidating benchmarks2010-03-16T00:00:00Z/cpu-throttling<div class='blog-entry-story'><p>I was trying to update the benchmarks for <a
href="http://common-lisp.net/project/cl-irregsexp/">cl-irregsexp</a>
and to my surprise I found that as my laptop heated up it was getting
slower. The observation was that the more times I repeated a
benchmark, the slower it became.</p><p>I'd set the cpu_scaling_governors to performance, and made sure that
the max scaling frequency was okay. But the CPUs were entering
throttling states, as shown by acpitool. </p><p>One of the thermal zones was right on the throttle temperature when I
looked at it
<pre>
Thermal zone 2 : ok, 85 C
Trip points :
-------------
critical (S5): 130 C
passive: 85 C: tc1=0 tc2=3 tsp=10 devices=CPU0 CPU1
</pre></p><p>
The room I am in is unusually hot. I guess this may invalidate some
of my benchmark results. Fortunately, people have repeated most of the
HTTP ones to get similar numbers, but I shall have to run them
again. With the heating off.</p><p>Does anybody know a way of checking whether throttling ever kicks in?
I guess I shall poll the temperature every few seconds.</p><p>PS 20100318. Worked round it for long running benchmarks by forcing the CPU to
be in the lowest possible speed, and moving to a colder room.
</p></div>Programming languages don't have speeds2010-03-15T20:00:00Z/speed-and-programming-languages<div class='blog-entry-story'><p>Every two minutes, some ignorant fanboy for a particularly opinionated
programming language will claim that it can be <q>fast as C</q> or
even faster. Sadly, Common Lispers are not immune to this idiocy. And
idiocy it is. Languages inherently don't have speeds.</p><p>Firstly, when comparing two languages with the same program, the
results depend on how you structure the program and the compilers or
language environments you choose. You have to compare best possible
programs in each language to start with, which simply invalidates most
of these arguments as the C program is usually very badly written.</p><p>Secondly, performance in real programs is more about communication
than computations. The claim that a language can be fast as C because
it can do floating point operations on data in registers (the typical
argument), misses completely that modern performance is dominated by
<a
href="http://igoro.com/archive/gallery-of-processor-cache-effects/">data-locality
considerations (caches, talking to other cores, etc.)</a>.</p><p>Thirdly, C compilers are generally much better at optimizing than
compilers for novelty languages. Making a compiler that can produce
code as good as modern GCC is pretty tricky. Many novelty languages
solve this by compiling down to C code. If I remember correctly, there
was a point when <a href="http://www.bitc-lang.org/">BitC</a> had
better scores than C on the <a
href="http://shootout.alioth.debian.org/">ever changing programming
language benchmark game</a> — despite compiling down to
C. Again, this does not make logical sense.</p><p>The essential reason why these arguments are ridiculous is that
computers have speeds, not languages. If you have a problem and you
know that it requires reading through 100TB of data, then you know
that you can't do it faster than the time you need to read through
100TB. All your program can do is slow that down. Unfortunately, it is
true that many programming environments do slow you down, or make it
very difficult to avoid being slowed down.</p><p>Once you have decided on an algorithm, then your computer determines
how fast it can possibly be, and your programming environment may
restrict that more. For example, in many cases the most efficient
data-structure for a problem is a bit-array. A common operation needed
is to find the first set bit. X86 computers include the bsf/bsr
instructions to quickly find that bit. How much trouble is it to use
that instruction in your novelty programming language? In GCC, there is
a semi-standard asm construct; that makes it a cinch to use an
instruction like this. Under Common Lisp in SBCL, for example, you
would have to mess about with the compiler and defknown.</p><p>Simply being unable to utilise the dedicated hardware (even very
important things like vector instructions) is a common problem for
less mainstream programming environments. But this is a distraction
from the main point. Poor data layout in memory and unnecessary
inter-core communication are the typical causes for inefficiency. </p><p>How much trouble is it to control how structures are laid out in
memory in your chosen programming environment? The language
specification probably doesn't say much about it and is therefore
irrelevant. It's about the compiler/environment implementation, not
the language. </p><p>Talking about the inefficiencies introduced into programs by a
compiler implementation is much more interesting than talking about
the (very few) cases where it does well, because that further
constrains how you can structure your program. Finding the best
algorithm is fantastic fun, but it's diminished by people trying to
whitewash the inefficiencies introduced by their chosen programming
environment.</p><p>Languages don't have speeds, computers do.
</p></div>Operands to NOP on AMD642010-02-23T00:00:00Z/amd64-nopl<div class='blog-entry-story'><p>I was looking at a disassembly. It contained this
<pre>
nopl 0x0(%rax)
</pre></p><p>What is the point of passing an operand to NOP? NOP is the instruction
that does nothing. Yet not quite true: Intel's <a
href="http://patft.uspto.gov/netacgi/nph-Parser?Sect2=PTO1&Sect2=HITOFF&p=1&u=/netahtml/PTO/search-bool.html&r=1&f=G&l=50&d=PALL&RefSrch=yes&Query=PN/5701442">US
Patent 5,701,442 <q>Method of modifying an instruction set
architecture of a computer processor to maintain backward
compatibility</q></a> suggests that they could opt to use more complex
NOP instructions to provide hints like memory prefetch requests. On
processors without the prefetch logic the operations would do nothing,
but processors with prefetch would initiate memory requests to bring
the data into cache. The extended NOPs taking operands that lie in the
amd64 and x86 instruction sets are called hinting nops for this
reason, but as far as I know they don't yet hint anything (see the
PREFETCH instructions near the same opcode code points).</p><p>It turns out that these interesting nopls, nopw, etc. are generated by
GAS (that is, when using GCC). As the instruction set is encoded in
quite a uniform way to simplify the decode stage, there are many
instructions that achieve nothing: for example, xchg %ax,%ax (the
standard two-byte nop) or leal 0(%esi),%esi, a three-byte
nop. Dedicating opcodes for longer NOPs is a sensible way to
simplify the CPU's own optimizations.</p><p>Aligning jump targets in code to a 16-byte boundary to make sure that
the target can be fetched in a single cacheline request is
important. However the padding used to flow through to this aligned
boundary should be as efficiently encoded as possible, and using only
a single instruction to take up eight bytes — nopw
0L(%[re]ax,%[re]ax,1), is more efficient than repeatedly exercising
the decode logic on eight one-byte nops. The code to do this is in <a
href="http://sourceware.org/cgi-bin/cvsweb.cgi/src/gas/config/tc-i386.c?rev=1.426&content-type=text/x-cvsweb-markup&cvsroot=src">binutils/gas/config/tc-i386.c</a>:i386_align_code.</p><p>Goes to show how mature the AMD64 instruction set has become —
and how far from the days of the Binary Coded Decimal nonsense.</p><p>
</p></div>K9Mail SSL client certificates and keys2010-01-29T20:43:00Z/k9mail-ssl-client-certificates<div class='blog-entry-story'><p>Finally broke down and replaced my utterly untrusty Nokia N95 —
fantastic but unreliable hardware, reliably broken software —
with an Android phone. It arrived yesterday. The first thing I
discovered is that Android does support client SSL certificates with
email (the N95 did . . .).</p><p>So I hacked up <a
href="http://groups.google.com/group/k-9-mail/browse_thread/thread/2a676995bdd78935">a
patch</a> for <a href="http://code.google.com/p/k9mail/">K9mail</a> to
include certificates actually in the source code. However, shouldn't
be hard to modify it to store them on the SD-card or something.</p><p>Android development seems to be relatively painless compared to
Symbian, and a league apart from the diabolically short-sighted Java
MIDP — removing named constants from classes to save space?
Aaaargh.</p><p>The thing which took longest was figuring out that I should use ddms
debugger to see the full logs; the <a
href="http://code.google.com/p/k9mail/wiki/LoggingErrors">instructions
on the wiki</a> exclude logs emitted by K9 under not the k9 tag but
the TrustManagerFactory tag.
</p></div>