Posted 2010-08-17 00:00:00 GMT
After more than fifty years, and fifteen years of being an ANSI standard, Lisp still has a feature advantage over platforms like .NET and Java. It's sad. Behind me at work sits a Smalltalker and the pain is shared.
As SGML was in some ways superior to XML and DSSSL was massively more capable than CSS, there's no natural force compelling progress to be in the direction of technical advancement.
The strange difficulty faced by the .NET platform in programmatically generating a class without resorting to manual IL generation and the assembly builder is symptomatic of the unnecessary balkanisation between compile and run time. F# includes a compiler that can be called programmatically, but bizarrely mishmashes two type systems together (the ML type-system which finds order of definition significant and the .NET type system which works class at a time and supports overloading) without any serious attempt to reconcile them.
On the JVM which is the intellectual heir to much of the Lisp legacy, you have the ASM bytecode framework. What about real macros?
With real macros, you can leverage the program you've already written and build on it. Emitting bytecode requires a conceptual shift, new debugging tools, and so on. It demands a (presumably slightly differently featured) parallel implementation of numerous aspects of the domain already done in the more expressive overlying language; and unnecessarily burdensome, as the C# to IL compiler is hardly a vast technological feat that needs to be guarded for its competitive advantage. Shouldn't it be available at all stages of the programming process?
The entropy and complexity of the computing universe increases naturally; things don't necessarily improve — hard and thankless work is needed.
Posted 2010-06-16 18:42:34 GMT
1 watching live
The Scala programming language has all the features of Java, and supports all sorts of fun things like closures, higher-kinded types and inline XML. If you're starting a new project for the JVM, shouldn't you just use Scala?
The Scala environment is very interesting and in 2.8 has the ability to pass around unboxed primitive types through to functions that are specified over multiple types — so that generic functions no longer have a massive performance penalty.
Scala has operator overloading, and even the ability to give the appearance of adding new methods to existing types via the implicit mechanism.
These things make the task of writing code much more convenient. There's less need to explicitly distinguish between arrays and other sequences and unboxed doubles and other numeric types. Papering over the division between arrays, primitive types and reference types in the JVM actually reduces to some extent the conceptual complexity of using it.
But these ideas are fundamentally opposed to the reason Java came into
being. Scala positions itself as multi-paradigm programming
language designed to express common programming patterns in a concise,
elegant
way. Java is culturally opposed to these ideas: it is
deliberately simple.
The Java design team examined many aspects of the "modern" C and
C++ languages to determine features that could be eliminated in the
context of modern object-oriented programming
while still
retaining some similarity to C++ syntax.
In particular, a founding principle was that everything should
be very explicit: A major problem with C and C++ is the amount
of context you need to understand another programmer's code: you have
to read all related header files, all related #defines, and all
related typedefs before you can even begin to analyze a program. In
essence, programming with #defines and typedefs results in every
programmer inventing a new programming language that's
incomprehensible to anybody other than its creator, thus defeating the
goals of good programming practices.
This definition of a good programming practice is one that Scala vehemently opposes. The ability to write domain specific languages that integrate with the core is promoted as a positive.
In my opinion, the definition of a good programming practice should include whether or not it gets results. After the HotJava experiment failed perhaps it's time to admit that all three major webbrowsers (Mozilla Gecko, Webkit, and Internet Explorer) are all written in C++ for real reasons and not just force of habit.
Java was supposed to start small and grow as a reaction against the perceived causes for which languages like Lisp and Smalltalk were rejected. Scala is proud of the power of its higher-kinded types: Java is proud of removing multiple inheritance. They're not really competitors: the cultural gulf is huge.
Perhaps Scala will be able to drag the JVM into competition with numerous other platforms. It has a good set of web companies (for example, Foursquare) using it as a replacement for programmer-orientated scripting languages, and that's where Scala has a natural fit, albeit with strong-typing.
Java is deliberately not programmer-orientated. That's the point of using it — it was designed to restrict the kind of trouble programmers can get themselves into. If you're stuck with the JVM, I guess the question is: how much rope do you want to give your programmers? Scala is essentially the opposite answer to Java.
'Give your programmers'? Who is choosing the platform here? Sounds like it's not the programmers, and that's a deal-breaker.
Posted 2010-06-16 21:21:46 GMT by
A programming language that isn't programmer-oriented--what a great idea. So was COBOL.
Posted 2010-06-17 03:31:26 GMT by
The companies adopting Scala are not using it as a replacement for scripting languages, they're using it to write complex high performance systems: foursquare uses it for their website, LinkedIn has many of its systems in Scala, Novell Pulse is entirely written in Scala, etc. Twitter replaced it's backend with a Scala solution. There's plenty of examples...
Maybe Java 1.0 started with the goal that everything must be explicit, but that's not the current Java, it has its own share of warts.
Let's not even consider autoboxing and generics, start with something as simple as "Hi "+ 2.
1) there's operator overloading: the same operator "+" does something different than 2+2
2) The "int" is converted to a string
3) The string concatenation creates internally a StringBuffer and concatenates it... hardly explicit.
But I think the cultural gulf is clearly evident at the end: you consider that a language must restrict what programmers can do with it. I believe that more powerful languages give you more freedom to solve complex problems in a more effective way.
If you can't trust your programmers, you have a bigger problem and I'm pretty sure they can create unmaintainable code in Java or any other language...
Posted 2010-06-17 04:08:18 GMT by
I have seen many poorly written Java programs in my years. However, unlike terribly-written C++ or even Ruby programs, this awful code was at least sort-of understandable and wasn't much of a battle to rewrite properly. In my opinion it's not very easy to write a totally unmaintainable program in Java... unless of course you compile it to bytecode, obfuscate the hell out of it and then JAD it back to source, throwing away the original. John, I'm afraid you might be right about Scala. That bothers me. I was really hoping that Scala would be the successor to Java and I'd be able to migrate my half-million-line cash cow over to it piece by piece (once they straighten out the IDE issues of course) but even with me being the primary developer I think Scala might cause more harm than good to its future.
Posted 2010-06-17 04:18:26 GMT by
Perhaps the title of this post should be "Java is simpler than Scala, IMHO" rather than "Scala is not a better Java".
Posted 2010-06-17 11:13:20 GMT by
I really like Java for the simplicity reasons you list. It is a great language to develop in a complex 'enterprise' environment. Perhaps Scala can attract some to the strays - e.g. Python - back to using strong typing.
Posted 2010-06-17 12:44:04 GMT by
How about the anonymous classes introduced afterwards, Generics introduced afterwards in Java? Even more so, how about the new closure functionality in Java. Don't these enhancement contradict Java's staying simple rule then?
Posted 2010-06-17 19:46:04 GMT by
While the complexity of Scala can easily lead to programs with all the readability of, say, Perl (and I don't mean that in a good way), Java's simplicity is no panacea. I've spent more than a decade coding in Java, and I've seen my share of hard-to-read code that the simplicity of the Java language did nothing to ameliorate. Examples include (but are not limited to):
* Spider webs of anonymous inner classes, interwoven with poorly named variables and enormous undocumented methods, leading to a vast unreadable mess.
* Duplicated code, because Java interfaces can't contain executable code (unlike Scala traits or Ruby modules) and because the subclasses were already inheriting from different parent classes.
* Extended string classes that must wrap the entire java.lang.String class, just to add a few methods that java.lang.String doesn't have (versus shoving those new methods into a static class, C-style), all because String is final and cannot be extended. Scala also addresses this problem. Hell, C# addresses this problem, too.
* IOC frameworks, like Spring, which, misused, lead to hard-to-debug large-scale systems that are wired together and interconnected in non-obvious ways.
* Class after class after class with single-line setters and getters galore, all obscuring the actual business logic that serves as the reason the class was written in the first place--and all violating the uniform access principle. Scala, Python, Ruby and C# all address this problem quite nicely, leading to a reduction of extraneous code which, in turn, enhances readability and maintainability by obscuring LESS of the actual business logic.
Java's simplicity did nothing to prevent or ameliorate these problems.
Yes, there's something to be said for simple languages. But sometimes, oversimplification causes as many problems as it alleviates.
Posted 2010-06-17 23:06:22 GMT by
Someone already said it, but I'll say it again: the complexity of Scala can easily lead to programs with all the readability of Perl (in a bad way). In this way, Scala is definately worse than Java. Sure Scala has lots of cool features and fixes some of the warts of Java - but at what cost?
Posted 2010-06-19 04:17:28 GMT by
Complexity comes from many places. I think the real world experience is not predictable and the results depend much on the culture created.
I spent yesterday debugging a production issue with a Java system that copies data from one database to another. It does this with two programs, one of which is a Web Service server and the other a client. Both ends use complex stored procedures and the web service uses a complex authentication and authorization system based on Kerberos tickets etc, which, after hours of work, we determined was the cause of the fault. The web service has exactly one client.
Lets not forget Java is a market failure as a client side language, which was its original target.
Many features missing from Java are more about not enough time than a deliberate omission.
I think the apparent ease of maintaining Java programs compared to the C/C++ programs I maintained before, comes down to mainly that Java protects its abstractions.
The failure to re-compile a Java class or fix a header results in clear message. A similar failure in C/C++ results in corruption which shows up much later. The other main feature helps is garbage collection removing many memory leaks and corruptions.
Scala's additional features will reduce the need for tools outside of the language (spring config etc.) and this will help programs more maintainable. This must be balanced against whether use and mis-use of power reduces maintainability. My money is that Scala is a better Java in the real world.
Posted 2010-06-22 23:20:51 GMT by
"Lets not forget Java is a market failure as a client side language, which was its original target."
Java (originally Oak) was targeted at embedded.
Posted 2010-07-01 03:30:44 GMT by
"Java (originally Oak) was targeted at embedded."
and at this Java has been extremely successful. cellphones, kindle, blu-ray, smartcards, etc. oh my.
Posted 2010-07-19 18:31:48 GMT by
Post a comment
Posted 2010-05-15 00:00:00 GMT
Lisp's (and especially Scheme's) greatness is its coherence — instead of expressions, statements, sequence points and so on, it just has expressions. In implementations, instead of separate tools for compiler, debugger, profiler, you generally have one tool and, if you have the source, can in a unified way examine and fluidly adapt any part. This is never going to fly in the Balkanized world of other languages, where the compiler is generally not implemented in the language itself.
Mainstream languages like JavaScript, Python, C#, C++ and Java have steadily adopted some of Lisp's ideas (e.g. garbage collection (1959)), but some, sadly, remain overlooked. Having moved back to programming in C++ and doing some C#, I'm constantly amazed that some really fundamental things from Lisp remain shrouded in mysterious brackets.
1. In my opinion, most basic and most overlooked are the benefits of usable global variables. In Lisp you don't have to keep passing extra parameters to functions because a function they call needs them. Each special variable has its own stack (per-thread in most implementations, leading to performance compromises). So you can use global variables without worrying about affecting other contexts.
Historically, lexical scoping is relatively new to Lisp (last thirty years?), and as it is much cleaner as a default, old Lisp's dynamic or special scoping received a bad name. But in the right places it makes the difference between having to add a parameter to a chain of function calls or adding an unnecessary field to a class, and simply implementing the needed functionality.
2. The condition system by Kent Pitman is utterly fantastic. C++'s STL is vastly better thought out than Common Lisp's sequences; but the conditions system in Common Lisp is a leap ahead of C++'s exceptions. Instead of an exception automatically unwinding the stack, callers can choose to catch a specific exception (called a condition) and perform some action — like, say, popping up a dialog box asking the user to free up disk space and retry — all without affecting the control flow. You're free to unwind the stack if you like of course. In implementation, this is normally pretty much a library on top of the language using special variables and closures. . .
3. Which brings me on to powerful closures. C++ is getting lambda functions, but they're not as powerful as Common Lisp's. Python resists allowing you to modify variables in the enclosing scope. In a Common Lisp lambda, you can not only modify variables in your enclosing lexical scope, but also return-from the enclosing scope (provided of course that it's still on the stack). This can be implemented by an exception with a tag unique to the enclosing function in C++ and other languages. But it is immensely useful, and much more convenient to have the compiler insert the boilerplate for you, as it allows all sorts of things.
It would be great if some these ideas would be massaged into other languages.
Is C++'s STL comparable to crhodes's user-extensible sequences?
(http://www.doc.gold.ac.uk/~mas01cr/papers/ilc2007/sequences-20070301.pdf)
Posted 2010-05-16 00:43:08 GMT by
The user-extensible sequences are cool. But the STL is much bigger than that http://www.stlport.org/resources/StepanovUSA.html
Posted 2010-05-26 21:35:00 GMT by
One of the reasons many languages are lacking cool features that already had a long history before these languages came into being might be the fact that most language writers have to rewrite all that stuff when they start a new language.
Btw, Java did not include closures and generics for the simple reason that they didn't have time for it (see http://www.artima.com/weblogs/viewpost.jsp?thread=173229 ). In addition to the language Java itself, those guys had to write the whole JVM, a GC and libraries for networking, multi-threading and so forth.
In the meantime Java got generics, and end of this year they will probably get proper closures. Whether these features - which are nice as such - will help Java survive is a different question.
The emergence of quality "platforms" like LLVM or even higher level ones like the JVM and maybe dot.net could really bring about cool new languages, or great features in existing ones, as language writers finally have time for it. Currently we are seeing 2 languages really gaining steam on the JVM: Clojure and Scala.
In the case of Clojure, this one guy Rich Hickey could have never written such an already pretty mature and stable language with its unique combination of features (STM, sequence abstraction, persistent data structures) in such a short period of time hadn't it been for the JVM - which gave him a GC, Unicode, multi-threading and JIT compilation, tons of runtime optimization tricks like escape analysis etc. for free, not to mention the zillions of existing Java libraries.
So my hope is that this extra abstraction layer "platform" will provide us with great new (or old) features in languages that run on top of them.
Posted 2010-06-17 12:15:02 GMT by
Post a comment
Posted 2010-05-08 21:30:19 GMT
Played in the codejam then got an FBML Facebook app working — a small scale recreation of a program by my father, that asks progressively harder arithmetic problems. You score is saved between rounds and I guess I should start adding social friends links and things.
The source is here. It's rather hacky and needs an up-to-the-minute tpd2. Try it out!
Post a comment
Posted 2010-04-22 21:43:11 GMT
I've never found a VM that handled running out of memory properly. Maybe Lispworks? I know that Allegro and SBCL can just crash horribly when you try to allocate too much. And Java tends to bring down my machine. I was a .NET evangelist to the all the Scala people I ran into, but now having used it, I have to eat my words.
Take a look at this C# (pseudo-code as I haven't .NET at home).
var sb = new StringBuilder();
while(true){
try {
sb.EnsureCapacity(1000000);
break;
}catch(OutOfMemoryException){
GC.Collect();
}
}
Even when there is plenty of memory free (2GB available on the machine
and only 300MB in the program), if you rapidly repeatedly allocate
this 20MB, sometimes it will throw an OutOfMemoryException. This would
normally crash your program but you can just call GC.Collect() to get
the GC to agree to allocate more heap for the program. It is insane
that it throws OutOfMemoryException when there is so, so much space
free to allocate! The fragmentation
argument does not wash, as the program could freely allocate more
memory. And simply, why doesn't the allocator call GC.Collect instead
of throwing this exception? Having managed code
in a VM means
that you can de-fragment your memory when you feel like it . . .
PS. And while ThreadPool.QueueUserWorkItem is pretty great, parallel for, isn't.
Post a comment
Posted 2010-04-18 20:16:00 GMT
Blogs normally show the last few posts on the front page. This is great if you are frequent visitor. However, if you are just stumbling onto the site for the first time, you will have to dig to find the best content — which is unlikely to be always the most recent.
Consequently, I've changed the blog frontpage to display the hottest articles in larger text sizes. The old chronological order is still available and the Atom feed is unchanged.
Any ideas for how to make the front page prettier much appreciated! As always the source is on github.
Post a comment
Posted 2010-03-17 00:00:00 GMT
Russ Cox wrote a series of articles on regexps and then released RE2. The idea is to make a general regexp engine with a time and space complexity that does not grow horribly when confronted with a pathological string/regexp combination.
This interested me as when I made cl-irregsexp I found that the performance of competing regexp engines varied wildly. The best was generally the Perl engine which has some very good special case paths.
Years ago, when I released cl-irregsexp I wanted a good headline. I looked a small example that would keep in the fast case for cl-irregsexp, but which would trip up other implementations. I ended up with searching for either indecipherable or undecipherable &mdash indecipherable|undecipherable. Funnily, in many implementations this is slower than searching for one string then the other.
If the freshly released RE2 lives up to its interesting and useful claim that complex regexps combined with large strings cannot cause it to blow up, then it should easily handle this straightforward case.
| Implementation | Time (s), smaller is better |
|---|---|
| ruby.rb | 41.18 |
| python.py | 36.49 |
| pcre | 32.12 |
| perl.pl | 24.55 |
| cl-irregsexp.clozure | 12.02 |
| re2 | 10.09 |
| cl-irregsexp.sbcl | 6.24 |
And it does, and handles it well, coming second after cl-irregsexp — taking only 60% more time than cl-irregsexp with SBCL 1.0.36.13 and actually significantly faster than cl-irregsexp on ClozureCL 1.4. This is the one thing that cl-irregsexp specialises in (it doesn't do anything else very well) and it's nice to see a decent competitor emerge.
Combining the silly implementation tricks in cl-irregsexp with Russ Cox's general idea for dealing with regexps as DFA and then compiling them to native code via SBCL (if only computed gotos were added) could make something better than either without being massively complex. The avoidance of exponential blow-up would be nice too. But generally for most practical use, the Perl engine's tuned but unnecessarily self-limiting toolchest will be hard to beat.
Post a comment
Posted 2010-03-16 00:00:00 GMT
I was trying to update the benchmarks for cl-irregsexp and to my surprise I found that as my laptop heated up it was getting slower. The observation was that the more times I repeated a benchmark, the slower it became.
I'd set the cpu_scaling_governors to performance, and made sure that the max scaling frequency was okay. But the CPUs were entering throttling states, as shown by acpitool.
One of the thermal zones was right on the throttle temperature when I looked at it
Thermal zone 2 : ok, 85 C Trip points : ------------- critical (S5): 130 C passive: 85 C: tc1=0 tc2=3 tsp=10 devices=CPU0 CPU1
The room I am in is unusually hot. I guess this may invalidate some of my benchmark results. Fortunately, people have repeated most of the HTTP ones to get similar numbers, but I shall have to run them again. With the heating off.
Does anybody know a way of checking whether throttling ever kicks in? I guess I shall poll the temperature every few seconds.
PS 20100318. Worked round it for long running benchmarks by forcing the CPU to be in the lowest possible speed, and moving to a colder room.
Post a comment
Posted 2010-03-15 20:00:00 GMT
Every two minutes, some ignorant fanboy for a particularly opinionated
programming language will claim that it can be fast as C
or
even faster. Sadly, Common Lispers are not immune to this idiocy. And
idiocy it is. Languages inherently don't have speeds.
Firstly, when comparing two languages with the same program, the results depend on how you structure the program and the compilers or language environments you choose. You have to compare best possible programs in each language to start with, which simply invalidates most of these arguments as the C program is usually very badly written.
Secondly, performance in real programs is more about communication than computations. The claim that a language can be fast as C because it can do floating point operations on data in registers (the typical argument), misses completely that modern performance is dominated by data-locality considerations (caches, talking to other cores, etc.).
Thirdly, C compilers are generally much better at optimizing than compilers for novelty languages. Making a compiler that can produce code as good as modern GCC is pretty tricky. Many novelty languages solve this by compiling down to C code. If I remember correctly, there was a point when BitC had better scores than C on the ever changing programming language benchmark game — despite compiling down to C. Again, this does not make logical sense.
The essential reason why these arguments are ridiculous is that computers have speeds, not languages. If you have a problem and you know that it requires reading through 100TB of data, then you know that you can't do it faster than the time you need to read through 100TB. All your program can do is slow that down. Unfortunately, it is true that many programming environments do slow you down, or make it very difficult to avoid being slowed down.
Once you have decided on an algorithm, then your computer determines how fast it can possibly be, and your programming environment may restrict that more. For example, in many cases the most efficient data-structure for a problem is a bit-array. A common operation needed is to find the first set bit. X86 computers include the bsf/bsr instructions to quickly find that bit. How much trouble is it to use that instruction in your novelty programming language? In GCC, there is a semi-standard asm construct; that makes it a cinch to use an instruction like this. Under Common Lisp in SBCL, for example, you would have to mess about with the compiler and defknown.
Simply being unable to utilise the dedicated hardware (even very important things like vector instructions) is a common problem for less mainstream programming environments. But this is a distraction from the main point. Poor data layout in memory and unnecessary inter-core communication are the typical causes for inefficiency.
How much trouble is it to control how structures are laid out in memory in your chosen programming environment? The language specification probably doesn't say much about it and is therefore irrelevant. It's about the compiler/environment implementation, not the language.
Talking about the inefficiencies introduced into programs by a compiler implementation is much more interesting than talking about the (very few) cases where it does well, because that further constrains how you can structure your program. Finding the best algorithm is fantastic fun, but it's diminished by people trying to whitewash the inefficiencies introduced by their chosen programming environment.
Languages don't have speeds, computers do.
> Languages inherently don't have speeds.
Shockingly lots and lots of programmers understand that - but still find it a convenient way to express what is really being discussed.
> ever changing programming language benchmark game
The programming language versions change to keep up-to-date.
The contributed programs change when someone notices an opportunity to improve performance.
The measurements change surprisingly little (mostly when a program is contributed that exploits multi core_.
Posted 2010-03-16 01:11:33 GMT by
Regarding the shootout, http://groups.google.com/group/comp.lang.lisp/msg/5489247d2f56a848 "the lifecycle of a shootout benchmark" by Juho Snellman explains what I was complaining about.
If you can point out a programming language performance comparison that is well done, I'd be very happy to read it!
Posted 2010-03-16 16:21:20 GMT by
maybe u know http://shootout.alioth.debian.org/
Posted 2010-03-18 07:44:13 GMT by
> Juho Snellman explains what I was complaining about
Juho Snellman wrote in 2006 about the wholesale changes that happened in 2005.
Here we are in 2010 - perhaps you could complain about something that happened in this decade.
Posted 2010-03-20 17:05:11 GMT by
Every twenty minutes, some ignorant fanboy for a particularly slow programming language will claim that languages inherently don't have speeds.
Posted 2010-05-12 03:28:55 GMT by
Post a comment
Posted 2010-02-23 00:00:00 GMT
I was looking at a disassembly. It contained this
nopl 0x0(%rax)
What is the point of passing an operand to NOP? NOP is the instruction
that does nothing. Yet not quite true: Intel's US
Patent 5,701,442 Method of modifying an instruction set
architecture of a computer processor to maintain backward
compatibility
suggests that they could opt to use more complex
NOP instructions to provide hints like memory prefetch requests. On
processors without the prefetch logic the operations would do nothing,
but processors with prefetch would initiate memory requests to bring
the data into cache. The extended NOPs taking operands that lie in the
amd64 and x86 instruction sets are called hinting nops for this
reason, but as far as I know they don't yet hint anything (see the
PREFETCH instructions near the same opcode code points).
It turns out that these interesting nopls, nopw, etc. are generated by GAS (that is, when using GCC). As the instruction set is encoded in quite a uniform way to simplify the decode stage, there are many instructions that achieve nothing: for example, xchg %ax,%ax (the standard two-byte nop) or leal 0(%esi),%esi, a three-byte nop. Dedicating opcodes for longer NOPs is a sensible way to simplify the CPU's own optimizations.
Aligning jump targets in code to a 16-byte boundary to make sure that the target can be fetched in a single cacheline request is important. However the padding used to flow through to this aligned boundary should be as efficiently encoded as possible, and using only a single instruction to take up eight bytes — nopw 0L(%[re]ax,%[re]ax,1), is more efficient than repeatedly exercising the decode logic on eight one-byte nops. The code to do this is in binutils/gas/config/tc-i386.c:i386_align_code.
Goes to show how mature the AMD64 instruction set has become — and how far from the days of the Binary Coded Decimal nonsense.
Post a comment
Post a comment