John Fremlin's blog

Dynamic Lisp blog entry demo: rationalize

Posted 2014-03-13 07:00:00 GMT

Given a floating point number, how to go to its representation as a rational?

1.0471976 = 1/3π

Try another number:

Thanks to the RATIONALIZE function in Common Lisp. See the SBCL source.

Updated to not show d0 at the end

Posted 2014-03-13 08:03:49 GMT by John Fremlin

wat

Posted 2014-03-19 07:01:40 GMT by Anonymous from 108.15.111.28

<script type=javascript>alert("what")</script>

Posted 2014-03-19 07:03:31 GMT by Anonymous from 108.15.111.28

Post a comment

Enterprise software wheel of life

Posted 2013-09-26 03:42:00 GMT

I've worked at big companies for a while and when planning a software project you need to figure out how to be a organisational team player and fit with all those other teams and their roadmaps. Here's a handy guide to how well another team's project will help yours:

Official state Apparent suitability for your project,
after meeting the other team
Actual state
Development 100% fit, can accommodate your capricious feature requests, designed to scale while consistently providing low latency, beautiful UX in the next quarter/half Non-existent, vaporware, not used for anything
Production Team tells you to go away until next quarter/half, will not discuss your use-case Bug-ridden mess failing at its first use-case
Deprecated Unmaintained, so no team to talk toYears of consistent operation for real use-cases, could do easily do yours if it weren't scheduled to be retired in the next quarter/half

The kicker being, of course, that the next quarter/half never seems to come around.

A ring of truth perhaps, and why is this?

I would say, the typical incentive structure primarily: proposing a project, you need to make the business case, and once that's locked in (the production stage), you don't want to compromise that by taking on something else — as the first case determines how you'll be evaluated. And once it's working, you will have many requests to fix the tough issues that have small wider benefit — but which are important to the users of the system harming your relationship with them — so it's time to create another project.

How to fix it? Do not emphasize project ownership (outcomes ownership instead), reward engineers who are willing to get their hands dirty across traditional team boundaries, and let them participate in evaluating the performance of the people who work on those other teams. In nearly every business there are teams with conflicting goals, and often directly conflicting, but it is possible to foster a culture of technical collaboration despite that.

A little thought at the beginning of a project in terms of its design can have a huge effect on the lives of everybody who has to work with it, and a little thought about the way people are included even more so. It's too natural for engineers, and engineering managers, to think there are no major feature requests for their system, simply because they have never spent time with the people who interact with the system every day.

Post a comment

Crazyflie with Leap Motion controller

Posted 2013-09-23 05:28:00 GMT

Connected up a crazyflie quadcopter with a Leap Motion. Kind of fun because you can fly the thing by waving your hand in the air!

There was an issue that prevented takeoff — the Leap would often lose visual identification of my fingers and the software would cut the thrust to zero in that case.

The fix is pretty simple, just turn off the accidental reading protection:

             # Protect against accidental readings. When tilting the had
             # fingers are sometimes lost so only use 4.
             if (len(hand.fingers) < 4):
-                self._dcb(0,0,0,0)
+                print('lost fingers')

I feel the next step is to build a sort of hover control into the device as the pitch and yaw from the Leap are also quite noisy, so they need to be smoothed, which means that the human pilot will not be able to make fine adjustments.

Many thanks to Davey, Mike and Ye for devices and help!

Post a comment

C++'s optional is unmovable

Posted 2013-08-12 04:31:00 GMT

std::optional is a proposal for the C++ standard mirroring Haskell's Data.Maybe.

This elegantly avoids the problem of passing around pointers to doubles or having weird flag values.

In a way, a std::unique_ptr with a nullptr contents is also an odd flag value, and in fact that flag value might already have some contextual significance (e.g. an unused slot in a finite pool), so it would be fine to fit a unique_ptr into a std::optional. But sadly it does not support such types that only have constructors from rvalues, originating as it does from boost::optional which predates move semantics.

Would be great to get this fixed (should be possible just by modifying the library proposal). Even better would be if dealing with such rvalue constructed types in downstream templates did not need so much explicit machinery!

I believe this should already work (as of the latest draft of C++1y, N3690). It's normal for papers to change between what is in the pre-meeting mailing (which N3527 comes from) and the form actually voted into the draft standard (which was N3672). The latter certainly appears to support move-only types. The former also seems to, but it doesn't emphasize it as much.

There's a reference implementation of the std::optional proposal at https://github.com/akrzemi1/Optional -- it'd be interesting to see how smoothly std::optional<std::unique_ptr<T>> works with that.

Posted 2013-08-12 06:52:28 GMT by Richard Smith

Post a comment

A code for code review

Posted 2013-05-06 05:49:00 GMT

In 2001, before I started university, I interned at a company making radio controlled heating valves: why not use code review I asked? Palpably, the quality of technical decisions in open source software like the Linux kernel was much better for discussion around direction — sometimes descending into frankly ad hominem insults but resulting at least in some degree of consideration of alternatives. On the other hand, who wants a layer of bureaucracy? And so we opted not to.

Since coming to Facebook, where code reviews are strongly encouraged and almost enforced, I've done more review than coding — about three to one — which is personally a little frustrating as writing code is more fun. But one reason I do so many reviews is that it is not always easy to get changes in: there are large swathes of the code-base, lying unmaintained, where proposed changes can go unreviewed forever and finding someone who is able to spend the time to consider the ramifications of a modification is often tricky.

What are the duties of a reviewer? There is a school of thought which suggests that these to not extend to verifying the software for correctness. I would disagree — with the exception that if the description of how the change is tested is an outright fabrication, then the reviewer is responsible for independently assessing the correctness of both the implementation and the assumptions underlying it, including a duty to insist on a proper plan for empirically observing the behaviour of the program. Beyond that, the reviewer should consider the consequences in terms of the wider ecosystem of the change (does it increase load on another system or impose technical debt in terms of fragility to subsequent changes), and should consider alternative approaches. The issue of coding style, especially superficial formatting, should not be the main focus of discussion.

The duties of the coder, the reviewee, comprise foremost a duty to ensure a proper review, which means submitting comprehensible (and therefore small) patches to a reviewer who is capable of understanding their consequences — and sometimes this means insisting on additional consideration of some subtlety that the author may have missed.

The question of how strongly opinions should be expressed in the discussion of a patch is largely a personal preference and in some open source communities vitriolic and scathing remarks are not uncommon (Linus Torvalds being infamous for this). My personal opinion is that the delivery of the message is less important than the content, and the reasoning behind it, which should be made clear. And if the reviewer expresses concerns, the onus is on the reviewee, as supplicant, to placate those, or alternatively to find another more convivial reviewer, rather than to try to bully a change through the process. However, civility and a lighthearted sense of humour are most pleasant to work with!

Sadly, in moments of highest pressure the review process is most circumvented: when the change is very large or even beyond a few hundred lines it is most time-consuming to review, so it becomes tempting to skip the process: but this is exactly when consideration of alternatives can have the greatest benefit. Similarly, when there is a very proximate deadline of some sort it is tempting to short-circuit the review, but exactly then are bugs and wrong decisions most damaging, as there is by definition little time to observe and correct them. Reviews here are most essential and I feel that an additional process requirement of a third pair of eyes might actually be beneficial.

At the end of the day, it's almost certainly easier and quicker to rewrite some code than debug it years later. Good code review means better code, better mutual understanding, better systems and therefore better morale.

I recently had a review of more than 80 files. It was pleasant to read and I learned a lot. Very impressive quality, that changed how I write code myself now. Reviews are very important to educate developers. And reviewers must question everything they read.

P!

Posted 2014-05-01 06:07:40 GMT by Anonymous from 70.36.55.52

Good job on this article! I really like how you presented your facts and how you made ​​it interesting and easy to understand. Thank you!

http://www.frivjuegosnow.com

Posted 2014-05-28 07:33:47 GMT by bella

Post a comment

A little guide to liblinear logistic regression

Posted 2013-05-05 22:00:00 GMT

The C++ library and toolset liblinear is awesome for sparse large (20M row+) logistic regression — using past data to predict probabilities of an occurrence.

Unfortunately, it has a few gotchas that can catch you out though when using the train and predict functionality.

— interacting features must be done before passing to the package, and text feature labels have to be turned into packed feature indices.

— features indices are labeled starting from 1 not 0 (the first feature has index 1). If using the C++ interface, to indicate the end of features for a row use a feature_node with index = -1.

— only solver mode 0 (L2 regularisation) and solver mode 6 (L2 regularisation) are for logistic regression, the others are for SVM.

— to benefit from regularisation, scale features appropriately (e.g. divide by standard deviation) or else features that have a wide range of values will be penalised.

— the C parameter controlling the degree of regularisation decreases regularization the larger it becomes. To get more regularization make it smaller (e.g. 0.001). To get sparse feature selection, use solver 6 (L1 regularisation penalty) with small C.

This is a great package. Thanks to Dean for much advice, and many thanks to the authors of it at the Machine Learning and Data Mining Group at NTU in Taiwan!

Post a comment

A lisp study in mathematical bugs

Posted 2013-01-21 21:30:00 GMT

On the way to yesterday's Bay Area Lisp meet-up, which was fascinating and had great talks by many speakers and a very generous giveaway of memorabilia by Paul McJones, I made a little game of balls bouncing around — when they collide the ball with the greatest mojo wins. Thanks to a couple of suggestions from Ron it turned into something quite fun.

The collisions of the balls were calculated by stepping the motion for one frame and then checking for overlap; clearly this is manifestly unfair in a plethora of circumstances. On the way back to San Francisco, I set about improving the game by solving the quadratic equation for the moment of collision of two balls moving with constant velocities (clearly it is in general quadratic as it can be formulated as a polynomial in time and there are two solution: first intersection when the balls start overlapping and final intersection when they eventually pass through each other).

I wished to return the solutions from the following expression

(solve-for time 
  (- (^2 (+ (ball-r a) (ball-r b)))
    (+ (^2 (+ (- (ball-x a) (ball-x b)) (* time (- (ball-vx a) (ball-vx b))))) 
        (^2 (+ (- (ball-y a) (ball-y b)) (* time (- (ball-vy a) (ball-vy b)))))))

Ric gave a talk at the meetup about the importance of choosing the correct point of abstraction for a software project. In this case I decided not to rearrange the equation by hand; I simply wrote out an equation for the distance between the edges of the balls, employing an abstraction in the form of the unwritten solve-for macro. To solve the quadratic one could write a numeric function like Newton-Raphson and that would be one potential implementation of the solve-for macro, but there is an elementary analytic technique called completing the squares which is preferable in the case of a quadratic. I wished to implement the rearrangement of terms automatically as it is error prone and the result is difficult to interpret or modify.

To that end I implemented the analytic solve-for on the train home. And most of the time was spent in trying to debug the essentially correct approach. My initial assessment was that the main task would be the rearrangement of forms to collect the coefficients of the powers of time in the expanded equation. This in the end went well; it is easy to test step by step after all. Where I stumbled again and again was in the reliable discovery of solutions once the coefficients had been determined.

How can this be? The formula is just (-b ± √ (b2 - 4ac))/2a for the solutions to the quadratic ax2 + bx + c = 0. The case where a is zero is handled separately; and I did it separately — when the coefficient a could be statically determined to be always 0. When it became zero because the balls were moving at the same velocity, the solver would crash.

The bug which confused me the most was that this code

(case (signnum ,b2-4ac)
  (-1 ...)
  (0 ...)
  (1 ...))

dealing with the cases of the balls touching but never overlapping, overlapping then parting, and never touching, did not function as I expected despite my testcases. It transpires to my surprise that signnum is defined to return, not a member of the set of fixnums -1, 0, 1 but these numbers in the same type as the original input which causes this case statement to fall through when passed a float. As most CPUs provide enough information on comparison to distinguish these three eventualities in a single instruction it is rather sad not to be able to exploit it idiomatically. To discover this bug I wrote an alternative to the solve-for that used a simple bisection searching for a change in sign to triangulate the source of my confusion.

Finally, having discovered all these cases, I consider that I should have abstracted out the coefficient solver; a function that takes coefficients and returns the solutions, rather than implementing it inline in the solve-for macro, which should not have expanded focused on the task of doing the rearrangement, which it achieved very successfully.

The mistake I made was in not understanding deeply enough the correct prototype or function signature for the coefficient solver function: after reflection and discussion with my friend Richard Smith, I believe it should take in as input the symbolic representation of the equation with all the constants resolved to the velocities of the balls in question, and then as output return an object which can respond to queries for the next solution at a time greater than or equal to a given time. This would allow one to handle tougher functions that may oscillate very rapidly and cross zero an infinite number of times in a finite interval.

Post a comment

vii-secure-autorun: secure encrypted autorun for Linux

Posted 2013-01-09 06:25:00 GMT

Media center computers are ideally without keyboard. However sometimes they need a software patch. Servers need neither input devices nor screen. How to run commands on them if they lose their Internet connexion? And how to do so securely?

Here I present vii-secure-autorun, a system for running commands from removable media like USB drives and DVDs, with the guarantee that only code from trusted sources can affect the machine. With these udev rules it will attempt to mount and check the signature on any ext2 filesystem labeled vii-secure-auto

ACTION=="add", ENV{ID_FS_LABEL}=="vii-secure-auto", ENV{ID_FS_TYPE}=="ext2", ENV{UDISKS_PRESENTATION_HIDE}:="1", RUN+="/etc/vii-secure-autorun/vii-secure-autorun signed-execute-dev $env{DEVNAME}"

Of course, by simply automounting the removable filesystem it may be possible to exploit bugs in the filesystem drivers and so on, so caveat emptor.

vii-secure-autorun signed-execute-dev /dev/sda1 — mount the device and execute the code on it, umount it, etc.

gpg --export | vii-secure-autorun import-keys — add keys to the trusted keychain

vii-secure-autorun package-sign directory — make a tarball of the files in the directory and sign it; the file that will be executed on unpacking is vii-secure-autorun-exec

Hope it's useful, it is to me!

Post a comment

Lisp interview at Lisp, the Universe and Everything

Posted 2013-01-03 14:26:00 GMT

Proud to be included in Vsevolod Dyomkin's lisp hacker interview series.

I enjoyed answering the questions!

Post a comment

Mail setup for a private domain

Posted 2012-12-31 23:00:00 GMT

Don't like the idea of storing mail in the cloud? Me neither.

My setup: postfix (mailserver). clamav-milter (antivirus), spamass-milter (antispam), grossd (greylisting).

Configuration: etckeeper diff.

Add to /etc/postfix/main.cf

+smtpd_milters = unix:clamav/clamav-milter.ctl, unix:spamass/spamass.sock

Post a comment

Older entries (79 remaining)