Visiting JPL

I’m at JPL today and tomorrow, visiting a colleague (and hopefully collaborator) at NASA, and attending the first day of the AIRS Science Meeting.

AIRS stands for the Atmospheric Infrared Sensor, an instrument that sits aboard NASA’s Aqua satellite. Along with other sensors onboard Aqua, AIRS provides for detailed atmospheric sensing, especially of greenhouse gases that account for climate change.

In fact, the AIRS science meeting is ground central for some of the climate change work that we mostly hear about through a political lens. It’s fascinating to sit in the JPL cafeteria and listen to atmosphere scientists talk about some of the fundamental problems in climate science (clouds are apparently a BIG mystery in terms of how they influence global temperature).

So what’s a computer scientist doing in this super-charged atmosphere ? The instrument produces gargantuan amounts of data, and there are all kinds of interesting data analysis questions that need to be addressed at scale. What makes these questions interesting is that unlike in much of mining, there is a physical ground truth that we can use to validate any interesting patterns that a mining process might uncover.

Published in: on March 27, 2007 at 7:26 am Comments (2)

An ending…

They say, “all good things come to an end”. What they don’t say is, “if good things didn’t end, you wouldn’t realize how good they were”.

Lance announces today that he’s shutting down the Computational Complexity blog. As far as I know, he’s the ur-TCS blogger, and was the direct inspiration for my starting the Geomblog. I’ve always had the Complexity blog on my must-read list, and have felt both pressured (to write better posts) and inspired (by the rich content) every time I’ve seen a new posting.

Another aspect of Lance’s blog that I envy is the rich commenter community: he was able to create a forum for discussion of TheoryCS, on matters both technical and non-technical, that will be hard to replace.

So thanks for all the posts, Lance, and hopefully you’ll be back one day !

Published in: on March 25, 2007 at 11:47 pm Leave a Comment

4th Carnival of mathematics up

Published in: on at 6:03 am Leave a Comment

Job announcements

For people who don’t subscribe to compgeom-announce:

Published in: on March 21, 2007 at 7:34 am Leave a Comment

Meta-posts

I was browsing my referrer logs and found a callback to Lance’s blog. In the comments for the post, I see this comment (emphasis mine):

Here’s one that I’ve found very vexing: Let A(d) denote the least surface area of a cell that tiles R^d via the translations in Z^d. Improve on the following bounds:

sqrt(pi e / 2) sqrt(d)
(It’s a little more natural to consider A(d)/2 than A(d).)

The lower bound is by considering a volume-1 sphere, the upper bound by monkeying around a little with a cube’s corner.

Any proof of A(d)/2 >= 10sqrt(d) or A(d)/2
For motivation: Either pretend I posted on Suresh’s blog (so it’s just a problem in geometry); or — it’s related to finding the best rate for parallel repetition of a simple family of 2P1R games.

It’s a meta post !!! And by the way, there’s no such thing as “JUST a problem in geometry”. Harumph !!

;)

Published in: on at 7:30 am Leave a Comment

On ranking journals

Via Cosma Shalizi comes a new way of ranking journals developed by Jevin West, Ben Althouse, Martin Rosvall, Ted Bergstrom, and Carl Bergstrom. It works on the PageRank principle: take a random paper from a journal and do a random walk on the citations, using the stationary distribution to rank journals. The result is eigenfactor.org, a website that catalogues many journals and other publications, and provides both a ranking by their “pagerank” but also a “value for money” that might be useful when deciding which journals to purchase for a library.

Impact measurement for journals has become a popular parlor game, as well as impact factors like the ‘h-index‘ for individual researchers. There are all kinds of problems with these measurements in general, and Eigenfactor does provide a way of eliminating some of the usual problems with measuring impact across multiple communities with different citation mechanisms, community sizes, and so on.

Eigenfactor has a few top 10 lists for different areas (science, social science, etc): here’s my informal list of top ranked computer science algorithms (and friends) journals, ranked by article impact factor (all scores are percentiles over the universe of 6000+ journals considered):

  • SIAM Review: 98.35
  • J. ACM: 97.91
  • IEEE Trans. Inf. Theory: 95.05
  • Machine Learning: 93.92
  • SICOMP: 93.04
  • JCSS: 90.93
  • Journal of Algorithms: 90.31*
  • DCG: 85.29
  • CGTA: 81.96
  • Algorithmica: 79.13

* The ACM Transactions on Algorithms, which absorbed most of the editorial board of J. Alg, is too new to show up. This ranking should probably reflect the historical relevance of J. Alg as well as its current state.

Published in: on at 3:53 am Comments (12)

John Backus, 1924-2007

John Backus died on Saturday. He was the father of FORTRAN, Turing Award winner, and one of the names behind the Backus-Naur form (Peter Naur, 2005 Turing Award winner, was the other).

No matter how much you might make fun of FORTRAN today, the fact is that it remains the language of choice for many scientific computing packages. Even if you’re writing in C, you’re probably interfacing with routines first written in FORTRAN.

His Turing Award lecture is also a remarkable read. Titled, “Can programming be liberated from the Von Neumann style“, it laid the groundwork for functional programming, and anticipated many of the stream programming paradigms that we see today, from GPU programs to Google’s MapReduce system. I’m always humbled when I read articles from so many years ago that have such resonance today; I can only hope that anything that I do can still have value even ten years later.

Published in: on March 20, 2007 at 2:55 am Leave a Comment

Anonymizing PDF

File this under ‘late-night LaTeX griping’:

Is there any way of stripping metadata from a PDF file ? I’m writing a referee report for a journal, and used PDFLaTeX to create the report. When I scan it in acroread, there’s all kinds of meta data that could identify me.

Now pdftk is a useful package that can strip out some of the simple metadata like ‘creator’. However, pdftex adds “Advanced” fields, and one of them is the full pathname of the original LaTeX file. If your filesystem (UNIX) is anything like mine, then a part of that pathname is the /<username>/ section, which in many instances is an almost unique identifier. This also happens with dvipdfm, which uses the gs backend to create the PDF file, and with ps2pdf. pdftk cannot strip out these fields, because it doesn’t appear to see them.

I suspect that if I owned a copy of the very-not-free Acrobat, I could meddle around with this metadata. Obviously I could submit the review as a Postscript file, but in general I prefer to maintain PDF. Further this problem also occurs if I want to do due diligence when submitting to conferences with double blind review, and sometimes I don’t have the option to use PS.

Published in: on March 17, 2007 at 7:28 am Comments (18)

Dr. Karp and The TCS Lens

or How I learned to stop worrying and love complexity.

[ed. note: This note was prompted by Aaron Clauset's series of articles titled 'Unreasonable effectiveness' (the title itself a riff on Eugene Wigner's famous essay). Further inspiration came from reading the various summaries of Karp's 'algorithmic lens' idea.]

I believe in the gospel of St. Bernard. We are about to enter the age of the algorithm. The only question is: what kind of algorithm will it be ?

We’re all familiar with the steps. Define the problem. Design an algorithm. Analyze the algorithm. Rinse and repeat, till we have a matching lower bound, or till we hit an even harder problem. Algorithms are designed from the top down, by a coffee-swilling researcher with a whiteboard, with the elegance of a swiss watch, or the clumsiness of a sledgehammer. Some of them are so beautiful they make you weep. Some are algorithms “in name only”; no one in their right minds would ever dream of implementing one of these beasts.

The algorithmic lens flips this on its head: complexity (and the solution) emerges from a set of simple rules. It’s like Spore; set up the rules, start the game, and see what happens. The ingenuity here is not in the construction of a solution, but in the design of the rules. It’s declarative, not procedural. It’s bottom up, not top down. It’s distributed, not centralized.

It has the flavor of (computational) complexity: I’ll give you a resource; a guess, a random coin, a scratch pad, an AskJeeves clone. You tell me what I can do with how little. But it’s still a lens. Why ? Things we’re familiar with show up as distributed, bottom up, emergent constructions: A Voronoi diagram is a battle between equally (or unequally) matched forces. A Steiner tree is a soap bubble diagram. A clustering algorithm is a local “I’ll go to my closest friend”; a regular language is the consequence of amnesiac entities bouncing around between states.

It’s unsettling: we aren’t the masters of our domain any more. We’re not GOD, intelligently designing beautiful watches. We’re just the rule-makers for a complex system. It’s like being the parent of a young child !

What we can do is this: find out which rules yield cause what solutions to emerge. We study social networks, and watch complexity emerge from a set of simple pairwise interactions. In other words, the algorithm IS the social network. You can’t get much more Web 2.0 than that…

If you’ve heard this before, it’s because you have: things like the Ising model, phase transitions, statistical mechanics, Conway’s Game of Life, Stephen Wolfram’s New Kind of Science. So what makes these concepts so exciting now ?

Maybe it’s all about scale. We preach that immense scale requires theoretical analysis; the N0 in the “for all n > N0″ of O() notation is finally here. But maybe linear time algorithms, streaming algorithms, and small space sketching are all soon-to-be-futile attempts to stave off the inevitable horrible failure of programs to manage the complexity of the data we deal with. Suddenly, local algorithms that evolve continuing solutions look quite appealing, as well adaptive.

It’s an exciting time to do algorithms, whether you’re designing watches, or setting up an ecosystem.

Published in: on March 15, 2007 at 7:00 am Leave a Comment

A movie about a font is my kind of movie

One of the curses of making web pages and writing documents is an unhealthy obsession with fonts. People who’ve written papers with me have heard my paens to the Euler math font with barely suppressed irritation, and I often have to stop myself from attempting to call out names of familiar fonts when walking down the street :) .

Sounds like this is the movie for me:

Helvetica, which had its world premiere at the conference, presents the life story of something all of us encounter on a daily (or even hourly) basis. Created in 1957 by the Swiss modernist designer Max Miedinger as a response to the cluttered typography and design of the postwar era, Helvetica’s clean neutrality and balanced use of the empty space surrounding letters quickly made it a go-to font for public signage, advertising, corporate logos and works of modernist design around the world

[...]

Filmmaker Gary Hustwitt revels in his fascination with something so commonplace that it blends almost entirely into a context-less background, becoming a detective of sorts to unveil the myriad everyday places Helvetica is hiding (“It’s a disease,” Hustwitt said of his obsessive font-spotting).

Published in: on March 14, 2007 at 11:07 pm Comments (1)