Individualism, parallel processing and Condor

Thank goodness it’s someone else’s job to get deep down into technical nitty-gritty and  make things go.  It’s my job to DREAM and make huge semantic leaps.  And that’s how I assert my own individuality on the universe, make my markOr dent, in the words of Steve Jobs re-purposed by Hugh McLeod.

Recently, I was talking to someone interesting and important in the APL community about what it takes to be an APL programmer these days. He said, It’s more than just being proficient in Array programming.  You need to be able to listen and understand what the customer wants, and do THAT.

And this reminded me of another conversation between two former APLers who are both very successful in business and technology now outside of the Array language community:  One says to the other, It took me a long time to realize it’s important to do what my boss wants me to do… Yeah! says the other, Me too!

Oh! Individualism! The Array programmers’ tragic flaw.  Where other languages require a whole army to assemble around a problem, for us it’s just one. We don’t need to assemble.  APL makes us independent.

But what if we want to tackle really big problems.  And I mean huge gigantic impossible to imagine they-are-so-big problems? I mean problems like parallel processing problems. 

What the heck is a parallel processing problem anyway?

I asked Peter Keller if he would explain to me a parallel processing problem, because I read a lot of discourse on the subject, but only see very small hints about why I should care.  Not enough, really, for me to sink my teeth into.

As it turns out, his answer was really cool.  He’s working on the Condor project, which is not an Array language project, but it could be.   Here’s an excerpt of what he wrote to me:

In the use cases that I know of, and for which my small contributions are most likely to be used, it would be data processing for high energy physics.

Basically, modern particle accelerators (like the large hadron collider) produce interesting particle event data in the gigabits per second range for ten or more years straight. This data gets stored and routed to entire countries or political organizations to be processed on the vast physics grids to look for statistical correlations in the data or to see if it matches predicted behavior.

The mathematical models are very complex, sometimes being in pipelines of dozens to hundreds of programs and have to be run upon billions (possibly trillions?) of events whose subsequent data-in varying sizes of kilobytes to terabytes, needs to be moved to the right place, etc, etc, etc. Each scientific research group has their own set of mathematical models, each with their own data pipelines, and there are many groups. The processing of a single event may take 5 minutes or much longer like an hour depending upon the type of analysis being performed on it.

These pipelines already exist… and each one can take 6+ months in real time to run on hundreds to thousands of computers (all going up and down, networks failing, disks filling up, etc, etc, etc). Condor and other batch schedulers are the means by which these disparate workflows are executed (often at the same time in a pool of machines). We provide a robust layer to get the work completed in the face of all kinds of failures. We try very hard to make it that one or two people can manage hundreds of parallel workflows on thousands of machines with little to no human intervention.

High! Energy! Physics! I want to understand high energy physics!

Peter also suggested that I go ask the big physics project entities why they are not using APL or any of the APL Array Languages descendants. I expect that asking is the next best thing to being there.  So, I will.

Now I shall keep a close eye on my stats to see if  I’ve caught your attention.

Oh! I’m still getting hits on my naked Austrian post, by the way.


0 Responses to “Individualism, parallel processing and Condor”

Comments are currently closed.

Analytics Plugin created by Jake Ruston's Wordpress Plugins - Powered by Laptop Cases and r4.