Archive for the 'Apache' Category

Pacta sunt servanda

Pacta sunt servanda (“pacts must be respected”) is probably the first or second thing you learn when you enter law school in Italy. Looks like someone in Sun could use some good old latin to understand why what they’re doing to Java is wrong in so many ways.

I hope the community as a whole understands what the real issue is: the problem is not Sun trying to restrict Java use through questionable strings attached to the Java Compatibility Kit. The problem is not Harmony being at risk of losing the “Java” branding. The problem is not even OpenJDK itself being potentially impaired by the Field of Use restrictions Sun is trying to make the community swallow, which are most probably incompatible with the GPL itself.

All these are serious issues which should be addressed, yet there is a much more prominent conundrum which needs attention from all of us. The whole JCP is at serious risk, as the very basic contract among the different parties (the JSPA) has been breached by the most prominent kid in the block, the one who started it all: if this issue is not resolved, it won’t be just a problem for Harmony, as everyone would feel entitled to provide Open Source incompatible ties to the TCK or the RI for a given Java spec.

Just look around and consider how much Open Source software relies on JCP-provided specs: from XML processing to JDBC, from servlet containers to portals, the virtuous cycle of interoperable implementations by means of good specs, solid TCKs and usable RIs will be broken into pieces, as spec leads would be able to restrict usage of TCKs, effectively hampering or even prohibiting Open Source implementations.

Let me delve into more latin: Harmony here is just a casus belli. The stakes are much, much higher.

No Apachecon US for me…

The speaker list is out, and the long waited acceptance email didn’t come in. My bad for providing just business-related talks: Apachecon is and should be a mostly-geek event, and the competition for the few business slots is understandably high. Time to flesh out some tech ideas for the next edition…

Looks like I’m going to miss my first Apachecon in four years: given both Ugo and Andrew have been accepted, Sourcesense is both going to be more than well represented and in need of someone at the helm, which means I will probably stay home and enjoy some local work for a change. I will deeply miss hanging around with the other ASF guys, but there will for sure be other chances to meet up. Have fun in Austin, guys!

ApacheCon Dublin

Apachecon Europe 2006 just started, and the first keynote from Mark Shuttleworth is over. I’m exercising my new camera and posting a few pictures on Flickr (you might want to look for the “apachecon2006″ tag to see more from other attendees, or even “apacheconeu2006″, “apachecon” or whatever tag the next guy will decide to use).

Meanwhile, I’m frantically finishing up slides for my two talks, and getting some stuff ready for the Open Development BOF I just submitted, hoping a few people will show up to discuss the whole concept and how to approach it. If you feel like chiming in, head to the Ulster room tomorrow at 9PM, we’ll have some fun!

OSBC and Apachecon: ubiquity machine, anyone?

I just realized that the first OSBC Europe is going to happen in the very same days of Apachecon EU. This sucks in so many ways: I know the target audience is somewhat different, but I really don’t think I’m the only one who was planning to attend both.

I’ve been looking with a lot of anticipation to OSBC Europe, and we were also planning to have a booth or something, yet seeing how the OSBC producers didn’t give a dime about colliding with one of the major Open Source events out there it’s not a good sign at all. I’m sure there have been plenty of good reasons to choose those dates, yet I really don’t feel comfortable in having to choose between the two, and I’m sure I’m not alone. Matt, can you share any insights about this mishap?

The “real Open Source” blogfest

Touching a nerve is always interesting: useful discussion is bound to happen when a lot of people with different backgrounds talk about controversial matters. I guess it’s time to put a few more irons in the fire responding to the feedback I’ve got so far.

Matt: I guess I deserve to be called an elitist after being labeling most of the OSS business arena as a bunch of baitware-based suckers. I wouldn’t have been stretching the paradigm so far to include racism in the picture (the Aryan bit was unfortunate in my opinion), but that might just be a problem of language/cultural barrier. Oh, and I don’t give a damn about soccer, despite being italian. :)

More to the point, I think that as of today there is no way to describe Open Source apart from using the very minimal common definition: a set of licenses with some common principles in terms of non-discriminatory access to software, designed to ease access to source code. As such, Open Source is legitimately up for grabs by anyone willing to comply with a few legal requirements: it’s a very pragmatic concept, which worked really well to turn the software industry tables and still able to make a lot of a difference thanks to its simplicity.

The devil is in the details, though: the easy to understand concept behind Open Source isn’t able to differentiate enough the value of community developed software versus a different way to perform software distribution. The former aims to provide quality solutions via peer-based production system, achieving notable goals such as avoiding lock-ins while at it, while the latter is able to provide “just” source code at best, and aims to actually lead to lock-ins at worst.

As much as some value source code per se, I’m more and more inclined towards leaving that camp: thanks to the Open Source and Free Software movement, availability of source code isn’t a big deal anymore and tends to be taken for granted even in notable and traditionally proprietary solutions. As Matt himself correctly pointed out in the past, there are quite a few Open Source benemoths out there (OO.o, Firefox) who can’t be touched with a six feet pole by the average developer, so what’s the real deal with Open Source then apart from bare availability? I don’t really think that the net effect of having tons of freely available source code is going to make any difference that matters in the end.

Also, I’m not buying what Matt, Matthew and Ugo are saying about some sort of Darwinian selection being able to discriminate the good from the bad (assuming there is actually “good” and “bad” – I just tend to think we have different objectives): it’s hard enough to move the CIO masses beyond the “Open Source means Linux” meme, go figure explaining why they should care to consider the difference between Open Source built within the virtuous cycle of community based development and Open Source as a pure distribution model of conceptually proprietary and closed to participation code. This is why I really think we need to be more vocal about it, possibly with a new term or brand that clearly specs out what we really perceive as the real value around open development.

Last but not least, I have been invited to check out the Free Software definition and consider it as an alternative. Well, thanks for the heads-up, but I’m still a pragmatic guy: I think that there is still a lot of room between the social implications of the Free Software Foundation guidelines (which I might buy as a natural consequence, not as a given precondition) and the practical effect of healthy communities providing great software because it just makes sense. I remain unsold on forcing freedom down the throat of anyone: technical merits and shared itches are the still best community builders around.

Now, where do I signup for that Open Source panel? :)

Open Source rants

The nice people at the Apache Marketing Blog have been quoting a rant of yours truly where I’m speculating about what I don’t like in the current Open Source trends. I plan to post a more detailed wrap-up later, but for now it’s good to see I’m not alone!

Maven2 is sweet!

I know this will be no surprise for many of you, yet I’m writing this post for the skepticals still around: if you’re developing Java applications and you’re not considering moving to Maven 2, well, think again. I’ve been in the skeptical camp for far too long and while I don’t plan to enter the zealot crowd anytime soon (there are still a few rough edges), I’m definitely sold on the idea.

After a few years spent juggling jars, keeping Ant skeleton files around and trying to put together best practices and guidelines, I’ve become sick of build processes that no matter what you do always end up in spaghetti code making Postscript shine as a more manageable alternative. Maven’s standardization works indeed, and what you loose in terms of flexibility is more than paid off in terms of clarity and maintainability. True enough, there is some black magic lying around, and I’m too old/not brave enough to see what has been downloaded in my local jar repository, but the overall result is just astonishing.

What makes Maven great, besides the sound ideas behind it, is the number of great plugins that are made possible thanks to standardization. If you’re an Eclipse user like me, you’ll just love the Eclipse plugin that generates project files automagically, referencing jars in your local repository. And if you’re into web applications like me, you’ll find yourself asking how on earth you managed to survive without the Jetty6 plugin around, which makes webapp development a breeze with a mere handful of configuration lines.

Bottom line: if you didn’t give Maven2 a try, this is a very good time to take it for a spin. I, for one, am not looking back.

Wild pipeline API thoughts

(Note: this is a long post, and most certainly the syntax highlighter will make it look funny on your aggregator. You might want to visit the web page to get a better grasp of it)

In 2006 it will be roughly 6 years since I started juggling with XML pipelines. As my few fellow readers might remember, I’m starting to hate XML languages with a passion, but once again this doesn’t mean I don’t like XML anymore and, even more, this doesn’t mean my love for Cocoon is fading. I’m still convinced that pipeline-based processing is the way to go: the road to complex yet maintainable results clearly goes through decomposing the problem in a set of easy step to be performed sequentially and incrementally.

I’m also still convinced that XML is here to stay, for a number of valid reasons, yet I think that the overall scenario has changed since the original Cocoon vision. XML is possibly the most important player out there, but didn’t manage to pursue its Borgish ambition to assimilate everything else: there is a growing party of people who are realizing how the idea that everything could (and should!) be represented as XML is pretentious at least, and stupid at most.

This leaves us, however, with two important concepts: we need pipelines, and we need to steer clear of XML when it doesn’t make sense. To achieve the first goal what we need is a generic, easy and intuitive pipeline API. And it should be a programmatic API, because we need pipelines everywhere, and we need them to be easy enough to grasp for the average programmer (think Facade on steroids): what bugs me with the currently available pipeline API is how they tend to be clumsy and counter-intuitive. Think SAX as the perfect example of why we need a more generic and easier pipeline API and machinery: in the SAX world if you want to pipe events from foo to bar you just do this:

[java]
foo.setContentHandler(bar);
[/java]

Things however get complicated when baz and boo enter the picture. Now you have to:

[java]
baz.setContentHandler(boo);
bar.setContentHandler(baz);
foo.setContentHandler(bar);
[/java]

Which, counter-intuitively, means building the pipeline starting from the last component and going all the way to the first one. In addition to that, those statements are usually interspersed on code that contains other statements such as creation and configuration of the various pieces. Moreover, from a functional point of view the above code could be rewritten as:

[java]
foo.setContentHandler(bar);
bar.setContentHandler(baz);
baz.setContentHandler(boo);
[/java]

Which, even if it looks better from the user point of view (the pipeline steps are now ordered) it has no relation with the underlying model. In fact, you could actually obfuscate stuff when considering switching jobs:

[java]
foo.setContentHandler(bar);
baz.setContentHandler(boo);
bar.setContentHandler(baz);
[/java]

The above lines will still work as expected, but I dare anyone to understand who sends events to whom in a real life scenario where those statements might be ten lines away from each other. To me, this just doesn’t sound right.

Now enter Cocoon and see how, in its declarative sitemap, it shines from the user’s point of view:

[xml]

[/xml]

This just sounds right: pipeline steps are listed sequentially, as they should be, and everyone now understands who’s first and who’s next. But, heck, this is a domain-specific language by all means, moreover written in XML. No way this stuff can be reused in different context, and the “strong typing” nature of the Cocoon pipeline (where everything starts with a Generator and ends with a Serializer, assuming not only that the whole world will talk XML but actually that the whole world will talk SAX) makes things even more difficult.

Finally, consider what the Unix genius have brought us:

[code]
$ grep index.html access.log | awk '{ print $1 }' | sort | uniq | wc -l
[/code]

I think there’s no better comment for the above solution than this:

“A designer knows he has achieved perfection not when there is nothing left to add, but when there is nothing left to take away.” (Antoine de Saint-Exupery)

Expressing the pipeline concept with just one character (the | sign) is a clear indicator of what could be achieved when thinking about simplicity: the concept is powerful, yet the user space view of it is as simple as it can get (admittedly, a bit opaque but it doesn’t take much to get used).

So, what do the above snippets bring to us? In this quest for a simple pipeline API we learn that simplicity is key and that the principle of least surprise suggests that the pipeline declaration should happen at once and in an ordered way. Sticking to Java, this leaves us with something like

[java]
pipeline.setupPipeline(List components);
[/java]

or (uglier, but sometimes just effective enough):

[java]
pipeline.setupPipeline(PipelineComponent[] components);
[/java]

Actually I’d much rather see the setup happening during the construction phase, but for the sake of interface design I’ll leave the convenience method for now. This means that our Pipeline interface becomes something like:

[java]
interface Pipeline extends PipelineComponent {

void setupPipeline(List components);

void start();

}
[/java]

Easy and effective: whoever knows the pipeline concept should be able to grasp how this API works in minutes. We need to talk about the PipelineComponent interface though, and this is related to the next wild idea: pipeline machinery.

(warning: shaky ground ahead, this is the part which needs *much* more thinking)

In the OO world things aren’t quite as simple as in a CLI environment, where all you have are basic pipeline contracts such as “whatever byte streams comes from the left side is pumped to the right side”. We have objects here, and I don’t want this generic pipeline API to be strongly typed as in being able to work just with – say – SAX events or other XML gibberish. What I want is an API which is able to work with many formats in a way that’s transparent to the user: this means that the various pipeline stages should be able to express their contracts in terms of required input and output format. It’s up to the pipeline machinery providing adapters and bridges so that, say, a pipeline component working with SAX events might be able to cooperate with another component working with streams. This could be accomplished either through some kind of PipelineDescriptor, with annotations or just through different interfaces: whatever keeps things simple, makes me happy.

Another nice solution comes (again!) from a chat with Sylvain reminding me of the IAdaptable approach in Eclipse. This solution fits like hand in glove with a world of interchangeable and heterogeneous pipeline component stages, even though I have a few concerns thinking about the added complexity for pipeline component writers in implementing an Adaptable strategy: if the first and foremost objective of this API is simplicity, then writing components should be as easy as possible.

Anyway, the final outcome of all this would be something like this during the pipeline assembly phase:

[java]
PipelineComponent current;
PipelineComponent next;

if (next.accepts(current)) {

current.setNext(next);

} else {

PipelineComponent adapter = next.getAdapter(current.class);

current.setNext(adapter);
adapter.setNext(next);

}
[/java]

With this mechanism in place, in theory, the pipeline is much more versatile: sticking to the XML world it would be possible to build pipelines using whatever mix of SAX, DOM, StAX, AXIOM and YouNameWhat. Moreover, it would be easy enough to provide adapters to the stream world, tees and nested pipelines (there is a reason for Pipeline to extend PipelineComponent after all).

Of course I expect the pipeline machinery to do much more than just adaptation: caching, logging, monitoring and management are vital to the pipeline deployer. But the real point of this effort is, once again, simplicity. Once I’m able to do this:

[java]
// Get the default pipeline implementation
Pipeline pipeline = PipelineFactory.getPipeline();

// Set up the pipeline with an array of PipelineComponents
pipeline.setupPipeline({reader, transform1, transform2, streamAdapter });

// grab the InputStream from the latest component
InputStream result = streamAdapter.getInputStream();

// start processing
pipeline.start();

// enjoy results
is.read()…
[/java]

or this:

[java]
// This time we use SAX events straight away
pipeline.setupPipeline({reader, transform1, transform2 });

// connect to the pipeline result
transform2.setContentHandler(myContentHandler);

// start processing and handle events coming in
pipeline.start();
[/java]

or, why not:

[java]
pipeline.setupPipeline({reader, transform1, transform2 });

anotherPipeline.setupPipeline({something, pipeline, somethingElse});

anotherPipeline.start();
[/java]

Then I could do this from within Cocoon:

[javascript]
function handlePage() {

var pipeline = cocoon.newPipeline({ file(“something.xml”, xslt(“foo.xsl”), forms(), i18n() });

cocoon.sendPipelineAndWait(pipeline);
}
[/javascript]

but also, when Cocoon is not an option:

[xml]
< %@ taglib uri="http://jakarta.apache.org/taglibs/pipeline" prefix="pipeline" %>

[/xml]

Conclusion: if you managed to survive this far, well, congrats and thanks for sticking: it’s been a bumpy ride and there are certainly a ton of rough edges, but the more I think about it, the more I’m convinced that a simple, painless and easy to use Pipeline API could be an invaluable tool. I’d love to use the incredible experience of Cocoon in building solid pipelines to factor out a new and fresh approach that allows anyone to enjoy the power of pipeline-based processing: it’s not going to be easy, but the goal is indeed worth the effort. Now, finding the time to make it happen is a totally different question…

(Unusual) hacking fun

These post-Christmas days are a bit less hectic (but hey, just a tad) than what I’m used to survive to normally, so I’m having some good old hacking fun (you know, the kind of stuff you don’t know how much you’ve been missing until you return to it).

The excuses for firing up that IDE again were multiple: I wanted to take Maven 2 for a spin and, after my recent rant about pipeline languages vs. APIs, I wanted to try some concepts out and see where they might bring. I’m well far away from a solution, but so far I have a few points to make:

  • Maven2 is really, really nice. I’m still reluctant to say that it rocks since I need to see how it behaves with complex stuff, but so far it has way exceeded my expectations. After a few years spent juggling megabytes of jars even for the simplest stuff, seeing that my (automagic, just mvn assembly:assembly) distribution of what I’ve got so far is *just* 4K despite having 15 different library dependancies makes me so happy that I might break in tears any moment now.
  • E4X looks terrific. It’s so immensely powerful to sparkle a lot of weird ideas in mind to exploit every single bit of it in my quest to reduce XML-induced overtyping and messy stuff. Now, if only I could convince Rhino to use my (Java) DOM trees directly in E4X instead than having to go through string serialization every time (yuck!) I’d be a very, very, happy puppy (suggestions are welcome, of course).
  • JSON is another neat piece of technology worth visiting. A suggestion from Sylvain revealed how my quest for a comfy pipeline API might be soon over if I manage to bend it a little bit to my needs, but so far looks promising indeed.

I’m almost positive work stuff will make me drawn any moment now, so that I’ll be forced to quit these nice experiments, but I definitely want to pursue the above technologies, see if and how they might help in the OSS stuff I’m involved in and, last but not least, bring them to our projects where it makes sense. Moreover, as my formal new year proposition, I want to commit myself to some hacking on a regular basis: yes, old lawyers farts can code!

Your presentation sucks, your presentation rocks!

I’m an easily bored kind of guy: if you want me to pay attention to what you’re saying, you’re better make sure that the topic is interesting enough AND that you’re presenting it the right way.

This gets extemely important if I’m at a conference and I’m following your talk: there is a good chance I can get Internet access while you’re talking, and if you don’t manage to grab my attention, I will happily switch to surfing and e-mail. So, in the spirit of being constructive, here are a few random suggestions for you to make sure I listen up.

Slides: the root of all evil. I’m strongly convinced that a good talk doesn’t need slides at all, except from graphs, code snippets and nice pictures, however I also realize that slides is what the audience is expecting nowadays. That being said, I have a whole sleeve of problems with slideware, but let’s stick to the main points:

  • first of all, here is some news for you: we all can read. If your presentation is all about reading slides, than thanks but I can do that myself in a fraction of the time it’s taking to you. Not to mention that what you’re saying is no news anymore.
  • for the same reason, mile-long slides should be avoided. Stick to a few bullet points, and make them interesting enough for me to hear from you what was that catchy phrase about.
  • do your homework, and study your slides. It sucks so badly when you advance to the next slide and stop for a few seconds to actually read it. Actually, you should start talking about the next slide before it hits my eyes: have me expect something, and I’ll be all ears

The way you present: in most cases, if attendees’ thoughts were floating as comics balloons, you would see a bold and loud “BORING!” flashing all over the room. Ok, this is technical stuff so you shouldn’t act like a clown, but still there are a few tricks that keep us from snoring:

  • walk around: don’t do your presentation sitting on a table or standing behind a conference desk, as if you were nailed to it. If you move around, the audience will have to follow you, and coincidentally might even hear a word or two of what you’re saying.
  • if you walk around (and you should) do a favor to us all, and buy yourself a wireless presenting mouse: it’s just a few bucks, but it will radically change your audience experience. Changing slides shouldn’t require walking to your notebook and click a mouse: we’ll get bored in no time flat, especially if you’re the kind of guy who needs to read what the next slide is about. I know it’s just a second or two, but it’s more than enough to kill the attention threshold.
  • use your body: have some gesture walking your talk. Clap your hands, raise your arms, snap your fingers, squat, duck, tilt: everything would do. Let us know you’re a human being, with moving parts.
  • change your tone of voice: 50-60 minutes are way too long to pay attention to what seems a Text-To-Speech automatic output. Shout, whisper and talk: the audience will be with you.
  • look at me. Actually make sure you look in turn everyone in their eyes. If you stare at the end of the room, people won’t feel you’re having a conversation with them, and will start wondering where to go for dinner.
  • interact with the audience. Perform show of hands, and ask the audience a few open questions. But be careful with it: keep in mind that the audience came to the room to hear something from you, not the opposite. Also, if you’re giving a talk to an international audience, know that you might get less feedback because people are shy to speak out in a foreign language. And there little if anything worst than an open question with no answers.

Finally, you: given all the points above, ask yourself is you’re the speaker kind at all. This has nothing to do with tecnical background: I’m sure you know your stuff. But do wonder whether:

  • do you have a sound knowledge of the language you’re speaking, if that’s not your native one? A telling sign is knowing a few jokes and being able to leverage them. If you’re barely able to write short emails and/or you have a frightening accent that won’t let anyone but your fellow countrymen undestand what you’re saying, the answer is probably not.
  • do you feel comfortable standing in front of a crowded room? If you’re somewhat shy or easy to feel under pressure, that will show up: you’ll speak with a feeble tone, you’ll start muttering stuff, and the audience will turn its attention to something more interesting like counting people in the room and performing statistics over their hair colour.

Presentation is somewhat a form of art: like it or not, technical content is not enough to have people walk out enthusiastically from your room thinking they learnt something or that they definitely give your stuff a try: the way you’re presenting makes the difference. Make sure you’re entertaining: you’re audience will thank you, and they will come back to your next talk.