The Wayback Machine - https://web.archive.org/web/20100105094429/http://planet.apache.org:80/committers/

Planet Apache

January 05, 2010

Steve Loughran—AI-hard zombie routing algorithms

Bryan points to this fantastic paper on the AI problems of the current era of zombie-shootup games

Lovely. And you thought being a zombie was easy?

by Steve Loughran at January 05, 2010 08:59 AM

Edward J. Yoon—SW engineer 로 성공하기

나의 작년 한해는 너무 고단했다.
몇 가지 이유가 있었겠지만 그 중 하나는 ApacheCon '09를 참석하면서,
1년치 연차를 연초에 모두 소진해버렸기 때문이다. ㅋ
(이런 세미나의 참석지원은 기대할 수 없다는...)

한국에서 SW 개발자가 성공하지 못하는 세가지 이유

.. 류한석이라는 사람이 쓴 글인데 매오 공감할 수 밖에.

회사에서 Skill Set 조사라는걸 받아봤는데,
난 그저 "엉성한 개발자 관리"에 공감할 수 밖에!

Why?

Google CodeJam 참여테스트 Round 조차 못 푸는 사람들에게
Java/C++ Skill을 평가받는다는건 '슬픈일'일수 밖에 없지 않겠나.
Java 티셔츠만 입고 다니면 Java 고수다. ㅋ

이런고로, SW engineer 로 성공하기란 정말 어려운것 같다.
어려울수밖에, 그도 그럴것이 회사를 다니면서 바라본 주변 개발자들은 하나같이 형편없다.
살펴보면 죄다 for loop 뺑뺑이 돌고 앉어있다.
(반대로 물론 그들은 나를 그렇게 보나? ㅋ)
단지 머리 속 능구렁이들(multi cores)의 이해득실 연산만이 화려한 알고리즘으로 병렬수행 중이다.
(여기에 그 동안 배운 지식들을 모조리 쓰는건가? ㅋ)

하지만 기회는 온다.
세상이 변하고 있기 때문에.
단 조건은 있다.

"한석이형 말대로 행동해야 한다" ㅋ
2010년은 멋지게 달리는 한 해가 되길...

by Edward J. Yoon at January 05, 2010 06:37 AM

Jon Scott Stevens—sardine - a very partial webdav client for java

For ages now, I've had the need for a webdav client for java that just implemented the features that I needed and wasn't an overcomplicated undocumented mess (cough jackrabbit cough).

Primarily, I just needed the ability to get directory listings, put files on the server and get an InputStream to a remote file. I also wanted it to use a recent version of Apache HttpComponents.

Since nothing like this exists, I created one... it is called Sardine. Hopefully someone else will find this smelly fish useful. Enjoy.

by Jon Scott Stevens at January 05, 2010 01:51 AM

Ben Hyde—Nexus One

Rumor is that Google will offer for sale to the general public a phone based on the Android platform, real soon now. Here is what I hope they do. I hope they announce a slew of phones, not just one. That they offer a shop where you can pick any of N Android phones. In my opinion it would be a mistake to anoint one Android phone as more or less worthy than the other offerings. Further I think that if they want to break the tie between the phone and the cellular service they need to break the up the distribution channel in some manner. There are plenty of tiny electronics manufacturing firms all over asia who can build these things, but they can’t distribute them. No doubt Google wants to fix that.

I think they ought to announce a premier phone that sets a bar for these firms to aspire to, while also putting a bit of a fright into the other smart phone platforms. I expect that phone to have some Google branding juice. But what does that signify? Does it signal that Google is now a cell phone maker (in the eyes of consumers)? I hope not, I hope it only signals that the Google added value is inside (i.e. the proprietary applications). When Sony or LG makes an Android phone based phone they will want to put that have that bit of tatoo on their phones and then it certainly won’t mean “phone from Google.”

All this assumes that Google’s overarching goal is to make cellular internet access more of a commodity than it is now. That goal, to commoditize one of their key complements, is far far more important than having a profitable phone business. For that to happen you have to disrupt the distribution channel that currently keeps phones, cellular service, and long term contracts in a death grip. Building an alternate distribution channel is key.

Oh and, I’m wondering if the Apple product that is the topic of many rumors will be, ah, pliable.

by bhyde at January 05, 2010 12:51 AM

January 04, 2010

Dave Johnson—The Jazz Connection

Here's something I've been closely involved with during my entire IBM career (almost 9 months now): making software development more social by integrating Rational Team Concert and Lotus Connections.

In case you don't know, Team Concert is Rational's "complete agile collaborative development environment" with integrated source code control, issue tracking, build management and very slick Eclipse and web-based client UIs -- it's a collaborative environment for software developers. Lotus Connections is IBM's comprehensive social software suite with blogs (Roller based!), wikis, social bookmarking, forums, file sharing, social networking and more -- an environment for more general collaboration.

IBM partner Mainsoft has developed an integration between Team Concert and Connections and it's now available as a tech preview. The product makes it easy for developers to hook a a software development project up to a Lotus Connections and enable software developers to collaborate with the much wider community of folks involved with a software project including end users, subject matter experts, executives and other stakeholders. As you can see from the list of features, it's a pretty tight integration.

If you want to learn more about the integration, check out the links I referenced above. There's also a short podcast available at Developer Works and there will be sessions at Lotusphere 2010 this month and (with luck) at Rational's Innovate 2010 Conference in June.

by roller at January 04, 2010 11:27 PM

Bryan Pendleton—Ken Watts praises IDEA

I happened to be over on the IntelliJ site looking at some information about IDEA, and I happened to notice that the featured IDEA testimonial is currently from Ken Watts.

I think that their web site chooses the featured testimonial somewhat randomly, so to see Ken's testimonial you need to search for his name on the testimonials page, probably.

I worked for Ken years ago and have nothing but praise for him, so I suppose this is sort of a testimonial for a testimonial-giver :)

by Bryan at January 04, 2010 10:39 PM

Bryan Pendleton—AI in gaming

I'm back at work after an enjoyably-hectic multi-week holiday.

One of the things I did over my holiday was to play a computer game, which I haven't done in a while. This particular game was King's Bounty: The Legend. I've been enjoying this game immensely, as did my father. In fact, I've been enjoying it enough that I might consider giving Armored Princess a try, once it is released (is it already released?).

Anyway, one of the things that fascinates me about computer games is the programming challenge of implementing a computer opponent. I've followed the work in this area for many years, and have even taken a few tries at programming computer opponents myself, once in the context of a railroad strategy game called 1830, another time in the context of random dungeon creation algorithms in a NetHack-style dungeon game. Of course, this is a huge and deep field, filled with lots of fascinating sub-problems.

So it was a delight to stumble across this fascinating presentation by one of the game designers at Valve, discussing some of the ideas and concepts behind the automated computer opponents and automated computer team-mates in the Left 4 Dead game.

The game itself may not be your cup of tea (it's a shoot-em-up game with a horror/zombie theme), but the presentation is mostly about issues that translate well across a wide variety of games, such as how to model unpredictability, how to construct co-operating actors, and how to provide re-playability.

If you're at all interested in what goes into the underlying logic of providing computer player behaviors in a modern game, I'm sure you'll enjoy reading through the presentation.

by Bryan at January 04, 2010 10:23 PM

Dan Diephouse—Selection Bias in Software Sales

The other day I heard someone in a particular niche of the software industry say, “whenever we run into our competitors we always win.” I about choked on my food I was eating at the time. I like the company and they make a good product, but what made this claim particularly astounding was that I’ve heard it from two of their competitors as well.

I’ve heard a similar thing from someone in sales in our company as well. While I believe Mule, Tcat, etc are great products, I surely don’t believe that we just have a marketing problem. If only people would get in front of our product then we’d always make a sale! Yeah, right.

Back to the first example though, how in the world can this be true for all three companies? Well, it comes from what is called a selection bias. Surely you’ve heard of it? Selection bias is when you have a bias in your conclusion based on who you are sampling from. For instance, let’s say you were trying to determine President Obama’s approval rating. If you said, hmmm, I’ll just take samples from urban centers – there are a lot of people around and it’s really convenient – that would be a selection bias. You’d be missing out on large swaths of republican, red state people who are much less likely to approve of Obama.

Similarly I believe that the 3 companies are coming to a wrong conclusion based on who their sampling from. This is almost too obvious to state, but when a lead comes in to a company they have already gone through many different gates:

Hearing about your product
Learning what your product is
Evaluating your product
Contacting sales
Buying your product

I’m guessing that these 3 companies actually are only basing their data on those who made it to the last two points. Folks may have given up though on any one of the first three. They may not be hearing about your product. Your product could be described wrong for their use case. Your product may not work out of the box correctly. In any of these cases, you’ll never ever hear about it and you’ll be selling yourself short in the market.

Further reading

There’s actually a rather interesting field of research on this as it relates to cosmology. Nick Bostrom in particular has written a great book on it called Anthropic Bias (first 5 chapters are free). He asks questions about how we can reason with our limited perspective – we can’t see the grand scheme of things in the universe afterall, we can only see from our planet earth. So how can we make judgments about the probability of life? Or the probability of our species surviving after we know how to produce nuclear bombs? I find it a rather interesting topic. He also has a bunch of other great articles like Global Catastrophic Risks and Where Are They? Why I hope that the search for extraterrestrial life finds nothing.

by Dan Diephouse at January 04, 2010 09:26 PM

Adrian Sutton—More Build Systems and Lots of Links

I’ve been doing a bit more Googling and seem to have hit onto a few key articles that tie into a web of articles around build systems. There’s certainly a lot more options that I’d originally thought.

Build Tools

Gant – Not really a build tool itself but an interesting library for scripting ant tasks in Groovy.
Gradle – Came out of the work on Gant and provides a full build tool with Groovy scripting and leveraging Ant tasks quite heavily under the hood. Uses Ivy for dependency management and promises good things for multi-project builds. Most interestingly though it has transitive dependency support without the need for remote repositories or pom/ivy files.
Schmant – Aims to be comparable to Ant in features but nicer and easier to work with. Uses Java 6 scripting to let you use a wide range of languages to script the build, but the sample build files look a little complex still. Most interesting is the TaskExecutor support for running different build tasks in parallel threads – not sure if it’s easier to use than ant’s parallel task though.
Apache Buildr – rake for Java I guess. Could also be described as maven done right – the build files are kind of POM like, but are actually full ruby classes. I played with this one a bit and it’s very impressive, though its transitive dependency support is still a bit immature.

Maven Info

Also stumbled across some good Maven articles:

How Atlassian uses Maven
Don Brown “Fixing Maven 2” and Don’s blog starting from Making Maven 2 not suck

by Adrian Sutton at January 04, 2010 05:37 PM

Adrian Sutton—On Build Systems

Recently, the subject of build tools and systems has come up again at Ephox and it appears the topic is rising up again around the internet. As part of this I’ve been reading up on and playing with a bunch of build tools to get a feel for their benefits and limitations, so it seemed worthwhile writing up what I find as I go along.

The Projects

Which build tool suits best clearly depends on the type of project you’re working with. I’m currently playing with three quite different projects:

EditLive! – a big, old code base with a few dependencies and a quite complex ant-based build process. While the primary code base is Java, the distribution includes a bunch of JavaScript and other ancillary files, documentation and is packaged up in three or four different distribution packages.
“EPipes” – a brand new, very simple Java library with no dependencies (other than JUnit). Ant and Maven build scripts at the moment.
“E2” – a newish internal web app with quite a few dependencies (including a complex transitive dependency tree), a few sub-projects and the need to include EditLive!

One important thing to note here is that I’m not trying to solve deployment as part of the build script – I’m only interesting in building the software ready to be deployed where it’s required and no server management etc1.

The Problems

Complexity

Simple projects have nice simple build scripts, but as you add multiple output formats and various other filtering steps into the build process the build scripts grow in complexity and become hard to understand. The build script becomes a software project in its own right and chews up large amounts of engineering time.

Sub-modules and Internal Dependencies

Breaking code up into separate modules is one of the most powerful ways to reduce complexity and increase maintainability, but if those modules are all just thrown into the same project and built together, it’s very common for interdependencies to leak through the code because there’s nothing to enforce the separation. What you want to be able to do is break the project up into separate sub-projects which are built in isolation so that the build fails if you add unwanted dependencies.

However, that means that instead of building one project, you’re now building multiple projects to get the final output and duplication gets introduced into the build system. In most cases you also want to be able to make changes across the modules without doing a full release or even committing. For example, you may want to add a new configuration option to one of your sub-modules that the main code base requires. Ideally you should be able to try that approach out in your own sandbox without having to commit, build and release the sub-module first.

Internal dependencies are a slight variant on this – they are at least conceptually developed by a different team, so you don’t need to make concurrent changes, but you do need to depend on versions that potentially aren’t publicly released yet. The real challenge here is that they may use a different build system and may not be available in or meet the assumptions of any particular dependency management scheme. Including EditLive! usually presents this kind of challenge.

Transitive Dependencies

When you depend on a library it usually requires a bunch of other libraries in turn. It’s usually pretty easy to grab all of the required library and shove them in at the start, but when you later go to upgrade that library or remove the dependency on it, you need to know which libraries it pulled in and whether or not they are still required, which versions are required etc.

Build Reliability

We write all kinds of automated tests for our code to ensure it works correctly, but most of the time our build scripts are completely untested. As complexity increases the chances of errors in the build scripts increases pretty dramatically. The challenge is that the build script is what runs the tests, so who tests the tester?

Spin-Up

Creating a new project or a new sub-project takes too long or it doesn’t get started with the right set of quality controls (e.g. it’s not running checkstyle yet, or not checking code coverage etc).

Repeatability

The single most important aspect of a build system is that it is 100% repeatable. An interesting exception at Ephox is that we never do two builds with the same version number. It was originally a poor-mans repeatability – if you want consistency only conduct the test once. We’ve kept it around even though we have a very reliable build system because a) it’s a bit of a safety blanket and b) code signing certificates have an expiry date that we can’t avoid, so the jar signature might be valid on one build and invalid on the other even though we do everything exactly the same.

False Problems

There are a few false problems that people bring up a lot when discussing build systems. Usually these are actually contributing factors rather than actual problems, sometimes they’re just personal preference showing through.

XML is not a Programming Language

This has to the most common false problem. It’s pretty clear that there are XML dialects which are in fact programming languages but that’s not really what the complaint is about. The real point here is that the build script is either too complex or too verbose. It might also be a complaint about productivity when editing the build script. It’s important to dig into these particular complaints because understanding what the real problem is leads to finding the right solution. If productivity is the problem, better editing tools are often the best solution. Often ant’s use of XML is being confused with the declarative intent of ant. In other words, with ant you’re build script isn’t a programming language and isn’t trying to be – you’re just trying to use it wrong (and perhaps the tool isn’t the right choice for you).

Downloading the Internet

This is commonly levelled at maven since it uses it’s own dependency management system to reduce the initial size of it’s download. Complaining that maven is so complex it has to download extra modules just to run clean is a red-herring – it could just as easily have included those modules with the initial download and been just as complex without needing to download anything. Ant for instance is “so complex” it requires you to program your own clean.

That said, accessing the internet can be a real problem in various ways. Does the build work the same way on the internal intranet as it does when working from home? What happens if the internet connection is unreliable? Does downloading stuff mean that the build isn’t reproducible? Once you identify the real problems, most build tools provide solutions to them in various ways. For example, all of those problems apply to maven by default, but can actually be solved2.

The Build is Too Slow

This can be a real problem if your build tool happens to be the bottle-neck but that’s fairly unlikely. More likely is that your unit tests are too slow, or that it’s too hard to make the build run in a distribution fashion, or that the complexity has introduced bugs into the build script and it’s doing work that’s either not needed or has already been done once before.

My Build is a Unique and Special Snowflake

It’s easy to believe that your software project is somehow special and there’s no other software that’s built like it out there. More likely though, it’s a pretty close variant on a theme rather than something completely unique to itself. Down this path lies building your own build tool – from scratch at the extreme. A custom built build tool can be the right choice, but it’s no cakewalk. I just look at the pain many newly open-sourced projects have had because they have an unusual build process. Huge amounts of time are spent teaching people how to set up the build environment and compile the project before they can start contributing. That cost is encountered with every new team member even if you never open-source your code. Plus, now instead of one project to build, now you have two and it’s unlikely to reduce the complexity.

So dig into this false problem further and really understand it:

Are you mixing in deployment stuff into the build and would that be better split out?
Is there something about your project you could or should change to make it easier to build?
Can you build custom modules for your build system rather than trying to do everything in the build script? For example, build an ant task or a maven plugin to perform particularly custom tasks.
Would you be better running an external script as part of the build rather than replacing the whole build process?

The key theme here is to identify the parts of your project that follow common patterns and the parts that are more unusual then leverage existing tools for the common parts. Depending on how much is common and how much is unusual will affect which tools is right, but you can usually avoid having to write a completely custom build system.

The Options I See So Far

There are a lot of build tools out there these days, but here are the ones I’ve found so far that are worth investigating:

ant All our systems are built around it so far but we need to find better ways of using it to solve the problems we’re seeing.
Maven The other Java build system. Maven has a real love it or hate it thing so it’s hard to know. I see huge potential with Maven and have huge concerns as well. I can’t put much faith in what I’ve read against Maven though because the articles always seem to lack:
- A sense of rationality. Huge diatribes are easy to find, careful analysis is much less common.
- Using a private maven repository. This is a must for Maven but is usually just mentioned off-hand as a solution that was put in the too hard basket rather than really tried.
- Building custom plugins for custom parts of the build. People who complain maven “just can’t do” something haven’t looked at building a plugin for it, or calling out to an external script etc.
Rake Except Rake doesn’t seem to really know anything about building Java projects so it would have to be combined with another library, possibly Raven, but I need to do more looking into how best to use rake with Java projects and Rake in general.

Anything else that would be worth looking into would be good to know as well. I’m deliberately ignoring make since the build process needs to run cross-platform and while that’s possible in make, it’s a pretty big challenge.

1 – I tend to be of the view that your build and deployment systems should be separate even if they do wind up using the same tools. That way you separate the configuration from your actual code since the deployment step does the configuration rather than it being baked into what gets built. It’s also a pretty good split-point to help reduce complexity and lets you pick a deployment tool that best fits rather than having to use the same tool for both. ↩

2 – whether the effort required to solve them is worth it or not depends on the particulars of the project and potentially how many projects that effort can be amortised over; it usually needs to be done per company rather than per project. ↩

by Adrian Sutton at January 04, 2010 05:15 PM

Daniel Kulp—Apache CXF: 2009 in Numbers

Claus Ibsen wrote up a nice blog post about how Apache Camel did this year using some easily obtained metrics. I thought it was kind of interesting so I wanted to do the same for CXF.

Number of posts on the users list: 6614

Number of posts on the dev list: 1661

Number of messages on commits list (svn commits and wiki changes and such): 3893

Number of JIRA issues raised in 2009: 595

Number of JIRA issues resolved in 2009: 763

Number of JIRA issues raised in 2009, but still unresolved: 51 (only 19 are bugs, 3 are bugs in JAXB where we need a new release of JAXB and 5 were logged in the last 2 weeks while most devs were on vacation, so really 11 bugs unresolved).

The JIRA stats are kind of interesting. Put in a graph:

CXF issues created/resolved graph for 2009

You can kind of see that the last 4 months or so, the CXF community really tried to go through all the old JIRA entries and resolve as many as possible. The result should be a much more stable and bug free product.

by Daniel Kulp at January 04, 2010 03:53 PM

Ross Gardler—Treading the thin line between Free, Proprietary and Open Source Software

For quite some time OSS Watch have been trying to put together an article examining Microsofts approach to open source. Today we welcomed the new year with the publication of “Microsoft: an end to open hostilities?“

This has been a very hard piece to write. We felt we needed to talk to as many people as possible, we needed to sift through significant amounts of Fear Uncertainty and Doubt along with unnecessarily emotional responses.

Things weren’t made any easier by the fact that every time we felt ready to publish something else heppened that seemed to change the story somewhat and we had to return to our sources for more observations.

During our research for this article OSS Watch have been accused, by an OSI board observer and ASF Member, of being “surrogates” for Microsoft, whilst Tony Hey (Corporate Vice President of External Research, Microsoft) privately expressed concern that OSS Watch was “encouraging academics to use the GPL.” Simultaneously, various free software representatives have pointed out how “naive” they believed us to be by even considering the idea that Microsoft may have genuine intentions with respect to engaging with the free and open source community.

As a non-advocacy advisory service we tend to think that if all sides in a debate believe we are in the wrong, yet all are still talking to us, we are probably doing something right. Certainly none of them can claim us as their own.

Given all this input what did we conclude?

Well, as you would expect, the conclusion is far from clear. On the one side we have the Stallman’s (Free Software Foundation) view that “these free programs are meant specifically to prevent the world from freeing itself from non-free software”. On the other side we have Erenkrantz’s (The Apache Software Foundation) view that “every positive and constructive engagement Microsoft has with the open source community (and vice versa) … will continue to chip away at the old perceptions”.

Furthermore, whilst Microsoft may be making concessions to open source and are happy to play with open source when it suits their needs they are also willing to use other methods where it best suits their business. For example, on patents Darren Strange (Head of Open Source Engagement, Microsoft UK) says “Patents drive innovation and they drive openness actually.”

Our own conclusion is that “Microsoft is not simply an unchanging monolith.” The article demonstrates that things within Microsoft are changing. Naturally they are changing in ways that benefit Microsoft as a business, but the good news is that some of these changes also benefit the world of free and open source software.

Over the years I have often quoted Ghandi when looking at Microsoft and their relationship with Free and Open Source Software: “First they ignore you, then they ridicule you, then they fight you, then you win”. FOSS has not “won” yet, but the frontline is moving and it is open source software that is winning.

by Ross Gardler at January 04, 2010 02:18 PM

Rob Davies—Apache ActiveMQ 2009 in Numbers

So Claus started this with his blog on Apache Camel - 2009 in numbers - so I'd thought it would be interesting to do the same with ActiveMQ (does not include ActiveMQ C#, C++ or Stomp sub-projects). The numbers are not as good as Camel - but still good: Number of posts to the ActiveMQ user forum in 2009: 4440 Number of posts to the ActiveMQ dev forum in 2009: 4447 Number of commits in 2009:

by Rob Davies at January 04, 2010 01:03 PM

Edward J. Yoon—Apache HTTPD and Tomcat Easy & Fast installation guide

1) Install httpd and httpd_devel using yum.

# yum install httpd**

2) Download the latest tomcat and tomcat connectors

# wget http://mirror.khlug.org/apache/tomcat/tomcat-6/v6.0.20/bin/apache-tomcat-6.0.20.tar.gz
# wget http://www.apache.org/dist/tomcat/tomcat-connectors/jk/source/jk-1.2.28/tomcat-connectors-1.2.28-src.tar.gz

3) Compile tomcat connectors

# tar -xvfz http://www.apache.org/dist/tomcat/tomcat-connectors/jk/source/jk-1.2.28/tomcat-connectors-1.2.28-src.tar.gz
# cd tomcat-connectors-1.2.28-src/native
# ./configure --with-apxs=/usr/sbin/apxs
# make
# su -c 'make install'

4) Configurations

4-1) Add jk module to httpd.conf.

# vi /etc/httpd/conf/httpd.conf

LoadModule jk_module modules/mod_jk.so
//JkMount /*.jsp ajp13

<ifmodule jk_module="">
  JkWorkersFile conf/workers.properties
  JkLogFile logs/mod_jk.log
  JkLogLevel error
</ifmodule>

4-2) Set an tomcat/jdk home path.

# vi /etc/httpd/conf/workers.properties

workers.tomcat_home=/usr/local/src/tomcat6
workers.java_home=/usr/java/jdk_1.6.0.17

4-3) Edit {$TOMCAT_HOME}/server.xml

...
    <Host name="centos.com"  appbase="/home/test" unpackWARS="true" autoDeploy="true" xmlValidation="false" xmlNamespaceAware="false">
    <Context Path="" docBase="" debug="1" allowLinking="true" reloadable="true"></Context>
...

5) Changes SELinux policy

If you see something along the lines of:

Sep 15 10:56:57 fc5test kernel: audit(1158314217.408:259): avc:
denied  { name_connect } for  pid=2245 comm="httpd"
dest=8009 scontext=system_u:system_r:httpd_t:s0
tcontext=system_u:object_r:port_t:s0 tclass=tcp_socket

then there is some work to do. Note, the “avc: denied” message references httpd (the Apache daemon) and port 8009 as the destination for a tcp socket connection (this is the Tomcat port from earlier). To allow Apache to perform network connects, you can do the following:

setsebool -P httpd_can_network_connect=1

This will allow Apache to perform network connections and will store this change in the booleans.local file in /etc/selinux/targeted/modules/active so it wil be reloaded at next boot.

by Edward J. Yoon at January 04, 2010 08:16 AM

January 03, 2010

Nick Kew—Virgin hadoop/hdfs/C++

I’ve just been playing with the C++ API to Hadoop’s HDFS. All on my newly-installed virgin Linux box, so no baggage! I encountered a few problems, all of which proved straightforward to fix, but which may highlight issues of possible interest to their documentation folks. Recording here while it’s relatively fresh in the mind.

1: I’ve downloaded hadoop, now how do I install it? The docs and README tell me nothing; there’s no INSTALL. I played with quick start in the download directory, but obviously that’s not something you want to keep on doing! Fortunately someone on the wiki tells me: I just move the whole caboodle to /usr/local and set up the paths. And a dedicated hadoop user as suggested there makes sense.

2: Now “hadoop” works and emits a usage message, but as soon as I try to do something it fails. OK, my virgin linux box doesn’t have a JVM installed; just go ahead and install it. The fact that “hadoop” had produced the usage message had led me to suppose it was installed: misleading until “file hadoop” revealed it to be a script!

3: How do I tell hadoop where to keep its filesystem? Quickstart tells me

Format a new distributed-filesystem:
$ bin/hadoop namenode -format

and the wiki is similar. But neither of them tell me where in my filesystem it’ll start writing! If I have to RTFM for that without a clue where in TFM to start, it rather defeats the purpose of a quick start! OK, run it as my newly-minted hadoop user, so filesystem protections protect me from anything I can’t wipe-and-start-again if it seems to be writing to lots of places I don’t want.

Turns out it created stuff in /tmp (which is fine for now, though I think some of what it created is supposed to be persistent). Also lots of log files, in hadoop’s logs dir – which is also fine just so long as I know where they are! Takes a bit more browsing the wiki to find how to configure it – at yahoo’s tutorial pages!

4: Where are all the files? Lots of find and locate required ‘cos they’re not under Hadoop’s /src and /lib directories, and there isn’t an /include! The C++ API has its own directory as an apparent afterthought.

5: Trial and error required to compile the HelloWorld C sample program. I ended up with the following makefile to record paths. Not a problem, but perhaps the docs page could use it:

CFLAGS=         -g -O0 -Wall -c
INCLUDES=       -I /usr/local/hadoop/src/c++/libhdfs \
                -I /usr/lib/jvm/java-6-sun-1.6.0.15/include/ \
                -I /usr/lib/jvm/java-6-sun-1.6.0.15/include/linux/
LDPATH=         -L /usr/local/hadoop/c++/Linux-i386-32/lib/ \
                -L /usr/lib/jvm/java-6-sun-1.6.0.15/jre/lib/i386/client
LIBS=           -lhdfs -ljvm

sample:         sample.o
                $(CC) -o sample sample.o $(LDPATH) $(LIBS)

sample.o:       sample.c
                $(CC) $(CFLAGS) $(INCLUDES) sample.c

6: Finally, I needed to set library path and CLASSPATH. Throwing the kitchen sink at the latter, as recommended in the scanty docs, I end up with the ugly but functional:

export PATH=/usr/local/hadoop/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/hadoop/c++/Linux-i386-32/lib/:/usr/lib/jvm/java-6-sun-1.6.0.15/jre/lib/i386/client:$LD_LIBRARY_PATH
export CLASSPATH=/usr/local/hadoop/hadoop-0.20.1-ant.jar:/usr/local/hadoop/hadoop-0.20.1-core.jar:/usr/local/hadoop/hadoop-0.20.1-examples.jar:/usr/local/hadoop/hadoop-0.20.1-test.jar:/usr/local/hadoop/hadoop-0.20.1-tools.jar:/usr/local/hadoop/lib/commons-cli-1.2.jar:/usr/local/hadoop/lib/commons-codec-1.3.jar:/usr/local/hadoop/lib/commons-el-1.0.jar:/usr/local/hadoop/lib/commons-httpclient-3.0.1.jar:/usr/local/hadoop/lib/commons-logging-1.0.4.jar:/usr/local/hadoop/lib/commons-logging-api-1.0.4.jar:/usr/local/hadoop/lib/commons-net-1.4.1.jar:/usr/local/hadoop/lib/core-3.1.1.jar:/usr/local/hadoop/lib/hsqldb-1.8.0.10.jar:/usr/local/hadoop/lib/jasper-compiler-5.5.12.jar:/usr/local/hadoop/lib/jasper-runtime-5.5.12.jar:/usr/local/hadoop/lib/jets3t-0.6.1.jar:/usr/local/hadoop/lib/jetty-6.1.14.jar:/usr/local/hadoop/lib/jetty-util-6.1.14.jar:/usr/local/hadoop/lib/junit-3.8.1.jar:/usr/local/hadoop/lib/kfs-0.2.2.jar:/usr/local/hadoop/lib/log4j-1.2.15.jar:/usr/local/hadoop/lib/oro-2.0.8.jar:/usr/local/hadoop/lib/servlet-api-2.5-6.1.14.jar:/usr/local/hadoop/lib/slf4j-api-1.4.3.jar:/usr/local/hadoop/lib/slf4j-log4j12-1.4.3.jar:/usr/local/hadoop/lib/xmlenc-0.52.jar

by niq at January 03, 2010 11:59 PM

James Duncan—Lightroom Graduated Filter Quick Demo

by James Duncan Davidson at January 03, 2010 11:55 PM

Justin Mason—SAY2K10 Doh

Happy new year! Or maybe not. Doh.

Over a year ago, Lee Maguire noticed that a contributed SpamAssassin rule, FH_DATE_PAST_20XX, was naively written — simply to match any date in the year 2010 or later — and would start to false-positive on all mail in 14 months. We made the trivial fix to avoid this (for at least 10 years, by which point the rule would have obsoleted itself through normal means), and I committed it to SVN.

Problem solved, right? Nope. I’d committed to trunk, but in a moment of inattention had forgotten to backport the fix to the stable release branch, 3.2.x, as well. Nobody else noticed the mistake, and several months later, boom:

Bugger.

Annoyingly, the GA had assigned this rule 3.5 points in the 3.2.0 rescoring run. This meant that the effective default threshold had been lowered from 5.0 points to 1.5, which produced a 2% false positive rate during the first 13 hours of the new year.

After that point, the fix was pushed to the sa-update channel, and anyone who runs sa-update regularly (as they should!) was brought back to normal filtering behaviour.

The rule is superfluous anyway, since it overlaps with a better-written “eval” rule, DATE_IN_FUTURE_96_XX. Accordingly, most likely scenario is that it’ll be removed.

Personally, I see a few lessons from this:

Obviously, I need to pay more attention. This is easier said than done though, since SpamAssassin has nothing to do with my day job anymore; it’s a spare-time thing nowadays, and that’s a rare resource, unfortunately. :( But still, a chastening result, and I’m very sorry for my part in this screwup.
We need more active committers on Apache SpamAssassin. If we’d had more eyes, the fact that I’d forgotten to backport the fix might have been spotted. we’re definitely in a better situation now in this regard than we were 6 months ago, so that’s good.
IMO, this is a good demonstration of how too many simple rules are risky; without careful vetting and moderation, it’s easy for a bad one to slip past. Perhaps we need to move more towards a DNSBL/network-rule driven approach, although this has its downsides too. Still thinking about this.
It’d be good to fix the GA so that it wouldn’t assign such high points to simple rules like this, without some indication that a human has vetted them and believes them trustworthy.

Daryl posted a good comment on /.:

Clearly we dropped the ball on this one. As far as I know it’s our first big rule screw up in the project’s 10 years. If you’re going to screw up you might as well do it well.

+1 to that!

And to everyone who had to clean up the fallout and spend a holiday recovering lost mails from spam folders… sorry :(

by Justin at January 03, 2010 11:38 PM

Dennis Byrne—Presenting Memory Barriers at SpeakerConf 2010

The one thing I'd like to see more of on conference circuits are the basics. I often find myself more interested in concepts that have been around for decades more than the latest and greatest framework. This year at SpeakerConf I'll be doing another talk on concurrency and I'll be getting closer to the metal. Memory barriers, or fences, are a set of processor instructions used to apply ordering limitations on memory operations. Without memory barriers every mutex, actor or synchronization point in your application is broken; so consider this talk to be relevant to all languages. Looking forward to another great lineup this year:

Steve Vinoski, Neal Ford, Stuart Halloway, Obie Fernandez, Brian Marick, Philippe Hanrigou, Dave Hoover, Pramod Sadalage, Oren Eini, George Malamidis, Matt Deiters, Amanda Laucher, Michael Nygard, Freg George, Dave Thomas, Aslak Hellesoy, Pat Farley, Eric Yew and Robert Martin

Memory Barriers

View more presentations from Dennis Byrne.

by Not Dennis Byrne at January 03, 2010 08:35 PM

Claus Ibsen—Apache Camel - 2009 in numbers

Now we are all safe into 2010 I hope.

Just to do a quick post on some of the numbers for the Apache Camel project in year 2009.

Number of posts on Camel user forum in 2009: 5634
Number of commits in 2009: 4947 (commit mails send to public forum).
Number of tickets created in 2009: 1115 (where as 123 are still unresolved).
Number of tickets resolved in 2009: 1447
Number of tickets updated in 2009 but still unresolved: 149
Number of unresolved tickets not updated in 2009 or later: 2

I am sure you can dig more numbers if you got the time to hunt them down.

From the numbers I notice that we resolved more tickets in 2009 than created which means we are up-to-date with the issues reported. In fact there are only 146 (6%) open tickets currently. And for people reporting tickets we are surely bound to take a look as there are only 2 tickets which was not updated in 2009 at all. These 2 tickets are old tickets about a Groovy DSL and supporting Spring property placeholders. The latter is not possible due limitations in Spring Framework itself. The former is not in much demand. And you can already use Groovy for Camel DSL. So in case you have any issue with Camel we have a positive track record of looking into it.

Are there any numbers you would like to look at? Ohloh got some numbers for LOC and whatnot. It does however not have a solid history as Ohloh could not cope with the Apache Top Level Move we did in early 2009 so some of the metrics are not accurate.

by Claus Ibsen at January 03, 2010 05:47 PM

Robert Burrell Donkin—Prize Crowdsourcing: Never Underestimate The Power Of The Cowell

A Guardian commentator ~~demonstrates that a little ignorance goes a long way in satire~~ comments:

It is Mike Myers's Austin Powers villain, of course, who has been cryogenically frozen for so long that he hijacks some nuclear weapons and attempts to hold the world's leaders to ransom for "one meeellion dollars", only to see them dissolve into laughter – a reaction one would imagine would be replicated by the world's greatest technological innovators were David Cameron to turn up and offer them £1m to design and develop a tool to change the face of democracy itself. A meeellion quid, love? I think they can do you a mouse mat for that.

Ah - but this is a Prize, and competitions are often quite different...

An alternative perspective (and the full email):

Cripes. HM’s Loyal Opposition has announced – if elected – a £1m prize for an online platform for large-scale crowdsourcing.
This almost comes onto the radar of big IT suppliers. It’s massive for smart little NGOs; it would have funded about a decade of early MySociety work.

Given a good idea, it should be possible for anyone with motivation to learn enough skills from the internet to assemble a solution from open source components at minimal cost. I'd like to see more procurement opened up to real competition in this way...

Ben Hyde—The Tale of the Traitorous Story

Sometime ago I posted about a fun book on depression era “stamp scrip” a variant of local currency that, to hear tell, actually works. In that posting I quoted a story about the surprising amusing helpful role that a high velocity currency can play. Today I happened upon the same story again, but I was bemused that the story has been repurposed. I think it is fair to say that the original story’s purpose was to illustrate why we need more currency in circulation, while the new version appears to be arguing that you shouldn’t trust the currency system. Apparently this story lacks much loyalty to either side in the debate.

Here are the two versions, first the older one:

Charles Zylstra, the enterprising man who first introduced Stamp Scrip to America (in a small western town) tells this story. A travelling salesman stopped at a hotel and handed the clerk a hundred dollar bill to be put in the safe, saying he would call for it in twenty-four hours. The clerk, whose name was A, owed $100 to B and clandestinely he used this bill for the liquidation of his debt, thinking that before the expiration of 24 hours he could collect $100 from his own debtor, whose name was Z. So this 100 dollar bill went to B, who, greatly surprised, used it to pay his own 100 dollar debt to one C, who (equally surprised) . . . and so on, and so on, all the way down to Z, who, with much pleasure, returned the bill to A, the clerk, who, in the morning, restored it to the salesman. And then did A, the clerk, stand petrified with horror to see the salesman light a cigar with it. ”Counterfeit,” said the salesman, “a fake gift from a crazy friend, Abner; but he didn’t put it over, did he?”

Apparently they used to write in paragraphs, but the modernized version is done up in newspaper standard style; one sentence per paragraph.

It’s a slow day in a little east Texas town. The sun is beating down, and the streets are deserted. Times are tough, everybody is in debt, and everybody lives on credit.

On this particular day a rich tourist from back east is driving through town. He stops at the motel and lays a $100 bill on the desk saying he wants to inspect the rooms upstairs in order to pick one to spend the night.

As soon as the man walks upstairs, the owner grabs the bill and runs next door to pay his debt to the butcher.

The butcher takes the $100 and runs down the street to retire his debt to the pig farmer.

The pig farmer takes the $100 and heads off to pay his bill at the supplier of feed and fuel.

The guy at the Farmer’s Co-op takes the $100 and runs to pay his debt to the local prostitute, who has also been facing hard times
and has had to offer her “services” on credit.

The hooker rushes to the hotel and pays off her room bill with the hotel owner.

The hotel proprietor then places the $100 back on the counter so the rich traveler will not suspect anything.

At that moment the traveler comes down the stairs, picks up the $100 bill, states that the rooms are not satisfactory, pockets the money, and leaves town.

No one produced anything. No one earned anything. However, the whole town is now out of debt and now looks to the future with a lot more optimism.

And that, ladies and gentlemen, is how the United States of America Government is conducting business today.

It’s fun to think about all the little changes. Things happen faster these days, for example. I like the addition of weather. And a prostitute! How liberated.

by bhyde at January 03, 2010 03:45 PM

Edward J. Yoon—My Hometown.

My hometown is ChunJu. It's the capital of the North Cholla Province. (I drove snowy road to call on my parents from Bundang to Chunju.)

내 고향은 전주. 전라북도에선 그래도 도시다.
새해인사를 드리러 (후륜을 이끌고) 눈길을 헤치며 달려갔다.

I was the first to arrive. so I could park inside the yard.

식구들 중 내가 가장 먼저 도착해서 마당에 주차할 수 있었다. :)
(참고로 나는 막내아들로 서열이 가장 아래임.)

P.S. 4층집인데 부모님은 1층 상가 하나를 사용하시고 전부 세를 내어주어,
처음엔 옥상에 텐트를 쳤다가 앞에 보이는 창고에서 형과 1박2일 했는데,
입이 돌아가는줄 알았다는.

My parents.

우리 부모님.
전주는 갈때마다 느끼는건데 분당과 지리적으로 형상이 매우 비슷하다.
분당으로 치면 여기는 야탑 탄천 근처 쯤 된다.
전주천을 자전거 타고 유유히 노후를 보내고 계신다.

by Edward J. Yoon at January 03, 2010 01:15 PM

January 02, 2010

Yoav Shapira—Happy new year! 2009 recap, 2010 resolutions

2009 was an awesome year for me. It really flew by quickly.

I was working at our little internet marketing shop, HubSpot, which grew and is not so little anymore. I am proud of the work we're doing here, psyched with the people we're hiring, and honestly happy to come in to work every single day. For 2010, I hope for more of the same: continued growth, excitement, and challenges.

Alli was in school at Harvard, where she learned a lot, met a ton of great classmates, and generally improved herself. Her year went by fast as well.

We had a good amount of time to hang out with our families, visiting both Israel and Florida multiple times. The flights were good, the journeys memorable, each including a number of special events like my sister's wedding.

We also had a big trip, to China, which we had been looking forward to for a while. In fact, tonight we're celebrating the start of 2010 with many of the China trip friends. I also got to cross off a nice "life todo list" item during this summer, as part of the China trip: to travel around the world in one trip.

Personally, I did not manage to lose as much weight as I wanted to. So that remains a goal, a challenge, and a resolution for 2010. I have about 12 pounds to lose. My new approach in that area involves more natural sports and competition. I just started playing squash with a colleague last week, resumed pickup basketball last month, and will continue playing ultimate frisbee. I may also resume soccer when the weather is better: I haven't played soccer since elementary school!

I am not going to list my favorite music, films, or restaurants. There were plenty of good experiences, as readers of this blog know.

For 2010, it's onwards and upwards! Happy new year, everyone ;)

Yoav Shapira—Movie review: Up in the Air

Last night Alli and I saw Up in the Air, a new movie with George Clooney as a much-traveling consultant who lays people off. It was fun, entertaining, and had at least one or two surprising twists. Overall the movie was actually deeper / more interesting / more engaging than I thought it would be.

Ceki Gulcu—Not allowed to stand up before landing

According to new flight safety regulations, passengers are not allowed to stand up and go to the restroom one hour prior to landing of the aircraft. How about handcuffing all passengers and tie them to their seats so that they do not budge? Better yet, how about restricting air travel on a need-to-fly basis? Surely, that would improve security...The USA, as a civilized western democracy, should

by Ceki at January 02, 2010 07:55 PM

Rich Bowen—That's Hard

Over the last decade or so, I've had a pretty consistent cycle on learning new technologies, which I'd really like to break out of. I start by believing that something is hard - probably too hard for me to learn. And then ... well, I stay there for an absurdly long time. Later on, usually out of necessity, or just because I'm tired of thinking that it's hard, I force myself try, really hard, even though I probably won't be able to do it. And, lo and behold, it's usually a lot easier than I expected.

One of the unfortunate consequences of this is that I put off, for ridiculously long times, things that I wish to accomplish.

Examples of this include mod_perl, mod_rewrite, C programming, Apache modules, and jQuery. But there are others. I do this a lot.

Two things have brought this starkly to my attention recently. But they both amount to the same thing - complaining about something for *years* before finally doing something about it. And I think that in the geek world we get an awful lot of joy out of complaining about things. But fixing them is so much more full of satisfaction, it's surprising that we often put it off so long.

The first of these was the PHP documentation. I have complained for years that the "how to install php on Apache" documentation was wrong. Yes, I'm ashamed to admit it. Years. Probably five. Perhaps more. And during that whole time, people would come on to the Apache IRC channels, ask "why isn't this working", and I'd say "Because the PHP docs are wrong. Silly PHP people." All that time, of course, I was ignoring a simple truth. I'm one of those PHP people. I use PHP. For most of those years I made my living with PHP. I was one of those annoying people who complains without doing anything. That's embarrassing.

I have repented of my ways, and I'm about half-way through fixing the offending documentation, not only to make it more accurate and less prone to error, but to strengthen the ties between the two communities, which are hurt by jerks like me who spend too much time saying "that's a PHP problem" and too little time saying "let's see how we can fix this."

The other incident is within the Apache HTTPd project. I have been a committer on Apache HTTPd since September 2001, and have done some work on the documentation. But during that time I have said, many times, that I merely write about what other people do, and have declined to accept any responsibility when things don't work quite as people want them to.

Granted, writing Apache module C code is a lot harder than patching Docbook. But it's not impossible. I've just convinced myself that it's Too Hard For Me, even though I wrote the for Nick's excellent Apache Modules book.

Now, I did try once, long ago, to patch one of the Apache modules. mod_access, I believe. And because I had no idea what I was doing - and, more importantly, because I didn't test it once I was done - my change was reverted within five minutes of my committing it. Mostly, I don't talk about that, because that was entirely my fault, and was a stupid mistake. Mistakes are fine. Not testing changes is not excusable.

And I also gave a whiny presentation back in 2006, at ApacheCon in Stuttgart, entitled "Why I Hate Apache." It was amusing, and was better received than it deserved. And, to give credit where it's due, almost all of the things I complained about have since been fixed. None of them were fixed by me. That's a pity. The talk would have been much more effective if it had been accompanied by patches.

But now, because I finally got off my bum and decided to do something, I actually have two changes in Apache, which will be part of the next release. The QSD flag in RewriteRule, and a teensy tweak in mod_autoindex's CSS support (not yet complete). Neither of these are earth-shaking, but they're the first step, I hope, over a hurdle that I set for myself for no particularly good reason.

That was a lot of talk to get to my points.

1) Don't decide that something's too hard before you've tried it. That's not only selling yourself short, it's being selfish with your time and talents.

2) Complaining about stuff without doing anything about it is just adding to the problem. Filing bugs is important and good, but just whining and complaining is a strong disincentive for anyone else to fix it.

So if there's a theme to my goals (resolutions, if you like) for the new year, it's this: Stop complaining and do something about it.

by rbowen at January 02, 2010 05:02 PM

Ben Hyde—ugly

Oh man that is ugly, and as the column on the right shows household net worth is even worse.

by bhyde at January 02, 2010 04:33 PM

Ben Hyde—Eccentric goals

People are writing blog posts about plans, goals, resolutions and such. I am reminded that like a plan that manifests some eccentricity by this guy who resolved a year ago to spend a year without getting into a car. Nice. After the revolution all media outlets will be required to report on some happy local who has just succeeded in keeping an exceptionally interesting resolution. Give us all hope we can pull it off. Help to model this fine behavior.

If you see more examples, please pass them along.

A public service announcement: You’ll get 40% more done by keeping your plans secret! So if your writing up resolutions and goals, have at it; but consider tearing up the list and don’t show anybody!

by bhyde at January 02, 2010 04:10 PM

Rich Bowen—New Year's Goals

I made some resolutions last year, but they were more goals than resolutions. This year, I have a list of rather specific goals - things I want to accomplish in the coming 360 days or so. Some of them are rather vague and unmeasurable, like learning or improving a particular skill, (how do I know when I'm good enough?) while some are pretty specific To-Do items.

As I was writing out this list, I found that more than half of the items involve writing in some way. Most of it is technical writing, but some of it is non-technical. And I've spent much of the last two weeks working on the mod_rewrite-related parts of these goals. I've worked on the Rewrite Flags document, and gone through the module documentation several times, all of which is a prelude to working on the new edition of my book. And I'm now one day behind on modrewrite.org.

Other goals involve tasks around the house, such as putting up yet more shelves for our overflowing book collection, and finally cleaning out the garage and attic. But most of the goals involve, in some way, staring at a computer screen.

No, I'm not going to list the goals. I read somewhere, several times this past year, that announcing your goals is the first step down the road to forgetting about them. And several of the skill-improving goals are skills that I'm a little reticent to admit I haven't already mastered.

by rbowen at January 02, 2010 01:11 PM

Sanjiva Weerawarana—Delivering a complete middleware platform under the Apache license

Let me start by wishing everyone a wonderful 2010!

Right from the get-go, WSO2 was designed to be a company that built a complete middleware platform. We set out to target the big guys who have a complete story, except with two key fundamental differences: our technical approach and our business model.

Our technical approach is of course based on Web services and SOA. For the first time in the history of computing, Web services have offered a lingua franca for how systems interact with each other. There were of course many previous attempts, but one camp or the other of the technology industry didn't agree and so there was no "English" of the computer world. Web services has changed that with every major and minor vendor supporting interoperability via Web services (XML, HTTP, SOAP and the rest of WS-*).

SOA, despite the much ballyhooed story of its demise at the beginning of 2009, is not only alive and well, but is in fact kicking butt. SOA is fundamentally an approach for how to build large scale composite systems. As an approach, it mimics the real world's service-oriented economy. As such, SOA is a fundamental concept, not some vendor-driven theory. That said, SOA, like any other technology, has had to live through the Gartner Hype Curve. If at all instead of 2009 being the year SOA died, it became the year it came out of the trough and started climbing up towards the plateau of productivity.

Of course the fall into the trough was not without reason for SOA and Web services. Much of it was driven by middleware vendors not delivering anything new, anything valuable in the form of SOA middleware. Many of them simply took their existing middleware and rebranded it the shiny new SOA gimmick. Well that of course doesn't work and the cracks in the story will force you down to the trough .. and it did.

WSO2 is unique in having started from nothing and set off on a path to build a complete middleware platform with Web services and SOA in its heart. The result is simply orders of magnitude less complexity, much better performance and overall greater productivity and lower TCO. These are not random claims from me - these have all come from our users and customers.

We now call it Lean Enterprise Middleware. Try it and see - you'll be shocked at how lean it us, how productive it is and how much money you can save by replacing your legacy or pretend open source middleware stack with ours.

Now let's talk about the business model. Right from the beginning, we made a strong commitment to releasing all of our software under the Apache license and to not attempt any bait-n-switch type acts. Believe me, that took a lot of hard work to keep going .. investors for example have a major issue with the Apache license. Why? Well because you can take any of our software and do whatever you want with it and never ever pay us. We have no legal recourse to making you pay (as dual license business models do) nor any way to force you to pay for the good stuff (as many "commercial open source" companies do). Instead, we rely on delivering real, measurable value to our customers without forcing them to pay us. Our customers love us because they pay for the value we deliver to them, not because we are using the law to force them to pay for the software they use.

When I say you can do whatever, I mean whatever - recently one of our competitors sold a support contract for one of our own pieces of software! Yes, that is possible. In this case the people who will pay the eventual price is the customer who did the stupid thing of buying support from someone who has nothing to do with the software! Remember Oracle's Unbreakable Linux? Well that didn't break Redhat and neither will this act - it just shows how low some people will go to make a buck.

So today you can download an entire enterprise middleware platform from us without registering, without paying, without any risk of bait-n-switch for absolutely no cost. How can we afford to do that and become a successful business? We have many many customers who happily pay us to provide maintenance, provide help and in general to be their technology partner. So having thousands and thousands of free non-paying users is not a problem for us - that's free marketing and helps us save the world from the ugliness that is IBM, Oracle, etc. middleware.

WSO2 is delivering on the promise to build lean enterprise middleware and deliver 100% of it as open source under the Apache license. Oh yes, we also offer it as various cloud offerings - virtual machines, or online services.

We are the ONLY vendor offering a complete enterprise middleware platform 100% open source under the Apache license.

That was all wrapped up in 2009, a tremendous year for us. In an environment of economic uncertainty, not only did we meet our targets but we beat them. We have been doubling revenue each year and this year was no different. We are on a roll :-).

Looking towards 2010, we have more work to do to make our enterprise middleware platform simply untouchable by anyone else. We're already far ahead of our competitors with our WSO2 Carbon powered platform, but we have several things planned to further leave our competitors in the dust. As I wrote in an earlier blog, we practice open development - so if you want to be part of it come on over and join us on architecture@wso2.org!

by Sanjiva Weerawarana at January 02, 2010 07:03 AM

January 01, 2010

Jon Scott Stevens—Which build tool to use...

This blog posting is mostly spot on.

I guess I need to try out Rake, but the idea of using Ruby to solve a Java problem, seems like a bad idea too. Ant, with all its warts, is still my preference after all these years.

My favorite quote from that posting:

"Maven builds are an infinite cycle of despair that will slowly drag you into the deepest, darkest pits of hell (where Maven itself was forged)."

I couldn't agree more with everything the author said. I've been there. If you are using Maven to do builds, you should have your head checked (or even better, commit privileges revoked).

by Jon Scott Stevens at January 01, 2010 11:44 PM

December 31, 2009

Howard M. Lewis Ship—Clojure 1.1 is out ... plus videos about new features

So, Clojure 1.1 is now available, with lots of cool new features, including transients, pre & post conditions, futures, promises and a boat load of other stuff. Rich Hickey has put together release notes.

Meanwhile, if you are curious about some of these new features, check out this series of videos by Sean Devlin.

I'll be speaking about Clojure and Tapestry at CodeMash 2.0.1.0 this year, January 13-15.

by Howard at December 31, 2009 11:33 PM

Daniel Kulp—Apache CXF has another book!

Just a week or so after the first book for Apache CXF was released, a second book now appears. Developing Web Services with Apache CXF and Axis2 (3rd Edition) is an update to Kent Ka Iok Tong’s book about Axis2, but it now covers CXF as well.

Being on holidays, I haven’t had time to look much at it (I’m only 1/4 of the way through the first book), but a quick glance through provided me a nice surprise. I kind of expected that it would be geared more toward Axis2 (since it’s an 3rd edition of the Axis2 book) with small “Now here’s how you do the same thing with CXF” type sections. However, it looks to be completely the oppossite. The initial examples and screen shots and stuff in the text is using CXF and JAX-WS stuff and then a small “Now here’s how you do the same thing with Axis2 section.”

Anyway, after the holidays, I hope to review both books in more detail.

by Daniel Kulp at December 31, 2009 07:51 PM

Nick Kew—Interesting Times

2009 is ending. In some places it’s already 2010; here we have between five and six hours to go. And we live in interesting times.

Featuring large on the 2010 calendar is our election, and what the new government will do with the economic disaster and the legacy of fiscal incontinence on a mindblowing scale. The current government is more bankrupt even than the country (and I don’t just mean financially). The only real alternative – the Tories – don’t look promising. Neither does the third party, the libdems. I expect the small parties to benefit, and alas the xenophobic [BN|UKI]P may well outperform the greens among the minor parties, as they’re seen as a more powerful protest vote.

Nationally I have to support the tories, as the best chance to end thirteen years of the most blatently corrupt government in our history (no, this has nothing to do with MPs expenses). Not, I hasten to add, with any enthusiasm: rather with my nose firmly held and screaming “none of the above“, but alas, that’s not an option.

But they’re setting themselves up for a huge fall. By being far too timid on the economy, they’re walking right into a whirlwind of blame for the coming collapse. Four years of cold turkey followed by another Labour government is a truly ghastly prospect. I want to hear a credible plan now! Don’t pretend it’ll be painless for the majority. Don’t pretend a bloated NHS can be ringfenced. Don’t pretend all that debt can be swept under the carpet indefinitely. Tell us the worst now, so you have a mandate for what you have to do! Because if you play labour’s game and downplay the problem, you’ll deserve (as well as get) the blame for killing off “the recovery” and plunging us into a deeper recession.

The most interesting prospect I can see to tackle the broken economy is Philip Hammond, though that’s based on very little knowledge. Perhaps a higher-profile role for him (who needs Osborne?) would be a good start. Googling him for a link, I see the Adam Smith Institute have another interesting idea.

by niq at December 31, 2009 06:38 PM

Sander Temme—OK Apple, Where Is It?

Apple says: “Apple will support Microsoft Windows 7 (Home Premium, Professional, and Ultimate) with Boot Camp in Mac OS X Snow Leopard before the end of the year. This support will require a software update to Boot Camp.”

My VMWare VM is running Windows 7 on the Boot Camp partition, but I’m waiting for this new version of Boot Camp so I can boot Windows 7 directly on the metal. It’s the end of the year. Where’s my update?

Share this post:

by Sander at December 31, 2009 05:48 PM

Arjé Cahn—2010: the year the E-Reader will become the content managers’ favorite productivity tool

Flipping through my usability notes, I noticed one of our Hippo CMS users mentioned the following about proof-reading:

“There is no printing facility for content in Hippo CMS so everything has to be done on screen. Even if we could print, we can only see a single field, not the whole article, so it's a bit pointless trying to proofread this way. This leads to long periods of time staring at a screen. It's also very difficult to spot typos on screen, leading to potential loss of quality in copy.”

Exactly. Often enough, also I spot typing errors only after having printed out my text (like this blog post), and walking through it again while commuting on the train. The letters start to dance before my eyes when I’m behind a monitor for too long. Sometimes it helps a lot to just put your text aside, do something else for a while, and then return to it on a quiet moment to walk through what you’ve written. So, why not implement a printing functionality in Hippo CMS, to allow authors to take their text with them? Everybody got so used to printing out Word documents anyway, that it baffles me that this idea had never appeared to me before.

So I put the print function on our roadmap.

But somehow, it feels wrong. What happened to the paperless office? It’s almost 2010 (it’s the 31st of December, 11:45 in the morning, as I write this), Kopenhagen happened only a few weeks ago, I’ve been driving the most economical car on earth for years, and I use the printer as little as possible. I don’t want to have a tree cut down for me to be able to proof-read my blog post!

Enter the e-reader. 2009 was the year of the e-reader. A number of those devices already boost a keyboard and the possibility to add annotations to texts, to store them and synchronize them with your PC. And every modern writer nowadays carries an e-reader to read their books anyway, right?

So here’s the thought: let’s make a tool that allows you, after a long day of writing, to take all the texts you’ve worked on with you on your e-reader. You grab your reader again when you sit on the train, where you walk through all passages for typos and make annotations. The following day, you import all those changes back into Hippo, or maybe you’ve already sent them to the content repository over wireless email.

I don’t know whether the latest generation of e-readers are already open enough to share annotations with a content management system. Maybe we’ll have to wait for that to happen in 2010. But at least I found a very good excuse to rush downtown and treat myself with another gadget to go try it out!

Brian McCallister—Bulkheads

Bulkheads are used in ships to create seperate watertight compartments which serve to limit the effect of a failure – ideally preventing the ship from sinking. The bold vertical lines in Samuel Halpern’s diagram illustrate them:

If water breaks through the hull in one compartment, the bulkheads prevent it from flowing into other compartments, limiting the scope of the failure.

This same concept is useful in the architecture of large systems for the same reason – limiting the scope of failure.

If we look at a very simple system, say something that easily partitions by user, like a wish list of some kind. We can put bulkheads in between sets of app servers talking to distinct databases. In this system a given app server only talks to the database in its partition.

Given this setup, if a single app server goes berserk and starts lashing out with a TCP hatchet at everything it talks to, no matter how angry it gets it only takes out a vertical slice of the system, the rest goes about business happily.

If we take a slightly fancier system (ie, slightly more realistic) we can see we develop (mostly) identical vertical slices:

On a ship we’d call the groups compartments, but we’ll call them clusters because each vertical bunch of stuff forms a logical unit which can be thought of as one thing (say, a cluster!). In this setup, if one of the caches started blackholing requests the damage done (hopefully just latency increasing a small bump to a reasonable timeout) would stop at the bulkheads around the cluster. Yea!

If we look closely at the slightly fancier system, we note that a cluster consists of:

3 App Servers
2 Caches
2 Log Servers
4 Somethings
1 Database

Typically, you can use clusters as units by which to add capacity, and the exact contents of the cluster will be determined by finding the limiting element (usually the one which needs to maintain lots of state), on the most constrained axis of scale embodied in the cluster, and sizing out the rest of the elements based on their capacity relative to the limiting element. Add to this enough capacity to handle spikes, provide acceptable redundancy, and voila, you have a cluster. In theory.

In practice, some things simply do not work well with hard boundaries like this. In this example, note that the load balancers are not part of a cluster, but span clusters – they need to as they are responsible for determining which cluster can handle a given request!

It gets worse, notice that we have two log servers per cluster. Given a reasonable number of clusters, say 25, that amounts to 50 log servers. A single log server (in this case) is capable of servicing about 1000 app servers, but logs are really important so we need to run them redundantly, hence two per cluster. Given 25 clusters and three app servers per cluster, a single log server has plenty of capacity, yet we have 50 for fault isolation in this setup. The accountants are not happy.

Another variant on the inefficieny problem are the Somethings. Somethings utilization is very bursty. Under average conditions one is enough for each cluster, but they occasionally (once a week or so) burst to four times that activity, so we have four, in each cluster. Now, we know the usage pattern is such that bursts don’t overlap, so we’d really like to have shared burst capacity rather than per-cluster, but that will then breach the bulkhead.

This leads to system classification based upon the scope of failure directly causable by a given component (how is that for absract sounding?). If we revise our system to make the accountants and engineers happier (centralized things are generally easier to build) it might look like this:

The system is now much more efficient, but we have increased the risk of a cascading failure taking down more, or all, of the system if any of the load balancers or log servers are effected, or if two something’s burst at the same time (I am sure you have heard someone say, “but that can never happen!” before, right?).

We’ll call services which span clusters Class A services, and services are fully contained within a cluster Class B services. Working with Class A services is harder than Class B because you need to be extra special super really really careful that the Class A service cannot take down your Class B service.

A famous example, going back to the naval usage of bulkhead, of a Class A service would be the passenger decks (the E deck in particular) on the Titanic.

The Titanic was built to suffer failure in a number of its clusters^W compartments, but in fact it suffered failure in too many, and the cluster spanning passenger decks allowed the water to cascade across bulkheads, leading to tragedy.

by Brian McCallister at December 31, 2009 07:00 AM

Chris Pepper—Parental Controls: Ubuntu Netbook Remix vs. Mac OS X

Julia had my old PowerBook for a while. She liked that it was (externally) the same as my MacBook Pro, and it was fine for the Flash edutainment sites she uses, such as http://www.starfall.com/, http://www.cyberkids.com/, http://pbskids.org/, & http://www.poissonrouge.com/.

Using Mac OS X's Parental Controls, I limited her to a half-hour per day (although most days she doesn't use the computer), and I set her up with the Simple Finder. It didn't work perfectly -- Parental Controls prevented us parents (administrators) from configuring/fixing certain things, and the timer granularity wasn't really sufficient (she got a 15-minute warning in a half-hour session, which was just distracting, and we couldn't set anything between 30 and 60 minutes), but it worked pretty well. Unfortunately, the PowerBook finally died -- it couldn't retain access to the AirPort network, kept losing the clock (which broke Parental Controls), and eventually stopped booting entirely.

Amy and I agreed that buying a fixed desktop computer didn't make sense, but a MacBook with AppleCare would cost over $1,000. Fortunately, I found a (purple) Eee PC netbook for $229 at Amazon, with 512mb RAM, 4gb flash, and a 9" 1024*600 LCD. It came with Xandros Linux (Windows was a non-starter). I upgraded to 1gb, still well under $300. I wiped Xandros in favor of Ubuntu Netbook Remix 9.04, which is quite nice. It's basically a smaller Ubuntu distribution with a launcher instead of a static background. The keyboard is quite awkward for an adult to use, but it works fine with a spare USB keyboard & mouse. My attempt to upgrade to 9.10 failed -- Ubuntu uses a new image format, and doesn't have Mac instructions; I tried on a Windows VM but it didn't work, and hasn't been a priority. The built-in upgrade option doesn't work because there isn't enough free space on the 4gb flash drive. 9.04 works fine, though, so I'm not fussed about figuring out a workaround for the upgrade -- I'm sure they'll get working Mac instructions eventually.

After installing openssh-server and setting up passwords, it's easy to manage from Terminal and X11.app on my Mac.

But the tough question was how to recreate the parental controls. On Linux, it seems fairly straightforward to run a network proxy to filter out 'bad stuff', but as far as I can tell, there is no such thing as a good site blacklist. Since Julia isn't yet 7, I think for now we can just explain that we'll keep an eye on her computer usage (browser history), and keep an eye on her when she's using the computer (it's staying out of her bedroom, for instance).

The other tough part is the time limit. Fortunately, the included Keyboard Preferences has a section called "Typing Break", which I have set to lock the screen after 30 minutes, and unlock 840 minutes later. That should provide a reasonable control, although I have already thought of at least 3 different ways around it. When I kill the program to release the lock via ssh (so she can finish what she's doing), it doesn't come back next time, and I haven't investigated how to restart it yet...

As a backup, I have configured the computer to send me email every 10 minutes when it's on, which should provide reasonable cross-check:

pepper@julia:~$ crontab -l
# m h  dom mon dow   command
*/10    *   *   *   *   (uptime ; last | head -3) | mail -s "julia netbook is on" pepper

by reppep at December 31, 2009 05:00 AM

Ted Leung—2009 in Photography

Here’s a roundup of what I saw through the lens in 2009.

January

This year I did a lot more work with local and regional dancers. Here’s a danceseattle rehearsal shot.

February

I continued my role as the official photographer for Bainbridge Island Chinese Connection’s Chinese New Year Celebration.

March

I caught this shot of Guido van Rossum at PyCon by being in the right place at the right time. My camera was lying on the table next to me when Guido suddenly grabbed the Django Pony and started running down the aisle. He was moving fast enough that I had to snap off a bunch of frames to catch him in focus.

April

April was busy dance month. The Olympic Performance Group put on “The Toymaker’s Doll” (also known as Coppelia).

danceseattle had their first ever performance.

For the first time in a long time, I actually was able to show up to a Seattle Flickr Garage shoot.

May

May is when the weather in the Seattle area starts to get decent, so I was able to get some nature subjects in front of the lens.

June

I headed to San Francisco for JavaOne in early June.

I finished out June with Bainbridge Ballet’s end of year recital.

July

The Bainbridge Island Fourth of July Parade is always a family and photographic staple.

Also in July, we had the first guinea pig born in our house.

August

I made it to a second Seattle Flickr garage shoot.

September

Senior pictures for Bainbridge High School are due at the end of September, and I did 4 sessions in the space of 12 days or so.

October

School was in full swing in October, and one of the science lessons that Julie did with the girls involved extracting DNA using Bacardi 151 rum.

November

This year was the 10th Anniversary of the Apache Software Foundation (and my involvement with it). I did take a few shots while I was at ApacheCon.

I was also fortunate enough to get a slot to J. Mark Wallace’s US Meetup Tour when it hit Seattle.

December

Photographically, December is dominated by the Olympic Performance Group’s production of the Nutcracker.

by Ted Leung at December 31, 2009 03:58 AM

Rich Bowen—The Princess and the Frog

(Addendum to what I already posted earlier on Facebook ...)

This evening Disney ensured that I will never again watch one of their movies in the theater. I clearly can no longer trust that a Disney G-rated movie is safe for my children to watch. Instead, I'll wait for it to come out on video, and then preview it before showing it to the kids.

Voodoo and tarot and talismans and demons and deals with the devil, and a guarantee that the kids will have nightmares for weeks to come.

Good job, Disney. I'm sure Walt would be so very proud.

Time was when if it was made by Disney, you could be assured that it would be safe, heartwarming, and perhaps even have a good message behind it. Then came Hunchback, and now this monstrosity, and suddenly we're in a world where Walt Disney is no longer a guarantee of quality and family-friendly entertainment.

I guess Roy died just in time.

by rbowen at December 31, 2009 01:48 AM

December 30, 2009

Deepal Jayasinghe—How to disable service listing in Axis2

Number of users have requested to have a way to enable/disable service listing in Axis2. What that means is, by default Axis2 list out all the service in the system when you go the following URL;

http://localhost:8080/axis2/services/listServices

However there are situation where we do not need to expose our services publicly, in such a situation following would comes handy.

To enable/disable service listing use following parameter in axis2.xml (WEB-INF/con/axis2.xml).

<parameter name="disableServiceList">true</parameter>

True – Disable
False -Enable

Adding this does not prevent listing service under administration window, to stop it, you need to change the default username and password. You can do that by changing the following two paramters.

<parameter name="userName">admin</parameter>
<parameter name="password">axis2</parameter>

You can download the fix here, replace axis2-kernel.jar (WEB-INF/lib) with this.

by Deepal Jayasinghe at December 30, 2009 11:07 PM

Matthias Wessendorf—JSF 2 and CDI – a nice combo!

In JSF 2.0 there is (optional) support for annotating JSF Managed beans, via the Faces Managed Bean Annotation Specification. Both Apache MyFaces and the SUN RI implement this specification.

With the advent of these two JSRs:

There is an ongoing discussion about the right annotation for the “JSF beans”.

I agree that a JSF 2.0 application should avoid using the @ManagedBean annotation. I prefer using the stuff that 299/330 are offering. The good news is that even Spring (in Version 3.0) does actually support the @inject specification (e.g. @Named).

Using the two specifications, a simple bean could look like:

...
import javax.enterprise.context.RequestScoped;
import javax.inject.Named;

@Named
@RequestScoped
public class HelloWorldBean
{
...
}
...

For CDI there is a neat convenience annotation (@Model) that combines the two above (@Named and @RequestScoped)!

Too bad that Spring does not support CDI (JSR 299), so the @Model is not usable there… However, making something on your own (e.g. @SpringModelBean) is not hard, as you just have to combine the @Named and the @Scope(“request”) annotation, from Spring…

One concern maybe that 299 is only usable inside of a fullblown Java EE 6 container, like Glassfish3. But that is not the case. The implementations of the 299 specification from Apache(OpenWebBeans) and JBoss (Weld) are offering support for Servlet Containers, like Tomcat.

I added a new (very) simple HelloWorld that combines CDI(299/330) and JSF 2.0 to my FacesGoodies project. The source is located here. For the implementations I picked Apache MyFaces 2.0 (the first alpha release is out; a beta is coming soon) and Apache OpenWebBean (which recently left the Apache Incubator). The combination of these two standards really matches very well. I will add more complex demos to FacesGoodies soon, but this (trivial) HelloWorld demo is just a starting point.
Note: Currently you have to build the OpenWebBeans stuff, to have the latest greatest fixes from the trunk (or, use the M3 release from the incubator repository).

Another benefit is, as my demo uses the JettyMavenPlugin, with CDI you can use the jetty:run goal. The annotations for the JSF managed bean facility do require jetty:run-exploded, which is kinda odd… See… another reason to use CDI with JSF 2.0

Apache offers great support for these with its MyFaces and OpenWebBeans project! Check em out!

HINT: You need to build Apache OpenWebBeans in order to run this example. Do the SVN checkout from this location. And run mvn install in its root-folder!

by matthiaswessendorf at December 30, 2009 03:46 PM

Nick Kew—Bullied by Visa

I’ve banked with Nationwide for over 20 years. During that time, I’ve been generally well-pleased with the service they offer. From time to time the ‘industry’ has ganged up to impose new charges on customers: for example, annual charges to hold a creditcard, charges to withdraw money from each other’s cashpoint machines, or charges to use your card outside the UK. Nationwide has always remained resolutely free of such things. Furthermore, they don’t seem to cock up, and they’re the biggest UK bank to have escaped the crisis of the last couple of years without having to recapitalise (or much worse). All in all, a huge relief compared to other banks I’ve used.

So when they messed up yesterday, my first inclination was to blame the merchant I was trying to use (Nokia). This is part of shopping before VAT rises, and I was ordering some new kit to the value of over £500. I wanted to query a couple of points, so I placed the order by ‘phone. There followed an email confirming my order. Five minutes later another email from billing@nokia:

Your order (No. 900937209) has been cancelled because we were unable to process your payment on the credit card that you provided. We apologize for any inconvenience this may cause. Please visit our online store at http://shop.nokia.co.uk/nokia-uk to replace this order. Prior to re-attempting the order, we recommend that you contact your credit card company.

Sounds like a maxed out creditcard or something? Nope, it’s about £5000 short of my limit, and is paid in full by direct debit every month. Thinking the man who took my order might’ve cocked up, I went online and retried.

Same again.

OK, there’s a local Nationwide agency. Not a full branch, but a little room in an estate agent. They know me there. I marched down there intending to give them a hard time until they’d sorted it.

They were closed. Harumph!

Leaving a message is most likely going to miss the boat for 15% VAT. Nothing for it, have to use the published ‘phone numbers and hope someone replies. They did, and they were able to sort it out. They also told me the Nokia purchase had put a security block on my card, which is what they had to remove! After that I was able to place the order last night.

But hang on! This is a purchase of physical goods. That means there’s a shipping address. The fact it’s the same as the billing address (which hasn’t changed recently) should be a pretty good indicator that it’s really me, not a fraudster. What happens next time I need to settle a £500 hotel bill somewhere abroad, and perhaps in a remote timezone when there’s noone there to answer the phone. Am I at risk of the same thing happening? What’s the use of a creditcard if I can’t rely on being able to use it?

My strong suspicion is that this is because Nokia isn’t using phished by visa. To me that’s a plus: I’m placing an order with them, and all is transparent and open (the quirks of Nokia’s system are another story, but no showstopper). I’m guessing this kind of block might be becoming routine for online retailers who decline to be bullied into it. Grrr

Postscript: as I write, I just had a phone call from the man I originally placed the phone order with, to tell me the order had failed. Of course I already knew, but it’s good that he took the trouble.

Bah, Humbug.

by niq at December 30, 2009 12:27 PM

David N. Welton—Detecting BlackBerry JDE Version

Recently, I went back and added some preprocessor code (it's pretty much necessary in the world of J2ME) to ensure that Hecl would compile with older versions of the BlackBerry JDE. However, I also faced a problem: how to figure out what version of the JDE we're using. It could be my latest cold clouding my mind, but I couldn't find a simple way to do this. It never seems to be simple with the BlackBerry platform, unfortunately.

I did, however, finally find a nice way to obtain this information programmatically: the bin/rapc.jar file, which ships with the JDE, contains a file called app.version, which, indeed, contains the version of the JDE in use. I hacked up this code to read it and print it out:

by David N. Welton at December 30, 2009 12:15 PM

Edward J. Yoon—Chipettes Hot n Cold

Happy new year guys...

by Edward J. Yoon at December 30, 2009 11:58 AM

Chris Pepper—Canon T1i Tips

I downloaded the T1i manual & product guide from Canon's support site, and put them on my iPhone for reference. The paper manuals are small, so the PDFs are quite readable on the iPhone.

According to Canon, due to the T1i's smaller-than-35mm APS-C image sensor, my 55-250mm telephoto is equivalent to an 88-400mm 35mm lens.

In normal use, Medium/Fine (3,456*2,304) is indistinguishable from Large/Fine (4,752*3,168), although I haven't yet tried Normal JPEG compression, which will probably also be just fine at about half the on-disk size.

I haven't used 1920*1080@20fps video (the T1i can't do 1080@30fps) -- instead I use 1280*720@30fps (222mb/minute, which should fit 73min on an otherwise empty 16gb card), but I am fortunately satisfied with both video & audio quality, despite the tiny on-body microphone (no audio input available). I'm not very conscious of the resolution (even though it's easy to see on the rear LCD display), so I need to get used to raising it to L for long/landscape shots and returning to M for normal/close photos.

Unlike the SD800 IS, the T1i is somewhat awkward in portrait orientation -- especially with the LumaLoop's lanyard hanging in front of my face. I like the LumaLoop, though.

I use BetterHTMLExport for exporting galleries to the web. Today I hit a new problem: my private December photo gallery is 849 photos, mostly a mix of uncropped Large & Medium T1i photos. The whole thing (including full-resolution images) is 3.2gb, and for some reason Mac OS X sees my private Samba share as an 8gb volume with 2.8gb free (it's actually 634gb with 198gb free). I've noticed this before, but it never mattered except when copying large OS installers up to my archive. Today I had to export to a local gallery folder and then copy it up to the server, because iPhoto refused to let me export 3.2gb to a volume with (apparently) 2.8gb free. Not BHE's fault, but annoying!

Update: I tested in-camera JPEG compression. Large/Normal and Medium/Fine are both very good, while Medium/Normal isn't quite as legible at 100%. Interestingly, Large/Normal images are smaller than Medium/Fine, so I'll use L/N. Bonus: I won't have to switch resolutions between landscapes & head shots.

As Ken Rockwell points out, video is compromised by how well the in-body microphone picks up noise from lens movement. So 'autofocus' (which must be manually triggered by hitting the AF button) and zooming are both quite disruptive.

by reppep at December 30, 2009 03:00 AM

December 29, 2009

Nick Kew—New toy

I have a new toy: an Acer Revo box, which I’m using as a desktop. Ideally I’d've liked something ARM-powered (for low power consumption), but the Acer has an Atom processor, which seems to be the best available in the real world without having to DIY hardware.

It’s a lovely box: tiny size (smaller than a laptop, due to the latter having a screen), sleek to look at, and blissfully quiet. And Acer evidently believe in people who dual-boot: the machine was supplied with three disc partitions, of which one was formatted but unused by the inevitable windows installation. So that’s somewhere to install a real OS without losing the windoze games supplied (I have yet to play them, but …).

I’ve now got around to installing Linux on it. This required a bit of reading TFM, as it has neither a floppy nor a CDROM drive, so I had to figure out making a bootable installation image on a USB stick. I selected a kubuntu image, and after some faff with the install (the installer wanted to do something strange with the partitions, so I ran fdisk by hand instead) I have a working kubuntu. Some more faff getting the display to work correctly (cursing the absence of xorg.conf, and installing a non-free nvidia driver), and it’s up and running. Wow, it’s been quite a few years since a linux install didn’t “just work” without my having to do anything!

And I´m reminded just how long it is since I used KDE: I’ve run gnome on both linux and solaris variants for some years. It seems really strange now, and I’m missing gnome’s nice little dock for my favourite apps. Time will tell if I stick with it or switch!

by niq at December 29, 2009 11:59 PM

Justin Mason—Links for 2009-12-29

Body By Victoria – Secure Computing: Sec-C : Dr. Neal Krawetz brings the science on detecting Photoshop retouching
(tags: pixels images forensics jpeg photoshop fake analysis detection)
jwz – How to use Facebook with a feed reader : “Justin Mason likes this”
(tags: jwz facebook feeds rss atom howto syndication)

by dailylinks at December 29, 2009 11:05 PM

Jeroen Reijn—Content mangement and the semantic web

I came across the term 'semantic web' a couple of years ago, when one of the original creators of Apache Cocoon went of to work on the SIMILE Project at MIT. I didn't pay much attention to the concept of 'semantic web' back then, because I just started learning Apache Cocoon and still had a lot to learn.
But over the last couple of months I've been doing some research on the currently available standards for providing semantic data on the web with a strong focus on RDFa.

Content management

Working at Hippo, a CMS vendor based in the Netherlands & USA, makes me think in content and publishing strategies. Publishing information to the web is one of our core businesses, but I've learned over the last couple of month we can enrich our publishing platform even more by providing semantic data. I started my journey by looking around if other CMS vendors are paying attention to semantic web standards. I noticed that only a few of the enormous amount of content management vendors actually put effort in providing semantic web functionalities for their end-users. I think that's a shame, because enrich your pages a lot.
This post should give you an insight on how you could create a website with embedded meta data (with Hippo), but let's first start with some basics.

What's the idea behind the semantic web?

The current web is very well suited for being read by people like you and me. Computers however can only analyze the words on a page, but can not see the semantics of a piece of information on that specific page, that we as people do see.
If you would allow the information on you page to be machine-readable, the computer would be able to analyze your page and extract much more information from it then just being a piece of text. That's where semantic web standards can help out.
Standards for providing semantic data on the web are not new and some of them have already been available for quite some time. Probably the two most well known are: RDF and Microformats. However recently RDFa has been getting a lot of attention by Google, Yahoo and now also the UK government.

What is RDFa?

RDFa is short for “Resource Description Framework in attributes”. This sounds a bit descriptive, but it means that RDFa provides a set of XHTML attributes, which in their turn provide a way of translating visual data on a page into machine-readable hints. So let's take a look at an example of how a simple web page is currently structured.

<html>
  <body>
    <h1>Content management and the semantic web</h1>
    <h2>Jeroen Reijn</h2>
    <p>some information</p>
  </body>
</html>

As you can see in the above XHTML fragment, we have a page with a title, a subtitle and a small snippet of text inside the body of the page. By rendering this HTML fragment in the browser the visitor of this page will recognize this piece of text as being the title and author of the current article on the page. A machine however would need a bit more information to be sure the content can be identified as a title and author. That's where RDFa can help out. By using vocabularies, you can give meaning to specific pieces of content on a page.
Let's see what the above XHTML fragment would look like if we would use RDFa.

<html>
  <body xmlns:dc="http://purl.org/dc/elements/1.1/"> 
    <h1 property="dc:title">Content management and the semantic web</h1>
    <h2 property="dc:creator">Jeroen Reijn</h2>
    <p>some information</p>
  </body>
</html>

As shown in the example, the Dublin Core vocabulary is added to the page first. This is important to be able to use the properties inside the vocabulary later on. Once the vocabulary is in place, we can give meaning to fragments on the page. In the HTML fragment above the h1 is marked as the Dublin Core title attribute and the h2 as the Dublin Core creator attribute. With these properties in place a machine, like a search engine crawler, can now also store this as additional meta data of the page.
One of the main advantages of RDFa is that your content can processed in a more efficient way, which in turn can make your page rank higher then it might have been before.
Big search engines like Google and Yahoo already scan your website for RDFa embedded information, so why not use it?

How to use RDFa in your (hippo) website?

Hippo CMS is a content (centered) management system and it differs from other CMS's in such a way that the information inside the Hippo CMS content repository is not stored or identified as pages, but rather as content. In most cases even reusable content. To be more precise: information stored inside the content repository is stored as JCR nodes and/or properties.
Since the data is just content and not bound to any front-end technology, you can either publish it as XML, (X)HTML with some help from the Hippo Site Toolkit (HST) or any other format you might like.
Now let's take the above HTML fragment as an example and let's see what this would look like on a content level. One of the most important things to mention here is that a JCR repository has the concept of nodetype definitions in which you can configure what your data model looks like. You could compare it with for instance a XML Schema or DTD for a piece of XML, but then for the nodes and properties available in a JCR repository.

Let's first start with our content definition or in content management terms the document type. We will need three fields:

Title
Author
Body (rich-text field)

If you would create a document type with the Hippo CMS template editor, the resulting nodetype definition will end up looking like this:


<'myproject'='http://www.myproject.org/nt/myproject/1.0'>
<'hippostd'='http://www.onehippo.org/jcr/hippostd/nt/2.0'>
<'hippo'='http://www.onehippo.org/jcr/hippo/nt/2.0'>

[myproject:text] > hippostd:publishable, hippostd:publishableSummary, hippo:document
- myproject:title (string)
- myproject:author (string)
+ myproject:body (hippostd:html)

As you can see all three fields are available and can be used later on by any client that can read from the Java content repository. To be able to render this type of information as XHTML, we will be using the Hippo Site Toolkit. The Hippo Site Toolkit uses the concept of mapping JCR nodes to simple Java beans, to be able to have an easier development cycle without having to learn the entire JCR API.

A Java bean representation of the JCR 'myproject:text' nodetype will look like this:

import org.hippoecm.hst.content.beans.Node; 

import org.hippoecm.hst.content.beans.standard.HippoDocument;
import org.hippoecm.hst.content.beans.standard.HippoHtml;


@Node(jcrType="myproject:text")
public class TextBean extends HippoDocument{

    public String getTitle() {
        return getProperty("myproject:title");
    }
    
    public String getAuthor() {
        return getProperty("myproject:author");
    }

    public HippoHtml getBody(){
        return getHippoHtml("myproject:body");
    }

}

As you can see the Java bean is quite straight forward and easy to read.
Now if we want to render the information on a webpage, we can use for instance JSP's with expression language to get the information from the Java bean. The JSP needed for outputting the RDFa enabled webpage can be as simple as this:

<%@ page language="java" %>
<%@ taglib uri="http://www.hippoecm.org/jsp/hst/core" prefix='hst'%>
<html>
  <body xmlns:dc="http://purl.org/dc/elements/1.1/"> 
    <h1 property="dc:title">${document.title}</h1>
    <h2 property="dc:creator">${document.author}</h2>
    <hst:html hippohtml="${document.body}"/>
  </body>
</html>

As you can see it's that easy to use RDFa inside your website if you have a template independent CMS like Hippo.

It gets even better

Using RDFa for simple text can already be a great improvement for you website, but support for other RDFa vocabularies is added on a regular basis. Google recently announced support for RDFa enabled pages with videos (or media) on them. You can provide extra information for your media files to the Google crawler, like the url to the thumbnail that belongs to your video, which can be presented when your video is found as one of the results in a search performed at Google. The possibilities are enormous, so I can see a lot of good things coming from using RDFa in the near future.

I think the role that content management systems can have for RDFa should not be underestimated, since most website these days are backed by some sort of content management system.

For more information on RDFa see:

RDFa for HTML authors, by Steven Pemberton
RDFa.info - A site containing news about RDFa
Google supporting FaceBook Share and RDFa for videos

by Jeroen Reijn at December 29, 2009 09:45 PM

Grant Ingersoll—Manning: Mahout in Action

Very cool, Manning already has up the first 6 chapters of Mahout in Action.

by grant_ingersoll at December 29, 2009 08:33 PM

Jeroen Reijn—Using Daemon modules with Hippo CMS 7

Recently I was working on a new Hippo CMS 7 based project, where I was in need of a repository component that could run in the background and perform some scheduled tasks.

While talking to some colleagues about what I had to do, they pointed me to a build-in solution for adding repository components, which are initiated at startup.

It was actually very simple to implement this feature, so I'll try to describe how you can achieve the same solution in some very small steps.

The first thing you will need to do is create a Java class that implements the DaemonModule interface. As an example I've created the BackgroundModule as shown below.

package com.onehippo.repository;

import javax.jcr.RepositoryException;
import javax.jcr.Session;

import org.hippoecm.repository.ext.DaemonModule;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class BackgroundModule implements DaemonModule{

  static final Logger log = LoggerFactory.getLogger(BackgroundModule.class);
  static Session session = null;

  public void initialize(Session session) throws RepositoryException {
    this.session = session; 
    log.info("BackgroundModule started"); 
  }

  public void shutdown() {
    session.logout();
  }

}

You might wonder how the repository knows about these daemon modules? Well the trick is that the repository goes through all 'MANIFEST.MF' files, which it can find on the classpath. If the MANIFEST.MF file contains an entry for the property 'Hippo-Modules', it will be added to the list of available modules. Once finished finding all modules it will start to initialize each of them and pass on an authorized JCR session, so you will be able to work with all information inside the repository.

I'm always using Maven 2 while working with CMS 7. Maven 2 has some usefull utilities and it can help you you out with adding the correct manifest entry. In my pom.xml I added some configuration for the maven-jar-plugin that adds my module to the manifest.

<plugin>
  <groupid>org.apache.maven.plugins</groupid>
  <artifactid>maven-jar-plugin</artifactid>
  <configuration>
    <archive>
      <manifest>
        <adddefaultimplementationentries>true</adddefaultimplementationentries>
      </manifest>
      <manifestentries>
        <hippo-modules>com.onehippo.repository.BackgroundModule</hippo-modules>
      </manifestentries>
    </archive>
  </configuration>
</plugin>

If you need to add more then one module, you can do so by adding a space in between modules.

For the project I was doing, I also made use of Quartz triggers, so my module would execute once in a while instead of just after initialization of the repository.

The concept of these modules is quite powerful, so I hope this can help you to get started with writing your own Daemon modules.

by Jeroen Reijn at December 29, 2009 07:54 PM

Tim Bish—RC-2 of Apache.NMS and Apace.NMS.ActiveMQ now available

I've just finished work on the RC-2 bundles of Apache.NMS and Apache.NMS.ActiveMQ. Release candidate 2 resolves several issues found since the RC-1 version was posted. The bundles are available here

This release adds the Inactivity Monitor to NMS.ActiveMQ which is useful in detecting broken connections quickly which allows the failover transport to recover your connection faster. The inactivity monitor is disabled by default in this RC while we continue to test it, you can enable it by add "transport.useInactivityMonitor=true" to your connection URI. Also several bugs related to transactions and message redelivery were addressed in this release.

by Tim at December 29, 2009 07:50 PM

Jeroen Reijn—Apache Camel: open source integration framework

I'm currently working on a project where we are looking at creating an integration layer for external applications to connect to our back-end applications. In our case, one of the back-end applications is Hippo CMS 7's repository.

I've been reading up on ESB's like Apache ServiceMix and Synapse, but even though both projects look very interesting, they actually are a bit too much for what I want to do. There was one project though that seems to be exactly what I want: Apache Camel.

About Apache Camel

Apache Camel is an open source Java framework that focuses on making integration easier. One of the great things is that Camel comes with a lot of default components and connectors.
Even though I was quite new to the integration concept, I was able to get my first Camel project up and running within 30 minutes or so, which I think is quite fast. You only need is a bit of Java/Spring knowledge to get going.

The basic concepts

While using an integration framework like Camel, you will have to keep four key terms in mind:

Endpoint: where the message comes in or leaves the integration layer
Route: how a message goes from endpoint A to endpoint B
Filter: the chained components that are involved in the process of handling a message that comes from endpoint A and goes to endpoint B. It could be that the content of the message needs to be transformed from SOAP to for instance ATOM.
Pipe: the way the message travels from endpoint A through filters to endpoint B

One of the things I'm looking at Camel for is using it to convert RSS feed entries into JCR nodes. If I would create an endpoint diagram, which would describe my route, it would look something like the image below.

With Camel, the endpoints and routes can be configured in a few lines of Java code or with Spring XML configuration. I started out with the Spring XML configuration and it was actually quite easy to get going. Here is an example where I poll my own RSS feed and store the items into a mock 'feeds' object.

<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:context="http://www.springframework.org/schema/context"
xsi:schemaLocation="
http://www.springframework.org/schema/beans
http://www.springframework.org/schema/beans/spring-beans-2.5.xsd
http://www.springframework.org/schema/context
http://www.springframework.org/schema/context/spring-context-2.5.xsd
http://camel.apache.org/schema/spring
http://camel.apache.org/schema/spring/camel-spring.xsd">

  <camelContext xmlns="http://camel.apache.org/schema/spring">
    <route>
      <from uri="rss://http://blog.jeroenreijn.com/feeds/posts/default?alt=rss" />
      <to uri="mock:feeds"/>
    </route>
  </camelContext>

</beans>

As you can see that's just a couple of lines of code. It's really that simple to do things in Camel. Of course this configuration does not end up in a JCR repository, but as an example I think it's quite easy to grasp. For those of you, that want to play around with Camel as well, I'll try to explain all the step I took to get a working web application example from here on. As I'm using Maven2 for building my projects, you should be able to reproduce my setup quite easily.

Setting up your maven project

First off we'll start with adding the camel dependencies to our maven project descriptor( pom.xml).

<dependencies>
  <dependency>
    <groupId>org.apache.camel</groupId>
    <artifactId>camel-core</artifactId>
    <version>${camel-version}</version>
  </dependency>
  <dependency>
    <groupId>org.apache.camel</groupId>
    <artifactId>camel-spring</artifactId>
    <version>${camel-version}</version>
  </dependency>
  <dependency>
    <groupId>org.springframework</groupId>
    <artifactId>spring-core</artifactId>
    <version>${spring-version}</version>
  </dependency>
  <dependency>
    <groupId>org.springframework</groupId>
    <artifactId>spring-web</artifactId>
    <version>${spring-version}</version>
  </dependency>
  <dependency>
    <groupId>org.apache.camel</groupId>
    <artifactId>camel-rss</artifactId>
    <version>${camel-version}</version>
  </dependency>
</dependencies>

As you can see I explicitly added the camel-rss component, so that my camel application knows how to handle rss feeds. Camel does not have it's own RSS parser, but is using Rome in the background for handling the RSS feeds. The Camel project is setup in such a way that you can include any component you want, by adding the needed component dependency to your pom.xml. If you're thinking about using Camel, make sure you checkout the components page, which shows you all of the currently available components.

Camel uses Spring, so we need to add the Spring ContextLoaderListener to the local web.xml in src/main/webapp/WEB-INF/.

<?xml version="1.0" encoding="UTF-8"?>
<web-app xmlns="http://java.sun.com/xml/ns/j2ee"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://java.sun.com/xml/ns/j2ee
http://java.sun.com/xml/ns/j2ee/web-app_2_4.xsd"
version="2.4">

  <listener>
    <listener-class>org.springframework.web.context.ContextLoaderListener</listener-class>
  </listener>
</web-app>

The last step in our process is defining our endpoints. In my case I chose to use the Spring XML configuration for defining my endpoints.

Add a file called applicationContext.xml to your src/main/webapp/WEB-INF/ folder.
Once the file is created you should be able to define your routes like this:

<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:context="http://www.springframework.org/schema/context"
xsi:schemaLocation="
http://www.springframework.org/schema/beans
http://www.springframework.org/schema/beans/spring-beans-2.5.xsd
http://www.springframework.org/schema/context
http://www.springframework.org/schema/context/spring-context-2.5.xsd
http://camel.apache.org/schema/spring
http://camel.apache.org/schema/spring/camel-spring.xsd">

  <camelContext xmlns="http://camel.apache.org/schema/spring">
   <route>
     <from uri="rss://http://blog.jeroenreijn.com/feeds/posts/default?alt=rss" />
     <to uri="mock:feeds"/>
  </route>
</camelContext>

</beans>

In this example I'm using my own RSS feed, but you can of course use any feed url you like.
For testing purposes you can add a log4j.properties file in src/main/resources/, so you can see the output of the Camel RSS component in your console. Here is the configuration I used writing this blogpost.


# The logging properties used for eclipse testing, We want to see debug output on the console.
log4j.rootLogger=INFO, out

log4j.logger.org.apache.camel=DEBUG

# uncomment the following line to turn on ActiveMQ debugging
# log4j.logger.org.springframework=INFO

# CONSOLE appender not used by default
log4j.appender.out=org.apache.log4j.ConsoleAppender
log4j.appender.out.layout=org.apache.log4j.PatternLayout
log4j.appender.out.layout.ConversionPattern=[%30.30t] %-30.30c{1} %-5p %m%n

Well that's it. Now the only thing you will need to do is fire up an application container, like Jetty and see what's going on in the console.

$ mvn jetty:run

If Jetty is running and everything is setup correctly you should be able to see some debug information come by that looks like:


  SyndFeedImpl.author=noreply@blogger.com (Jeroen Reijn)
  SyndFeedImpl.authors=[]
  SyndFeedImpl.title=Jeroen Reijn
  SyndFeedImpl.description=
  SyndFeedImpl.feedType=rss_2.0
  SyndFeedImpl.encoding=null
  SyndFeedImpl.entries[0].contributors=[]

As you will see the RSS feed is parsed and converted into a SyndFeed object.
From there on you can make use of this object and perform any operation on it.

I must admit that while playing around with Camel and RSS feeds,
I noticed that the RSS (and Atom) component did not handle extra request parameters correctly, so I added a patch in the Camel JIRA, hoping it wil be included in the next release of Camel.
If you have issues with the RSS component and request parameters, you might want to try to build the Camel SVN trunk and apply my patch (CAMEL-1496).
This is only necessary if you want to parse a feed that has for instance a unique id as request parameter added to the feed URL.

We'll that's it! This post will get a follow-up, where I will show you have to use Camel to actually store the RSS feed entries into a JCR repository.

Here are a couple of good articles too read before starting with Camel:

This blogpost was inspired by an article over at Gridshore, where Jettro wrote a post on using Spring Integrations as integration framework. Since I'm pretty much Apache minded, I have been looking around for other open source integration frameworks within the ASF, which brought me to Apache Camel.

by Jeroen Reijn at December 29, 2009 07:30 PM

December 28, 2009

Howard M. Lewis Ship—Securing Tapestry pages with Annotations, Part 1

Everyone wants all sorts of integrations for Tapestry with other frameworks, but sometimes rolling your own is actually easier. Let's start with securing access to pages, a subject that still keeps coming up on the mailing list. I thought I'd show a little bit about how I tackle this problem generally.

People have been asking for a single definitive solution for handling security ... but I don't see any single solution satisfying even the majority of projects. Why? Because there are simply too many variables. For example, are you using LDAP, OpenAuth or some ad-hoc user registry (in your database)? Are pages accessible by default, or in-accessible by default? Are you using role-based security? How do you represent roles then? Creating a single solution that's pluggable enough for all these possibilities seems like an insurmountable challenge ... but perhaps we can come up with a toolkit so that you can assemble your own custom solution (more on that later).

One approach to security could be to define a base class, ProtectedPage, that enforced the basic rules (you must be logged in to use this page). You can accomplish such a thing using the activate event handler ... but I find such an approach clumsy. Anytime you can avoid inheritance, you'll find your code easier to understand, easier to manage, easier to test and easier to evolve.

Instead, let's pursue a more declarative approach, where we use an annotation to mark pages that require that the user be logged in. We'll start with these ground rules:

Pages are freely accessible by anyone, unless they have a @RequiresLogin annotation
Any static resource (in the web context directory) is accessible to anybody
There's already some kind of UserAuthentication service that knows if the user is currently logged in or not, and (if logged in) who they are, as a User object

So, we need to define a RequiresLogin annotation, and we need to enforce it, by preventing any access to the page unless the user is logged in.

That poses a challenge: how do you get "inside" Tapestry to enforce this annotation? What you really want to do is "slip in" a little bit of your code into existing Tapestry code ... the code that analyzes the incoming request, determines what type of request it is (a page render request vs. a component event request), and ultimately starts calling into the page code to do the work.

This is a great example of the central design of Tapestry and it's IoC container: to natively supporting this kind of extensibility. Through the use of service configurations it's possible to do exactly that: slip a piece of code into the middle of that default Tapestry code. The trick is to identify where. This image gives a rough map to how Tapestry handles incoming requests:

Tapestry Request Processing

In fact, there's a specific place for this kind of extension: the ComponentRequestHandler pipeline service¹. As a pipeline service, ComponentRequestHandler has a configuration of filters, and adding a filter to this pipeline is just what we need.

Defining the Annotation

First, lets define our annotation:

@Target( { ElementType.TYPE })
@Retention(RetentionPolicy.RUNTIME)
@Documented
public @interface RequiresLogin {

}

This annotation is designed to be placed on a page class to indicate that the user must be logged in to access the page. The retention policy is important here: it needs to be visible at runtime for our runtime code to see it and act on its presence.

An annotation by itself does nothing ... we need the code that checks for the annotation.

Creating a ComponentRequestFilter

Filters for the ComponentRequestHandler pipeline are instances of the interface ComponentRequestFilter:

/**
 * Filter interface for {@link org.apache.tapestry5.services.ComponentRequestHandler}.
 */
public interface ComponentRequestFilter
{
    /**
     * Handler for a component action request which will trigger an event on a component and use the return value to
     * send a response to the client (typically, a redirect to a page render URL).
     *
     * @param parameters defining the request
     * @param handler    next handler in the pipeline
     */
    void handleComponentEvent(ComponentEventRequestParameters parameters, ComponentRequestHandler handler)
            throws IOException;

    /**
     * Invoked to activate and render a page. In certain cases, based on values returned when activating the page, a
     * {@link org.apache.tapestry5.services.ComponentEventResultProcessor} may be used to send an alternate response
     * (typically, a redirect).
     *
     * @param parameters defines the page name and activation context
     * @param handler    next handler in the pipeline
     */
    void handlePageRender(PageRenderRequestParameters parameters, ComponentRequestHandler handler) throws IOException;
}

Our implementation of this filter will check the page referenced in the request to see if it has the annotation. If the annotation is present and the user has not yet logged in, we'll redirect to the Login page. When a redirect is not necessary, we delegate to the next handler in the pipeline²:

public class RequiresLoginFilter implements ComponentRequestFilter {

  private final PageRenderLinkSource renderLinkSource;

  private final ComponentSource componentSource;

  private final Response response;

  private final AuthenticationService authService;

  public PageAccessFilter(PageRenderLinkSource renderLinkSource,
      ComponentSource componentSource, Response response,
      AuthenticationService authService) {
    this.renderLinkSource = renderLinkSource;
    this.componentSource = componentSource;
    this.response = response;
    this.authService = authService;
  }

  public void handleComponentEvent(
      ComponentEventRequestParameters parameters,
      ComponentRequestHandler handler) throws IOException {

    if (dispatchedToLoginPage(parameters.getActivePageName())) {
      return;
    }

    handler.handleComponentEvent(parameters);

  }

  public void handlePageRender(PageRenderRequestParameters parameters,
      ComponentRequestHandler handler) throws IOException {

    if (dispatchedToLoginPage(parameters.getLogicalPageName())) {
      return;
    }

    handler.handlePageRender(parameters);
  }

  private boolean dispatchedToLoginPage(String pageName) throws IOException {

    if (authService.isLoggedIn()) {
      return false;
    }

    Component page = componentSource.getPage(pageName);

    if (! page.getClass().isAnnotationPresent(RequiresLogin.class)) {
      return false;
    }

    Link link = renderLinkSource.createPageRenderLink("Login");

    response.sendRedirect(link);

    return true;
  }
}

The above code makes a bunch of assumptions and simplifications. First, it assumes the name of the page to redirect to is "Login". It also doesn't try to capture any part of the incoming request to allow the application to continue after the user logs in. Finally, the AuthenticationService is not part of Tapestry ... it is something specific to the application.

You'll notice that the dependencies (PageRenderLinkSource, etc.) are injected through constructor parameters and then stored in final fields. This is the preferred, if more verbose approach. We could also have used no constructor, a non-final fields with an @Inject annotation (it's largely a style choice, though constructor injection with final fields is more guaranteed to be fully thread safe).

The class on its own is not enough, however: we have to get Tapestry to actually use this class.

Contributing the Filter

The last part of this is hooking the above code into the flow. This is done by making a contribution to the ComponentEventHandler service's configuration.

Service contributions are implemented as methods of a Tapestry module class, such as AppModule:

  public static void contributeComponentRequestHandler(
      OrderedConfiguration configuration) {
    configuration.addInstance("RequiresLogin", RequiresLoginFilter.class);
  }

Contributing modules contribute into an OrderedConfiguration: after all modules have had a chance to contribute, the configuration is converted into a List that's passed to the service implementation.

The addInstance() method makes it easy to contribute the filter: Tapestry will look at the class, see the constructor, and inject dependencies into the filter via the constructor parameters. It's all very declarative: the code needs the PageRenderLinkSource, so it simply defines a final field and a constructor parameter ... Tapestry takes care of the rest.

You might wonder why we need to specify a name ("RequiresLogin") for the contribution? The answer addresses a somewhat rare but still important case: multiple contributions to the same configuration that have some form of interaction. By giving each contribution a unique id, it's possible to set up ordering rules (such as "contribution 'Foo' comes after contribution 'Bar'"). Here, there is no need for ordering because there aren't any other filters (Tapestry provides this service and configuration, but doesn't make any contributions of its own into it).

Improvements and Conclusions

This is just a first pass at security. For my clients, I've built more elaborate solutions, that include capturing the page name and activation context to allow the application to "resume" after the login is complete, as well as approaches for automatically logging the user in as needed (via a cookie or other mechanism).

Other improvements would be to restrict access to pages based on some set of user roles; again, how this is represented both in code and annotations, and in the data model is quite up for grabs.

My experience with different clients really underscores what a fuzzy world security can be: there are so many options for how you represent, identify and authenticate the user. Even basic decisions are underpinnings are subject to interpretation; for example, one of my clients wants all pages to require login unless a specific annotation is found. Perhaps over time enough of these use cases can be worked out to build the toolkit I mentioned earlier.

Even so, the amount of code to build a solid, custom security implementation is still quite small ... though the trick, as always, is writing just the write code and hooking it into Tapestry in just the right way.

I expect to follow up this article with part 2, which will expand on the solution a bit more, addressing some more of the real world constraints my customers demand. Stay tuned!

¹ In fact, this service and pipeline were created in Tapestry 5.1 specifically to address this use case. In Tapestry 5.0, this approach required two very similar filter contributions to two similar pipelines.

² If there are multiple filters, you'd think that you'd delegate to the next filter. Actually you do, but Tapestry provides a bridge: a wrapper around the filter that uses the main interface for the service. In this way, each filter delegates to either the next filter, or the terminator (the service implementation after all filters) in a uniform manner. More details about this are in the pipeline documentation.

by Howard at December 28, 2009 11:04 PM

Anton Tagunov—No disconnection without fair trial please

I have signed http://petitions.number10.gov.uk/dontdisconnectus/
against unfair disconnections (UK)

by Anton Tagunov at December 28, 2009 08:58 PM

David Blevins—EJB 3.1 goes final

The EJB 3.1 and Java EE 6 specifications finally closed this month and are up for download. On a personal level, I'd like to say that EJB 3.1 has been the most productive I've been on a specification, thanks in no small way to the truly amazing group we had. For many of us it is a labor of love.

First a major thanks to Ken Saks. Ken, for a first-time spec lead you did an outstanding job and handled group input like a pro. It's difficult when in a position of authority to both have an opinion and collect the opinions of others. Where one sits in that continuum defines them as a spec lead, sets the tone for the group, and shapes the spec itself. You struck a balance that was just right. You started conversations with proposals that weren't too firm, yet were clear in the goal. As well you actively encouraged the most input from the group without letting conversations drag on too long with no clear conclusion. Hat's off to you, Ken.

As well the deepest appreciation to my fellow Expert Group members. There were many, but a special thanks to Reza Rahman, Evan Ireland, Soloman Barghouthi, Carlo de Wolf, Gavin King, Florent Benoit, Adam Bien, and Kim Wonseok. It was an absolute pleasure working with you guys over the last two years. The specification wouldn't have turned out nearly as well without you. It truly was a great group. Let me leave my professionalism aside for a moment and simply say, you all rock.

In reference to the specification itself there are areas to which I feel more personally attached. The EJBs in .wars functionalty and of course the Embedded EJB Container API the highest among them. If you like them, please let me know. They have been a labor of love for me for quite some time. I have high hopes for these in regards to the lightweight EJB front and hope that people find great use in them. The Embedded EJB Container API in particular I see as just the beginning and I'm excited to see what innovations we can bring in EJB.next once more vendors have had the chance to implement it.

Other areas I find particularly exciting are @Schedule, @Singleton and @Asynchronous. These were a challenge and took a considerable amount of the group's time. @Schedule for the challenge in creating an API that is expressive yet simple. @Singleton for the locking and startup ordering. @Asynchronous for the utter simplicity of it that it was hard to know when to stop. In all of the above, I hope we found that magical line that gives enough functionality without cutting us off from adding more bits in the future. Time will tell.

Of course thanks to all the people who provided feedback to the group directly or indirectly through their projects. I know I collected a good amount of spec feedback from the OpenEJB users as did other EG members from their users/customers. This perhaps the most valuable and least thanked group. So on behalf of myself and I'm sure all of the other EG members a whole hearted thanks. You have our deepest gratitude.

With that I raise a glass to EJB 3.1! I look forward to working with everyone again in EJB.next! Salud!

by David Blevins at December 28, 2009 02:59 PM

Gavin McDonald—Vostro 1720 first impressions

I’ve only been using it for a day really, and this post is more a test of my WLW setup, but first impressions of the Dell Vostro and Window 7 are pretty good.

What I’m looking forward to more is actually wiping my old Dell Inspiron 6000 and putting either Ubuntu or FreeBSD on it, undecided which.

(The keyboard on this Vostro is a little springy, better get used to that.)

by gmcdonald at December 28, 2009 08:32 AM

December 27, 2009

Dennis Byrne—I Join DRW Trading

Well I'm back in Chicago to work for DRW Trading. It was tough leaving ThoughtWorks but this was an opportunity I just couldn't pass up.

The interview process at DRW is the toughest I've seen in my career. It's the only company that has ever put me through eight interviews (six technical). I love interviewing. One of my dream jobs is to be a professional interviewer. It's also why I like presenting. I love the last ten minutes of a presentation because I never know what someone is going to shoot at me.

When I interview I'm not just there to do my best at answering questions and asking questions. I also want to see if the interviewers are asking the right questions. If I get an offer after getting through the interviews without being quizzed on unit testing, iterative delivery or continuous integration ... I know I'm probably going to regret taking the offer.

Other bonus points for me were the catered breakfast and lunch, a lot of health benefits I'll never use, and an iPhone. I don't like working in structured environments with a lot of rules, so I was very relieved by the fact that there is no dress code at DRW. Still very impressed at the compensation package they put together for me right in the middle of a global recession. I'm also looking forward to working with fellow ThoughtQuitters Jay Fields, Mike Ward, Mike Rettig, Bobby Norton, Peter Ryan, etc. I'm also excited to get more opportunity to work in a heavy math environment with tools like Retlang or Jetlang.

Anyways, tomorrow I start a new job and boy do I have a lot to learn about trading. I'll be investing a lot less in my technical knowledge portfolio and lot more in the domain, at least for the short term. Besides, software is about to enter a pretty boring period. Concurrency and mobile are obviously going to matter more, but I keep telling people "the next big thing in software is going to be a little thing in hardware". I've got my eye on gumstix and GPU development.

I am certainly missing San Francisco though. I spent a little more than a year at my last two clients out in the bay area and I sure will miss the weather. I'll also miss the travel opportunities I got with consulting. Living on the road always fit well with me. Relocating back to Chicago was pretty easy this time because everything I own fits into a single trunk and a single piece of luggage. Now that I will have a permanent address, or at least a permanent city, I have to go buy things like dishes and chairs.

Well, that's about it for me. If any of you are in Chicago and you want to meet up just to talk about technology or trading, shoot me an email.

by Not Dennis Byrne at December 27, 2009 11:04 PM

Rich Bowen—Closing Guantanamo

Help me understand this, people. You thought it was a good idea when he promised to close Gitmo. You did realize, didn't you, that the folks inside would have to go somewhere else, right? Why is this a surprise? Or was the expectation that he'd throw open the doors and give them a friendly pat on the bum as they walked out?

As I understood it, the promise was not a statement that those folks shouldn't be incarcerated, but rather than an off-shore facility with limited access to the press was an ideal situation for abuse. It seems pretty obvious that if you close a prison, the folks in it have to go somewhere else.

Seems to me like just another opportunity for manufactured outrage. Please keep in mind that if this move surprises you, you're admitting that you didn't think about the promise itself for more than about a half-second. And yet none of these news stories say "I told you so." Why is that?

by rbowen at December 27, 2009 02:02 PM

Juan Jose Pablos—php binary not found. Make sure php5 or php exists in PATH

If you found this error on debian lenny. Then it means that you have not got the Command Line Interface package. To fix it (assuming that you are using php5):

aptitude install php5-cli

by Juan Jose Pablos at December 27, 2009 01:18 PM

Dec	JAN	Feb
	05
2009	2010	2011