Archive for the 'Software Development' Category

The Best Open Source Searching Platform

Monday, July 10th, 2006

Erik Hatcher uses Solr should that be good enough? He uses Ruby on Rail and Solr in conjunction to support http://www.nines.org/.

So what is Solr?

In a nutshell Solr is a wrapper around Lucene which provides all of Lucene’s functionality as a web service.

Here is the description from Apache-Con
Apache Solr, a Lucene based full-text search server, with XML/HTTP interfaces, declarative specification of data types and text analysis with a schema, extensive caching, index replication, and a web admin interface. Solr is optimized for high volume low latency web traffic and has support for faceted browsing and dynamic results grouping.

Whats so great about Solr?

  • Replication: Solr can copy itself and still guarantee read access. Very nice for high availability and scalability
  • Language Independent: Solr makes searching with Lucene a web service, now you can access a Lucene collection in any language.
  • Support for Faceted Searching: Solr provides the infrastructure for faceted search with a plug in module and open bit-sets

Where does Solr shine

Solr works well with highly structured schemes, where the data stored is know well. Solr works great as a full text searching engine, which an update capability. It doesn’t work well as a front end database, the complexity of managing the system for rapid updates is just too hard.

Nutch is another application which uses Lucene, IMHO it looks like a great tool to support unstructured search when your need to scan through and index a lot of different documents.

Red-Piranha is another project which may interest some, but I’m not sure its active any more :(

Continuous Integration

Sunday, July 9th, 2006

People look at me like I’m crazy when I tell them that they might want to try an optimistic release cycle.

Most organization have a pessimistic release cycle. Projects are developed on isolated branches. Once completed the project code is usually integrated down onto the release branch and tested.

Different organization have different shades of pessimism. Some reserve their pessimism for high value sections of code, others use reserve it for complex projects, and others use reserve it for constantly changing applications.

I would like to propose an optimistic model for developing and deploying code on application which are constantly changing. Too often managers want to create a stable release candidate that they use to fix and regress bugs. By combining bug fixes and development on a single branch, it is impossible to tell the difference between bugs created during the development of new features, and the bugs created from bug-fixes. On this subject I say a bug is a bug, and far better to find and fix problems right after they occur. Constant integration and constant execution of repeatable tests finds bugs right away. Doing development on separate branches lengthens the time between creating a bug and fixing it. Lets face it code scheduled for release at some future date, doesn’t get a lot of quality oversight. Oh and did I mention that integration is a pain in the butt. I’ve seen code integration projects introduce more thorny bugs then the feature development!

Here is a description of a process which has worked well in the past. Web application code is released every week, with a full release followed by a patch release. User facing web applications get new code every week, but the service layers and older applications have less frequent updates.

The business unit has a single set of Java code which is compiled every time, via cruise-control) an update is made. The compilation also runs unit tests and API tests. The compilation produces jars. Each application, then manages their own dependencies, and during development they will suck out the jars they need. Tools like Ivy and Sage are helpful for managing dependencies.

All development is done on the head. Two days before deployment a release branch is created. Next a release candidate is created from this branch and deployed to a staging/integration environment. Next automated tests are run using Sybioware. Bugs are filed, fixed, and hopefully regressed. The release candidate is now the production release.

Once live, you may fix-forward by patching the branch and redeploying. Or you may roll back to the previous release. A patch release is a schedule fix-forward. Code changes to the release branch are also made to the head.

I would definitely consider this approach an optimistic one. No one can say for sure what is in the development branch at any given moment. The release branch is a snap-shot of development. So every week unwanted “features” may make it into the release candidate, but these are usually found before going live. In my experience this process produces only 2 rollbacks a year. I consider this optimistic process pretty good. In addition, it really saves time, by eliminating lengthy integration work.

Of course, there are exceptions to every rule and some long lived hairy development projects on static code are best done on a separate branch, but for frequently changing applications the integration phase only creates more bugs!

This process worked in with a fairly simple setup. The web application tier was stateless, and it are only served 16MM pages a day. It didn’t support multiple data centers active at the same time, so there was a limited need to synchronize message passing. I’m fairly certain that this release model could be extended to more complex environments, by doing the following:

  • provide additional capability to commit, tag, and track code in set of files
  • encourage constant project integration on a daily basis
  • encourage repeatable full functional testing during development

Symbioware next generation testing

Saturday, June 10th, 2006

Why I love Symbioware & you should too http://symbioware.com/overview.html

Testing web applications is tough. It’s so hard, that I encourage most developers to keep as much code out of the webapp as possible. For java this means turning off scriplets and using JSTL as much as possible. Ruby has a pretty good setup already, but making your code testable via modules is a must. With PHP templeting with Smarty and other addons is a good idea.

Why is it so tough? First you need to spin up a webcontainer and add test harnesses to talk to the webapp. Many times the only option is to use HTTP to exercise the test cases, and using HTTP mean scraping web pages, not fun.

Second using HTTP, means the testing client is getting active scripting. There are almost no clients our there which can handle active scripts. This is a shame because a good web enable application may have lots good features in active script.

Finally, there is every changing presentation. A form which says email may say username tomorrow. The testing code, which is often written in XML or some scripting language can’t keep up.

The solution comes in the form of products like Symbioware. It uses IE browser to parse web pages and it fully supports everything that IE does. That solves problems 1 & 2.

Symbioware, has a great WSYSIWYG interface. Creating new test cases is easy, you see can interact with your web page. It is also easy to keep up with changes, you see it. No more digging through code.

Its fantastic. Higher Quality means less time fixing bugs, which means more features faster.

Software Engineers without Analytics

Tuesday, May 30th, 2006

It frustrates me to see otherwise respectable software engineers working on problems without basic analytical skills. I can understand why it happens, we are in an age of enabled and empowered software developers who are constantly building new infrastructure. A mindset quickly develops, if its broken rebuild it.

Consequently every developer is an creative architect.

Sometimes you need some analytic abilities. This week we are faced with an admittedly complex problem, yet no one understands the scope of the problem. Parts of the problem are poorly defined, and their are no metrics indicating the scope of the problem. I love to see metrics for problems, even not so good ones. They tell you if your are on the right direction right track. After all how else do you know if your solution is worth the time and energy you put in?

Another thing, the simple explanations are often overlooked. A recent configuration change should always be a suspect, ruled out after studious investigation. When you hear hoof beats it’s more likely a herd of horses than a stampede of zebras.

Martin Fowler and I agree

Monday, May 22nd, 2006

In a recent post Mr Fowler and I agree.

He breaks down code ownership, one of my favorite topics into three segments, with Strong Ownership being his least favorite.

  • Strong Ownership
  • Weak Ownership
  • Community Ownership

I agree, Strong Ownership doesn’t work. Sure people need to be held accountable for the code they right, but having an individual responsible for a section of code makes no sense. It doesn’t even pass the hit by a bus test.

Community Ownership has always worked well for the projects and teams that I’ve managed. The team really comes together to put out a the right product, and they make some high quality code. There is something to be said for peer pressure, you just don’t want to let down your team.