Archive for July, 2006

Money Saving Tips on Hardware

Wednesday, July 19th, 2006

Luckily most of you aren’t buying servers or trying to stretch your hardware budget. For those of you who are, here are some tips on saving money.

Actually there is only one tip, buy as little as possible from the original manufacture. You see the original manufacture like IBM or HP has pricing power, and they will charge you an arm and a leg for their brand name products.

Here is how you apply these rules

  • By the most minimal configuration from the manufacture
  • Use a VAR for the purchase and buy additional items like memory and disk drives from the VAR
  • Have the VAR do the installation of the memory and disk drives

So for example, if you wanted to buy a HP-DL385, dual opteron with 12Gb of memory and 3 72Gb drives don’t buy it all from HP! Get an HP-DL385 with 2Gb or memory and one tiny drive. Then have the VAR purchase the additional memory and drives, install them and ship them to you.

Shop around and find some good VAR. They are competitive, and unlike the manufacture they don’t have pricing power.

This won’t work for all your hardware needs. Take a HP-DL585 for example. It can take up to 64Gb of memory, but only specially certified memory from HP may be used.

This is why you can get a 2 way HP-DL385 with 16Gb for $7,000, but a 2 way HP-DL585 with 64Gb will cost at least $68,000. Four times the memory nine times the cost.

Offermatica Scales!

Monday, July 17th, 2006

So last Friday the folks at Offermatica invited me down for a visit. They were very nice, and I quickly discovered that my experiences with scaling were not right. The company I was working with had sent they traffic to Offermatica’s staging servers! Doh!

That Friday while I was there, ESPN started running some tests. The additional traffic wasn’t causing any problems. ESPN is the 5th largest entertainment site according to Netratings, so thats not too shabby.

The same day I was talking to a friend of mine who works at BabyCenter. He said Offermatica was working out great for them. Before any A/B/multivariate testing had to go through the development team, which meant placing more work on the development queue. With Offermatica, the integration is down via JavaScript as a web service.

Now that the development team isn’t involved, the business analysts can take direct control of the testing and get the most out of their site.

Mozilla’s Millions

Monday, July 10th, 2006

Did you know that Mozilla makes 10 of millions of dollars from that little google search box atop every mozilla browser? In the bay area people keep telling me that Mozilla will make $76MM this year. I believe them, but I can’t find a solid source.

This register article is the closest I could get.
http://www.theregister.co.uk/2006/03/14/google_mozilla_tax/

Search impressions matter, they are even worth paying for.

The Best Open Source Searching Platform

Monday, July 10th, 2006

Erik Hatcher uses Solr should that be good enough? He uses Ruby on Rail and Solr in conjunction to support http://www.nines.org/.

So what is Solr?

In a nutshell Solr is a wrapper around Lucene which provides all of Lucene’s functionality as a web service.

Here is the description from Apache-Con
Apache Solr, a Lucene based full-text search server, with XML/HTTP interfaces, declarative specification of data types and text analysis with a schema, extensive caching, index replication, and a web admin interface. Solr is optimized for high volume low latency web traffic and has support for faceted browsing and dynamic results grouping.

Whats so great about Solr?

  • Replication: Solr can copy itself and still guarantee read access. Very nice for high availability and scalability
  • Language Independent: Solr makes searching with Lucene a web service, now you can access a Lucene collection in any language.
  • Support for Faceted Searching: Solr provides the infrastructure for faceted search with a plug in module and open bit-sets

Where does Solr shine

Solr works well with highly structured schemes, where the data stored is know well. Solr works great as a full text searching engine, which an update capability. It doesn’t work well as a front end database, the complexity of managing the system for rapid updates is just too hard.

Nutch is another application which uses Lucene, IMHO it looks like a great tool to support unstructured search when your need to scan through and index a lot of different documents.

Red-Piranha is another project which may interest some, but I’m not sure its active any more :(

Continuous Integration

Sunday, July 9th, 2006

People look at me like I’m crazy when I tell them that they might want to try an optimistic release cycle.

Most organization have a pessimistic release cycle. Projects are developed on isolated branches. Once completed the project code is usually integrated down onto the release branch and tested.

Different organization have different shades of pessimism. Some reserve their pessimism for high value sections of code, others use reserve it for complex projects, and others use reserve it for constantly changing applications.

I would like to propose an optimistic model for developing and deploying code on application which are constantly changing. Too often managers want to create a stable release candidate that they use to fix and regress bugs. By combining bug fixes and development on a single branch, it is impossible to tell the difference between bugs created during the development of new features, and the bugs created from bug-fixes. On this subject I say a bug is a bug, and far better to find and fix problems right after they occur. Constant integration and constant execution of repeatable tests finds bugs right away. Doing development on separate branches lengthens the time between creating a bug and fixing it. Lets face it code scheduled for release at some future date, doesn’t get a lot of quality oversight. Oh and did I mention that integration is a pain in the butt. I’ve seen code integration projects introduce more thorny bugs then the feature development!

Here is a description of a process which has worked well in the past. Web application code is released every week, with a full release followed by a patch release. User facing web applications get new code every week, but the service layers and older applications have less frequent updates.

The business unit has a single set of Java code which is compiled every time, via cruise-control) an update is made. The compilation also runs unit tests and API tests. The compilation produces jars. Each application, then manages their own dependencies, and during development they will suck out the jars they need. Tools like Ivy and Sage are helpful for managing dependencies.

All development is done on the head. Two days before deployment a release branch is created. Next a release candidate is created from this branch and deployed to a staging/integration environment. Next automated tests are run using Sybioware. Bugs are filed, fixed, and hopefully regressed. The release candidate is now the production release.

Once live, you may fix-forward by patching the branch and redeploying. Or you may roll back to the previous release. A patch release is a schedule fix-forward. Code changes to the release branch are also made to the head.

I would definitely consider this approach an optimistic one. No one can say for sure what is in the development branch at any given moment. The release branch is a snap-shot of development. So every week unwanted “features” may make it into the release candidate, but these are usually found before going live. In my experience this process produces only 2 rollbacks a year. I consider this optimistic process pretty good. In addition, it really saves time, by eliminating lengthy integration work.

Of course, there are exceptions to every rule and some long lived hairy development projects on static code are best done on a separate branch, but for frequently changing applications the integration phase only creates more bugs!

This process worked in with a fairly simple setup. The web application tier was stateless, and it are only served 16MM pages a day. It didn’t support multiple data centers active at the same time, so there was a limited need to synchronize message passing. I’m fairly certain that this release model could be extended to more complex environments, by doing the following:

  • provide additional capability to commit, tag, and track code in set of files
  • encourage constant project integration on a daily basis
  • encourage repeatable full functional testing during development

Offermatica’s Response

Thursday, July 6th, 2006

Yikes, 4th of July, Tour de France, and Wimbledon have conspired against my blogging. Mark from Offermatica posted a comment, which deserves a little more prominence. So I’ve posted it below.

Eric,

Thank you for appreciating the elegance of the Offermatica integration.

I’d like to clarify a few of your points that may be based on old information:
1. Offermatica is supporting clients that send over 100 Million requests a day.
2. Offermatica supports multiple conversion events across multiple pages and multiple user visits.
3. By default Offermatica uses first-party cookies

Mark
(Offermatica Engineering Team)

Takeaway

Whats the takeaway from all of this?

Well first off I could be wrong. My first hand impressions and usage, may be different than others. I’m overwhelmingly positive about Offermatica, and I expect others would be as well, but my particular criticisms of the product may not be apparent to others.

Which bring me to my next point, these optimization tools are necessary and valuable. For that reason I recommend them where ever I go.

Finally, there should be a much bigger market for analytical tools. Everyday people vote with their clicks on what they like and what works for them. Its unfortunate that most of the debate bounces back and forth between those people who already understand the importance of good analytics.

So the same question remains where are all the other optimization and analytics software for web sites? Why do some business owners continue to believe that their intuition is better than gathering real data from frequent experiments.

So here is my challenge to business who develop software to support web-analytics. Please make amazing products, which provide easy to understand reports for the simplest of tests. As an industry please work together to create a body of standard metrics for testing different user experiences.

My challenge for business owners, please use the products listed on the blog. Make real projections for the products you launch, and use consistent measures in your business. Don’t be fooled by the recent tide of ad dollars, which has lifted all boats.