Archive for August, 2006

Ubuntu vs Debian Package Counts

Monday, August 14th, 2006

I found this amusing. Lucas Nussbaum has compared packages in Debian testing/unstable to Ubuntu Dapper/Breezy. Check out the stats. He loosely makes the claim that Debian has newer software then Ubuntu. For me the comparison doesn’t really say much, but the comments are especially funny.

Hard Drive Temperature RRD

Monday, August 14th, 2006

Berkeley doesn’t get too warm, maybe 85F at the hotest. This summer has been very warm with temperatures in the upper 90s! I don’t have an air-conditioner, and I’m not alone in my plight, none of my neighbors have one either. I was worried that the few airconditioners I do have would melt down in my all too tosty office.

I found HDtemp, a simple program that reads the temperature of my drives and kicks out the information from a TCP/IP port. I made an RRD graph out of it and posted the results. I documented the scripts and such here. HDTemp integrates really well with gkrellm which saves the time of making and posting RRDs.

Small RRD of disk temperature

Web App Architectures Part 2

Sunday, August 13th, 2006

Multi-service web platforms represent architectures which have several distinct functions. Sometimes these functions are tightly integrated together, sometimes they are loosely coupled services and sometimes they are both.

I really dislike the term N-Tier architecture. An N-Tier architecture is a tightly integrated set of distinct functions which are usually separated onto different tiers of machines. An N-Tier implies more than 3 tiers. To me it hearkens back to the dark ages of build it all yourself and over engineering. I still get asked how much experience I have with N-Tier architectures, so I suppose someone cares about them. I always hope its a trick question, and my response is not much.

Diagram of N-Tier
NTier Application Stack

Service Oriented Architectures (SOA) mix it up a little bit, by taking a defined business or technical area and creating a separate application space. Many SOAs can live in the same container, or a single SOA can live on its own tier of machines.

Diagram of SOA Web Stack
SOA Web App Stack

In a tightly integrated stack you might see the following

Data and Query Layer - go get data from multiple sources, like your stock portfolio information from the user database, current stock price from the stock database, and a real time query to the clearing-house mainframe to check the status of your trades.

Data Object Layer - The different queries are marshaled into objects, often the object is implicated tied to it’s data cache

Business Logic Layer - Special rules but distinct from the data logic in the data layer. In the worst case there may be several distinct layers of business logic.

Presentation Layer - special controls for surfacing different workflows and different look and feel.

In a loosely integrated stack, you are more likely to see the layers combined. So the Data Object Layer will not be separated from the Data Queries, and it may not even be separated from the Business Logic. With a loosely coupled system, lots of small vertical stacks are created, which isolates changes and promotes common APIs.

Without the need for tight integration, the need for an complex architecture disappears. The simplicity from a loosely coupled system is the result of pushing the dependencies onto the consumer of the service. Most consumers have simple needs especially in the beginning. With a tightly integrated stack the dependencies are build into the system, and consumers are given a complex system, in an attempt to encapsulate their needs and make the dependencies invisible.

Web App Architectures Part 1

Thursday, August 10th, 2006

I enjoy blogging, but I often talk about little tech and process tidbits that interest me. This week I’m in Seattle, and I’ve had the opportunity to talk to a lot of folks starting online businesses, building online businesses and running online businesses.

The interesting thing in Seattle is its a Microsoft town, yet all most all of the business managers and founders of new online companies want tech people skilled in Open Source and Linux. This desire is driven by the feeling that open source is faster, more flexible, and more on the cutting edge. I agree with that, but I also see a lot of bloat, and too many tech choices to make. Just think of how may Java XML parsers there are ( Jibx, PPP, JBO).

The problem is lots of opinions about better technology are throw out there, with no backdrop for comparison. Often technologies are discussed and evaluated for their unique or expressed purpose, but they need to be evaluated for how they fit into the overall stack of software.

At a 10,000 foot view a basic web app has three parts
Data - some structured information
Presentation - some web application to display the data
Manipulation - some apps to manipulate or personalize the data

Basic Web Architecture

This is a basic web app, which should cover more than 90% of all the sites out there. This breakdown isn’t going to describe everything, but lets save something for part 2 & 3.

The simplest representation of this is a flat HTML page.
Presentation - the HTML page and an Apache server
Manipulation - the tool would be a text editor like vi
Data - the file system.

A more complex system would have capabilities like user generated content, search, administration tools, meta data, co-branded pages, and ad delivery. I’ll admit not every app falls cleanly into my three over generalized buckets, but this is just a basic web architecture which should describe almost all the sites out there.

Presentation
——————-
Search
Co-brands
Feeds
Google Maps
Ad Delivery
Reading Blogs

Manipulation
——————–
Editorial Tools
User Facing Tools
Writing Blogs

Data
———
Databases
File systems
In Memory Caches

Things that don’t fit in well are integrated APIs, where reading and writing are done via the same interface. The same application may handle both, and separating the two functions would be silly.

Messaging platforms (ie JMS) don’t really fit in either.

Like I said before, this isn’t an attempt to describe all web architectures, just a the most basic variety.

Stupid SSH Hacks

Sunday, August 6th, 2006

Every see lots of ssh requests in you auth log? Ya, me too and I’m sick of it.

Aug 2 22:21:49 host sshd[27593]: Invalid user web from 61.189.35.74

My ssh doesn’t accept passwords you need a valid 1028bit private key, plus a pass phrase to get in. So I don’t think that a brute force attack on ssh will do much. Still I wanted to put an end to it and I was thinking of using swatch to monitor the logs and block ips. Then I found an interesting post on using iptables to stop brute force SSH attacks.

I added just two lines into my firewall, much easier then using some script. If you look at the post above you’ll find a more sophisticated setup.

# create a new chain
iptables -N SSH_Brute_Force
iptables -A INPUT -p tcp --dport 22 -m state --state NEW -m recent --name SSH --set --rsource -j SSH_Brute_Force
# Good ips
iptables -A SSH_Brute_Force -s 12.223.68.45 -j RETURN
iptables -A SSH_Brute_Force -s 45.68.223.12 -j RETURN
iptables -A SSH_Brute_Force -m recent ! --rcheck --seconds 60 --hitcount 3 --name SSH --rsource -j RETURN
iptables -A SSH_Brute_Force -j LOG --log-prefix "SSH Brute Force Attempt: "
iptables -A SSH_Brute_Force -p tcp -j TARPIT

-m recent matches the packets storing the ip addresses in a list
–name SSH give you list of ip addresses a name otherwise DEFAULT is used
–set put the source ip address on the list
–seconds the quite period and the sample period
–hitcount number of hits required in sample period to activate this rule
–rcheck see if the source ip address is currently on the list or not (!)
RETURN jump back to the original chain
TARPIT keep the connection open and let the client timeout

ATA over Ethernet (AoE) is Crap

Tuesday, August 1st, 2006

On July 31st an article titled iSCSI Killer was posted on Slashdot, which created quite a flurry of blog posts and activity. The Slashdot post references this linux journal article.

There has been a huge shift in storage technologies over the last year or two. It’s been very exciting, and there are a large number of very compelling vendors who offer some great products and services.

I’m not going to give a tutorial on networked storage, but ATA over Ethernet (AoE) is a lightweight low level network protocol which is NOT IP based. AoE can’t be routed since is isn’t IP based, and it gets punted over to all devices on the switch. The current mainstream, network based protocol is iSCSI. iSCSI shunts SCSI commands into TCP frames. iSCSI is a IP based protocol which may be routed.

Yes AoE seems like a lot of fun, and yes it is a different option than iSCSI, but I still say its crap. First off I’ve seen iSCSI work in high transaction environments with updates in bursts of 1Gb in a few minutes. The clients have no problem. The 3Par server had a TOE, but it was still a lot less expensive than the EMC/Veritas cluster we had. Screw 30% performance degradation from TCP/IP overhead, ’cause iSCSI with a TOE is still cheaper that what was available in 2003/2004.

iSCSI may be routed across switches which is a huge deal. Why, because for most networks I build out I have a few routers and lots of switches, often a switch for every-other rack, and two routers for each ingress/egress. What good it 10Tb of data, if 90% of the machines in the co-lo can get to it!?

As for a large storage array sitting behind a single host, something that is very useful in a small business for disk based backups, may be easily accomplished via iSCSI. Check out this article on building an iSCSI array. The article singles out open-e as the best of breed.

From my limited research it seems that open-e is a tweaked version of Debian (my favorite) running a flash based box with an IDE controller.

Note that most of the articles list a price point of $1/Gb. Amazon’s 3S service charges $0.15 per Gb per month, and $0.20Gb foreach Gb transferred. So you would reach $1/Gb after 6 months of 3S service, even without transferring data! Goes to show you many times it makes sense to grow your own.