Friday, January 14, 2005
What a mess....
Alright, I'm on a rant today. Here's the subject of my current rant. http://searchexchange.techtarget.com/originalContent/0,289142,sid43_gci1043858,00.html?track=NL-368&ad=501419
I guess part of the problem here is that the author of this article chose to quote a non-Microsoft employee. Ok - let's be fair; it's probably not the easiest thing in the world to get an official quote from Microsoft, but I think it would be nice to verify the information you gather.
So what's my big problem with this article? Well, 2 things come to mind. Let's take a look at the article a little closer, shall we?
First, Microsoft rules out a new Exchange data store. Whooooaaaaa! Stop the presses! Whaddya mean? Hasn't the plan always been to put Exchange on a SQL database? To be honest, I don't understand what the added benefits would be. SQL replication or log shipping? Maybe, but there is no guarantee those features would even be enabled or supported. SQL would have to be re-written to be optimized for e-mail (not a trivial thing here), and I suspect it would take a couple of iterations to make it work optimally. Part two of this conversation is: Why does everyone think that JET (EASE) is so bad? Possibly part of the misconception comes from everyone associating JET with Access, which does in fact use a Jet database, but it is based on the Jet Red version. Exchange 4.0 and 5.0 used Jet, but it was Jet Blue. There is no comparison between Jet Red and Jet Blue - they are so totally different it isn't even funny. Exchange 5.5 used yet another revision of Jet, referred to as ESE (Extensible Storage Engine), ESE97 to be exact. It has been optimized for Exchange, and works pretty darn well. Ok - this perhaps wasn't always the case, as Exchange 5.5 was more prone to corruption, but with ESE98 (the version that Exchange 2000 and 2003 uses), this is much less of an issue. Plus, you have to remember that 99.9% of the time (a figure I made up on the spot of course), corruption is caused by hardware issues, such as bad memory, or a failing RAID controller. Corruption in the stores that is actually caused by Exchange is much less common. Exchange 2003 SP1 even has a new feature called ECC (Error Correcting Code) Checksum, which is designed to further prevent potential problems by automatically correcting certain errors.
OK - second problem I have with the article is the misinformation presented by Brien Posey, based on the following:
Exchange Server 2003 Enterprise Edition is designed for up to 38 GB, although performance takes a hit when you go beyond that, he said. But Microsoft has let IT administrators run a lot of parallel information stores to compensate.
I guess I must have missed the bus on this one, because I have never seen or heard anything like this. Others I know have commented similarly, but this is just flat out wrong. Exchange is designed to scale into the Terabytes, should an administrator so desire. Of course, the server hosting such a store would have to be properly sized, but that's another story. The only recommendation I have ever come across for database sizing indicate that you should size your databases based on your restore window. In other words, if you have SLA's that dictate your restore time can take no longer than 6 hours, then you need to size your databases to meet that SLA. That of course depends a lot on the backup hardware (DLT, SDLT, LTO, backup to disk, snapshot, etc.) and also to some degree depends on the software used to back up the information stores. BackupExec, for instance, will probably be able to have a higher throughput than say, NTBackup, which is built-in to Windows.
Now as the database grows, the disk I/O that it consumes does go up. Larger files need more I/O. At which point, however, do you say "performance takes a hit". What are you comparing performance to? Is it from a user perspective, or from a server perspective? Are the databases stored on a RAID 5 volume, a RAID 1+0 volume, a RAID 10 volume? Is Exchange being allocated a dedicated spindle/volume? What type of disks? 10,000 rpm? 15,000 rpm? DAS, NAS or SAN? These are all variables that must be taken into consideration. Of course if you are running a 100gb database on a 3-disk RAID 5 volume, odds are your users aren't going to be happy. But you also haven't done your homework on disk sizing. There just isn't any reason to make a blanket statement like that. The same rule is still true when using multiple databases. Many people are aware that Exchange 2000 and 2003 allow you to host multiple databases on the same server (a welcome change, I might add). In fact, you can have up to 20 databases on a server (4 storage groups, 5 databases per storage group). This is all well and good, but it also dictates proper server sizing. I don't see much benefit of creating 20 databases that are housed on the same RAID volume, and I would expect that the disk I/O would be even greater than 1 gigantic database. Maybe I'm off base there, I dunno. What I'm trying to say is that the same proper server sizing procedures apply here.
Anyways, I think I've reached the end of my rant. As always, I'd appreciate your comments.