Random thoughts from an unusual company

Are you searching too much?

Gabriella Davis  11 March 2009 13:00:06
I spent a lot of yesterday on a customer support call which started out as a corrupt Windows folder containing a FT index and ended up with the indexer spiking to 60% trying to FT index a mail file of 550mb.  When it was still doing that 20 mins later and the size of the index was spiralling bigger than the mail file, I stopped it and went to check what was happening.  This particular user , although only having 6000 documents in his mail file, had 10 - 20 word documents in many of the messages.  The indexer's attempt to FT index those attachments was causing the spike in CPU and the enormous resulting index size.  I recreated the index choosing to disable the indexing of file attachments and it created in less than a minute using 12% of CPU and ending up 23mb in size.

So is it worth indexing file attachments?  It's the default setting for creating a FT index and many admins just select the entire Mail directory and create FT indexes for every db in it, leaving the option for indexing attachments on.  But that not only radically increases the amount of indexing that the server needs to do, it also takes up considerably more disk.  All this and the user's may not even be aware that feature is in place.  Many users search for text in emails, they don't think to search for text in attached files in emails because they don't realise the FT indexing is that clever.  In fact I've had several support calls in the past from customers who thought the index was returning invalid results because they didn't realise it had picked up the search word in the attached file even though it wasn't in the document.

If your users are only searching mail message content then do yourself and your server a favour, rebuild the FT indexes you have excluding the indexing of attachments.

oh and if you ever have a Windows folder with a ftgi.p00 directory which is showing Access Denied to Administrator, System and Cacl changes - the only way to delete that puppy without a 3rd party tool is to boot the server into Safe Mode first.  Trust me and the 4 hrs I spent trying to do otherwise!
Comments

1Ben Perales  18/03/2009 05:39:37  Are you searching too much?

Hi Gabriella,

"If your users are only searching mail message content then do yourself and your server a favour, rebuild the FT indexes you have excluding the indexing of attachments."

This is also happening to our mail servers. I'd like to know how to turn indexing of attachments off. Is there a switch or console command for this?

Looking forward to hear from you..

Ben

2Gab Davis  18/03/2009 17:05:23  Are you searching too much?

@Ben .. the easiest thing to do is to delete the FT indexes on the server (right mouse click on the top level folder in Administrator and choose Full Text Index then - Delete. It helps to have Full Access Admin on when you do it). Then re-create them but make sure the option on the 'Create' dialog for indexing attached files isn't selected, then let it go ahead and recreate

Hope that helps