Problems with cron/seach/indexing

Posts: 2
Joined: 10/11/2008

I'm having some recent search problems with my site.

I recently added 2000ish products on top of an already 3000ish products in my store bringing my products numbers to 5000. Upon doing so I re-indexed the site and began running cron.php to add the new products to the seach index.

The first run of cron takes me to 1% indexed....the second time jumps me to around 93% indexed ... followed by 94...95....and so on to 100%. But at 100% im really only able to search about 5% of products and get the search to pull something up. So their seems to be a jump in reindexing process.

I've tried settings to select the amount of items to index per cron run...i have tried all...still getting the jump

Is there something im missing or something else i need to clear up to reindex my site?

Please help

Posts: 116
Joined: 04/23/2008

As a general point, from my experience and a few others in the forums, the default search engine in drupal is not suitable for a large number of nodes.

Hence some people use:

- ApacheSolr and its drupal interface (http://drupal.org/project/apachesolr)
ubercart.org site I think uses this...

- Sphinx and its drupal interface (http://drupal.org/project/sphinxsearch )
I use this one...

So long term, it may be something to consider...

But to answer your question, what I did in the past was to change the drupal's search.module so that I could increase the number of nodes indexed in one go to about 50,000. You have to play with your environment to get a sense for what will run in one go... So your's may be different...

You may also have to:
- delete drupal's cache first,
- reset cron to enable you to restart properly... i.e. delete FROM `variable` WHERE name = 'cron_semaphore'
- reset also maybe the 'cron_last' variable too...

Posts: 2357
Joined: 08/07/2007
AdministratoreLiTe!

For future reference, you shouldn't have to reindex your site when you add new content. Drupal should be able to find the new products without throwing away all of the data from the old products.