This forum is in READ-ONLY mode.
You can look around, but if you want to ask a new question, please use the new forum.
Home » plugins » Search plugins » [Zend Lucene] Updating indexes
[Zend Lucene] Updating indexes [message #68233] Fri, 19 December 2008 18:56 Go to next message
HiDDeN  is currently offline HiDDeN
Messages: 135
Registered: July 2006
Location: Barcelona, Spain
Senior Member
For me, the Lucene system has one problem: if I update or insert one row directly in the database, the indexes are not updated. So, the next time I do a search, it will not find my new rows.

How can I update the indexes with the new rows in the database? The cleanup task explained in Jobeet's day 17 doesn't help, because it's intended for deleted rows.
Re: [Zend Lucene] Updating indexes [message #68259 is a reply to message #68233 ] Fri, 19 December 2008 23:15 Go to previous messageGo to next message
halfer  is currently offline halfer
Messages: 9535
Registered: January 2006
Location: West Midlands, UK
Faithful Member
If by "directly" you mean outside of Propel, then yes, it would become out of date. You could have a daily scheduled index rebuild to deal with this, if writing to the db directly is unavoidable. But the alternative - obviously - is to do your updates with your ORM, and then have your own save method that ensures the index is kept right up to date.


Remember Palestine
Re: [Zend Lucene] Updating indexes [message #71980 is a reply to message #68233 ] Tue, 10 February 2009 08:40 Go to previous messageGo to next message
alxkn  is currently offline alxkn
Messages: 126
Registered: June 2007
Senior Member
Hello,

I have applied jobeet search to my application and noticed that if I make changes then the old indexed words are not removed from index. I think these lines
if ($hit = $index->find('pk:'.$this->getId()))
  {
    $index->delete($hit->id);
  }

in updateLuceneIndex() function has a bug.

Does anyone know how it could be fixed?

A.

[Updated on: Wed, 11 February 2009 00:24]


http://fmpsv.com/ --revolution in social networking
Re: [Zend Lucene] Updating indexes [message #72935 is a reply to message #71980 ] Thu, 19 February 2009 16:52 Go to previous messageGo to next message
palt  is currently offline palt
Messages: 10
Registered: January 2009
Location: Augsburg, Germany
Junior Member
yeah, the tutorial has lots of bugs..

try this one:

 if ($hits = $index->find('id: '.$this->getId()))
		{
   			 $index->delete($hits[0]->id);
		}
Re: [Zend Lucene] Updating indexes [message #72940 is a reply to message #68233 ] Thu, 19 February 2009 17:01 Go to previous messageGo to next message
halfer  is currently offline halfer
Messages: 9535
Registered: January 2006
Location: West Midlands, UK
Faithful Member
If this is a bug, then please raise it on trac - it's the only place where bugs will get noticed Very Happy


Remember Palestine
Re: [Zend Lucene] Updating indexes [message #72966 is a reply to message #68233 ] Fri, 20 February 2009 03:52 Go to previous messageGo to next message
alxkn  is currently offline alxkn
Messages: 126
Registered: June 2007
Senior Member
unfortunately, it did not help. In both cases $index->find .. returns empty value. This must be so, because in the index there is nothing like pk:1 and etc or id:1. There is something like 1 .. pk ..

I do not know how to create a ticket, but this is certainly a bug.

[Updated on: Fri, 20 February 2009 05:18]


http://fmpsv.com/ --revolution in social networking
Re: [Zend Lucene] Updating indexes [message #72986 is a reply to message #68233 ] Fri, 20 February 2009 11:33 Go to previous messageGo to next message
palt  is currently offline palt
Messages: 10
Registered: January 2009
Location: Augsburg, Germany
Junior Member
Check if your field where you store your id in is Keyword. Because the unIndexed example is wrong, you try to search for this id, so it must be indexed.

And if you're using IDs with number values, you have to set the default text analyzer to
Zend_Search_Lucene_Analysis_Analyzer::setDefault(new Zend_Search_Lucene_Analysis_Analyzer_Common_Utf8Num_CaseInsensitive());
inside your search peer, because it ignores numbers by default.

With these changes made to your code, you should be able to find your keys and even delete them.
Re: [Zend Lucene] Updating indexes [message #73031 is a reply to message #68233 ] Fri, 20 February 2009 19:59 Go to previous messageGo to next message
alxkn  is currently offline alxkn
Messages: 126
Registered: June 2007
Senior Member
I have this
$doc = new Zend_Search_Lucene_Document();
 
  // store job primary key URL to identify it in the search results
  $doc->addField(Zend_Search_Lucene_Field::UnIndexed('pk', $this->getId()));
 
 
  $doc->addField(Zend_Search_Lucene_Field::UnStored('title', $this->getTitle(), 'utf-8'));
  

  // add job to the index
  $index->addDocument($doc);
  $index->commit();

for indexing. Not sure what do you mean stored in Keyword. That line helps to index words with numbers, but unfortunately does not help to find ids. Ids are already in .cfs files. But they are in the form (for id 31 and title fitness)
... 31..pk|title ... ...fitness 

So $index->find (pk:31) will be empty, but if I try $index->find('fitness') it works.

[Updated on: Fri, 20 February 2009 20:05]


http://fmpsv.com/ --revolution in social networking
Re: [Zend Lucene] Updating indexes [message #73032 is a reply to message #73031 ] Fri, 20 February 2009 20:02 Go to previous messageGo to next message
palt  is currently offline palt
Messages: 10
Registered: January 2009
Location: Augsburg, Germany
Junior Member
I mean, that if you use pk with the find('pk:' id) statement, you have to declare this field as an indexed field. In the given example it is not and so you cannot search within this field.
So try to declare this field as "keyword" and you should be able to find the pk.

hope you can understand what I'm trying to explain Wink

In Codeview I mean:

replace this
$doc->addField(Zend_Search_Lucene_Field::UnIndexed('pk', $this->getId()));


with this
$doc->addField(Zend_Search_Lucene_Field::Keyword('pk', $this->getId()));

[Updated on: Fri, 20 February 2009 20:03]

Re: [Zend Lucene] Updating indexes [message #73033 is a reply to message #73032 ] Fri, 20 February 2009 20:12 Go to previous messageGo to next message
alxkn  is currently offline alxkn
Messages: 126
Registered: June 2007
Senior Member
Laughing awesome. It works. Thanks a lot. I also need to index emails, but they get broken on symbol @. I appreciate If you let me know how this can be fixed.

Thanks in advance.
A.


http://fmpsv.com/ --revolution in social networking
Re: [Zend Lucene] Updating indexes [message #73034 is a reply to message #73033 ] Fri, 20 February 2009 20:18 Go to previous messageGo to next message
palt  is currently offline palt
Messages: 10
Registered: January 2009
Location: Augsburg, Germany
Junior Member
Have a look at the field types, you need a field that is not tokenized. Don't have these in mind right now but its listed very well in the lucene Documentation.

Cheers,
Peter
Re: [Zend Lucene] Updating indexes [message #74222 is a reply to message #68233 ] Sat, 07 March 2009 10:45 Go to previous messageGo to next message
zero0x  is currently offline zero0x
Messages: 36
Registered: October 2007
Member

Here's the solution:

Read first few paragraphs to understand it: http://framework.zend.com/manual/en/zend.search.lucene.searc hing.html

And this is how to do this:

$term  = new Zend_Search_Lucene_Index_Term($this->getId(), 'pk');
$query = new Zend_Search_Lucene_Search_Query_Term($term);
$hit   = $index->find($query);


By doing this you skip whole tokenization thing and you're looking for exact match.

And it works Wink


ubuntu rules
Re: [Zend Lucene] Updating indexes [message #74295 is a reply to message #68233 ] Sun, 08 March 2009 02:55 Go to previous messageGo to next message
alxkn  is currently offline alxkn
Messages: 126
Registered: June 2007
Senior Member
I tried this, but for search for yahoo.com I get all users with these emails.
For keyword abcd1 I get users with emails abcd@yahoo.com and abcd1@msn.com


http://fmpsv.com/ --revolution in social networking
Re: [Zend Lucene] Updating indexes [message #74312 is a reply to message #68233 ] Sun, 08 March 2009 11:01 Go to previous messageGo to next message
wissl  is currently offline wissl
Messages: 447
Registered: March 2008
Location: Germany
Faithful Member
I believe that is because the analyzer splits strings at special characters like '@', '-', etc. If you do not want that to happen you will have to use a different analyzer, maybe a customized one...
Re: [Zend Lucene] Updating indexes [message #76513 is a reply to message #71980 ] Wed, 08 April 2009 11:30 Go to previous messageGo to next message
marduc  is currently offline marduc
Messages: 6
Registered: March 2009
Junior Member
While were at it, I have a similar question regarding this one

alxkn wrote on Tue, 10 February 2009 08:40


if ($hit = $index->find('pk:'.$this->getId()))
  {
    $index->delete($hit->id);
  }




Say, if I add data to my index like this

$doc->addField(Zend_Search_Lucene_Field::Keyword('pk', $this->getId()));

# Index post data
$doc->addField(Zend_Search_Lucene_Field::UnStored('title',      $this->getTitle(), 'utf-8'));

$doc->addField(Zend_Search_Lucene_Field::UnStored('slug',       $this->getSlug(), 'utf-8'));

$doc->addField(Zend_Search_Lucene_Field::UnStored('content',    $this->getContent(), 'utf-8'));


Using the previous snippet only issuing $index->delete($hit->id), does this actually delete the contents of the title, slug, and content so I can re-add the whole document? In other words. If my save() method for my post were to use $index->delete($hit->id) for deletion before re-adding the document, would the title, slug, and contents be updated?

I am asking this out of curiousity, as both http://framework.zend.com/manual/en/zend.search.lucene.index -creation.html and this one suggests using a foreach loop to delete all related fields.

What did I miss? Smile
Re: [Zend Lucene] Updating indexes [message #76518 is a reply to message #76513 ] Wed, 08 April 2009 12:50 Go to previous messageGo to next message
marduc  is currently offline marduc
Messages: 6
Registered: March 2009
Junior Member
So, I did some more testing on this one, and to be able to completely update the index for my posts as some of the mentioned fields change I have to do this

// remove an existing entry
if ($hits = $index->find('pk:'.$this->getId()))
  {   
  foreach($hits as $hit) {
    $index->delete($hit->id);
  }
}


Any comments on this one?
Re: [Zend Lucene] Updating indexes [message #76734 is a reply to message #68233 ] Mon, 13 April 2009 12:44 Go to previous messageGo to next message
freakx0  is currently offline freakx0
Messages: 77
Registered: November 2006
Location: Germany
Member
i have the same problem.
with the example search/indexing of jobeet it doesn't work.

i can't find an entry via "$index->find('pk:'.$this->getId())"

i tried to use Zend_Search_Lucene_Field::Keyword not Zend_Search_Lucene_Field::unIndexed, but it doesn't work.

anybody ideas?

dominik
Re: [Zend Lucene] Updating indexes [message #76777 is a reply to message #68233 ] Tue, 14 April 2009 09:15 Go to previous messageGo to next message
wissl  is currently offline wissl
Messages: 447
Registered: March 2008
Location: Germany
Faithful Member
Have you looked into the index files with Luke (http://www.getopt.org/luke/) ? First thing we need to know is if the data is indexed correctly or not.
Re: [Zend Lucene] Updating indexes [message #76815 is a reply to message #68233 ] Tue, 14 April 2009 17:03 Go to previous message
freakx0  is currently offline freakx0
Messages: 77
Registered: November 2006
Location: Germany
Member
Nice tool for debugging. But it doesn't help me with that problem.

I've added my index as an attachment.

Previous Topic:[solved] sfLucene no search results
Next Topic:limiting search results in sfSarchPlugin
Goto Forum:
  

powered by FUDforum - copyright ©2001-2004 FUD Forum Bulletin Board Software