Package translate :: Package search :: Package indexing :: Module PyLuceneIndexer :: Class PyLuceneDatabase
[hide private]
[frames] | no frames]

Class PyLuceneDatabase

source code


manage and use a pylucene indexing database

Instance Methods [hide private]
 
__init__(self, basedir, analyzer=None, create_allowed=True)
initialize or open an indexing database
source code
 
__del__(self)
remove lock and close writer after loosing the last reference
source code
 
flush(self, optimize=False)
flush the content of the database - to force changes to be written to disk
source code
PyLucene.Query
_create_query_for_query(self, query)
generate a query based on an existing query object
source code
PyLucene.Query
_create_query_for_string(self, text, require_all=True, analyzer=None)
generate a query for a plain term of a string query
source code
PyLucene.Query
_create_query_for_field(self, field, value, analyzer=None)
generate a field query
source code
PyLucene.Query
_create_query_combined(self, queries, require_all=True)
generate a combined query
source code
PyLucene.Document
_create_empty_document(self)
create an empty document to be filled and added to the index later
source code
 
_add_plain_term(self, document, term, tokenize=True)
add a term to a document
source code
 
_add_field_term(self, document, field, term, tokenize=True)
add a field term to a document
source code
 
_add_document_to_index(self, document)
add a prepared document to the index database
source code
 
begin_transaction(self)
PyLucene does not support transactions
source code
 
cancel_transaction(self)
PyLucene does not support transactions
source code
 
commit_transaction(self)
PyLucene does not support transactions
source code
subclass of CommonEnquire
get_query_result(self, query)
return an object containing the results of a query
source code
 
delete_document_by_id(self, docid)
delete a specified document
source code
list of dicts
search(self, query, fieldnames)
return a list of the contents of specified fields for all matches of a query
source code
 
_delete_stale_lock(self) source code
 
_writer_open(self)
open write access for the indexing database and acquire an exclusive lock
source code
 
_writer_close(self)
close indexing write access and remove the database lock
source code
 
_writer_is_open(self)
check if the indexing write access is currently open
source code
 
_index_refresh(self)
re-read the indexer database
source code

Inherited from CommonIndexer.CommonDatabase: delete_doc, get_field_analyzers, index_document, make_query, set_field_analyzers

Inherited from object: __delattr__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __repr__, __setattr__, __str__

Class Variables [hide private]
  QUERY_TYPE = PyLucene.Query
override this with the query class of the implementation
  INDEX_DIRECTORY_NAME = "lucene"
override this with a string to be used as the name of the indexing directory/file in the filesystem

Inherited from CommonIndexer.CommonDatabase: ANALYZER_DEFAULT, ANALYZER_EXACT, ANALYZER_PARTIAL, ANALYZER_TOKENIZE, field_analyzers

Properties [hide private]

Inherited from object: __class__

Method Details [hide private]

__init__(self, basedir, analyzer=None, create_allowed=True)
(Constructor)

source code 

initialize or open an indexing database

Any derived class must override __init__.

Parameters:
  • basedir (str) - the parent directory of the database
  • analyzer (int) - bitwise combination of possible analyzer flags to be used as the default analyzer for this database. Leave it empty to use the system default analyzer (self.ANALYZER_DEFAULT). see self.ANALYZER_TOKENIZE, self.ANALYZER_PARTIAL, ...
  • create_allowed (bool) - create the database, if necessary; default: True
Raises:
  • ValueError - the given location exists, but the database type is incompatible (e.g. created by a different indexing engine)
  • OSError - the database failed to initialize
Overrides: object.__init__

flush(self, optimize=False)

source code 

flush the content of the database - to force changes to be written to disk

some databases also support index optimization

Parameters:
  • optimize (bool) - should the index be optimized if possible?
Overrides: CommonIndexer.CommonDatabase.flush

_create_query_for_query(self, query)

source code 

generate a query based on an existing query object

basically this function should just create a copy of the original

Parameters:
  • query (PyLucene.Query) - the original query object
Returns: PyLucene.Query
resulting query object
Overrides: CommonIndexer.CommonDatabase._create_query_for_query

_create_query_for_string(self, text, require_all=True, analyzer=None)

source code 

generate a query for a plain term of a string query

basically this function parses the string and returns the resulting query

Parameters:
Returns: PyLucene.Query
resulting query object
Overrides: CommonIndexer.CommonDatabase._create_query_for_string

_create_query_for_field(self, field, value, analyzer=None)

source code 

generate a field query

this functions creates a field->value query

Parameters:
Returns: PyLucene.Query
resulting query object
Overrides: CommonIndexer.CommonDatabase._create_query_for_field

_create_query_combined(self, queries, require_all=True)

source code 

generate a combined query

Parameters:
  • queries (list of PyLucene.Query) - list of the original queries
  • require_all (bool) - boolean operator (True -> AND (default) / False -> OR)
Returns: PyLucene.Query
the resulting combined query object
Overrides: CommonIndexer.CommonDatabase._create_query_combined

_create_empty_document(self)

source code 

create an empty document to be filled and added to the index later

Returns: PyLucene.Document
the new document object
Overrides: CommonIndexer.CommonDatabase._create_empty_document

_add_plain_term(self, document, term, tokenize=True)

source code 

add a term to a document

Parameters:
  • document (PyLucene.Document) - the document to be changed
  • term (str) - a single term to be added
  • tokenize (bool) - should the term be tokenized automatically
Overrides: CommonIndexer.CommonDatabase._add_plain_term

_add_field_term(self, document, field, term, tokenize=True)

source code 

add a field term to a document

Parameters:
  • document (PyLucene.Document) - the document to be changed
  • field (str) - name of the field
  • term (str) - term to be associated to the field
  • tokenize (bool) - should the term be tokenized automatically
Overrides: CommonIndexer.CommonDatabase._add_field_term

_add_document_to_index(self, document)

source code 

add a prepared document to the index database

Parameters:
  • document (PyLucene.Document) - the document to be added
Overrides: CommonIndexer.CommonDatabase._add_document_to_index

begin_transaction(self)

source code 

PyLucene does not support transactions

Thus this function just opens the database for write access. Call "cancel_transaction" or "commit_transaction" to close write access in order to remove the exclusive lock from the database directory.

Overrides: CommonIndexer.CommonDatabase.begin_transaction

cancel_transaction(self)

source code 

PyLucene does not support transactions

Thus this function just closes the database write access and removes the exclusive lock.

See 'start_transaction' for details.

Overrides: CommonIndexer.CommonDatabase.cancel_transaction

commit_transaction(self)

source code 

PyLucene does not support transactions

Thus this function just closes the database write access and removes the exclusive lock.

See 'start_transaction' for details.

Overrides: CommonIndexer.CommonDatabase.commit_transaction

get_query_result(self, query)

source code 

return an object containing the results of a query

Parameters:
  • query (a query object of the real implementation) - a pre-compiled query
Returns: subclass of CommonEnquire
an object that allows access to the results
Overrides: CommonIndexer.CommonDatabase.get_query_result

delete_document_by_id(self, docid)

source code 

delete a specified document

Parameters:
  • docid (int) - the document ID to be deleted
Overrides: CommonIndexer.CommonDatabase.delete_document_by_id

search(self, query, fieldnames)

source code 

return a list of the contents of specified fields for all matches of a query

Parameters:
  • query (a query object of the real implementation) - the query to be issued
  • fieldnames (string | list of strings) - the name(s) of a field of the document content
Returns: list of dicts
a list of dicts containing the specified field(s)
Overrides: CommonIndexer.CommonDatabase.search