ham_runtime_statistics_dbdata_t Struct Reference

#include <hamsterdb_stats.h>

Data Fields

ham_runtime_statistics_opdbdata_t op [HAM_OPERATION_STATS_MAX]
ham_u32_t rescale_tracker
ham_key_t lower_bound
ham_u32_t lower_bound_index
ham_offset_t lower_bound_page_address
ham_bool_t lower_bound_set
ham_key_t upper_bound
ham_u32_t upper_bound_index
ham_offset_t upper_bound_page_address
ham_bool_t upper_bound_set

Detailed Description

Definition at line 335 of file hamsterdb_stats.h.


Field Documentation

Remember the upper and lower bound kays for this database; update them when we insert a new key, maybe even update them when we delete/erase a key.

These bounds are collected on the fly while searching (find()): they are stored in here as soon as a find() operation hits either the lower or upper bound of the key range stored in the database.

The purpose of storing these bounds is to speed up out-of-bounds key searches significantly: by comparing incoming keys with these bounds, we can immediately tell whether a key will have a change of being found or not, thus precluding the need to traverse the btree - which would produce the same answer in the end anyhow.

WARNING: having these key (copies) in here means we'll need to clean them up when we close the database connection, or we'll risk leaking memory in the key->data here.

NOTE #1: this is the humble beginning of what in a more sophisticated database server system would be called a 'histogram' (Oracle, etc.). Here we don't spend the effort to collect data for a full histogram, but merely collect info about the extremes of our stored key range.

NOTE #2: I'm pondering whether this piece of statistics gathering should be allowed to be turned off by the user 'because he knows best' where premium run-time performance requirements are at stake. Yet... The overhead here is a maximum of two key comparisons plus 2 key copies (which can be significant when we're talking about extended keys!) when we're producing find/insert/erase results which access a btree leaf node which is positioned at the upper/lower edge of the btree key range.

Hence, worst case happens for sure with tiny databases, as those will have ONE btree page only (root=leaf!) and the worst case is reduced to 1 key comparison + 1 key copy for any larger database, which spans two btree pages or more.

To further reduce the worst case overhead, we also store the within-btree-node index of the upper/lower bound key: when this does not change, there is no need to compare the key - unless the key is overwritten, which is a special case of the insert operation.

Warning:
The key data is allocated using the ham_db_t allocator and the key data must be freed before the related ham_db_t handle is closed or deleted.

Definition at line 406 of file hamsterdb_stats.h.

Definition at line 407 of file hamsterdb_stats.h.

Definition at line 408 of file hamsterdb_stats.h.

Definition at line 409 of file hamsterdb_stats.h.

Definition at line 338 of file hamsterdb_stats.h.

common rescale tracker as the rescaling is done on all operations data at once, so these remain 'balanced'.

Fringe case consideration: when there's, say, a lot of FIND going on with a few ERASE operations in between, is it A Bad Thing that the ERASE stats risc getting rescaled to almost nil then? Answer: NO. Because there's a high probability that the last ERASE btree leaf node isn't in cache anymore anyway -- unless it's the same one as used by FIND.

The reason we keep track of 3 different leaf nodes is only so we can supply good hinting in scanerios where FIND, INSERT and/or ERASE are mixed in reasonable ratios; keeping track of only a single btree leaf would deny us some good hinting for the other operations.

Definition at line 356 of file hamsterdb_stats.h.

Definition at line 410 of file hamsterdb_stats.h.

Definition at line 411 of file hamsterdb_stats.h.

Definition at line 412 of file hamsterdb_stats.h.

Definition at line 413 of file hamsterdb_stats.h.


The documentation for this struct was generated from the following file:

Generated on Thu Feb 11 22:04:58 2010 for hamsterdb Embedded Database by  doxygen 1.6.1