An epoch number is the total number of operations that originated
at a particular replica.
In Figure 2 from Operation tracking for each replica, the epoch number for boston_hub is
950.
The
MultiSite synchronization
scheme attempts to minimize the amount of data transmitted among replicas.
Each replica keeps track of these epoch numbers:
- Changes made in the current replica. The number of operations that
originated at the current replica.
- Changes at sibling replicas that have been imported to the current
replica. When syncreplica writes an operation from an update
packet to the current replica, it increments the epoch number that records
the number of operations originating at the sibling replica that have been
imported at the current replica.
- Estimates of the states of other replicas. For each other replica,
an estimate of its own changes and other replicas’ changes. The current replica
keeps track of the operations it has sent to other replicas, and assumes that
these operations are imported successfully.
Table 1 shows how
these epoch numbers fall into an epoch number matrix. Each replica maintains
its own such matrix, revising its rows as work occurs locally and as it exchanges
update packets with other replicas:
- When work occurs in the boston_hub replica, its own epoch number
is incremented.
- When the boston_hub replica receives an update from sanfran_hub,
it revises its own row (boston_hub) and the sanfran_hub row
in its epoch number matrix.
- When the boston_hub replica generates an update packet to be sent
to sanfran_hub, it revises the sanfran_hub row in its epoch
number matrix.
A syncreplica –export command updates epoch
numbers immediately. It does not wait for acknowledgment from the importing
replica that the packet has been received and applied correctly. During normal MultiSite processing,
manual intervention is not necessary to maintain the accuracy of the epoch
number matrices for the various replicas. However, failure to apply a packet
may require manual intervention.
Table 1. Two-row epoch number matrix
at replica boston_hub |
Operations originated at boston_hub |
Operations originated at sanfran_hub |
boston_hub’s record of its own state |
950 |
504 |
boston_hub’s estimate of sanfran_hub’s state |
912 |
504 |
The
contents of this matrix are reported by the
lsepoch command at the
boston_hub replica:
multiutil lsepoch -clan telecomm -site boston_hub -family PRODA -user bostonadmin -password secret
Multiutil: Estimates of the epochs from each site replayed at site ’boston_hub’ (@minuteman):
boston_hub: 950
sanfran_hub: 504
Multiutil: Estimates of the epochs from each site replayed at site ’sanfran_hub’ (@goldengate):
boston_hub: 912
sanfran_hub: 504
A
syncreplica –export command entered at
boston_hub uses
this matrix as follows to generate an update destined for
sanfran_hub: - At the boston_hub replica, the number of local operations is 950
(the number in upper left corner of matrix), and the estimate is that the sanfran_hub replica
has imported all operations through oplog ID 912 (the number in lower left
corner).
- The update packet that the boston_hub replica sends to the sanfran_hub replica
includes boston_hub oplog entries 913-950. After the Boston administrator
invokes syncreplica –export, the sanfran_hub row is
updated:
multiutil lsepoch -clan telecomm -site boston_hub -family PRODA -user lexadmin -password secret
Multiutil: Estimates of the epochs from each site replayed at site ’boston_hub’ (@minuteman):
boston_hub: 950
sanfran_hub: 504
Multiutil: Estimates of the epochs from each site replayed at site ’sanfran_hub’ (@goldengate):
boston_hub: 950
sanfran_hub: 504