logo_kerberos.gif

Difference between revisions of "Projects/Hierarchical iprop"

From K5Wiki
Jump to: navigation, search
(Proposed changes)
Line 6: Line 6:
 
This project adds the ability for incremental propagation slaves to act as masters for other slaves, forming a hierarchy of masters and slaves. This feature can be useful for controlling propagation load in environments with many slaves, or to address incomplete network connectivity between masters and slaves. The project is based on an existing implementation by Richard Basch.
 
This project adds the ability for incremental propagation slaves to act as masters for other slaves, forming a hierarchy of masters and slaves. This feature can be useful for controlling propagation load in environments with many slaves, or to address incomplete network connectivity between masters and slaves. The project is based on an existing implementation by Richard Basch.
   
==Existing Design==
+
==Existing Design and Code==
   
 
Since release 1.7, MIT krb5 has had support for incremental propagation for changes to principal information. The slave runs a kpropd daemon in standalone mode, which periodically polls for updates from kadmind on the master, using a GSSRPC-based protocol. Updates are normally transmitted to the slave as marshalled operations in the response to the polling RPC. If the slave drifts too far out of date or if there have been changes to policies on the master, then a full dump is performed from the master to the slave using kprop.
 
Since release 1.7, MIT krb5 has had support for incremental propagation for changes to principal information. The slave runs a kpropd daemon in standalone mode, which periodically polls for updates from kadmind on the master, using a GSSRPC-based protocol. Updates are normally transmitted to the slave as marshalled operations in the response to the polling RPC. If the slave drifts too far out of date or if there have been changes to policies on the master, then a full dump is performed from the master to the slave using kprop.
  +
  +
The iprop code experienced numerous changes between 1.11 and 1.12; this section describes the code as it exists in 1.12.
   
 
The master and slave store the state needed for incremental propagation in a memory-mapped file called the ulog, which contains a header and a circular array of fixed-size update entries. Each update entry has a timestamp and a serial number; serial numbers begin at 1 and increment for each update. The ulog header contains:
 
The master and slave store the state needed for incremental propagation in a memory-mapped file called the ulog, which contains a header and a circular array of fixed-size update entries. Each update entry has a timestamp and a serial number; serial numbers begin at 1 and increment for each update. The ulog header contains:
Line 24: Line 26:
 
The size of the circular entry array is determined by the iprop_master_ulogsize krb5.conf variable, which is not recorded in the header. The kdb_num, kdb_first_sno, and kdb_last_sno fields are redundant; this redundancy can be used to detect if the ulogsize parameter changed since the last time the ulog was updated.
 
The size of the circular entry array is determined by the iprop_master_ulogsize krb5.conf variable, which is not recorded in the header. The kdb_num, kdb_first_sno, and kdb_last_sno fields are redundant; this redundancy can be used to detect if the ulogsize parameter changed since the last time the ulog was updated.
   
After the ulog is initialized or reinitialized (with kproplog -R), the kdb_num, kdb_first_time, kdb_first_sno, and kdb_last_sno fields are set to 0, and the kdb_last_time field is set to the time of initialization. The first database update after ulog initialization increments the kdb_num, kdb_first_sno, and kdb_last_sno fields to 1, and sets the kdb_first_time and kdb_last_time fields to the time of the update.
+
When the ulog is initialized or reinitialized (with kproplog -R), the kdb_num, kdb_first_time, kdb_first_sno, and kdb_last_sno fields are set to 0, and the kdb_last_time field is set to the time of initialization. The first database update after ulog initialization increments the kdb_num, kdb_first_sno, and kdb_last_sno fields to 1, and sets the kdb_first_time and kdb_last_time fields to the time of the update.
  +
  +
The kdb_first_sno and kdb_first_time fields are used by the following code operations on the master KDC:
  +
* ulog_reset sets both fields to zero.
  +
* ulog_add_update, on the first update after initialization, sets kdb_first_sno to 1 and kdb_first_time to the time of the update.
  +
* ulog_get_entries uses kdb_first_sno to detect if the slave's last update is no longer in the ulog. kdb_first_time is not used here; instead, the timestamp of the ulog entry for the slave's last update sno is checked against the slave's last update time.
  +
* kdb5_util load uses kdb_first_sno and kdb_first time to detect if the current dump's last update serial number is still in the ulog.
  +
* kproplog prints both fields when displaying the header status.
  +
  +
The kdb_last_sno and kdb_last_time fields are used by the following code operations on the master KDC:
  +
* ulog_reset sets kdb_last_sno to zero and kdb_last_time to the current time.
  +
* ulog_add_update sets both fields to the serial number and timestamp of the update being added.
  +
* ulog_map detects if ulogentries has decreased after the ulog circled, by checking if kdb_last_sno > kdb_num but kdb_num != ulogentries. The ulog header is reinitialized in this case.
  +
* ulog_get_entries compares kdb_last_sno and kdb_last_time to the RPC request parameters to see if the slave is up to date. If the slave's serial number is higher, or if it matches kdb_last_sno but the timestamp does not match, the slave is told to perform a full resync because the master's ulog was reinitialized since the last slave update.
  +
* ulog_get_entries uses kdb_last_sno to count and bound the updates to send to the slave, and uses both fields to set the lastentry parameter of the RPC response.
  +
* kdb5_util dump checks whether kdb_last_sno is nonzero to detect if the ulog is empty (and therefore does not contain the current dump's last update).
  +
* kdb5_util dump writes both fields into the dump's iprop header.
  +
* kdb5_util load blocks non-iprop loads if kdb_last_sno is nonzero.
  +
* kproplog prints both fields when displaying the header status.
  +
* kproplog uses kdb_last_sno as the end bound of the loop when displaying entries, and to compute the starting index when a -e parameter is given.
   
XXX enumerate uses of each header field on masters
 
  +
The kdb_num field is used by the following code operations on the master KDC:
  +
* ulog_reset sets it to zero.
  +
* ulog_add_update increments it if it is less than ulogentries (i.e. if the ulog is not yet circling).
  +
* ulog_map detects if ulogentries has shrunk too much prior to the ulog circling, by checking if kdb_num > ulogentries. The ulog header is reinitialized in this case.
  +
* ulog_map detects it ulogentries might have grown by checking if kdb_num < ulogentries (which can also be true if the ulog simply has not circled yet). In this case, the file size is checked and the file is extended with zero bytes if required.
  +
* kproplog displays it as part of the header status.
  +
* kproplog checks if kdb_num is zero before printing update entries.
  +
* kproplog ignores the -e argument if it is equal to or greater than kdb_num.
   
Slaves do not use the update entry array and only make use of two header fields, kdb_last_sno and kdb_last_time, to track the most recently received update from the master. XXX enumerate which pieces of code use and set these fields on slaves
+
Slaves do not use the update entry array and only make use of two header fields, kdb_last_sno and kdb_last_time, to track the most recently received update from the master. The following code operations use the two fields:
  +
* ulog_finish_update_slave, called from ulog_replay, sets both fields to the serial number and timestamp of the RPC response's lastentry parameter, or to zero if there was an error replaying the changes.
  +
* kdb5_util load sets the two fields to the values from the iprop header when an iprop dump is loaded.
  +
* kdb5_util load blocks non-iprop loads if kdb_last_sno is nonzero, signifying that the ulog received updates from the master at a previous time.
  +
* ulog_reset sets kdb_last_sno to 0 and kdb_last_time to the current time. This can happen if the administrator runs kproplog -R on the slave KDC, or when the slave database is first created.
   
 
==Proposed changes==
 
==Proposed changes==
Line 44: Line 53:
 
* After receiving a full dump from upstream, the kdb_num, kdb_first_time, and kdb_first_sno fields are 0, but the kdb_last_sno and kdb_last_time fields reflect the most recent update from the upstream master when the dump file was created.
 
* After receiving a full dump from upstream, the kdb_num, kdb_first_time, and kdb_first_sno fields are 0, but the kdb_last_sno and kdb_last_time fields reflect the most recent update from the upstream master when the dump file was created.
   
* After a ulog reset or full dump, subsequent updates reflect the serial numbers of the updates received from upstream, instead of starting at 1. On the master, if kdb_num is less than ulogsize, kdb_first_sno is always 1 and kdb_last_sno is always equal to kdb_num, but on a slave that is not necessarily true.
+
* After a ulog reset or full dump, subsequent updates reflect the serial numbers of the updates received from upstream, instead of starting at 1. On the master, if kdb_num is less than ulogsize, kdb_first_sno is always 1 and kdb_last_sno is always equal to kdb_num. On a slave, updates can begin at a serial number other than 1, and in this case they can wrap around within the circular array before the array fills up and old entries are overwritten.
   
 
XXX what pieces of code need to change as a result? Partial list:
 
XXX what pieces of code need to change as a result? Partial list:

Revision as of 15:53, 19 January 2014

This is an early stage project for MIT Kerberos. It is being fleshed out by its proponents. Feel free to help flesh out the details of this project. After the project is ready, it will be presented for review and approval.


This project is targeted at release 1.13.


Description

This project adds the ability for incremental propagation slaves to act as masters for other slaves, forming a hierarchy of masters and slaves. This feature can be useful for controlling propagation load in environments with many slaves, or to address incomplete network connectivity between masters and slaves. The project is based on an existing implementation by Richard Basch.

Existing Design and Code

Since release 1.7, MIT krb5 has had support for incremental propagation for changes to principal information. The slave runs a kpropd daemon in standalone mode, which periodically polls for updates from kadmind on the master, using a GSSRPC-based protocol. Updates are normally transmitted to the slave as marshalled operations in the response to the polling RPC. If the slave drifts too far out of date or if there have been changes to policies on the master, then a full dump is performed from the master to the slave using kprop.

The iprop code experienced numerous changes between 1.11 and 1.12; this section describes the code as it exists in 1.12.

The master and slave store the state needed for incremental propagation in a memory-mapped file called the ulog, which contains a header and a circular array of fixed-size update entries. Each update entry has a timestamp and a serial number; serial numbers begin at 1 and increment for each update. The ulog header contains:

  • kdb_hmagic: A magic number (to detect corruption)
  • db_version_num: A version number (currently set to 1 and unused by readers)
  • kdb_num: The number of updates in the ulog
  • kdb_first_time: The timestamp of the ulog's first entry
  • kdb_last_time: The timestamp of the ulog's last entry
  • kdb_first_sno: The timestamp of the ulog's first entry
  • kdb_last_sno: The timestamp of the ulog's last entry
  • kdb_state: The stability state of the ulog
  • kdb_block: The entry size (normally 2048)

The size of the circular entry array is determined by the iprop_master_ulogsize krb5.conf variable, which is not recorded in the header. The kdb_num, kdb_first_sno, and kdb_last_sno fields are redundant; this redundancy can be used to detect if the ulogsize parameter changed since the last time the ulog was updated.

When the ulog is initialized or reinitialized (with kproplog -R), the kdb_num, kdb_first_time, kdb_first_sno, and kdb_last_sno fields are set to 0, and the kdb_last_time field is set to the time of initialization. The first database update after ulog initialization increments the kdb_num, kdb_first_sno, and kdb_last_sno fields to 1, and sets the kdb_first_time and kdb_last_time fields to the time of the update.

The kdb_first_sno and kdb_first_time fields are used by the following code operations on the master KDC:

  • ulog_reset sets both fields to zero.
  • ulog_add_update, on the first update after initialization, sets kdb_first_sno to 1 and kdb_first_time to the time of the update.
  • ulog_get_entries uses kdb_first_sno to detect if the slave's last update is no longer in the ulog. kdb_first_time is not used here; instead, the timestamp of the ulog entry for the slave's last update sno is checked against the slave's last update time.
  • kdb5_util load uses kdb_first_sno and kdb_first time to detect if the current dump's last update serial number is still in the ulog.
  • kproplog prints both fields when displaying the header status.

The kdb_last_sno and kdb_last_time fields are used by the following code operations on the master KDC:

  • ulog_reset sets kdb_last_sno to zero and kdb_last_time to the current time.
  • ulog_add_update sets both fields to the serial number and timestamp of the update being added.
  • ulog_map detects if ulogentries has decreased after the ulog circled, by checking if kdb_last_sno > kdb_num but kdb_num != ulogentries. The ulog header is reinitialized in this case.
  • ulog_get_entries compares kdb_last_sno and kdb_last_time to the RPC request parameters to see if the slave is up to date. If the slave's serial number is higher, or if it matches kdb_last_sno but the timestamp does not match, the slave is told to perform a full resync because the master's ulog was reinitialized since the last slave update.
  • ulog_get_entries uses kdb_last_sno to count and bound the updates to send to the slave, and uses both fields to set the lastentry parameter of the RPC response.
  • kdb5_util dump checks whether kdb_last_sno is nonzero to detect if the ulog is empty (and therefore does not contain the current dump's last update).
  • kdb5_util dump writes both fields into the dump's iprop header.
  • kdb5_util load blocks non-iprop loads if kdb_last_sno is nonzero.
  • kproplog prints both fields when displaying the header status.
  • kproplog uses kdb_last_sno as the end bound of the loop when displaying entries, and to compute the starting index when a -e parameter is given.

The kdb_num field is used by the following code operations on the master KDC:

  • ulog_reset sets it to zero.
  • ulog_add_update increments it if it is less than ulogentries (i.e. if the ulog is not yet circling).
  • ulog_map detects if ulogentries has shrunk too much prior to the ulog circling, by checking if kdb_num > ulogentries. The ulog header is reinitialized in this case.
  • ulog_map detects it ulogentries might have grown by checking if kdb_num < ulogentries (which can also be true if the ulog simply has not circled yet). In this case, the file size is checked and the file is extended with zero bytes if required.
  • kproplog displays it as part of the header status.
  • kproplog checks if kdb_num is zero before printing update entries.
  • kproplog ignores the -e argument if it is equal to or greater than kdb_num.

Slaves do not use the update entry array and only make use of two header fields, kdb_last_sno and kdb_last_time, to track the most recently received update from the master. The following code operations use the two fields:

  • ulog_finish_update_slave, called from ulog_replay, sets both fields to the serial number and timestamp of the RPC response's lastentry parameter, or to zero if there was an error replaying the changes.
  • kdb5_util load sets the two fields to the values from the iprop header when an iprop dump is loaded.
  • kdb5_util load blocks non-iprop loads if kdb_last_sno is nonzero, signifying that the ulog received updates from the master at a previous time.
  • ulog_reset sets kdb_last_sno to 0 and kdb_last_time to the current time. This can happen if the administrator runs kproplog -R on the slave KDC, or when the slave database is first created.

Proposed changes

Three changes are needed to allow hierarchical iprop:

1. On slaves, maintain a complete ulog. When update entries are received from the upstream master and processed by ulog_replay(), add them to the ulog in addition to modifying the KDB.

2. Add a -proponly option to kadmind so that it can be run on slaves as an iprop server.

3. Add a "-A server" option to kpropd to make it talk to a specified upstream master rather than the krb5.conf value of admin_server.

The ulog on a slave can be in states that are impossible on the master:

  • After receiving a full dump from upstream, the kdb_num, kdb_first_time, and kdb_first_sno fields are 0, but the kdb_last_sno and kdb_last_time fields reflect the most recent update from the upstream master when the dump file was created.
  • After a ulog reset or full dump, subsequent updates reflect the serial numbers of the updates received from upstream, instead of starting at 1. On the master, if kdb_num is less than ulogsize, kdb_first_sno is always 1 and kdb_last_sno is always equal to kdb_num. On a slave, updates can begin at a serial number other than 1, and in this case they can wrap around within the circular array before the array fills up and old entries are overwritten.

XXX what pieces of code need to change as a result? Partial list:

  • Detection of ulogsize change in ulog_map

XXX examine corner cases associated with full resyncs, upstream and downstream

Testing

Documentation

Mailing list discussions

Commits

Release notes