logo_kerberos.gif

Projects/replay cache collision avoidance

From K5Wiki
< Projects
Revision as of 14:40, 29 December 2008 by TomYu (talk | contribs) (Discussion: summarize some list discussion)

Jump to: navigation, search

An announcement has been sent to krbdev@mit.edu starting a review of this project. That review will conclude on January 12, 2009.

Comments can be sent to krbdev@mit.edu.

Background

The MIT Kerberos replay cache follows the guidelines of RFC 4120 Section 10. Security Considerations:

  Implementation note: If a client generates multiple requests to the
  KDC with the same timestamp, including the microsecond field, all but
  the first of the requests received will be rejected as replays.  This
  might happen, for example, if the resolution of the client's clock is
  too coarse.  Client implementations SHOULD ensure that the timestamps
  are not reused, possibly by incrementing the microseconds field in
  the time stamp when the clock returns the same time for multiple
  requests.

As computers have become faster and greater numbers of processes or threads are requiring network authentication to network services, it is becoming more likely that the microseconds field will match for multiple requests. It is also more difficult to avoid such collisions without introducing a significant performance hit. As a result, an ever increasing number of false positives are triggered even though the authenticators used by the various client processes differ.

The current FILE replay cache format consists of a file, starting with a two-byte big-endian version number. The rest of the file consists of records, each of which includes the following fields (in host byte order!):

 unsigned int   - client length == strlen(client principal name) + 1
 variable       - client principal name (NUL terminated C-string)
 unsigned int   - server length == strlen(server principal name) + 1
 variable       - server principal name (NUL terminated C-string)
 krb5_int32     - micro seconds
 krb5_timestamp - timestamp

These are the fields that are required by RFC 4120 Section 10. Security Considerations:

  Unless the application server provides its own suitable means to
  protect against replay (for example, a challenge-response sequence
  initiated by the server after authentication, or use of a server-
  generated encryption subkey), the server MUST utilize a replay cache
  to remember any authenticator presented within the allowable clock
  skew.  Careful analysis of the application protocol and
  implementation is recommended before eliminating this cache.  The
  replay cache will store at least the server name, along with the
  client name, time, and microsecond fields from the recently-seen
  authenticators, and if a matching tuple is found, the
  KRB_AP_ERR_REPEAT error is returned.

As stated the requirement is that at least the client and server principals along with the complete time stamp are required to be present. Additional data such as a hash of a canonical representation of the authenticator or the full clear text of a canonical representation of the authenticator would be permitted by RFC 4120 and could be used to avoid false positives.

Simply adding new fields to the MIT Kerberos replay cache record would not be an acceptable solution. As indicated in RFC 4120, it is imperative that all services sharing the same service principal share the same replay cache regardless of which Kerberos implementation is in use.

  If multiple servers (for example, different services on one machine,
  or a single service implemented on multiple machines) share a service
  principal (a practice that we do not recommend in general, but that
  we acknowledge will be used in some cases), either they MUST share
  this replay cache, or the application protocol MUST be designed so as
  to eliminate the need for it.  Note that this applies to all of the
  services.  If any of the application protocols does not have replay
  protection built in, an authenticator used with such a service could
  later be replayed to a different service with the same service
  principal but no replay protection, if the former doesn't record the
  authenticator information in the common replay cache.

As a result, the file format used by MIT Kerberos has been implemented in other Kerberos implementations. It would not be safe to change the replay cache file format in a manner that prevented the sharing of the replay cache among all of the implementations that can be expected to be deployed on a single machine. Whether they be from different versions of MIT Kerberos or implementations from different vendors.

The Proposal

In order to avoid false positives we require that the replay cache data structure store sufficient data to be able to distinguish between two authenticators while at the same time maintain compatibility with existing deployments.

One method of achieving this goal is to modify the replay cache library to store two records for each entry. Part of this modification will consist of extension records that may be stored in the guise of client or server principal names.

The first replay cache record will consist of the following:

 unsigned int   - length of authenticator-hash extension
     unsigned char  - null byte
     krb5_ui4       - extension magic number 0x48415348 ("HASH")
     krb5_int32     - hash algorithm number
     variable       - authenticator-hash
 unsigned int   - length of authenticator-cleartext extension
     unsigned char  - null byte
     krb5_ui4       - extension magic number 0x41555448 ("AUTH")
     variable       - cleartext authenticator
 krb5_int32     - microseconds
 krb5_timestamp - timestamp

For consistency, integer values remain in host byte order.

This record would be followed by the existing record consisting of the client and server principal names, the time stamp and the microseconds value. A complete entry therefore takes up two existing records.

The default hash method would be a MD5 hash. This will be fast to compare, and if there are collisions, the cleartext authenticators can be compared. It is not important that the hash be collision-resistant because a collision merely means that the full cleartext needs comparison.

How does this provide backward compatibility?

For new servers reading entries written by new servers, the comparisons are made based upon the hashes, then the authenticators. The client and server principal names and the timestamps become irrelevant. If there is an authenticator being used multiple times, it is a problem.

For new servers reading entries written by old servers, the string fields will be valid principal names and the time stamp value will not be a small integer. As a result, the server will know it is dealing with an old style entry and perform an old style check. This might lead to false positives but there is nothing we can do without additional information that is not available.

For old servers reading entries written by new servers. The new hash based entry will never match the incoming principal names and will therefore be skipped. The old style entries will be used as they are today.

For old servers reading entries written by old servers. The behavior will be the same as today.

Is this a long term fix for the problem?

Perhaps this is the best we can do. RFC 4120 Section 10. Security Considerations describes how the application server must behave when it is unsure that it has enough replay data to cover the allowable clock skew period:

  If a server loses track of authenticators presented within the
  allowable clock skew, it MUST reject all requests until the clock
  skew interval has passed, providing assurance that any lost or
  replayed authenticators will fall outside the allowable clock skew
  and can no longer be successfully replayed.  If this were not done,
  an attacker could subvert the authentication by recording the ticket
  and authenticator sent over the network to a server and replaying
  them following an event that caused the server to lose track of
  recently seen authenticators.

That implies that at the very least a change in file format would result in an outage.

In addition, a change in file format would make it impossible to share a replay cache among services built against different Kerberos implementations. While it is certainly possible to create a faster implementation that is built around B-trees, the new format could only be used if it was known that only MIT Kerberos would ever use that replay cache.

Implementation considerations

The krb5_rc_store function is an internal API. The krb5_donot_replay structure can gain a new member to point to the cleartext of the authenticator.

The key to detecting one of these new extension records is finding a replay cache entry that appears to have a zero-length string for client or server principal name, but whose counted length is greater than one. The search mechanism that krb5_rc_dfl_store uses will take into account the authenticator hash.

Old implementations will read the entire extension records in the new replay cache entry, but ignore the apparently zero-length strings.


Review

This section documents the review of the project according to Project policy. It is divided into multiple sections. First, approvals should be listed. To list an approval type

#~~~~

on its own line. The next section is for discussion. Use standard talk page conventions. In particular, sign comments with

--~~~~

and indent replies.

Members of Krbcore raising Blocking objections should preface their comment with {{project-block}}. The member who raised the objection should remove this markup when their objection is handled.

Approvals

Discussion

To be incorporated:

  • hashing of ciphertext is acceptable and advantageous for further reducing false positives
  • look for paired entries to distinguish between entries written with old implementation and entries written with new implementation
  • clarify behavior when an old implementation corrupts a new-style entry, considering that "leading null byte" format of a new entry could lead to old implementations truncating bytes following the null byte