Projects/Timestamps after 2038
This project will make MIT krb5 timestamps work on platforms with 64-bit time_t for times after January 2038, up through January 2106, by repurposing the negative number range of the krb5_timestamp type.
krb5.h defines the type krb5_timestamp as krb5_int32. The range of this type extends to time values up through 2038-01-19 03:14:07 UTC. On most 32-bit Unix platforms, the native time_t type also has this limitation, and numerous applications other than MIT krb5 will fail to work after the year 2037. On platforms with a 64-bit time_t type, the limited range of krb5_timestamp will only cause MIT krb5 to fail after 2037.
The krb5_timestamp type is used in numerous libkrb5 structures and function signatures. Changing the size of the type would be an incompatible ABI change, which would pose a disruption to downstream packagers. Changing the type to unsigned would have a more subtle impact on the API, but could still create problems for existing code.
In the GSSAPI, intervals are represented using OM_uint32. This is a limitation but not an important one, since GSSAPI does not make use of absolute timestamps.
(See also: [krbdev.mit.edu #8352])
Kerberos does not generally need to represent time values before the year 1970. Therefore, negative krb5_timestamp values can be taken to represent times between 2038 and 2106, as if the type were unsigned. To accomplish this, we can define a conversion function from krb5_timestamp to time_t as follows:
- If time_t is 32-bit, the conversion is the identity function.
- If the value is nonnegative, the conversion is the identity function.
- Otherwise, the conversion adds 2^32 to the value. (We might need to preserve the value -1 rather than converting it to 2^32-1, but the need for this is unclear.)
If we do not need to preserve the value -1, this conversion is as simple as (time_t)(uint32_t)timestamp, whether time_t is a 32-bit or 64-bit integer.
The inverse of this conversion is trivial; simply casting from time_t to krb5_timestamp will have the desired behavior, whether or not the value -1 needs to be preserved.
C language considerations
Conversion to a signed type from a value outside of the range of the signed type is specified in C99 section 18.104.22.168 as "the result is implementation-defined or an implementation-defined signal is raised". C99 section 3.4.1 defines implementation-defined behavior as "unspecified behavior where each implementation documents how the choice is made"; the result must be consistent for the same operands, and the compiler cannot simply assume that implementation-defined behavior never happens. We check in configure.in that conversion to signed types preserves the twos-complement bit representation and does not crash the program; therefore, we can safely rely (for example) on implicit conversions from time_t (if it is 64-bit) to krb5_timestamp to generate the appropriate negative value for times between years 2038 and 2106.
Arithmetic operations on a signed type which overflow or underflow are undefined by the C standard. Compilers frequently take advantage of the assumption that undefined behavior cannot occur when optimizing code, so we should try to avoid undefined behavior when operating on krb5_timestamp values. Casting from krb5_timestamp to uint32_t before adding or subtracting an offset should be an adequate workaround.
Conversions to an unsigned integer type are well-specified (C99 section 22.214.171.124), as are arithmetic operations on unsigned types (C99 section 6.2.5). Values that cannot be represented in the range of the unsigned type are reduced modulo that range.
Windows defaults to using a 64-bit time_t type unless _USE_32BIT_TIME_T is defined (which is not allowed on 64-bit Windows). We define this symbol internally when building Kerberos on 32-bit Windows, for reasons discussed in [krbdev.mit.edu #2883]. Although the code in win-mac.h is commented as being "to ensure backward compatibility of the ABI", time_t does not appear in our current ABI and the definition does not apply to external code (it only applies when k5-int.h is included).
Making sure we find all of the affected code and devise test cases for it will be challenging. Any code which performs arithmetic or comparison operations on timestamps or marshals them to or from a string form could be affected. Some useful search terms (aside from "krb5_timestamp" itself):
- krb5_deltat: also a signed 32-bit type; this type is frequently used to hold the difference between two timestamps
- krb5_ticket_times: this type holds the start and end times of tickets, which are frequent targets for arithmetic operations
- "times.": krb5_ticket times is embedded in several other structures under the field name "times" or "krb_times".
- time_t: to catch assignments of krb5_timestamp values to time_t values
In the GSS krb5 mechanism:
- krb5_gss_inquire_context(), krb5_gss_context_time(), accept_sec_context.c, and s4u_gss_glue.c perform subtraction on krb5_timestamp values to compute the lifetime result.
- acquire_cred.c computes a refresh time using arithmetic operations, and marshals it into a ccache config variable. It also compares cred expiry times to the current time and uses subtraction to produce a cred expiration time.
- iakerb.c and init_sec_context.c add time_req to the current time to produce a requested ticket end time.
- valid_times.c performs subtractions and comparison on timestamp values to determine if a ticket is currently valid.
- The in_clock_skew() macro in int-proto.h performs subtraction and comparison on timestamp values to determine if two timestamps are within the clock skew of each other. (This macro is used only once and implicitly depends on a local variable "context", so it could perhaps be eliminated. Or it could be adjusted to make the context explicit and used in more places.)
- get_in_tkt.c:verify_as_reply() computes an offset from two timestamps, or checks whether the ticket start time is within the clock skew of the current time.
- get_in_tkt.c:set_request_times() adds offsets to timestamps to produce request timestamps. It uses krb5int_add32() to avoid signed overflow, and should instead convert to unsigned.
- get_in_tkt.c:note_req_timestamp() computes an offset from two timestamps.
- get_creds.c:get_cached_local_tgt() compares timestamps to see if the TGT is expired.
- gic_pwd.c:warn_pw_expiry() compares and subtracts timestamps to determine whether the expiry time is less than a week from the current time.
- toffset.c:krb5_set_real_time() subtracts timestamps to compute a time offset.
- timeofday.c:krb5_check_clockskew() subtracts timestamps to determine whether the difference is within the allowed clockskew.
- ustime.c contains code to adjust a timestamp by the context offset amount.
- kt_file.c:more_recent() compares timestamps to make inferences about kvno wraparound.
- rc_dfl.c:alive() adds an interval to a timestamp and compares it to the current timestamp to determine if a replay record is still active.
- t_kerb.c converts a timestamp to a time_t in test_string_to_timestamp().
Timestamps are serialized in:
- The libkrb5 and libgssapi_krb5 ser_ framework: uses store_32_be/load_32_be (no changes needed)
- The ASN.1 framework: encoding routines operate on time_t; proper conversion needed in glue functions (encode_kerberos_time())
- send_tgs.c: uses the current timestamp as a nonce. RFC 4120 specifies the nonce as being unsigned 32-bit, while MIT krb5 and Heimdal encode it as signed 32-bit. The AS code uses a random nonce in the range 0..2^32-1; the TGS code should probably be changed to do this as well.
- The kadmin protocol: xdr_krb5_timestamp() uses xdr_int32() which uses xdr_long(), which checks value ranges. Negative values should survive a round trip but it might be cleaner to use xdr_u_int32().
- The DB2 KDB module: uses store_32_be/load_32_be (no changes needed)
- The LDAP KDB module: uses strftime()/strptime(); getstringtime() needs to properly convert from krb5_timestamp to time_t
- JSON serialization of credentials in krb5 GSS mech: encoding uses k5_json_array_fmt() and may be nonconformant (values are provided as int32_t arguments but are processed from argv list as int); negative values should survive a round trip but would ideally be encoded as positive numbers
- The PAC processing code: k5_time_to_seconds_since_1970() won't produce a timestamp after 2038. k5_seconds_since_1970_to_time() shouldn't need changes at it starts by converting to an unsigned type.
- The refresh timestamp in krb5 GSS (acquire_cred.c): currently marshalled into ASCII as a signed long; should be marshalled as an unsigned value.
- krb5_timestamp_to_string() and krb5_timestamp_to_sfstring(): need to properly convert from krb5_timestamp to time_t. krb5_string_to_timestamp() shouldn't need changes.
- getdate.y (used by kadmin for user input): does not operate on krb5_timestamp values, but returns an error if given a specification for a date after 2038, even if the result could be expressed in a time_t value.
- t_replay.c: reads timestamps from argv using atol(). On a platform where long is 32-bit and time_t is 64-bit (Windows, Linux x32), this won't work for times after 2038. Using atoll() (specified in C99, POSIX-2001) is a simple workaround.
- Is -1 used as a distinguished timestamp value anywhere in the code? If so, we must be careful to preserve the value -1 when converting from krb5_timestamp to time_t.