Projects/Parallel KDC
Contents
Problem
The KDC is a single-threaded daemon--once it receives a complete request from a client, it fully processes that request before receiving another. The performance consequences of this are threefold:
- Only one CPU services KDC requests, including cryptography operations.
- When the KDC is reading data from disk (such as the replay cache or a BDB database), it does nothing else.
- If the KDB module retrieves data from a remote source (such as an LDAP query), the KDC does nothing while waiting for a reply.
Most KDCs experience only moderate load and can service requests quickly. In some circumstances, higher performance may be required.
Candidate Solutions
There are four possible solutions, the first of which is already possible:
- The realm administrator can run multiple KDC processes on the same host, each listening on a different port, each accessing the same database. This is possible with the current implementation, and SRV records can be used to avoid the need for client configuration; however, it does not yield optimal performance. Each client request will select a port without knowing whether the KDC process servicing that port is busy, and will wait for a timeout before trying another port. Moreover, MIT krb5 client code does not implement randomization of equal-priority SRV records, so randomization of SRV responses by the DNS infrastructure would be necessary for load-balancing to occur, and such randomization is sometimes defeated by caching. Parallelism is limited to the number of KDC processes.
- We could make the KDC event-oriented. This approach would require refactoring the entire KDC code base and all KDB modules. The DAL would have to provide KDB modules access to the listen_and_process main loop, and all DAL requests would have to be structured with callbacks or other mechanisms to allow the answer to arrive after further iterations of the main loop. This approach would only solve the problem of allowing the KDC to perform work while waiting for remote data sources such as LDAP; it would not allow multiple CPUs to service KDC requests or allow the KDC to perform work while waiting for disk accesses to complete.
- We could make the KDC multithreaded. This approach would require eliminating all use of global state (in particular, the kdc_active_realm variable and all of the macros such as kdc_context which derive from it) and ensuring that all library code used by the KDC is thread-safe. Any mistakes in thread-safety might result in difficult-to-debug race conditions, some of which might have security consequences.
- We could make the KDC use a multi-process worker model. After setting up its initial state including listener sockets, the KDC would fork multiple subprocesses. The set of idle subprocesses would compete for UDP packets or incoming TCP connections on the listener sockets, invisibly to clients. Once a worker process has obtained a request, it would service it according to the current single-threaded logic. Parallelism would be limited to the number of worker processes.
This project proposes to implement the fourth option, as it requires minimal code changes and does not introduce much additional risk.
Design of Proposed Solution
A new option would need to be added to the getopt() loop in initialize_realms() to specify the number of worker threads. The -w option is a reasonable option since it is currently unused. -w and -n (nofork) are mutually exclusive.
Code to create the worker processes would be invoked from main() after the call to write_pid_file(). The parent process would act as a proxy for SIGTERM so that killing the pid in the pid file terminates all worker processes.
The network socket code would likely need to set the listening sockets to non-blocking, and process_packet() would need to ignore EAGAIN errors instead of logging them.
The logging code would need to be examined to make sure that concurrent access to the same logging sinks would not create problems.
Additional attention to bug #1671 (no file locking used by replay cache) may be necessary to evaluate whether there is a security impact on a multi-process KDC, keeping in mind that allowing one replay to each independent KDC processes is typically not considered a serious security threat in master/slave scenarios.
Testing Plan
In most test scenarios, requests are processed too quickly by the KDC to measure any difference in behavior from a multi-process worker model. It should be possible to test this by hand by temporarily modifying the BDB back end to sleep() for a minute when looking up a particular principal name such as "slowuser". While testing, note that libkrb5 will retry requests after a timeout, so a single "kinit slowuser" will cause multiple worker processes to block unless the retry loop is disabled in the client code.
Automated testing of this functionality would be pretty tricky; we would need a special stub KDB back end to cause worker processes to block, as well as a way to control the client retry loop.
Resources and Priority
Resources have not been allocated to this project. It is not currently of high priority since most KDC deployments do not experience more load than they can process with single-threaded servicing of requests. However, the limited amount of work required makes this project low-hanging fruit, so it may be possible to implement it on the margin.