Difference between revisions of "Projects/Reporting-friendly KDB dump format"

Revision as of 17:55, 22 July 2015

This is an early stage project for MIT Kerberos. It is being fleshed out by its proponents. Feel free to help flesh out the details of this project. After the project is ready, it will be presented for review and approval.

This project is targeted at release 1.14.

This is split from Projects/KDB reporting and bulk operations.

Background

Operators often want to perform reporting operations on KDB data, but the dump format is optimized for loading by kdb5_util, not human reading or reporting using simple scripts. For example, the number of columns on each principal dump line depends on the number of keydata entries associated with that principal. Also, some useful metadata such as modification date are only present in a human-unfriendly hexadecimal format, as an artifact of being stored in the tl_data of the principal.

Format choice

Tab-separated value formats are probably acceptable to the largest variety of tools. These tools range from simple awk scripts to SQL databases. Comma-separated values are easy to add. Various quoting and escaping options exist in various dialects of CSV format. Opinions vary on whether tab-separated formats can use quoting, but in any case, most of the fields will not require quoting if tab-separated.

There are several conceptual database tables, each of which will have a different set of columns. To allow a single combined dump format, each dump line will have a first column that indicates the conceptual table to which it belongs. Another option is to provide command line options to select an individual conceptual table to dump, in which case the table name prefix would be omitted from each line. Headers should be optional, because some tools work better without them.

Examples

These are some examples of how commonly available tools could be used to manipulate output from one of these dumps.

$ cat keyinfo.txt
name	keyindex	kvno	enctype	salttype	salt
foo@EXAMPLE.COM	0	1	aes128-cts-hmac-sha1-96	normal	-1
bar@EXAMPLE.COM	0	1	aes128-cts-hmac-sha1-96	normal	-1
bar@EXAMPLE.COM	1	1	des-cbc-crc	normal	-1
$ sqlite3
sqlite> .mode tabs
sqlite> .import keyinfo.txt keyinfo
sqlite> select * from keyinfo where enctype like 'des-cbc-%';
bar@EXAMPLE.COM	1	1	des-cbc-crc	normal	-1
sqlite> .quit
$ awk -F'\t' '$4 ~ /des-cbc-/ { print }' keyinfo.txt
bar@EXAMPLE.COM	1	1	des-cbc-crc	normal	-1

Conceptual tables

This is an attempt to split KDB data fields into a somewhat more normalized schema that is somewhat easier to manipulate for reporting purposes.

Principal metadata

Principal name
Last modified by (principal name)
Last modification date
Last password change
Policy object name
Master key version used for this principal's key data

Principal keys

There will generally be multiple dump key data dump lines per principal. The order is significant (though typically it's only important within the active kvno), so there will need to be a key index number in case the user imports the dump into a data store that doesn't preserve the ordering of input lines (such as most relational databases).

Some open questions include whether to use numeric values or string representations for enctype or salt type. Maybe this can be a runtime option?

Principal name
Key index
Key version number (kvno)
Enctype
Salt type
Salt data as hex string (might be "-1" to denote no salt or normal/default salt)

A sample implementation of a "keyinfo" dump format is at https://github.com/tlyu/krb5/tree/keyinfo

Per-principal policy data

Principal name
Principal expiration date
Password expiration date
Max ticket lifetime
Max renewable ticket lifetime

Per-principal lockout data

These are the per-KDC (non-replicated) data that track failed logins due to incorrect passwords.

Principal name
Last success
Last failure
Count of failed attempts

Principal boolean attributes

This is currently an boolean flag word; it's probably best to make it a set of strings. This is a bit tricky because there are some flags that are of the form disallow_*.

Principal name
Attribute name if set (string form)
Boolean value (true/false or 1/0)

Feedback from operators indicates that having a row for each attribute, regardless of whether or not it's set, can be useful to satisfy auditors.

Unknown attributes can probably print as hexadecimal numbers, or optionally ignored.

Principal string attributes

Principal name
Attribute name
Attribute value

Principal key history metadata

Principal name
kvno of kadmin/history key (admin_history_kvno)
old_key_len (internal ring buffer length)
old_key_next (internal ring buffer index)

Principal key history

This is very similar to the keyinfo/keydata table. There is some weird ring buffer stuff that we may or may not want to reflect in the dump.

Principal name
Key index
Key version number (kvno)
Enctype
Salt type
Salt data as hex string (might be "-1" to denote no salt or normal/default salt)

Password policy

Policy name
Min password life
Max password life
Min password length
Min password character classes
Password history length

Lockout policy

Policy name
Max failures
Failure count reset interval
Lockout duration

Ticket policy

Policy name
Max ticket lifetime
Max renewable ticket lifetime

Policy boolean attributes

As for principal boolean attributes

Policy allowed keysalts

(Is this an ordered list?)

Policy name
Enctype
Salt type

C structure cross reference

krb5_db_entry

magic: (not encoded)
len
mask: (not encoded?)
attributes: princflags
max_life: princpolicy
max_renewable_life: princpolicy
expiration: princpolicy
pw_expiration: princpolicy
last_success: princlockout
last_failed: princlockout
fail_auth_count: princlockout
n_tl_data: (tl_data)
n_key_data: keyinfo/keydata
e_length: (implicit)
e_data: princ_edata
princ: (redundant? not for consistency vs db key)
tl_data: (tl_data)
key_data: keyinfo/keydata

osa_princ_ent_rec

version
policy: princmeta
aux_attributes
old_key_len: oldkeymeta
old_key_next: oldkeymeta
old_keys: oldkeyinfo/oldkeydata
admin_history_kvno: oldkeymeta

tl_data cross reference

KRB5_TL_LAST_PWD_CHANGE: princmeta
KRB5_TL_MOD_PRINC: princmeta
KRB5_TL_KADM_DATA: (see osa_princ_ent_rec)
KRB5_TL_MKVNO: princmeta

@@ Line 10: / Line 10: @@
 ==Format choice==
-Tab-separated value formats are probably acceptable to the largest variety of tools.  These tools range from simple awk scripts to SQL databases.
+Tab-separated value formats are probably acceptable to the largest variety of tools.  These tools range from simple awk scripts to SQL databases.  Comma-separated values are easy to add.  Various quoting and escaping options exist in various dialects of CSV format.  Opinions vary on whether tab-separated formats can use quoting, but in any case, most of the fields will not require quoting if tab-separated.
-There are several conceptual database tables, each of which will have a different set of columns.  To allow combining these in a single dump file, each dump line will have a first column that indicates the conceptual table to which it belongs.  Another option is to provide command line options to select an individual conceptual table to dump.
+There are several conceptual database tables, each of which will have a different set of columns.  To allow a single combined dump format, each dump line will have a first column that indicates the conceptual table to which it belongs.  Another option is to provide command line options to select an individual conceptual table to dump, in which case the table name prefix would be omitted from each line.  Headers should be optional, because some tools work better without them.
 ==Examples==
@@ Line 22: / Line 22: @@
 name	keyindex	kvno	enctype	salttype	salt
 foo@EXAMPLE.COM	0	1	aes128-cts-hmac-sha1-96	normal	-1
+bar@EXAMPLE.COM	0	1	aes128-cts-hmac-sha1-96	normal	-1
 bar@EXAMPLE.COM	1	1	des-cbc-crc	normal	-1
 $ sqlite3
@@ Line 29: / Line 30: @@
 bar@EXAMPLE.COM	1	1	des-cbc-crc	normal	-1
 sqlite> .quit
 $ awk -F'\t' '$4 ~ /des-cbc-/ { print }' keyinfo.txt
 bar@EXAMPLE.COM	1	1	des-cbc-crc	normal	-1
 </pre>
@@ Line 42: / Line 43: @@
 * Last modified by (principal name)
 * Last modification date
+* Last password change
+* Policy object name
+* Master key version used for this principal's key data
 ===Principal keys===
@@ Line 65: / Line 69: @@
 * Max ticket lifetime
 * Max renewable ticket lifetime
-* Password policy name
 ===Per-principal lockout data===
@@ Line 93: / Line 96: @@
 * Attribute name
 * Attribute value
+===Principal key history metadata===
+* Principal name
+* kvno of kadmin/history key (admin_history_kvno)
+* old_key_len (internal ring buffer length)
+* old_key_next (internal ring buffer index)
+===Principal key history===
+This is very similar to the keyinfo/keydata table.  There is some weird ring buffer stuff that we may or may not want to reflect in the dump.
+* Principal name
+* Key index
+* Key version number (kvno)
+* Enctype
+* Salt type
+* Salt data as hex string (might be "-1" to denote no salt or normal/default salt)
 ===Password policy===
@@ Line 127: / Line 148: @@
 * Enctype
 * Salt type
+==C structure cross reference==
+===krb5_db_entry===
+;magic: (not encoded)
+;len:
+;mask: (not encoded?)
+;attributes: princflags
+;max_life: princpolicy
+;max_renewable_life: princpolicy
+;expiration: princpolicy
+;pw_expiration: princpolicy
+;last_success: princlockout
+;last_failed: princlockout
+;fail_auth_count: princlockout
+;n_tl_data: (tl_data)
+;n_key_data: keyinfo/keydata
+;e_length: (implicit)
+;e_data: princ_edata
+;princ: (redundant? not for consistency vs db key)
+;tl_data: (tl_data)
+;key_data: keyinfo/keydata
+===osa_princ_ent_rec===
+;version:
+;policy: princmeta
+;aux_attributes:
+;old_key_len: oldkeymeta
+;old_key_next: oldkeymeta
+;old_keys: oldkeyinfo/oldkeydata
+;admin_history_kvno: oldkeymeta
+==tl_data cross reference==
+;KRB5_TL_LAST_PWD_CHANGE: princmeta
+;KRB5_TL_MOD_PRINC: princmeta
+;KRB5_TL_KADM_DATA: (see osa_princ_ent_rec)
+;KRB5_TL_MKVNO: princmeta