Difference between revisions of "Coding style"
(→WRITING C CODE: delete moved content) |
(→indent.pro settings) |
||
| (13 intermediate revisions by 4 users not shown) | |||
| Line 1: | Line 1: | ||
| − | The C language '''Coding style''' described here is based on the BSD coding style, with some additional elements from the GNU coding standards and the SunOS coding standards. |
||
| + | __NOTOC__ |
||
| + | The C language '''Coding style''' described here is based on the BSD coding style (Kernel Normal Form - KNF), with some additional elements from the GNU coding standards and the SunOS coding standards. |
||
* [[Coding style/Formatting|Formatting]] |
* [[Coding style/Formatting|Formatting]] |
||
* [[Coding style/Practices|Practices]] |
* [[Coding style/Practices|Practices]] |
||
| + | * [[Coding style/Style checker|Style checker]] |
||
| + | * [[Coding style/Version control practices|Version control practices]] |
||
| + | * [[Coding style/Transition_strategies|Transition strategies]] |
||
| + | * [[Coding style/Reindenting|Reindenting]] |
||
== External links == |
== External links == |
||
| Line 8: | Line 13: | ||
* [http://cvsweb.netbsd.org/bsdweb.cgi/~checkout~/src/share/misc/style?rev=HEAD NetBSD <code>/usr/share/misc/style</code>] [http://cvsweb.netbsd.org/bsdweb.cgi/src/share/misc/style (log)] |
* [http://cvsweb.netbsd.org/bsdweb.cgi/~checkout~/src/share/misc/style?rev=HEAD NetBSD <code>/usr/share/misc/style</code>] [http://cvsweb.netbsd.org/bsdweb.cgi/src/share/misc/style (log)] |
||
* [http://www.gnu.org/prep/standards/ GNU coding standards] |
* [http://www.gnu.org/prep/standards/ GNU coding standards] |
||
| − | * [http://opensolaris.org/os/community/ |
+ | * [http://opensolaris.org/os/community/on/cstyle.ms.pdf C Style and Coding Standards for SunOS] |
| − | == |
+ | == General information == |
| − | |||
| − | Old content to be moved elsewhere is below. |
||
| − | |||
| − | ---- |
||
| − | |||
| − | == WRITING C CODE == |
||
| − | |||
| − | The code in the krb5 source tree largely follows BSD KNF |
||
| − | (/usr/share/misc/style on NetBSD) except that it uses a four column |
||
| − | basic offset. The style described here is a synthesis of BSD KNF and |
||
| − | the GNU coding standards for the C language. The formatting described |
||
| − | in the "Formatting Your Source Code" section of the GNU coding |
||
| − | standards is mostly what we want, except we use BSD brace style and |
||
| − | BSD-ish conventions for the spacing around operators. |
||
| − | |||
| − | === Formatting style for C code === |
||
| − | |||
| − | (deleted moved content) |
||
| − | |||
| − | === Coding practices for C === |
||
| − | |||
| − | Assume, for most purposes, working ANSI/ISO C ('89, not '99) support, |
||
| − | both for internal use and for applications compiling against Kerberos |
||
| − | header files and libraries. Some exceptions are noted below. |
||
| − | |||
| − | Do not use assignments as truth values. Rather than this: |
||
| − | |||
| − | <pre> |
||
| − | /* bad style */ |
||
| − | if ((retval = krb5_foo())) |
||
| − | /* ... */; |
||
| − | </pre> |
||
| − | |||
| − | do this: |
||
| − | |||
| − | <pre> |
||
| − | /* better style */ |
||
| − | retval = krb5_foo(); |
||
| − | if (retval) |
||
| − | /* ... */; |
||
| − | </pre> |
||
| − | |||
| − | This makes the code easier to read, and also makes it easier to use |
||
| − | debuggers. It may be excusable to put assignments into the |
||
| − | conditional espression of a "while" statement, though, like: |
||
| − | |||
| − | <pre> |
||
| − | while ((ch = getopt(argc, argv, "abn")) != -1) |
||
| − | /* ... */; |
||
| − | </pre> |
||
| − | |||
| − | Using assignments as truth values in conditional expressions may make |
||
| − | code particularly impenetrable. |
||
| − | |||
| − | There are at least three types of "zero" known to C. These are the |
||
| − | integer zero (0), the null pointer constant (NULL), and the character |
||
| − | constant zero ('\0'). Yes, these are usually all technically the |
||
| − | integer zero. Use them in their correct contexts. (Purists will |
||
| − | point out that 0 is a valid null pointer constant; still, do not use 0 |
||
| − | to specify a null pointer constant. For further unconfusion, read the |
||
| − | section on null pointer constants in the C FAQ.) Do not use a lone |
||
| − | variable as a truth value unless it's of integer type. Thus: |
||
| − | |||
| − | <pre> |
||
| − | int i; |
||
| − | char *cp; |
||
| − | /* ... */ |
||
| − | if (i) |
||
| − | /* ... */; |
||
| − | if (cp != NULL) { |
||
| − | while (*cp != '\0') |
||
| − | /* ... */; |
||
| − | } |
||
| − | </pre> |
||
| − | |||
| − | Do not cast uses of NULL unless you're calling a function with a |
||
| − | variable number of arguments, in which case you should cast it to to |
||
| − | the appropriate pointer type. Likewise, do not cast the return value |
||
| − | from malloc() and friends; the prototype should declare them properly |
||
| − | as returning a void * and thus shouldn't require an explicit cast. |
||
| − | |||
| − | Do not assume that realloc(NULL, size) will do the right thing, or |
||
| − | that free(NULL) will do the right thing. ANSI guarantees that it |
||
| − | will, but some old libraries (hopefully becoming obsolete) don't. |
||
| − | Also, don't assume that malloc(0) will return a non-NULL pointer. |
||
| − | Typically, though, the output of malloc(0) will be safe to pass to |
||
| − | realloc() and free(). |
||
| − | |||
| − | In any case, reading the section in the C FAQ on null pointers is |
||
| − | highly recommended to remove confusion regarding null pointers in C, |
||
| − | since this is a subject of much confusion to even experienced |
||
| − | programmers. In particular, if you do not understand why using |
||
| − | calloc() to allocate a struct that contains pointer members or why |
||
| − | calling memset() to initialize such a struct to all-bytes-zero is |
||
| − | wrong, reread that section again. (Note that there are *lots* of |
||
| − | examples of code in the krb5 source tree that erroneously calls |
||
| − | memset() to zero a struct, and we should fix these somehow |
||
| − | eventually.) |
||
| − | |||
| − | Control flow statements that have a single statement as their body |
||
| − | should nevertheless have braces around their bodies if the body is |
||
| − | more than one line long, especially in the case of stacked multiple |
||
| − | if-else clauses; use: |
||
| − | |||
| − | <pre> |
||
| − | if (x) { |
||
| − | if (y) |
||
| − | foo(); |
||
| − | else |
||
| − | bar(); |
||
| − | } |
||
| − | </pre> |
||
| − | |||
| − | instead of: |
||
| − | |||
| − | <pre> |
||
| − | /* bad style */ |
||
| − | if (x) |
||
| − | if (y) |
||
| − | foo(); |
||
| − | else |
||
| − | bar(); |
||
| − | </pre> |
||
| − | |||
| − | which, while legible to the compiler, may confuse human readers and |
||
| − | make the code less maintainable, especially if new branches get added |
||
| − | to any of the clauses. |
||
| − | |||
| − | Also, you should almost never intersperse conditional compilation |
||
| − | directives with control flow statements, as some combination of |
||
| − | #define'd symbols may result in statements getting eaten by dangling |
||
| − | bits of control flow statements. When it is not possible to avoid |
||
| − | this questionable practice (you really should rewrite the relevant |
||
| − | code section), make use of redundant braces to ensure that a compiler |
||
| − | error will result in preference to incorrect runtime behavior (such as |
||
| − | inadvertantly providing someone with a root shell). |
||
| − | |||
| − | Do not intersperse conditional compilation directives with control |
||
| − | flow statements in such a way that confuses emacs cc-mode. Not only |
||
| − | does emacs get confused, but the code becomes more difficult to read |
||
| − | and maintain. Therefore, avoid code like this: |
||
| − | |||
| − | <pre> |
||
| − | /* bad style */ |
||
| − | if (x) { |
||
| − | f(); |
||
| − | } |
||
| − | #ifdef FOO |
||
| − | else if (y) { |
||
| − | #else |
||
| − | else { |
||
| − | #endif |
||
| − | g(); |
||
| − | } |
||
| − | </pre> |
||
| − | |||
| − | Put comments after conditional compilation directives such as "#else" |
||
| − | and "#endif". Make them correspond to the sense of the value that |
||
| − | controls the compilation of the section they are closing, i.e. |
||
| − | |||
| − | <pre> |
||
| − | #ifdef FOO |
||
| − | /* ... */ |
||
| − | #else /* !FOO */ |
||
| − | /* ... */ |
||
| − | #endif /* !FOO */ |
||
| − | </pre> |
||
| − | |||
| − | Also, in the case of more complex conditional compilation directives, |
||
| − | write the comments like this: |
||
| − | |||
| − | <pre> |
||
| − | #if defined(FOO) || defined(BAR) |
||
| − | /* ... */ |
||
| − | #else /* !(defined(FOO) || defined(BAR)) */ |
||
| − | /* ... */ |
||
| − | #endif /* !(defined(FOO) || defined(BAR)) */ |
||
| − | </pre> |
||
| − | |||
| − | If you are writing a do-while loop that has only one statement in its |
||
| − | body, put braces around it anyway, since the while clause may be |
||
| − | mistaken for a while loop with an empty body. Don't do this: |
||
| − | |||
| − | <pre> |
||
| − | /* bad style */ |
||
| − | do |
||
| − | foo(); |
||
| − | while (x); |
||
| − | </pre> |
||
| − | |||
| − | Instead, write this: |
||
| − | |||
| − | <pre> |
||
| − | /* better style */ |
||
| − | do { |
||
| − | foo(); |
||
| − | } while (x); |
||
| − | </pre> |
||
| − | |||
| − | While it is syntactically correct to call through a function pointer |
||
| − | without applying a dereference operator to it, do not write code that |
||
| − | does this. It is easier to see that the function call is actually |
||
| − | taking place through a function pointer if you write an explicit |
||
| − | dereference. However, do not explicitly take the address of a |
||
| − | function in order to assign it to a function pointer, since a function |
||
| − | name degrades into a pointer. Thus: |
||
| − | |||
| − | |||
| − | <pre> |
||
| − | int (*fp)(void); |
||
| − | int foofunc(void); |
||
| − | fp = foofunc; |
||
| − | x = (*fp)(); |
||
| − | </pre> |
||
| − | |||
| − | In general, do not take the address of an array. It does not return a |
||
| − | pointer to the first element; it returns a pointer to the array |
||
| − | itself. These are often identical when cast to an integral type, but |
||
| − | they are inherently of different types themselves. Functions that |
||
| − | take array types or pointers to array types as arguments can be |
||
| − | particularly trouble-prone. |
||
| − | |||
| − | If a function is declared to return a value, do not call "return" |
||
| − | without an argument or allow the flow of control to fall off the end |
||
| − | of the function. |
||
| − | |||
| − | Always declare the return type of a function, even if it returns int. |
||
| − | Yes, this means declaring main() to return int, since main() is |
||
| − | required to return int by the standard. If a function is not supposed |
||
| − | to return a value, declare it as returning void rather than omitting |
||
| − | the return type, which will default the return type to int. |
||
| − | |||
| − | Try to use ANSI C prototype-style function definitions in preference |
||
| − | to K&R style definitions. When using K&R style function definitions, |
||
| − | declare all the argument types, even those that are int, but beware of |
||
| − | any narrow types in the argument list. |
||
| − | |||
| − | Do not declare variables in an inner scope, e.g. inside the compound |
||
| − | substatement of an if statement, unless the complexity of the code |
||
| − | really demands that the variables be declared that way. In such |
||
| − | situations, the function could probably stand to be broken up into |
||
| − | smaller chunks anyway. Do not declare variables in an inner scope |
||
| − | that shadow ones in an outer scope, since this leads to confusion. |
||
| − | Also, some debugging environments, such as gdb under Solaris, can't |
||
| − | see variables declared in an inner scope, so declaring such variables |
||
| − | will make maintenance more difficult as well. |
||
| − | |||
| − | Parenthesize expressions that may be confusing, particularly where C's |
||
| − | precedences are broken. For example, the shift operators have lower |
||
| − | precedence than the +, -, *, /, and % operators. Perhaps the most |
||
| − | familiar C precedence quirk is that equality and relational operators |
||
| − | are of higher precedence than assignment operators. Less well known |
||
| − | is that the bitwise operators are of a lower precedence than equality |
||
| − | and relational operators. |
||
| − | |||
| − | The sizeof operator takes either a unary expression or a parenthesized |
||
| − | type name. It is not necessary to parenthesize the operand of sizeof |
||
| − | if it is applied to a unary expression, but still, always parenthesize |
||
| − | the operand of the sizeof operator. The sizeof operator does not |
||
| − | evaluate its operand if it is a unary expression, so usages such as |
||
| − | |||
| − | <pre> |
||
| − | s = sizeof(++foo); |
||
| − | </pre> |
||
| − | |||
| − | should be avoided for the sake of sanity and readability. |
||
| − | |||
| − | Don't pass around structures except by address. We may relax this |
||
| − | restriction for non-API function, though. |
||
| − | |||
| − | For new functions, input parameters should go before output parameters |
||
| − | in the call signature. There are exceptions, such as a context-like |
||
| − | parameter. |
||
| − | |||
| − | Every function should have block comment preceding it describing |
||
| − | briefly in complete sentences what it does, what inputs and outputs it |
||
| − | has, and what error codes it can return. It should also describe any |
||
| − | unsual aspects of the function. At some point we will want to put |
||
| − | some of this information into a machine-parsable form. |
||
| − | |||
| − | Macros should have all-uppercase names. If it is necessary to use |
||
| − | multiple statements, use braces, and wrap the whole thing in a |
||
| − | do-while(0) construct, such as |
||
| − | |||
| − | <pre> |
||
| − | #define FOOMACRO(x, y) do { \ |
||
| − | foo = (x) + (y); \ |
||
| − | f(y); \ |
||
| − | } while (0) |
||
| − | </pre> |
||
| − | |||
| − | Leave off the semicolon at the end of a function-like macro, so that |
||
| − | it can be mostly used like a call to a function without a return |
||
| − | value. Line up the backslashes to make it more readable. Use M-x |
||
| − | c-backslash-region in emacs to do neat lined-up backslashes. |
||
| − | Parenthesize uses of arguments in the replacement text of a macro in |
||
| − | order to prevent weird interactions. |
||
| − | |||
| − | Strive to make your code capable of compiling using "gcc -Wall |
||
| − | -Wmissing-prototypes -Wtraditional -Wcast-qual -Wcast-align |
||
| − | -Wconversion -Waggregate-return -pedantic" [XXX need to rethink this |
||
| − | somewhat] without generating any errors or warnings. Do not, however, |
||
| − | compile using the "-ansi" flag to gcc, since that can result in odd |
||
| − | behavior with header files on some systems, causing some necessary |
||
| − | symbols to not be defined. |
||
| − | |||
| − | === Namespaces === |
||
| − | |||
| − | The C standard reserves a bunch of namespaces for the implementation. |
||
| − | Don't stomp on them. For practical purposes, any identifier with a |
||
| − | leading underscore should not be used. (Technically, ^_[a-z].* are |
||
| − | reserved only for file scope, so should be safe for things smaller |
||
| − | than file scope, but it's better to be paranoid in this case.) |
||
| − | |||
| − | POSIX reserves typedef names ending with _t as well. |
||
| − | |||
| − | Recall that errno is a reserved identifier, and is permitted to be a |
||
| − | macro. Therefore, do not use errno as the name of a structure member, |
||
| − | etc. |
||
| − | |||
| − | Reserved namespaces are somewhat more restricted than this; read the |
||
| − | appropriate section of the C standard if you have questions. |
||
| − | |||
| − | If you're writing new library code, pick a short prefix and stick with |
||
| − | it for any identifier with external linkage. If for some reason a |
||
| − | library needs to have external symbols that should not be visible to |
||
| − | the application, pick another (related) prefix to use for the internal |
||
| − | globals. This applies to typedef names, tag names, and preprocessor |
||
| − | identifiers as well. |
||
| − | |||
| − | For the krb5 library, the prefix for public global symbols is "krb5_". |
||
| − | Use "krb5int_" as a prefix for library internal globals. Avoid using |
||
| − | "__" in symbol names, as it may confuse C++ implementations. There |
||
| − | are admittedly a number of places where we leak thing into the |
||
| − | namespace; we should try to fix these. |
||
| − | |||
| − | Header files should also not leak symbols. Usually using the upcased |
||
| − | version of the prefix you've picked will suffice, e.g. "KRB5_" as a |
||
| − | CPP symbol prefix corresponding to "krb5_". In general, do not define |
||
| − | macros that are lowercase, in order to avoid confusion and to prevent |
||
| − | namespace collisions. |
||
| − | |||
| − | The C standard only guarantees six case-insensitive characters to be |
||
| − | significant in external identifiers; this is largely regarded as |
||
| − | obsolescent even in 1989 and we will ignore it. It does, however, |
||
| − | only guarantee 31 case-sensitive characters to be signficant in |
||
| − | internal identifiers, so do not use identifiers that differ beyond the |
||
| − | 31st character. This is unlikely to be a problem, though. |
||
=== Aspects of C style in GNU coding std but not here === |
=== Aspects of C style in GNU coding std but not here === |
||
| Line 64: | Line 69: | ||
=== Emacs cc-mode style === |
=== Emacs cc-mode style === |
||
| − | Putting the following code in your .emacs file will result in mostly |
||
| + | Load the elisp file {{trunkref|src/util/krb5-c-style.el}} to get mostly |
||
| − | the right thing happening with respect to formatting style. Note that |
||
| + | the right thing to happen with respect to formatting style. <code>krb5-c-style.el</code> uses a heuristic to detect whether a file should have the "krb5" C coding style applied. Currently, it uses the combined presence of <code>c-basic-offset: 4</code> and <code>indent-tabs-mode: nil</code> as a signal to use the "krb5" style. See [[Coding_style/Transition strategies]] for some details. Also, if you are newly adding the file-local variable settings line to a file, use <code>M-x normal-mode</code> to reinitialize cc-mode with the new settings. |
||
| − | you may want to turn on auto-newline feature of cc-mode, though that |
||
| + | |||
| + | You may want to turn on auto-newline feature of cc-mode, though that |
||
seems to have some bugs with brace-elseif-brace handling at least in |
seems to have some bugs with brace-elseif-brace handling at least in |
||
| − | + | old versions of cc-mode (Emacs 20.3 or so). |
|
| − | (defconst krb5-c-style |
||
| + | You might also want to try (for Emacs 22 and later): |
||
| − | '("bsd" |
||
| + | |||
| − | (c-cleanup-list |
||
| + | (add-hook 'before-save-hook 'copyright-update) |
||
| − | brace-elseif-brace brace-else-brace defun-close-semi) |
||
| + | |||
| − | (c-comment-continuation-stars . "* ") |
||
| + | which will offer to update the year in the top-most copyright notice in a file when you save it, if it's not already current. |
||
| − | (c-electric-pound-behavior alignleft) |
||
| − | (c-hanging-braces-alist |
||
| − | (brace-list-open) |
||
| − | (class-open after) |
||
| − | (substatement-open after) |
||
| − | (block-close . c-snug-do-while) |
||
| − | (extern-lang-open after)) |
||
| − | (c-hanging-colons-alist |
||
| − | (case-label after) |
||
| − | (label after)) |
||
| − | (c-hanging-comment-starter-p) |
||
| − | (c-hanging-comment-ender-p) |
||
| − | (c-indent-comments-syntactically-p . t) |
||
| − | (c-label-minimum-indentation . 0) |
||
| − | (c-special-indent-hook))) |
||
| − | (defun krb5-c-hook () |
||
| − | (c-add-style "krb5" krb5-c-style t)) |
||
| − | (add-hook 'c-mode-common-hook 'krb5-c-hook) |
||
=== indent.pro settings === |
=== indent.pro settings === |
||
| Line 99: | Line 87: | ||
reasonable approximation to the C coding style described here, though |
reasonable approximation to the C coding style described here, though |
||
some manual cleanup may be necessary. Note that the gindent installed |
some manual cleanup may be necessary. Note that the gindent installed |
||
| − | in the gnu locker does not currently handle -psl correctly though. |
+ | in the gnu locker does not currently handle -nut or -psl correctly though. |
| + | <pre> |
||
-bap |
-bap |
||
-br |
-br |
||
| Line 113: | Line 102: | ||
-nbc |
-nbc |
||
-ncdb |
-ncdb |
||
| + | -ncs |
||
-ndj |
-ndj |
||
-nfc1 |
-nfc1 |
||
-lp |
-lp |
||
-npcs |
-npcs |
||
| + | -nut |
||
-psl |
-psl |
||
-sc |
-sc |
||
-sob |
-sob |
||
| + | </pre> |
||
| − | == MAKEFILES == |
||
| + | === vim/gvim editor settings === |
||
| + | These settings allow the vim or gvim editor to conform to the MITKC code style: |
||
| − | [XXX needs to be written] |
||
| + | <pre> |
||
| − | |||
| + | set shiftwidth=4 |
||
| − | == TEST SUITES == |
||
| + | set tabstop=8 |
||
| − | |||
| + | set softtabstop=4 |
||
| − | [XXX needs to be written] |
||
| + | set expandtab |
||
| + | set nosmartindent |
||
| + | set cindent |
||
| + | set cinoptions=p0,t0,+4,(0,u4,U1,:0 |
||
| + | set formatoptions=croq |
||
| + | set comments=sr:/*,mb:*,ex:*/,:// |
||
| + | set textwidth=79 |
||
| + | </pre> |
||
Latest revision as of 12:59, 2 February 2016
The C language Coding style described here is based on the BSD coding style (Kernel Normal Form - KNF), with some additional elements from the GNU coding standards and the SunOS coding standards.
External links
General information
Aspects of C style in GNU coding std but not here
- redundant parens to force extra indent of operators of different precedences
- redundant parens to force general extra indent of expressions that are broken between lines
- use of ^L characters to break up source files into pages
- nitpicking about capitalization in comments of variable names when their values are meant
- commenting usages of static variables
- casts to void
- separation of word in names with underscores vs case change
- enum vs #define'd integer constants
- 14 char filename limits, MS-DOS filename limits
- portability
- system library function quirks
- internationalization
- mmap()
Aspects of C style in BSD KNF but not here
- sorting of header files
- sorting of struct members
- separating struct tag decl and struct typedef
- sorting of var decl
- lining up var names in decls
- newline after decls
- usage of __P
- usage of getopt
- not initializing vars in decls
- stdarg/varargs handling
Emacs cc-mode style
Load the elisp file src/util/krb5-c-style.el (raw | annotated | history) to get mostly
the right thing to happen with respect to formatting style. krb5-c-style.el uses a heuristic to detect whether a file should have the "krb5" C coding style applied. Currently, it uses the combined presence of c-basic-offset: 4 and indent-tabs-mode: nil as a signal to use the "krb5" style. See Coding_style/Transition strategies for some details. Also, if you are newly adding the file-local variable settings line to a file, use M-x normal-mode to reinitialize cc-mode with the new settings.
You may want to turn on auto-newline feature of cc-mode, though that seems to have some bugs with brace-elseif-brace handling at least in old versions of cc-mode (Emacs 20.3 or so).
You might also want to try (for Emacs 22 and later):
(add-hook 'before-save-hook 'copyright-update)
which will offer to update the year in the top-most copyright notice in a file when you save it, if it's not already current.
indent.pro settings
The following settings for the indent program should produce a reasonable approximation to the C coding style described here, though some manual cleanup may be necessary. Note that the gindent installed in the gnu locker does not currently handle -nut or -psl correctly though.
-bap -br -ce -ci4 -cli0 -d0 -di8 -i4 -ip4 -l79 -nbc -ncdb -ncs -ndj -nfc1 -lp -npcs -nut -psl -sc -sob
vim/gvim editor settings
These settings allow the vim or gvim editor to conform to the MITKC code style:
set shiftwidth=4 set tabstop=8 set softtabstop=4 set expandtab set nosmartindent set cindent set cinoptions=p0,t0,+4,(0,u4,U1,:0 set formatoptions=croq set comments=sr:/*,mb:*,ex:*/,:// set textwidth=79
