php-src/CODING_STANDARDS.md
Tim Düsterhus 48971af482
Update CODING_STANDARDS for the acronym casing RFC (#14169)
* [skip ci] Update CODING_STANDARDS for the acronym casing RFC

see https://wiki.php.net/rfc/class-naming-acronyms

* Improve formatting in CODING_STANDARDS.md

Co-authored-by: Larry Garfield <larry@garfieldtech.com>

---------

Co-authored-by: Larry Garfield <larry@garfieldtech.com>
2024-05-08 20:35:44 +02:00

11 KiB

PHP coding standards

This file lists standards that any programmer adding or changing code in PHP should follow. The code base does not yet fully follow it, but new features are going in that general direction. Many sections have been rewritten to comply with these rules.

Code implementation

  1. Document your code in source files and the manual. (tm)

  2. PHP is implemented in C99. The optional fixed-width integers from stdint.h (int8_t, int16_t, int32_t, int64_t and their unsigned counterparts) must be available.

  3. Functions that are given pointers to resources should not free them.

    For instance, function int mail(char *to, char *from) should NOT free to and/or from.

    Exceptions:

    • The function's designated behavior is freeing that resource. E.g. efree()

    • The function is given a boolean argument, that controls whether or not the function may free its arguments (if true, the function must free its arguments; if false, it must not)

    • Low-level parser routines, that are tightly integrated with the token cache and the bison code for minimum memory copying overhead.

  4. Functions that are tightly integrated with other functions within the same module, and rely on each other's non-trivial behavior, should be documented as such and declared static. They should be avoided if possible.

  5. Use definitions and macros whenever possible, so that constants have meaningful names and can be easily manipulated. Any use of a numeric constant to specify different behavior or actions should be done through a #define.

  6. When writing functions that deal with strings, be sure to remember that PHP holds the length property of each string, and that it shouldn't be calculated with strlen(). Write your functions in such a way so that they'll take advantage of the length property, both for efficiency and in order for them to be binary-safe. Functions that change strings and obtain their new lengths while doing so, should return that new length, so it doesn't have to be recalculated with strlen() (e.g. php_addslashes()).

  7. NEVER USE strncat(). If you're absolutely sure you know what you're doing, check its man page again, and only then, consider using it, and even then, try avoiding it.

  8. Use PHP_* macros in the PHP source, and ZEND_* macros in the Zend part of the source. Although the PHP_* macros are mostly aliased to the ZEND_* macros it gives a better understanding on what kind of macro you're calling.

  9. When commenting out code using a #if statement, do NOT use 0 only. Instead, use "<git username here>_0". For example, #if FOO_0, where FOO is your git user foo. This allows easier tracking of why code was commented out, especially in bundled libraries.

  10. Do not define functions that are not available. For instance, if a library is missing a function, do not define the PHP version of the function, and do not raise a run-time error about the function not existing. End users should use function_exists() to test for the existence of a function.

  11. Prefer emalloc(), efree(), estrdup(), etc. to their standard C library counterparts. These functions implement an internal "safety-net" mechanism that ensures the deallocation of any unfreed memory at the end of a request. They also provide useful allocation and overflow information while running in debug mode.

    In almost all cases, memory returned to the engine must be allocated using emalloc().

    The use of malloc() should be limited to cases where a third-party library may need to control or free the memory, or when the memory in question needs to survive between multiple requests.

  12. The return type of "is" or "has" style functions should be bool, which return a "yes"/"no" answer. zend_result is an appropriate return value for functions that perform some operation that may succeed or fail.

User functions/methods naming conventions

  1. Function names for user-level functions should be enclosed with in the PHP_FUNCTION() macro. They should be in lowercase, with words underscore delimited, with care taken to minimize the letter count. Abbreviations should not be used when they greatly decrease the readability of the function name itself:

    Good:

    str_word_count
    array_key_exists
    

    Ok:

    date_interval_create_from_date_string
    // Could be 'date_intvl_create_from_date_str'?
    get_html_translation_table()
    // Could be 'html_get_trans_table'?
    

    Bad:

    hw_GetObjectByQueryCollObj
    pg_setclientencoding
    jf_n_s_i
    
  2. If they are part of a "parent set" of functions, that parent should be included in the user function name, and should be clearly related to the parent program or function family. This should be in the form of parent_*:

    A family of foo functions, for example:

    Good:

    foo_select_bar
    foo_insert_baz
    foo_delete_baz
    

    Bad:

    fooselect_bar
    fooinsertbaz
    delete_foo_baz
    
  3. Function names used by user functions should be prefixed with _php_, and followed by a word or an underscore-delimited list of words, in lowercase letters, that describes the function. If applicable, they should be declared static.

  4. Variable names must be meaningful. One letter variable names must be avoided, except for places where the variable has no real meaning or a trivial meaning (e.g. for (i=0; i<100; i++) ...).

  5. Variable names should be in lowercase. Use underscores to separate between words.

  6. Method names follow the studlyCaps (also referred to as bumpy case or camel caps) naming convention, with care taken to minimize the letter count. The initial letter of the name is lowercase, and each letter that starts a new "word" is capitalized.

  7. Class names should be descriptive nouns in PascalCase and as short as possible. Each word in the class name should start with a capital letter, without underscore delimiters. The class name should be prefixed with the name of the "parent set" (e.g. the name of the extension) if no namespaces are used.

  8. Abbreviations and acronyms as well as initialisms should be avoided wherever possible, unless they are much more widely used than the long form (e.g. HTTP or URL). Abbreviations, acronyms, and initialisms should be treated like regular words, thus they should be written with an uppercase first character, followed by lowercase characters.

  9. Diverging from this policy is allowed to keep internal consistency within a single extension, if the name follows an established, language-agnostic standard, or for other reasons, if those reasons are properly justified and voted on as part of the RFC process.

    Good method names:

    connect()
    getData()
    buildSomeWidget()
    performHttpRequest()
    

    Bad method names:

    get_Data()
    buildsomewidget()
    getI()
    performHTTPRequest()
    

    Good class names:

    Curl
    CurlResponse
    HttpStatusCode
    Url
    BtreeMap // B-tree Map
    UserId // User Identifier
    Char // Character
    Intl // Internationalization
    Ssl\Certificate
    Ssl\Crl // Certificate Revocation List
    Ssl\CrlUrl
    

    Bad class names:

    curl
    curl_response
    HTTPStatusCode
    URL
    BTreeMap
    UserID // User Identifier
    CHAR
    INTL
    SSL\Certificate
    SSL\CRL
    SSL\CRLURL
    

Internal function naming conventions

  1. Functions that are part of the external API should be named php_modulename_function() to avoid symbol collision. They should be in lowercase, with words underscore delimited. Exposed API must be defined in php_modulename.h.

    PHPAPI char *php_session_create_id(PS_CREATE_SID_ARGS);
    

    Unexposed module function should be static and should not be defined in php_modulename.h.

    static int php_session_destroy()
    
  2. Main module source file must be named modulename.c.

  3. Header file that is used by other sources must be named php_modulename.h.

Syntax and indentation

  1. Use K&R-style. Of course, we can't and don't want to force anybody to use a style he or she is not used to, but, at the very least, when you write code that goes into the core of PHP or one of its standard modules, please maintain the K&R style. This applies to just about everything, starting with indentation and comment styles and up to function declaration syntax. Also see Indentstyle.

  2. Be generous with whitespace and braces. Keep one empty line between the variable declaration section and the statements in a block, as well as between logical statement groups in a block. Maintain at least one empty line between two functions, preferably two. Always prefer:

    if (foo) {
        bar;
    }
    

    to:

    if(foo)bar;
    
  3. When indenting, use the tab character. A tab is expected to represent four spaces. It is important to maintain consistency in indentation so that definitions, comments, and control structures line up correctly.

  4. Preprocessor statements (#if and such) MUST start at column one. To indent preprocessor directives you should put the # at the beginning of a line, followed by any number of spaces.

  5. The length of constant string literals should be calculated via strlen() instead of using sizeof()-1 as it is clearer and any modern compiler will optimize it away. Legacy usages of the latter style exists within the codebase but should not be refactored, unless larger refactoring around that code is taking place.

Testing

  1. Extensions should be well tested using *.phpt tests. Read more at qa.php.net documentation.

New and experimental functions

To reduce the problems normally associated with the first public implementation of a new set of functions, it has been suggested that the first implementation include a file labeled EXPERIMENTAL in the function directory, and that the functions follow the standard prefixing conventions during their initial implementation.

The file labelled EXPERIMENTAL should include the following information:

  • Any authoring information (known bugs, future directions of the module).
  • Ongoing status notes which may not be appropriate for Git comments.

In general, new features should go to PECL or experimental branches until there are specific reasons for directly adding them to the core distribution.

Aliases & legacy documentation

You may also have some deprecated aliases with close to duplicate names, for example, somedb_select_result and somedb_selectresult. For documentation purposes, these will only be documented by the most current name, with the aliases listed in the documentation for the parent function. For ease of reference, user-functions with completely different names, that alias to the same function (such as highlight_file and show_source), will be separately documented.

Backwards compatible functions and names should be maintained as long as the code can be reasonably be kept as part of the codebase. See the README in the PHP documentation repository for more information on documentation.