$Id$ PHP 7.0 INTERNALS UPGRADE NOTES 0. Wiki Examples 1. Internal API changes e. New data types f. zend_parse_parameters() specs g. sprintf() formats h. HashTable API i. New portable macros for large file support j. New portable macros for integers k. get_class_entry object handler info l. get_class_name object handler info m. Other portable macros info n. ZEND_ENGINE_2 removal o. Updated final class modifier p. TSRM changes q. gc_collect_cycles() is now hookable r. PDO uses size_t for lengths 2. Build system changes a. Unix build system changes b. Windows build system changes ================ 0. Wiki Examples ================ The wiki contains multiple examples and further explanations of the internal changes. See: https://wiki.php.net/phpng-upgrading ======================== 1. Internal API changes ======================== e. New data types String Besides the old way of accepting the strings with 's', the new 'S' ZPP spec was introduced. It expects an argument of the type zend_string *. String lengths in it are bound to the size_t datatype. Integer types Integers do no more depend on the firm 'long' type. Instead a platform dependent integer type is used, it is called zend_long. That datatype is defined dynamically to guarantee the consistent 64 bit support. The zval field representing user land integer it bound to zend_long. Signed integer is defined as zend_long, unsigned integer as zend_ulong inside Zend. Other datatypes zend_off_t - portable off_t analogue zend_stat_t - portable 'struct stat' analogue These datatypes are declared to be portable across platforms. Thus, direct usage of the functions like fseek, stat, etc. as well as direct usage of off_t and struct stat is strongly not recommended. Instead the portable macros should be used. zend_fseek - portable fseek equivalent zend_ftell - portable ftell equivalent zend_lseek - portable lseek equivalent zend_fstat - portable fstat equivalent zend_stat - portable stat equivalent f. zend_parse_parameters() specs The new spec 'S' introduced, which expects an argument of type zend_string *. The 'l' spec expects a parameter of the type zend_long, not long anymore. The 's' spec expects parameters of the type char * and size_t, no int anymore. g. sprintf() formats New printf modifier 'p' was implemented to platform independently output zend_long, zend_ulong and php_size_t datatypes. That modifier can be used with'd', 'u', 'x' and 'o' printf format specs with spprintf, snprintf and the wrapping printf implementations. %pu is sufficient for both zend_ulong and php_size_t. the code using %p spec to output pointer address might need to be enclosed into #ifdef when it unlickily followed by 'd', 'u', 'x' or 'o'. The only exceptions are the snprintf and zend_sprintf functions yet, because in some cases they can use the implemenations available on the system, not the PHP one. With snprintf the macros ZEND_INT_FMT and ZEND_UINT_FMT should be used. h. HashTable API Datatype for array indexes was changed to zend_ulong, for string keys to zend_string *. i. New portable macros for large file support Function(s) Alias Comment stat, _stat64 zend_stat for use with zend_stat_t fstat, _fstat64 zend_fstat for use with zend_stat_t lseek, _lseeki64 zend_lseek for use with zend_off_t ftell, _ftelli64 zend_ftell for use with zend_off_t fseek, _fseeki64 zend_fseek for use with zend_off_t j. New portable macros for integers Function(s) Alias Comment snprintf with "%ld" or "%lld", _ltoa_s, _i64toa_s ZEND_LTOA for use with zend_long atol, atoll, _atoi64 ZEND_ATOL for use with zend_long strtol, strtoll, _strtoi64 ZEND_STRTOL for use with zend_long strtoul, strtoull, _strtoui64 ZEND_STRTOUL for use with zend_long abs, llabs, _abs64 ZEND_ABS for use with zend_long - ZEND_LONG_MAX replaces LONG_MAX where appropriate - ZEND_LONG_MIN replaces LONG_MIN where appropriate - ZEND_ULONG_MAX replaces ULONG_MAX where appropriate - SIZEOF_ZEND_LONG reworked SIZEOF_ZEND_LONG representing the size of zend_long datatype - ZEND_SIZE_MAX Max value of size_t - Z_L casts an integral constant to zend_long - Z_UL casts an integral constant to zend_ulong The macro ZEND_ENABLE_ZVAL_LONG64 reveals whether zval operates on 64 or 32 bit integer. k. The get_class_entry object handler is no longer available. Instead zend_object.ce is always used. l. The get_class_name object handler is now only used for displaying class names in debugging functions like var_dump(). It is no longer used in get_class(), get_parent_class() or similar. The handler is now obligatory, no longer accepts a `parent` argument and must return a non-NULL zend_string*, which will be released by the caller. m. Other portable macros info ZEND_SECURE_ZERO - zeroes chunk of memory ZEND_VALID_SOCKET - validates a php_socket_t variable ZEND_FASTCALL is defined to use __vectorcall convention on VS2013 and above ZEND_NORETURN is defined as __declspec(noreturn) on VS n. The ZEND_ENGINE_2 macro has been removed. A ZEND_ENGINE_3 macro has been added. o. Removed ZEND_ACC_FINAL_CLASS in favour of ZEND_ACC_FINAL, turning final class modifier now a different class entry flag. Update your extensions. p. TSRM changes The TSRM layer undergone significant changes. It is not necessary anymore to pass the tsrm_ls pointer explicitly. The TSRMLS_* macros should not be used in any new developments. The recommended way accessing globals in TS builds is by implementing it to use a local thread unique pointer to the corresponding data pool. These are the new macros that serve to achieve this goal: - ZEND_TSRMG - based on TSRMG and if static pointer for data pool access is implemented, will use it - ZEND_TSRMLS_CACHE_EXTERN - to be used in a header shared across multiple C/C++ files - ZEND_TSRMLS_CACHE_DEFINE - to be used in the main extension C/C++ file - ZEND_TSRMLS_CACHE_UPDATE - to be integrated at the top of the globals ctor or RINIT function - ZEND_ENABLE_STATIC_TSRMLS_CACHE - to be used in the config.[m4|w32] for enabling the globals access per thread unique pointer - TSRMLS_CACHE - expands to the variable name which links to the thread unique data pool Note that usually it will work without implementing the globals access per a thread unique pointer, however it might cause a slowdown. The reason is that the access model to the extension/SAPI own globals as well as the access to the core globals is determined in the compilation phase depending on whether ZEND_ENABLE_STATIC_TSRMLS_CACHE macro is defined. Due to the current compiler evolution state, the TSRMLS_CACHE pointer (when implemented) has to be present and updated in every binary unit which is not compiled statically into PHP. So any DLL, DSO, EXE, etc. has to take care about defining and updating it's TSRMLS cache. An update should happen before any globals was accessed. Porting an extension or SAPI is usually as easy as removing all the TSRMLS_* occurrences and integrating the macros mentioned above. However if tsrm_ls is used explicitly, its usage can be considered obsolete in most cases. Additionally, if an extension triggers its own threads, TSRMLS_CACHE shouldn't be passed to that threads directly. When working with CTOR/DTOR for the extension globals, only the data passed as parameters should be touched. For example, don't use code like free(MYEXT_G(vars)); inside globals destructor, as it possibly can destroy the data from a wrong thread. Instead use the pointer to the data delivered into the destructor. So the previous example could look like free(my_struct->vars); given the income was casted to (struct my_struct *). q. gc_collect_cycles() is now a function pointer, and can be replaced in the same manner as zend_execute_ex() if needed (for example, to include the time spent in the garbage collector in a profiler). The default implementation has been renamed to zend_gc_collect_cycles(), and is exported with ZEND_API. r. In accordance with general use of size_t as string length, all PDO API functions now use size_t for string length. s. Removed ZEND_REGISTER/FETCH_RESOURCE, use zend_fetch_resource and and zend_register_resource instead, zend_fetch_resource accepts a zend_resource * as first argument, if you still need to fetch a resource from zval, use zend_fetch_resource_ex. More details can be found in Zend/zend_list.c. t. Optimized strings concatenation. ZEND_ADD_STRING/VAR/CHAR are replaced with ZEND_ROPE_INIT, ZEND_ROPE_ADD, ZEND_ROPE_END. Instead of reallocation and copying string on each ZEND_ADD_STRING/VAR/CAHR, collect all the strings and then allocate and construct the resulting string once. ======================== 2. Build system changes ======================== a. Unix build system changes b. Windows build system changes - Besides Visual Studio, building with Clang or Intel Composer is now possible. To enable an alternative toolset, add the option --with-toolset=[vs,clang,icc] to the configure line. The default toolset is vs. Still clang or icc need the correct environment which involves many tools from the vs toolset. The toolset option is supported by phpize as well. AWARENESS The only recommended and supported toolset to produce production ready binaries is Visual Studio. Still other compilers can be used now for testing and analyzing purposes. - configure.js now produces response files which are passed to the linker and library manager. This solves the issues with the long command lines which can exceed the OS limit. - with clang toolset, an option --with-uncritical-warn-choke is available, which suppresses the most frequent false positive warnings - the --with-mp option will by default utilize all the available cores. It's enabled by default and can be disabled with the special "disable" keyword. ======================== 3. Module changes ======================== Session: - PS_MOD_UPDATE_TIMESTAMP() session save handler definition is added. New save handler should use PS_MOD_UPDATE_TIMESTAMP() definition. Save handler must not refer/change PS() variables directly from save handler. Use parameters. - PS_MOD()/PS_MOD_SID() macro exist only for transition. Beware these save handlers are less secure and slower than PS_MOD_UPDATE_TIMESTAMP(). - PS(invalid_session_id) was removed. It was never worked as it supposed. To report invalid session ID, use PS_VALIDATE_SID() handler. - PS_VALIDATE_SID() defines session ID validation handler. This handler must validate if requested session ID exists in session data storage or not. Do not access PS(id) directly, but use this handler and it's parameter. - PS_UPDATE_TIMESTAMP() defines timestamp updating handler. This handler must update session data timestamp for GC if it is needed. e.g. Memcache updates timestamp on read, so it does not need to update timestamp. Return SUCCESS simply for this case. - PS_CREATE_SID() should check session ID collision. Return NULL for failure. - More documentation can be found in ext/session/mod_files.c as comments. mod_files.c may be used as reference implementation.