mirror of
https://github.com/php/php-src.git
synced 2025-01-01 16:35:00 +08:00
22c3da8f9c
(written over a year ago but somehow never commited)
281 lines
12 KiB
Plaintext
281 lines
12 KiB
Plaintext
Following is a merge of two letters I sent to php4beta@lists.php.net,
|
|
describing the changes in API between PHP 3.0 and PHP 4.0 (Zend).
|
|
This file is by no means thorough documentation of the PHP API,
|
|
and is intended for developers who are familiar with the PHP 3.0 API,
|
|
and want to port their code to PHP 4.0, or take advantage of its new
|
|
features. For highlights about the PHP 3.0 API, consult apidoc.txt.
|
|
|
|
Zeev
|
|
|
|
--------------------------------------------------------------------------
|
|
|
|
I'm going to try to list the important changes in API and programming
|
|
techniques that are involved in developing modules for PHP4/Zend, as
|
|
opposed to PHP3. Listing the whole PHP4 API is way beyond my scope here,
|
|
it's mostly a 'diff' from the apidoc.txt, which you're all pretty familiar
|
|
with.
|
|
An important note that I neglected to mention yesterday - the php4 tree is
|
|
based on the php 3.0.5 tree, plus all 3.0.6 patches hand-patched into it.
|
|
Notably, it does NOT include any 3.0.7 patches. All of those have to be
|
|
reapplied, with extreme care - modules should be safe to patch (mostly),
|
|
but anything that touches the core or main.c will almost definitely require
|
|
changes in order to work properly.
|
|
|
|
[1] Symbol Tables
|
|
|
|
One of the major changes in Zend involves changing the way symbols tables
|
|
work. Zend enforces reference counting on all values and resources. This
|
|
required changes in the semantics of the hash tables that implement symbol
|
|
tables. Instead of storing pval in the hashes, we now store pval *. All
|
|
of the API functions in Zend were changed in a way that this change is
|
|
completely transparent. However, if you've used 'low level' hash functions
|
|
to access or update elements in symbol tables, your code will require
|
|
changes. Following are two simple examples, one demonstrates the
|
|
difference between PHP3 and Zend when reading a symbol's value, and the
|
|
other demonstrates the difference when writing a value.
|
|
|
|
php3_read()
|
|
{
|
|
pval *foo;
|
|
|
|
_php3_hash_find(ht, "foo", sizeof("foo"), &foo);
|
|
/* foo->type is the type and foo->value is the value */
|
|
}
|
|
|
|
|
|
php4_read()
|
|
{
|
|
pval **foo;
|
|
|
|
_php3_hash_find(ht, "foo", sizeof("foo"), &foo);
|
|
/* (*foo)->type is the type and (*foo)->value is the value */
|
|
}
|
|
|
|
---
|
|
|
|
php3_write()
|
|
{
|
|
pval newval;
|
|
|
|
newval.type = ...;
|
|
newval.value = ...;
|
|
_php3_hash_update(ht, "bar", sizeof("bar"), &newval, sizeof(pval), NULL);
|
|
}
|
|
|
|
php4_write()
|
|
{
|
|
pval *newval = ALLOC_ZVAL();
|
|
|
|
newval->refcount=1;
|
|
newval->is_ref=0;
|
|
newval->type = ...;
|
|
newval->value = ...;
|
|
_php3_hash_update(ht, "bar", sizeof("bar"), &newval, sizeof(pval *), NULL);
|
|
}
|
|
|
|
|
|
[2] Resources
|
|
|
|
One of the 'cute' things about the reference counting support is that it
|
|
completely eliminates the problem of resource leaking. A simple loop that
|
|
included '$result = mysql_query(...)' in PHP leaked unless the user
|
|
remembered to run mysql_free($result) at the end of the loop body, and
|
|
nobody really did. In order to take advantage of the automatic resource
|
|
deallocation upon destruction, there's virtually one small change you need
|
|
to conduct. Change the result type of a resource that you want to destroy
|
|
itself as soon as its no longer referenced (just about any resource I can
|
|
think of) as IS_RESOURCE, instead of as IS_LONG. The rest is magic.
|
|
|
|
A special treatment is required for SQL modules that follow MySQL's
|
|
approach for having the link handle as an optional argument. Modules that
|
|
follow the MySQL module model, store the last opened link in a global
|
|
variable, that they use in case the user neglects to explicitly specify a
|
|
link handle. Due to the way referenec counting works, this global
|
|
reference is just like any other reference, and must increase that SQL link
|
|
resource's reference count (otherwise, it will be closed prematurely).
|
|
Simply, when you set the default link to a certain link, increase that
|
|
link's reference count by calling zend_list_addref().
|
|
As always, the MySQL module is the one used to demonstrate 'new
|
|
technology'. You can look around it and look for IS_RESOURCE, as well as
|
|
zend_list_addref(), to see a clear example of how the new API should be used.
|
|
|
|
|
|
[3] Thread safety issues
|
|
|
|
I'm not going to say that Zend was designed with thread safety in mind, but
|
|
from some point, we've decided upon several guidelines that would make the
|
|
move to thread safety much, much easier. Generally, we've followed the PHP
|
|
3.1 approach of moving global variables to a structure, and encapsulating
|
|
all global variable references within macros. There are three main
|
|
differences:
|
|
1. We grouped related globals in a single structure, instead of grouping
|
|
all globals in one structure.
|
|
2. We've used much, much shorter macro names to increase the readability
|
|
of the source code.
|
|
3. Regardless of whether we're compiling in thread safe mode or not, all
|
|
global variables are *always* stored in a structure. For example, you
|
|
would never have a global variable 'foo', instead, it'll be a property of a
|
|
global structure, for example, compiler_globals.foo. That makes
|
|
development much, much easier, since your code will simply not compile
|
|
unless you remember to put the necessary macro around foo.
|
|
|
|
To write code that'll be thread safe in the future (when we release our
|
|
thread safe memory manager and work on integrating it), you can take a look
|
|
at zend_globals.h. Essentially, two sets of macros are defined, one for
|
|
thread safe mode, and one for thread unsafe mode. All global references
|
|
are encapsulated within ???G(varname), where ??? is the appropriate prefix
|
|
for your structure (for example, so far we have CG(), EG() and AG(), which
|
|
stand for the compiler, executor and memory allocator, respectively).
|
|
When compiling with thread safety enabled, each function that makes use of
|
|
a ???G() macro, must obtain the pointer to its copy of the structure. It
|
|
can do so in one of two forms:
|
|
1. It can receive it as an argument.
|
|
2. It can fetch it.
|
|
|
|
Obviously, the first method is preferable since it's much quicker.
|
|
However, it's not always possible to send the structure all the way to a
|
|
particular function, or it may simply bloat the code too much in some
|
|
cases. Functions that receive the globals as an argument, should look like
|
|
this:
|
|
|
|
rettype functioname(???LS_D) <-- a function with no arguments
|
|
rettype functioname(type arg1, ..., type argn ???LS_DC) <-- a funciton with
|
|
arguments
|
|
|
|
Calls to such functions should look like this:
|
|
functionname(???LS_C) <-- a function with no arguments
|
|
functionname(arg1, ..., argn ???LS_CC) <-- a function with arguments
|
|
|
|
LS stands for 'Local Storage', _C stands for Call and _CC stands for Call
|
|
Comma, _D stands for Declaration and _DC stands for Declaration Comma.
|
|
Note that there's NO comma between the last argument and ???LS_DC or ???LS_CC.
|
|
|
|
If you do not pass the structure as an argument you can always fetch it by
|
|
using the appropriate ???LS_FETCH() macros within a functions variable
|
|
declarations.
|
|
|
|
In general, every module that makes use of globals should use this approach
|
|
if it plans to be thread safe.
|
|
|
|
|
|
[4] Generalized INI support
|
|
|
|
The code comes to solve several issues:
|
|
|
|
a. The ugly long block of code in main.c that reads values from the
|
|
cfg_hash into php3_ini.
|
|
b. Get rid of php3_ini. The performance penalty of copying it around all
|
|
the time in the Apache module probably wasn't too high, but
|
|
psychologically, it annoyed me :)
|
|
c. Get rid of the ugly code in mod_php4.c, that also reads values from
|
|
Apache directives and puts them into the php3_ini structure.
|
|
d. Generalize all the code so that you only have to add an entry in one
|
|
single place and get it automatically supported in php3.ini, Apache, Win32
|
|
registry, runtime function ini_get() and ini_alter() and any future method
|
|
we might have.
|
|
e. Allow users to easily override *ANY* php3.ini value, except for ones
|
|
they're not supposed to, of course.
|
|
|
|
I'm happy to say that I think I pretty much reached all goals. php_ini.c
|
|
implements a mechanism that lets you add your INI entry in a single place,
|
|
with a default value in case there's no php3.ini value. What you get by
|
|
using this mechanism:
|
|
|
|
1. Automatic initialization from php3.ini if available, or from the
|
|
default value if not.
|
|
2. Automatic support in ini_alter(). That means a user can change the
|
|
value for this INI entry at runtime, without you having to add in a single
|
|
line of code, and definitely no additional function (for example, in PHP3,
|
|
we had to add in special dedicated functions, like
|
|
set_magic_quotes_runtime() or the likes - no need for that anymore).
|
|
3. Automatic support in Apache .conf files.
|
|
4. No need for a global php3_ini-like variable that'll store all that
|
|
info. You can directly access each INI entry by name, in runtime. 'Sure,
|
|
that's not revolutionary, it's just slow' is probably what some of you
|
|
think - which is true, but, you can also register a callback function that
|
|
is called each time your INI entry is changed, if you wish to store it in a
|
|
cached location for intensive use.
|
|
5. Ability to access the current active value for a given INI entry, and
|
|
the 'master' value.
|
|
|
|
Of course, (2) and (3) are only applicable in some cases. Some entries
|
|
shouldn't be overriden by users in runtime or through Apache .conf files -
|
|
you can, of course, mark them as such.
|
|
|
|
|
|
So, enough hype, how does it work.
|
|
|
|
Essentially:
|
|
|
|
static PHP_INI_MH(OnChangeBar); /* declare a message handler for a change
|
|
in "bar" */
|
|
|
|
PHP_INI_BEGIN()
|
|
PHP_INI_ENTRY("foo", "1", PHP_INI_ALL, NULL, NULL)
|
|
PHP_INI_ENTRY("bar", "bah", PHP_INI_SYSTEM, OnChangeBar, NULL)
|
|
PHP_INI_END()
|
|
|
|
static PHP_INI_MH(OnChangeBar)
|
|
{
|
|
a_global_var_for_bar = new_value;
|
|
return SUCCESS;
|
|
}
|
|
|
|
int whatever_minit(INIT_FUNC_ARGS)
|
|
{
|
|
...
|
|
REGISTER_INI_ENTRIES();
|
|
...
|
|
}
|
|
|
|
|
|
int whatever_mshutdown(SHUTDOWN_FUNC_ARGS)
|
|
{
|
|
...
|
|
UNREGISTER_INI_ENTRIES();
|
|
...
|
|
}
|
|
|
|
|
|
and that's it. Here's what it does. As you can probably guess, this code
|
|
registers two INI entries - "foo" and "bar". They're given defaults "1"
|
|
and "bah" respectively - note that all defaults are always given as
|
|
strings. That doesn't reduce your ability to use integer values, simply
|
|
specify them as strings. "foo" is marked so that it can be changed by
|
|
anyone at any time (PHP_INI_ALL), whereas "bar" is marked so it can be
|
|
changed only at startup in the php3.ini only, presumably, by the system
|
|
administrator (PHP_INI_SYSTEM).
|
|
When "foo" changes, no function is called. Access to it is done using the
|
|
macros INI_INT("foo"), INI_FLT("foo") or INI_STR("foo"), which return a
|
|
long, double or char * respectively (strings that are returned aren't
|
|
duplicated - if they're manipulated, you must duplicate them first). You
|
|
can also access the original value (the 'master' value, in case one of them
|
|
was overriden by a user) using another pair of macros:
|
|
INI_ORIG_INT("foo"), INI_ORIG_FLT("foo") and INI_ORIG_STR("foo").
|
|
|
|
When "bar" changes, a special message handler is called, OnBarChange().
|
|
Always declare those message handlers using PHP_INI_MH(), as they might
|
|
change in the future. Message handlers are called as soon as an ini entry
|
|
initializes or changes, and allow you to cache a certain INI value in a
|
|
quick C structure. In this example, whenever "bar" changes, the new value
|
|
is stored in a_global_var_for_bar, which is a global char * pointer,
|
|
quickly accessible from other functions. Things get a bit more complicated
|
|
when you want to implement a thread-safe module, but it's doable as well.
|
|
Message handlers may return SUCCESS to acknowledge the new value, or
|
|
FAILURE to reject it. That enables you to reject invalid values for some
|
|
INI entries if you want. Finally, you can have a pointer passed to your
|
|
message handler - that's the fifth argument to PHP_INI_ENTRY(). It is
|
|
passed as mh_arg to the message handler.
|
|
|
|
Remember that for certain values, there's really no reason to mess with a
|
|
callback function. A perfect example for this are the syntax highlight
|
|
colors, which no longer have a dedicated global C slot that stores them,
|
|
but instead, are fetched from the php_ini hash on demand.
|
|
|
|
"As always", for a perfect working example of this mechanism, consult
|
|
functions/mysql.c. This module uses the new INI entry mechanism, and was
|
|
also converted to be thread safe in general, and in its php_ini support in
|
|
particular. Converting your modules to look like this for thread safety
|
|
isn't a bad idea (not necessarily now, but in the long run).
|
|
|