APACHE PORTABLE RUNTIME (APR) LIBRARY STATUS: -*- coding: utf-8 -*- Last modified at [$Date$] Releases: 2.0.0 : in development on trunk/ 1.5.0 : in development on branches/1.5.x/ 1.4.9 : in development on branches/1.4.x/ 1.4.8 : released June 21, 2013 1.4.7 : not released 1.4.6 : released Feb 14, 2012 1.4.5 : released May 22, 2011 1.4.4 : released May 9, 2011 1.4.3 : not released 1.4.2 : released April 3, 2010 1.4.1 : not released 1.4.0 : not released 1.3.12 : released April 4, 2010 1.3.11 : not released 1.3.10 : not released 1.3.9 : released September 23, 2009 1.3.8 : released August 6, 2009 1.3.7 : released July 23, 2009 1.3.6 : released July 4, 2009 1.3.5 : released June 5, 2009 1.3.4 : not released 1.3.3 : released August 14, 2008 1.3.2 : released June 23, 2008 1.3.1 : not released 1.3.0 : released June 3, 2008 1.2.12 : released November 25, 2007 1.2.11 : released September 6, 2007 1.2.10 : not released 1.2.9 : released June 7, 2007 1.2.8 : released December 4, 2006 1.2.7 : released April 14, 2006 1.2.6 : released March 25, 2006 1.2.5 : not released 1.2.4 : not released 1.2.3 : not released 1.2.2 : released October 11, 2005 1.2.1 : released August 18, 2005 1.2.0 : not released 1.1.2 : no such version 1.1.1 : released March 17, 2005 1.1.0 : released January 25, 2005 1.0.1 : released November 19, 2004 1.0.0 : released September 1, 2004 0.9.21 : in maintenance 0.9.20 : released September 16, 2011 0.9.19 : released October 17, 2010 0.9.18 : released June 5, 2009 0.9.17 : released November 25, 2007 0.9.16 : released September 6, 2007 0.9.15 : not released 0.9.14 : released June 7, 2007 0.9.13 : released December 4, 2006 0.9.12 : released April 14, 2006 0.9.11 : released March 30, 2006 0.9.10 : not released 0.9.9 : not released 0.9.9 : not released 0.9.7 : released October 11, 2005 0.9.6 : released February 4, 2005 0.9.5 : released November 16, 2004 0.9.4 : released September 25, 2003 0.9.3 : released April 3, 2003 0.9.2 : released March 22, 2003 0.9.1 : released September 11, 2002 0.9.0 : released August 28, 2002 Bundled with httpd: 2.0a9 : released December 12, 2000 2.0a8 : released November 20, 2000 2.0a7 : released October 8, 2000 2.0a6 : released August 18, 2000 2.0a5 : released August 4, 2000 2.0a4 : released June 7, 2000 2.0a3 : released April 28, 2000 2.0a2 : released March 31, 2000 2.0a1 : released March 10, 2000 RELEASE SHOWSTOPPERS: CURRENT VOTES: CURRENT test/testall -v EXCEPTIONS: Please add any platform anomilies to the following exception list. * 'testipsub' will tickle an Solaris 8 getaddrinfo() IPv6 bug, causing the test to hang. Configure with --disable-ipv6 if using an unpatched Solaris 8 installation. * The 'testdso' tests will not work if configured with --disable-shared since the loadable modules cannot be built. * 'testdso' fails on older versions of OpenBSD due to dlsym(NULL, ...) segfaulting. * Win32 Not Implemented tests poll: pollcb not implemented procmutex: lacks fork() support sock : Sync behavior causes us to skip one test sockets: tcp6_socket/udp6_socket skipped for no IPv6 adapter sockopt: TCP isn't corkable users: username: Groups from apr_uid_get not implemented * Win32 tests are known to fail when APR_HAVE_IPV6, but there is no ipv6 adapter is loaded (even loopback is sufficient). There are obnoxious getaddrinfo() missing results from looking up a fixed IPv4-IPv6 mixed notation address, which reflect a Win32 bug. ipsub: One test fails for IPv6 with no IPv6 adapter configured sock : One test fails for IPv6 with no IPv6 adapter configured ONGOING REMINDERS FOR STYLE/SUBSTANCE OF CONTRIBUTING TO APR: * Flush out the test suite and make sure it passes on all platforms. We currently have about 450 functions in APR and 147 tests. That means we have a large number of functions that we can't verify are actually portable. This TODO includes finishing the migration to the unified test suite, and adding more tests to make the suite comprehensive. * Eliminate the TODO's and XXX's by using the doxygen @bug feature to allow us to better track the open issues, and provide historical bug lists that help porters understand what was wrong in the old versions of APR that they would be upgrading from. * Continue to review, deprecate and eliminate from 2.0 all namespace un-protected names throughout include/apr_foo.h headers. RELEASE NON-SHOWSTOPPERS BUT WOULD BE REAL NICE TO WRAP THESE UP: * Need some architecture/OS specific versions of the atomic operations. progress: generic, solaris Sparc, FreeBSD5, linux, and OS/390 done need: AIX, AS400, HPUX * The new lock API is a full replacement for the old API, but is not yet complete on all platforms. Components that are incomplete or missing include: Netware: apr_proc_mutex_*() (Is proc_mutex unnecessary on Netware?) * proc_mutex is not necessary on NetWare since the OS does not support processes. The proc_mutex APIs actually redirect to the thread_mutex APIs. (bnicholes) OS/2: apr_thread_cond_*(), apr_proc_mutex_*() Less critical components that we may wish to add at some point: Beos: apr_thread_rwlock_try*lock() apr_proc_mutex_trylock() Unix: apr_thread_rwlock_*() for platforms w/o rwlocks in pthread * Need to contemplate apr_strftime... platforms vary. OtherBill suggested this solution (but has no time to implement): Document our list of 'supported' escapes. Run some autoconf/m4 magic against the complete list we support. Move the strftime re-implementation from time/win32 to time/unix. Add some APR_HAVE_STRFTIME magic to use the system fn, or fail over to time/unix/strftime.c. Message-ID: <025e01c1a891$bf41f660$94c0b0d0@v505> * Using reentrant libraries with non-threaded APR - Anecdotal evidence exists that suggests it is bad to mix reentrant and non-reentrant libraries and therefore we should always use the reentrant versions. - Unfortunately, on some platforms (AIX 4.2.1) defining the reentrant flag (-D_THREAD_SAFE) causes builds to fail, and so one would expect --disable-threads to fix this. Although this has been fixed for that particular version of AIX, it may be useful to only enable the reentrant versions when threads are enabled. How will we deal with this issue once APR becomes a standalone library? It is perfectly legitimate to have apps needing both versions (threaded/reentrant and non-threaded/non-reentrant) on the same machine. Wrowe chuckles, uhm, it already is. And seems most have shifted to shipping threaded builds, of at least apr itself. * Pools debugging - Find a way to do check if a pool is used in multiple threads, while the creation flags say it isn't. IOW, when the pool was created with APR_POOL_FNEWALLOCATOR, but without APR_POOL_FLOCK. Currently, no matter what the creation flags say, we always create a lock. Without it integrity_check() and apr_pool_num_bytes() blow up (because they traverse pools child lists that possibly belong to another thread, in combination with the pool having no lock). However, this might actually hide problems like creating a child pool of a pool belonging to another thread. Maybe a debug function apr_pool_set_owner(apr_thread_t *) in combination with extra checks in integrity_check() will point out these problems. apr_pool_set_owner() would need to be called everytime the owner(the thread the pool is being used in) of the pool changes. - Implement apr_pool_join and apr_pool_lock. Those functions are noops at the moment. - Add stats to the pools code. We already have basic stats in debug mode. Stats that tell us about wasted memory in the production code require more thought. Status: Sander Striker is looking into this (low priority) David says this is a 1.1 issue. * Get OTHER_CHILD support into Win32 Status: Bill S. is looking into this * SysV semaphore support isn't usable by Apache when started as root because we don't have a way to allow the semaphore to be used by the configured User and Group. Current work-around: change the initial permissions to 0666. Needed code: See 1.3's http_main.c, SysV sem flavor of accept_mutex_init(). Status: Jim will look into this Update: Apache deals with this itself, though it might be nice if APR could do something. * Build scripts do not recognise AIX 4.2.1 pthreads Justin says: "Is this still true?" * FirstBill says we need a new procattr, APR_CREATE_SUSPENDED (or something similar) to direct ap_create_process to create the process suspended. We also need a call to wake up the suspended process. This may not be able to be implemented everywhere though. Status: OtherBill asks, why? What is the benefit, how is it portably implemented? Unless this creates some tangible that mirrors another platform, then I'm -1. * Replace tables with a proper opaque ADT that has pluggable implementations (including something like the existing data type, plus hash tables for speed, with options for more later). Status: fanf is working on this. * add a version number to apr_initialize() as an extra failsafe against (APR) library version skew. MsgID: Status: Greg -1, Jeff +1, Ryan +1, Tony -0(?), david +1 * add apr_crypt() and APR_HAS_CRYPT for apps to determine whether the crypt() function is available, and a way to call it (whether it is located in libc, libcrypt, or libufc) Justin says: Should apr_crypt() be in apr-util? Wrowe answers: of course! It's called openssl DES_fcrypt ;-) * use os_(un)cork in network_io/unix/sendrecv.c for FreeBSD's sendfile implementation. david: The socket options stuff is now in and using it should reduce the number of syscalls that are required for os_cork and uncork, so the code should be reviewed to make use of the new calls. If no-one beats me to it I'll get around to it soonish... * toss the per-Makefile setup of INCLUDES; shift to rules.mk.in rbb: This is a bad thing IMHO. If we do this, then we can't use these makefiles for anything else. For example, apr-util * add the rest of the pool accessor declare/impl macros. Justin says: Both thread and file have the accessors now. Any others? Status: Greg volunteers * I think apr_open_stderr() and friends *should* dup() the descriptor. That would allow the new/returned file to be closed (via pool cleanup or manually) without accidentally closing stderr/out. wrowe: votes -1, reasons directly manipulate this through APR * need to export (in code, not just build scripts) the shared library extension (e.g. ".so") for the platform. clients need to use this to construct filenames to pass to apr_dso_load() -- note on Win32 we distinguish 'apache module' names from other 'loadable module' names, so be careful with Apache's directive. AIX, HPUX may use similar (.so for a 'module's name while the defaults .a or .sl are used for libs.) * Possible gmtime_r replacement in explode_time On Solaris (and possibly others), the gmtime_r libc function obtains a mutex. We have seen 21/25 threads being blocked in this mutex on a threaded httpd MPM when requesting static pages. It may be worth it to hand optimize this since there is no real need for a mutex at the system level (straight arithmetic from what I can tell). If you have access to the Solaris source code: osnet_volume/usr/src/lib/libc/port/gen/time_comm.c. * Add a way to query APR for what features it has at runtime (i.e. threads). Justin says: I'm not completely sold on this, but it has been mentioned before and at least added to STATUS. * apr_xlate.h generates a bunch of compiler warnings. Jeff asks: which platform? Justin says: Solaris with Forte 6.1. * fcntl() oddness on Solaris. Under high loads, fcntl() decides to return error code 46 (ENOLCK). httpd (prefork MPM) error log says (predictably): (46)No record locks available: couldn't grab the accept mutex All of the children report this and subsequently exits. httpd is now hosed. AFAICT, this does not look to be an out-of-fds error. Solaris's man page says: ENOLCK The cmd argument is F_SETLK, F_SETLK64, F_SETLKW, or F_SETLKW64 and satisfying the lock or unlock request would result in the number of locked regions in the system exceeding a system-imposed limit. Justin says: What is this system-imposed limit and how do we change it? This gives me more rationale for switching the default interprocess lock mechanism to pthread (if available). Explanation (from Kristofer Spinka ): ============ The system imposed default limit of outstanding lock requests is 512. You can verify this by, in a contemporary version of Solaris: # mdb -k > tune_t_flckrec/D tune_t_flckrec: tune_t_flckrec: 512 This can be increased by adding the following to /etc/system: set tune_t_flckrec=1024 and rebooting. Of course "1024" can be any reasonable limit, although we do not know what "reasonable" should be, so be conservative, only increase this as necessary. * Generate a good bug report to send to the FreeBSD hackers that details the problems we have seen with threads and system calls (specifically sendfile data is corrupted). From our analysis so far, we don't think that this is an APR issue, but rather a FreeBSD kernel issue. Our current solution is to just disable threads across the board on FreeBSD. MsgID: <20010828091959.D17570@ebuilt.com> Status: Fixed in -CURRENT. MFC in about a week. Continuing testing with threads on FreeBSD. FreeBSD PR kern/32684: http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/32684 * There are some optimizations that can be done to the new apr_proc_*() functions (on UNIX). One that may reduce pointer indirection would be to make the apr_proc_mutex_unix_lock_methods_t first-class members of the apr_proc_mutex_t structure. * Condition variables are tricky enough to use, and even trickier to implement properly. We could really use a better test case for those subtle quirks that sometimes creep into CV implementations. * Once we are fully satisfied with the new lock API, we can begin to migrate the old API to be implemented on top of the new one, or just decide to get rid of it altogether. * FreeBSD returns 45 (EOPNOTSUPP) when the lockfile is on a NFS partition when you call fcntl(F_SETLKW). It may be good if we can somehow detect this and error out when creating the lock rather than waiting for the error to occur when acquiring lock. * Fix autoconf tests for strerror_r on BeOS and remove the hack in misc/unix/errorcodes.c to get error reporting working. Committed as the solution is elusive at present. * implement APR_PROGRAM_ENV and APR_PROGRAM_PATH on BeOS, OS/2, Netware * stat() on a few platforms (notably Solaris and AIX) succeeds for a non-directory even if a trailing '/' was specified in the name. APR should perhaps simulate the normal -1/ENOTDIR behavior in APR routines which retrieve information about the file. Note: Win2K fails GetFileAttributesEx in this scenario. See OtherBill's comments in this message to dev@httpd.apache.org: Message-Id: <5.1.0.14.2.20020315080852.00bce168@localhost> * Identify and implement those protection bits that have general usefulness, perhaps hidden, generic read-only [immutable], effective current user permissions, etc. * dso getsym implementation are becoming very strict about returning a fn pointer v.s. a data pointer, this should be split in apr_dso. Interface Changes Postponed for APR 2.0: * apr_atomic_casptr() has the volatile qualifier in the wrong place: should take "pointer to volatile pointer to void", not "pointer to pointer to volatile void". * apr_socket_sendfile(): the offset parameter should not be pass-by-reference, or it should be updated to do something useful. * apr_password_get(): the bufsize parameter should not be pass-by-reference. * apr_allocator.h: apr_memnode_t's use of uint32_t's doesn't match well with allocation sizes being apr_size_t, possibly this can be improved by using apr_size_t throughout. * apr_hash_count() should take a const apr_hash_t * argument. * apr_ino_t should be an ino64_t in LFS builds. * possible type renames: apr_file_info_t from apr_finfo_t apr_file_attrs_t from apr_fileattrs_t apr_file_seek_where_t from apr_seek_where_t apr_lock_mech_e from apr_lockmech_e apr_time_interval_t from apr_interval_time_t apr_time_interval_short_t from apr_short_interval_time_t * wrowe writes: Looking at bug 32520, it occurs to me that exploding times using the apr_time_exp_* functions; it would make more sense to split ->tm_usec into ->tm_msec thousandths (milleseconds) ->tm_usec millionths (microseconds) for most display purposes. It's trivial to roll them together with the format string %03d%03d if that's what's desired, or display simply %02d.%03d if millisecond resolution is desired. It would also shrink the fields to int's so unpacking would be slightly slower, using them would be slightly faster, for what's likely to be little impact on performance. * The other-child API doesn't allow the apr_exit_why_e to be passed to the application's maintenance function. The expected usage is that the application calls apr_proc_wait[_all_procs]() and is given back apr_exit_why_e and exit_code_or_signal_num, thus losing the original (on Unix, at least) representation which held both pieces of information in an int. Both pieces of data should be available to the maintenance function so that it has the opportunity to take different actions. An example would be to issue messages about probable misconfiguration when receiving a certain exit code and trying to restart otherwise. Thus, apr_proc_other_child_alert() should take an additional apr_exit_why_e parameter, as should the application-provided maintenance function. The exit-why value would be ignored in the same circumstances as the existing status parameter: reason != APR_OC_REASON_DEATH. * apr_file_gets() should take an apr_size_t size parameter? * apr_table_vdo should not continue iterating through the keys list once the callback function returns non-zero; see JCW's comment in apr_tables.c. * The library SONAME should vary for the different library ABIs - i.e. LFS support, IPv6 support or not. * remove APR_POLL_LASTDESC from apr_datatype_e. * Almost every API in APR depends on pools, but pool semantics aren't a good match for a lot of applications. We need to find a way to support alternate allocators polymorphically without a significant performance penalty. * apr_global_mutex_child_init and apr_proc_mutex_child_init aren't portable. There are a variety of problems with the locking API when it is used with apr_create_proc instead of apr_fork. First, _child_init doesn't take a lockmech_e parameter so it causes a segfault after the apr_proc_create, because the proc_mutex field hasn't been initialized. When the lockmech_e parameter is added, it _still_ doesn't work, because some lock mechanisms expect to inherit from the parent process. For example, sys V semaphores don't have a file to open, so the child process can't reaquire the lock. jerenkrantz says: This is not a showstopper and I believe the above analysis is slightly confusing. The real problem here is that apr_*_mutex_child_init assumes a shared memory space - that is, the children processes have access to the parent apr_*_mutex_t pointer. The children just call child_init on the original, inherited apr_*_mutex_t. Unlike globalmutexchild in test, apr_*_mutex_create is *not* intended to be called from the child and subsequently call child_init. Instead, apr_create_proc is intended to exec separate processes with disjoint memory addresses. Currently, APR does not provide a cross-platform mechanism for joining an already existing lock. A simple 'apr_*_mutex_join' which is intended to be called from separate processes to an already-existing lock would solve this problem. child_init is not intended to be used this way. Even with SysV semaphores, using IPC_PRIVATE should still work due to the parent-child relationship. A strawman has been posted to dev@apr: Message-Id: <213031CF0406DE1AC426A411@[10.0.1.137]> * The return type of a thread function (void *) is inconsistent with the type used in apr_thread_exit()/apr_thread_join() (apr_status_t). The thread function's return type should be changed to apr_status_t so that a return from the thread main function has the same effect as apr_thread_exit(). See Message-Id: for thread discussing this. +1: BrianH, Aaron, david, jerenkrantz Status: Deferred to 2.0.0 (API change)