APACHE AMSTERDAM ROADMAP: -*-text-*- ApacheCon HACKATHON: * Let's get one or two tables together at 10am Tuesday, May 1 Background reading on server architecture/performance: * Dan Kegel: * Tim Brecht: * Matt Welsh: * Jeff Darcy: * Others: ROADMAP DISCUSSION POINTS: Configuration * Pain points with 2.x: - config does not enable runtime changes [PaulQ] - config is not easy to programmatically extend [PaulQ] - config syntax is a dozen ad-hoc rules instead of one language [RoyF] - config lacking macros and parameters [JorgeS] - config file size is unwieldy, could benefit from an outline presentation (similar to ad-hoc syntax pain) [RoyF] * Build a cleaner configuration system based on an Object Model, rather than a file/specific directives. * Investigate using small scripting languages for complicated user configurable things, including Caching Rules, Rewrite Rules, and AAA/Require Rules. Look at Varnish's VCL or lighttpd's mod_magnet for interesting examples: * Stick with NCSA syntax, or change to a - properties style of syntax? - XML syntax? - something else? * How about a simple configuration template mechanism, like: T.generic: { host.name: $host host.admin: webmaster@$host host.root: /home/customers/$host/www/ ... } myhost = T.generic[$host = "myhost.com"] Parent Startup * Separate SSL-enabled install (httpsd) from non-SSL [RoyF] Parent-Worker Coordination (MPM) * Pain points: - large ISPs need per-customer (per-vhost) isolation [RoyF] * Refactor MPMs to split platform specific needs from the process and thread models. A reasonable goal is to have one thread model that runs on both unix and winnt. - Decouple low-level I/O (disk and network and pipes) from concurrency models (multi-process, multi-threaded, event-driven async) and also from our protocol handlers. [AaronB] * Provide a generic IPC or Scoreboard mechanism with an easy to use API for modules and the core. * Break the 1:1 mapping of a worker to a single request. * Consider having three types of worker: 1) triage workers to accept the connection, load check, read the request headers, and construct an internal request object + fd for request body; 2) handler workers that each handles a defined prefix of URIs, tests access, auth, performs the action, and constructs a response object and brigade; 3) delivery workers that translate the response to a message on the network connection and then return the connection to the event loop. Worker Front-end * Include support for Waka, once an RFC/more details are available. * Make the http protocol a module, decoupled from the core. Worker Middle-end * Async IO Core, including a generic Event Loop, allowing modules and protocols to register new events and when to be notified. - Consider using libevent for the event loop abstraction. [AaronB] * Use Serf (and its Bucket system) as the starting point for making filters, buckets and brigades work with the Async Core. * Make everything possible into a hook or use the provider model. Example: the way we determine if a connection can be kept alive is a monolithic function. This should be a hook. [BrianA] Worker Back-end * Investigate using Syslets for functions like stat(): * Investigate providing a higher level module API, which hides the complexity of an async core. Basically, a 'simple world view' for those that want to write a simple generator. * Promote and include an external-process communication method in the core. This could be used to communicate with PHP, a JVM, Ruby or many other things that do not wish to be run inside a highly-threaded and async core. We should optionally include a process management framework, to spawn these as needed, to make configuration for server administrators easier. * VFS-like layer so things like Subversion/mod_dav don't have to fake out the core handlers. Platform Portability * Replace the generic MPM design with specific designs for each of the modern platforms, find a way to stay in kernel-land until a valid request is received (with load restrictions tested and ipfw applied automatically), transform the request into a waka message, and then dispatch that request to a process running under a userid matched to a prefix of the URI. * Replace APR with a library that only does portability, never allocates memory, and doesn't use pools (at least not directly) General Performance * Reduce default memory footprint