================================================================================ Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to You under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. ================================================================================ Notes of things to consider for the next major Tomcat release (probably 8.0.x but possibly 7.1.x). 1. Refactor the TLD parsing. TLDs are currently parsed twice. Once by Catalina looking for listeners and once by Jasper. 2. Refactor the XML parsing (org.apache.tomcat.util.xml ?) to remove duplicate XML parsing code in Catalina and Jasper such as the entity resolvers used for validation. 3. TLDs may have a many to many relationship between URIs and TLD files. This can result in the same TLD file being parsed many times. Refactor the TldLocationCache to cache the parsed nodes (will need to check for changes to TLD files). 4. TLD files should be included in the dependencies for JSP and Tag files. 5. Run the unused code detector and remove everything that isn't currently used. Add deprecation markers for the removed code to Tomcat 7.0.x - Complete for javax.* - Complete for o.a.[catalina|coyote|el].* - Remaining code in progress 6. Change the default URIEncoding on the connector to UTF-8. 7. Rip out all the JNDI code in resource handling and replace it with straight URLs (File or WAR). kkolinko: I think this proposal goes too far. There are several separate issues. There are: a) Internal API to define resources - BaseDirContext implementing aliases and resource jars, and there will be overlays in Servlet 3.1 - StandardContext.setResources() allowing an arbitrary DirContext implementation via element. Concerns: - Too many ways to configure it. b) Internal API to lookup resources - DirContext interface Concerns: - Unnecessary objects, e.g. NamingException instead of null. - Too many methods. Name vs. String. list() vs. listBindings(). - Limited API. As a workaround, there are non-standard methods that are implemented on BaseDirContext instead, e.g. getRealPath(), doListBindings(..). - All caching (ProxyDirContext) and aliases handling is performed on the root level only. Once I do a lookup that returns a DirContext, further lookups on it will bypass the caching and aliases. c) WebappClassLoader and its interaction with resources WebappClassLoader uses DirContext API to access resources (classes, jars). Note that it has to construct a classpath for Java compiler called by Jasper. The compiler cannot operate on a DirContext and needs access to actual files and JARs. Concerns: - There are problems with access to classes and JAR files in non-unpacked WARs. It is resolved by unzipping the files into the working directory (in WebappLoader#setRepositories()). Note that DirContext is not notified of this copying. StandardJarScanner does not know of those copies either. - There are problems when the classes directory is served from multiple locations It seems to be worked around by adding the path of the alternative classes directory to virtualClasspath of VirtualWebappLoader (as shown by example in config/context.html#Virtual_webapp), but it is likely that I miss something. - antiJARLocking support in WebappClassLoader creates copies of resources, but does not notify the DirContext. - WebappClassLoader.jarFiles is used to track JAR files and keep them open. These might be useful when looking for resources in those files. These might be useful for StandardJarScanner. d) StandardJarScanner Concerns: - It essentially scans the same JARs as accessed by WebappClassLoader. It might be better to access them via WebappClassLoader rather that through Servlet API methods. The scanner is used by Jasper as well, so there are some concerns to keep the components independent. e) URL returned by ServletContext.getResource() jndi:/hostName/contextPath/resourcePath Goodies: - Uniform URL space. Both files and directories can be represented, hiding details of aliases, resource jars, etc. - It hides implementation details. - Permissions can be expressed as JndiPermission. They do not depend on context version number. - Accessing a resource through such URL leverages the cache implemented in ProxyDirContext class. We do not access the file system directly, nor need to open a JAR file. - It can be created from a String if necessary. Such use relies on DirContextURLStreamHandler.bindThread(..) being called earlier in the same thread. Concerns: - Some components would like to get "real" resource URL from this one. Maybe it can be exposed through some special header name, DirContextURLConnection.getHeaderField(str)? How such real URL can be prepared? DirContext.getNameInNamespace()? BaseDirContext.getRealPath()? ((FileResourceAttributes)DirContext.getAttributes()).getCanonicalPath()? - A resource lookup is performed twice. The first time in ServletContext.getResource() (implemented in ApplicationContext.getResource()) to return null for non-existing paths. The second time in DirContextURLConnection.connect(). It is good that there is a cache in ProxyDirContext that saves time for repeated lookups. - Using URLs involves encoding/decoding. If there were some other API to access resources in a web application, I would prefer some opaque object that allows access to resource properties, but is converted to string/url form only on demand. f) Connection, created from jndi: URL DirContextURLStreamHandler, DirContextURLConnection Goodies: - DirContextURLConnection provides information about resource via methods such as getContentLength(), getHeaderField(str). Concerns: - It exposes DirContext through some APIs, such as DirContextURLConnection.getContent(). Is this feature going to be preserved for compatibility, or to be removed? - DirContextURLConnection.list(): a public method, that is not part of the usual URL API. So URL API is lacking. Maybe some other methods could be added. A possible candidate could be "isCollection()", instead of asking for getContentType() and checking its value against ResourceAttributes.COLLECTION_TYPE. Threads: http://markmail.org/thread/hqbmdn2qs6xcooko 8. Review the connector shutdown code for timing and threading issues particularly any that may result in a client socket being left open after a connector.stop(). 9. Remove the svn keywords from all the files. (Just Java files?) 10. Code to the interfaces in the o.a.catalina package and avoid coding directly to implementations in other packages. This is likely to require a lot of work. Maybe use Structure 101 (or similar) to help. 11. Merge Service and Engine 12. Java 7 updates - Use of <> operator where possible - Complete for javax.* - Complete for o.a.[catalina|coyote|el].* - Not started for remainder - Use of try with resources - Not started - Catching multiple exceptions - Not started 13. Fix all FindBugs warnings - Complete for javax.* - Complete for o.a.[catalina|coyote|el].* - Remaining code in progress 14. Review date formatting with a view to reducing duplication.