00001 /*! 00002 * @file ./README 00003 * 00004 * @brief Introductory annotations. 00005 * 00006 * @section Control 00007 * 00008 * \$URL: https://svn.apache.org/path/name/README $ \$Id: README 0 09/28/05 dlydick $ 00009 * 00010 * Copyright 2005 The Apache Software Foundation 00011 * or its licensors, as applicable. 00012 * 00013 * Licensed under the Apache License, Version 2.0 ("the License"); 00014 * you may not use this file except in compliance with the License. 00015 * You may obtain a copy of the License at 00016 * 00017 * http://www.apache.org/licenses/LICENSE-2.0 00018 * 00019 * Unless required by applicable law or agreed to in writing, 00020 * software distributed under the License is distributed on an 00021 * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, 00022 * either express or implied. 00023 * 00024 * See the License for the specific language governing permissions 00025 * and limitations under the License. 00026 * 00027 * 00028 * @version \$LastChangedRevision: 0 $ 00029 * 00030 * @date \$LastChangedDate: 09/28/2005 $ 00031 * 00032 * @author \$LastChangedBy: dlydick $ 00033 * Original code contributed by Daniel Lydick on 09/28/2005. 00034 * 00035 * 00036 * @section Reference 00037 * 00038 * @see LICENSE 00039 * 00040 * @see INSTALL 00041 * 00042 * @note: (In the following narratives, the normal documentation tags 00043 * are @e not used so that this file may be used in a 00044 * stand-alone fashion without the assistance of any document 00045 * reader and without knowledge of such tags.) 00046 * 00047 * @note: See also the documentation page named "Main Page" for an 00048 * overview from the JVM spec section number perspective. 00049 * 00050 * @todo Need to verify which web document for the 00051 * Java 5 class file definition is either "official", 00052 * actually correct, or is the <em>de facto</em> standard. 00053 * 00054 * @verbatim 00055 00056 00057 Apache Harmony Bootstrap Java Virtual Machine 00058 ============================================= 00059 00060 This implementation of the Java Virtual Machine has been written 00061 as a (notice "a", not "the") reference implementation that comprises 00062 almost all of the facilities of a JVM without any further changes. 00063 00064 Please see the file named 'INSTALL' in this same directory for 00065 instructions on installing and building the program. 00066 00067 00068 The goals of this effort are: 00069 00070 (1) To provide a working Java Virtual Machine interpreter to the 00071 Apache Harmony project as a cornerstone for a Java runtime 00072 environment, especially considering the possibility of using 00073 it as a bootstrap JVM in the final project code base, where 00074 and if applicable. 00075 00076 (2) To provide a working Java Virtual Machine to the Apache Harmony 00077 project as a starting point for architectural discussions 00078 to help the project get started in earnest, pulling itself 00079 up "by its bootstraps," as it were. Thus a second reason to 00080 call this project the "bootstrap JVM." :-| :-O :-) 00081 00082 (3) To provide a highly modular implementation so pieces may be added 00083 and modified and removed with minimal impact to other pieces. 00084 For example, the heap allocation component may be easily 00085 switched between three modes, 'simple' and 'bimodal' and 'other'. 00086 This is accomplished by running 'config.sh' and changing it 00087 there. In like manner, the garbage collection component or any 00088 future component so constructed may be easily replaced without 00089 any changes to other code. Other components may not be _quite_ 00090 as hermetically sealed, such as the class file loader or threading 00091 mechanism, but the structure is present to improve upon the 00092 implementations, as desired. 00093 00094 (4) To provide a simple code implementation written in straight, 00095 vanilla ANSI 'C' using the ubiquitous GCC compiler so that the 00096 code may be easily and efficiently adapted and modified to perform 00097 all tasks without esoteric tools, and utilizing the experience 00098 and creativity of _all_ contributors. The only suggestion is 00099 that each contributor working on source code learn how to use 00100 Doxygen, a simple, powerful, and highly configurable C/C++ 00101 documentation utility. There is an abundance of example commentary 00102 in the showing how to use it for a wide variety of project 00103 documentation purposes. Furthermore, its installation also 00104 includes extensive built-in documentation about itself. 00105 00106 (5) To provide a very clear and concise code base for teaching JVM 00107 concepts to potential contributors. 00108 00109 (6) To organize the code as a simple static library that can be linked 00110 into any larger body of code, and also to be able to connect JNI 00111 shared object files/dynamic load libraries to it easily. 00112 00113 (7) To provide a sample main() program entry point for how to link 00114 and use the static library. 00115 00116 (8) To provide a start on the native side of the JNI implementation 00117 of java.lang.* and java.io.* (etc.) that may be more fully 00118 fleshed out by project contributors as a way to start becoming 00119 familiar with the core code. Also to provide a similar start on 00120 the JNI side for the same reasons, including a sample main() 00121 program there. 00122 00123 This implementation is NOT intended to be: 00124 00125 (1) The final Apache Harmony JVM. 00126 00127 (2) The authority or standard against which JVM's are measured, 00128 whether by the Apache Harmony project or otherwise. 00129 00130 (3) The most efficient possible implementation. Issues of 00131 modularity, concise implementation, and ease of understanding 00132 how the code works take precedence over runtime efficiency 00133 issues, including CPU time, memory resources, and the like. 00134 00135 This contribution consists of about 52,000 lines of 'C', Java, shell 00136 scripts, and data files. It has been written, unit tested, and 00137 _lightly_ integration tested. It is being contributed partly with 00138 testing in mind to familiarize contributors with the code as they 00139 work with it and continue with integration testing, especially while 00140 extending its feature set into a full-fledged, bullet-proof 00141 work of art. This file contains a list of items that should be 00142 addressed, both features and bug fixes. Furthermore, the source 00143 code and therefore the pre-formatted documentation contains over 00144 200 focused, specific enhancements, questions, and problems in 00145 the @todo list that should be addressed during this process. 00146 00147 It is my hope that this contribution will be the seed that helps the 00148 Apache Harmony project to sprout, mature, and bloom into a first-class 00149 Java Virtual Machine that is worthy of the reputation of the 00150 Apache brand. 00151 00152 00153 Yours Truly 00154 00155 00156 00157 Daniel Lydick 00158 September 28, 2005 00159 00160 00161 00162 --- 00163 00164 00165 Configuration 00166 ------------- 00167 Run the 'config.sh' script in this directory to configure for 00168 your environment. It creates a './config' sub-directory with 00169 a './config/config.h' header file containing top-level compile 00170 parameters. This file is always referenced in every source file 00171 of the core JVM code by including "arch.h". This script also 00172 creates other files there useful with the 'build.sh' scripts 00173 and contains normal compiler command line parameters. Eclipse 00174 project files are also available which contain these same 00175 compile command line parameters. 00176 00177 Whatever Java JDK you are currently using is probably fine for 00178 now when running this JVM. Reading of JAR files is _not_ done via 00179 the JAR classes (yet), but with your '$JAVA_HOME/bin/jar' utility. 00180 The test classes may be compiled with your JDK's '$JAVA_HOME/bin/javac' 00181 compiler. All this is in an attempt to leverage the functionality 00182 of your existing JDK to "bootstrap" this bootstrap JVM into existence. 00183 Eventually, of course, all this will get replaced with Harmony 00184 versions of all of these components. A seed for these components 00185 is found in the Java classes in 'jni/src/harmony/generic/0.0'. 00186 Currently, these classes contain _only_ definitions for what is 00187 termed "local native methods". These are JNI method calls to 00188 code that has intimate knowledge of the details of the JVM 00189 implementation, such as 'java.lang.Object.wait()'. (See 00190 'jvm/src/native.c' for more information.) 00191 00192 Once the 'config.sh' completes successfully, run 'build.sh' in any 00193 or all directories where it is found to build that diretctory or 00194 directory tree. At the top level, 'build.sh all' will build the 00195 entire project. Call it as 'build.sh help' for options. 00196 00197 Eclipse project files are provided to do the same things with the 00198 same options except that it does not compile the Java classes in 00199 'jni/src/harmony/generic/0.0'. (Notice that if you change 00200 anything, it will need to be changed for both 'build.sh' and for 00201 Eclipse if you want them be both work the same.) The original 00202 development of this code was on a Sun Ultra 5 with Solaris 9 00203 running GCC 3.3.2, Gmake 3.80, GNU binutils 2.11.2, and GDB 6.0, 00204 coordinated under Eclipse 3.0.2 with the C/C++ plugin CDT 2.1.1. 00205 The JDK was Sun's 1.4.2_06-b03. The source code was documented 00206 with Doxygen 1.4.4, Solaris version. 00207 00208 00209 Code organization 00210 ----------------- 00211 (The following description is also found in 'jvm/src/jni'c for 00212 display on the "Main Page" documentation.) 00213 Several directories are provided within the source tree: 00214 00215 jvm Source code for JVM, including a main() wrapper. 00216 Builds binary file 'jvm/bin/bootjvm'. 00217 00218 libjvm For building 'jvm' as a statically linked 00219 library archive, less main() wrapper. Builds 00220 library archive 'libjvm/lib/libjvm.a'. Source 00221 code comes from the 'jvm' directory. 00222 00223 main A simple main() wrapper that links 00224 'libjvm/lib/libjvm.a' and builds binary 00225 file 'main/bin/bootjvm'. Source code comes 00226 from the 'jvm' directory. 00227 00228 jni Source code for a sample JNI shared library 00229 'jni/harmony/generic/0.0/lib/bootjni.so' 00230 for linking with JNI code, but needs the 00231 build directives to be functional, as it 00232 currently links statically with a main() into 00233 a binary just like 'jvm'. This directory 00234 contains a tree for JNI implementations from any 00235 supplier that wants to support the Harmony project. 00236 Currently, there is one JNI implementation here, 00237 found in 'jni/src/harmony/generic/0.0'. 00238 00239 test Builds numerous Java test classes in 'test/bin' 00240 for driving development work. 00241 00242 With the exception of the Java test classes in 'test/src', 00243 all source code is found in 'jvm/src' and in the directory tree 00244 'jni/src/vendor/product/version'. The purpose of 'libjvm' 00245 and 'main' for demonstrating various possible organizations 00246 for the source code, namely for building a static library archive 00247 and for linking it. 00248 00249 The source code is about 3 MB in size. The final size of all of 00250 parts of the compiled code tree is about 12 MB. The full 00251 documentation tree in 'doc.ORIG' is about another 55 MB when fully 00252 installed. It may be removed if desired in favor of maintaining 00253 _only_ the working documentation in 'doc' as generated by 'build.sh dox' 00254 at the top level, which will be the same size as 'doc.ORIG' if all 00255 documentation formats are desired, or less if fewer documentation 00256 formats are used. For example, if only the HTML format is used, 00257 the 'doc' directory will be about 19 MB in size. 00258 00259 Comments in the source will _always_ mention 'C' language "functions", 00260 but Java language "methods", and _never_ the reverse. This is part 00261 of an attempt to separate the two compile and runtime domains. See 00262 also comments on this subject below under 'jrtypes.h' and the source 00263 itself for additional comments about type definitions in the Java 00264 and real machine domains. 00265 00266 Subsystem component abstraction 00267 ------------------------------- 00268 The implementation key Java concepts is performed in the following 00269 source files with corresponding navigation macros: 00270 00271 Component Source and header Navigation 00272 --------- ----------------- ---------- 00273 Java class class.[ch] CLASS() macro 00274 classutil.c 00275 00276 Java object object.[ch] OBJECT() macro 00277 objectutil.c 00278 00279 Java thread thread.[ch] THREAD() macro 00280 threadstate.c 00281 threadutil.c 00282 00283 Java method method.[ch] METHOD() macro 00284 00285 Java class field.[ch] FIELD() macro 00286 static field, 00287 object instance 00288 00289 00290 JVM registers jvmreg.h STACK(), et al 00291 00292 Java native native.[ch] -- 00293 jlObject.[ch] -- 00294 jlClass.[ch] -- 00295 jlString.[ch] -- 00296 jlThread.[ch] -- 00297 00298 JVM spec 00299 class file classfile.[ch] Many, see cfmacros.h 00300 00301 Heap modules heap.h HEAP_xxx() macros 00302 heap_simple.c 00303 heap_bimodal.c 00304 00305 Garbage gc.h GC_xxx() macros 00306 collection gc_stub.c 00307 modules 00308 00309 00310 00311 By simply changing out the respective source file and adjusting 00312 the main navigation macro for that component, the implementation 00313 can be changed drastically without affecting the other components. 00314 (Larger implementations will have more than one source file, see 00315 especially 'threadstate.c'.) It is _highly_ unlikely that the 00316 'classfile.[ch]' components will _ever_ change since this is under 00317 the direct control of the Java specification, but the others are 00318 under the control of this project and may be modified to suit its 00319 needs as desired. 00320 00321 00322 Support Scripts 00323 =============== 00324 Following is a short description of each script file and other 00325 support files: 00326 00327 config.sh <--- * * * START HERE AFTER READING 'INSTALL' FILE * * * 00328 --------- 00329 Introduce users to the project and how to set it up and configure it 00330 for various CPU platforms and for various functional features. 00331 This interactive shell script provides introductory material and 00332 a description of which versions of which software tools are needed 00333 to compile and document the project and how to administer the 00334 pre-formatted documentation. 00335 00336 It then starts evaluating the existing Java JDK (as declared by the 00337 JAVA_HOME environment variable) for existence of the proper tools and 00338 verifies the name of the class library archive that will be used for 00339 temporary access to JVM startup classes such as the root object 00340 java.lang.Object.class . 00341 00342 Once this introductory evaluation is complete, it will ask questions 00343 about how to configure the project for compile, runtime, and 00344 distribution features, as well as which components to build and 00345 to document. Once the questions are answered, the project is 00346 configured and optionally built using 'build.sh cfg'. 00347 00348 build.sh 00349 clean.sh 00350 common.sh 00351 --------- 00352 Top-level build scripts that invokes build scripts of the same names 00353 at the various levels in the directory tree. The one named 'build.sh' 00354 compiles the source code, while the one named 'clean.sh' removes the 00355 effects of that build. The shared file 'common.sh' is used by both 00356 of these scripts. Notice that nowhere in the tree except here at the 00357 top level will the documentation build occur, as it is a global process 00358 due to interdependencies of @link and @see directives, among others. 00359 00360 getsvndata.sh 00361 getsvndups.sh 00362 ------------- 00363 Show a list of all revisions of all source files compiled into an 00364 object file, a library archive, or a linked binary with 'getsvndata.sh'. 00365 Show a list of conflicting revisions using 'getsvndups.sh'. Object 00366 files may not have conflicts, neither may library archives. Linked 00367 binaries may or may not, depending on the particulars. 00368 00369 echotest.sh 00370 ----------- 00371 Generic script support for 'echo -n' feature for shells that do not 00372 support it natively. 00373 00374 00375 jvm/build.sh 00376 libjvm/build.sh 00377 main/build.sh 00378 test/build.sh 00379 jni/src/harmony/generic/0.0/build.sh 00380 jvm/clean.sh 00381 libjvm/clean.sh 00382 main/clean.sh 00383 test/clean.sh 00384 jni/src/harmony/generic/0.0/clean.sh 00385 jvm/common.sh 00386 libjvm/common.sh 00387 main/common.sh 00388 test/common.sh 00389 jni/src/harmony/generic/0.0/common.sh 00390 ------------------------------------- 00391 Like at the top level, each relevant directory level has a build 00392 script that compiles the source code ('build.sh') and removes the 00393 output files from that build ('clean.sh'). These files share a 00394 common file ('common.sh') also. The output of 'libjvm' is stored 00395 in a 'libjvm/lib' subdirectory, while the output of the other scripts 00396 is stored in a '______/bin' subdirectory. 00397 00398 dox.sh 00399 undox.sh 00400 commondox.sh 00401 ------------ 00402 The logic behind the documentation build using Doxygen. The output 00403 is stored by 'dox.sh' into a 'doc' subdirectory at this level, while 00404 the 'undox.sh' removes it. They share a common file 'commondox.sh'. 00405 In order to speed up the documentation build during development, 00406 define the environment variable SUPPRESS_DOXYGEN_VERYCLEAN as any 00407 non-null string. See logic of 'dox.sh' for other comments. 00408 00409 dist-src.sh 00410 dist-doc.sh 00411 dist-bin.sh 00412 ----------- 00413 Construct a source distribution and store it above the top of 00414 the directory tree in gzipped tar, where CONFIG_RELEASE_LEVEL is 00415 the release level defined the last time that 'config.sh' was run: 00416 00417 Type Script Output TAR file 00418 ---- ------ --------------- 00419 Source dist-src.sh ../../bootJVM-src-$CONFIG_RELEASE_LEVEL.tar.gz 00420 (plus 00421 docs) 00422 00423 Docs dist-src.sh ../../bootJVM-doc-$CONFIG_RELEASE_LEVEL.tar.gz 00424 00425 Binary dist-src.sh ../../bootJVM-bin-$CONFIG_RELEASE_LEVEL.tar.gz 00426 (plus 00427 docs) 00428 00429 A side effect of the source distribution (only) is the creation of a 00430 file in this directory named 'bootJVM-docs.tar.gz'. It is included 00431 in the distribution as a part of the deliverables and contains 00432 installable documentation files. Its installation is managed with 00433 'config.sh' where it references the "pre-formatted documentation. 00434 00435 Other Files 00436 =========== 00437 00438 INSTALL <--- * * * START HERE * * * 00439 ------- 00440 Instructions for installing the source code and building the 00441 binaries and the documentation. 00442 00443 LICENSE 00444 ------- 00445 The Apache Software Foundation license text used by the ASF 00446 for all software distributions. 00447 00448 README 00449 ------ 00450 This file. 00451 00452 bootjvm.dox 00453 dox_filter.sh 00454 ------------- 00455 'bootjvm.dox' is the Doxygen directive file used to create 00456 all project documentation, invoked by 'dox.sh'. 00457 00458 'dox_filter.sh' the filter script declared in 'bootjvm.dox' 00459 for filtering input files. It is declared as the 00460 'INPUT_FILTER=' parameter in 'bootjvm.dox' and is necessary 00461 to properly format all files that are not explicitly '.c' 00462 or '.h' or '.java' source files. 00463 00464 svnstat.sh 00465 ---------- 00466 Sample script from Doxygen documentation to display the status of 00467 a file in SVN. Not used in the project for any purpose. 00468 00469 test/.project 00470 test/.classpath 00471 --------------- 00472 Eclipse project files for the 'test' Java project. 00473 00474 jvm/.project 00475 jvm/.cdtproject 00476 jvm/.cdtbuild 00477 libjvm/.project 00478 libjvm/.cdtproject 00479 libjvm/.cdtbuild 00480 main/.project 00481 main/.cdtproject 00482 main/.cdtbuild 00483 jni/.project 00484 jni/.cdtproject 00485 jni/.cdtbuild 00486 ------------------- 00487 Eclipse project files for the several C/C++ projects. 00488 00489 00490 00491 Output areas 00492 ============ 00493 00494 bootclasspath/ 00495 -------------- 00496 Directory containing default value of BOOTCLASSPATH environment 00497 variable. THIS ABSOLUTE PATH NAME IS COMPILED INTO SOURCE CODE 00498 AND WILL NEED TO ULTIMATELY BE PHASED OUT. 00499 00500 config/ 00501 ------- 00502 Directory where all results from 'config.sh' are stored. These 00503 files are used by the various shell scripts and by Doxygen to 00504 build various aspects of the project. 00505 00506 jvm/bin 00507 main/bin 00508 test/bin 00509 jni/src/harmony/generic/0.0/bin 00510 ------------------------------- 00511 Output area respectively from 'jvm/build.sh', 'main/build.sh', 00512 'test/build.sh', and 'jni/src/harmony/generic/0.0/build.sh'. 00513 00514 libjvm/lib 00515 ---------- 00516 Output area from 'libjvm/build.sh' 00517 00518 jni/bin 00519 ------- 00520 Output area for the Eclipse 'jni' project. Source distributions 00521 (via 'dist-src.sh') cannot be performed until this directory has been 00522 manually removed of an Eclipse 'clean' operation done for the 00523 'jni' project. 00524 00525 00526 Output files 00527 ============ 00528 00529 bootclasspath/*.class 00530 bootclasspath/*/*.class 00531 -------------- 00532 Java class files that are found in the BOOTCLASSPATH environment 00533 variable. They are extracted from your JDK's runtime JAR file 00534 and are used to start up the JVM. They will eventually get phased 00535 out of the project when these classes have been developed by the 00536 project team. Whether or not replacements from the project are 00537 stored here is TBD. 00538 00539 config/config.h 00540 --------------- 00541 Although the other files stored in the 'config' directory are not 00542 listed here, they are derived along with this file from 'config.sh' 00543 to control the compile and run time features of the project. This 00544 file specifically can be used as a reference when running 'config.sh' 00545 again to remember what settings were configured. It will be very 00546 obvious from that script as to which definitions here match the 00547 questions. 00548 00549 jvm/bin/bootjvm 00550 main/bin/bootjvm 00551 test/bin/*.class 00552 test/bin/*/*.class 00553 jni/src/harmony/generic/0.0/bin/bootjvm 00554 jni/src/harmony/generic/0.0/bin/*/*.class 00555 ----------------------------------------- 00556 The output files respectively from the build of the 'jvm', 00557 'main', 'test', and 'jni/src/harmony/generic/0.0' 00558 project build. 00559 00560 libjvm/lib/libjvm.a 00561 ------------------- 00562 Output file from the 'libjvm' project build. 00563 00564 00565 Source code 00566 =========== 00567 Following is a short description of each source file. All function 00568 names start with the name of their source file, 'filename_function()' 00569 in keeping with the OO concept of packaging all related code and data 00570 into the same sourcefile. See also 'jrtypes.h' for comments on naming 00571 conventions for certain data types. 00572 00573 The JNI source code is grouped separately, as is the test suite. 00574 00575 00576 jvm/src/arch.h 00577 -------------- 00578 Configure the compilation of each source file with architectural 00579 parameters, especially from the configuration script 'config.sh'. 00580 Provide copyright information for the binary edition of each source 00581 file. This file MUST be included by all source files in 00582 'jvm/include' and in 'jvm/src'. Do NOT include it in 'jni' 00583 source files. There needs to be an equivalent for the Java code 00584 written for these features (see to-do item in source). 00585 00586 jvm/src/argv.c 00587 -------------- 00588 Parse the JVM command line. For easy help, call program 00589 with '-help' option. 00590 00591 jvm/src/attribute.c 00592 jvm/src/attribute.h 00593 ------------------- 00594 Handle class file attributes. The type definition 00595 'jvm_attribute_index' is the key to properly using class file 00596 attributes. 00597 00598 jvm/src/bytegames.c 00599 ------------------- 00600 Manipulate bytes for various purposes such as byte swapping, reading 00601 2-byte structures from 1-byte addresses, or 4-byte or 8-byte structures 00602 from 1- or 2-byte addresses, etc. (These last are due to use of 00603 structure packing, especially in parsing class file data and parsing 00604 virtual opcode operands). 00605 00606 jvm/src/cfattrib.c 00607 ------------------ 00608 Process the attibute fields of class file data. Since attributes 00609 are a large and specialized area of the JVM spec class file 00610 definition, they have been broken out of 'classfile.c'. 00611 00612 jvm/src/cfmacros.h 00613 ------------------ 00614 Macros for navigating the class file structures of 'classfile.h'. 00615 00616 jvm/src/cfmsgs.c 00617 ---------------- 00618 Diagnostic messages for class file data. 00619 00620 jvm/src/class.c 00621 jvm/src/class.h 00622 --------------- 00623 Handle Java classes in the real machine implementation. 00624 The type definition 'jvm_class_index' is the key to properly 00625 using Java file classes. Notice that this index is implemented 00626 in such a way that the underlying implementation could be 00627 completely changed from a simple array of structures to something 00628 unrelated and there would be only a few changes to some macros 00629 necessary to support that new implementation. Such an implementation 00630 might call this type definition a 'jvm_class_hash' instead, for 00631 example. The design decision, however, was to try to maintain a 00632 separate structure for classes than for objects. Each class does 00633 have an object that contains its object-ish components. This 00634 object is also implemented to satisfy the spec requirements that 00635 a class also have a class object. See 'linkage.h' as to how to 00636 navigate between classes, objects, and threads. 00637 00638 jvm/src/classfile.c 00639 jvm/src/classfile.h 00640 ------------------- 00641 Definition of JVM spec, version 2, section 4, for JDK 1.5, namely, 00642 the class file structure, and its implemention. The version of 00643 the class file definition (section 4) is listed near the top of 00644 the header file. It is one of several that have been floating 00645 around, but appears to meet the JDK 1.5 attribute extensions 00646 rather exactly, and is _assumed_ (that is a VERY big assumption!) 00647 to be class file 'major.minor' version '49.0', namely the JDK 1.5 00648 class file format. 00649 00650 00651 All symbol definitions are declared _exactly_ as shown in this 00652 specification with the single exception of array declarations like, 00653 00654 u2 constant_pool_count; 00655 cp_info constant_pool[constant_pool_count - 1]; 00656 00657 Such definitions are instead declared as, 00658 00659 u2 constant_pool_count; 00660 cp_info **constant_pool; 00661 00662 and an array of pointers to (cp_info) is allocated on the heap of 00663 size 'constant_pool_count'. (In this one case, element zero is 00664 defined as not being used, so it is always a NULL pointer.) 00665 00666 For purposes of real machine word alignment, type 'cp_info' and 00667 'attribute_info' have been embedded in a _slightly_ larger structure 00668 to keep 2- and 4-byte member references on the correct real machine 00669 address boundaries. The embedding structures are called 'cp_info_dup' 00670 and 'attribute_info_dup', respectively, as they duplicate the contents 00671 of the smaller structure, if you will. 00672 00673 All symbol definitions that are not _explicitly_ found in the spec 00674 are locally defined for use in this implementation. They are prefixed 00675 with the string "LOCAL_" to distinguish them from spec definitions. 00676 For example, ACC_PUBLIC marks a class as public, but the local 00677 definition ACC_EMPTY means no defintion at all. Since it is not 00678 found in the spec, it is actually named LOCAL_ACC_EMPTY here. 00679 00680 00681 jvm/src/classpath.c 00682 jvm/src/classpath.h 00683 ------------------- 00684 Manage and navigate CLASSPATH environment variable. 00685 00686 jvm/src/classutil.c 00687 ------------------- 00688 Utilities for handing real machine class structures. 00689 00690 jvm/src/exit.c 00691 jvm/src/exit.h 00692 -------------- 00693 Exit codes for exit(3) and fatal error handing for JVM runtime. 00694 This code implements non-local subroutine returns using the 00695 standard library calls setjmp(3) and longjmp(3). If you want 00696 to understand this non-local return mechanism, you _must_ read 00697 the man pages for these functions and study the examples. They 00698 are extremely useful in processing fatal error conditions and 00699 state machine conditions under which there is no valid return 00700 or repair of an invalid state or irreconcilable condition. 00701 00702 jvm/src/field.c 00703 jvm/src/field.h 00704 --------------- 00705 Handle Java virtual machine variables for both class static fields 00706 and object instance fields. The type definitions 'jvm_field_index' 00707 and 'jvm_field_lookup_index' are the keys to properly using Java fields. 00708 00709 jvm/src/gc.h 00710 ------------ 00711 API for all garbage collection algorithms. 00712 00713 jvm/src/gc_stub.c 00714 ----------------- 00715 Default implementation of garbage collection (a do-nothing 00716 implementation). This code is meant to be replaced by one 00717 or more GC algorithms. The GC insertion points in the code 00718 body corporate should stand for all algorithms. See 'heap_*.c' 00719 and header file 'heap.h' for an example of implementing 00720 multiple algorithms on a single API. 00721 00722 jvm/src/heap.h 00723 -------------- 00724 API for all heap management algorithms. 00725 00726 jvm/src/heap_simple.c 00727 --------------------- 00728 Default implementation of heap allocation using malloc(3) 00729 and free(3). One option is presented to return an allocated 00730 area that has been initialized to zeroes, which is useful for 00731 structure initialization. 00732 00733 jvm/src/heap_bimodal.c 00734 ---------------------- 00735 An improvement over 'heap_simple.c' where a large memory block 00736 is allocated from whence come all allocations up to a certain 00737 size. Beyond that, the 'heap_simple.c' algorithm is used. 00738 00739 jvm/src/jrtypes.c 00740 jvm/src/jrtypes.h 00741 ----------------- 00742 Map Java data types ('j') to real machine ('r') types. Much of the 00743 code uses these two letters prefixed to variable names to distinguish 00744 between Java and real machine data types. The other common prefix 00745 used throughout the code is 'p' for pointer, such as '(char *) pstr' 00746 or '(jint *) parray'. However, this convention is for instances of 00747 pointers to _any_ data type, and may be used for either the 'C' or 00748 the Java domains. Compiled constant definitions are found 00749 in 'jrtypes.c' also. 00750 00751 jvm/src/jvalue.h 00752 ---------------- 00753 Java data type aggregate for storage of class static field 00754 and object instance fields. One type fits all. 00755 00756 jvm/src/jvm.c 00757 jvm/src/jvm.h 00758 ------------- 00759 The main JVM control structure for running the Java virtual 00760 machine on this real machine. There is virtually no global 00761 storage in this code, and only sparing use of static (file scope) 00762 storage. Most everything else is either found here, is a 00763 local variable, or is on the heap. The code in 'jvm.c' initializes, 00764 runs, and shuts down the JVM. 00765 00766 jvm/src/jvmcfg.c 00767 jvm/src/jvmcfg.h 00768 ---------------- 00769 Top-level configuration of JVM parameters (not including what is 00770 mostly compile-time setup with 'config.sh' and its 'config/config.h' 00771 header file.) All sorts of things are defined here, from OS-dependent 00772 directory delimiters to JVM command line parameter strings. A number 00773 of significant typedefs are also found here, as well as high and 00774 low limits on data types, etc. Compiled constant definitions are 00775 found in 'jvmcfg.c' also for NULL and BAD definitions for major 00776 typedefs. 00777 00778 jvm/src/jvmclass.h 00779 ------------------ 00780 Fully qualified class name strings (internal form) for a wide 00781 variety of classes needed by the JVM at run time. Of special 00782 significance is a large number of error and exception classes. 00783 00784 jvm/src/jvmregs.h 00785 ----------------- 00786 Definition of JVM program counter, stack, and stack navigation. 00787 00788 jvm/src/jvmutil.c 00789 ----------------- 00790 Utilities for debug message levels, stack dumps, etc. 00791 00792 jvm/src/linkage.c 00793 jvm/src/linkage.h 00794 ----------------- 00795 The header contains linkages between class, object, and thread 00796 structures. The source file 00797 contains late binding linkage functions to link field references 00798 and method references from one class into their definitions in 00799 another class. 00800 00801 jvm/src/main.c 00802 main/src/main.c (symbolic link to jvm/src/main.c) 00803 --------------- 00804 A sample main() entry point that calls the JVM from either a 00805 library archive or by direct object linkage. 00806 00807 jvm/src/manifest.c 00808 ------------------ 00809 Read and process selected properties from a JAR manifest file. 00810 00811 jvm/src/method.c 00812 jvm/src/method.h 00813 ---------------- 00814 Handle Java virtural machine methods. The type definition 00815 'jvm_method_index' is the key to properly using Java methods. 00816 00817 jvm/src/native.c 00818 jvm/src/native.h 00819 ---------------- 00820 Support for JNI native methods and what are called local native 00821 methods (those with intimate knowledge of the inner workings of 00822 the JVM). Local native methods may, but are not required to, use 00823 the JNI interface, but a significant shortcut is provided for more 00824 direct connection between them and the JVM. 00825 00826 jvm/src/nts.c 00827 jvm/src/nts.h 00828 jvm/src/unicode.c 00829 jvm/src/unicode.h 00830 jvm/src/utf8.c 00831 jvm/src/utf8.h 00832 ----------------- 00833 Utilities for null-terminated strings, UTF8 strings, and Unicode 00834 strings, including conversion back and forth, getting and putting, 00835 length functions, and examination/conversion of Java class formatting 00836 in various types. Notice that there are three types of strings 00837 in this JVM: (1) 'C' style strings of ASCII characters terminated by 00838 an ASCII '\0' (NUL) byte, also known as "pointer to real-machine 00839 character" strings, or 'prchar' variables; (2) UTF8 strings as 00840 implemented with the (CONSTANT_Utf8_info) type definition, particularly 00841 as related to Java class files; (3) Unicode strings, which consist of a 00842 (jchar)[] type array variable and a (u2) length variable. 00843 00844 jvm/src/object.c 00845 jvm/src/object.h 00846 ---------------- 00847 Handle Java objects in the real machine implementation. 00848 The type definition 'jvm_object_hash' is the key to properly 00849 using Java objects. Notice that this hash is implemented 00850 in such a way that the underlying implementation could be 00851 completely changed from a simple array of structures to something 00852 unrelated and there would be only a few changes to some macros 00853 necessary to support that new implementation. See 'linkage.h' 00854 for how to navigate between classes, objects, and threads. 00855 00856 jvm/src/objectutil.c 00857 -------------------- 00858 Utilities for Java objects. Support for synchronize() 00859 and unsynchronize() is found here, namely, the object monitor 00860 locks. Other utility functions may be placed here as appropriate. 00861 00862 jvm/src/opcode.c 00863 jvm/src/opcode.h 00864 ---------------- 00865 The JVM spec, version 2, section 9, operation code list is found 00866 in the header file directly as defined in the spec. The code 00867 contains the JVM inner execution loop. 00868 00869 jvm/src/stdio.c 00870 --------------- 00871 Standard input/output functions for stdout, stderr, buffer formatting, 00872 etc. CAVEAT EMPTOR (let the buyer beware!): This file does _NOT_ use 00873 structure packing because on the original implementation (Solaris 9 00874 with GCC 3.3.2), the standard I/O library was _not_ compiled with 00875 structure packing and caused strange runtime errors and SIGSEGV on 00876 normal library calls. Therefore, most standard I/O was gathered into 00877 this file, with the exception of 'manifest.c', which does not seem 00878 to have to problem. 00879 00880 jvm/src/thread.c 00881 jvm/src/thread.h 00882 ---------------- 00883 Handle JVM threads in the real machine implementation. 00884 The type definition 'jvm_thread_index' is the key to 00885 properly using threads. Notice that this index is 00886 implemented in such a way that the underlying 00887 implementation could be completely changed from a 00888 simple array of structures to something unrelated 00889 and there would be only a few changes to some macros 00890 necessary to support that new implementation. See 'linkage.h' 00891 for how to navigate between classes, objects, and threads. 00892 00893 jvm/src/threadstate.c 00894 --------------------- 00895 Thread state machine individual states. 00896 A lengthy narrative near the beginning of this file defines 00897 the JVM thread state machine, which is then implemented in 'jvm.c' 00898 by jvm_run(). 00899 00900 jvm/src/threadutil.c 00901 -------------------- 00902 Utilities for normal operation of the thread state machine. 00903 00904 jvm/src/timeslice.c 00905 ------------------- 00906 Real-time JVM time slice timer for limiting the amount of time 00907 that a thread may run (see also 'jvm.c' and 'opcode.c') before 00908 the next thread is given some time. 00909 00910 jvm/src/tmparea.c 00911 ----------------- 00912 Private disk storage area for running the JVM. It is set up and 00913 torn down by this code and is referenced through a getter function. 00914 00915 jvm/src/util.h 00916 -------------- 00917 Miscellaneous support macros and function prototypes from various 00918 of the source files. This file contains a list of which source 00919 files have their prototypes listed here, and those that could 00920 but do _not_ for specific reasons. 00921 00922 00923 A WORD ABOUT FUNCTION PROTOTYPES 00924 ================================ 00925 Each and every source 'filename.c' that has a 'filename.h' associated 00926 with it will locate its function prototypes in that header file. 00927 The few exceptions are listed in 'jvm/src/util.h'. 00928 00929 00930 A WORD ABOUT SOURCE FORMATTING 00931 ============================== 00932 Each and every source file is formatted so that not one single line 00933 goes beyond 72 characters in width except where absolutely unavoidable. 00934 Furthermore, there is not one single TAB character (ASCII 9, 011, 00935 0x09, \u0009) in the entire body of code. The only place it is found 00936 is in theEclipse '.project' and '.classpath' files. All indentions 00937 are made at 4-column intervals. 00938 00939 The purpose of these two constraints is so that absolutely anyone with 00940 any text editor may view the source in a standard 80-column text editor, 00941 with line numbers, and be able to see the whole line without any line 00942 wrapping. This constraint makes the interchange of source code 00943 between project contributors much simpler. Although there is a case 00944 to be made for wider lines, many developers like 80 column constraints 00945 anyway, or some other number. This choice of constraint should provide 00946 maximum flexibility of interchange of source files amongst all 00947 contributors. 00948 00949 00950 JNI SOURCE CODE 00951 =============== 00952 00953 jni/src/harmony/generic/0.0/src/java/lang/Object.java 00954 jni/src/harmony/generic/0.0/src/java/lang/Class.java 00955 jni/src/harmony/generic/0.0/src/java/lang/String.java 00956 jni/src/harmony/generic/0.0/src/java/lang/Thread.java 00957 ----------------------------------------------------- 00958 Sample segments of java.lang.* package that contain the 00959 native declarations in these classes. (Not comprehensive 00960 either for each class or for the package.) 00961 00962 jni/src/harmony/generic/0.0/include/java_lang_Object.h 00963 jni/src/harmony/generic/0.0/include/java_lang_Class.h 00964 jni/src/harmony/generic/0.0/include/java_lang_String.h 00965 jni/src/harmony/generic/0.0/include/java_lang_Thread.h 00966 ------------------------------------------------------ 00967 JNI headers that correspond to the 'src/java/lang' equivalent 00968 Java source files in the java.lang.* package that contain the 00969 native declarations in these classes. (Not comprehensive 00970 either for each class or for the package.) 00971 00972 jni/src/harmony/generic/0.0/src/java_lang_Object.c 00973 jni/src/harmony/generic/0.0/src/java_lang_Class.c 00974 jni/src/harmony/generic/0.0/src/java_lang_String.c 00975 jni/src/harmony/generic/0.0/src/java_lang_Thread.c 00976 -------------------------------------------------- 00977 JNI source code that correspond to the 'src/java/lang' equivalent 00978 Java source files in the java.lang.* package that contain the 00979 implementation of the local native methods in these classes. 00980 (Not comprehensive either for each class or for the package.) 00981 00982 jvm/include/jlObject.h 00983 jvm/include/jlClass.h 00984 jvm/include/jlString.h 00985 jvm/include/jlThread.h 00986 ---------------------- 00987 Headers for connecting the JVM to the JNI interface of the above 00988 classes. There are effectively two parallel sets of runtime 00989 definitions in these files, one for generic JNI, one for the 00990 core JVM code, namely this body of source. They are 00991 _independent_ so there is NO coupling between an arbitrary 00992 JDK's JNI implementation and this JVM. Please see comments 00993 in these header files for more information. 00994 00995 jvm/src/jlObject.c 00996 jvm/src/jlClass.c 00997 jvm/src/jlString.c 00998 jvm/src/jlThread.c 00999 ------------------ 01000 The native side of the JNI implementation of the above selected 01001 core java.lang.* classes. The functions in these files are 01002 referenced by the JNI target functions. For example, the 01003 function java_lang_Class_isPrimative() in 'java_lang_Class.c' 01004 will at some point call the 'jlClass_isPrimative()' function 01005 in 'jlClass.c'. 01006 01007 jni/src/harmony/generic/0.0/src/sampleJNImain.c 01008 ----------------------------------------------- 01009 Sample function that calls JNI functions through the JNI interface. 01010 This is currently built as a straight binary. It needs to be built 01011 as a .so/.dll shared object file. 01012 01013 01014 TEST JAVA FILES 01015 =============== 01016 This area is designed to be filled in with _many_ packages and 01017 classes for testing the JVM. Here is a rudimentary but useful 01018 first pass: 01019 01020 01021 test/src/HelloWorld.java 01022 ------------------------ 01023 Just what it says. In order to run this main() method, however, 01024 requires a _fully_ functional JVM with most of the basic JNI in 01025 place. 01026 01027 test/src/harmony/jvm/test/PkgHelloWorld.java 01028 -------------------------------------------- 01029 The same "hello world" program, but in a package. 01030 01031 test/src/harmony/jvm/test/MainArgs.java 01032 --------------------------------------- 01033 A main() method that takes arguments from the command line. 01034 01035 01036 01037 KNOWN ISSUES AND SUGGESTED TO-DO ITEMS 01038 ====================================== 01039 01040 (1) The default garbage collection algorithm is 'no collection'. 01041 This means that an algorithm must be written in order to sustain 01042 JVM execution with continuous heap availability. Other heap 01043 implementations might also be written beyond the two that are 01044 supplied. 01045 01046 (2) Anonymous strings (without a class static field reference in 01047 (rclass) or an object instance reference in (robject) may not get 01048 completely unreferenced and deallocated, but could continue to 01049 accumulate instance references. Depending on the GC algorithm, this 01050 may or may not occur and/or may or may not present a problem over a 01051 very long JVM session where reference structures grow without bound 01052 or counts wrap around to zero. 01053 01054 (3) There has been no attempt made to process any special values 01055 of floating point data, neither NAN, +/- infinity, overflow, or 01056 anything else. The definitions per the class file spec are present, 01057 of course, but nothing has been done with them. Special care should 01058 be taken in fprintf() statements-- like the sysDbgMsg() calls in 01059 'cfmsgs.c' -- to make sure that formats like "%lf" are followed and 01060 precise. The same goes for 64-bit integers, (long long) in real 01061 machine and (jlong) in JVM, that formats like "%ld" are followed 01062 and precise. 01063 01064 (4) All reading of JAR files uses the 'jar' utility provided in 01065 the $JAVA_HOME directory tree in '$JAVA_HOME/bin/jar'. This will, 01066 of course, need to be enhanced to simply read the JAR files directly. 01067 To start with, the BOOTCLASSPATH could have them added to it as a 01068 part of 'config.sh', then invoked from the JVM. 01069 01070 (5) The JNI interface in 'jni/src/harmony' contains only a cursory 01071 JNI implementation. This will need to be filled out to the complete 01072 suite of native methods needed by java.lang.*. The native side is 01073 found in 'jvm/include' for the 'jni/src/harmony' source 01074 and header files, while the implementation is in the 01075 'jvm/src/jlClassNameXYZ.c' files (all beginning with 'jl' for 01076 "java/lang" followed by the actual class name). For example, 01077 'jlThread.c' contains the native portions of java.lang.Thread. 01078 Certain key system calls like open(2) and exit(2) and socket(2) 01079 are located in packages like 'java.io' and 'java.system' and 01080 'java.rmi' (?), respectively. The local native method interface 01081 will need to be extended to cover these system calls. 01082 See also item (28) below. 01083 01084 (6) The JVM execution engine in particular will need through testing. 01085 It has been unit tested, of course, but good, solid integration 01086 testing is an excellent exercise for new contributors to learn the 01087 code, so this item has been specifically reserved for contributors 01088 for this purpose. 01089 01090 (7) There is apparently a typographical error in the spec section 01091 4.5.7 describing 21-bit character encoding. Therefore, this format 01092 has not been implemented pending proper resolution. It probably is 01093 correct to assume that the spurious "-1" in the second byte of the 01094 first triple should say "(bits 20-16)" instead of "(bits 20-16)-1". 01095 However, this is presented as an exercise for contributors. 01096 01097 (8) There are a plethora of comments marked "@todo xxx" all over the 01098 code. In the Doxygen "Related Pages" section, they are gathered into 01099 one comprehensive list. These items are presented as exercises for 01100 contributors. 01101 01102 (9) The Solaris 32-bit implementation was chosen for development as a 01103 very generic development and runtime environment. Testing partway 01104 through indicated that a transition to a 64-bit runtime environment 01105 could be compiled with little effort. However, there will have to 01106 be some adjustments to get everything to work because that one test 01107 produced a SIGSEGV during JVM initialization, and its cause was not 01108 thoroughly explored. Every effort has been made to anticipate 64-bit 01109 compilations, but that adjustment will need to be fully tested. 01110 01111 (10) Every effort has been made to use _only_ standard Unix and 01112 Posix runtime library calls so the code could be ported to as many 01113 platforms as possible. The first targets that come to mind are, 01114 of course, Linux, HP-UX, and AIX, with CygWin in mind also, with 01115 a few changes for system(3) scripts. (*** The code intentionally 01116 does _NOT_ have the Windows version of some CONFIG_xxxx_SCRIPT 01117 definitions so that the contributor will need to get it written, 01118 possibly put into a .BAT file, then invoked through the existing 01119 infrastructure in the code. ***) 01120 01121 (11a) There are a number of places here and there in the code where a 01122 return value is not properly checked for having an error value, such 01123 as 'jvm_class_index_null' or 'jvm_attribute_index_bad'. This 01124 is typically _not_ a throwable error condition, like from heap 01125 allocation or other error, but is an actual runtime condition 01126 that is returned. After three full passes through the code, 01127 the error code structure has morphed significantly from a fully 01128 structured approach just to get the code started to a partial OO 01129 approach, throwing errors through setjmp(3)/longjmp(3). Someone 01130 needs to go through the code in its _entirety_ and find out where 01131 invalid results are intentionally returned instead of throwing an 01132 error and _make sure_ that the calling functions check those results 01133 properly. 01134 01135 (11b) The same goes for input parameters in various places 01136 throughout the code for the following typedefs from 'jvmcfg.h': 01137 jvm_thread_index 01138 jvm_constant_pool_index 01139 jvm_interface_index 01140 jvm_class_index 01141 jvm_method_index 01142 jvm_field_index 01143 jvm_field_lookup_index 01144 jvm_attribute_index 01145 jvm_unicode_string_index 01146 jvm_object_hash 01147 01148 These will need to be reviewed for the same reasons as the error 01149 return codes. Either they will index an array to its NULL element, 01150 specifically (with corresponding NULL element named also), 01151 01152 jvm_thread_index (jvm_thread_index_null) 01153 jvm_constant_pool_index (jvm_constant_pool_index_null) 01154 jvm_class_index (jvm_class_index_null) 01155 jvm_object_hash (jvm_object_hash_null) 01156 jvm_native_method_ordinal (jvm_native_method_ordinal_null) 01157 01158 which, in the case of a 'jvm_constant_pool_index' will produce 01159 a NULL pointer SIGSEGV, or it will index way off the end at the 01160 max index for the data type, namely (with corresponding BAD 01161 element name also), 01162 01163 jvm_interface_index (jvm_interface_index_bad) 01164 jvm_method_index (jvm_method_index_bad) 01165 jvm_field_index (jvm_field_index_bad) 01166 jvm_field_lookup_index (jvm_field_lookup_index_bad) 01167 jvm_attribute_index (jvm_attribute_index_bad) 01168 jvm_unicode_string_index (jvm_unicode_string_index_bad) 01169 01170 A typical check might be, 01171 01172 jvm_attribute_index 01173 attribute_find_in_field_by_cp_entry(jvm_class_index clsidx,...) 01174 { 01175 if (jvm_class_index_null == clsidx) 01176 { 01177 /* Somebody goofed * / 01178 exit_init_failure(EXIT_JVM_INTERNAL, 01179 JVMCLASS_JAVA_LANG_INTERNALERROR); 01180 /*NOTREACHED* / 01181 } 01182 } 01183 01184 In all cases, a review of the code with format input parameters 01185 to test what goes in is simple-- there are quite a few places 01186 where this _is_ done, it just is not complete. A similar 01187 review of functions that pass these data types back _out_ will 01188 provide the inverse test. Both are _mandatory_ for integrity 01189 of the runtime environment. 01190 01191 (12) The threading algorithm uses a simple for() loop. It could 01192 be easily generalized to using POSIX threads since there is already 01193 one POSIX thread started in the current code. This thread would go 01194 away when and if such threading were implemented. This thread is 01195 implemented in 'timeslice.c'. It is a time slice real-time clock 01196 timer that sends SIGALRM every so often on a tick boundary. This 01197 signal is connected to a handler that sets a flag in each live 01198 JVM thread and tells the JVM inner loop to quit after the current 01199 operation code so the next thread can be activated. By unwinding 01200 the thread activation for() loop, also known as the JVM outer loop, 01201 and putting in place a POSIX thread for each JVM thread, the timer 01202 would go away and the code would become a true multi-threaded 01203 application. 01204 01205 The purpose of _this_ implementation as a for() loop is for simplicity 01206 of implementation and for education of contributors who are new to 01207 multi-threaded environments in general and the Java virtual machine 01208 multi-threaded environment in particular. (Note that such unwinding 01209 would take place exclusively in 'thread.c'. No other logic 01210 _anywhere_ should be affected in more than a minor way beyond the 01211 JVM outer look in 'jvm.c'.) 01212 01213 (13) Commensurate with the above discussion on unwinding the JVM 01214 outer loop into a multi-threaded application is the discussion of 01215 write barriers. THERE ARE ABSOLUTELY NO WRITE BARRIERS IN THIS 01216 DESIGN!!! This is both for simplicity (per above reasons) and so 01217 that this _critical_ subject is given due consideration by the 01218 whole team JVM architecture contributors. This implementation 01219 makes no claims toward being an authority on such a vital and 01220 potentially sensitive subject. 01221 01222 (14) The 'field_info' area of the class file definition supports 01223 the 'ConstantValue' attribute, but does not support 'Signature'. 01224 This needs to be added for conformance to JDK 5 requirements. 01225 It also should support 'Synthetic' and 'Deprecated'. The definitions 01226 are present in 'classfile.h', they just need to get implemented. 01227 01228 (15) The 'method_info' area of the class file definition supports 01229 the 'Code' and 'Exceptions' attributes, but does not support 01230 'Signature'. This needs to be added for conformance to JDK 5 01231 requirements. The definitions are present in 'classfile.h', they 01232 just need to get implemented. 01233 01234 (16) When nearing the end of the initial development, I ran across 01235 what is probably a memory configuration limit on my Solaris platform, 01236 which I did not bother to track down, but rather worked around. It 01237 seems that when calling malloc(3C) or malloc(3MALLOC), after 2,280 01238 malloc() allocations and 612 free() invocations, there is something 01239 under the covers that does a SIGSEGV, and it can happen in either 01240 routine. I therefore extended the heap mechanism of 'heap_simple.c' 01241 to allocate a large number of slots of 'n' bytes for small allocations 01242 up to this size. Everything else still uses malloc(). In this way, 01243 I was able to finish development on the JVM without arguing with heap 01244 allocation. This implementation is found in 'heap_bimodal.c'. (In 01245 other words, I will let the team fix it!) Furthermore, I am not sure 01246 that the real project wants a static 'n + 1' MB data area just hanging 01247 around the runtime just because I did not take time to tune the system 01248 configuration! 01249 01250 (17) In preparation for writing the actual JVM execution engine, 01251 the 'bytegames.c' utilities bytegames_getrl8() and bytegames_putrl8() 01252 and bytegames_swap8() and bytegames_mix8() were added. The 'util.h' 01253 macros GETRL8() and PUTRL8() were likewise added. Other ways of 01254 processing 64-bit values were obsoleted and the places where they were 01255 used have been replaced by these functions and macros. This 01256 substitution needs to be more rigorously tested. All places where 01257 this substitution was made have been marked as commented-out code 01258 plus new functionality with a @todo item commenting it. 01259 01260 (18) There needs to be an examination of heap usage to make _sure_ 01261 that all instances of HEAP_GET_xxxx() eventually have a matching 01262 HEAP_FREE_xxxx() so there are no memory leaks. The existing code 01263 has been carefully written to facilitate this requirement, but 01264 it has not been rigorously regression tested. And since it is _well_ 01265 known that memory leaks are a system integrator's worst nightmare, 01266 if the code is checked early on into the project, there will be 01267 nothing to lose sleep over later on. 01268 01269 (19) A review of the non-local return logic in 'opcode.c' from threads 01270 that throw a java.lang.Error, java.lang.Exception, and 01271 java.lang.Throwable is in order. Errors should kill the JVM, but 01272 Exceptions should allow it to proceed. What about Throwables? 01273 The framework is in place for the project team to adjust to meet 01274 the JVM spec. Also, make sure that simply creating a new class 01275 object, running its <init> method, and continuing with previous code 01276 or quitting is the right way to process the condition. The existing 01277 logic is from heuristic observation of a JVM, but an examination of 01278 the spec should provide the definitive answer. Notice that 01279 java.lang.Throwable and java.lang.StackTraceElement will need local, 01280 native implementations. The exception framework needs to be filled in 01281 in opcode_run() for exceptions thrown by Java virtual methods. All 01282 handling of errors like java.lang.NoSuchMethodError is already taken 01283 care of. 01284 01285 (20) The concept of the ThreadGroup is not implemented here (see JVM 01286 spec section 2.16). This is given as an exercise to the project team. 01287 The default implementation consists _ONLY_ of the method 01288 java.lang.ThreadGroup.uncaughtException(). The rest of this class 01289 must be implemented. This partial implementation may also need to 01290 go away at that time. The basic reasoning for this is that the 01291 functionality of java.lang.ThreadGroup does not seem to warrant 01292 any native methods. Therefore, the whole of its functionality should 01293 likely be implemented in a Java class library and should typically 01294 behave properly without direct JVM core support. 01295 01296 (21) The sequence of loading and initializing the three critical 01297 classes java.lang.Object, Class, and String in 'jvm_init()' is 01298 somewhat fragile and probably not really sustainable. It might be 01299 that the best thing to do is to either make classlib-specific 01300 implementations of this initialization or to unload and reload these 01301 classes once the original loading is done and let the normal <clinit> 01302 be performed during this second attempt to load each of these classes. 01303 The problem is one of chickens and eggs: Which comes first? The 01304 logic is written and somewhat tested. It needs to be firmed up. 01305 An exercise for the team. Unloading and reloading support is already 01306 provided with class_static_delete() and class_reload(). 01307 01308 (22) The verification step (VM spec section 2.17.3) is given as an 01309 exercise for the project team. This was done in the interest of time 01310 and the fact that there may be third-party packages that may do this 01311 nicely (BCEL?) without much further effort. 01312 01313 (23) The source code is documented using Doxygen in its normative 01314 state. Although this tool can be _significantly_ contorted to perform 01315 documentation of almost _any_ file type due to its capability to hook 01316 into a user-specfied input filter, this implementation has done 01317 nothing outside of the usual bounds of the default product 01318 configuration. With that said, there is _one_ small admission to make 01319 here: Even though all of the code was written in ANSI 'C', the OO 01320 concept of throwing exceptions is intrinsically bound up with the 01321 runtime needs of a Java Virtual Machine. Therefore, one _small_ 01322 bending of the rules was permitted in the documentation-- use of the 01323 C++ documentation keyword '@throws' (also known as '@exception' and 01324 '@throw'). Functions that instantiate a java.lang.Throwable such as 01325 'Error' or 'Exception' (and/or their subclasses) use the following 01326 documentation syntax to present this behavioral feature to the 01327 Doxygen output. The results will show up in the bolded section named 01328 'Exceptions:' in the same way that parameters show up in the 01329 'Parameters:' section and return values in the 'Returns:' section: 01330 01331 * 01332 * @brief Function fn() doc header 01333 * ... 01334 * other doc info... 01335 * ... 01336 * 01337 * @param p1 what parm 1 does 01338 * 01339 * @returns what fn returns 01340 * 01341 * @throws JVMCLASS_some_kind_of_exception_string_name 01342 * @link \#JVMCLASS_some_kind_of_exception_string_name 01343 * brief description of how to make this happen @endlink. 01344 * / 01345 01346 *include "exit.h" 01347 *include "jvmclass.h" 01348 01349 rettype fn(parmtype p1) 01350 { 01351 ... 01352 if (problem_happened) 01353 { 01354 exit_throw_exception(JVMCLASS_some_kind_of_exception_string_name); 01355 /*NOTREACHED* / 01356 } 01357 01358 ... 01359 01360 return(normal_value); 01361 } 01362 01363 Notice this function call is similar to exit_jvm(), but has the 01364 added feature of initiating an exception, after which exit_jvm() 01365 will be called with an appropriate exit code to shut down the JVM. 01366 01367 01368 (25) When using the Eclipse C/C++ plugin, be aware of a bug in 01369 that code: Sometimes it loses its brains and wants to debug the 01370 test Java program instead of the C program. Look at the Run menu's 01371 Debug dialog (Run|Debug) and you will notice that under "C/C++ Local 01372 Application", the 'jvm' configuration is missing. To correct 01373 this, simply click on "C/C++ Local Application" and then click "New": 01374 01375 Main: 01376 Project: jvm 01377 C/C++ Application: bin/bootjvm 01378 Connect process input & output 01379 to a terminal: yes 01380 Arguments: 01381 C/C++ Program Arguments: (anything appropriate) 01382 Use default working directory: yes 01383 Environment: 01384 (anything you like) 01385 Debugger: 01386 Debugger: GDB Debugger 01387 Stop in main() on startup: yes (typically, no is okay) 01388 Main: 01389 GDB debugger: gdb 01390 GDB command file: <empty> 01391 Share Libraries: 01392 (anything you like) 01393 Source: 01394 jvm yes 01395 test yes 01396 Common: 01397 Type of launch configuration: local 01398 Display in favorites menu: (anything appropriate) 01399 Launch in background: yes 01400 01401 Now click 'Apply' and 'Close'. This should save it properly. 01402 Now it may be used normally. 01403 01404 01405 (24) A solid introductory project for someone is to go in to 01406 'cfmsgs.c' and write and equivalent to cfmsgs_show_constant_pool() 01407 for the field table and method table, suggested names being, 01408 cfmsgs_show_fields_table() and cfmsgs_show_methods_table(). 01409 This method takes cfmsgs_typemsg() and displays each constant_pool[] 01410 item. The same approach could be taken to these two utilities. 01411 The first and most important use of these functions is in 01412 classfile_loadclassdata(). 01413 01414 In like manner, an attribute table display function needs to 01415 be written to do the same thing for any attributes table. 01416 Currently cfmsgs_atrmsg() shows the contents of a _single_ 01417 attribute, but there is nothing that can explicitly dump 01418 an entire attribute table for fields, methods, or the (single) 01419 class attribute table. 01420 01421 (25) While working on the above 'cfmsgs.c' display routines, 01422 that same person should probably go in to classfile_loadclassdata() 01423 and clean up the numerous individual cfmsgs_typemsg() calls for 01424 inclusion in those routines. A few should probably still be 01425 left in, such as 'cfmsgs_typemsg("this", pcfs, pcfs->this_class)' 01426 which reports the 'this_class' member, but this process was followed 01427 when writing cfmsgs_show_constant_pool() and it helped the readability 01428 of the output _immensely_ to get rid of 'ad hoc' debug messages. 01429 01430 (26) The debug message level (DML) values need to be completely 01431 overhauled for more effective use by developers. 01432 01433 (27) Does there need to be a tighter relationship between the 01434 declaration of a native method via ACC_NATIVE when the class file 01435 is loaded by classfile_loadclassdata() and the class resolution 01436 logic of the linkage_resolve/unresolve_class() methods? What 01437 about with thread_class_load() and related startup code that runs a 01438 method? 01439 01440 (28) Although there have been _several_ significant passes at 01441 desk-debugging 'threadstate.c' and 'threadutil.c' and in particular, 01442 'jlThread.c', the JVM thread state machine needs a rigorous pass at 01443 integration testing. The java.lang.Thread methods of 'jlThread.c' 01444 and 'threadstate.c' are of particular concern, along with the hooks 01445 in 'timeslice.c'. This testing has been left as an exercise 01446 for the project team. Someone who has completed their Sun 01447 Certified Java Developer certification and is good with skills 01448 in 'C' should be able to firm up this area of the code nicely. See 01449 also item (5) above. 01450 01451 (29) In like manner to (28) above, unit testing of object monitor 01452 locking has been left as an exercise for the project team. Again, 01453 someone who has completed their Sun Certified Java Developer 01454 certification and is good with 'C' should be able to firm up 01455 this area of the code nicely. 01456 01457 (30) The whole body of code needs to be reviewed for proper ordering 01458 of PUSH() and POP() macros for (jlong) and (jdouble) variables. 01459 The same goes for local variable access of the same. The way it 01460 _should_ be done at this time is PUSH(MS), PUSH(LS) with corresponding 01461 POP(LS), POP(MS). For local variables, var[n] := MS, var[n+1] := [LS]. 01462 However, this needs proper scrutiny for completeness and correctness. 01463 01464 (31) A review needs to be made of the Eclipse project files. In 01465 particular, do the C/C++ project settings need to be made independent 01466 of the workspace settings for compile, link, and library archiver 01467 options? Do the Java project settings need to be made similarly 01468 independent? Should there be a workspace provided in the distribution 01469 that has all these things set up? Currently, the projects are a bit 01470 of a mixture of workspace and project settings. The 'build.sh' 01471 scripts use a hard-coded set of GCC options that are derived from the 01472 Eclipse setup. The 'config/*.gcc' and config/*.gccld' files reflect 01473 this for the C source. The Java compilations in the 'build.sh' 01474 scripts use the default Java compiler options. Does there need to be 01475 a harmonization (sic) between the Eclipse and 'build.sh' settings? 01476 Should they be manually mantained in Eclipse and 'config.sh'? These 01477 are questions that need some review. Furthermore, this author is very 01478 much a proponent of systematic use of 'gmake' as a premier project 01479 build tool across the industry. It would be a really good idea for 01480 use of 'gmake' to be reviewed. It is used internally by Eclipse, 01481 but users should not be forced to use Eclipse just to get 'gmake'. 01482 01483 (32) The inner loop of virtual instructions in opcode.c checks various 01484 items at run time such as ACC_STATIC and ACC_FINAL and other items 01485 that a class verifier should test. This is done so that the code 01486 corresponds _directly_ with the definition of each instruction as 01487 described in JVM spec section 6. Many of these these tests (namely, 01488 the examples above) should be moved to a byte code verifier because 01489 they will not change from the time that the class is loaded until it 01490 is unloaded. Obviously, checking them each time a virtual instruction 01491 is run is _not_ efficient at run time! However, this implementation 01492 is meant also to teach the principles and requirements of virtual 01493 instructions, so the implementation of the JVM spec requirement was 01494 done in the inner loop in opcode_run() as a sort of reference 01495 implementation of each virtual instruction. 01496 01497 (33) One very good exercise for project team members to learn this 01498 code will be to implement the exception logic as found in the 01499 exception attribute of the program counter, namely jvm_pc.excpatridx. 01500 This field is filled in correctly at method startup time, and there 01501 should be enough hooks in the structure of the code to implement it 01502 without much effort. What will be worth it is the learning process. 01503 This logic will bear some strong similarities with LATE_CLASS_LOAD() 01504 and class_load_resolve_clinit() when not using the system thread. 01505 01506 (34) It is similarly left as an exercise for the project team to 01507 locate a method in a superclass or an interface if it is not found 01508 in the current class. Currently, a request is made for a method 01509 in the current class and a (jvm_method_index) is returned. This is 01510 fine for JVM initialization when it is known which methods are 01511 found where, but what _should_ happen in normal JVM runtime is that 01512 the INVOKEVIRTUAL, INVOKEINTERFACE, INVOKESTATIC, etc., should check 01513 if a method is in the current class. If not, check superclasses 01514 until no more superclasses. If found, return a heap pointer that 01515 contains the (jvm_class_index) and (jvm_method_index) of the located 01516 method, otherwise return 'rnull', the NULL pointer. Class loading 01517 does search for superclasses, so look for uses of 'pcfs->super_class' 01518 or 'pcfs_recurse->super_class' for examples on how to do this. 01519 01520 (35) In the documentation, the @def and @struct tags were used 01521 to introduce specific definition types. This practice somehow 01522 was not carried through to all definitions. Similar cleanup should 01523 be done for the following tags: @enum, @fn, @var, @typedef. None 01524 of these were put in place, but would clean up the organization of 01525 the output areas and add better indexing capabilities to the result 01526 set. It also might ease the need for @link tags so that definitions 01527 may be automatically tagged better, making room for cleanup of such 01528 constructions in the narrative as: 01529 01530 @link \#jvm_class_index jvm_class_index@endlink 01531 01532 to be simplified into, 01533 01534 jvm_class_index 01535 01536 This would then ease the readability (in source code itself) of the 01537 documentation. Notice that constructions such as, 01538 01539 @link \#jvm_class_index a resulting class index@endlink 01540 01541 will not need cleanup since the target link is not the same as the 01542 displayed text. 01543 01544 (36) An excellent tutorial for learning how objects are created using 01545 non-default constructors is in jvm.c and class.c where string 01546 parameters are loaded in from the argv[] command line and from 01547 CONSTANT_String_info structures of the class file, respectively. 01548 A suggestion is made in jvm.c as to how to accomplish this. 01549 01550 (37) After going back and forth as to whether or not a @throws 01551 condition that happens in a subroutine should also be reported 01552 back up the line in the function that called it, the result is 01553 that sometimes it is reported and sometimes it is not. This needs 01554 to be regularized. It is recommended to _not_ report it in the 01555 calling function's documentation also since this is really not 01556 an OO environment. Document it where it happens and let that stand. 01557 01558 @endverbatim 01559 * 01560 */ 01561 */ /* 01562 * (Use #! and #/ with dox_filter.sh to fool Doxygen into 01563 * parsing this non-source text file for the documentation set. 01564 * Use the above open comment to force termination of parsing 01565 * since it is not a Doxygen-style 'C' comment.) 01566 * 01567 * EOF 01568