Main Page | Namespace List | Alphabetical List | Data Structures | Directories | File List | Data Fields | Globals | Related Pages

README

Go to the documentation of this file.
00001 /*!
00002  * @file ./README
00003  *
00004  * @brief Introductory annotations.
00005  *
00006  * @section Control
00007  *
00008  * \$URL: https://svn.apache.org/path/name/README $ \$Id: README 0 09/28/05 dlydick $
00009  *
00010  * Copyright 2005 The Apache Software Foundation
00011  * or its licensors, as applicable.
00012  * 
00013  * Licensed under the Apache License, Version 2.0 ("the License");
00014  * you may not use this file except in compliance with the License.
00015  * You may obtain a copy of the License at
00016  * 
00017  *     http://www.apache.org/licenses/LICENSE-2.0
00018  * 
00019  * Unless required by applicable law or agreed to in writing,
00020  * software distributed under the License is distributed on an
00021  * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND,
00022  * either express or implied.
00023  * 
00024  * See the License for the specific language governing permissions
00025  * and limitations under the License.
00026  *
00027  *
00028  * @version \$LastChangedRevision: 0 $
00029  *
00030  * @date \$LastChangedDate: 09/28/2005 $
00031  *
00032  * @author \$LastChangedBy: dlydick $
00033  *         Original code contributed by Daniel Lydick on 09/28/2005.
00034  *
00035  *
00036  * @section Reference
00037  *
00038  * @see LICENSE
00039  *
00040  * @see INSTALL
00041  *
00042  * @note:   (In the following narratives, the normal documentation tags
00043  *          are @e not used so that this file may be used in a
00044  *          stand-alone fashion without the assistance of any document
00045  *          reader and without knowledge of such tags.)
00046  *
00047  * @note:   See also the documentation page named "Main Page" for an
00048  *          overview from the JVM spec section number perspective.
00049  *
00050  * @todo Need to verify which web document for the
00051  *       Java 5 class file definition is either "official",
00052  *       actually correct, or is the <em>de facto</em> standard.
00053  *
00054  * @verbatim
00055 
00056 
00057 Apache Harmony Bootstrap Java Virtual Machine
00058 =============================================
00059 
00060 This implementation of the Java Virtual Machine has been written
00061 as a (notice "a", not "the") reference implementation that comprises
00062 almost all of the facilities of a JVM without any further changes.
00063 
00064 Please see the file named 'INSTALL' in this same directory for
00065 instructions on installing and building the program.
00066 
00067 
00068 The goals of this effort are:
00069 
00070   (1) To provide a working Java Virtual Machine interpreter to the
00071       Apache Harmony project as a cornerstone for a Java runtime
00072       environment, especially considering the possibility of using
00073       it as a bootstrap JVM in the final project code base, where
00074       and if applicable.
00075 
00076   (2) To provide a working Java Virtual Machine to the Apache Harmony
00077       project as a starting point for architectural discussions
00078       to help the project get started in earnest, pulling itself
00079       up "by its bootstraps," as it were.  Thus a second reason to
00080       call this project the "bootstrap JVM."     :-|   :-O  :-)
00081 
00082   (3) To provide a highly modular implementation so pieces may be added
00083       and modified and removed with minimal impact to other pieces.
00084       For example, the heap allocation component may be easily
00085       switched between three modes, 'simple' and 'bimodal' and 'other'.
00086       This is accomplished by running 'config.sh' and changing it
00087       there.  In like manner, the garbage collection component or any
00088       future component so constructed may be easily replaced without
00089       any changes to other code.  Other components may not be _quite_
00090       as hermetically sealed, such as the class file loader or threading
00091       mechanism, but the structure is present to improve upon the
00092       implementations, as desired.
00093 
00094   (4) To provide a simple code implementation written in straight,
00095       vanilla ANSI 'C' using the ubiquitous GCC compiler so that the
00096       code may be easily and efficiently adapted and modified to perform
00097       all tasks without esoteric tools, and utilizing the experience
00098       and creativity of _all_ contributors.  The only suggestion is
00099       that each contributor working on source code learn how to use
00100       Doxygen, a simple, powerful, and highly configurable C/C++
00101       documentation utility. There is an abundance of example commentary
00102       in the showing how to use it for a wide variety of project
00103       documentation purposes.  Furthermore, its installation also
00104       includes extensive built-in documentation about itself.
00105 
00106   (5) To provide a very clear and concise code base for teaching JVM
00107       concepts to potential contributors.
00108 
00109   (6) To organize the code as a simple static library that can be linked
00110       into any larger body of code, and also to be able to connect JNI
00111       shared object files/dynamic load libraries to it easily.
00112 
00113   (7) To provide a sample main() program entry point for how to link
00114       and use the static library.
00115 
00116   (8) To provide a start on the native side of the JNI implementation
00117       of java.lang.* and java.io.* (etc.) that may be more fully
00118       fleshed out by project contributors as a way to start becoming
00119       familiar with the core code.  Also to provide a similar start on
00120       the JNI side for the same reasons, including a sample main()
00121       program there.
00122 
00123 This implementation is NOT intended to be:
00124 
00125   (1) The final Apache Harmony JVM.
00126 
00127   (2) The authority or standard against which JVM's are measured,
00128       whether by the Apache Harmony project or otherwise.
00129 
00130   (3) The most efficient possible implementation.  Issues of
00131       modularity, concise implementation, and ease of understanding
00132       how the code works take precedence over runtime efficiency
00133       issues, including CPU time, memory resources, and the like.
00134 
00135 This contribution consists of about 52,000 lines of 'C', Java, shell
00136 scripts, and data files.  It has been written, unit tested, and
00137 _lightly_ integration tested.  It is being contributed partly with
00138 testing in mind to familiarize contributors with the code as they
00139 work with it and continue with integration testing, especially while
00140 extending its feature set into a full-fledged, bullet-proof
00141 work of art.  This file contains a list of items that should be
00142 addressed, both features and bug fixes.  Furthermore, the source
00143 code and therefore the pre-formatted documentation contains over
00144 200 focused, specific enhancements, questions, and problems in
00145 the @todo list that should be addressed during this process.
00146 
00147 It is my hope that this contribution will be the seed that helps the
00148 Apache Harmony project to sprout, mature, and bloom into a first-class
00149 Java Virtual Machine that is worthy of the reputation of the
00150 Apache brand.
00151 
00152 
00153 Yours Truly
00154 
00155 
00156 
00157 Daniel Lydick
00158 September 28, 2005
00159 
00160 
00161 
00162 ---
00163 
00164 
00165 Configuration
00166 -------------
00167 Run the 'config.sh' script in this directory to configure for
00168 your environment.  It creates a './config' sub-directory with
00169 a './config/config.h' header file containing top-level compile
00170 parameters. This file is always referenced in every source file
00171 of the core JVM code by including "arch.h".  This script also
00172 creates other files there useful with the 'build.sh' scripts
00173 and contains normal compiler command line parameters.  Eclipse
00174 project files are also available which contain these same
00175 compile command line parameters.
00176 
00177 Whatever Java JDK you are currently using is probably fine for
00178 now when running this JVM.  Reading of JAR files is _not_ done via
00179 the JAR classes (yet), but with your '$JAVA_HOME/bin/jar' utility.
00180 The test classes may be compiled with your JDK's '$JAVA_HOME/bin/javac'
00181 compiler.  All this is in an attempt to leverage the functionality
00182 of your existing JDK to "bootstrap" this bootstrap JVM into existence.
00183 Eventually, of course, all this will get replaced with Harmony
00184 versions of all of these components.  A seed for these components
00185 is found in the Java classes in 'jni/src/harmony/generic/0.0'.
00186 Currently, these classes contain _only_ definitions for what is
00187 termed "local native methods".  These are JNI method calls to
00188 code that has intimate knowledge of the details of the JVM
00189 implementation, such as 'java.lang.Object.wait()'.  (See
00190 'jvm/src/native.c' for more information.)
00191 
00192 Once the 'config.sh' completes successfully, run 'build.sh' in any
00193 or all directories where it is found to build that diretctory or
00194 directory tree.  At the top level, 'build.sh all' will build the
00195 entire project.  Call it as 'build.sh help' for options.
00196 
00197 Eclipse project files are provided to do the same things with the
00198 same options except that it does not compile the Java classes in
00199 'jni/src/harmony/generic/0.0'.  (Notice that if you change
00200 anything, it will need to be changed for both 'build.sh' and for
00201 Eclipse if you want them be both work the same.)  The original
00202 development of this code was on a Sun Ultra 5 with Solaris 9
00203 running GCC 3.3.2, Gmake 3.80, GNU binutils 2.11.2, and GDB 6.0,
00204 coordinated under Eclipse 3.0.2 with the C/C++ plugin CDT 2.1.1.
00205 The JDK was Sun's 1.4.2_06-b03.  The source code was documented
00206 with Doxygen 1.4.4, Solaris version.
00207 
00208 
00209 Code organization
00210 -----------------
00211 (The following description is also found in 'jvm/src/jni'c for
00212 display on the "Main Page" documentation.)
00213 Several directories are provided within the source tree:
00214 
00215    jvm        Source code for JVM, including a main() wrapper.
00216               Builds binary file 'jvm/bin/bootjvm'.
00217 
00218    libjvm     For building 'jvm' as a statically linked
00219               library archive, less main() wrapper.  Builds
00220               library archive 'libjvm/lib/libjvm.a'.  Source
00221               code comes from the 'jvm' directory.
00222 
00223    main       A simple main() wrapper that links
00224               'libjvm/lib/libjvm.a' and builds binary
00225               file 'main/bin/bootjvm'.  Source code comes
00226               from the 'jvm' directory.
00227 
00228    jni        Source code for a sample JNI shared library
00229               'jni/harmony/generic/0.0/lib/bootjni.so'
00230               for linking with JNI code, but needs the
00231               build directives to be functional, as it
00232               currently links statically with a main() into
00233               a binary just like 'jvm'.  This directory
00234               contains a tree for JNI implementations from any
00235               supplier that wants to support the Harmony project.
00236               Currently, there is one JNI implementation here,
00237               found in 'jni/src/harmony/generic/0.0'.
00238 
00239    test       Builds numerous Java test classes in 'test/bin'
00240               for driving development work.
00241 
00242 With the exception of the Java test classes in 'test/src',
00243 all source code is found in 'jvm/src' and in the directory tree
00244 'jni/src/vendor/product/version'.  The purpose of 'libjvm'
00245 and 'main' for demonstrating various possible organizations
00246 for the source code, namely for building a static library archive
00247 and for linking it.
00248 
00249 The source code is about 3 MB in size.  The final size of all of
00250 parts of the compiled code tree is about 12 MB.  The full
00251 documentation tree in 'doc.ORIG' is about another 55 MB when fully
00252 installed.  It may be removed if desired in favor of maintaining
00253 _only_ the working documentation in 'doc' as generated by 'build.sh dox'
00254 at the top level, which will be the same size as 'doc.ORIG' if all
00255 documentation formats are desired, or less if fewer documentation
00256 formats are used.  For example, if only the HTML format is used,
00257 the 'doc' directory will be about 19 MB in size.
00258 
00259 Comments in the source will _always_ mention 'C' language "functions",
00260 but Java language "methods", and _never_ the reverse.  This is part
00261 of an attempt to separate the two compile and runtime domains.  See
00262 also comments on this subject below under 'jrtypes.h' and the source
00263 itself for additional comments about type definitions in the Java
00264 and real machine domains.
00265 
00266 Subsystem component abstraction
00267 -------------------------------
00268 The implementation key Java concepts is performed in the following
00269 source files with corresponding navigation macros:
00270 
00271     Component        Source and header       Navigation
00272     ---------        -----------------       ----------
00273     Java class       class.[ch]              CLASS() macro
00274                      classutil.c
00275 
00276     Java object      object.[ch]             OBJECT() macro
00277                      objectutil.c
00278 
00279     Java thread      thread.[ch]             THREAD() macro
00280                      threadstate.c
00281                      threadutil.c
00282 
00283     Java method      method.[ch]             METHOD() macro
00284 
00285     Java class       field.[ch]              FIELD() macro
00286     static field,
00287     object instance
00288     
00289 
00290     JVM registers    jvmreg.h                STACK(), et al
00291 
00292     Java native      native.[ch]             --
00293                      jlObject.[ch]           --
00294                      jlClass.[ch]            --
00295                      jlString.[ch]           --
00296                      jlThread.[ch]           --
00297 
00298     JVM spec
00299       class file     classfile.[ch]          Many, see cfmacros.h
00300 
00301     Heap modules     heap.h                  HEAP_xxx() macros
00302                      heap_simple.c
00303                      heap_bimodal.c
00304 
00305     Garbage          gc.h                    GC_xxx() macros
00306     collection       gc_stub.c
00307     modules
00308 
00309 
00310 
00311 By simply changing out the respective source file and adjusting
00312 the main navigation macro for that component, the implementation
00313 can be changed drastically without affecting the other components.
00314 (Larger implementations will have more than one source file, see
00315 especially 'threadstate.c'.)  It is _highly_ unlikely that the
00316 'classfile.[ch]' components will _ever_ change since this is under
00317 the direct control of the Java specification, but the others are
00318 under the control of this project and may be modified to suit its
00319 needs as desired.
00320 
00321 
00322 Support Scripts
00323 ===============
00324 Following is a short description of each script file and other
00325 support files:
00326 
00327 config.sh  <---  * * *   START HERE AFTER READING 'INSTALL' FILE   * * *
00328 ---------
00329 Introduce users to the project and how to set it up and configure it
00330 for various CPU platforms and for various functional features.
00331 This interactive shell script provides introductory material and
00332 a description of which versions of which software tools are needed
00333 to compile and document the project and how to administer the
00334 pre-formatted documentation.
00335 
00336 It then starts evaluating the existing Java JDK (as declared by the
00337 JAVA_HOME environment variable) for existence of the proper tools and
00338 verifies the name of the class library archive that will be used for
00339 temporary access to JVM startup classes such as the root object
00340 java.lang.Object.class .
00341 
00342 Once this introductory evaluation is complete, it will ask questions
00343 about how to configure the project for compile, runtime, and
00344 distribution features, as well as which components to build and
00345 to document.  Once the questions are answered, the project is
00346 configured and optionally built using 'build.sh cfg'.
00347 
00348 build.sh
00349 clean.sh
00350 common.sh
00351 ---------
00352 Top-level build scripts that invokes build scripts of the same names
00353 at the various levels in the directory tree.  The one named 'build.sh'
00354 compiles the source code, while the one named 'clean.sh' removes the
00355 effects of that build.  The shared file 'common.sh' is used by both
00356 of these scripts.  Notice that nowhere in the tree except here at the
00357 top level will the documentation build occur, as it is a global process
00358 due to interdependencies of @link and @see directives, among others.
00359 
00360 getsvndata.sh
00361 getsvndups.sh
00362 -------------
00363 Show a list of all revisions of all source files compiled into an
00364 object file, a library archive, or a linked binary with 'getsvndata.sh'.
00365 Show a list of conflicting revisions using 'getsvndups.sh'.  Object
00366 files may not have conflicts, neither may library archives.  Linked
00367 binaries may or may not, depending on the particulars.
00368 
00369 echotest.sh
00370 -----------
00371 Generic script support for 'echo -n' feature for shells that do not
00372 support it natively.
00373 
00374 
00375 jvm/build.sh
00376 libjvm/build.sh
00377 main/build.sh
00378 test/build.sh
00379 jni/src/harmony/generic/0.0/build.sh
00380 jvm/clean.sh
00381 libjvm/clean.sh
00382 main/clean.sh
00383 test/clean.sh
00384 jni/src/harmony/generic/0.0/clean.sh
00385 jvm/common.sh
00386 libjvm/common.sh
00387 main/common.sh
00388 test/common.sh
00389 jni/src/harmony/generic/0.0/common.sh
00390 -------------------------------------
00391 Like at the top level, each relevant directory level has a build
00392 script that compiles the source code ('build.sh') and removes the
00393 output files from that build ('clean.sh').  These files share a
00394 common file ('common.sh') also.  The output of 'libjvm' is stored
00395 in a 'libjvm/lib' subdirectory, while the output of the other scripts
00396 is stored in a '______/bin' subdirectory.
00397 
00398 dox.sh
00399 undox.sh
00400 commondox.sh
00401 ------------
00402 The logic behind the documentation build using Doxygen.  The output
00403 is stored by 'dox.sh' into a 'doc' subdirectory at this level, while
00404 the 'undox.sh' removes it.  They share a common file 'commondox.sh'.
00405 In order to speed up the documentation build during development,
00406 define the environment variable SUPPRESS_DOXYGEN_VERYCLEAN as any
00407 non-null string.  See logic of 'dox.sh' for other comments.
00408 
00409 dist-src.sh
00410 dist-doc.sh
00411 dist-bin.sh
00412 -----------
00413 Construct a source distribution and store it above the top of
00414 the directory tree in gzipped tar, where CONFIG_RELEASE_LEVEL is
00415 the release level defined the last time that 'config.sh' was run:
00416 
00417     Type    Script       Output TAR file
00418     ----    ------       ---------------
00419     Source  dist-src.sh  ../../bootJVM-src-$CONFIG_RELEASE_LEVEL.tar.gz
00420     (plus
00421       docs)
00422 
00423     Docs    dist-src.sh  ../../bootJVM-doc-$CONFIG_RELEASE_LEVEL.tar.gz
00424 
00425     Binary  dist-src.sh  ../../bootJVM-bin-$CONFIG_RELEASE_LEVEL.tar.gz
00426     (plus
00427       docs)
00428 
00429 A side effect of the source distribution (only) is the creation of a
00430 file in this directory named 'bootJVM-docs.tar.gz'.  It is included
00431 in the distribution as a part of the deliverables and contains
00432 installable documentation files.  Its installation is managed with
00433 'config.sh' where it references the "pre-formatted documentation.
00434 
00435 Other Files
00436 ===========
00437 
00438 INSTALL  <---  * * *   START HERE * * *
00439 -------
00440 Instructions for installing the source code and building the
00441 binaries and the documentation.
00442 
00443 LICENSE
00444 -------
00445 The Apache Software Foundation license text used by the ASF
00446 for all software distributions.
00447 
00448 README
00449 ------
00450 This file.
00451 
00452 bootjvm.dox
00453 dox_filter.sh
00454 -------------
00455 'bootjvm.dox' is the Doxygen directive file used to create
00456 all project documentation,  invoked by 'dox.sh'.
00457 
00458 'dox_filter.sh' the  filter script declared in 'bootjvm.dox'
00459 for filtering input files.  It is declared as the
00460 'INPUT_FILTER=' parameter in 'bootjvm.dox' and is necessary
00461 to properly format all files that are not explicitly '.c'
00462 or '.h' or '.java' source files.
00463 
00464 svnstat.sh
00465 ----------
00466 Sample script from Doxygen documentation to display the status of
00467 a file in SVN.  Not used in the project for any purpose.
00468 
00469 test/.project
00470 test/.classpath
00471 ---------------
00472 Eclipse project files for the 'test' Java project.
00473 
00474 jvm/.project
00475 jvm/.cdtproject
00476 jvm/.cdtbuild
00477 libjvm/.project
00478 libjvm/.cdtproject
00479 libjvm/.cdtbuild
00480 main/.project
00481 main/.cdtproject
00482 main/.cdtbuild
00483 jni/.project
00484 jni/.cdtproject
00485 jni/.cdtbuild
00486 -------------------
00487 Eclipse project files for the several C/C++ projects.
00488 
00489 
00490 
00491 Output areas
00492 ============
00493 
00494 bootclasspath/
00495 --------------
00496 Directory containing default value of BOOTCLASSPATH environment
00497 variable.  THIS ABSOLUTE PATH NAME IS COMPILED INTO SOURCE CODE
00498 AND WILL NEED TO ULTIMATELY BE PHASED OUT.
00499 
00500 config/
00501 -------
00502 Directory where all results from 'config.sh' are stored.  These
00503 files are used by the various shell scripts and by Doxygen to
00504 build various aspects of the project.
00505 
00506 jvm/bin
00507 main/bin
00508 test/bin
00509 jni/src/harmony/generic/0.0/bin
00510 -------------------------------
00511 Output area respectively from 'jvm/build.sh', 'main/build.sh',
00512 'test/build.sh', and 'jni/src/harmony/generic/0.0/build.sh'.
00513 
00514 libjvm/lib
00515 ----------
00516 Output area from 'libjvm/build.sh'
00517 
00518 jni/bin
00519 -------
00520 Output area for the Eclipse 'jni' project.  Source distributions
00521 (via 'dist-src.sh') cannot be performed until this directory has been
00522 manually removed of an Eclipse 'clean' operation done for the
00523 'jni' project.
00524 
00525 
00526 Output files
00527 ============
00528 
00529 bootclasspath/*.class
00530 bootclasspath/*/*.class
00531 --------------
00532 Java class files that are found in the BOOTCLASSPATH environment
00533 variable.  They are extracted from your JDK's runtime JAR file
00534 and are used to start up the JVM.  They will eventually get phased
00535 out of the project when these classes have been developed by the
00536 project team.  Whether or not replacements from the project are
00537 stored here is TBD.
00538 
00539 config/config.h
00540 ---------------
00541 Although the other files stored in the 'config' directory are not
00542 listed here, they are derived along with this file from 'config.sh'
00543 to control the compile and run time features of the project.  This
00544 file specifically can be used as a reference when running 'config.sh'
00545 again to remember what settings were configured.  It will be very
00546 obvious from that script as to which definitions here match the
00547 questions.
00548 
00549 jvm/bin/bootjvm
00550 main/bin/bootjvm
00551 test/bin/*.class
00552 test/bin/*/*.class
00553 jni/src/harmony/generic/0.0/bin/bootjvm
00554 jni/src/harmony/generic/0.0/bin/*/*.class
00555 -----------------------------------------
00556 The output files respectively from the build of the 'jvm',
00557 'main', 'test', and 'jni/src/harmony/generic/0.0'
00558 project build.
00559 
00560 libjvm/lib/libjvm.a
00561 -------------------
00562 Output file from the 'libjvm' project build.
00563 
00564 
00565 Source code
00566 ===========
00567 Following is a short description of each source file.  All function
00568 names start with the name of their source file, 'filename_function()'
00569 in keeping with the OO concept of packaging all related code and data
00570 into the same sourcefile.  See also 'jrtypes.h' for comments on naming
00571 conventions for certain data types.
00572 
00573 The JNI source code is grouped separately, as is the test suite.
00574 
00575 
00576 jvm/src/arch.h
00577 --------------
00578 Configure the compilation of each source file with architectural
00579 parameters, especially from the configuration script 'config.sh'.
00580 Provide copyright information for the binary edition of each source
00581 file.  This file MUST be included by all source files in
00582 'jvm/include' and in 'jvm/src'.  Do NOT include it in 'jni'
00583 source files.  There needs to be an equivalent for the Java code
00584 written for these features (see to-do item in source).
00585 
00586 jvm/src/argv.c
00587 --------------
00588 Parse the JVM command line.  For easy help, call program
00589 with '-help' option.
00590 
00591 jvm/src/attribute.c
00592 jvm/src/attribute.h
00593 -------------------
00594 Handle class file attributes.  The type definition
00595 'jvm_attribute_index' is the key to properly using class file
00596 attributes.
00597 
00598 jvm/src/bytegames.c
00599 -------------------
00600 Manipulate bytes for various purposes such as byte swapping, reading
00601 2-byte structures from 1-byte addresses, or 4-byte or 8-byte structures
00602 from 1- or 2-byte addresses, etc.  (These last are due to use of
00603 structure packing, especially in parsing class file data and parsing
00604 virtual opcode operands).
00605 
00606 jvm/src/cfattrib.c
00607 ------------------
00608 Process the attibute fields of class file data.  Since attributes
00609 are a large and specialized area of the JVM spec class file
00610 definition, they have been broken out of 'classfile.c'.
00611 
00612 jvm/src/cfmacros.h
00613 ------------------
00614 Macros for navigating the class file structures of 'classfile.h'.
00615 
00616 jvm/src/cfmsgs.c
00617 ----------------
00618 Diagnostic messages for class file data.
00619 
00620 jvm/src/class.c
00621 jvm/src/class.h
00622 ---------------
00623 Handle Java classes in the real machine implementation.
00624 The type definition 'jvm_class_index' is the key to properly
00625 using Java file classes.  Notice that this index is implemented
00626 in such a way that the underlying implementation could be
00627 completely changed from a simple array of structures to something
00628 unrelated and there would be only a few changes to some macros
00629 necessary to support that new implementation.  Such an implementation
00630 might call this type definition a 'jvm_class_hash' instead, for
00631 example.  The design decision, however, was to try to maintain a
00632 separate structure for classes than for objects.  Each class does
00633 have an object that contains its object-ish components.  This
00634 object is also implemented to satisfy the spec requirements that
00635 a class also have a class object.  See 'linkage.h' as to how to
00636 navigate between classes, objects, and threads.
00637 
00638 jvm/src/classfile.c
00639 jvm/src/classfile.h
00640 -------------------
00641 Definition of JVM spec, version 2, section 4, for JDK 1.5, namely,
00642 the class file structure, and its implemention.  The version of
00643 the class file definition (section 4) is listed near the top of
00644 the header file.  It is one of several that have been floating
00645 around, but appears to meet the JDK 1.5 attribute extensions
00646 rather exactly, and is _assumed_ (that is a VERY big assumption!)
00647 to be class file 'major.minor' version '49.0', namely the JDK 1.5
00648 class file format.
00649 
00650 
00651 All symbol definitions are declared _exactly_ as shown in this
00652 specification with the single exception of array declarations like,
00653 
00654     u2      constant_pool_count;
00655     cp_info constant_pool[constant_pool_count - 1];
00656 
00657 Such definitions are instead declared as,
00658 
00659     u2      constant_pool_count;
00660     cp_info **constant_pool;
00661 
00662 and an array of pointers to (cp_info) is allocated on the heap of
00663 size 'constant_pool_count'.  (In this one case, element zero is
00664 defined as not being used, so it is always a NULL pointer.)
00665 
00666 For purposes of real machine word alignment, type 'cp_info' and
00667 'attribute_info' have been embedded in a _slightly_ larger structure
00668 to keep 2- and 4-byte member references on the correct real machine
00669 address boundaries.  The embedding structures are called 'cp_info_dup'
00670 and 'attribute_info_dup', respectively, as they duplicate the contents
00671 of the smaller structure, if you will.
00672 
00673 All symbol definitions that are not _explicitly_ found in the spec
00674 are locally defined for use in this implementation.  They are prefixed
00675 with the string "LOCAL_" to distinguish them from spec definitions.
00676 For example, ACC_PUBLIC marks a class as public, but the local
00677 definition ACC_EMPTY means no defintion at all.  Since it is not
00678 found in the spec, it is actually named LOCAL_ACC_EMPTY here.
00679 
00680 
00681 jvm/src/classpath.c
00682 jvm/src/classpath.h
00683 -------------------
00684 Manage and navigate CLASSPATH environment variable.
00685 
00686 jvm/src/classutil.c
00687 -------------------
00688 Utilities for handing real machine class structures.
00689 
00690 jvm/src/exit.c
00691 jvm/src/exit.h
00692 --------------
00693 Exit codes for exit(3) and fatal error handing for JVM runtime.
00694 This code implements non-local subroutine returns using the
00695 standard library calls setjmp(3) and longjmp(3).  If you want
00696 to understand this non-local return mechanism, you _must_ read
00697 the man pages for these functions and study the examples.  They
00698 are extremely useful in processing fatal error conditions and
00699 state machine conditions under which there is no valid return
00700 or repair of an invalid state or irreconcilable condition.
00701 
00702 jvm/src/field.c
00703 jvm/src/field.h
00704 ---------------
00705 Handle Java virtual machine variables for both class static fields
00706 and object instance fields.  The type definitions 'jvm_field_index'
00707 and 'jvm_field_lookup_index' are the keys to properly using Java fields.
00708 
00709 jvm/src/gc.h
00710 ------------
00711 API for all garbage collection algorithms.
00712 
00713 jvm/src/gc_stub.c
00714 -----------------
00715 Default implementation of garbage collection (a do-nothing
00716 implementation).  This code is meant to be replaced by one
00717 or more GC algorithms.  The GC insertion points in the code
00718 body corporate should stand for all algorithms.  See 'heap_*.c'
00719 and header file 'heap.h' for an example of implementing
00720 multiple algorithms on a single API.
00721 
00722 jvm/src/heap.h
00723 --------------
00724 API for all heap management algorithms.
00725 
00726 jvm/src/heap_simple.c
00727 ---------------------
00728 Default implementation of heap allocation using malloc(3)
00729 and free(3).  One option is presented to return an allocated
00730 area that has been initialized to zeroes, which is useful for
00731 structure initialization.
00732 
00733 jvm/src/heap_bimodal.c
00734 ----------------------
00735 An improvement over 'heap_simple.c' where a large memory block
00736 is allocated from whence come all allocations up to a certain
00737 size.  Beyond that, the 'heap_simple.c' algorithm is used.
00738 
00739 jvm/src/jrtypes.c
00740 jvm/src/jrtypes.h
00741 -----------------
00742 Map Java data types ('j') to real machine ('r') types.  Much of the
00743 code uses these two letters prefixed to variable names to distinguish
00744 between Java and real machine data types.  The other common prefix
00745 used throughout the code is 'p' for pointer, such as '(char *) pstr'
00746 or '(jint *) parray'.  However, this convention is for instances of
00747 pointers to _any_ data type, and may be used for either the 'C' or
00748 the Java domains.  Compiled constant definitions are found
00749 in 'jrtypes.c' also.
00750 
00751 jvm/src/jvalue.h
00752 ----------------
00753 Java data type aggregate for storage of class static field
00754 and object instance fields.  One type fits all.
00755 
00756 jvm/src/jvm.c
00757 jvm/src/jvm.h
00758 -------------
00759 The main JVM control structure for running the Java virtual
00760 machine on this real machine.  There is virtually no global
00761 storage in this code, and only sparing use of static (file scope)
00762 storage.  Most everything else is either found here, is a
00763 local variable, or is on the heap.  The code in 'jvm.c' initializes,
00764 runs, and shuts down the JVM.
00765 
00766 jvm/src/jvmcfg.c
00767 jvm/src/jvmcfg.h
00768 ----------------
00769 Top-level configuration of JVM parameters (not including what is
00770 mostly compile-time setup with 'config.sh' and its 'config/config.h'
00771 header file.)  All sorts of things are defined here, from OS-dependent
00772 directory delimiters to JVM command line parameter strings.  A number
00773 of significant typedefs are also found here, as well as high and
00774 low limits on data types, etc.  Compiled constant definitions are
00775 found in 'jvmcfg.c' also for NULL and BAD definitions for major
00776 typedefs.
00777 
00778 jvm/src/jvmclass.h
00779 ------------------
00780 Fully qualified class name strings (internal form) for a wide
00781 variety of classes needed by the JVM at run time.  Of special
00782 significance is a large number of error and exception classes.
00783 
00784 jvm/src/jvmregs.h
00785 -----------------
00786 Definition of JVM program counter, stack, and stack navigation.
00787 
00788 jvm/src/jvmutil.c
00789 -----------------
00790 Utilities for debug message levels, stack dumps, etc.
00791 
00792 jvm/src/linkage.c
00793 jvm/src/linkage.h
00794 -----------------
00795 The header contains linkages between class, object, and thread
00796 structures.  The source file
00797 contains late binding linkage functions to link field references
00798 and method references from one class into their definitions in
00799 another class.
00800 
00801 jvm/src/main.c
00802 main/src/main.c (symbolic link to jvm/src/main.c)
00803 ---------------
00804 A sample main() entry point that calls the JVM from either a
00805 library archive or by direct object linkage.
00806 
00807 jvm/src/manifest.c
00808 ------------------
00809 Read and process selected properties from a JAR manifest file.
00810 
00811 jvm/src/method.c
00812 jvm/src/method.h
00813 ----------------
00814 Handle Java virtural machine methods.  The type definition
00815 'jvm_method_index' is the key to properly using Java methods.
00816 
00817 jvm/src/native.c
00818 jvm/src/native.h
00819 ----------------
00820 Support for JNI native methods and what are called local native
00821 methods (those with intimate knowledge of the inner workings of
00822 the JVM).  Local native methods may, but are not required to, use
00823 the JNI interface, but a significant shortcut is provided for more
00824 direct connection between them and the JVM.
00825 
00826 jvm/src/nts.c
00827 jvm/src/nts.h
00828 jvm/src/unicode.c
00829 jvm/src/unicode.h
00830 jvm/src/utf8.c
00831 jvm/src/utf8.h
00832 -----------------
00833 Utilities for null-terminated strings, UTF8 strings, and Unicode
00834 strings, including conversion back and forth, getting and putting,
00835 length functions, and examination/conversion of Java class formatting
00836 in various types.  Notice that there are three types of strings
00837 in this JVM:  (1) 'C' style strings of ASCII characters terminated by
00838 an ASCII '\0' (NUL) byte, also known as "pointer to real-machine
00839 character" strings, or 'prchar' variables; (2) UTF8 strings as
00840 implemented with the (CONSTANT_Utf8_info) type definition, particularly
00841 as related to Java class files; (3) Unicode strings, which consist of a
00842 (jchar)[] type array variable and a (u2) length variable.
00843 
00844 jvm/src/object.c
00845 jvm/src/object.h
00846 ----------------
00847 Handle Java objects in the real machine implementation.
00848 The type definition 'jvm_object_hash' is the key to properly
00849 using Java objects.  Notice that this hash is implemented
00850 in such a way that the underlying implementation could be
00851 completely changed from a simple array of structures to something
00852 unrelated and there would be only a few changes to some macros
00853 necessary to support that new implementation.  See 'linkage.h'
00854 for how to navigate between classes, objects, and threads.
00855 
00856 jvm/src/objectutil.c
00857 --------------------
00858 Utilities for Java objects.  Support for synchronize()
00859 and unsynchronize() is found here, namely, the object monitor
00860 locks.  Other utility functions may be placed here as appropriate.
00861 
00862 jvm/src/opcode.c
00863 jvm/src/opcode.h
00864 ----------------
00865 The JVM spec, version 2, section 9, operation code list is found
00866 in the header file directly as defined in the spec.  The code
00867 contains the JVM inner execution loop.
00868 
00869 jvm/src/stdio.c
00870 ---------------
00871 Standard input/output functions for stdout, stderr, buffer formatting,
00872 etc.  CAVEAT EMPTOR (let the buyer beware!):  This file does _NOT_ use
00873 structure packing because on the original implementation (Solaris 9
00874 with GCC 3.3.2), the standard I/O library was _not_ compiled with
00875 structure packing and caused strange runtime errors and SIGSEGV on
00876 normal library calls.  Therefore, most standard I/O was gathered into
00877 this file, with the exception of 'manifest.c', which does not seem
00878 to have to problem.
00879 
00880 jvm/src/thread.c
00881 jvm/src/thread.h
00882 ----------------
00883 Handle JVM threads in the real machine implementation.
00884 The type definition 'jvm_thread_index' is the key to
00885 properly using threads.  Notice that this index is
00886 implemented in such a way that the underlying
00887 implementation could be completely changed from a
00888 simple array of structures to something unrelated
00889 and there would be only a few changes to some macros
00890 necessary to support that new implementation.  See 'linkage.h'
00891 for how to navigate between classes, objects, and threads.
00892 
00893 jvm/src/threadstate.c
00894 ---------------------
00895 Thread state machine individual states.
00896 A lengthy narrative near the beginning of this file defines
00897 the JVM thread state machine, which is then implemented in 'jvm.c'
00898 by jvm_run().
00899 
00900 jvm/src/threadutil.c
00901 --------------------
00902 Utilities for normal operation of the thread state machine.
00903 
00904 jvm/src/timeslice.c
00905 -------------------
00906 Real-time JVM time slice timer for limiting the amount of time
00907 that a thread may run (see also 'jvm.c' and 'opcode.c') before
00908 the next thread is given some time.
00909 
00910 jvm/src/tmparea.c
00911 -----------------
00912 Private disk storage area for running the JVM.  It is set up and
00913 torn down by this code and is referenced through a getter function.
00914 
00915 jvm/src/util.h
00916 --------------
00917 Miscellaneous support macros and function prototypes from various
00918 of the source files.  This file contains a list of which source
00919 files have their prototypes listed here, and those that could
00920 but do _not_ for specific reasons.
00921 
00922 
00923 A WORD ABOUT FUNCTION PROTOTYPES
00924 ================================
00925 Each and every source 'filename.c' that has a 'filename.h' associated
00926 with it will locate its function prototypes in that header file.
00927 The few exceptions are listed in 'jvm/src/util.h'.
00928 
00929 
00930 A WORD ABOUT SOURCE FORMATTING
00931 ==============================
00932 Each and every source file is formatted so that not one single line
00933 goes beyond 72 characters in width except where absolutely unavoidable.
00934 Furthermore, there is not one single TAB character (ASCII 9, 011,
00935 0x09, \u0009) in the entire body of code.  The only place it is found
00936 is in theEclipse '.project' and '.classpath' files.  All indentions
00937 are made at 4-column intervals.
00938  
00939 The purpose of these two constraints is so that absolutely anyone with
00940 any text editor may view the source in a standard 80-column text editor,
00941 with line numbers, and be able to see the whole line without any line
00942 wrapping.  This constraint makes the interchange of source code
00943 between project contributors much simpler.  Although there is a case
00944 to be made for wider lines, many developers like 80 column constraints
00945 anyway, or some other number.  This choice of constraint should provide
00946 maximum flexibility of interchange of source files amongst all
00947 contributors.
00948 
00949 
00950 JNI SOURCE CODE
00951 ===============
00952 
00953 jni/src/harmony/generic/0.0/src/java/lang/Object.java
00954 jni/src/harmony/generic/0.0/src/java/lang/Class.java
00955 jni/src/harmony/generic/0.0/src/java/lang/String.java
00956 jni/src/harmony/generic/0.0/src/java/lang/Thread.java
00957 -----------------------------------------------------
00958 Sample segments of java.lang.* package that contain the
00959 native declarations in these classes.  (Not comprehensive
00960 either for each class or for the package.)
00961 
00962 jni/src/harmony/generic/0.0/include/java_lang_Object.h
00963 jni/src/harmony/generic/0.0/include/java_lang_Class.h
00964 jni/src/harmony/generic/0.0/include/java_lang_String.h
00965 jni/src/harmony/generic/0.0/include/java_lang_Thread.h
00966 ------------------------------------------------------
00967 JNI headers that correspond to the 'src/java/lang' equivalent
00968 Java source files in the java.lang.* package that contain the
00969 native declarations in these classes.  (Not comprehensive
00970 either for each class or for the package.)
00971 
00972 jni/src/harmony/generic/0.0/src/java_lang_Object.c
00973 jni/src/harmony/generic/0.0/src/java_lang_Class.c
00974 jni/src/harmony/generic/0.0/src/java_lang_String.c
00975 jni/src/harmony/generic/0.0/src/java_lang_Thread.c
00976 --------------------------------------------------
00977 JNI source code that correspond to the 'src/java/lang' equivalent
00978 Java source files in the java.lang.* package that contain the
00979 implementation of the local native methods in these classes.
00980 (Not comprehensive either for each class or for the package.)
00981 
00982 jvm/include/jlObject.h
00983 jvm/include/jlClass.h
00984 jvm/include/jlString.h
00985 jvm/include/jlThread.h
00986 ----------------------
00987 Headers for connecting the JVM to the JNI interface of the above
00988 classes.  There are effectively two parallel sets of runtime
00989 definitions in these files, one for generic JNI, one for the
00990 core JVM code, namely this body of source.  They are
00991 _independent_ so there is NO coupling between an arbitrary
00992 JDK's JNI implementation and this JVM.  Please see comments
00993 in these header files for more information.
00994 
00995 jvm/src/jlObject.c
00996 jvm/src/jlClass.c
00997 jvm/src/jlString.c
00998 jvm/src/jlThread.c
00999 ------------------
01000 The native side of the JNI implementation of the above selected
01001 core java.lang.* classes.  The functions in these files are
01002 referenced by the JNI target functions.  For example, the
01003 function java_lang_Class_isPrimative() in 'java_lang_Class.c'
01004 will at some point call the 'jlClass_isPrimative()' function
01005 in 'jlClass.c'.
01006 
01007 jni/src/harmony/generic/0.0/src/sampleJNImain.c
01008 -----------------------------------------------
01009 Sample function that calls JNI functions through the JNI interface.
01010 This is currently built as a straight binary.  It needs to be built
01011 as a .so/.dll shared object file.
01012 
01013 
01014 TEST JAVA FILES
01015 ===============
01016 This area is designed to be filled in with _many_ packages and
01017 classes for testing the JVM.  Here is a rudimentary but useful
01018 first pass:
01019 
01020 
01021 test/src/HelloWorld.java
01022 ------------------------
01023 Just what it says.  In order to run this main() method, however,
01024 requires a _fully_ functional JVM with most of the basic JNI in
01025 place.
01026 
01027 test/src/harmony/jvm/test/PkgHelloWorld.java
01028 --------------------------------------------
01029 The same "hello world" program, but in a package.
01030 
01031 test/src/harmony/jvm/test/MainArgs.java
01032 ---------------------------------------
01033 A main() method that takes arguments from the command line.
01034 
01035 
01036 
01037 KNOWN ISSUES AND SUGGESTED TO-DO ITEMS
01038 ======================================
01039 
01040 (1) The default garbage collection algorithm is 'no collection'.
01041 This means that an algorithm must be written in order to sustain
01042 JVM execution with continuous heap availability.  Other heap
01043 implementations might also be written beyond the two that are
01044 supplied.
01045 
01046 (2) Anonymous strings (without a class static field reference in
01047 (rclass) or an object instance reference in (robject) may not get
01048 completely unreferenced and deallocated, but could continue to
01049 accumulate instance references.  Depending on the GC algorithm, this
01050 may or may not occur and/or may or may not present a problem over a
01051 very long JVM session where reference structures grow without bound
01052 or counts wrap around to zero.
01053 
01054 (3) There has been no attempt made to process any special values
01055 of floating point data, neither NAN, +/- infinity, overflow, or
01056 anything else.  The definitions per the class file spec are present,
01057 of course, but nothing has been done with them.  Special care should
01058 be taken in fprintf() statements-- like the sysDbgMsg() calls in
01059 'cfmsgs.c' -- to make sure that formats like "%lf" are followed and
01060 precise.  The same goes for 64-bit integers, (long long) in real
01061 machine and (jlong) in JVM, that formats like "%ld" are followed
01062 and precise.
01063 
01064 (4) All reading of JAR files uses the 'jar' utility provided in
01065 the $JAVA_HOME directory tree in '$JAVA_HOME/bin/jar'.  This will,
01066 of course, need to be enhanced to simply read the JAR files directly.
01067 To start with, the BOOTCLASSPATH could have them added to it as a
01068 part of 'config.sh', then invoked from the JVM.
01069 
01070 (5) The JNI interface in 'jni/src/harmony' contains only a cursory
01071 JNI implementation.  This will need to be filled out to the complete
01072 suite of native methods needed by java.lang.*.  The native side is
01073 found in 'jvm/include' for the 'jni/src/harmony' source
01074 and header files, while the implementation is in the
01075 'jvm/src/jlClassNameXYZ.c' files (all beginning with 'jl' for
01076 "java/lang" followed by the actual class name).  For example,
01077 'jlThread.c' contains the native portions of java.lang.Thread.
01078 Certain key system calls like open(2) and exit(2) and socket(2)
01079 are located in packages like 'java.io' and 'java.system' and
01080 'java.rmi' (?), respectively.  The local native method interface
01081 will need to be extended to cover these system calls.
01082 See also item (28) below.
01083 
01084 (6) The JVM execution engine in particular will need through testing.
01085 It has been unit tested, of course, but good, solid integration
01086 testing is an excellent exercise for new contributors to learn the
01087 code, so this item has been specifically reserved for contributors
01088 for this purpose.
01089 
01090 (7) There is apparently a typographical error in the spec section
01091 4.5.7 describing 21-bit character encoding.  Therefore, this format
01092 has not been implemented pending proper resolution.  It probably is
01093 correct to assume that the spurious "-1" in the second byte of the
01094 first triple should say "(bits 20-16)" instead of "(bits 20-16)-1".
01095 However, this is presented as an exercise for contributors.
01096 
01097 (8) There are a plethora of comments marked "@todo xxx" all over the
01098 code.  In the Doxygen "Related Pages" section, they are gathered into
01099 one comprehensive list.  These items are presented as exercises for
01100 contributors.
01101 
01102 (9) The Solaris 32-bit implementation was chosen for development as a
01103 very generic development and runtime environment.  Testing partway
01104 through indicated that a transition to a 64-bit runtime environment
01105 could be compiled with little effort.  However, there will have to
01106 be some adjustments to get everything to work because that one test
01107 produced a SIGSEGV during JVM initialization, and its cause was not
01108 thoroughly explored.  Every effort has been made to anticipate 64-bit
01109 compilations, but that adjustment will need to be fully tested.
01110 
01111 (10) Every effort has been made to use _only_ standard Unix and
01112 Posix runtime library calls so the code could be ported to as many
01113 platforms as possible.  The first targets that come to mind are,
01114 of course, Linux, HP-UX, and AIX, with CygWin in mind also, with
01115 a few changes for system(3) scripts.  (*** The code intentionally
01116 does _NOT_ have the Windows version of some CONFIG_xxxx_SCRIPT
01117 definitions so that the contributor will need to get it written,
01118 possibly put into a .BAT file, then invoked through the existing
01119 infrastructure in the code. ***)
01120 
01121 (11a) There are a number of places here and there in the code where a
01122 return value is not properly checked for having an error value, such
01123 as 'jvm_class_index_null' or 'jvm_attribute_index_bad'.  This
01124 is typically _not_ a throwable error condition, like from heap
01125 allocation or other error, but is an actual runtime condition
01126 that is returned.  After three full passes through the code,
01127 the error code structure has morphed significantly from a fully
01128 structured approach just to get the code started to a partial OO
01129 approach, throwing errors through setjmp(3)/longjmp(3).  Someone
01130 needs to go through the code in its _entirety_ and find out where
01131 invalid results are intentionally returned instead of throwing an
01132 error and _make sure_ that the calling functions check those results
01133 properly.
01134 
01135 (11b) The same goes for input parameters in various places
01136 throughout the code for the following typedefs from 'jvmcfg.h':
01137     jvm_thread_index
01138     jvm_constant_pool_index
01139     jvm_interface_index
01140     jvm_class_index
01141     jvm_method_index
01142     jvm_field_index
01143     jvm_field_lookup_index
01144     jvm_attribute_index
01145     jvm_unicode_string_index
01146     jvm_object_hash
01147 
01148 These will need to be reviewed for the same reasons as the error
01149 return codes.  Either they will index an array to its NULL element,
01150 specifically (with corresponding NULL element named also),
01151 
01152     jvm_thread_index (jvm_thread_index_null)
01153     jvm_constant_pool_index (jvm_constant_pool_index_null)
01154     jvm_class_index (jvm_class_index_null)
01155     jvm_object_hash (jvm_object_hash_null)
01156     jvm_native_method_ordinal (jvm_native_method_ordinal_null)
01157 
01158 which, in the case of a 'jvm_constant_pool_index' will produce
01159 a NULL pointer SIGSEGV, or it will index way off the end at the
01160 max index for the data type, namely (with corresponding BAD
01161 element name also),
01162 
01163     jvm_interface_index (jvm_interface_index_bad)
01164     jvm_method_index (jvm_method_index_bad)
01165     jvm_field_index (jvm_field_index_bad)
01166     jvm_field_lookup_index (jvm_field_lookup_index_bad)
01167     jvm_attribute_index (jvm_attribute_index_bad)
01168     jvm_unicode_string_index (jvm_unicode_string_index_bad)
01169 
01170 A typical check might be,
01171 
01172     jvm_attribute_index
01173        attribute_find_in_field_by_cp_entry(jvm_class_index clsidx,...)
01174     {
01175         if (jvm_class_index_null == clsidx)
01176         {
01177             /* Somebody goofed * /
01178             exit_init_failure(EXIT_JVM_INTERNAL,
01179                               JVMCLASS_JAVA_LANG_INTERNALERROR);
01180     /*NOTREACHED* /
01181         }
01182     }
01183 
01184 In all cases, a review of the code with format input parameters
01185 to test what goes in is simple-- there are quite a few places
01186 where this _is_ done, it just is not complete.  A similar
01187 review of functions that pass these data types back _out_ will
01188 provide the inverse test.  Both are _mandatory_ for integrity
01189 of the runtime environment.
01190 
01191 (12) The threading algorithm uses a simple for() loop.  It could
01192 be easily generalized to using POSIX threads since there is already
01193 one POSIX thread started in the current code.  This thread would go
01194 away when and if such threading were implemented.  This thread is
01195 implemented in 'timeslice.c'.  It is a time slice real-time clock
01196 timer that sends SIGALRM every so often on a tick boundary.  This
01197 signal is connected to a handler that sets a flag in each live
01198 JVM thread and tells the JVM inner loop to quit after the current
01199 operation code so the next thread can be activated.  By unwinding
01200 the thread activation for() loop, also known as the JVM outer loop,
01201 and putting in place a POSIX thread for each JVM thread, the timer
01202 would go away and the code would become a true multi-threaded
01203 application.  
01204 
01205 The purpose of _this_ implementation as a for() loop is for simplicity
01206 of implementation and for education of contributors who are new to
01207 multi-threaded environments in general and the Java virtual machine
01208 multi-threaded environment in particular.  (Note that such unwinding
01209 would take place exclusively in 'thread.c'.  No other logic
01210 _anywhere_ should be affected in more than a minor way beyond the
01211 JVM outer look in 'jvm.c'.)
01212 
01213 (13) Commensurate with the above discussion on unwinding the JVM
01214 outer loop into a multi-threaded application is the discussion of
01215 write barriers.  THERE ARE ABSOLUTELY NO WRITE BARRIERS IN THIS
01216 DESIGN!!!  This is both for simplicity (per above reasons) and so
01217 that this _critical_ subject is given due consideration by the
01218 whole team JVM architecture contributors.  This implementation
01219 makes no claims toward being an authority on such a vital and
01220 potentially sensitive subject.
01221 
01222 (14) The 'field_info' area of the class file definition supports
01223 the 'ConstantValue' attribute, but does not support 'Signature'.
01224 This needs to be added for conformance to JDK 5 requirements.
01225 It also should support 'Synthetic' and 'Deprecated'.  The definitions
01226 are present in 'classfile.h', they just need to get implemented.
01227 
01228 (15) The 'method_info' area of the class file definition supports
01229 the 'Code' and 'Exceptions' attributes, but does not support
01230 'Signature'.  This needs to be added for conformance to JDK 5
01231 requirements.  The definitions are present in 'classfile.h', they
01232 just need to get implemented.
01233 
01234 (16) When nearing the end of the initial development, I ran across
01235 what is probably a memory configuration limit on my Solaris platform,
01236 which I did not bother to track down, but rather worked around.  It
01237 seems that when calling malloc(3C) or malloc(3MALLOC), after 2,280
01238 malloc() allocations and 612 free() invocations, there is something
01239 under the covers that does a SIGSEGV, and it can happen in either
01240 routine.  I therefore extended the heap mechanism of 'heap_simple.c'
01241 to allocate a large number of slots of 'n' bytes for small allocations
01242 up to this size.  Everything else still uses malloc().  In this way,
01243 I was able to finish development on the JVM without arguing with heap
01244 allocation.  This implementation is found in 'heap_bimodal.c'.  (In
01245 other words, I will let the team fix it!)  Furthermore, I am not sure
01246 that the real project wants a static 'n + 1' MB data area just hanging
01247 around the runtime just because I did not take time to tune the system
01248 configuration!
01249 
01250 (17) In preparation for writing the actual JVM execution engine,
01251 the 'bytegames.c' utilities bytegames_getrl8() and bytegames_putrl8()
01252 and bytegames_swap8() and bytegames_mix8() were added.  The 'util.h'
01253 macros GETRL8() and PUTRL8() were likewise added.  Other ways of
01254 processing 64-bit values were obsoleted and the places where they were
01255 used have been replaced by these functions and macros.  This
01256 substitution needs to be more rigorously tested.  All places where
01257 this substitution was made have been marked as commented-out code
01258 plus new functionality with a @todo item commenting it.
01259 
01260 (18) There needs to be an examination of heap usage to make _sure_
01261 that all instances of HEAP_GET_xxxx() eventually have a matching
01262 HEAP_FREE_xxxx() so there are no memory leaks.  The existing code
01263 has been carefully written to facilitate this requirement, but
01264 it has not been rigorously regression tested.  And since it is _well_
01265 known that memory leaks are a system integrator's worst nightmare,
01266 if the code is checked early on into the project, there will be
01267 nothing to lose sleep over later on.
01268 
01269 (19) A review of the non-local return logic in 'opcode.c' from threads
01270 that throw a java.lang.Error, java.lang.Exception, and
01271 java.lang.Throwable is in order.  Errors should kill the JVM, but
01272 Exceptions should allow it to proceed.  What about Throwables?
01273 The framework is in place for the project team to adjust to meet
01274 the JVM spec.  Also, make sure that simply creating a new class
01275 object, running its <init> method, and continuing with previous code
01276 or quitting is the right way to process the condition.  The existing
01277 logic is from heuristic observation of a JVM, but an examination of
01278 the spec should provide the definitive answer.  Notice that
01279 java.lang.Throwable and java.lang.StackTraceElement will need local,
01280 native implementations.  The exception framework needs to be filled in
01281 in opcode_run() for exceptions thrown by Java virtual methods.  All
01282 handling of errors like java.lang.NoSuchMethodError is already taken
01283 care of.
01284 
01285 (20) The concept of the ThreadGroup is not implemented here (see JVM
01286 spec section 2.16).  This is given as an exercise to the project team.
01287 The default implementation consists _ONLY_ of the method
01288 java.lang.ThreadGroup.uncaughtException().  The rest of this class
01289 must be implemented.  This partial implementation may also need to
01290 go away at that time.  The basic reasoning for this is that the
01291 functionality of java.lang.ThreadGroup does not seem to warrant
01292 any native methods.  Therefore, the whole of its functionality should
01293 likely be implemented in a Java class library and should typically
01294 behave properly without direct JVM core support.
01295 
01296 (21) The sequence of loading and initializing the three critical
01297 classes java.lang.Object, Class, and String in 'jvm_init()' is
01298 somewhat fragile and probably not really sustainable.  It might be
01299 that the best thing to do is to either make classlib-specific
01300 implementations of this initialization or to unload and reload these
01301 classes once the original loading is done and let the normal <clinit>
01302 be performed during this second attempt to load each of these classes.
01303 The problem is one of chickens and eggs:  Which comes first?  The
01304 logic is written and somewhat tested.  It needs to be firmed up.
01305 An exercise for the team.  Unloading and reloading support is already
01306 provided with class_static_delete() and class_reload().
01307 
01308 (22) The verification step (VM spec section 2.17.3) is given as an
01309 exercise for the project team.  This was done in the interest of time
01310 and the fact that there may be third-party packages that may do this
01311 nicely (BCEL?) without much further effort.
01312 
01313 (23) The source code is documented using Doxygen in its normative
01314 state.  Although this tool can be _significantly_ contorted to perform
01315 documentation of almost _any_ file type due to its capability to hook
01316 into a user-specfied input filter, this implementation has done
01317 nothing outside of the usual bounds of the default product
01318 configuration.  With that said, there is _one_ small admission to make
01319 here:  Even though all of the code was written in ANSI 'C', the OO
01320 concept of throwing exceptions is intrinsically bound up with the
01321 runtime needs of a Java Virtual Machine.  Therefore, one _small_
01322 bending of the rules was permitted in the documentation-- use of the
01323 C++ documentation keyword '@throws' (also known as '@exception' and
01324 '@throw').  Functions that instantiate a java.lang.Throwable such as
01325 'Error' or 'Exception' (and/or their subclasses) use the following
01326 documentation syntax to present this behavioral feature to the
01327 Doxygen output.  The results will show up in the bolded section named
01328 'Exceptions:' in the same way that parameters show up in the
01329 'Parameters:' section and return values in the 'Returns:' section:
01330 
01331  *
01332  * @brief Function fn() doc header
01333  * ...
01334  * other doc info...
01335  * ...
01336  *
01337  * @param p1 what parm 1 does
01338  *
01339  * @returns what fn returns
01340  *
01341  * @throws JVMCLASS_some_kind_of_exception_string_name
01342  *         @link \#JVMCLASS_some_kind_of_exception_string_name
01343  *         brief description of how to make this happen @endlink.
01344  * /
01345 
01346  *include "exit.h"
01347  *include "jvmclass.h"
01348 
01349 rettype fn(parmtype p1)
01350 {
01351     ...
01352     if (problem_happened)
01353     {
01354       exit_throw_exception(JVMCLASS_some_kind_of_exception_string_name);
01355 /*NOTREACHED* /
01356     }
01357 
01358     ...
01359 
01360     return(normal_value);
01361 }
01362 
01363 Notice this function call is similar to exit_jvm(), but has the
01364 added feature of initiating an exception, after which exit_jvm()
01365 will be called with an appropriate exit code to shut down the JVM.
01366 
01367 
01368 (25) When using the Eclipse C/C++ plugin, be aware of a bug in
01369 that code:  Sometimes it loses its brains and wants to debug the
01370 test Java program instead of the C program.  Look at the Run menu's
01371 Debug dialog (Run|Debug) and you will notice that under "C/C++ Local
01372 Application", the 'jvm' configuration is missing.  To correct
01373 this, simply click on "C/C++ Local Application" and then click "New":
01374 
01375 Main:
01376     Project:                         jvm
01377     C/C++ Application:               bin/bootjvm
01378     Connect process input & output
01379       to a terminal:                 yes
01380 Arguments:
01381     C/C++ Program Arguments:         (anything appropriate)
01382     Use default working directory:   yes
01383 Environment:
01384     (anything you like)
01385 Debugger:
01386     Debugger:                        GDB Debugger
01387     Stop in main() on startup:       yes (typically, no is okay)
01388     Main:
01389         GDB debugger:                gdb
01390         GDB command file:            <empty>
01391     Share Libraries:
01392         (anything you like)
01393 Source:
01394     jvm                              yes
01395     test                             yes
01396 Common:
01397     Type of launch configuration:    local
01398     Display in favorites menu:       (anything appropriate)
01399     Launch in background:            yes
01400 
01401 Now click 'Apply' and 'Close'.  This should save it properly.
01402 Now it may be used normally.
01403 
01404 
01405 (24) A solid introductory project for someone is to go in to
01406 'cfmsgs.c' and write and equivalent to cfmsgs_show_constant_pool()
01407 for the field table and method table, suggested names being,
01408 cfmsgs_show_fields_table() and cfmsgs_show_methods_table().
01409 This method takes cfmsgs_typemsg() and displays each constant_pool[]
01410 item.  The same approach could be taken to these two utilities.
01411 The first and most important use of these functions is in
01412 classfile_loadclassdata().
01413 
01414 In like manner, an attribute table display function needs to
01415 be written to do the same thing for any attributes table.
01416 Currently  cfmsgs_atrmsg() shows the contents of a _single_
01417 attribute, but there is nothing that can explicitly dump
01418 an entire attribute table for fields, methods, or the (single)
01419 class attribute table.
01420 
01421 (25) While working on the above 'cfmsgs.c' display routines,
01422 that same person should probably go in to classfile_loadclassdata()
01423 and clean up the numerous individual cfmsgs_typemsg() calls for
01424 inclusion in those routines.  A few should probably still be
01425 left in, such as 'cfmsgs_typemsg("this", pcfs, pcfs->this_class)'
01426 which reports the 'this_class' member, but this process was followed
01427 when writing cfmsgs_show_constant_pool() and it helped the readability
01428 of the output _immensely_ to get rid of 'ad hoc' debug messages.
01429 
01430 (26) The debug message level (DML) values need to be completely
01431 overhauled for more effective use by developers.
01432 
01433 (27) Does there need to be a tighter relationship between the
01434 declaration of a native method via ACC_NATIVE when the class file
01435 is loaded by classfile_loadclassdata() and the class resolution
01436 logic of the linkage_resolve/unresolve_class() methods?  What
01437 about with thread_class_load() and related startup code that runs a
01438 method?
01439 
01440 (28) Although there have been _several_ significant passes at
01441 desk-debugging 'threadstate.c' and 'threadutil.c' and in particular,
01442 'jlThread.c', the JVM thread state machine needs a rigorous pass at
01443 integration testing.  The java.lang.Thread methods of 'jlThread.c'
01444 and 'threadstate.c' are of particular concern, along with the hooks
01445 in 'timeslice.c'.  This testing has been left as an exercise
01446 for the project team.  Someone who has completed their Sun
01447 Certified Java Developer certification and is good with skills
01448 in 'C' should be able to firm up this area of the code nicely.  See
01449 also item (5) above.
01450 
01451 (29) In like manner to (28) above, unit testing of object monitor
01452 locking has been left as an exercise for the project team.  Again,
01453 someone who has completed their Sun Certified Java Developer
01454 certification and is good with 'C' should be able to firm up
01455 this area of the code nicely.
01456 
01457 (30) The whole body of code needs to be reviewed for proper ordering
01458 of PUSH() and POP() macros for (jlong) and (jdouble) variables.
01459 The same goes for local variable access of the same.  The way it
01460 _should_ be done at this time is PUSH(MS), PUSH(LS) with corresponding
01461 POP(LS), POP(MS).  For local variables, var[n] := MS, var[n+1] := [LS].
01462 However, this needs proper scrutiny for completeness and correctness.
01463 
01464 (31) A review needs to be made of the Eclipse project files.  In
01465 particular, do the C/C++ project settings need to be made independent
01466 of the workspace settings for compile, link, and library archiver
01467 options?  Do the Java project settings need to be made similarly
01468 independent?  Should there be a workspace provided in the distribution
01469 that has all these things set up?  Currently, the projects are a bit
01470 of a mixture of workspace and project settings.  The 'build.sh'
01471 scripts use a hard-coded set of GCC options that are derived from the
01472 Eclipse setup.  The 'config/*.gcc' and config/*.gccld' files reflect
01473 this for the C source.  The Java compilations in the 'build.sh'
01474 scripts use the default Java compiler options.  Does there need to be
01475 a harmonization (sic) between the Eclipse and 'build.sh' settings?
01476 Should they be manually mantained in Eclipse and 'config.sh'?  These
01477 are questions that need some review.  Furthermore, this author is very
01478 much a proponent of systematic use of 'gmake' as a premier project
01479 build tool across the industry.  It would be a really good idea for
01480 use of 'gmake' to be reviewed.  It is used internally by Eclipse,
01481 but users should not be forced to use Eclipse just to get 'gmake'.
01482 
01483 (32) The inner loop of virtual instructions in opcode.c checks various
01484 items at run time such as ACC_STATIC and ACC_FINAL and other items
01485 that a class verifier should test.  This is done so that the code
01486 corresponds _directly_ with the definition of each instruction as
01487 described in JVM spec section 6.  Many of these these tests (namely,
01488 the examples above) should be moved to a byte code verifier because
01489 they will not change from the time that the class is loaded until it
01490 is unloaded. Obviously, checking them each time a virtual instruction
01491 is run is _not_ efficient at run time!  However, this implementation
01492 is meant also to teach the principles and requirements of virtual
01493 instructions, so the implementation of the JVM spec requirement was
01494 done in the inner loop in opcode_run() as a sort of reference
01495 implementation of each virtual instruction.
01496 
01497 (33) One very good exercise for project team members to learn this
01498 code will be to implement the exception logic as found in the
01499 exception attribute of the program counter, namely jvm_pc.excpatridx.
01500 This field is filled in correctly at method startup time, and there
01501 should be enough hooks in the structure of the code to implement it
01502 without much effort.  What will be worth it is the learning process.
01503 This logic will bear some strong similarities with LATE_CLASS_LOAD()
01504 and class_load_resolve_clinit() when not using the system thread.
01505 
01506 (34) It is similarly left as an exercise for the project team to
01507 locate a method in a superclass or an interface if it is not found
01508 in the current class.  Currently, a request is made for a method
01509 in the current class and a (jvm_method_index) is returned.  This is
01510 fine for JVM initialization when it is known which methods are
01511 found where, but what _should_ happen in normal JVM runtime is that
01512 the INVOKEVIRTUAL, INVOKEINTERFACE, INVOKESTATIC, etc., should check
01513 if a method is in the current class.  If not, check superclasses
01514 until no more superclasses.  If found, return a heap pointer that
01515 contains the (jvm_class_index) and (jvm_method_index) of the located
01516 method, otherwise return 'rnull', the NULL pointer.  Class loading
01517 does search for superclasses, so look for uses of 'pcfs->super_class'
01518 or 'pcfs_recurse->super_class' for examples on how to do this.
01519 
01520 (35) In the documentation, the @def and @struct tags were used
01521 to introduce specific definition types.  This practice somehow
01522 was not carried through to all definitions.  Similar cleanup should
01523 be done for the following tags:  @enum, @fn, @var, @typedef.  None
01524 of these were put in place, but would clean up the organization of
01525 the output areas and add better indexing capabilities to the result
01526 set.  It also might ease the need for @link tags so that definitions
01527 may be automatically tagged better, making room for cleanup of such
01528 constructions in the narrative as:
01529 
01530     @link \#jvm_class_index jvm_class_index@endlink
01531 
01532 to be simplified into,
01533 
01534     jvm_class_index
01535 
01536 This would then ease the readability (in source code itself) of the
01537 documentation.  Notice that constructions such as,
01538 
01539     @link \#jvm_class_index a resulting class index@endlink
01540 
01541 will not need cleanup since the target link is not the same as the
01542 displayed text.
01543 
01544 (36) An excellent tutorial for learning how objects are created using
01545 non-default constructors is in jvm.c and class.c where string
01546 parameters are loaded in from the argv[] command line and from
01547 CONSTANT_String_info structures of the class file, respectively.
01548 A suggestion is made in jvm.c as to how to accomplish this.
01549 
01550 (37) After going back and forth as to whether or not a @throws
01551 condition that happens in a subroutine should also be reported
01552 back up the line in the function that called it, the result is
01553 that sometimes it is reported and sometimes it is not.  This needs
01554 to be regularized.  It is recommended to _not_ report it in the
01555 calling function's documentation also since this is really not
01556 an OO environment.  Document it where it happens and let that stand.
01557 
01558 @endverbatim
01559  *
01560  */
01561  */ /* 
01562  * (Use  #! and #/ with dox_filter.sh to fool Doxygen into
01563  * parsing this non-source text file for the documentation set.
01564  * Use the above open comment to force termination of parsing
01565  * since it is not a Doxygen-style 'C' comment.)
01566  *
01567  * EOF
01568 

Generated on Fri Sep 30 18:49:10 2005 by  doxygen 1.4.4