http://xml.apache.org/http://www.apache.org/http://www.w3.org/

Home

Readme
Download
Installation
Build

API Docs
Samples
Schema

FAQs
Programming
Migration

Releases
Bug-Reporting
Feedback

Y2K Compliance
PDF Document

CVS Repository
Mail Archive

API Docs for SAX and DOM
 

Main Page   Class Hierarchy   Alphabetical List   Compound List   File List   Compound Members   File Members  

SAXParser.hpp

Go to the documentation of this file.
00001 /*
00002  * The Apache Software License, Version 1.1
00003  *
00004  * Copyright (c) 1999-2001 The Apache Software Foundation.  All rights
00005  * reserved.
00006  *
00007  * Redistribution and use in source and binary forms, with or without
00008  * modification, are permitted provided that the following conditions
00009  * are met:
00010  *
00011  * 1. Redistributions of source code must retain the above copyright
00012  *    notice, this list of conditions and the following disclaimer.
00013  *
00014  * 2. Redistributions in binary form must reproduce the above copyright
00015  *    notice, this list of conditions and the following disclaimer in
00016  *    the documentation and/or other materials provided with the
00017  *    distribution.
00018  *
00019  * 3. The end-user documentation included with the redistribution,
00020  *    if any, must include the following acknowledgment:
00021  *       "This product includes software developed by the
00022  *        Apache Software Foundation (http://www.apache.org/)."
00023  *    Alternately, this acknowledgment may appear in the software itself,
00024  *    if and wherever such third-party acknowledgments normally appear.
00025  *
00026  * 4. The names "Xerces" and "Apache Software Foundation" must
00027  *    not be used to endorse or promote products derived from this
00028  *    software without prior written permission. For written
00029  *    permission, please contact apache\@apache.org.
00030  *
00031  * 5. Products derived from this software may not be called "Apache",
00032  *    nor may "Apache" appear in their name, without prior written
00033  *    permission of the Apache Software Foundation.
00034  *
00035  * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED
00036  * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
00037  * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
00038  * DISCLAIMED.  IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR
00039  * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
00040  * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
00041  * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF
00042  * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
00043  * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
00044  * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
00045  * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
00046  * SUCH DAMAGE.
00047  * ====================================================================
00048  *
00049  * This software consists of voluntary contributions made by many
00050  * individuals on behalf of the Apache Software Foundation, and was
00051  * originally based on software copyright (c) 1999, International
00052  * Business Machines, Inc., http://www.ibm.com .  For more information
00053  * on the Apache Software Foundation, please see
00054  * <http://www.apache.org/>.
00055  */
00056 
00057 /*
00058  * $Log: SAXParser.hpp,v $
00059  * Revision 1.16  2001/06/03 19:26:20  jberry
00060  * Add support for querying error count following parse; enables simple parse without requiring error handler.
00061  *
00062  * Revision 1.15  2001/05/11 13:26:22  tng
00063  * Copyright update.
00064  *
00065  * Revision 1.14  2001/05/03 19:09:25  knoaman
00066  * Support Warning/Error/FatalError messaging.
00067  * Validity constraints errors are treated as errors, with the ability by user to set
00068  * validity constraints as fatal errors.
00069  *
00070  * Revision 1.13  2001/03/30 16:46:57  tng
00071  * Schema: Use setDoSchema instead of setSchemaValidation which makes more sense.
00072  *
00073  * Revision 1.12  2001/03/21 21:56:09  tng
00074  * Schema: Add Schema Grammar, Schema Validator, and split the DTDValidator into DTDValidator, DTDScanner, and DTDGrammar.
00075  *
00076  * Revision 1.11  2001/02/15 15:56:29  tng
00077  * Schema: Add setSchemaValidation and getSchemaValidation for DOMParser and SAXParser.
00078  * Add feature "http://apache.org/xml/features/validation/schema" for SAX2XMLReader.
00079  * New data field  fSchemaValidation in XMLScanner as the flag.
00080  *
00081  * Revision 1.10  2001/01/12 21:23:41  tng
00082  * Documentation Enhancement: explain values of Val_Scheme
00083  *
00084  * Revision 1.9  2000/08/02 18:05:15  jpolast
00085  * changes required for sax2
00086  * (changed private members to protected)
00087  *
00088  * Revision 1.8  2000/04/12 22:58:30  roddey
00089  * Added support for 'auto validate' mode.
00090  *
00091  * Revision 1.7  2000/03/03 01:29:34  roddey
00092  * Added a scanReset()/parseReset() method to the scanner and
00093  * parsers, to allow for reset after early exit from a progressive parse.
00094  * Added calls to new Terminate() call to all of the samples. Improved
00095  * documentation in SAX and DOM parsers.
00096  *
00097  * Revision 1.6  2000/02/17 03:54:27  rahulj
00098  * Added some new getters to query the parser state and
00099  * clarified the documentation.
00100  *
00101  * Revision 1.5  2000/02/16 03:42:58  rahulj
00102  * Finished documenting the SAX Driver implementation.
00103  *
00104  * Revision 1.4  2000/02/15 04:47:37  rahulj
00105  * Documenting the SAXParser framework. Not done yet.
00106  *
00107  * Revision 1.3  2000/02/06 07:47:56  rahulj
00108  * Year 2K copyright swat.
00109  *
00110  * Revision 1.2  1999/12/15 19:57:48  roddey
00111  * Got rid of redundant 'const' on boolean return value. Some compilers choke
00112  * on this and its useless.
00113  *
00114  * Revision 1.1.1.1  1999/11/09 01:07:51  twl
00115  * Initial checkin
00116  *
00117  * Revision 1.6  1999/11/08 20:44:54  rahul
00118  * Swat for adding in Product name and CVS comment log variable.
00119  *
00120  */
00121 
00122 #if !defined(SAXPARSER_HPP)
00123 #define SAXPARSER_HPP
00124 
00125 #include <sax/Parser.hpp>
00126 #include <internal/VecAttrListImpl.hpp>
00127 #include <framework/XMLDocumentHandler.hpp>
00128 #include <framework/XMLElementDecl.hpp>
00129 #include <framework/XMLEntityHandler.hpp>
00130 #include <framework/XMLErrorReporter.hpp>
00131 #include <validators/DTD/DocTypeHandler.hpp>
00132 
00133 class DocumentHandler;
00134 class EntityResolver;
00135 class XMLPScanToken;
00136 class XMLScanner;
00137 class XMLValidator;
00138 
00139 
00149 
00150 class  SAXParser :
00151 
00152     public Parser
00153     , public XMLDocumentHandler
00154     , public XMLErrorReporter
00155     , public XMLEntityHandler
00156     , public DocTypeHandler
00157 {
00158 public :
00159     // -----------------------------------------------------------------------
00160     //  Class types
00161     // -----------------------------------------------------------------------
00162     enum ValSchemes
00163     {
00164         Val_Never
00165         , Val_Always
00166         , Val_Auto
00167     };
00168 
00169 
00170     // -----------------------------------------------------------------------
00171     //  Constructors and Destructor
00172     // -----------------------------------------------------------------------
00173 
00176 
00181     SAXParser(XMLValidator* const valToAdopt = 0);
00182 
00186     ~SAXParser();
00188 
00189 
00192 
00198     DocumentHandler* getDocumentHandler();
00199 
00206     const DocumentHandler* getDocumentHandler() const;
00207 
00214     EntityResolver* getEntityResolver();
00215 
00222     const EntityResolver* getEntityResolver() const;
00223 
00230     ErrorHandler* getErrorHandler();
00231 
00238     const ErrorHandler* getErrorHandler() const;
00239 
00246     const XMLScanner& getScanner() const;
00247 
00254     const XMLValidator& getValidator() const;
00255 
00263     ValSchemes getValidationScheme() const;
00264 
00275     bool getDoSchema() const;
00276 
00287     int getErrorCount() const;
00288 
00298     bool getDoNamespaces() const;
00299 
00309     bool getExitOnFirstFatalError() const;
00310 
00321     bool getValidationConstraintFatal() const;
00323 
00324 
00325     // -----------------------------------------------------------------------
00326     //  Setter methods
00327     // -----------------------------------------------------------------------
00328 
00331 
00348     void setDoNamespaces(const bool newState);
00349 
00366     void setValidationScheme(const ValSchemes newScheme);
00367 
00381     void setDoSchema(const bool newState);
00382 
00383 
00399     void setExitOnFirstFatalError(const bool newState);
00400 
00416     void setValidationConstraintFatal(const bool newState);
00418 
00419 
00420     // -----------------------------------------------------------------------
00421     //  Advanced document handler list maintenance methods
00422     // -----------------------------------------------------------------------
00423 
00426 
00439     void installAdvDocHandler(XMLDocumentHandler* const toInstall);
00440 
00450     bool removeAdvDocHandler(XMLDocumentHandler* const toRemove);
00452 
00453 
00454     // -----------------------------------------------------------------------
00455     //  Implementation of the SAXParser interface
00456     // -----------------------------------------------------------------------
00457 
00460 
00472     virtual void parse(const InputSource& source, const bool reuseGrammar = false);
00473 
00486     virtual void parse(const XMLCh* const systemId, const bool reuseGrammar = false);
00487 
00498     virtual void parse(const char* const systemId, const bool reuseGrammar = false);
00499 
00510     virtual void setDocumentHandler(DocumentHandler* const handler);
00511 
00521     virtual void setDTDHandler(DTDHandler* const handler);
00522 
00533     virtual void setErrorHandler(ErrorHandler* const handler);
00534 
00546     virtual void setEntityResolver(EntityResolver* const resolver);
00548 
00549 
00550     // -----------------------------------------------------------------------
00551     //  Progressive scan methods
00552     // -----------------------------------------------------------------------
00553 
00556 
00587     bool parseFirst
00588     (
00589         const   XMLCh* const    systemId
00590         ,       XMLPScanToken&  toFill
00591         , const bool            reuseGrammar = false
00592     );
00593 
00624     bool parseFirst
00625     (
00626         const   char* const     systemId
00627         ,       XMLPScanToken&  toFill
00628         , const bool            reuseGrammar = false
00629     );
00630 
00661     bool parseFirst
00662     (
00663         const   InputSource&    source
00664         ,       XMLPScanToken&  toFill
00665         , const bool            reuseGrammar = false
00666     );
00667 
00692     bool parseNext(XMLPScanToken& token);
00693 
00715     void parseReset(XMLPScanToken& token);
00716 
00718 
00719 
00720 
00721     // -----------------------------------------------------------------------
00722     //  Implementation of the DocTypeHandler Interface
00723     // -----------------------------------------------------------------------
00724 
00727 
00741     virtual void attDef
00742     (
00743         const   DTDElementDecl& elemDecl
00744         , const DTDAttDef&      attDef
00745         , const bool            ignoring
00746     );
00747 
00757     virtual void doctypeComment
00758     (
00759         const   XMLCh* const    comment
00760     );
00761 
00778     virtual void doctypeDecl
00779     (
00780         const   DTDElementDecl& elemDecl
00781         , const XMLCh* const    publicId
00782         , const XMLCh* const    systemId
00783         , const bool            hasIntSubset
00784     );
00785 
00799     virtual void doctypePI
00800     (
00801         const   XMLCh* const    target
00802         , const XMLCh* const    data
00803     );
00804 
00816     virtual void doctypeWhitespace
00817     (
00818         const   XMLCh* const    chars
00819         , const unsigned int    length
00820     );
00821 
00834     virtual void elementDecl
00835     (
00836         const   DTDElementDecl& decl
00837         , const bool            isIgnored
00838     );
00839 
00850     virtual void endAttList
00851     (
00852         const   DTDElementDecl& elemDecl
00853     );
00854 
00861     virtual void endIntSubset();
00862 
00869     virtual void endExtSubset();
00870 
00885     virtual void entityDecl
00886     (
00887         const   DTDEntityDecl&  entityDecl
00888         , const bool            isPEDecl
00889         , const bool            isIgnored
00890     );
00891 
00896     virtual void resetDocType();
00897 
00910     virtual void notationDecl
00911     (
00912         const   XMLNotationDecl&    notDecl
00913         , const bool                isIgnored
00914     );
00915 
00926     virtual void startAttList
00927     (
00928         const   DTDElementDecl& elemDecl
00929     );
00930 
00937     virtual void startIntSubset();
00938 
00945     virtual void startExtSubset();
00946 
00959     virtual void TextDecl
00960     (
00961         const   XMLCh* const    versionStr
00962         , const XMLCh* const    encodingStr
00963     );
00965 
00966 
00967     // -----------------------------------------------------------------------
00968     //  Implementation of the XMLDocumentHandler interface
00969     // -----------------------------------------------------------------------
00970 
00973 
00988     virtual void docCharacters
00989     (
00990         const   XMLCh* const    chars
00991         , const unsigned int    length
00992         , const bool            cdataSection
00993     );
00994 
01004     virtual void docComment
01005     (
01006         const   XMLCh* const    comment
01007     );
01008 
01028     virtual void docPI
01029     (
01030         const   XMLCh* const    target
01031         , const XMLCh* const    data
01032     );
01033 
01045     virtual void endDocument();
01046 
01063     virtual void endElement
01064     (
01065         const   XMLElementDecl& elemDecl
01066         , const unsigned int    urlId
01067         , const bool            isRoot
01068     );
01069 
01080     virtual void endEntityReference
01081     (
01082         const   XMLEntityDecl&  entDecl
01083     );
01084 
01104     virtual void ignorableWhitespace
01105     (
01106         const   XMLCh* const    chars
01107         , const unsigned int    length
01108         , const bool            cdataSection
01109     );
01110 
01115     virtual void resetDocument();
01116 
01127     virtual void startDocument();
01128 
01155     virtual void startElement
01156     (
01157         const   XMLElementDecl&         elemDecl
01158         , const unsigned int            urlId
01159         , const XMLCh* const            elemPrefix
01160         , const RefVectorOf<XMLAttr>&   attrList
01161         , const unsigned int            attrCount
01162         , const bool                    isEmpty
01163         , const bool                    isRoot
01164     );
01165 
01175     virtual void startEntityReference
01176     (
01177         const   XMLEntityDecl&  entDecl
01178     );
01179 
01197     virtual void XMLDecl
01198     (
01199         const   XMLCh* const    versionStr
01200         , const XMLCh* const    encodingStr
01201         , const XMLCh* const    standaloneStr
01202         , const XMLCh* const    actualEncodingStr
01203     );
01205 
01206 
01207     // -----------------------------------------------------------------------
01208     //  Implementation of the XMLErrorReporter interface
01209     // -----------------------------------------------------------------------
01210 
01213 
01236     virtual void error
01237     (
01238         const   unsigned int                errCode
01239         , const XMLCh* const                msgDomain
01240         , const XMLErrorReporter::ErrTypes  errType
01241         , const XMLCh* const                errorText
01242         , const XMLCh* const                systemId
01243         , const XMLCh* const                publicId
01244         , const unsigned int                lineNum
01245         , const unsigned int                colNum
01246     );
01247 
01256     virtual void resetErrors();
01258 
01259 
01260     // -----------------------------------------------------------------------
01261     //  Implementation of the XMLEntityHandler interface
01262     // -----------------------------------------------------------------------
01263 
01266 
01277     virtual void endInputSource(const InputSource& inputSource);
01278 
01293     virtual bool expandSystemId
01294     (
01295         const   XMLCh* const    systemId
01296         ,       XMLBuffer&      toFill
01297     );
01298 
01306     virtual void resetEntities();
01307 
01322     virtual InputSource* resolveEntity
01323     (
01324         const   XMLCh* const    publicId
01325         , const XMLCh* const    systemId
01326     );
01327 
01339     virtual void startInputSource(const InputSource& inputSource);
01341 
01342 
01345 
01355     bool getDoValidation() const;
01356 
01370     void setDoValidation(const bool newState);
01372 
01373 
01374 protected :
01375     // -----------------------------------------------------------------------
01376     //  Unimplemented constructors and operators
01377     // -----------------------------------------------------------------------
01378     SAXParser(const SAXParser&);
01379     void operator=(const SAXParser&);
01380 
01381 
01382     // -----------------------------------------------------------------------
01383     //  Private data members
01384     //
01385     //  fAttrList
01386     //      A temporary implementation of the basic SAX attribute list
01387     //      interface. We use this one over and over on each startElement
01388     //      event to allow SAX-like access to the element attributes.
01389     //
01390     //  fDocHandler
01391     //      The installed SAX doc handler, if any. Null if none.
01392     //
01393     //  fDTDHandler
01394     //      The installed SAX DTD handler, if any. Null if none.
01395     //
01396     //  fElemDepth
01397     //      This is used to track the element nesting depth, so that we can
01398     //      know when we are inside content. This is so we can ignore char
01399     //      data outside of content.
01400     //
01401     //  fEntityResolver
01402     //      The installed SAX entity handler, if any. Null if none.
01403     //
01404     //  fErrorHandler
01405     //      The installed SAX error handler, if any. Null if none.
01406     //
01407     //  fAdvDHCount
01408     //  fAdvDHList
01409     //  fAdvDHListSize
01410     //      This is an array of pointers to XMLDocumentHandlers, which is
01411     //      how we see installed advanced document handlers. There will
01412     //      usually not be very many at all, so a simple array is used
01413     //      instead of a collection, for performance. It will grow if needed,
01414     //      but that is unlikely.
01415     //
01416     //      The count is how many handlers are currently installed. The size
01417     //      is how big the array itself is (for expansion purposes.) When
01418     //      count == size, is time to expand.
01419     //
01420     //  fParseInProgress
01421     //      This flag is set once a parse starts. It is used to prevent
01422     //      multiple entrance or reentrance of the parser.
01423     //
01424     //  fScanner
01425     //      The scanner being used by this parser. It is created internally
01426     //      during construction.
01427     //
01428     // -----------------------------------------------------------------------
01429     VecAttrListImpl         fAttrList;
01430     DocumentHandler*        fDocHandler;
01431     DTDHandler*             fDTDHandler;
01432     unsigned int            fElemDepth;
01433     EntityResolver*         fEntityResolver;
01434     ErrorHandler*           fErrorHandler;
01435     unsigned int            fAdvDHCount;
01436     XMLDocumentHandler**    fAdvDHList;
01437     unsigned int            fAdvDHListSize;
01438     bool                    fParseInProgress;
01439     XMLScanner*             fScanner;
01440 };
01441 
01442 
01443 // ---------------------------------------------------------------------------
01444 //  SAXParser: Getter methods
01445 // ---------------------------------------------------------------------------
01446 inline DocumentHandler* SAXParser::getDocumentHandler()
01447 {
01448     return fDocHandler;
01449 }
01450 
01451 inline const DocumentHandler* SAXParser::getDocumentHandler() const
01452 {
01453     return fDocHandler;
01454 }
01455 
01456 inline EntityResolver* SAXParser::getEntityResolver()
01457 {
01458     return fEntityResolver;
01459 }
01460 
01461 inline const EntityResolver* SAXParser::getEntityResolver() const
01462 {
01463     return fEntityResolver;
01464 }
01465 
01466 inline ErrorHandler* SAXParser::getErrorHandler()
01467 {
01468     return fErrorHandler;
01469 }
01470 
01471 inline const ErrorHandler* SAXParser::getErrorHandler() const
01472 {
01473     return fErrorHandler;
01474 }
01475 
01476 inline const XMLScanner& SAXParser::getScanner() const
01477 {
01478     return *fScanner;
01479 }
01480 
01481 #endif


Copyright © 2000 The Apache Software Foundation. All Rights Reserved.