http://xml.apache.org/http://www.apache.org/http://www.w3.org/

Home

Readme
Download
Installation
Build

API Docs
Samples
Schema

FAQs
Programming
Migration

Releases
Bug-Reporting
Feedback

Y2K Compliance
PDF Document

CVS Repository
Mail Archive

API Docs for SAX and DOM
 

Main Page   Class Hierarchy   Alphabetical List   Compound List   File List   Compound Members   File Members  

SAXParser.hpp

Go to the documentation of this file.
00001 /*
00002  * The Apache Software License, Version 1.1
00003  *
00004  * Copyright (c) 1999-2001 The Apache Software Foundation.  All rights
00005  * reserved.
00006  *
00007  * Redistribution and use in source and binary forms, with or without
00008  * modification, are permitted provided that the following conditions
00009  * are met:
00010  *
00011  * 1. Redistributions of source code must retain the above copyright
00012  *    notice, this list of conditions and the following disclaimer.
00013  *
00014  * 2. Redistributions in binary form must reproduce the above copyright
00015  *    notice, this list of conditions and the following disclaimer in
00016  *    the documentation and/or other materials provided with the
00017  *    distribution.
00018  *
00019  * 3. The end-user documentation included with the redistribution,
00020  *    if any, must include the following acknowledgment:
00021  *       "This product includes software developed by the
00022  *        Apache Software Foundation (http://www.apache.org/)."
00023  *    Alternately, this acknowledgment may appear in the software itself,
00024  *    if and wherever such third-party acknowledgments normally appear.
00025  *
00026  * 4. The names "Xerces" and "Apache Software Foundation" must
00027  *    not be used to endorse or promote products derived from this
00028  *    software without prior written permission. For written
00029  *    permission, please contact apache\@apache.org.
00030  *
00031  * 5. Products derived from this software may not be called "Apache",
00032  *    nor may "Apache" appear in their name, without prior written
00033  *    permission of the Apache Software Foundation.
00034  *
00035  * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED
00036  * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
00037  * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
00038  * DISCLAIMED.  IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR
00039  * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
00040  * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
00041  * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF
00042  * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
00043  * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
00044  * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
00045  * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
00046  * SUCH DAMAGE.
00047  * ====================================================================
00048  *
00049  * This software consists of voluntary contributions made by many
00050  * individuals on behalf of the Apache Software Foundation, and was
00051  * originally based on software copyright (c) 1999, International
00052  * Business Machines, Inc., http://www.ibm.com .  For more information
00053  * on the Apache Software Foundation, please see
00054  * <http://www.apache.org/>.
00055  */
00056 
00057 /*
00058  * $Log: SAXParser.hpp,v $
00059  * Revision 1.22  2001/12/05 22:09:02  tng
00060  * Update documentation for setExternalSchemaLocation and setExternalNoNamespaceSchemaLocation.
00061  *
00062  * Revision 1.21  2001/11/20 18:51:44  tng
00063  * Schema: schemaLocation and noNamespaceSchemaLocation to be specified outside the instance document.  New methods setExternalSchemaLocation and setExternalNoNamespaceSchemaLocation are added (for SAX2, two new properties are added).
00064  *
00065  * Revision 1.20  2001/08/01 19:11:02  tng
00066  * Add full schema constraint checking flag to the samples and the parser.
00067  *
00068  * Revision 1.19  2001/07/27 20:24:21  tng
00069  * put getScanner() back as they were there before, not to break existing apps.
00070  *
00071  * Revision 1.18  2001/07/16 12:52:09  tng
00072  * APIDocs fix: default for schema processing in DOMParser, IDOMParser, and SAXParser should be false.
00073  *
00074  * Revision 1.17  2001/06/23 14:13:16  tng
00075  * Remove getScanner from the Parser headers as this is not needed and Scanner is not internal class.
00076  *
00077  * Revision 1.16  2001/06/03 19:26:20  jberry
00078  * Add support for querying error count following parse; enables simple parse without requiring error handler.
00079  *
00080  * Revision 1.15  2001/05/11 13:26:22  tng
00081  * Copyright update.
00082  *
00083  * Revision 1.14  2001/05/03 19:09:25  knoaman
00084  * Support Warning/Error/FatalError messaging.
00085  * Validity constraints errors are treated as errors, with the ability by user to set
00086  * validity constraints as fatal errors.
00087  *
00088  * Revision 1.13  2001/03/30 16:46:57  tng
00089  * Schema: Use setDoSchema instead of setSchemaValidation which makes more sense.
00090  *
00091  * Revision 1.12  2001/03/21 21:56:09  tng
00092  * Schema: Add Schema Grammar, Schema Validator, and split the DTDValidator into DTDValidator, DTDScanner, and DTDGrammar.
00093  *
00094  * Revision 1.11  2001/02/15 15:56:29  tng
00095  * Schema: Add setSchemaValidation and getSchemaValidation for DOMParser and SAXParser.
00096  * Add feature "http://apache.org/xml/features/validation/schema" for SAX2XMLReader.
00097  * New data field  fSchemaValidation in XMLScanner as the flag.
00098  *
00099  * Revision 1.10  2001/01/12 21:23:41  tng
00100  * Documentation Enhancement: explain values of Val_Scheme
00101  *
00102  * Revision 1.9  2000/08/02 18:05:15  jpolast
00103  * changes required for sax2
00104  * (changed private members to protected)
00105  *
00106  * Revision 1.8  2000/04/12 22:58:30  roddey
00107  * Added support for 'auto validate' mode.
00108  *
00109  * Revision 1.7  2000/03/03 01:29:34  roddey
00110  * Added a scanReset()/parseReset() method to the scanner and
00111  * parsers, to allow for reset after early exit from a progressive parse.
00112  * Added calls to new Terminate() call to all of the samples. Improved
00113  * documentation in SAX and DOM parsers.
00114  *
00115  * Revision 1.6  2000/02/17 03:54:27  rahulj
00116  * Added some new getters to query the parser state and
00117  * clarified the documentation.
00118  *
00119  * Revision 1.5  2000/02/16 03:42:58  rahulj
00120  * Finished documenting the SAX Driver implementation.
00121  *
00122  * Revision 1.4  2000/02/15 04:47:37  rahulj
00123  * Documenting the SAXParser framework. Not done yet.
00124  *
00125  * Revision 1.3  2000/02/06 07:47:56  rahulj
00126  * Year 2K copyright swat.
00127  *
00128  * Revision 1.2  1999/12/15 19:57:48  roddey
00129  * Got rid of redundant 'const' on boolean return value. Some compilers choke
00130  * on this and its useless.
00131  *
00132  * Revision 1.1.1.1  1999/11/09 01:07:51  twl
00133  * Initial checkin
00134  *
00135  * Revision 1.6  1999/11/08 20:44:54  rahul
00136  * Swat for adding in Product name and CVS comment log variable.
00137  *
00138  */
00139 
00140 #if !defined(SAXPARSER_HPP)
00141 #define SAXPARSER_HPP
00142 
00143 #include <sax/Parser.hpp>
00144 #include <internal/VecAttrListImpl.hpp>
00145 #include <framework/XMLDocumentHandler.hpp>
00146 #include <framework/XMLElementDecl.hpp>
00147 #include <framework/XMLEntityHandler.hpp>
00148 #include <framework/XMLErrorReporter.hpp>
00149 #include <validators/DTD/DocTypeHandler.hpp>
00150 
00151 class DocumentHandler;
00152 class EntityResolver;
00153 class XMLPScanToken;
00154 class XMLScanner;
00155 class XMLValidator;
00156 
00157 
00167 
00168 class  SAXParser :
00169 
00170     public Parser
00171     , public XMLDocumentHandler
00172     , public XMLErrorReporter
00173     , public XMLEntityHandler
00174     , public DocTypeHandler
00175 {
00176 public :
00177     // -----------------------------------------------------------------------
00178     //  Class types
00179     // -----------------------------------------------------------------------
00180     enum ValSchemes
00181     {
00182         Val_Never
00183         , Val_Always
00184         , Val_Auto
00185     };
00186 
00187 
00188     // -----------------------------------------------------------------------
00189     //  Constructors and Destructor
00190     // -----------------------------------------------------------------------
00191 
00194 
00199     SAXParser(XMLValidator* const valToAdopt = 0);
00200 
00204     ~SAXParser();
00206 
00207 
00210 
00216     DocumentHandler* getDocumentHandler();
00217 
00224     const DocumentHandler* getDocumentHandler() const;
00225 
00232     EntityResolver* getEntityResolver();
00233 
00240     const EntityResolver* getEntityResolver() const;
00241 
00248     ErrorHandler* getErrorHandler();
00249 
00256     const ErrorHandler* getErrorHandler() const;
00257 
00264     const XMLScanner& getScanner() const;
00265 
00272     const XMLValidator& getValidator() const;
00273 
00281     ValSchemes getValidationScheme() const;
00282 
00293     bool getDoSchema() const;
00294 
00305     bool getValidationSchemaFullChecking() const;
00306 
00317     int getErrorCount() const;
00318 
00328     bool getDoNamespaces() const;
00329 
00339     bool getExitOnFirstFatalError() const;
00340 
00351     bool getValidationConstraintFatal() const;
00352 
00372     XMLCh* getExternalSchemaLocation() const;
00373 
00393     XMLCh* getExternalNoNamespaceSchemaLocation() const;
00394 
00396 
00397 
00398     // -----------------------------------------------------------------------
00399     //  Setter methods
00400     // -----------------------------------------------------------------------
00401 
00404 
00421     void setDoNamespaces(const bool newState);
00422 
00439     void setValidationScheme(const ValSchemes newScheme);
00440 
00454     void setDoSchema(const bool newState);
00455 
00472     void setValidationSchemaFullChecking(const bool schemaFullChecking);
00473 
00489     void setExitOnFirstFatalError(const bool newState);
00490 
00506     void setValidationConstraintFatal(const bool newState);
00507 
00527 
00528     void setExternalSchemaLocation(const XMLCh* const schemaLocation);
00529 
00538     void setExternalSchemaLocation(const char* const schemaLocation);
00539 
00554     void setExternalNoNamespaceSchemaLocation(const XMLCh* const noNamespaceSchemaLocation);
00555 
00564     void setExternalNoNamespaceSchemaLocation(const char* const noNamespaceSchemaLocation);
00565 
00567 
00568 
00569     // -----------------------------------------------------------------------
00570     //  Advanced document handler list maintenance methods
00571     // -----------------------------------------------------------------------
00572 
00575 
00588     void installAdvDocHandler(XMLDocumentHandler* const toInstall);
00589 
00599     bool removeAdvDocHandler(XMLDocumentHandler* const toRemove);
00601 
00602 
00603     // -----------------------------------------------------------------------
00604     //  Implementation of the SAXParser interface
00605     // -----------------------------------------------------------------------
00606 
00609 
00621     virtual void parse(const InputSource& source, const bool reuseGrammar = false);
00622 
00635     virtual void parse(const XMLCh* const systemId, const bool reuseGrammar = false);
00636 
00647     virtual void parse(const char* const systemId, const bool reuseGrammar = false);
00648 
00659     virtual void setDocumentHandler(DocumentHandler* const handler);
00660 
00670     virtual void setDTDHandler(DTDHandler* const handler);
00671 
00682     virtual void setErrorHandler(ErrorHandler* const handler);
00683 
00695     virtual void setEntityResolver(EntityResolver* const resolver);
00697 
00698 
00699     // -----------------------------------------------------------------------
00700     //  Progressive scan methods
00701     // -----------------------------------------------------------------------
00702 
00705 
00736     bool parseFirst
00737     (
00738         const   XMLCh* const    systemId
00739         ,       XMLPScanToken&  toFill
00740         , const bool            reuseGrammar = false
00741     );
00742 
00773     bool parseFirst
00774     (
00775         const   char* const     systemId
00776         ,       XMLPScanToken&  toFill
00777         , const bool            reuseGrammar = false
00778     );
00779 
00810     bool parseFirst
00811     (
00812         const   InputSource&    source
00813         ,       XMLPScanToken&  toFill
00814         , const bool            reuseGrammar = false
00815     );
00816 
00841     bool parseNext(XMLPScanToken& token);
00842 
00864     void parseReset(XMLPScanToken& token);
00865 
00867 
00868 
00869 
00870     // -----------------------------------------------------------------------
00871     //  Implementation of the DocTypeHandler Interface
00872     // -----------------------------------------------------------------------
00873 
00876 
00890     virtual void attDef
00891     (
00892         const   DTDElementDecl& elemDecl
00893         , const DTDAttDef&      attDef
00894         , const bool            ignoring
00895     );
00896 
00906     virtual void doctypeComment
00907     (
00908         const   XMLCh* const    comment
00909     );
00910 
00927     virtual void doctypeDecl
00928     (
00929         const   DTDElementDecl& elemDecl
00930         , const XMLCh* const    publicId
00931         , const XMLCh* const    systemId
00932         , const bool            hasIntSubset
00933     );
00934 
00948     virtual void doctypePI
00949     (
00950         const   XMLCh* const    target
00951         , const XMLCh* const    data
00952     );
00953 
00965     virtual void doctypeWhitespace
00966     (
00967         const   XMLCh* const    chars
00968         , const unsigned int    length
00969     );
00970 
00983     virtual void elementDecl
00984     (
00985         const   DTDElementDecl& decl
00986         , const bool            isIgnored
00987     );
00988 
00999     virtual void endAttList
01000     (
01001         const   DTDElementDecl& elemDecl
01002     );
01003 
01010     virtual void endIntSubset();
01011 
01018     virtual void endExtSubset();
01019 
01034     virtual void entityDecl
01035     (
01036         const   DTDEntityDecl&  entityDecl
01037         , const bool            isPEDecl
01038         , const bool            isIgnored
01039     );
01040 
01045     virtual void resetDocType();
01046 
01059     virtual void notationDecl
01060     (
01061         const   XMLNotationDecl&    notDecl
01062         , const bool                isIgnored
01063     );
01064 
01075     virtual void startAttList
01076     (
01077         const   DTDElementDecl& elemDecl
01078     );
01079 
01086     virtual void startIntSubset();
01087 
01094     virtual void startExtSubset();
01095 
01108     virtual void TextDecl
01109     (
01110         const   XMLCh* const    versionStr
01111         , const XMLCh* const    encodingStr
01112     );
01114 
01115 
01116     // -----------------------------------------------------------------------
01117     //  Implementation of the XMLDocumentHandler interface
01118     // -----------------------------------------------------------------------
01119 
01122 
01137     virtual void docCharacters
01138     (
01139         const   XMLCh* const    chars
01140         , const unsigned int    length
01141         , const bool            cdataSection
01142     );
01143 
01153     virtual void docComment
01154     (
01155         const   XMLCh* const    comment
01156     );
01157 
01177     virtual void docPI
01178     (
01179         const   XMLCh* const    target
01180         , const XMLCh* const    data
01181     );
01182 
01194     virtual void endDocument();
01195 
01212     virtual void endElement
01213     (
01214         const   XMLElementDecl& elemDecl
01215         , const unsigned int    urlId
01216         , const bool            isRoot
01217     );
01218 
01229     virtual void endEntityReference
01230     (
01231         const   XMLEntityDecl&  entDecl
01232     );
01233 
01253     virtual void ignorableWhitespace
01254     (
01255         const   XMLCh* const    chars
01256         , const unsigned int    length
01257         , const bool            cdataSection
01258     );
01259 
01264     virtual void resetDocument();
01265 
01276     virtual void startDocument();
01277 
01304     virtual void startElement
01305     (
01306         const   XMLElementDecl&         elemDecl
01307         , const unsigned int            urlId
01308         , const XMLCh* const            elemPrefix
01309         , const RefVectorOf<XMLAttr>&   attrList
01310         , const unsigned int            attrCount
01311         , const bool                    isEmpty
01312         , const bool                    isRoot
01313     );
01314 
01324     virtual void startEntityReference
01325     (
01326         const   XMLEntityDecl&  entDecl
01327     );
01328 
01346     virtual void XMLDecl
01347     (
01348         const   XMLCh* const    versionStr
01349         , const XMLCh* const    encodingStr
01350         , const XMLCh* const    standaloneStr
01351         , const XMLCh* const    actualEncodingStr
01352     );
01354 
01355 
01356     // -----------------------------------------------------------------------
01357     //  Implementation of the XMLErrorReporter interface
01358     // -----------------------------------------------------------------------
01359 
01362 
01385     virtual void error
01386     (
01387         const   unsigned int                errCode
01388         , const XMLCh* const                msgDomain
01389         , const XMLErrorReporter::ErrTypes  errType
01390         , const XMLCh* const                errorText
01391         , const XMLCh* const                systemId
01392         , const XMLCh* const                publicId
01393         , const unsigned int                lineNum
01394         , const unsigned int                colNum
01395     );
01396 
01405     virtual void resetErrors();
01407 
01408 
01409     // -----------------------------------------------------------------------
01410     //  Implementation of the XMLEntityHandler interface
01411     // -----------------------------------------------------------------------
01412 
01415 
01426     virtual void endInputSource(const InputSource& inputSource);
01427 
01442     virtual bool expandSystemId
01443     (
01444         const   XMLCh* const    systemId
01445         ,       XMLBuffer&      toFill
01446     );
01447 
01455     virtual void resetEntities();
01456 
01471     virtual InputSource* resolveEntity
01472     (
01473         const   XMLCh* const    publicId
01474         , const XMLCh* const    systemId
01475     );
01476 
01488     virtual void startInputSource(const InputSource& inputSource);
01490 
01491 
01494 
01504     bool getDoValidation() const;
01505 
01519     void setDoValidation(const bool newState);
01521 
01522 
01523 protected :
01524     // -----------------------------------------------------------------------
01525     //  Unimplemented constructors and operators
01526     // -----------------------------------------------------------------------
01527     SAXParser(const SAXParser&);
01528     void operator=(const SAXParser&);
01529 
01530 
01531     // -----------------------------------------------------------------------
01532     //  Private data members
01533     //
01534     //  fAttrList
01535     //      A temporary implementation of the basic SAX attribute list
01536     //      interface. We use this one over and over on each startElement
01537     //      event to allow SAX-like access to the element attributes.
01538     //
01539     //  fDocHandler
01540     //      The installed SAX doc handler, if any. Null if none.
01541     //
01542     //  fDTDHandler
01543     //      The installed SAX DTD handler, if any. Null if none.
01544     //
01545     //  fElemDepth
01546     //      This is used to track the element nesting depth, so that we can
01547     //      know when we are inside content. This is so we can ignore char
01548     //      data outside of content.
01549     //
01550     //  fEntityResolver
01551     //      The installed SAX entity handler, if any. Null if none.
01552     //
01553     //  fErrorHandler
01554     //      The installed SAX error handler, if any. Null if none.
01555     //
01556     //  fAdvDHCount
01557     //  fAdvDHList
01558     //  fAdvDHListSize
01559     //      This is an array of pointers to XMLDocumentHandlers, which is
01560     //      how we see installed advanced document handlers. There will
01561     //      usually not be very many at all, so a simple array is used
01562     //      instead of a collection, for performance. It will grow if needed,
01563     //      but that is unlikely.
01564     //
01565     //      The count is how many handlers are currently installed. The size
01566     //      is how big the array itself is (for expansion purposes.) When
01567     //      count == size, is time to expand.
01568     //
01569     //  fParseInProgress
01570     //      This flag is set once a parse starts. It is used to prevent
01571     //      multiple entrance or reentrance of the parser.
01572     //
01573     //  fScanner
01574     //      The scanner being used by this parser. It is created internally
01575     //      during construction.
01576     //
01577     // -----------------------------------------------------------------------
01578     VecAttrListImpl         fAttrList;
01579     DocumentHandler*        fDocHandler;
01580     DTDHandler*             fDTDHandler;
01581     unsigned int            fElemDepth;
01582     EntityResolver*         fEntityResolver;
01583     ErrorHandler*           fErrorHandler;
01584     unsigned int            fAdvDHCount;
01585     XMLDocumentHandler**    fAdvDHList;
01586     unsigned int            fAdvDHListSize;
01587     bool                    fParseInProgress;
01588     XMLScanner*             fScanner;
01589 };
01590 
01591 
01592 // ---------------------------------------------------------------------------
01593 //  SAXParser: Getter methods
01594 // ---------------------------------------------------------------------------
01595 inline DocumentHandler* SAXParser::getDocumentHandler()
01596 {
01597     return fDocHandler;
01598 }
01599 
01600 inline const DocumentHandler* SAXParser::getDocumentHandler() const
01601 {
01602     return fDocHandler;
01603 }
01604 
01605 inline EntityResolver* SAXParser::getEntityResolver()
01606 {
01607     return fEntityResolver;
01608 }
01609 
01610 inline const EntityResolver* SAXParser::getEntityResolver() const
01611 {
01612     return fEntityResolver;
01613 }
01614 
01615 inline ErrorHandler* SAXParser::getErrorHandler()
01616 {
01617     return fErrorHandler;
01618 }
01619 
01620 inline const ErrorHandler* SAXParser::getErrorHandler() const
01621 {
01622     return fErrorHandler;
01623 }
01624 
01625 inline const XMLScanner& SAXParser::getScanner() const
01626 {
01627     return *fScanner;
01628 }
01629 
01630 #endif


Copyright © 2000 The Apache Software Foundation. All Rights Reserved.