Class HTMLSerializer
- All Implemented Interfaces:
DOMSerializer,Serializer,ContentHandler,DocumentHandler,DTDHandler,DeclHandler,LexicalHandler
- Direct Known Subclasses:
XHTMLSerializer
Serializer.
If an output stream is used, the encoding is taken from the output format (defaults to UTF-8). If a writer is used, make sure the writer uses the same encoding (if applies) as specified in the output format.
The serializer supports both DOM and SAX. DOM serializing is done
by calling BaseMarkupSerializer.serialize(org.w3c.dom.Element) and SAX serializing is done by firing
SAX events and using the serializer as a document handler.
If an I/O exception occurs while serializing, the serializer
will not throw an exception directly, but only throw it
at the end of serializing (either DOM or SAX's DocumentHandler.endDocument().
For elements that are not specified as whitespace preserving, the serializer will potentially break long text lines at space boundaries, indent lines, and serialize elements on separate lines. Line terminators will be regarded as spaces, and spaces at beginning of line will be stripped.
XHTML is slightly different than HTML:
- Element/attribute names are lower case and case matters
- Attributes must specify value, even if empty string
- Empty elements must have '/' in empty tag
- Contents of SCRIPT and STYLE elements serialized as CDATA
- Version:
- $Revision: 704573 $ $Date: 2008-10-14 21:41:22 +0530 (Tue, 14 Oct 2008) $
- Author:
- Assaf Arkin
- See Also:
-
Field Summary
FieldsFields inherited from class org.apache.xml.serialize.BaseMarkupSerializer
_docTypePublicId, _docTypeSystemId, _encodingInfo, _format, _indenting, _prefixes, _printer, _started, fCurrentNode, fDOMError, fDOMErrorHandler, fDOMFilter, features, fStrBuffer -
Constructor Summary
ConstructorsModifierConstructorDescriptionDeprecated.Constructs a new serializer.protectedHTMLSerializer(boolean xhtml, OutputFormat format) Deprecated.Constructs a new HTML/XHTML serializer depending on the value of xhtml.HTMLSerializer(OutputStream output, OutputFormat format) Deprecated.Constructs a new serializer that writes to the specified output stream using the specified output format.HTMLSerializer(Writer writer, OutputFormat format) Deprecated.Constructs a new serializer that writes to the specified writer using the specified output format.HTMLSerializer(OutputFormat format) Deprecated.Constructs a new serializer. -
Method Summary
Modifier and TypeMethodDescriptionvoidcharacters(char[] chars, int start, int length) Deprecated.protected voidcharacters(String text) Deprecated.Called to print the text contents in the prevailing element format.voidendElement(String tagName) Deprecated.voidendElement(String namespaceURI, String localName, String rawName) Deprecated.voidendElementIO(String namespaceURI, String localName, String rawName) Deprecated.protected StringDeprecated.protected StringgetEntityRef(int ch) Deprecated.Returns the suitable entity reference for this character value, or null if no such entity exists.protected voidserializeElement(Element elem) Deprecated.Called to serialize a DOM element.voidsetOutputFormat(OutputFormat format) Deprecated.Specifies an output format for this serializer.voidsetXHTMLNamespace(String newNamespace) Deprecated.protected voidstartDocument(String rootTagName) Deprecated.Called to serialize the document's DOCTYPE by the root element.voidstartElement(String namespaceURI, String localName, String rawName, Attributes attrs) Deprecated.voidstartElement(String tagName, AttributeList attrs) Deprecated.Methods inherited from class org.apache.xml.serialize.BaseMarkupSerializer
asContentHandler, asDocumentHandler, asDOMSerializer, attributeDecl, checkUnboundNamespacePrefixedNode, cleanup, comment, comment, content, elementDecl, endCDATA, endDocument, endDTD, endEntity, endNonEscaping, endPrefixMapping, endPreserving, enterElementState, externalEntityDecl, fatalError, getElementState, getPrefix, ignorableWhitespace, internalEntityDecl, isDocumentState, leaveElementState, modifyDOMError, notationDecl, prepare, printCDATAText, printDoctypeURL, printEscaped, printEscaped, printText, printText, processingInstruction, processingInstructionIO, reset, serialize, serialize, serialize, serializeNode, serializePreRoot, setDocumentLocator, setOutputByteStream, setOutputCharStream, skippedEntity, startCDATA, startDocument, startDTD, startEntity, startNonEscaping, startPrefixMapping, startPreserving, surrogates, unparsedEntityDeclMethods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface org.xml.sax.ContentHandler
declaration
-
Field Details
-
XHTMLNamespace
Deprecated.- See Also:
-
-
Constructor Details
-
HTMLSerializer
Deprecated.Constructs a new HTML/XHTML serializer depending on the value of xhtml. The serializer cannot be used without callingBaseMarkupSerializer.setOutputCharStream(java.io.Writer)orBaseMarkupSerializer.setOutputByteStream(java.io.OutputStream)first.- Parameters:
xhtml- True if XHTML serializing
-
HTMLSerializer
public HTMLSerializer()Deprecated.Constructs a new serializer. The serializer cannot be used without callingBaseMarkupSerializer.setOutputCharStream(java.io.Writer)orBaseMarkupSerializer.setOutputByteStream(java.io.OutputStream)first. -
HTMLSerializer
Deprecated.Constructs a new serializer. The serializer cannot be used without callingBaseMarkupSerializer.setOutputCharStream(java.io.Writer)orBaseMarkupSerializer.setOutputByteStream(java.io.OutputStream)first. -
HTMLSerializer
Deprecated.Constructs a new serializer that writes to the specified writer using the specified output format. If format is null, will use a default output format.- Parameters:
writer- The writer to useformat- The output format to use, null for the default
-
HTMLSerializer
Deprecated.Constructs a new serializer that writes to the specified output stream using the specified output format. If format is null, will use a default output format.- Parameters:
output- The output stream to useformat- The output format to use, null for the default
-
-
Method Details
-
setOutputFormat
Deprecated.Description copied from interface:SerializerSpecifies an output format for this serializer. It the serializer has already been associated with an output format, it will switch to the new format. This method should not be called while the serializer is in the process of serializing a document.- Specified by:
setOutputFormatin interfaceSerializer- Overrides:
setOutputFormatin classBaseMarkupSerializer- Parameters:
format- The output format to use
-
setXHTMLNamespace
Deprecated. -
startElement
public void startElement(String namespaceURI, String localName, String rawName, Attributes attrs) throws SAXException Deprecated.- Throws:
SAXException
-
endElement
Deprecated.- Throws:
SAXException
-
endElementIO
Deprecated.- Throws:
IOException
-
characters
Deprecated.- Specified by:
charactersin interfaceContentHandler- Specified by:
charactersin interfaceDocumentHandler- Overrides:
charactersin classBaseMarkupSerializer- Throws:
SAXException
-
startElement
Deprecated.- Throws:
SAXException
-
endElement
Deprecated.- Throws:
SAXException
-
startDocument
Deprecated.Called to serialize the document's DOCTYPE by the root element. The document type declaration must name the root element, but the root element is only known when that element is serialized, and not at the start of the document.This method will check if it has not been called before (
BaseMarkupSerializer._started), will serialize the document type declaration, and will serialize all pre-root comments and PIs that were accumulated in the document (seeBaseMarkupSerializer.serializePreRoot()). Pre-root will be serialized even if this is not the first root element of the document.- Throws:
IOException
-
serializeElement
Deprecated.Called to serialize a DOM element. Equivalent to callingstartElement(java.lang.String, java.lang.String, java.lang.String, org.xml.sax.Attributes),endElement(java.lang.String, java.lang.String, java.lang.String)and serializing everything inbetween, but better optimized.- Specified by:
serializeElementin classBaseMarkupSerializer- Parameters:
elem- The element to serialize- Throws:
IOException- An I/O exception occured while serializing
-
characters
Deprecated.Description copied from class:BaseMarkupSerializerCalled to print the text contents in the prevailing element format. Since this method is capable of printing text as CDATA, it is used for that purpose as well. White space handling is determined by the current element state. In addition, the output format can dictate whether the text is printed as CDATA or unescaped.- Overrides:
charactersin classBaseMarkupSerializer- Parameters:
text- The text to print- Throws:
IOException- An I/O exception occured while serializing
-
getEntityRef
Deprecated.Description copied from class:BaseMarkupSerializerReturns the suitable entity reference for this character value, or null if no such entity exists. Calling this method with '&' will return "&".- Specified by:
getEntityRefin classBaseMarkupSerializer- Parameters:
ch- Character value- Returns:
- Character entity name, or null
-
escapeURI
Deprecated.
-