All Packages  Class Hierarchy  This Package  Previous  Next  Index

Class org.w3c.tidy.Tidy

java.lang.Object
   |
   +----org.w3c.tidy.Tidy

public class Tidy
extends Object
implements Serializable

HTML parser and pretty printer

(c) 1998, 1999 (W3C) MIT, INRIA, Keio University See Tidy.java for the copyright notice. Derived from HTML Tidy Release 26 Jul 1999

Copyright (c) 1998 World Wide Web Consortium (Massachusetts Institute of Technology, Institut National de Recherche en Informatique et en Automatique, Keio University). All Rights Reserved.

Contributing Author(s):
Dave Raggett
Andy Quick (translation to Java)

The contributing author(s) would like to thank all those who helped with testing, bug fixes, and patience. This wouldn't have been possible without all of you.

COPYRIGHT NOTICE:
This software and documentation is provided "as is," and the copyright holders and contributing author(s) make no representations or warranties, express or implied, including but not limited to, warranties of merchantability or fitness for any particular purpose or that the use of the software or documentation will not infringe any third party patents, copyrights, trademarks or other rights.

The copyright holders and contributing author(s) will not be liable for any direct, indirect, special or consequential damages arising out of any use of the software or documentation, even if advised of the possibility of such damage.

Permission is hereby granted to use, copy, modify, and distribute this source code, or portions hereof, documentation and executables, for any purpose, without fee, subject to the following restrictions:

  1. The origin of this source code must not be misrepresented.
  2. Altered versions must be plainly marked as such and must not be misrepresented as being the original source.
  3. This Copyright notice may not be removed or altered from any source or altered source distribution.

The copyright holders and contributing author(s) specifically permit, without fee, and encourage the use of this source code as a component for supporting the Hypertext Markup Language in commercial products. If you use this source code in a product, acknowledgment is not required but would be appreciated.


Constructor Index

 o Tidy()

Method Index

 o getBreakBeforeBR()
 o getBurstSlides()
 o getCharEncoding()
 o getConfiguration()
 o getDocType()
 o getDropFontTags()
 o getErrfile()
 o getErrout()
 o getFixBackslash()
 o getHideEndTags()
 o getIndentAttributes()
 o getIndentContent()
 o getInputStreamName()
 o getLogicalEmphasis()
 o getMakeClean()
 o getNumEntities()
 o getOnlyErrors()
 o getQuoteAmpersand()
 o getQuoteMarks()
 o getQuoteNbsp()
 o getRawOut()
 o getShowWarnings()
 o getSlidestyle()
 o getSmartIndent()
 o getSpaces()
 o getStderr()
 o getTabsize()
 o getUpperCaseAttrs()
 o getUpperCaseTags()
 o getWrapAsp()
 o getWraplen()
 o getWrapScriptlets()
 o getWriteback()
 o getXHTML()
 o getXmlOut()
 o getXmlPi()
 o getXmlPIs()
 o getXmlTags()
 o main(String[])
Command line interface to parser and pretty printer.
 o parse(InputStream, OutputStream)
Parses InputStream in and returns the root Node.
 o setBreakBeforeBR(boolean)
BreakBeforeBR - o/p newline before <br> or not?
 o setBurstSlides(boolean)
BurstSlides - create slides on each h2 element
 o setCharEncoding(int)
CharEncoding
 o setDocType(String)
DocType - user specified doctype omit | auto | strict | loose | fpi where the fpi is a string similar to "-//ACME//DTD HTML 3.14159//EN" Note: for fpi include the double-quotes in the string.
 o setDropFontTags(boolean)
DropFontTags - discard presentation tags
 o setErrfile(String)
Errfile - file name to write errors to
 o setErrout(PrintWriter)
 o setFixBackslash(boolean)
FixBackslash - fix URLs by replacing \ with /
 o setHideEndTags(boolean)
HideEndTags - suppress optional end tags
 o setIndentAttributes(boolean)
IndentAttributes - newline+indent before each attribute
 o setIndentContent(boolean)
IndentContent - indent content of appropriate tags
 o setInputStreamName(String)
InputStreamName - the name of the input stream (printed in the header information).
 o setLogicalEmphasis(boolean)
LogicalEmphasis - replace i by em and b by strong
 o setMakeClean(boolean)
MakeClean - remove presentational clutter
 o setNumEntities(boolean)
NumEntities - use numeric entities
 o setOnlyErrors(boolean)
OnlyErrors - if true normal output is suppressed
 o setQuoteAmpersand(boolean)
QuoteAmpersand - output naked ampersand as &
 o setQuoteMarks(boolean)
QuoteMarks - output " marks as &quot;
 o setQuoteNbsp(boolean)
QuoteNbsp - output non-breaking space as entity
 o setRawOut(boolean)
RawOut - avoid mapping values > 127 to entities
 o setShowWarnings(boolean)
ShowWarnings - however errors are always shown
 o setSlidestyle(String)
Slidestyle - style sheet for slides
 o setSmartIndent(boolean)
SmartIndent - does text/block level content effect indentation
 o setSpaces(int)
Spaces - default indentation
 o setTabsize(int)
Tabsize
 o setUpperCaseAttrs(boolean)
UpperCaseAttrs - output attributes in upper not lower case
 o setUpperCaseTags(boolean)
UpperCaseTags - output tags in upper not lower case
 o setWrapAsp(boolean)
WrapAsp - wrap within ASP pseudo elements
 o setWraplen(int)
Wraplen - default wrap margin
 o setWrapScriptlets(boolean)
WrapScriptlets - wrap within JavaScript string literals
 o setWriteback(boolean)
Writeback - if true then output tidied markup
 o setXHTML(boolean)
XHTML - output extensible HTML
 o setXmlOut(boolean)
XmlOut - create output as XML
 o setXmlPi(boolean)
XmlPi - add <?xml?> for XML docs
 o setXmlPIs(boolean)
XmlPIs - if set to true PIs must end with ?>
 o setXmlTags(boolean)
XmlTags - treat input as XML

Constructors

 o Tidy
 public Tidy()

Methods

 o getConfiguration
 public Configuration getConfiguration()
 o getStderr
 public PrintWriter getStderr()
 o getErrout
 public PrintWriter getErrout()
 o setErrout
 public void setErrout(PrintWriter errout)
 o setSpaces
 public void setSpaces(int spaces)
Spaces - default indentation

See Also:
spaces
 o getSpaces
 public int getSpaces()
 o setWraplen
 public void setWraplen(int wraplen)
Wraplen - default wrap margin

See Also:
wraplen
 o getWraplen
 public int getWraplen()
 o setCharEncoding
 public void setCharEncoding(int charencoding)
CharEncoding

See Also:
CharEncoding
 o getCharEncoding
 public int getCharEncoding()
 o setTabsize
 public void setTabsize(int tabsize)
Tabsize

See Also:
tabsize
 o getTabsize
 public int getTabsize()
 o setErrfile
 public void setErrfile(String errfile)
Errfile - file name to write errors to

See Also:
errfile
 o getErrfile
 public String getErrfile()
 o setWriteback
 public void setWriteback(boolean writeback)
Writeback - if true then output tidied markup

See Also:
writeback
 o getWriteback
 public boolean getWriteback()
 o setOnlyErrors
 public void setOnlyErrors(boolean OnlyErrors)
OnlyErrors - if true normal output is suppressed

See Also:
OnlyErrors
 o getOnlyErrors
 public boolean getOnlyErrors()
 o setShowWarnings
 public void setShowWarnings(boolean ShowWarnings)
ShowWarnings - however errors are always shown

See Also:
ShowWarnings
 o getShowWarnings
 public boolean getShowWarnings()
 o setIndentContent
 public void setIndentContent(boolean IndentContent)
IndentContent - indent content of appropriate tags

See Also:
IndentContent
 o getIndentContent
 public boolean getIndentContent()
 o setSmartIndent
 public void setSmartIndent(boolean SmartIndent)
SmartIndent - does text/block level content effect indentation

See Also:
SmartIndent
 o getSmartIndent
 public boolean getSmartIndent()
 o setHideEndTags
 public void setHideEndTags(boolean HideEndTags)
HideEndTags - suppress optional end tags

See Also:
HideEndTags
 o getHideEndTags
 public boolean getHideEndTags()
 o setXmlTags
 public void setXmlTags(boolean XmlTags)
XmlTags - treat input as XML

See Also:
XmlTags
 o getXmlTags
 public boolean getXmlTags()
 o setXmlOut
 public void setXmlOut(boolean XmlOut)
XmlOut - create output as XML

See Also:
XmlOut
 o getXmlOut
 public boolean getXmlOut()
 o setXHTML
 public void setXHTML(boolean xHTML)
XHTML - output extensible HTML

See Also:
xHTML
 o getXHTML
 public boolean getXHTML()
 o setRawOut
 public void setRawOut(boolean RawOut)
RawOut - avoid mapping values > 127 to entities

See Also:
RawOut
 o getRawOut
 public boolean getRawOut()
 o setUpperCaseTags
 public void setUpperCaseTags(boolean UpperCaseTags)
UpperCaseTags - output tags in upper not lower case

See Also:
UpperCaseTags
 o getUpperCaseTags
 public boolean getUpperCaseTags()
 o setUpperCaseAttrs
 public void setUpperCaseAttrs(boolean UpperCaseAttrs)
UpperCaseAttrs - output attributes in upper not lower case

See Also:
UpperCaseAttrs
 o getUpperCaseAttrs
 public boolean getUpperCaseAttrs()
 o setMakeClean
 public void setMakeClean(boolean MakeClean)
MakeClean - remove presentational clutter

See Also:
MakeClean
 o getMakeClean
 public boolean getMakeClean()
 o setBreakBeforeBR
 public void setBreakBeforeBR(boolean BreakBeforeBR)
BreakBeforeBR - o/p newline before <br> or not?

See Also:
BreakBeforeBR
 o getBreakBeforeBR
 public boolean getBreakBeforeBR()
 o setBurstSlides
 public void setBurstSlides(boolean BurstSlides)
BurstSlides - create slides on each h2 element

See Also:
BurstSlides
 o getBurstSlides
 public boolean getBurstSlides()
 o setNumEntities
 public void setNumEntities(boolean NumEntities)
NumEntities - use numeric entities

See Also:
NumEntities
 o getNumEntities
 public boolean getNumEntities()
 o setQuoteMarks
 public void setQuoteMarks(boolean QuoteMarks)
QuoteMarks - output " marks as &quot;

See Also:
QuoteMarks
 o getQuoteMarks
 public boolean getQuoteMarks()
 o setQuoteNbsp
 public void setQuoteNbsp(boolean QuoteNbsp)
QuoteNbsp - output non-breaking space as entity

See Also:
QuoteNbsp
 o getQuoteNbsp
 public boolean getQuoteNbsp()
 o setQuoteAmpersand
 public void setQuoteAmpersand(boolean QuoteAmpersand)
QuoteAmpersand - output naked ampersand as &

See Also:
QuoteAmpersand
 o getQuoteAmpersand
 public boolean getQuoteAmpersand()
 o setWrapScriptlets
 public void setWrapScriptlets(boolean WrapScriptlets)
WrapScriptlets - wrap within JavaScript string literals

See Also:
WrapScriptlets
 o getWrapScriptlets
 public boolean getWrapScriptlets()
 o setSlidestyle
 public void setSlidestyle(String slidestyle)
Slidestyle - style sheet for slides

See Also:
slidestyle
 o getSlidestyle
 public String getSlidestyle()
 o setXmlPi
 public void setXmlPi(boolean XmlPi)
XmlPi - add <?xml?> for XML docs

See Also:
XmlPi
 o getXmlPi
 public boolean getXmlPi()
 o setDropFontTags
 public void setDropFontTags(boolean DropFontTags)
DropFontTags - discard presentation tags

See Also:
DropFontTags
 o getDropFontTags
 public boolean getDropFontTags()
 o setWrapAsp
 public void setWrapAsp(boolean WrapAsp)
WrapAsp - wrap within ASP pseudo elements

See Also:
WrapAsp
 o getWrapAsp
 public boolean getWrapAsp()
 o setFixBackslash
 public void setFixBackslash(boolean FixBackslash)
FixBackslash - fix URLs by replacing \ with /

See Also:
FixBackslash
 o getFixBackslash
 public boolean getFixBackslash()
 o setIndentAttributes
 public void setIndentAttributes(boolean IndentAttributes)
IndentAttributes - newline+indent before each attribute

See Also:
IndentAttributes
 o getIndentAttributes
 public boolean getIndentAttributes()
 o setDocType
 public void setDocType(String doctype)
DocType - user specified doctype omit | auto | strict | loose | fpi where the fpi is a string similar to "-//ACME//DTD HTML 3.14159//EN" Note: for fpi include the double-quotes in the string.

See Also:
docTypeStr, docTypeMode
 o getDocType
 public String getDocType()
 o setLogicalEmphasis
 public void setLogicalEmphasis(boolean LogicalEmphasis)
LogicalEmphasis - replace i by em and b by strong

See Also:
LogicalEmphasis
 o getLogicalEmphasis
 public boolean getLogicalEmphasis()
 o setXmlPIs
 public void setXmlPIs(boolean XmlPIs)
XmlPIs - if set to true PIs must end with ?>

See Also:
XmlPIs
 o getXmlPIs
 public boolean getXmlPIs()
 o setInputStreamName
 public void setInputStreamName(String name)
InputStreamName - the name of the input stream (printed in the header information).

 o getInputStreamName
 public String getInputStreamName()
 o parse
 public Node parse(InputStream in,
                   OutputStream out)
Parses InputStream in and returns the root Node. If out is non-null, pretty prints to OutputStream out.

 o main
 public static void main(String argv[])
Command line interface to parser and pretty printer.


All Packages  Class Hierarchy  This Package  Previous  Next  Index