public class HtmlToPlainText extends Object
Note that this is a fairly simplistic formatter -- for real world use you'll want to embrace and extend.
To invoke from the command line, assuming you've downloaded the jsoup jar to your current directory:
java -cp jsoup.jar org.jsoup.examples.HtmlToPlainText url [selector]
Modifier and Type | Class and Description |
---|---|
private class |
HtmlToPlainText.FormattingVisitor |
Modifier and Type | Field and Description |
---|---|
private static int |
timeout |
private static String |
userAgent |
Constructor and Description |
---|
HtmlToPlainText() |
Modifier and Type | Method and Description |
---|---|
String |
getPlainText(Element element)
Format an Element to plain-text
|
static void |
main(String... args) |
private static final String userAgent
private static final int timeout
public HtmlToPlainText()
public static void main(String... args) throws IOException
IOException
public String getPlainText(Element element)
element
- the root element to formatWebARTS Library Licensed Under the GNU - General Public License. Other Libraries licensed under their respective Open Source Licenses