jsoup 简介、中文文档、中英对照文档下载-蒲公英云

jsoup 文档下载链接（含jar包、源码、pom）

组件名称	中文-文档-下载链接	中英对照-文档-下载链接
jsoup-1.10.3.jar	jsoup-1.10.3-API文档-中文版.zip	jsoup-1.10.3-API文档-中英对照版.zip
jsoup-1.11.3.jar	jsoup-1.11.3-API文档-中文版.zip	jsoup-1.11.3-API文档-中英对照版.zip
jsoup-1.14.3.jar	jsoup-1.14.3-API文档-中文版.zip	jsoup-1.14.3-API文档-中英对照版.zip

jsoup 简介

jsoup：Java HTML 解析器

jsoup是一个用于处理真实世界 HTML 的 Java 库。它提供了一个非常方便的 API，用于获取 URL 以及提取和操作数据，使用最好的 HTML5 DOM 方法和 CSS 选择器。

jsoup实现了WHATWG HTML5规范，并将 HTML 解析为与现代浏览器相同的 DOM。

从 URL、文件或字符串中抓取和解析HTML
使用 DOM 遍历或 CSS 选择器查找和提取数据
操纵 HTML 元素、属性和文本
根据安全列表清理用户提交的内容，以防止XSS攻击
输出整齐的 HTML

jsoup 旨在处理在野外发现的各种 HTML；从原始和验证，到无效的标签汤；jsoup 将创建一个合理的解析树。

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-uifxVp8L-1674570510255)(./files/Jsoup.png)]
在这里插入图片描述

jsoup 中文文档、中英对照文档说明

摘要：jsoup、org.jsoup、中文文档、中英对照文档、下载、包含jar包、原API文档、源代码、Maven依赖信息文件、翻译后的API文档、jsoup、中英对照文档、jar包、java；

使用方法：解压翻译后的API文档，用浏览器打开“index.html”文件，即可纵览文档内容。

人性化翻译，文档中的代码和结构保持不变，注释和说明精准翻译，请放心使用。

双语对照，边学技术、边学英语。

涉及的包（package）

jsoup Java HTML Parser 1.14.3 API

Packages

Package	Description
org.jsoup	Contains the main `Jsoup` class, which provides convenient static access to the jsoup functionality. 包含主jsoup类，它提供了对JSoup功能的方便静态访问。
org.jsoup.examples	Contains example programs and use of jsoup. 包含JSUP的示例程序和使用。
org.jsoup.helper	Package containing classes supporting the core jsoup code. 包含支持核心JSoup代码的类的包。
org.jsoup.internal	Util methods used by Jsoup. JSUP使用的UTUR方法。
org.jsoup.nodes	HTML document structure nodes. HTML文档结构节点。
org.jsoup.parser	Contains the HTML parser, tag specifications, and HTML tokeniser. 包含HTML解析器，标记规范和HTML标记。
org.jsoup.safety	Contains the jsoup HTML cleaner, and safelist definitions. 包含JSoup HTML清除器和Safelist定义。
org.jsoup.select	Packages to support the CSS-style element selector. 包支持CSS样式元素选择器。

涉及的类（class）

All Classes

Class	Description
Attribute	A single key + value attribute. 单个键+ value属性。
Attributes	The attributes of an Element. 元素的属性。
CDataNode	A Character Data node, to support CDATA sections. 一个字符数据节点，以支持CDATA部分。
E>	Implementation of ArrayList that watches out for changes to the contents. instrinl列表的实现，以便更改内容。
CharacterReader	CharacterReader consumes tokens off a string. ShightReader消耗符号字符串。
Cleaner	The safelist based HTML cleaner. Safelist基于HTML清洁剂。
Collector	Collects a list of elements that match the supplied criteria. 收集与提供的条件匹配的元素列表。
CombiningEvaluator	Base combining (and, or) evaluator. 基本组合（和或）评估员。
CombiningEvaluator.And
CombiningEvaluator.Or
Comment	A comment node. 注释节点。
Connection	The Connection interface is a convenient HTTP client and session object to fetch content from the web, and parse them into Documents. 连接接口是一个方便的HTTP客户端和会话对象，用于从Web获取内容，并将它们解析为文档。
T>>	Common methods for Requests and Responses 请求和答复的常用方法
Connection.KeyVal	A Key:Value tuple(+), used for form data. 一个关键：value元组（+），用于表单数据。
Connection.Method	GET and POST http methods. 获取和发布HTTP方法。
Connection.Request	Represents a HTTP request. 表示HTTP请求。
Connection.Response	Represents a HTTP response. 表示HTTP响应。
ConstrainableInputStream	A jsoup internal class (so don’t use it as there is no contract API) that enables constraints on an Input Stream, namely a maximum read size, and the ability to Thread.interrupt() the read. JSoup内部类（因此不要使用它，因为没有合同API），可以在输入流上进行约束，即最大读取大小，以及读取的最大读取大小和线程的功能。
DataNode	A data node, for contents of style, script tags etc, where contents should not show in text(). 数据节点，适用于样式，脚本标记等的内容，其中内容不应在Text（）中显示。
DataUtil	Internal static utilities for handling data. 用于处理数据的内部静态实用程序。
Document	A HTML Document. HTML文档。
Document.OutputSettings	A Document’s output settings control the form of the text() and html() methods. 文档的输出设置控制文本（）和html（）方法的形式。
Document.OutputSettings.Syntax	The output serialization syntax. 输出序列化语法。
Document.QuirksMode
DocumentType	A `<!DOCTYPE>` node. A <！DOCTYPE>节点。
Element	A HTML element consists of a tag name, attributes, and child nodes (including text nodes and other elements). HTML元素由标记名称，属性和子节点（包括文本节点和其他元素）组成。
Elements	A list of `Element`s, with methods that act on every element in the list. 元素列表，具有在列表中的每个元素上采用的方法。
Entities	HTML entities, and escape routines. HTML实体和逃生例程。
Entities.EscapeMode
Evaluator	Evaluates that an element matches the selector. 评估元素与选择器匹配。
Evaluator.AllElements	Evaluator for any / all element matching 任何/所有元素匹配的评估者
Evaluator.Attribute	Evaluator for attribute name matching 属性名称匹配的评估器
Evaluator.AttributeKeyPair	Abstract evaluator for attribute name/value matching 属性名称/值匹配的抽象评估器
Evaluator.AttributeStarting	Evaluator for attribute name prefix matching 属性名称前缀匹配的评估器
Evaluator.AttributeWithValue	Evaluator for attribute name/value matching 属性名称/值匹配的评估器
Evaluator.AttributeWithValueContaining	Evaluator for attribute name/value matching (value containing) 属性名称/值匹配的评估器（包含的值）
Evaluator.AttributeWithValueEnding	Evaluator for attribute name/value matching (value ending) 属性名称/值匹配的评估器（值结束）
Evaluator.AttributeWithValueMatching	Evaluator for attribute name/value matching (value regex matching) 属性名称/值匹配的评估器（value Regex匹配）
Evaluator.AttributeWithValueNot	Evaluator for attribute name != value matching 属性名称的评估器！=匹配值
Evaluator.AttributeWithValueStarting	Evaluator for attribute name/value matching (value prefix) 属性名称/值匹配的评估器（Value Prefix）
Evaluator.Class	Evaluator for element class 元素类评估者
Evaluator.ContainsData	Evaluator for matching Element (and its descendants) data 匹配元素（及其后代）数据的评估者
Evaluator.ContainsOwnText	Evaluator for matching Element’s own text 匹配元素自己的文本的评估者
Evaluator.ContainsText	Evaluator for matching Element (and its descendants) text 匹配元素（及其后代）文本的评估者
Evaluator.CssNthEvaluator
Evaluator.Id	Evaluator for element id 元素ID的评估器
Evaluator.IndexEquals	Evaluator for matching by sibling index number (e = idx) 评估者通过兄弟指数号匹配（e = idx）
Evaluator.IndexEvaluator	Abstract evaluator for sibling index matching 抽象评估员兄弟姐妹指数匹配
Evaluator.IndexGreaterThan	Evaluator for matching by sibling index number (e > idx) 评估者通过兄弟姐妹指数号匹配（e> idx）
Evaluator.IndexLessThan	Evaluator for matching by sibling index number (e < idx) 通过兄弟指数号匹配的评估者
Evaluator.IsEmpty
Evaluator.IsFirstChild	Evaluator for matching the first sibling (css :first-child) 匹配第一个兄弟姐妹的评估者（CSS：First-Child）
Evaluator.IsFirstOfType
Evaluator.IsLastChild	Evaluator for matching the last sibling (css :last-child) 匹配最后一个兄弟姐妹的评估者（CSS：Last-Child）
Evaluator.IsLastOfType
Evaluator.IsNthChild	css-compatible Evaluator for :eq (css :nth-child) CSS兼容评估员：EQ（CSS：Nth-Child）
Evaluator.IsNthLastChild	css pseudo class :nth-last-child) CSS伪课程：nth-last-child）
Evaluator.IsNthLastOfType
Evaluator.IsNthOfType	css pseudo class nth-of-type CSS伪类N型
Evaluator.IsOnlyChild
Evaluator.IsOnlyOfType
Evaluator.IsRoot	css3 pseudo-class :root CSS3伪类：root
Evaluator.Matches	Evaluator for matching Element (and its descendants) text with regex 匹配元素（及其后代）与正则表达式的评估器
Evaluator.MatchesOwn	Evaluator for matching Element’s own text with regex 匹配元素自己的文本与正则表达式的评估器
Evaluator.MatchText
Evaluator.Tag	Evaluator for tag name 标签名称的评估者
Evaluator.TagEndsWith	Evaluator for tag name that ends with 用于结尾的标记名称的评估器
FieldsAreNonnullByDefault
FormElement	A HTML Form Element provides ready access to the form fields/controls that are associated with it. HTML表单元素提供了与其关联的表单字段/控件的Ready访问。
HtmlToPlainText	HTML to plain-text. HTML到纯文本。
HtmlTreeBuilder	HTML Tree Builder; creates a DOM from Tokens. HTML树构建器;从令牌创建一个dom。
HttpConnection	Implementation of `Connection`. 联系的实施。
HttpConnection.KeyVal
HttpConnection.Request
HttpConnection.Response
HttpStatusException	Signals that a HTTP request resulted in a not OK HTTP response. 信号HTTP请求导致不正常的HTTP响应。
Jsoup	The core public access point to the jsoup functionality. 核心公共访问点到JSUP功能。
ListLinks	Example program to list links from a URL. 示例程序从URL列出链接。
Node	The base, abstract Node model. 基础，抽象节点模型。
NodeFilter	Node filter interface. 节点过滤器接口。
NodeFilter.FilterResult	Filter decision. 过滤决定。
NodeTraversor	Depth-first node traversor. 深度第一节点遍历。
NodeVisitor	Node visitor interface. 节点访问者接口。
NonnullByDefault
Normalizer	Util methods for normalizing strings. 用于标准化字符串的util方法。
ParseError	A Parse Error records an error in the input HTML that occurs in either the tokenisation or the tree building phase. 解析错误记录在令叫令牌或树构建阶段发生的输入HTML中的错误。
ParseErrorList	A container for ParseErrors. 用于调用的容器。
Parser	Parses HTML into a `Document`. 将HTML解析为文档。
ParseSettings	Controls parser settings, to optionally preserve tag and/or attribute name case. 控制解析器设置，以可选地保留标记和/或属性名称案例。
PseudoTextElement	Represents a `Selector` `:matchText` syntax. 表示文本节点作为元素，以便使用选择器选择要选择的文本节点：匹配文本语法。
QueryParser	Parses a CSS selector into an Evaluator tree. 将CSS选择器解析为评估树。
ReturnsAreNonnullByDefault
Safelist	Safe-lists define what HTML (elements and attributes) to allow through the cleaner. 安全列表定义通过清洁器允许的HTML（元素和属性）。
Selector	CSS-like element selector, that finds elements matching a query. CSS样元素选择器，找到匹配查询的元素。
Selector.SelectorParseException
SerializationException	A SerializationException is raised whenever serialization of a DOM element fails. 每当DOM元素失败的序列化失败时都会提出序列化异化。
StringUtil	A minimal String utility class. 最小的字符串实用程序类。
StringUtil.StringJoiner	A StringJoiner allows incremental / filtered joining of a set of stringable objects. stringjoiner允许递增/过滤的一组可划伤对象的加入。
Tag	HTML Tag capabilities. HTML标记功能。
TextNode	A text node. 文本节点。
TokenQueue	A character queue with parsing helpers. 具有解析助手的字符队列。
UncheckedIOException
UnsupportedMimeTypeException	Signals that a HTTP response returned a mime type that is not supported. 信号HTTP响应返回不支持的MIME类型。
Validate	Simple validation methods. 简单的验证方法。
W3CDom	Helper class to transform a `org.w3c.dom.Document`, for integration with toolsets that use the W3C DOM. 辅助类将文档转换为ORG.W3C.DOM.Document，以与使用W3C DOM的工具集进行集成。
W3CDom.W3CBuilder	Implements the conversion by walking the input. 通过步行输入来实现转换。
Whitelist	Deprecated. As of release `v1.14.1`, this class is deprecated in favour of `Safelist`.
Wikipedia	A simple example, used on the jsoup website. 一个简单的例子，在jsoup网站上使用。
XmlDeclaration	An XML Declaration. XML声明。
XmlTreeBuilder	Use the `XmlTreeBuilder` when you want to parse XML without any of the HTML DOM rules being applied to the document. 如果要在没有应用于文档的任何HTML DOM规则的情况下，请使用XMLTreeBuilder。