You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

163 lines
4.0 KiB

3 months ago
  1. # domhandler [![Build Status](https://travis-ci.com/fb55/domhandler.svg?branch=master)](https://travis-ci.com/fb55/domhandler)
  2. The DOM handler creates a tree containing all nodes of a page.
  3. The tree can be manipulated using the [domutils](https://github.com/fb55/domutils)
  4. or [cheerio](https://github.com/cheeriojs/cheerio) libraries and
  5. rendered using [dom-serializer](https://github.com/cheeriojs/dom-serializer) .
  6. ## Usage
  7. ```javascript
  8. const handler = new DomHandler([ <func> callback(err, dom), ] [ <obj> options ]);
  9. // const parser = new Parser(handler[, options]);
  10. ```
  11. Available options are described below.
  12. ## Example
  13. ```javascript
  14. const { Parser } = require("htmlparser2");
  15. const { DomHandler } = require("domhandler");
  16. const rawHtml =
  17. "Xyz <script language= javascript>var foo = '<<bar>>';</script><!--<!-- Waah! -- -->";
  18. const handler = new DomHandler((error, dom) => {
  19. if (error) {
  20. // Handle error
  21. } else {
  22. // Parsing completed, do something
  23. console.log(dom);
  24. }
  25. });
  26. const parser = new Parser(handler);
  27. parser.write(rawHtml);
  28. parser.end();
  29. ```
  30. Output:
  31. ```javascript
  32. [
  33. {
  34. data: "Xyz ",
  35. type: "text",
  36. },
  37. {
  38. type: "script",
  39. name: "script",
  40. attribs: {
  41. language: "javascript",
  42. },
  43. children: [
  44. {
  45. data: "var foo = '<bar>';<",
  46. type: "text",
  47. },
  48. ],
  49. },
  50. {
  51. data: "<!-- Waah! -- ",
  52. type: "comment",
  53. },
  54. ];
  55. ```
  56. ## Option: `withStartIndices`
  57. Add a `startIndex` property to nodes.
  58. When the parser is used in a non-streaming fashion, `startIndex` is an integer
  59. indicating the position of the start of the node in the document.
  60. The default value is `false`.
  61. ## Option: `withEndIndices`
  62. Add an `endIndex` property to nodes.
  63. When the parser is used in a non-streaming fashion, `endIndex` is an integer
  64. indicating the position of the end of the node in the document.
  65. The default value is `false`.
  66. ## Option: `normalizeWhitespace` _(deprecated)_
  67. Replace all whitespace with single spaces.
  68. The default value is `false`.
  69. **Note:** Enabling this might break your markup.
  70. For the following examples, this HTML will be used:
  71. ```html
  72. <font> <br />this is the text <font></font></font>
  73. ```
  74. ### Example: `normalizeWhitespace: true`
  75. ```javascript
  76. [
  77. {
  78. type: "tag",
  79. name: "font",
  80. children: [
  81. {
  82. data: " ",
  83. type: "text",
  84. },
  85. {
  86. type: "tag",
  87. name: "br",
  88. },
  89. {
  90. data: "this is the text ",
  91. type: "text",
  92. },
  93. {
  94. type: "tag",
  95. name: "font",
  96. },
  97. ],
  98. },
  99. ];
  100. ```
  101. ### Example: `normalizeWhitespace: false`
  102. ```javascript
  103. [
  104. {
  105. type: "tag",
  106. name: "font",
  107. children: [
  108. {
  109. data: "\n\t",
  110. type: "text",
  111. },
  112. {
  113. type: "tag",
  114. name: "br",
  115. },
  116. {
  117. data: "this is the text\n",
  118. type: "text",
  119. },
  120. {
  121. type: "tag",
  122. name: "font",
  123. },
  124. ],
  125. },
  126. ];
  127. ```
  128. ---
  129. License: BSD-2-Clause
  130. ## Security contact information
  131. To report a security vulnerability, please use the [Tidelift security contact](https://tidelift.com/security).
  132. Tidelift will coordinate the fix and disclosure.
  133. ## `domhandler` for enterprise
  134. Available as part of the Tidelift Subscription
  135. The maintainers of `domhandler` and thousands of other packages are working with Tidelift to deliver commercial support and maintenance for the open source dependencies you use to build your applications. Save time, reduce risk, and improve code health, while paying the maintainers of the exact dependencies you use. [Learn more.](https://tidelift.com/subscription/pkg/npm-domhandler?utm_source=npm-domhandler&utm_medium=referral&utm_campaign=enterprise&utm_term=repo)