适用于 WordPress 的 Crawlomatic 多站点抓取器帖子生成器插件

你可以用这个插件做什么？

适用于 WordPress 的 Crawlomatic 多站点抓取器帖子生成器插件 是一个突破性的边缘网站抓取和抓取，帖子生成器 Autoblogging 插件 它使用网站抓取和抓取将您的网站变成自动博客甚至赚钱机器！
从几乎任何网页获取内容！您不再需要需要需要注册并提供有限访问权限的 API，您还可以从非 API 提供的网站检索数据。安排一次，让它像大师一样 7/24 为您自动驾驶您的帖子！

它是如何工作的？

此插件将抓取您为其提供的种子 URL（抓取意味着它将搜索网页包含的所有链接），并将访问并从每个抓取的 URL 中提取内容。爬网过程是可自定义的：您可以设置爬网深度、爬网速率、最大爬取文章计数、仅爬取具有特定类或 ID 的链接以及更多自定义。

Crawlomatic v2.0 更新

在 v2.0 更新中，插件中添加了一个新的 live scraper 短代码：[crawlomatic-scraper]。这项新功能使此插件成为易于实现的 WordPress 网络数据提取器。因此，它可用于将来自任何网站的实时数据直接显示到您的帖子、页面或侧边栏中。它还会临时缓存抓取的内容，因此您的网站不会过度使用资源。您可以使用此插件来包含实时股票报价、板球或足球比分或来自公共领域的任何其他通用内容！

此更新中包含的新功能：

抓取的输出可以通过自定义模板标签、页面、帖子和侧边栏中的简码（通过文本小部件）显示。
可配置抓取数据的缓存。对于每个抓取的数据，可以在几分钟内定义缓存超时。
可以为每次抓取设置抓取器的可配置 Useragent。
可配置的默认设置，如启用、用户代理、超时、缓存、错误处理。
多种查询内容的方式 – CSS 选择器、XPath 或 Regex、自动检测。
用于解析内容的广泛参数。
将 post 参数传递给要抓取的 URL 的选项。
将抓取的内容动态转换为指定的字符编码，以使用不同的字符集从站点抓取数据。
使用动态生成的 URL 动态创建抓取页面，以根据页面的 get 或 post 参数抓取或发布参数。
用于对抓取数据进行高级解析的回调函数。

检查 V2 更新的官方文档，浏览示例并查看常见问题解答以制作完美优化的网络爬虫。

有关插件的更多信息

您几乎可以从浏览器中打开的每个网站中抓取内容。如果内容是使用 JavaScript 加载的，则该插件可以与 PhantomJS 结合使用，以抓取 JavaScript 生成的内容。

此外，您可以自动生成无限数量的自定义网站抓取和抓取。

其他插件功能：

v2.5.5 更新： 如果源站点发生更改，则自动更新抓取的文章/页面/产品 + 如果抓取的 URL 在源站点上不再可用，则取消发布（设置为草稿）文章/页面/产品（可选功能，可以启用/禁用）
v2.5.1 更新： 从其他 WooCommerce/Shopify 商店抓取 WooCommerce 产品变体
v2.5.0 更新： 从 Google 或 Bing 为您的自定义关键字搜索抓取搜索引擎结果。检查此新功能的教程视频.
v2.4.1 更新： 抓取 WooCommerce 产品的产品图片库（对于非产品帖子类型，将从抓取的图像中创建帖子附件）
v2.3.5 更新： 在抓取的 HTML 上执行您自己的 JavaScript 代码并抓取结果 – 仅当使用无头浏览器进行抓取（Puppeteer/Tor/PhantomJS）或 HeadlessBrowserAPI 时，此功能才可用
v2.2.1 更新： 抓取链接的 RSS 提要并抓取其中列出的文章
v2.2.0 更新： 用HeadlessBrowserAPI 从 Internet 上的任何网站抓取 JavaScript 生成的 HTML 内容，而无需在您的服务器上安装任何东西（除了此插件）–教程视频
v2.1.0 更新： 使用 Tor 浏览器和 Puppeteer 从暗网中抓取 .onion 网站！–教程视频
v2.0.0 更新： 添加了 Live Scraper 短代码以获得更多爬行控制和抓取功能：[crawlomatic-scraper]
v1.7.1 更新： 支持站点地图爬取 –视频教程
v1.6.5 更新： 添加了视觉内容选择器支持 –视频教程
v1.6.0 更新： 添加了制作已抓取页面的屏幕截图并在生成的帖子内容中使用它们的功能 –视频教程
v1.5.2 更新： 能够缩短传出（帖子源）链接（并通过它们获利），使用Shorte.st 链接缩短服务 –缩短链接示例
v1.4.8 更新： 添加了对爬网页面的 JavaScript 执行支持 – 需要在服务器上安装 PhantomJS –如何安装 PhantomJs？ – 视频教程
v1.4.4 更新： 添加了为抓取页面设置多个代理的功能。该插件将在每次页面访问时随机选择一个
v1.4.0 更新： 添加了对爬网进行分页的功能（文章的爬网将在种子页面的下一页上继续进行）。
v1.4.0 更新： 添加了导入抓取产品的产品价格的功能（兼容 WooCommerce）+ dropshipping 价格自动修改 –视频教程
v1.4.0 更新： 添加了将导入产品价格增加固定数字或乘以预定义数字的功能（直销物超所值！
v1.2.8 更新： 添加了分页文章导入支持（导入到单个爬取的文章中）检查：视频.
v1.2.4 更新： 添加了设置用于抓取页面的代理的功能
v1.2.3 更新： 添加了一个选项，用于在直接抓取失败（阻止）时从 Google 缓存中抓取页面
Google 翻译支持 – 选择您要发布文章的语言
文本微调器支持 – 自动修改生成的文本，使用同义词更改单词 – 内置、The Best Spinner、SpinRewriter、WordAI、TurkceSpin 等 – 巨大的 SEO 价值！
可自定义生成的帖子状态（已发布、草稿、待处理、私人、垃圾箱）
列出此插件生成的所有帖子的短代码： [crawlomatic-list-posts type => ‘any’， order => ‘ASC’， ‘orderby’ => ‘date’， ‘posts’ => 50， ‘category’ => ”， ‘ruleid’ => ”]
可以设置抓取和抓取以尊重网站和机器人的 robots.txt 文件抓取页面的 HTML 标题
从 Marketplace 项目自动生成文章类别或标签
手动将文章类别或标签添加到项目
选择是否要更新 Post（如果已发布）
将自定义 Cookie 与请求一起发送到已爬网网页（Authentification）
生成 POST 或 PAGE 或任何自定义 POST 类型
使用网站抓取和抓取嵌入来自 YouTube、Vimeo、Flickr、IGN、Ustream.tv 和 DailyMotion 的视频
定义发布限制：不发布没有图像的帖子、标题短/长标题/内容的帖子
自动生成文章的特色图片
启用/禁用生成的帖子的评论、pingback 或 trackback
自定义帖子标题和内容（包含各种相关的帖子短代码）
“关键字替换工具” – 其目的是定义自动替换为您的联盟链接的关键字，它们出现在您网站内容中的任何位置。例如，您可以定义一个关键字 ‘codecanyon’ 并将其替换为指向 http://www.codecanyon.net/?ref=user_name 它出现在您网站内容中的任何位置。
‘Random Sentence Generator Tool’ （相关句子 – 由您定义）
在一段时间后自动删除生成的帖子的选项
详细的插件活动日志记录
计划的规则运行
对生成的帖子的自定义字段支持
对生成帖子的自定义分类法支持
无限制的爬网变量导入（无限制的已爬网页面的导入部分）
在本地复制或不复制图像的选项
能够使用 Regex 解析 JSON 数据
将规范 meta 标记添加到生成的帖子的选项
最大/最小标题长度帖子限制
最大/最小内容长度帖子限制
仅当在标题/内容中找到预定义的必需关键字时才添加帖子
仅当在标题/内容中找不到预定义的禁止关键字时，才添加帖子
从文件中保存和恢复插件规则列表

测试此插件

您可以使用 ‘测试站点生成器’.在这里您可以尝试插件的全部功能。请注意，生成的测试博客将在 24 小时后自动删除。

插件要求

PHP DOM -> 如何安装它（如果你没有它，但可能你已经有它）： http://php.net/manual/en/dom.setup.php
菲律宾 5.0 或更高版本
dom、mbstring、iconv 和 json 扩展名（默认启用）

有关如何配置插件的更多信息，请同时查看此 1 小时长的教程视频，它涵盖了插件的完整功能集。

需要支持？

请查看我们的知识库，它可能包含您问题的答案或您问题的解决方案。如果没有，请给我发电子邮件 support@coderevolution.ro 我会尽快回复。

更新日志：

版本 1.0 发布日期： 2017-08-15

First version released!

版本 1.1 发布日期： 2017-08-16

Fixed some small issues

版本 1.2 发布日期： 2017-08-17

Added the ability to crawl page by div class or id

版本 1.2.1 发布日期： 2017-08-18

Fixed incompatibility with some WordPress installs

版本 1.2.2 发布日期： 2017-08-22

Added a shortcode to display post generated by this plugin

版本 1.2.3 发布日期： 2017-08-30

Added an option to crawl the page from Google cache when direct crawling fails (blocked)

版本 1.2.4 发布日期： 2017-08-31

Added the ability to set proxies for crawling pages

版本 1.2.5 发布日期： 2017-09-04

Added the canonicalization for generated articles

版本 1.2.6 发布日期： 2017-09-13

Made the plugin timezone aware

版本 1.2.7 发布日期： 2017-09-14

Fixed post date for non gmt blogs

版本 1.2.8 发布日期： 2017-09-23

Added paginated post importing support

版本 1.2.9 发布日期： 2017-09-27

Bugfixes

版本 1.3.0 发布日期： 2017-09-28

Fixed rule restore

版本 1.3.1 发布日期： 2017-10-20

Fixed featured image generation

版本 1.3.2 发布日期： 2017-10-22

Added crawling helper

版本 1.3.3 发布日期： 2017-11-06

Fixed a memory issue

版本 1.3.4 发布日期： 2017-11-07

Bugfixes

版本 1.3.5 发布日期： 2017-12-14

Fixed class selector not working in all cases

版本 1.3.6 发布日期： 2017-12-18

Added the ability to specify a custom user agent for each crawled webpage

版本 1.3.7 发布日期： 2018-01-20

Added a new text spinner service: Spinrewriter

版本 1.3.8 发布日期： 2018-01-22

Plugin can now continuously import content

版本 1.3.9 发布日期： 2018-02-02

Fixed issue when multiple crawl classes where specified

版本 1.4.0 发布日期： 2018-02-22

Major update: added the ability to crawl imported product prices (WooCommerce compatible)
Added the ability to crawl serial content (paged crawling - crawling for articles will continue on the next page)

版本 1.4.1 发布日期： 2018-03-07

Bugfixes

版本 1.4.2 发布日期： 2018-03-21

Fixed a duplicate posting issue

版本 1.4.3 发布日期： 2018-03-22

Fixed a critical issue with multiple rule running

版本 1.4.4 发布日期： 2018-04-04

Added the ability to define multiple proxies. The plugin will select one at random at each page access

版本 1.4.5 发布日期： 2018-07-13

Updated built-in readability module

版本 1.4.6 发布日期： 2018-07-16

Critical bugfixes

版本 1.4.7 发布日期： 2018-07-19

Added the ability to not translate links

版本 1.4.8 发布日期： 2018-09-05

Added JavaScript execution support for crawled pages - requires PhantomJS installed on server

版本 1.4.9 发布日期： 2018-09-18

Bugfixes

版本 1.5.0 发布日期： 2018-09-24

Added the ability to add custom post taxonomies from crawled content
Added the ability to add unlimited crawled variables to posts's content/ meta/ taxonomies

版本 1.5.1 发布日期： 2018-10-16

Fixed issue when importing large pages

版本 1.5.2 发布日期： 2018-10-24

Added the ability to shorten links using Shorte.st

版本 1.5.3 发布日期： 2018-10-29

Fixed issue when importing paginated posts

版本 1.5.4 发布日期： 2018-11-06

Added the ability to strip HTML elements by tag name (div,a,span,etc.)

版本 1.5.5 发布日期： 2018-11-07

Added WooCommerce product category creation support

版本 1.5.6 发布日期： 2018-12-16

Added nested importing support - import mixed content into a single post, from multiple plugins created by CodeRevolution

版本 1.5.7 发布日期： 2018-12-16

Added the ability to define a list of URLs to skip from crawling and importing

版本 1.5.8 发布日期： 2019-01-08

Added the ability to import royalty free images for created posts

版本 1.5.9 发布日期： 2019-01-12

Added Gutenberg blocks support

版本 1.6.0 发布日期： 2019-02-01

Added the ability to make screenshots of scraped pages

版本 1.6.1 发布日期： 2019-02-06

Improved compatibility with some crawled pages

版本 1.6.2 发布日期： 2019-04-19

Security update

版本 1.6.3 发布日期： 2019-05-15

Fixed some recently found bugs with post pagination

版本 1.6.4 发布日期 2019-05-17

Added support for TurkceSpin content spinner

版本 1.6.5 发布日期： 2019-05-27

Added a much demanded new feature: Visual Content Selector for assigning scraped page content
Added the ability to scrape pages from bottom to top
Added the ability to replace words in scraped content
Other minor bug fixes and functionality improvements

版本 1.6.6 发布日期： 2019-07-26

Fixed timeout issue with some crawled pages
Many small issues fixed and features improved

版本 1.6.7 发布日期： 2019-08-05

Fixed issue with Google Translate

版本 1.6.8 发布日期 2019-11-15

WordPress 5.3 compatibility update

版本 1.6.9 发布日期： 2020-05-11

New features added for content templates
Bugfix update

版本 1.7.0 发布日期： 2020-07-21

Added support for scraping more sites

版本 1.7.1 发布日期： 2020-09-28

Added the ability to crawl sitemaps and to scrape posts linked in them
Added the ability to respect the directives set in the robots.txt files

版本 2.0.0发布日期 2020-12-08

Added a new shortcode and Gutenberg block alternative that will enable live scraping of any website
Major performance improvement
Fixed reported bugs

版本 2.1.0 发布日期： 2021-01-02

Added support for using the Tor Browser to crawl dark web sites! Scrape .onion websites like you would scrape any other public website!

版本 2.1.1 发布日期： 2021-01-04

Added the ability to crawl and scrape pages using POST requests (POST form submission scraping support)

版本 2.2.0 发布日期 2021-01-14

Added support for HeadlessBrowserAPI to scrape JavaScript rendered content with ease

版本 2.2.1 发布日期 2021-01-16

PHP 8 compatibility update
Added support for crawling links from RSS feeds

版本 2.2.2 发布日期 2021-01-28

Fixed rare issue when saving importing rule settings on some PHP 8 configurations

版本 2.2.3 发布日期： 2021-02-01

Improved content extraction algorithm

版本 2.2.4 发布日期 2021-02-17

Added the ability to not spin posts generated by specific rules

版本 2.2.5 发布日期： 2021-03-07

Added the ability to enter multiple URLs (one per line) to be crawled and scraped

版本 2.2.6 发布日期 2021-03-07

Visual Selector improvements - now it will be able to use HeadlessBrowserAPI/Puppeteer/PhantomJS/Tor to visualize scrape content

版本 2.2.7 发布日期 2021-04-02

Fixed rare issues when crawling links with URL parameters

版本 2.2.8 发布日期 2021-04-07

Fixed rare issues with relative URL paths in crawled content

版本 2.2.9 发布日期 2021-05-03

Added the ability to skip publishing of new posts if not images found (separately, for each rule)

版本 2.3.0 发布日期 2021-05-19

Added the ability to make screenshots of websites using the HeadlessBrowserAPI feature

版本 2.3.1 发布日期： 2021-06-10

Fixed content extracting/stripping in case of some websites with dynamically generated content

版本 2.3.2 发布日期： 2021-07-15

Added multiple Regex expression support (for content stripping and replacement)

版本 2.3.3 发布日期： 2021-07-18

Added SpinnerChief to the supported premium text spinners (SpinRewriter, The Best Spinner, WordAI, TurkceSpin)

版本 2.3.4 发布日期 2021-07-19

Added Bing Translator support (next to Google Translator and DeepL Translator)

版本 2.3.5 发布日期： 2021-08-06

Added the ability to execute your own custom JavaScript on scraped pages when using headless browsers (PhantomJS/Puppeteer/Tor) or HeadlessBrowserAPI (XSS - cross site scripting feature) and scrape the resulting HTML content

版本 2.3.6 发布日期 2021-08-30

Added the ability to set featured images of posts from website screenshots
Added the ability to remove HTML content (leave text only) of XPath matched content

版本 2.3.7 发布日期 2021-09-02

Added the ability to set local storage objects when scraping websites (these are similar to cookies, their usage is supported only when using headless browsers or HeadlessBrowserAPI in conjunction with the plugin)

版本 2.3.8 发布日期 2021-09-15

Added the ability to set the WPML language to created posts

版本 2.3.9 发布日期 2021-10-19

WooCommerce product scraping related improvements

版本 2.4.0 发布日期 2022-02-28

Added support for creating WooCommerce product attributes and assign values to them from scraped data

版本 2.4.1 发布日期 2022-03-05

Added the ability to scrape image galleries for WooCommerce products

版本 2.4.1.1 发布日期 2022-03-21

Bugfix update

版本 2.4.2 发布日期 2022-04-20

Fixed Google Translator problem caused by a recent Google API update

版本 2.5.0 发布日期 2022-05-01

Crawlomatic now can scrape search engine results from Google and Bing - tutorial video: https://www.youtube.com/watch?v=h6fQeH9-X8c

版本 2.5.1 发布日期： 2022-05-06

Added the ability to scrape WooCommerce product variations from Shopify and other WooCommerce products
Added the ability to automatically detect product prices
Improved readability module
Fixes and improvements

版本 2.5.2 发布日期 2022-06-14

Added the ability to translate posts a third time (acting like a Word Spinner, if the content is translated back to the original language

版本 2.5.3 发布日期 2022-06-23

Fixed WooCommerce price scraping related issue

版本 2.5.4 发布日期 2022-09-12

Added the ability to scrape links from TXT files

版本 2.5.5 发布日期 2022-10-14

Major update: post/page/product automatic updating if the scraped source URL changed

版本 2.5.6 发布日期 2022-11-30

Major update: added support for Google News scraping

版本 2.5.7 发布日期 2023-01-05

Added a new ability to HeadlessBrowserAPI to click on HTML elements by CSS selectors, enabling loading of Ajax content and bypassing Captchas which require a click

版本 2.5.8 发布日期 2023-01-17

Added product regular price scraping feature to WooCommerce products - the regular price is the price displayed before the discount is applied. You can scrape this full price from the websites or add/multiply the original price to create it automatically

版本 2.5.9 发布日期 2023-02-10

Fixed Google News scraping after recent changes

版本 2.6.0 发布日期 2023-03-13

Added more DeepL languages
Multiline scraping expressions support added
Fixed all reported issues

版本 2.6.0.1 发布日期 2023-04-13

Fixed reported bugs

版本 2.6.0.2 发布日期 2023-05-10

Improved scraper auto detection

版本 2.6.0.3 发布日期 2023-05-22

Fixed more reported bugs

版本 2.6.0.4 发布日期 2023-06-13

Reworked backend, improved scraping speed

版本 2.6.0.5 发布日期 2023-06-29

Scraped content now better matches source site styling

版本 2.6.0.6 发布日期 2023-07-28

Fixed Google Translate integration, working with latest changes

版本 2.6.0.7 发布日期 2023-10-18

Fixed PHP 8.2 related errors

版本 2.6.1 发布日期 2024-02-15

Fixed an issue with rule saving

版本 2.6.2 发布日期 2024-03-15

Visual selector fix for CSS issue happening in some cases

版本 2.6.3 发布日期 2024-07-12

Bugfix release
Purchase code verification now required for the plugin to function

版本 2.6.4 发布日期 2024-10-26

Content filtering improvements

版本 2.6.5 发布日期 2024-10-31

Added support for automatic Magento product variation scraping

版本 2.6.6 发布日期 2024-12-26

The plugin will detect if the same image was scraped before to the media library and will not scrape the same image twice, but will reuse the existing media library ID

版本 2.6.7 发布日期 2025-03-28

Fixed reported issues

版本 2.6.8 发布日期 2025-04-25

Supports WooCommerce product variation scraping from Rank Math JSON data from products

版本 2.6.8.1 发布日期 2025-05-07

Fixed reported security issue

版本 2.6.8.2 发布日期 2025-05-16

Fixed reported critical security issue

版本 2.6.9 发布日期 2025-06-03

Fixed another security issue

您已经是客户了吗？

如果您已经购买了此产品并试用过，请在项目的评论部分与我联系并给我反馈，以便我将其制作成更好的 WordPress 插件！

WordPress 6.8 和 PHP 8.4 已测试！

免責聲明
通过此插件，您可以从各种网站获取不一定属于您或不受您控制的内容。如果您在未经作者许可的情况下获取受版权保护的材料，插件的开发人员对您的行为不承担任何责任。此外，该插件的开发人员无法控制这些网站的性质、内容和可用性。

您喜欢我们的工作并想要更多吗？
退房这个 MEGA 插件包.