diff --git a/.github/ISSUE_TEMPLATE/config.yml b/.github/ISSUE_TEMPLATE/config.yml new file mode 100644 index 00000000..9ab3c9b8 --- /dev/null +++ b/.github/ISSUE_TEMPLATE/config.yml @@ -0,0 +1,6 @@ +# https://docs.github.com/en/github/building-a-strong-community/configuring-issue-templates-for-your-repository#configuring-the-template-chooser +blank_issues_allowed: false # We have a blank template which assigns labels +contact_links: + - name: Questions about using feapder? + url: "https://github.com/Boris-code/feapder/discussions" + about: Please see our guide on how to ask questions \ No newline at end of file diff --git a/.github/workflows/workflow.yml b/.github/workflows/workflow.yml new file mode 100644 index 00000000..e69de29b diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md new file mode 100644 index 00000000..63d42cb0 --- /dev/null +++ b/CONTRIBUTING.md @@ -0,0 +1,15 @@ +# 贡献指南 +感谢你的宝贵时间。你的贡献将使这个项目变得更好!在提交贡献之前,请务必花点时间阅读下面的入门指南。 + +## 提交 Pull Request +1. Fork [此仓库](https://github.com/Boris-code/feapder.git), +2. clone到本地,从 `develop` 创建分支,对代码进行更改。 +3. 请确保进行了相应的测试。 +4. 推送代码到自己Fork的仓库中。 +5. 在Fork的仓库中点击 Pull request 链接 +6. 点击「New pull request」按钮。 +7. 填写提交说明后,「Create pull request」。提交到`develop`分支。 + +## License + +[MIT](./LICENSE) diff --git a/README.md b/README.md index 2ce95aec..7bde6250 100644 --- a/README.md +++ b/README.md @@ -16,7 +16,8 @@ 读音: `[ˈfiːpdə]` -![Feapder](https://tva1.sinaimg.cn/large/008vxvgGly1h8byrr75xnj30u02f7k0j.jpg) +![feapder](http://markdown-media.oss-cn-beijing.aliyuncs.com/2023/09/04/feapder.jpg) + ## 文档地址 @@ -35,23 +36,30 @@ From PyPi: -通用版 +精简版 + +```shell +pip install feapder +``` +浏览器渲染版: ```shell -pip3 install feapder +pip install "feapder[render]" ``` 完整版: ```shell -pip3 install feapder[all] +pip install "feapder[all]" ``` -通用版与完整版区别: +三个版本区别: -1. 完整版支持基于内存去重 +1. 精简版:不支持浏览器渲染、不支持基于内存去重、不支持入库mongo +2. 浏览器渲染版:不支持基于内存去重、不支持入库mongo +3. 完整版:支持所有功能 -完整版可能会安装出错,若安装出错,请参考[安装问题](question/安装问题) +完整版可能会安装出错,若安装出错,请参考[安装问题](docs/question/安装问题.md) ## 小试一下 @@ -99,13 +107,56 @@ FirstSpider|2021-02-09 14:55:14,620|air_spider.py|run|line:80|INFO| 无任务, 1. start_requests: 生产任务 2. parse: 解析数据 + +## 感谢以下代理赞助商 + +### Rapidproxy代理 + + + + + + + + + +### SWIFTPROXY + + + + + + + + + +### NovProxy + + + + + + + + + + +## 参与贡献 + +贡献之前请先阅读 [贡献指南](./CONTRIBUTING.md) + +感谢所有做过贡献的人! + + + + + ## 爬虫工具推荐 1. 爬虫在线工具库:http://www.spidertools.cn 2. 爬虫管理系统:http://feapder.com/#/feapder_platform/feaplat 3. 验证码识别库:https://github.com/sml2h3/ddddocr - ## 微信赞赏 如果您觉得这个项目帮助到了您,您可以帮作者买一杯咖啡表示鼓励 🍹 @@ -121,16 +172,16 @@ FirstSpider|2021-02-09 14:55:14,620|air_spider.py|run|line:80|INFO| 无任务, 知识星球:17321694 作者微信: boris_tm - QQ群号:485067374 + QQ群号:521494615 - + - 加好友备注:feapder \ No newline at end of file + 加好友备注:feapder diff --git a/docs/README.md b/docs/README.md index b9a814d3..08ccb6aa 100644 --- a/docs/README.md +++ b/docs/README.md @@ -16,7 +16,7 @@ 读音: `[ˈfiːpdə]` -![Feapder](https://tva1.sinaimg.cn/large/008vxvgGly1h8byrr75xnj30u02f7k0j.jpg) +![feapder](http://markdown-media.oss-cn-beijing.aliyuncs.com/2023/09/04/feapder.jpg) ## 文档地址 @@ -35,21 +35,29 @@ From PyPi: -通用版 +精简版 ```shell -pip3 install feapder +pip install feapder +``` + +浏览器渲染版: +```shell +pip install "feapder[render]" ``` 完整版: ```shell -pip3 install feapder[all] +pip install "feapder[all]" ``` -通用版与完整版区别: +三个版本区别: + +1. 精简版:不支持浏览器渲染、不支持基于内存去重、不支持入库mongo +2. 浏览器渲染版:不支持基于内存去重、不支持入库mongo +3. 完整版:支持所有功能 -1. 完整版支持基于内存去重 完整版可能会安装出错,若安装出错,请参考[安装问题](question/安装问题) @@ -78,7 +86,7 @@ class FirstSpider(feapder.AirSpider): if __name__ == "__main__": FirstSpider().start() - + ``` 直接运行,打印如下: @@ -107,30 +115,30 @@ FirstSpider|2021-02-09 14:55:14,620|air_spider.py|run|line:80|INFO| 无任务, 3. 验证码识别库:https://github.com/sml2h3/ddddocr -## 微信赞赏 + ## 学习交流 - - - - - - - +
知识星球:17321694 作者微信: boris_tm QQ群号:485067374
+ + + + + + - - - -
知识星球:17321694 作者微信: boris_tm QQ群号:521494615
-
+ + + + + 加好友备注:feapder \ No newline at end of file diff --git a/docs/_sidebar.md b/docs/_sidebar.md index ef55dce7..bef51b37 100644 --- a/docs/_sidebar.md +++ b/docs/_sidebar.md @@ -38,6 +38,7 @@ * [海量数据去重-dedup](source_code/dedup.md) * [报警及监控](source_code/报警及监控.md) * [监控打点](source_code/监控打点.md) + * [自定义下载器](source_code/custom_downloader.md) * 爬虫管理系统 * [简介及部署](feapder_platform/feaplat.md) diff --git a/docs/feapder_platform/feaplat.md b/docs/feapder_platform/feaplat.md index d69476e2..405f3e0c 100644 --- a/docs/feapder_platform/feaplat.md +++ b/docs/feapder_platform/feaplat.md @@ -26,6 +26,8 @@ ## 功能概览 +暂时不支持 苹果电脑的Apple芯片 + ### 1. 项目管理 添加/编辑项目 @@ -95,11 +97,16 @@ worker节点根据任务动态生成,一个worker只运行一个任务实例 ## 部署 -> 下面部署以centos为例, 其他平台docker安装方式可参考docker官方文档:https://docs.docker.com/compose/install/ +> 安装方式参考docker官方文档:https://docs.docker.com/compose/install/ ### 1. 安装docker -删除旧版本(可选,需要重装升级时执行) +#### 1.1 centos系统 + +> docker --version +> 作者的docker版本为 20.10.12,低于此版本的可能会存在问题 + +删除旧版本(可选,需要重装升级docker时执行) ```shell yum remove docker docker-common docker-selinux docker-engine @@ -118,14 +125,69 @@ yum install -y yum-utils device-mapper-persistent-data lvm2 && python2 /usr/bin/ curl -sSL https://get.daocloud.io/docker | sh ``` +启动docker服务 - -启动 ```shell systemctl enable docker systemctl start docker ``` +验证: 打开终端,输入 + +```shell +docker ps +``` + +#### 1.2 ubuntu系统 + +``` +sudo apt update +sudo apt install docker.io docker-compose +``` + +启动docker服务 + +```shell +sudo systemctl enable docker +sudo systemctl start docker +``` + +验证: 打开终端,输入 + +```shell +sudo docker ps +``` + +#### 1.3 window系统 + +访问下面的链接,下载Docker Desktop, 然后安装即可 + +https://docs.docker.com/desktop/setup/install/windows-install/ + + +运行安装好的Docker Desktop + +验证: 打开cmd终端,输入 + +```shell +docker ps +``` + +#### 1.4 mac系统 + +访问下面的链接,下载Docker Desktop, 然后安装即可 + +https://docs.docker.com/desktop/setup/install/mac-install/ + + +运行安装好的Docker Desktop + +验证: 打开终端,输入 +```shell +docker ps +``` + + ### 2. 安装 docker swarm docker swarm init @@ -133,7 +195,12 @@ systemctl start docker # 如果你的 Docker 主机有多个网卡,拥有多个 IP,必须使用 --advertise-addr 指定 IP docker swarm init --advertise-addr 192.168.99.100 -### 3. 安装docker-compose +### 3. 安装docker-compose(非必须) +一般安装完docker后,会自带 docker compose。可先输入下面的命令验证是否有改环境,若有则不需要安装 +``` shell +docker compose +``` +若无`docker compose`命令,则按照下面的安装 ```shell sudo curl -L "https://github.com/docker/compose/releases/download/1.29.2/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose @@ -144,6 +211,9 @@ sudo chmod +x /usr/local/bin/docker-compose sudo curl -L "https://get.daocloud.io/docker/compose/releases/download/1.29.2/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose sudo chmod +x /usr/local/bin/docker-compose ``` +安装后输入`docker-compose`验证是否成功 + +注:`docker-compose` 与 `docker compose` 两种命令用法一样,是一个东西,只不过不同版本的docker可能叫法不一 ### 4. 部署feaplat爬虫管理系统 #### 预备项 @@ -153,13 +223,16 @@ yum -y install git ``` #### 1. 下载项目 +> 先按照下面命令拉取develop分支代码运行。 +> master分支不支持urllib3>=2.0版本,现在已经运行不起来了,但之前老用户不受影响。待后续测试好兼容性,不影响老用户后,会将develop分支合并到master + gitub ```shell -git clone https://github.com/Boris-code/feaplat.git +git clone -b develop https://github.com/Boris-code/feaplat.git ``` gitee ```shell -git clone https://gitee.com/Boris-code/feaplat.git +git clone -b develop https://gitee.com/Boris-code/feaplat.git ``` #### 2. 运行 @@ -168,6 +241,8 @@ git clone https://gitee.com/Boris-code/feaplat.git ```shell cd feaplat +docker compose up -d +或者 docker-compose up -d ``` @@ -242,28 +317,9 @@ docker node ls docker swarm leave ``` -## 拉取私有项目 - -拉取私有项目需在git仓库里添加如下公钥 - -``` -ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCd/k/tjbcMislEunjtYQNXxz5tgEDc/fSvuLHBNUX4PtfmMQ07TuUX2XJIIzLRPaqv3nsMn3+QZrV0xQd545FG1Cq83JJB98ATTW7k5Q0eaWXkvThdFeG5+n85KeVV2W4BpdHHNZ5h9RxBUmVZPpAZacdC6OUSBYTyCblPfX9DvjOk+KfwAZVwpJSkv4YduwoR3DNfXrmK5P+wrYW9z/VHUf0hcfWEnsrrHktCKgohZn9Fe8uS3B5wTNd9GgVrLGRk85ag+CChoqg80DjgFt/IhzMCArqwLyMn7rGG4Iu2Ie0TcdMc0TlRxoBhqrfKkN83cfQ3gDf41tZwp67uM9ZN feapder@qq.com -``` - -或在系统设置页面配置您的SSH私钥,然后在git仓库里添加您的公钥,例如: -![](http://markdown-media.oss-cn-beijing.aliyuncs.com/2021/10/19/16346353514967.jpg) - -注意,公私钥加密方式为RSA,其他的可能会有问题 +## 使用 -生成RSA公私钥方式如下: -```shell -ssh-keygen -t rsa -C "备注" -f 生成路径/文件名 -``` -如: -`ssh-keygen -t rsa -C "feaplat" -f id_rsa` -然后一路回车,不要输密码 -![](http://markdown-media.oss-cn-beijing.aliyuncs.com/2021/11/17/16371210640228.jpg) -最终生成 `id_rsa`、`id_rsa.pub` 文件,复制`id_rsa.pub`文件内容到git仓库,复制`id_rsa`文件内容到feaplat爬虫管理系统 +见 [FEAPLAT使用说明](feapder_platform/usage) ## 自定义爬虫镜像 @@ -355,18 +411,18 @@ SPIDER_IMAGE=my_feapder:1.0 ## 学习交流 - - - - - - - +
知识星球:17321694 作者微信: boris_tm QQ群号:750614606
+ + + + + + - - - -
知识星球:17321694 作者微信: boris_tm QQ群号:521494615
-
- - 加好友备注:feaplat + + + + + + + 加好友备注:feapder diff --git a/docs/feapder_platform/question.md b/docs/feapder_platform/question.md index 15c31f11..78de0f2f 100644 --- a/docs/feapder_platform/question.md +++ b/docs/feapder_platform/question.md @@ -94,7 +94,62 @@ INFLUXDB_PORT_UDP=8089 rm -f /etc/localtime ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime -# 校对时间 +# 校对时间 方式1 clock --hctosys +# 校对时间 方式2 +ntpdate 0.asia.pool.ntp.org ``` - \ No newline at end of file + +## 我搭建了个集群,如何让主节点不跑任务 + +在主节点上执行下面命令,将其设置成drain状态即可 + + docker node update --availability drain 节点id + + ## Network 问题 + +attaching to network failed, make sure your network options are correct and check manager logs: context deadline exceeded + ![](http://markdown-media.oss-cn-beijing.aliyuncs.com/2023/02/16/16765140608308.jpg) + +1. 确定当前节点是不是Drain节点:docker node ls + + ![](http://markdown-media.oss-cn-beijing.aliyuncs.com/2023/02/16/16765145635622.jpg) + + 是则继续往下看,不是则在评论区留言 + +1. 修复 + + ``` + docker node update --availability active 节点id + docker node update --availability drain 节点id + ``` + +原因是Drain节点,不能为其分配网络资源,需要先改成active,然后启动,之后在改回drain + +**若不是以上情况,可能是network内的可分配的ip满了(老版本feaplat会有这个问题),那么可继续往下看** + +1. 先检查feaplat目录下的docker-compost.yaml,翻到最后,看network相关配置是否为如下。若不是,则改成下面这样的。若下面指定的11 ip段和主机有冲突,可以写12、13等 + + ``` + networks: + default: + name: feaplat + driver: overlay + attachable: true + ipam: + config: + - subnet: 11.0.0.0/8 + gateway: 11.0.0.1 + ``` + + 完整配置见:https://github.com/Boris-code/feaplat/blob/develop/docker-compose.yaml + + +2. 改完后,需要删除之前的network,使其重新创建,命令如下: + + ``` + docker service ls -q | xargs docker service rm # 注意 这个会停止掉所有任务。 + docker network rm feaplat # 删除网络 + docker compose rm # 删除之前feaplat运行环境 + docker compose up -d # 启动 + ``` \ No newline at end of file diff --git a/docs/feapder_platform/usage.md b/docs/feapder_platform/usage.md index 100cd423..20e7bb12 100644 --- a/docs/feapder_platform/usage.md +++ b/docs/feapder_platform/usage.md @@ -31,7 +31,7 @@ 1. 准备项目,项目结构如下: ![](http://markdown-media.oss-cn-beijing.aliyuncs.com/2021/10/16/16343707944750.jpg) -2. 压缩后上传: +2. 压缩后上传:(推荐使用 `feapder zip` 命令压缩) ![](http://markdown-media.oss-cn-beijing.aliyuncs.com/2021/10/16/16343709590040.jpg) - 工作路径:上传的项目会被放到docker里的根目录下(跟你本机项目路径没关系),然后解压运行。因`feapder_demo.zip`解压后为`feapder_demo`,所以工作路径配置`/feapder_demo` - 本项目没依赖,可以不配置`requirements.txt` @@ -44,6 +44,30 @@ ![](http://markdown-media.oss-cn-beijing.aliyuncs.com/2021/10/16/16343720862217.jpg) 可以看到已经运行完毕 + +## git方式拉取私有项目 + +拉取私有项目需在git仓库里添加如下公钥 + +``` +ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCd/k/tjbcMislEunjtYQNXxz5tgEDc/fSvuLHBNUX4PtfmMQ07TuUX2XJIIzLRPaqv3nsMn3+QZrV0xQd545FG1Cq83JJB98ATTW7k5Q0eaWXkvThdFeG5+n85KeVV2W4BpdHHNZ5h9RxBUmVZPpAZacdC6OUSBYTyCblPfX9DvjOk+KfwAZVwpJSkv4YduwoR3DNfXrmK5P+wrYW9z/VHUf0hcfWEnsrrHktCKgohZn9Fe8uS3B5wTNd9GgVrLGRk85ag+CChoqg80DjgFt/IhzMCArqwLyMn7rGG4Iu2Ie0TcdMc0TlRxoBhqrfKkN83cfQ3gDf41tZwp67uM9ZN feapder@qq.com +``` + +或在系统设置页面配置您的SSH私钥,然后在git仓库里添加您的公钥,例如: +![](http://markdown-media.oss-cn-beijing.aliyuncs.com/2021/10/19/16346353514967.jpg) + +注意,公私钥加密方式为RSA,其他的可能会有问题 + +生成RSA公私钥方式如下: +```shell +ssh-keygen -t rsa -C "备注" -f 生成路径/文件名 +``` +如: +`ssh-keygen -t rsa -C "feaplat" -f id_rsa` +然后一路回车,不要输密码 +![](http://markdown-media.oss-cn-beijing.aliyuncs.com/2021/11/17/16371210640228.jpg) +最终生成 `id_rsa`、`id_rsa.pub` 文件,复制`id_rsa.pub`文件内容到git仓库,复制`id_rsa`文件内容到feaplat爬虫管理系统 + ## 爬虫监控 diff --git a/docs/images/aliyun_sale.jpg b/docs/images/aliyun_sale.jpg deleted file mode 100644 index f7b42b1a..00000000 Binary files a/docs/images/aliyun_sale.jpg and /dev/null differ diff --git a/docs/images/qingguo.jpg b/docs/images/qingguo.jpg new file mode 100644 index 00000000..24331df2 Binary files /dev/null and b/docs/images/qingguo.jpg differ diff --git a/docs/index.html b/docs/index.html index a501a519..d1112896 100644 --- a/docs/index.html +++ b/docs/index.html @@ -75,8 +75,8 @@ function (hook) { var header = [ '

', - '', - '阿里云', + '', + '青果代理', '', '

' ].join('') @@ -88,8 +88,8 @@ ].join('') hook.afterEach(function (html) { // var isReadme = window.location.href.indexOf("README"); - var isReadme = 1 // 可以投放广告 - if (isReadme === -1) { + var isReadme = 0 // 可以投放广告 + if (isReadme === 1) { return header + html + footer } else { return html + footer @@ -117,7 +117,7 @@ - + diff --git "a/docs/question/\350\277\220\350\241\214\351\227\256\351\242\230.md" "b/docs/question/\350\277\220\350\241\214\351\227\256\351\242\230.md" index cbc84e3b..ade03f4d 100644 --- "a/docs/question/\350\277\220\350\241\214\351\227\256\351\242\230.md" +++ "b/docs/question/\350\277\220\350\241\214\351\227\256\351\242\230.md" @@ -21,7 +21,7 @@ delete_keys为需要删除的key,类型: 元组/bool/string,支持正则; 常用于清空任务队列,否则重启时会断点续爬,如写成`delete_keys=True`也是可以的 -1. 手动修改任务分数为小于当前时间搓的分数 +1. 手动修改任务分数为小于当前时间戳的分数 ![-w917](http://markdown-media.oss-cn-beijing.aliyuncs.com/2021/03/11/16154327722622.jpg) diff --git a/docs/source_code/Item.md b/docs/source_code/Item.md index 3aafe547..e48218b9 100644 --- a/docs/source_code/Item.md +++ b/docs/source_code/Item.md @@ -102,6 +102,26 @@ class SpiderDataItem(Item): self.title = self.title.strip() ``` +## 指定入库使用的pipelines + +```python + +from feapder import Item +from feapder.pipelines.csv_pipeline import CsvPipeline + + +class SpiderDataItem(Item): + + __pipelines__ = [CsvPipeline()] + + def __init__(self, *args, **kwargs): + # self.id = None + self.title = None +``` + +使用__pipelines__指定后,该item只会流经指定的pipelines处理 + + ## 更新数据 采集过程中,往往会有些数据漏采或解析出错,如果我们想更新已入库的数据,可将Item转为UpdateItem diff --git a/docs/source_code/UpdateItem.md b/docs/source_code/UpdateItem.md index a461fad4..3036628a 100644 --- a/docs/source_code/UpdateItem.md +++ b/docs/source_code/UpdateItem.md @@ -1,6 +1,6 @@ # UpdateItem -UpdateItem用于更新数据,继承至Item,所以使用方式基本与Item一致,下载只说不同之处 +UpdateItem用于更新数据,继承至Item,所以使用方式基本与Item一致,下面只说不同之处 ## 更新逻辑 @@ -70,4 +70,4 @@ item = item.to_UpdateItem() item.update_key = "title" ``` -**推荐方式1,直接改Item类,不用修改爬虫代码** \ No newline at end of file +**推荐方式1,直接改Item类,不用修改爬虫代码** diff --git a/docs/source_code/custom_downloader.md b/docs/source_code/custom_downloader.md new file mode 100644 index 00000000..eb7c8c05 --- /dev/null +++ b/docs/source_code/custom_downloader.md @@ -0,0 +1,300 @@ +# 自定义下载器 + +下载器一共分为三种:**普通下载器**、**支持保持session的下载器**以及**浏览器渲染下载器**。默认已经在框架中内置,setting中的配置如下 + +``` +DOWNLOADER = "feapder.network.downloader.RequestsDownloader" # 请求下载器 +SESSION_DOWNLOADER = "feapder.network.downloader.RequestsSessionDownloader" +RENDER_DOWNLOADER = "feapder.network.downloader.SeleniumDownloader" # 渲染下载器 +``` + +- session下载器当配置中`USE_SESSION = True`时会启用 +- 渲染下载器当使用浏览器下载功能时会启用 + +这些下载器均为插件的形式,我们可以自定义 + +## 自定义普通下载器 + +1. 编写下载器。如在 `xxx-spider/downloader/my_downloader.py `下自定义了如下下载器 + + ``` + import requests + + from feapder.network.downloader.base import Downloader + from feapder.network.response import Response + + class RequestsDownloader(Downloader): + def download(self, request) -> Response: + response = requests.request( + request.method, request.url, **request.requests_kwargs + ) + # 将requests的response转化为feapder的Response 对象,方便后续解析时使用xpath、re等方法 + response = Response(response) + return response + ``` + + 注:这里返回的response对象不强制要求为是feapder的Response。返回值会传到解析函数的response参数里,若返回的是文本,则接收到的也是文本。 + + 但为了代码可读性,建议将返回值转为feapder的Response后再返回。 + + 转feapder的Response的方式有如下几种 + + ``` + # 方式1 + # response参数为reqeusts的response + Response(response) + + # 方式2 + Response.from_text(text="html内容") + ``` + +2. 在settings中指定下载器 + + ``` + DOWNLOADER = "downloader.my_downloader.RequestsDownloader" + ``` + +## 自定义session下载器 + +1. 和普通下载器一样,都是继承`Downloader`,如何保持session,可自定义。代码示例 `xxx-spider/downloader/my_downloader.py ` + + ``` + class RequestsSessionDownloader(Downloader): + session = None + + @property + def _session(self): + if not self.__class__.session: + self.__class__.session = requests.Session() + # pool_connections – 缓存的 urllib3 连接池个数 pool_maxsize – 连接池中保存的最大连接数 + http_adapter = HTTPAdapter(pool_connections=1000, pool_maxsize=1000) + # 任何使用该session会话的 HTTP 请求,只要其 URL 是以给定的前缀开头,该传输适配器就会被使用到。 + self.__class__.session.mount("http", http_adapter) + + return self.__class__.session + + def download(self, request) -> Response: + response = self._session.request( + request.method, request.url, **request.requests_kwargs + ) + response = Response(response) + return response + ``` + +2. 在settings中指定下载器 + + ``` + SESSION_DOWNLOADER = "downloader.my_downloader.RequestsSessionDownloader" + ``` + +注意,这里要配置 `SESSION_DOWNLOADER` + +## 自定义浏览器渲染下载器 + +1. 编写下载器 `xxx-spider/downloader/my_downloader.py ` + +**若浏览器框架本身不支持多线程,但想在多线程中使用,如playwright使用,参考如下:** + +``` +import feapder.setting as setting +import feapder.utils.tools as tools +from feapder.network.downloader.base import RenderDownloader +from feapder.network.response import Response +from feapder.utils.webdriver import WebDriverPool, PlaywrightDriver + + +class MyDownloader(RenderDownloader): + webdriver_pool: WebDriverPool = None + + @property + def _webdriver_pool(self): + if not self.__class__.webdriver_pool: + self.__class__.webdriver_pool = WebDriverPool( + **setting.PLAYWRIGHT, driver_cls=PlaywrightDriver, thread_safe=True + ) + + return self.__class__.webdriver_pool + + def download(self, request) -> Response: + # 代理优先级 自定义 > 配置文件 > 随机 + if request.custom_proxies: + proxy = request.get_proxy() + elif setting.PLAYWRIGHT.get("proxy"): + proxy = setting.PLAYWRIGHT.get("proxy") + else: + proxy = request.get_proxy() + + # user_agent优先级 自定义 > 配置文件 > 随机 + if request.custom_ua: + user_agent = request.get_user_agent() + elif setting.PLAYWRIGHT.get("user_agent"): + user_agent = setting.PLAYWRIGHT.get("user_agent") + else: + user_agent = request.get_user_agent() + + cookies = request.get_cookies() + url = request.url + render_time = request.render_time or setting.PLAYWRIGHT.get("render_time") + wait_until = setting.PLAYWRIGHT.get("wait_until") or "domcontentloaded" + if request.get_params(): + url = tools.joint_url(url, request.get_params()) + + driver: PlaywrightDriver = self._webdriver_pool.get( + user_agent=user_agent, proxy=proxy + ) + try: + if cookies: + driver.url = url + driver.cookies = cookies + driver.page.goto(url, wait_until=wait_until) + + if render_time: + tools.delay_time(render_time) + + html = driver.page.content() + response = Response.from_dict( + { + "url": driver.page.url, + "cookies": driver.cookies, + "_content": html.encode(), + "status_code": 200, + "elapsed": 666, + "headers": { + "User-Agent": driver.user_agent, + "Cookie": tools.cookies2str(driver.cookies), + }, + } + ) + + response.driver = driver + response.browser = driver + return response + except Exception as e: + self._webdriver_pool.remove(driver) + raise e + + def close(self, driver): + if driver: + self._webdriver_pool.remove(driver) + + def put_back(self, driver): + """ + 释放浏览器对象 + """ + self._webdriver_pool.put(driver) + + def close_all(self): + """ + 关闭所有浏览器 + """ + # 不支持 + # self._webdriver_pool.close() + pass +``` + +这里使用了WebDriverPool,参数`thread_safe=True`,即要保证使用时的线程安全,确保同个浏览器对象只能被同一个线程调用 + +**若浏览器框架本身支持多线程,如selenium,则参考如下** + +``` +import feapder.setting as setting +import feapder.utils.tools as tools +from feapder.network.downloader.base import RenderDownloader +from feapder.network.response import Response +from feapder.utils.webdriver import WebDriverPool, SeleniumDriver + + +class MyDownloader(RenderDownloader): + webdriver_pool: WebDriverPool = None + + @property + def _webdriver_pool(self): + if not self.__class__.webdriver_pool: + self.__class__.webdriver_pool = WebDriverPool( + **setting.WEBDRIVER, driver=SeleniumDriver + ) + + return self.__class__.webdriver_pool + + def download(self, request) -> Response: + # 代理优先级 自定义 > 配置文件 > 随机 + if request.custom_proxies: + proxy = request.get_proxy() + elif setting.WEBDRIVER.get("proxy"): + proxy = setting.WEBDRIVER.get("proxy") + else: + proxy = request.get_proxy() + + # user_agent优先级 自定义 > 配置文件 > 随机 + if request.custom_ua: + user_agent = request.get_user_agent() + elif setting.WEBDRIVER.get("user_agent"): + user_agent = setting.WEBDRIVER.get("user_agent") + else: + user_agent = request.get_user_agent() + + cookies = request.get_cookies() + url = request.url + render_time = request.render_time or setting.WEBDRIVER.get("render_time") + if request.get_params(): + url = tools.joint_url(url, request.get_params()) + + browser: SeleniumDriver = self._webdriver_pool.get( + user_agent=user_agent, proxy=proxy + ) + try: + browser.get(url) + if cookies: + browser.cookies = cookies + # 刷新使cookie生效 + browser.get(url) + + if render_time: + tools.delay_time(render_time) + + html = browser.page_source + response = Response.from_dict( + { + "url": browser.current_url, + "cookies": browser.cookies, + "_content": html.encode(), + "status_code": 200, + "elapsed": 666, + "headers": { + "User-Agent": browser.user_agent, + "Cookie": tools.cookies2str(browser.cookies), + }, + } + ) + + response.driver = browser + response.browser = browser + return response + except Exception as e: + self._webdriver_pool.remove(browser) + raise e + + def close(self, driver): + if driver: + self._webdriver_pool.remove(driver) + + def put_back(self, driver): + """ + 释放浏览器对象 + """ + self._webdriver_pool.put(driver) + + def close_all(self): + """ + 关闭所有浏览器 + """ + self._webdriver_pool.close() +``` + +2. 在settings中指定下载器 + +``` +RENDER_DOWNLOADER = "downloader.my_downloader.MyDownloader" +``` + +注,这里要写`RENDER_DOWNLOADER` \ No newline at end of file diff --git a/docs/source_code/pipeline.md b/docs/source_code/pipeline.md index 14dd7455..6a04dbf1 100644 --- a/docs/source_code/pipeline.md +++ b/docs/source_code/pipeline.md @@ -2,11 +2,26 @@ Pipeline是数据入库时流经的管道,用户可自定义,以便对接其他数据库。 -框架已内置mysql及mongo管道,其他管道作为扩展方式提供,可从[feapder_pipelines](https://github.com/Boris-code/feapder_pipelines)项目中按需安装 +框架已内置mysql、mongo、csv管道,其他管道作为扩展方式提供,可从[feapder_pipelines](https://github.com/Boris-code/feapder_pipelines)项目中按需安装 项目地址:https://github.com/Boris-code/feapder_pipelines -## 使用方式 +## 选择内置的pipeline + +在配置文件 `setting.py` 中的 `ITEM_PIPELINES` 中启用: + +```python +ITEM_PIPELINES = [ + "feapder.pipelines.mysql_pipeline.MysqlPipeline", + # "feapder.pipelines.mongo_pipeline.MongoPipeline", + # "feapder.pipelines.csv_pipeline.CsvPipeline", + # "feapder.pipelines.console_pipeline.ConsolePipeline", +] +``` + +然后 爬虫中`yield`的`item`会流经选择的pipeline自动存储 + +## 自定义pipeline 注:item会被聚合成多条一起流经pipeline,方便批量入库 diff --git a/docs/source_code/proxy.md b/docs/source_code/proxy.md index b961ecf0..de87845a 100644 --- a/docs/source_code/proxy.md +++ b/docs/source_code/proxy.md @@ -1,12 +1,13 @@ # 代理使用说明 -代理使用有两种方式 -1. 用框架内置的代理池 -2. 自己写 +代理使用有三种方式 +1. 使用框架内置代理池 +2. 自定义代理池 +3. 请求中直接指定 -## 1. 框架内置的代理池 +## 方式1. 使用框架内置代理池 -### 基本使用 +### 配置代理 在配置文件中配置代理提取接口 @@ -14,9 +15,10 @@ # 设置代理 PROXY_EXTRACT_API = None # 代理提取API ,返回的代理分割符为\r\n PROXY_ENABLE = True +PROXY_MAX_FAILED_TIMES = 5 # 代理最大失败次数,超过则不使用,自动删除 ``` -要求API返回的代理格式为: +要求API返回的代理格式为使用 /r/n 分隔: ``` ip:port @@ -26,13 +28,11 @@ ip:port 这样feapder在请求时会自动随机使用上面的代理请求了 -### 高阶 +## 管理代理 -> 注意:高阶用法现在不太友好,后期会调整使用方式 +1. 删除代理(默认是请求异常连续5次,再删除代理) -1. 标记代理失效或延时使用 - - 例如在发生异常时处理代理 + 例如在发生异常时删除代理 ```python import feapder @@ -44,49 +44,48 @@ ip:port print(response) def exception_request(self, request, response): - - # request.proxies_pool.tag_proxy(request.requests_kwargs.get("proxies"), -1) # 废弃本次代理 - request.proxies_pool.tag_proxy(request.requests_kwargs.get("proxies"), 1, 30) # 延迟本次代理30秒后再使用 - ``` - -1. 指定代理拉取时间间隔等 - - 在代码头部给feapder.Request.proxies_pool重新赋值 - - ```python - import feapder - from feapder.network.proxy_pool import ProxyPool - - proxy_pool= ProxyPool(reset_interval_max=180, reset_interval=5) - feapder.Request.proxies_pool = proxy_pool + request.del_proxy() + ``` - 相当于修改了代理池的默认参数值,更多参数看源码 +## 方式2. 自定义代理池 -1. 从redis里提取代理 +1. 编写代理池:例如在你的项目下创建个my_proxypool.py,实现下面的函数 ```python - import feapder - from feapder.network.proxy_pool import ProxyPool - - proxy_pool = ProxyPool( - proxy_source_url="redis://:passwd@host:ip/db", redis_proxies_key="proxies" - ) - feapder.Request.proxies_pool = proxy_pool + from feapder.network.proxy_pool import BaseProxyPool + + class MyProxyPool(BaseProxyPool): + def get_proxy(self): + """ + 获取代理 + Returns: + {"http": "xxx", "https": "xxx"} + """ + pass + + def del_proxy(self, proxy): + """ + @summary: 删除代理 + --------- + @param proxy: xxx + """ + pass ``` - - 要求redis使用zset集合存储代理,存储内容示例如下: + +3. 修改setting的代理配置 + ``` - ip:port - ip:port - ip:port + PROXY_POOL = "my_proxypool.MyProxyPool" # 代理池 ``` - redis_proxies_key及为存储代理的key,每次拉取时会拉取全量 + 将编写好的代理池配置进来,值为类的模块路径,需要指定到具体的类名 + + -## 2. 自己写 +## 方式3. 不使用代理池,直接给请求指定代理 -自己写就比较灵活,自己随机取个代理,然后给request赋值即可,例如在下载中间件里使用 +直接给request.proxies赋值即可,例如在下载中间件里使用 ```python import feapder @@ -96,7 +95,7 @@ class TestProxy(feapder.AirSpider): yield feapder.Request("https://www.baidu.com") def download_midware(self, request): - # 这里随机取个代理使用即可 + # 这里使用代理使用即可 request.proxies = {"https": "https://ip:port", "http": "http://ip:port"} return request diff --git "a/docs/source_code/\346\212\245\350\255\246\345\217\212\347\233\221\346\216\247.md" "b/docs/source_code/\346\212\245\350\255\246\345\217\212\347\233\221\346\216\247.md" index 023bd06f..87dbc695 100644 --- "a/docs/source_code/\346\212\245\350\255\246\345\217\212\347\233\221\346\216\247.md" +++ "b/docs/source_code/\346\212\245\350\255\246\345\217\212\347\233\221\346\216\247.md" @@ -1,5 +1,7 @@ # 报警及监控 +支持钉钉、飞书、企业微信、邮件报警 + ## 钉钉报警 条件:需要有钉钉群,需要获取钉钉机器人的Webhook地址 @@ -10,15 +12,19 @@ ![-w547](http://markdown-media.oss-cn-beijing.aliyuncs.com/2021/03/27/16167753030324.jpg) +或使用加签方式,然后在setting中设置密钥 + 相关配置: ```python # 钉钉报警 DINGDING_WARNING_URL = "" # 钉钉机器人api DINGDING_WARNING_PHONE = "" # 报警人 支持列表,可指定多个 +DINGDING_WARNING_ALL = False # 是否提示所有人, 默认为False +DINGDING_WARNING_SECRET = None # 加签密钥 ``` -## 微信报警 +## 企业微信报警 条件:需要企业微信群,并获取企业微信机器人的Webhook地址 @@ -39,6 +45,17 @@ WECHAT_WARNING_PHONE = "" # 报警人 将会在群内@此人, 支持列表, WECHAT_WARNING_ALL = False # 是否提示所有人, 默认为False ``` +## 飞书报警 + +可参考文档设置机器人:https://open.feishu.cn/document/ukTMukTMukTM/ucTM5YjL3ETO24yNxkjN#e1cdee9f + +然后在feapder的setting文件中修改如下配置 + +``` +FEISHU_WARNING_URL = "" # 飞书机器人api +FEISHU_WARNING_USER = None # 报警人 {"open_id":"ou_xxxxx", "name":"xxxx"} 或 [{"open_id":"ou_xxxxx", "name":"xxxx"}] +FEISHU_WARNING_ALL = False # 是否提示所有人, 默认为False +``` ## 邮件报警 @@ -69,6 +86,20 @@ EMAIL_RECEIVER = "" # 收件人 支持列表,可指定多个 4. 将本邮箱账号添加到白名单中 +## Qmsg酱报警 + +Qmsg酱是一个QQ消息推送机器人,用来通知自己消息的免费服务。 + +可以参考文档:https://qmsg.zendee.cn/docs/api/ + +```python +# QMSG报警 +QMSG_WARNING_URL = "" # qmsg机器人api +QMSG_WARNING_QQ = "" # 指定要接收消息的QQ号或者QQ群。多个以英文逗号分割,例如:12345,12346,支持列表,可指定多人 +QMSG_WARNING_BOT = "" # 机器人的QQ号 +``` + + ## 报警间隔及报警级别 框架会对相同的报警进行过滤,防止刷屏,默认的报警时间间隔为1小时,可通过以下配置修改: diff --git "a/docs/source_code/\346\265\217\350\247\210\345\231\250\346\270\262\346\237\223-Selenium.md" "b/docs/source_code/\346\265\217\350\247\210\345\231\250\346\270\262\346\237\223-Selenium.md" index 665f5aed..089f9537 100644 --- "a/docs/source_code/\346\265\217\350\247\210\345\231\250\346\270\262\346\237\223-Selenium.md" +++ "b/docs/source_code/\346\265\217\350\247\210\345\231\250\346\270\262\346\237\223-Selenium.md" @@ -4,7 +4,7 @@ 框架内置一个浏览器渲染池,默认的池子大小为1,请求时重复利用浏览器实例,只有当代理失效请求异常时,才会销毁、创建一个新的浏览器实例 -内置浏览器渲染支持 **CHROME** 、**PHANTOMJS**、**FIREFOX** +内置浏览器渲染支持 **CHROME**、**EDGE**、**PHANTOMJS**、**FIREFOX** ## 使用方式: @@ -14,7 +14,7 @@ def start_requests(self): ``` 在返回的Request中传递`render=True`即可 -框架支持`CHROME`、`PHANTOMJS`、`FIREFOX` 三种浏览器渲染,可通过[配置文件](source_code/配置文件)进行配置。相关配置如下: +框架支持`CHROME`、`EDGE`、`PHANTOMJS`、`FIREFOX` 三种浏览器渲染,可通过[配置文件](source_code/配置文件)进行配置。相关配置如下: ```python # 浏览器渲染 @@ -24,7 +24,7 @@ WEBDRIVER = dict( user_agent=None, # 字符串 或 无参函数,返回值为user_agent proxy=None, # xxx.xxx.xxx.xxx:xxxx 或 无参函数,返回值为代理地址 headless=False, # 是否为无头浏览器 - driver_type="CHROME", # CHROME 、PHANTOMJS、FIREFOX + driver_type="CHROME", # CHROME、EDGE、PHANTOMJS、FIREFOX timeout=30, # 请求超时时间 window_size=(1024, 800), # 窗口大小 executable_path=None, # 浏览器路径,默认为默认路径 @@ -80,7 +80,7 @@ def download_midware(self, request): } return request ``` - + ## 设置Cookie 通过 `feapder.Request`携带,如: @@ -219,7 +219,7 @@ class TestRender(feapder.AirSpider): user_agent=None, # 字符串 或 无参函数,返回值为user_agent proxy=None, # xxx.xxx.xxx.xxx:xxxx 或 无参函数,返回值为代理地址 headless=False, # 是否为无头浏览器 - driver_type="CHROME", # CHROME、PHANTOMJS、FIREFOX + driver_type="CHROME", # CHROME、EDGE、PHANTOMJS、FIREFOX timeout=30, # 请求超时时间 window_size=(1024, 800), # 窗口大小 executable_path=None, # 浏览器路径,默认为默认路径 diff --git "a/docs/source_code/\351\205\215\347\275\256\346\226\207\344\273\266.md" "b/docs/source_code/\351\205\215\347\275\256\346\226\207\344\273\266.md" index 547a6d16..e22be333 100644 --- "a/docs/source_code/\351\205\215\347\275\256\346\226\207\344\273\266.md" +++ "b/docs/source_code/\351\205\215\347\275\256\346\226\207\344\273\266.md" @@ -69,7 +69,7 @@ # user_agent=None, # 字符串 或 无参函数,返回值为user_agent # proxy=None, # xxx.xxx.xxx.xxx:xxxx 或 无参函数,返回值为代理地址 # headless=False, # 是否为无头浏览器 -# driver_type="CHROME", # CHROME、PHANTOMJS、FIREFOX +# driver_type="CHROME", # CHROME、EDGE、PHANTOMJS、FIREFOX # timeout=30, # 请求超时时间 # window_size=(1024, 800), # 窗口大小 # executable_path=None, # 浏览器路径,默认为默认路径 @@ -202,10 +202,10 @@ ```python import feapder - - + + class SpiderTest(feapder.AirSpider): __custom_setting__ = dict( SPIDER_MAX_RETRY_TIMES=20, ) -``` \ No newline at end of file +``` diff --git a/docs/usage/AirSpider.md b/docs/usage/AirSpider.md index 08c14185..71ac053c 100644 --- a/docs/usage/AirSpider.md +++ b/docs/usage/AirSpider.md @@ -243,7 +243,7 @@ def start_requests(self): ``` 在返回的Request中传递`render=True`即可 -框架支持`CHROME`和`PHANTOMJS`两种浏览器渲染,可通过[配置文件](source_code/配置文件)进行配置。相关配置如下: +框架支持`CHROME`、`EDGE`和`PHANTOMJS`浏览器渲染,可通过[配置文件](source_code/配置文件)进行配置。相关配置如下: ```python # 浏览器渲染 @@ -253,7 +253,7 @@ WEBDRIVER = dict( user_agent=None, # 字符串 或 无参函数,返回值为user_agent proxy=None, # xxx.xxx.xxx.xxx:xxxx 或 无参函数,返回值为代理地址 headless=False, # 是否为无头浏览器 - driver_type="CHROME", # CHROME 或 PHANTOMJS, + driver_type="CHROME", # CHROME、EDGE或PHANTOMJS, timeout=30, # 请求超时时间 window_size=(1024, 800), # 窗口大小 executable_path=None, # 浏览器路径,默认为默认路径 @@ -282,7 +282,7 @@ class AirSpeedTest(feapder.AirSpider): return request, response def parse(self, request, response): - print(response) + print(response) if __name__ == "__main__": @@ -314,7 +314,25 @@ class AirSpeedTest(feapder.AirSpider): print(title) ``` -## 15. 完整的代码示例 +## 15. 主动停止爬虫 + +``` +import feapder + + +class AirTest(feapder.AirSpider): + def start_requests(self): + yield feapder.Request("http://www.baidu.com") + + def parse(self, request, response): + self.stop_spider() # 停止爬虫,可以在任意地方调用该方法 + + +if __name__ == "__main__": + AirTest().start() +``` + +## 16. 完整的代码示例 AirSpider:https://github.com/Boris-code/feapder/blob/master/tests/air-spider/test_air_spider.py diff --git a/docs/usage/TaskSpider.md b/docs/usage/TaskSpider.md index 719f6481..5978dff9 100644 --- a/docs/usage/TaskSpider.md +++ b/docs/usage/TaskSpider.md @@ -31,6 +31,7 @@ from feapder import ArgumentParser class TaskSpiderTest(feapder.TaskSpider): # 自定义数据库,若项目中有setting.py文件,此自定义可删除 + # redis 必须,mysql可选 __custom_setting__ = dict( REDISDB_IP_PORTS="localhost:6379", REDISDB_USER_PASS="", @@ -43,7 +44,7 @@ class TaskSpiderTest(feapder.TaskSpider): ) def add_task(self): - # 加种子任务 + # 加种子任务 框架会调用这个函数,方便往redis里塞任务,但不能写成死循环。实际业务中可以自己写个脚本往redis里塞任务 self._redisdb.zadd(self._task_table, {"id": 1, "url": "https://www.baidu.com"}) def start_requests(self, task): @@ -69,7 +70,6 @@ def start(args): task_keys=["id", "url"], # 表里查询的字段 redis_key="test:task_spider", # redis里做任务队列的key keep_alive=True, # 是否常驻 - delete_keys=True, # 重启时是否删除redis里的key,若想断点续爬,设置False ) if args == 1: spider.start_monitor_task() @@ -86,7 +86,7 @@ def start2(args): task_table_type="redis", # 任务表类型为redis redis_key="test:task_spider", # redis里做任务队列的key keep_alive=True, # 是否常驻 - delete_keys=True, # 重启时是否删除redis里的key,若想断点续爬,设置False + use_mysql=False, # 若用不到mysql,可以不使用 ) if args == 1: spider.start_monitor_task() diff --git a/feapder/VERSION b/feapder/VERSION index ff2fd4fb..7b0231f5 100644 --- a/feapder/VERSION +++ b/feapder/VERSION @@ -1 +1 @@ -1.8.5 \ No newline at end of file +1.9.3 \ No newline at end of file diff --git a/feapder/buffer/item_buffer.py b/feapder/buffer/item_buffer.py index 874dcefa..35f9bb01 100644 --- a/feapder/buffer/item_buffer.py +++ b/feapder/buffer/item_buffer.py @@ -52,15 +52,28 @@ def __init__(self, redis_key, task_table=None): # 'table_name': ['id', 'name'...] # 缓存table_name与__update_key__的关系 } + self._item_pipelines = { + # 'table_name': ['pipeline1', 'pipeline2'] # 缓存table_name与pipelines的关系 + } + self._pipelines = self.load_pipelines() self._have_mysql_pipeline = MYSQL_PIPELINE_PATH in setting.ITEM_PIPELINES self._mysql_pipeline = None if setting.ITEM_FILTER_ENABLE and not self.__class__.dedup: - self.__class__.dedup = Dedup( - to_md5=False, **setting.ITEM_FILTER_SETTING - ) + if setting.ITEM_FILTER_SETTING.get( + "filter_type" + ) == Dedup.BloomFilter or setting.ITEM_FILTER_SETTING.get("name"): + self.__class__.dedup = Dedup( + to_md5=False, **setting.ITEM_FILTER_SETTING + ) + else: + self.__class__.dedup = Dedup( + to_md5=False, + name=self._redis_key, + **setting.ITEM_FILTER_SETTING, + ) # 导出重试的次数 self.export_retry_times = 0 @@ -208,7 +221,7 @@ def __pick_items(self, items, is_update_item=False): 将每个表之间的数据分开 拆分后 原items为空 @param items: @param is_update_item: - @return: + @return: 表名与数据的字典 """ datas_dict = { # 'table_name': [{}, {}] @@ -223,22 +236,24 @@ def __pick_items(self, items, is_update_item=False): if not table_name: table_name = item.table_name self._item_tables[item_name] = table_name + self._item_pipelines[table_name] = item.pipelines + + if is_update_item and table_name not in self._item_update_keys: + self._item_update_keys[table_name] = item.update_key if table_name not in datas_dict: datas_dict[table_name] = [] datas_dict[table_name].append(item.to_dict) - if is_update_item and table_name not in self._item_update_keys: - self._item_update_keys[table_name] = item.update_key - return datas_dict - def __export_to_db(self, table, datas, is_update=False, update_keys=()): - for pipeline in self._pipelines: + def __export_to_db(self, table, datas, is_update=False, update_keys=(), used_pipelines=None): + pipelines = used_pipelines or self._pipelines # 优先采用指定的pipelines + for pipeline in pipelines: if is_update: if table == self._task_table and not isinstance( - pipeline, MysqlPipeline + pipeline, MysqlPipeline ): continue @@ -258,7 +273,7 @@ def __export_to_db(self, table, datas, is_update=False, update_keys=()): # 若是任务表, 且上面的pipeline里没mysql,则需调用mysql更新任务 if not self._have_mysql_pipeline and is_update and table == self._task_table: if not self.mysql_pipeline.update_items( - table, datas, update_keys=update_keys + table, datas, update_keys=update_keys ): log.error( f"{self.mysql_pipeline.__class__.__name__} 更新数据失败. table: {table} items: {datas}" @@ -269,7 +284,7 @@ def __export_to_db(self, table, datas, is_update=False, update_keys=()): return True def __add_item_to_db( - self, items, update_items, requests, callbacks, items_fingerprints + self, items, update_items, requests, callbacks, items_fingerprints ): export_success = True self._is_adding_to_db = True @@ -278,7 +293,7 @@ def __add_item_to_db( if setting.ITEM_FILTER_ENABLE: items, items_fingerprints = self.__dedup_items(items, items_fingerprints) - # 分捡 + # 分捡(返回值包含 pipelines_dict) items_dict = self.__pick_items(items) update_items_dict = self.__pick_items(update_items, is_update_item=True) @@ -286,6 +301,7 @@ def __add_item_to_db( failed_items = {"add": [], "update": [], "requests": []} while items_dict: table, datas = items_dict.popitem() + used_pipelines = self._item_pipelines.get(table) log.debug( """ @@ -296,13 +312,14 @@ def __add_item_to_db( % (table, tools.dumps_json(datas, indent=16)) ) - if not self.__export_to_db(table, datas): + if not self.__export_to_db(table, datas, used_pipelines=used_pipelines): export_success = False failed_items["add"].append({"table": table, "datas": datas}) # 执行批量update while update_items_dict: table, datas = update_items_dict.popitem() + used_pipelines = self._item_pipelines.get(table) log.debug( """ @@ -315,7 +332,7 @@ def __add_item_to_db( update_keys = self._item_update_keys.get(table) if not self.__export_to_db( - table, datas, is_update=True, update_keys=update_keys + table, datas, is_update=True, update_keys=update_keys, used_pipelines=used_pipelines ): export_success = False failed_items["update"].append( diff --git a/feapder/buffer/request_buffer.py b/feapder/buffer/request_buffer.py index 22366e24..70677a94 100644 --- a/feapder/buffer/request_buffer.py +++ b/feapder/buffer/request_buffer.py @@ -28,14 +28,16 @@ def __init__(self, db=None, dedup_name: str = None): self._db = db or MemoryDB() if not self.__class__.dedup and setting.REQUEST_FILTER_ENABLE: - if dedup_name: + if setting.REQUEST_FILTER_SETTING.get( + "filter_type" + ) == Dedup.BloomFilter or setting.REQUEST_FILTER_SETTING.get("name"): self.__class__.dedup = Dedup( - name=dedup_name, to_md5=False, **setting.REQUEST_FILTER_SETTING - ) # 默认使用内存去重 + to_md5=False, **setting.REQUEST_FILTER_SETTING + ) else: self.__class__.dedup = Dedup( - to_md5=False, **setting.REQUEST_FILTER_SETTING - ) # 默认使用内存去重 + to_md5=False, name=dedup_name, **setting.REQUEST_FILTER_SETTING + ) def is_exist_request(self, request): if ( diff --git a/feapder/commands/cmdline.py b/feapder/commands/cmdline.py index cb2a3187..91d0531e 100644 --- a/feapder/commands/cmdline.py +++ b/feapder/commands/cmdline.py @@ -11,6 +11,7 @@ import re import sys from os.path import dirname, join +import os import requests @@ -77,6 +78,9 @@ def check_new_version(): if new_version: version = f"feapder=={VERSION.replace('-beta', 'b')}" tip = NEW_VERSION_TIP.format(version=version, new_version=new_version) + # 修复window下print不能带颜色输出的问题 + if os.name == "nt": + os.system("") print(tip) except Exception as e: pass diff --git a/feapder/commands/create/create_table.py b/feapder/commands/create/create_table.py index 2358da7f..15162782 100644 --- a/feapder/commands/create/create_table.py +++ b/feapder/commands/create/create_table.py @@ -141,8 +141,9 @@ def create(self, table_name): unique=unique, ) print(sql) - - if self._db.execute(sql): + result=self._db.execute(sql) + # 建立表成功。受影响的行数为 0,因此返回0 + if result==0: print("\n%s 创建成功" % table_name) print("注意手动检查下字段类型,确保无误!!!") else: diff --git a/feapder/core/base_parser.py b/feapder/core/base_parser.py index 6264b5ae..a06f9c44 100644 --- a/feapder/core/base_parser.py +++ b/feapder/core/base_parser.py @@ -13,6 +13,9 @@ from feapder.db.mysqldb import MysqlDB from feapder.network.item import UpdateItem from feapder.utils.log import log +from feapder.network.request import Request +from feapder.network.response import Response +from feapder.utils.perfect_dict import PerfectDict class BaseParser(object): @@ -26,7 +29,7 @@ def start_requests(self): pass - def download_midware(self, request): + def download_midware(self, request: Request): """ @summary: 下载中间件 可修改请求的一些参数, 或可自定义下载,然后返回 request, response --------- @@ -37,7 +40,7 @@ def download_midware(self, request): pass - def validate(self, request, response): + def validate(self, request: Request, response: Response): """ @summary: 校验函数, 可用于校验response是否正确 若函数内抛出异常,则重试请求 @@ -53,7 +56,7 @@ def validate(self, request, response): pass - def parse(self, request, response): + def parse(self, request: Request, response: Response): """ @summary: 默认的解析函数 --------- @@ -65,7 +68,7 @@ def parse(self, request, response): pass - def exception_request(self, request, response, e): + def exception_request(self, request: Request, response: Response, e: Exception): """ @summary: 请求或者parser里解析出异常的request --------- @@ -78,7 +81,7 @@ def exception_request(self, request, response, e): pass - def failed_request(self, request, response, e): + def failed_request(self, request: Request, response: Response, e: Exception): """ @summary: 超过最大重试次数的request 可返回修改后的request 若不返回request,则将传进来的request直接人redis的failed表。否则将修改后的request入failed表 @@ -135,7 +138,7 @@ def add_task(self): @result: """ - def start_requests(self, task): + def start_requests(self, task: PerfectDict): """ @summary: --------- diff --git a/feapder/core/collector.py b/feapder/core/collector.py index 4e063a7b..5b8ff652 100644 --- a/feapder/core/collector.py +++ b/feapder/core/collector.py @@ -63,7 +63,7 @@ def __input_data(self): current_timestamp = tools.get_current_timestamp() - # 取任务,只取当前时间搓以内的任务,同时将任务分数修改为 current_timestamp + setting.REQUEST_LOST_TIMEOUT + # 取任务,只取当前时间戳以内的任务,同时将任务分数修改为 current_timestamp + setting.REQUEST_LOST_TIMEOUT requests_list = self._db.zrangebyscore_set_score( self._tab_requests, priority_min="-inf", diff --git a/feapder/core/handle_failed_items.py b/feapder/core/handle_failed_items.py index 09f1b95a..655330f5 100644 --- a/feapder/core/handle_failed_items.py +++ b/feapder/core/handle_failed_items.py @@ -58,7 +58,7 @@ def reput_failed_items_to_db(self): for _data in datas: item = UpdateItem(**_data) item.table_name = table - item.update_keys = update_keys + item.update_key = update_keys self._item_buffer.put_item(item) total_count += 1 diff --git a/feapder/core/parser_control.py b/feapder/core/parser_control.py index 381a6e8a..021d2956 100644 --- a/feapder/core/parser_control.py +++ b/feapder/core/parser_control.py @@ -38,6 +38,8 @@ class ParserControl(threading.Thread): _failed_task_count = 0 _total_task_count = 0 + _hook_parsers = set() + def __init__(self, collector, redis_key, request_buffer, item_buffer): super(ParserControl, self).__init__() self._parsers = [] @@ -133,7 +135,7 @@ def deal_request(self, request): ) ) used_download_midware_enable = True - if not response: + if response is None: response = ( request_temp.get_response() if not setting.RESPONSE_CACHED_USED @@ -236,6 +238,8 @@ def deal_request(self, request): self.record_download_status( ParserControl.DOWNLOAD_EXCEPTION, parser.name ) + if request.retry_times % setting.PROXY_MAX_FAILED_TIMES == 0: + request.del_proxy() else: # 记录解析程序异常 @@ -431,21 +435,19 @@ def stop(self): def add_parser(self, parser: BaseParser): # 动态增加parser.exception_request和parser.failed_request的参数, 兼容旧版本 - if len(inspect.getfullargspec(parser.exception_request).args) == 3: - _exception_request = parser.exception_request - - def exception_request(request, response, e): - return _exception_request(request, response) - - parser.exception_request = exception_request - - if len(inspect.getfullargspec(parser.failed_request).args) == 3: - _failed_request = parser.failed_request - - def failed_request(request, response, e): - return _failed_request(request, response) + if parser not in self.__class__._hook_parsers: + self.__class__._hook_parsers.add(parser) + if len(inspect.getfullargspec(parser.exception_request).args) == 3: + _exception_request = parser.exception_request + parser.exception_request = ( + lambda request, response, e: _exception_request(request, response) + ) - parser.failed_request = failed_request + if len(inspect.getfullargspec(parser.failed_request).args) == 3: + _failed_request = parser.failed_request + parser.failed_request = lambda request, response, e: _failed_request( + request, response + ) self._parsers.append(parser) @@ -543,7 +545,7 @@ def deal_request(self, request): ) request = request_temp - if not response: + if response is None: response = ( request.get_response() if not setting.RESPONSE_CACHED_USED @@ -611,6 +613,8 @@ def deal_request(self, request): self.record_download_status( ParserControl.DOWNLOAD_EXCEPTION, parser.name ) + if request.retry_times % setting.PROXY_MAX_FAILED_TIMES == 0: + request.del_proxy() else: # 记录解析程序异常 diff --git a/feapder/core/scheduler.py b/feapder/core/scheduler.py index 011c42d9..0177d185 100644 --- a/feapder/core/scheduler.py +++ b/feapder/core/scheduler.py @@ -17,8 +17,8 @@ from feapder.buffer.request_buffer import RequestBuffer from feapder.core.base_parser import BaseParser from feapder.core.collector import Collector -from feapder.core.handle_failed_requests import HandleFailedRequests from feapder.core.handle_failed_items import HandleFailedItems +from feapder.core.handle_failed_requests import HandleFailedRequests from feapder.core.parser_control import ParserControl from feapder.db.redisdb import RedisDB from feapder.network.item import Item @@ -26,6 +26,7 @@ from feapder.utils import metrics from feapder.utils.log import log from feapder.utils.redis_lock import RedisLock +from feapder.utils.tail_thread import TailThread SPIDER_START_TIME_KEY = "spider_start_time" SPIDER_END_TIME_KEY = "spider_end_time" @@ -33,7 +34,7 @@ HEARTBEAT_TIME_KEY = "heartbeat_time" -class Scheduler(threading.Thread): +class Scheduler(TailThread): __custom_setting__ = {} def __init__( @@ -122,8 +123,7 @@ def __init__( setattr(setting, "SPIDER_THREAD_COUNT", thread_count) self._thread_count = setting.SPIDER_THREAD_COUNT - self._spider_name = redis_key - self._project_name = redis_key.split(":")[0] + self._spider_name = self.name self._task_table = task_table self._tab_spider_status = setting.TAB_SPIDER_STATUS.format(redis_key=redis_key) @@ -137,9 +137,6 @@ def __init__( self._stop_heartbeat = False # 是否停止心跳 self._redisdb = RedisDB() - self._project_total_state_table = "{}_total_state".format(self._project_name) - self._is_exist_project_total_state_table = False - # Request 缓存设置 Request.cached_redis_key = redis_key Request.cached_expire_time = setting.RESPONSE_CACHED_EXPIRE_TIME @@ -155,6 +152,8 @@ def __init__( # 重置丢失的任务 self.reset_task() + self._stop_spider = False + def init_metrics(self): """ 初始化打点系统 @@ -176,7 +175,7 @@ def run(self): while True: try: - if self.all_thread_is_done(): + if self._stop_spider or self.all_thread_is_done(): if not self._is_notify_end: self.spider_end() # 跑完一轮 self._is_notify_end = True @@ -196,15 +195,13 @@ def run(self): tools.delay_time(1) # 1秒钟检查一次爬虫状态 def __add_task(self): - # 启动parser 的 start_requests - self.spider_begin() # 不自动结束的爬虫此处只能执行一遍 - # 判断任务池中属否还有任务,若有接着抓取 todo_task_count = self._collector.get_requests_count() if todo_task_count: log.info("检查到有待做任务 %s 条,不重下发新任务,将接着上回异常终止处继续抓取" % todo_task_count) else: for parser in self._parsers: + # 启动parser 的 start_requests results = parser.start_requests() # 添加request到请求队列,由请求队列统一入库 if results and not isinstance(results, Iterable): @@ -237,6 +234,8 @@ def __add_task(self): self._item_buffer.flush() def _start(self): + self.spider_begin() + # 将失败的item入库 if setting.RETRY_FAILED_ITEMS: handle_failed_items = HandleFailedItems( @@ -487,8 +486,9 @@ def spider_end(self): spand_time = tools.get_current_timestamp() - begin_timestamp - msg = "《%s》爬虫结束,耗时 %s" % ( + msg = "《%s》爬虫%s,采集耗时 %s" % ( self._spider_name, + "被终止" if self._stop_spider else "结束", tools.format_seconds(spand_time), ) log.info(msg) @@ -586,3 +586,6 @@ def reset_task(self, heartbeat_interval=10): lose_count = len(datas) if lose_count: log.info("重置丢失任务完毕,共{}条".format(len(datas))) + + def stop_spider(self): + self._stop_spider = True diff --git a/feapder/core/spiders/air_spider.py b/feapder/core/spiders/air_spider.py index d2ef4868..70c30112 100644 --- a/feapder/core/spiders/air_spider.py +++ b/feapder/core/spiders/air_spider.py @@ -8,8 +8,6 @@ @email: boris_liu@foxmail.com """ -from threading import Thread - import feapder.setting as setting import feapder.utils.tools as tools from feapder.buffer.item_buffer import ItemBuffer @@ -20,9 +18,10 @@ from feapder.network.request import Request from feapder.utils import metrics from feapder.utils.log import log +from feapder.utils.tail_thread import TailThread -class AirSpider(BaseParser, Thread): +class AirSpider(BaseParser, TailThread): __custom_setting__ = {} def __init__(self, thread_count=None): @@ -41,11 +40,12 @@ def __init__(self, thread_count=None): self._memory_db = MemoryDB() self._parser_controls = [] - self._item_buffer = ItemBuffer(redis_key="air_spider") + self._item_buffer = ItemBuffer(redis_key=self.name) self._request_buffer = AirSpiderRequestBuffer( db=self._memory_db, dedup_name=self.name ) + self._stop_spider = False metrics.init(**setting.METRICS_OTHER_ARGS) def distribute_task(self): @@ -97,7 +97,7 @@ def run(self): while True: try: - if self.all_thread_is_done(): + if self._stop_spider or self.all_thread_is_done(): # 停止 parser_controls for parser_control in self._parser_controls: parser_control.stop() @@ -108,7 +108,10 @@ def run(self): # 关闭webdirver Request.render_downloader and Request.render_downloader.close_all() - log.info("无任务,爬虫结束") + if self._stop_spider: + log.info("爬虫被终止") + else: + log.info("无任务,爬虫结束") break except Exception as e: @@ -130,3 +133,6 @@ def join(self, timeout=None): return super().join() + + def stop_spider(self): + self._stop_spider = True diff --git a/feapder/core/spiders/batch_spider.py b/feapder/core/spiders/batch_spider.py index edbc2918..6b2ae092 100644 --- a/feapder/core/spiders/batch_spider.py +++ b/feapder/core/spiders/batch_spider.py @@ -1002,7 +1002,7 @@ def run(self): while True: try: - if ( + if self._stop_spider or ( self.task_is_done() and self.all_thread_is_done() ): # redis全部的任务已经做完 并且mysql中的任务已经做完(检查各个线程all_thread_is_done,防止任务没做完,就更新任务状态,导致程序结束的情况) if not self._is_notify_end: diff --git a/feapder/core/spiders/spider.py b/feapder/core/spiders/spider.py index a2a726e4..a1097559 100644 --- a/feapder/core/spiders/spider.py +++ b/feapder/core/spiders/spider.py @@ -184,7 +184,7 @@ def run(self): while True: try: - if self.all_thread_is_done(): + if self._stop_spider or self.all_thread_is_done(): if not self._is_notify_end: self.spider_end() # 跑完一轮 self._is_notify_end = True diff --git a/feapder/core/spiders/task_spider.py b/feapder/core/spiders/task_spider.py index 603988fd..41cb3596 100644 --- a/feapder/core/spiders/task_spider.py +++ b/feapder/core/spiders/task_spider.py @@ -50,6 +50,7 @@ def __init__( delete_keys=(), keep_alive=None, batch_interval=0, + use_mysql=True, **kwargs, ): """ @@ -91,6 +92,7 @@ def __init__( @param task_condition: 任务条件 用于从一个大任务表中挑选出数据自己爬虫的任务,即where后的条件语句 @param task_order_by: 取任务时的排序条件 如 id desc @param batch_interval: 抓取时间间隔 默认为0 天为单位 多次启动时,只有当前时间与第一次抓取结束的时间间隔大于指定的时间间隔时,爬虫才启动 + @param use_mysql: 是否使用mysql数据库 --------- @result: """ @@ -109,7 +111,7 @@ def __init__( ) self._redisdb = RedisDB() - self._mysqldb = MysqlDB() + self._mysqldb = MysqlDB() if use_mysql else None self._task_table = task_table # mysql中的任务表 self._task_keys = task_keys # 需要获取的任务字段 @@ -516,7 +518,7 @@ def run(self): while True: try: - if ( + if self._stop_spider or ( self.all_thread_is_done() and self.task_is_done() and self.related_spider_is_done() diff --git a/feapder/db/mongodb.py b/feapder/db/mongodb.py index e826b2bb..791fe0d9 100644 --- a/feapder/db/mongodb.py +++ b/feapder/db/mongodb.py @@ -12,7 +12,7 @@ from urllib import parse import pymongo -from pymongo import MongoClient +from pymongo import MongoClient, UpdateOne, UpdateMany from pymongo.collection import Collection from pymongo.database import Database from pymongo.errors import DuplicateKeyError, BulkWriteError @@ -23,30 +23,33 @@ class MongoDB: def __init__( - self, - ip=None, - port=None, - db=None, - user_name=None, - user_pass=None, - url=None, - **kwargs, + self, + ip=None, + port=None, + db=None, + user_name=None, + user_pass=None, + url=None, + **kwargs, ): + if not ip: + ip = setting.MONGO_IP + if not port: + port = setting.MONGO_PORT + if not db: + db = setting.MONGO_DB + if not user_name: + user_name = setting.MONGO_USER_NAME + if not user_pass: + user_pass = setting.MONGO_USER_PASS + if not url: + url = setting.MONGO_URL + if url: self.client = MongoClient(url, **kwargs) else: - if not ip: - ip = setting.MONGO_IP - if not port: - port = setting.MONGO_PORT - if not db: - db = setting.MONGO_DB - if not user_name: - user_name = setting.MONGO_USER_NAME - if not user_pass: - user_pass = setting.MONGO_USER_PASS self.client = MongoClient( - host=ip, port=port, username=user_name, password=user_pass + host=ip, port=port, username=user_name, password=user_pass, **kwargs ) self.db = self.get_database(db) @@ -94,7 +97,7 @@ def get_collection(self, coll_name, **kwargs) -> Collection: return self.db.get_collection(coll_name, **kwargs) def find( - self, coll_name: str, condition: Optional[Dict] = None, limit: int = 0, **kwargs + self, coll_name: str, condition: Optional[Dict] = None, limit: int = 0, **kwargs ) -> List[Dict]: """ @summary: @@ -133,13 +136,13 @@ def find( return dataset def add( - self, - coll_name, - data: Dict, - replace=False, - update_columns=(), - update_columns_value=(), - insert_ignore=False, + self, + coll_name, + data: Dict, + replace=False, + update_columns=(), + update_columns_value=(), + insert_ignore=False, ): """ 添加单条数据 @@ -195,13 +198,13 @@ def add( return affect_count def add_batch( - self, - coll_name: str, - datas: List[Dict], - replace=False, - update_columns=(), - update_columns_value=(), - condition_fields: dict = None, + self, + coll_name: str, + datas: List[Dict], + replace=False, + update_columns=(), + update_columns_value=(), + condition_fields: dict = None, ): """ 批量添加数据 @@ -331,6 +334,70 @@ def update(self, coll_name, data: Dict, condition: Dict, upsert: bool = False): else: return True + def update_many(self, coll_name, data: Dict, condition: Dict, upsert: bool = False): + """ + 批量更新 + Args: + coll_name: 集合名 + data: 单条数据 {"xxx":"xxx"} + condition: 更新条件 {"_id": "xxxx"} + upsert: 数据不存在则插入,默认为 False + + Returns: True / False + """ + try: + collection = self.get_collection(coll_name) + collection.update_many(condition, {"$set": data}, upsert=upsert) + except Exception as e: + log.error( + """ + error:{} + condition: {} + """.format( + e, condition + ) + ) + return False + else: + return True + + def update_batch( + self, + coll_name: str, + update_data_list: List[Dict], + condition_field: str, + upsert: bool = False, + ): + """ + 批量更新数据 + Args: + coll_name: 集合名 + update_data_list: 更新数据列表 + condition_field: 更新条件字段 + upsert: 数据不存在则插入,默认为 False + + Returns: 更新行数 + + """ + if not update_data_list: + return 0 + + collection = self.get_collection(coll_name) + bulk_operations = [] + + for update_data in update_data_list: + condition = {condition_field: update_data.get(condition_field)} + update_operation = UpdateMany( + condition, {"$set": update_data}, upsert=upsert + ) + bulk_operations.append(update_operation) + try: + result = collection.bulk_write(bulk_operations, ordered=False) + return result.modified_count + result.upserted_count + except BulkWriteError as e: + log.error(f"Bulk write error: {e.details}") + return 0 + def delete(self, coll_name, condition: Dict) -> bool: """ 删除 @@ -401,7 +468,7 @@ def get_index_key(self, coll_name, index_name): return index_keys def __get_update_condition( - self, coll_name: str, data: dict, duplicate_errmsg: str + self, coll_name: str, data: dict, duplicate_errmsg: str ) -> dict: """ 根据索引冲突的报错信息 获取更新条件 @@ -420,3 +487,15 @@ def __get_update_condition( def __getattr__(self, name): return getattr(self.db, name) + + +if __name__ == "__main__": + update_data_list = [{"_id": "1", "status": 1}, {"_id": "2", "status": 1}] + mongo = MongoDB() + updated_count = mongo.update_batch("your_table_name", update_data_list, "_id") + print(f"Updated {updated_count} documents.") + + id_list = ["1", "2"] + result = mongo.update_many( + "your_table_name", {"status": 1}, {"_id": {"$in": id_list}} + ) diff --git a/feapder/db/mysqldb.py b/feapder/db/mysqldb.py index 5677a8fa..9043bafe 100644 --- a/feapder/db/mysqldb.py +++ b/feapder/db/mysqldb.py @@ -41,7 +41,7 @@ def wapper(*args, **kwargs): class MysqlDB: def __init__( - self, ip=None, port=None, db=None, user_name=None, user_pass=None, **kwargs + self, ip=None, port=None, db=None, user_name=None, user_pass=None, charset="utf8mb4", set_session=None, **kwargs ): # 可能会改setting中的值,所以此处不能直接赋值为默认值,需要后加载赋值 if not ip: @@ -68,8 +68,10 @@ def __init__( user=user_name, passwd=user_pass, db=db, - charset="utf8mb4", + charset=charset, + setsession=set_session, cursorclass=cursors.SSCursor, + **kwargs ) # cursorclass 使用服务的游标,默认的在多线程下大批量插入数据会使内存递增 except Exception as e: @@ -83,7 +85,7 @@ def __init__( user_pass: {} exception: {} """.format( - ip, port, db, user_name, user_pass, e + ip, port, db, user_name, user_pass, charset, e ) ) else: @@ -117,7 +119,9 @@ def from_url(cls, url, **kwargs): "user_pass": url_parsed.password.strip(), "db": url_parsed.path.strip("/").strip(), } - + # 解析 query 字符串参数,比如 ?charset=utf8 + query_params = dict(parse.parse_qsl(url_parsed.query)) + connect_params.update(query_params) connect_params.update(kwargs) return cls(**connect_params) @@ -190,7 +194,7 @@ def find(self, sql, limit=0, to_json=False, conver_col=True): else: result = cursor.fetchall() - if to_json: + if to_json and result: columns = [i[0] for i in cursor.description] # 处理数据 @@ -198,7 +202,7 @@ def convert(col): if isinstance(col, (datetime.date, datetime.time)): return str(col) elif isinstance(col, str) and ( - col.startswith("{") or col.startswith("[") + col.startswith("{") or col.startswith("[") ): try: # col = self.unescape_string(col) @@ -269,12 +273,13 @@ def add_smart(self, table, data: Dict, **kwargs): sql = make_insert_sql(table, data, **kwargs) return self.add(sql) - def add_batch(self, sql, datas: List[Dict]): + def add_batch(self, sql, datas: List[List]): """ @summary: 批量添加数据 --------- - @ param sql: insert ignore into (xxx,xxx) values (%s, %s, %s) - # param datas: 列表 [{}, {}, {}] + @ param sql: insert ignore into (xxx,xxx,xxx) values (%s, %s, %s) + @ param datas: 列表 [[v1,v2,v3], [v1,v2,v3]] + 列表里的值要和插入的key的顺序对应上 --------- @result: 添加行数 """ @@ -299,7 +304,7 @@ def add_batch(self, sql, datas: List[Dict]): return affect_count - def add_batch_smart(self, table, datas: List[Dict], **kwargs): + def add_batch_smart(self, table, datas: List[Dict], **kwargs) -> int: """ 批量添加数据, 直接传递list格式的数据,不用拼sql Args: @@ -313,12 +318,13 @@ def add_batch_smart(self, table, datas: List[Dict], **kwargs): sql, datas = make_batch_sql(table, datas, **kwargs) return self.add_batch(sql, datas) - def update(self, sql): + def update(self, sql) -> int: + affect_count = None conn, cursor = None, None try: conn, cursor = self.get_connection() - cursor.execute(sql) + affect_count = cursor.execute(sql) conn.commit() except Exception as e: log.error( @@ -328,13 +334,12 @@ def update(self, sql): """ % (e, sql) ) - return False - else: - return True finally: self.close_connection(conn, cursor) - def update_smart(self, table, data: Dict, condition): + return affect_count + + def update_smart(self, table, data: Dict, condition) -> int: """ 更新, 不用拼sql Args: @@ -342,25 +347,26 @@ def update_smart(self, table, data: Dict, condition): data: 数据 {"xxx":"xxx"} condition: 更新条件 where后面的条件,如 condition='status=1' - Returns: True / False + Returns: 影响行数 """ sql = make_update_sql(table, data, condition) return self.update(sql) - def delete(self, sql): + def delete(self, sql) -> int: """ 删除 Args: sql: - Returns: True / False + Returns: 影响行数 """ + affect_count = None conn, cursor = None, None try: conn, cursor = self.get_connection() - cursor.execute(sql) + affect_count = cursor.execute(sql) conn.commit() except Exception as e: log.error( @@ -370,17 +376,24 @@ def delete(self, sql): """ % (e, sql) ) - return False - else: - return True finally: self.close_connection(conn, cursor) - def execute(self, sql): + return affect_count + + def execute(self, sql) -> int: + """ + + Args: + sql: + + Returns: 影响行数 + """ + affect_count = None conn, cursor = None, None try: conn, cursor = self.get_connection() - cursor.execute(sql) + affect_count = cursor.execute(sql) conn.commit() except Exception as e: log.error( @@ -390,8 +403,7 @@ def execute(self, sql): """ % (e, sql) ) - return False - else: - return True finally: self.close_connection(conn, cursor) + + return affect_count diff --git a/feapder/db/redisdb.py b/feapder/db/redisdb.py index a30e0576..d882e687 100644 --- a/feapder/db/redisdb.py +++ b/feapder/db/redisdb.py @@ -6,16 +6,15 @@ --------- @author: Boris """ - +import os import time +from typing import Union, List import redis -from redis._compat import unicode, long, basestring from redis.connection import Encoder as _Encoder from redis.exceptions import ConnectionError, TimeoutError from redis.exceptions import DataError from redis.sentinel import Sentinel -from rediscluster import RedisCluster import feapder.setting as setting from feapder.utils.log import log @@ -34,19 +33,19 @@ def encode(self, value): # ) elif isinstance(value, float): value = repr(value).encode() - elif isinstance(value, (int, long)): + elif isinstance(value, int): # python 2 repr() on longs is '123L', so use str() instead value = str(value).encode() elif isinstance(value, (list, dict, tuple)): - value = unicode(value) - elif not isinstance(value, basestring): + value = str(value) + elif not isinstance(value, str): # a value we don't know how to deal with. throw an error typename = type(value).__name__ raise DataError( "Invalid input of type: '%s'. Convert to a " "bytes, string, int or float first." % typename ) - if isinstance(value, unicode): + if isinstance(value, str): value = value.encode(self.encoding, self.encoding_errors) return value @@ -87,6 +86,8 @@ def __init__( user_pass = setting.REDISDB_USER_PASS if service_name is None: service_name = setting.REDISDB_SERVICE_NAME + if kwargs is None: + kwargs = setting.REDISDB_KWARGS self._is_redis_cluster = False @@ -156,6 +157,12 @@ def get_connect(self): ) else: + try: + from rediscluster import RedisCluster + except ModuleNotFoundError as e: + log.error('请安装 pip install "feapder[all]"') + os._exit(0) + # log.debug("使用redis集群模式") self._redis = RedisCluster( startup_nodes=startup_nodes, @@ -180,7 +187,7 @@ def get_connect(self): self._is_redis_cluster = False else: self._redis = redis.StrictRedis.from_url( - self._url, decode_responses=self._decode_responses + self._url, decode_responses=self._decode_responses, **self._kwargs ) self._is_redis_cluster = False @@ -583,18 +590,17 @@ def zexists(self, table, values): return is_exists def lpush(self, table, values): - if isinstance(values, list): pipe = self._redis.pipeline() if not self._is_redis_cluster: pipe.multi() for value in values: - pipe.rpush(table, value) + pipe.lpush(table, value) pipe.execute() else: - return self._redis.rpush(table, values) + return self._redis.lpush(table, values) def lpop(self, table, count=1): """ @@ -738,27 +744,41 @@ def hget_count(self, table): def hkeys(self, table): return self._redis.hkeys(table) - def setbit(self, table, offsets, values): + def hvals(self, key): + return self._redis.hvals(key) + + def setbit( + self, table, offsets: Union[int, List[int]], values: Union[int, List[int]] + ): """ - 设置字符串数组某一位的值, 返回之前的值 - @param table: + 设置字符串数组某一位的值,返回之前的值 + @param table: Redis key @param offsets: 支持列表或单个值 @param values: 支持列表或单个值 @return: list / 单个值 """ if isinstance(offsets, list): - if not isinstance(values, list): - values = [values] * len(offsets) + if isinstance(values, int): + # 使用lua脚本,数据是一起传给redis的,降低了网络开销,但redis会阻塞 + script = """ + local value = table.remove(ARGV, 1) + local offsets = ARGV + local results = {} + for i, offset in ipairs(offsets) do + results[i] = redis.call('SETBIT', KEYS[1], offset, value) + end + return results + """ + return self._redis.eval(script, 1, table, values, *offsets) else: assert len(offsets) == len(values), "offsets值要与values值一一对应" + pipe = self._redis.pipeline() + pipe.multi() - pipe = self._redis.pipeline() - pipe.multi() - - for offset, value in zip(offsets, values): - pipe.setbit(table, offset, value) + for offset, value in zip(offsets, values): + pipe.setbit(table, offset, value) - return pipe.execute() + return pipe.execute() else: return self._redis.setbit(table, offsets, values) @@ -785,6 +805,20 @@ def bitcount(self, table): return self._redis.bitcount(table) def strset(self, table, value, **kwargs): + """ + 设置键值 + Args: + table: + value: + **kwargs: + ex: Union[None, int, timedelta] = ..., 设置键的过期时间为 second 秒 + px: Union[None, int, timedelta] = ..., 设置键的过期时间为 millisecond 毫秒 + nx: bool = ..., 只有键不存在时,才对键进行设置操作 + xx: bool = ..., 只有键已经存在时,才对键进行设置操作 + keepttl: bool = ..., 保留键的过期时间 + Returns: + + """ return self._redis.set(table, value, **kwargs) def str_incrby(self, table, value): diff --git a/feapder/dedup/bitarray.py b/feapder/dedup/bitarray.py index 6d77719a..348ceb46 100644 --- a/feapder/dedup/bitarray.py +++ b/feapder/dedup/bitarray.py @@ -48,7 +48,7 @@ def __init__(self, num_bits): import bitarray except Exception as e: raise Exception( - "需要安装feapder完整版\ncommand: pip install feapder[all]\n若安装出错,参考:https://feapder.com/#/question/%E5%AE%89%E8%A3%85%E9%97%AE%E9%A2%98" + '需要安装feapder完整版\ncommand: pip install "feapder[all]"\n若安装出错,参考:https://feapder.com/#/question/%E5%AE%89%E8%A3%85%E9%97%AE%E9%A2%98' ) self.num_bits = num_bits @@ -127,7 +127,18 @@ def set(self, offsets, values): @param values: 支持列表或单个值 @return: list / 单个值 """ - return self.redis_db.setbit(self.name, offsets, values) + # 对offsets进行分片,最大100000个 + results = [] + batch_size = 170000 + for i in range(0, len(offsets), batch_size): + results.extend( + self.redis_db.setbit( + self.name, + offsets[i : i + batch_size], + values[i : i + batch_size] if isinstance(values, list) else values, + ) + ) + return results def get(self, offsets): return self.redis_db.getbit(self.name, offsets) diff --git a/feapder/network/downloader/__init__.py b/feapder/network/downloader/__init__.py index 9c7cc20f..f036271e 100644 --- a/feapder/network/downloader/__init__.py +++ b/feapder/network/downloader/__init__.py @@ -1,4 +1,12 @@ from ._requests import RequestsDownloader from ._requests import RequestsSessionDownloader -from ._selenium import SeleniumDownloader -from ._playwright import PlaywrightDownloader + +# 下面是非必要依赖 +try: + from ._selenium import SeleniumDownloader +except ModuleNotFoundError: + pass +try: + from ._playwright import PlaywrightDownloader +except ModuleNotFoundError: + pass diff --git a/feapder/network/downloader/_playwright.py b/feapder/network/downloader/_playwright.py index 3b5a7838..facc75cd 100644 --- a/feapder/network/downloader/_playwright.py +++ b/feapder/network/downloader/_playwright.py @@ -58,7 +58,8 @@ def download(self, request) -> Response: if cookies: driver.url = url driver.cookies = cookies - driver.page.goto(url, wait_until=wait_until) + http_response = driver.page.goto(url, wait_until=wait_until) + status_code = http_response.status if render_time: tools.delay_time(render_time) @@ -69,7 +70,7 @@ def download(self, request) -> Response: "url": driver.page.url, "cookies": driver.cookies, "_content": html.encode(), - "status_code": 200, + "status_code": status_code, "elapsed": 666, "headers": { "User-Agent": driver.user_agent, diff --git a/feapder/network/item.py b/feapder/network/item.py index dd961f10..33eae79c 100644 --- a/feapder/network/item.py +++ b/feapder/network/item.py @@ -9,6 +9,7 @@ """ import re +from typing import List import feapder.utils.tools as tools @@ -20,12 +21,14 @@ def __new__(cls, name, bases, attrs): attrs.setdefault("__name_underline__", None) attrs.setdefault("__update_key__", None) attrs.setdefault("__unique_key__", None) + attrs.setdefault("__pipelines__", None) return type.__new__(cls, name, bases, attrs) class Item(metaclass=ItemMetaclass): - __unique_key__ = [] + __unique_key__: List = [] + __pipelines__: List = None def __init__(self, **kwargs): self.__dict__ = kwargs @@ -64,11 +67,12 @@ def to_dict(self): propertys = {} for key, value in self.__dict__.items(): if key not in ( - "__name__", - "__table_name__", - "__name_underline__", - "__update_key__", - "__unique_key__", + "__name__", + "__table_name__", + "__name_underline__", + "__update_key__", + "__unique_key__", + "__pipelines__", ): if key.startswith(f"_{self.__class__.__name__}"): key = key.replace(f"_{self.__class__.__name__}", "") @@ -123,13 +127,24 @@ def unique_key(self, keys): else: self.__unique_key__ = (keys,) + @property + def pipelines(self): + return self.__pipelines__ or self.__class__.__pipelines__ + + @pipelines.setter + def pipelines(self, pipelines): + if isinstance(pipelines, (tuple, list)): + self.__pipelines__ = pipelines + else: + self.__pipelines__ = (pipelines,) + @property def fingerprint(self): args = [] for key, value in self.to_dict.items(): if value: if (self.unique_key and key in self.unique_key) or not self.unique_key: - args.append(str(value)) + args.append(key + str(value)) if args: args = sorted(args) diff --git a/feapder/network/proxy_pool/__init__.py b/feapder/network/proxy_pool/__init__.py new file mode 100644 index 00000000..0a6935b6 --- /dev/null +++ b/feapder/network/proxy_pool/__init__.py @@ -0,0 +1,11 @@ +# -*- coding: utf-8 -*- +""" +Created on 2023/7/25 10:16 +--------- +@summary: +--------- +@author: Boris +@email: boris_liu@foxmail.com +""" +from .base import BaseProxyPool +from .proxy_pool import ProxyPool diff --git a/feapder/network/proxy_pool/base.py b/feapder/network/proxy_pool/base.py new file mode 100644 index 00000000..0a2dc590 --- /dev/null +++ b/feapder/network/proxy_pool/base.py @@ -0,0 +1,43 @@ +# -*- coding: utf-8 -*- +""" +Created on 2023/7/25 10:03 +--------- +@summary: +--------- +@author: Boris +@email: boris_liu@foxmail.com +""" + +import abc + +from feapder.utils.log import log + + +class BaseProxyPool: + @abc.abstractmethod + def get_proxy(self): + """ + 获取代理 + Returns: + {"http": "xxx", "https": "xxx"} + """ + raise NotImplementedError + + @abc.abstractmethod + def del_proxy(self, proxy): + """ + @summary: 删除代理 + --------- + @param proxy: ip:port + """ + raise NotImplementedError + + def tag_proxy(self, **kwargs): + """ + @summary: 标记代理 + --------- + @param kwargs: + @return: + """ + log.warning("暂不支持标记代理") + pass diff --git a/feapder/network/proxy_pool/proxy_pool.py b/feapder/network/proxy_pool/proxy_pool.py new file mode 100644 index 00000000..ce492633 --- /dev/null +++ b/feapder/network/proxy_pool/proxy_pool.py @@ -0,0 +1,69 @@ +# -*- coding: utf-8 -*- +""" +Created on 2022/10/19 10:40 AM +--------- +@summary: +--------- +@author: Boris +@email: boris_liu@foxmail.com +""" +from queue import Queue + +import requests + +import feapder.setting as setting +from feapder.network.proxy_pool.base import BaseProxyPool +from feapder.utils import metrics +from feapder.utils import tools + + +class ProxyPool(BaseProxyPool): + """ + 通过API提取代理,存储在内存中,无代理时会自动提取 + API返回的代理以 \r\n 分隔 + """ + + def __init__(self, proxy_api=None, **kwargs): + self.proxy_api = proxy_api or setting.PROXY_EXTRACT_API + self.proxy_queue = Queue() + + def format_proxy(self, proxy): + return {"http": "http://" + proxy, "https": "http://" + proxy} + + @tools.retry(3, interval=5) + def pull_proxies(self): + resp = requests.get(self.proxy_api) + proxies = resp.text.strip() + resp.close() + if "{" in proxies or not proxies: + raise Exception("获取代理失败", proxies) + # 使用 /r/n 分隔 + return proxies.split("\r\n") + + def get_proxy(self): + try: + if self.proxy_queue.empty(): + proxies = self.pull_proxies() + for proxy in proxies: + self.proxy_queue.put_nowait(proxy) + metrics.emit_counter("total", 1, classify="proxy") + + proxy = self.proxy_queue.get_nowait() + self.proxy_queue.put_nowait(proxy) + + metrics.emit_counter("used_times", 1, classify="proxy") + + return self.format_proxy(proxy) + except Exception as e: + tools.send_msg("获取代理失败", level="error") + raise Exception("获取代理失败", e) + + def del_proxy(self, proxy): + """ + @summary: 删除代理 + --------- + @param proxy: ip:port + """ + if proxy in self.proxy_queue.queue: + self.proxy_queue.queue.remove(proxy) + metrics.emit_counter("invalid", 1, classify="proxy") diff --git a/feapder/network/proxy_pool.py b/feapder/network/proxy_pool_old.py similarity index 100% rename from feapder/network/proxy_pool.py rename to feapder/network/proxy_pool_old.py diff --git a/feapder/network/request.py b/feapder/network/request.py index 152e6127..95e51604 100644 --- a/feapder/network/request.py +++ b/feapder/network/request.py @@ -9,6 +9,7 @@ """ import copy +import os import re import requests @@ -20,7 +21,7 @@ from feapder.db.redisdb import RedisDB from feapder.network import user_agent from feapder.network.downloader.base import Downloader, RenderDownloader -from feapder.network.proxy_pool import ProxyPool +from feapder.network.proxy_pool import BaseProxyPool from feapder.network.response import Response from feapder.utils.log import log @@ -30,7 +31,7 @@ class Request: user_agent_pool = user_agent - proxies_pool: ProxyPool = None + proxies_pool: BaseProxyPool = None cache_db = None # redis / pika cached_redis_key = None # 缓存response的文件文件夹 response_cached:cached_redis_key:md5 @@ -195,13 +196,19 @@ def __setattr__(self, key, value): if key in self.__class__.__REQUEST_ATTRS__: self.requests_kwargs[key] = value + # def __getattr__(self, item): + # try: + # return self.__dict__[item] + # except: + # raise AttributeError("Request has no attribute %s" % item) + def __lt__(self, other): return self.priority < other.priority @property def _proxies_pool(self): if not self.__class__.proxies_pool: - self.__class__.proxies_pool = ProxyPool() + self.__class__.proxies_pool = tools.import_cls(setting.PROXY_POOL)() return self.__class__.proxies_pool @@ -224,9 +231,13 @@ def _session_downloader(self): @property def _render_downloader(self): if not self.__class__.render_downloader: - self.__class__.render_downloader = tools.import_cls( - setting.RENDER_DOWNLOADER - )() + try: + self.__class__.render_downloader = tools.import_cls( + setting.RENDER_DOWNLOADER + )() + except AttributeError: + log.error('当前是渲染模式,请安装 pip install "feapder[render]"') + os._exit(0) return self.__class__.render_downloader @@ -244,6 +255,7 @@ def to_dict(self): self.download_midware = [ getattr(download_midware, "__name__") if callable(download_midware) + and download_midware.__class__.__name__ == "method" else download_midware for download_midware in self.download_midware ] @@ -251,6 +263,7 @@ def to_dict(self): self.download_midware = ( getattr(self.download_midware, "__name__") if callable(self.download_midware) + and self.download_midware.__class__.__name__ == "method" else self.download_midware ) @@ -265,11 +278,11 @@ def to_dict(self): if value is not None: if key in self.__class__.__REQUEST_ATTRS__: if not isinstance( - value, (bytes, bool, float, int, str, tuple, list, dict) + value, (bool, float, int, str, tuple, list, dict) ): value = tools.dumps_obj(value) else: - if not isinstance(value, (bytes, bool, float, int, str)): + if not isinstance(value, (bool, float, int, str)): value = tools.dumps_obj(value) request_dict[key] = value @@ -331,7 +344,7 @@ def make_requests_kwargs(self): proxies = self.requests_kwargs.get("proxies", -1) if proxies == -1 and setting.PROXY_ENABLE and setting.PROXY_EXTRACT_API: while True: - proxies = self._proxies_pool.get() + proxies = self._proxies_pool.get_proxy() if proxies: self.requests_kwargs.update(proxies=proxies) break @@ -422,6 +435,12 @@ def get_proxy(self) -> str: "http.*?//", "", proxies.get("http", "") or proxies.get("https", "") ) + def del_proxy(self): + proxy = self.get_proxy() + if proxy: + self._proxies_pool.del_proxy(proxy) + del self.requests_kwargs["proxies"] + def get_headers(self) -> dict: return self.requests_kwargs.get("headers", {}) diff --git a/feapder/network/response.py b/feapder/network/response.py index 7fd78878..7f97861b 100644 --- a/feapder/network/response.py +++ b/feapder/network/response.py @@ -211,13 +211,14 @@ def _make_absolute(self, link): def _absolute_links(self, text): regexs = [ - r'(<(?i)a.*?href\s*?=\s*?["\'])(.+?)(["\'])', # a - r'(<(?i)img.*?src\s*?=\s*?["\'])(.+?)(["\'])', # img - r'(<(?i)link.*?href\s*?=\s*?["\'])(.+?)(["\'])', # css - r'(<(?i)script.*?src\s*?=\s*?["\'])(.+?)(["\'])', # js + r'( 标签后插入一个标签 repl = fr'\1' - body = re.sub(rb"(|\s.*?>))", repl.encode('utf-8'), body) + body = re.sub(rb"(|\s.*?>))", repl.encode("utf-8"), body) fd, fname = tempfile.mkstemp(".html") os.write(fd, body) diff --git a/feapder/network/selector.py b/feapder/network/selector.py index ea8b2eff..901f4eb5 100644 --- a/feapder/network/selector.py +++ b/feapder/network/selector.py @@ -12,6 +12,7 @@ import parsel import six from lxml import etree +from packaging import version from parsel import Selector as ParselSelector from parsel import SelectorList as ParselSelectorList from parsel import selector @@ -65,7 +66,7 @@ def create_root_node(text, parser_cls, base_url=None): return root -if parsel.__version__ < "1.7.0": +if version.parse(parsel.__version__) < version.parse("1.7.0"): selector.create_root_node = create_root_node diff --git a/feapder/network/user_agent.py b/feapder/network/user_agent.py index 28df6325..7f9024d4 100644 --- a/feapder/network/user_agent.py +++ b/feapder/network/user_agent.py @@ -61,6 +61,683 @@ "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_2) AppleWebKit/537.17 (KHTML, like Gecko) Chrome/24.0.1309.0 Safari/537.17", "Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.15 (KHTML, like Gecko) Chrome/24.0.1295.0 Safari/537.15", "Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.14 (KHTML, like Gecko) Chrome/24.0.1292.0 Safari/537.14", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.106 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3215.0 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.84 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.62 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.131 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3790.0 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.75 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.131 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.131 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.92 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.75 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.79 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.94 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.89 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.132 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.132 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.132 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.132 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.132 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.100 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.132 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.132 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.132 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.132 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.132 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.132 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.132 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.132 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.132 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.132 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.132 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.132 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.63 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.116 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.90 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.24 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.136 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.62 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.0.3016 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.132 Safari/537.36 Kinza/6.1.5", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.61 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.48 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.89 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.125 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.135 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.132 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.83 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.105 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.2.0.1713 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.89 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.47 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.2 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.819 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.75 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.102 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.41 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.785 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.117 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.198 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.9 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3235.0 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3409.85 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4371.0 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.142 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.9 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.43 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36 CravingExplorer/2.4.1", + "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.122 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.61 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.75 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.87 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.88 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.106 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.84 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4121.813 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.132 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.121 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.107 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.9 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.129 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.158 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.190 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.58 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.113 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.106 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.132 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.140 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36", + "Mozilla/5.0 (Microsoft Windows NT 10.0.16299.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36 (FTM)", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4500.0 Iron Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.66 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.121 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.198 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4427.5 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3835.0 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.190 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.120 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; ) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/82.0.4085.4 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.75 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.135 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.82 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.116 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.116 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.91 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.114 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.104 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.109 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.97 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.61 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.4000.0 Iron Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.198 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.135 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.122 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.0.0 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.41 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; ) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.163 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.116 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.41 Safari/537.36", + "Mozilla/5.0 (Windows NT 5.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 ADG/11.0.2566 AOLBUILD/11.0.2566 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/78.0.3904.108 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/88.0.4324.152 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 ADG/11.0.2510 AOLBUILD/11.0.2510 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; ) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36 AOLShield/83.0.4103.0", + "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 AOL/11.0 AOLBUILD/11.0.1839 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.75 ADG/11.0.2414 AOLBUILD/11.0.2414 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 ADG/11.0.2566 AOLBUILD/11.0.2566 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36 AOLShield/83.0.4103.2", + "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/80.0.3987.87 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/84.0.4147.105 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/86.0.4240.183 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/88.0.4324.152 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/90.0.4430.72 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 ADG/11.0.2510 AOLBUILD/11.0.2510 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/86.0.4240.198 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 ADG/11.0.2566 AOLBUILD/11.0.2566 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/78.0.3904.97 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/84.0.4147.105 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/86.0.4240.198 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/88.0.4324.182 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/78.0.3904.108 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/87.0.4280.88 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/89.0.4389.114 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 ADG/11.0.2510 AOLBUILD/11.0.2510 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/87.0.4280.101 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 AOL/11.0 AOLBUILD/11.0.1839 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 ADG/11.0.2470 AOLBUILD/11.0.2470 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 ADG/11.0.2566 AOLBUILD/11.0.2566 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36 AOLShield/79.0.3945.5", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/77.0.3865.90 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/79.0.3945.88 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/80.0.3987.162 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/84.0.4147.89 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/86.0.4240.99 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/87.0.4280.141 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/89.0.4389.72 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.93 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.100 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.93 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.102 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.106 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.128 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.90 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.141 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.190 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.90 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.97 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.150 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.75 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.157 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.123 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4558.0 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.114 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.101 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.117 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.104 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.85 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.121 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; ) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.113 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.135 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.114 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.102 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.66 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.69 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.69 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.109 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.61 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.114 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4564.0 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.114 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.198 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.121 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.131 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.54 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.83 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.163 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.131 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.81 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.93 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.87 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.71 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.72 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.122 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.107 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.125 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.45 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.81 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.135 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.81 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.81 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.93 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.99 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.101 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.77 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.69 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.190 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.164 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.141 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.45 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.90 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/99.0.4844.74 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.60 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3409.13 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.93 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.26 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.0.0 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.198 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.54 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.81 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/99.0.4844.84 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.64 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.114 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.127 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.69 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4591.54 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/99.101.4951.54 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.83 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.107 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.81 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.75 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.96 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.102 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/99.0.7113.93 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.54 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.0.0 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.190 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/102.0.5005.49 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.54 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/99.0.1150.52 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4950.0 Iron Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4450.0 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.127 Safari/537.36", + "Mozilla/5.0 (Windows NT 11.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/99.0.4868.173 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/99.0.4844.84 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/106.0.0.0 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.1483.27 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.45 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/106.0.0.0 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.5060.66 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.127 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.3478.83 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.69 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.139 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.99 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.0.0 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/105.0.0.0 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.117 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.67 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/102.0.0.0 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.198 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.114 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.60 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.0.0 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/102.0.5005.115 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4577.82 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.159 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; WOW64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/105.0.5118.205 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/106.0.0.0 Safari/537.36 Agency/97.8.8247.48", + "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.127 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.127 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/106.0.0.0 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/104.0.0.0 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.159 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4577.63 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.99 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.198 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.102 Safari/537.36", + "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.164 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.45 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.212 Safari/537.36", + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4137.1 SputnikBrowser/5.6.6280.0 (GOST) Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.106 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.79 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.87 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.84 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.106 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.87 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.103 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.139 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.103 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.139 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.181 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.103 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.100 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.132 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.43 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.132 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.132 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.132 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.132 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.142 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.100 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.122 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.122 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.132 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.132 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.132 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.79 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.106 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.122 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.132 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/82.0.4078.2 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.122 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.122 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.87 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/99.0.3538.77 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.89 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.105 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.122 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.83 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.83 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.125 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.5 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.6 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_0_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.1 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3409.631 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.3 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_0_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.101 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.2 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.93 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.8 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.5 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_0_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3409.1 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.111 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.183 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.44 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.779 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.19 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.6 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.163 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_1_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36 FS", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36\tChrome 79.0", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.149 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.132 Safari/537.36\tChrome Generic", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_16_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.83 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.113 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_16_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.89 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.182 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.192 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_2_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.96 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.69 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.146 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.141 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.186 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.192 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.150 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.170 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_2_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.90 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_2_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.90 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4450.0 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.182 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.114 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.114 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.114 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.114 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_1_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.150 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_1_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.96 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_2_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.192 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.192 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_2_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.67 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_3_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.114 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.198 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.96 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/524.34", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.114 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.90 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.102 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_1_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.146 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.192 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_2_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.82 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_2_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.192 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.105 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.90 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.90 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.114 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_2_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.146 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.193 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.51 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_2_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.152 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.152 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/99.0.3538.77 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.150 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.150 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.93 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_3) AppleWebKit/537.36 (KHTML, like Gecko, Mediapartners-Google) Chrome/77.0.3865.99 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_3) AppleWebKit/537.36 (KHTML, like Gecko, Mediapartners-Google) Chrome/81.0.4044.108 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_3) AppleWebKit/537.36 (KHTML, like Gecko, Mediapartners-Google) Chrome/83.0.4103.118 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_3) AppleWebKit/537.36 (KHTML, like Gecko, Mediapartners-Google) Chrome/84.0.4147.108 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_3) AppleWebKit/537.36 (KHTML, like Gecko, Mediapartners-Google) Chrome/84.0.4147.140 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_3) AppleWebKit/537.36 (KHTML, like Gecko, Mediapartners-Google) Chrome/85.0.4183.122 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_3) AppleWebKit/537.36 (KHTML, like Gecko, Mediapartners-Google) Chrome/87.0.4280.90 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_3) AppleWebKit/537.36 (KHTML, like Gecko, Mediapartners-Google) Chrome/88.0.4324.175 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_3) AppleWebKit/537.36 (KHTML, like Gecko, Mediapartners-Google) Chrome/89.0.4389.93 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_3) AppleWebKit/537.36 (KHTML, like Gecko, Mediapartners-Google) Chrome/89.0.4389.127 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/86.0.4240.75 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/79.0.3945.88 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/80.0.3987.116 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/81.0.4044.113 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/84.0.4147.135 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/86.0.4240.75 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/86.0.4240.198 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/87.0.4280.141 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/89.0.4389.72 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/78.0.3904.70 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/80.0.3987.116 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/80.0.3987.162 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/86.0.4240.75 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/87.0.4280.67 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/88.0.4324.152 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/89.0.4389.90 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/77.0.3865.90 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/78.0.3904.108 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/80.0.3987.87 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/80.0.3987.162 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/83.0.4103.116 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/85.0.4183.83 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/86.0.4240.99 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/86.0.4240.198 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/87.0.4280.141 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/88.0.4324.182 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/89.0.4389.90 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/90.0.4430.72 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_3) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/79.0.3945.88 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_5) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/79.0.3945.88 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/77.0.3865.90 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/78.0.3904.108 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/80.0.3987.122 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/81.0.4044.113 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/84.0.4147.89 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/85.0.4183.102 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/86.0.4240.183 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/87.0.4280.88 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/88.0.4324.146 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/89.0.4389.72 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/89.0.4389.114 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_0) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/78.0.3904.108 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_1) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/78.0.3904.70 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_2) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/78.0.3904.97 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_2) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/79.0.3945.130 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_3) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/78.0.3904.108 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_3) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/80.0.3987.87 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_3) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/80.0.3987.149 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_3) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/84.0.4147.89 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_3) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/86.0.4240.99 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/80.0.3987.149 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/81.0.4044.122 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/84.0.4147.89 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/87.0.4280.101 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/83.0.4103.97 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/84.0.4147.105 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/86.0.4240.75 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/78.0.3904.87 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/83.0.4103.106 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/84.0.4147.125 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/85.0.4183.121 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/86.0.4240.183 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/88.0.4324.152 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/83.0.4103.116 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/85.0.4183.102 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/86.0.4240.111 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/87.0.4280.60 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/87.0.4280.141 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/88.0.4324.182 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/89.0.4389.90 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_16_0) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/80.0.3987.116 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_0_0) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/86.0.4240.183 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_0_1) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/87.0.4280.67 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_0_1) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/88.0.4324.96 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_0_1) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/88.0.4324.192 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_1_0) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/87.0.4280.67 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_1_0) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/88.0.4324.96 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_1_0) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/89.0.4389.72 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_2_0) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/87.0.4280.101 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_2_0) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/88.0.4324.152 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_2_1) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/87.0.4280.101 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_2_1) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/88.0.4324.182 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_2_1) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/89.0.4389.90 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_2_2) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/88.0.4324.146 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_2_2) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/89.0.4389.72 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_2_3) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/88.0.4324.96 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_2_3) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/89.0.4389.72 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_2_3) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/89.0.4389.114 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_3_0) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/89.0.4389.114 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_3) AppleWebKit/537.36 (KHTML, like Gecko, Mediapartners-Google) Chrome/89.0.4389.130 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.128 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.93 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.93 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.93 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.111 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_3_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.128 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.69 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.212 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.90 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.54 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4577.82 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.69 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.114 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.69 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.85 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.81 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.61 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.61 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.114 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.54 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4582.189 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.181 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/82.0.4083.0 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.69 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4612.206 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.131 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.81 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.80 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.99 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.80 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4702.147 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.80 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.93 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.99 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.45 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4577.63 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.80 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.71 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4577.82 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.93 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.99 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.80 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.80 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.71 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.107 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4691.94 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.81 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4889.0 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.71 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.79 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.159 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.79 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/99.0.4844.84 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/99.0.4844.84 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/99.0.4844.84 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.54 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.54 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.54 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.54 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.9999.0 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.0.0 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/102.0.0.0 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.64 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.54 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/99.0.4844.84 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/99.0.4844.51 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.64 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.54 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.64 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.61 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/99.0.4844.84 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/102.0.5005.40 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.127 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.60 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4577.63 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.55 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4880.146 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.55 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.80 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.212 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.181 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.80 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.54 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.147 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.109 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.75 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.71 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.54 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.75 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.109 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.114 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/99.0.4886.93 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.150 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_1_0) AppleWebKit/537.36 (KHTML, like Gecko) Brave Chrome/89.0.4389.105 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4886.148 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.75 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/106.0.0.0 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/102.0.0.0 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.80 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/104.0.5112.102 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/104.0.0.0 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/105.0.0.0 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/106.0.0.0 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.0.0 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.75 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.127 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.64 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36", + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.5163.147 Safari/537.36" ], "opera": [ "Opera/9.80 (X11; Linux i686; Ubuntu/14.10) Presto/2.12.388 Version/12.16", diff --git a/feapder/network/user_pool/guest_user_pool.py b/feapder/network/user_pool/guest_user_pool.py index 0e550dde..9d34aad3 100644 --- a/feapder/network/user_pool/guest_user_pool.py +++ b/feapder/network/user_pool/guest_user_pool.py @@ -45,7 +45,7 @@ def __init__( user_agent: 字符串 或 无参函数,返回值为user_agent proxy: xxx.xxx.xxx.xxx:xxxx 或 无参函数,返回值为代理地址 headless: 是否启用无头模式 - driver_type: CHROME 或 PHANTOMJS,FIREFOX + driver_type: CHROME,EDGE 或 PHANTOMJS,FIREFOX timeout: 请求超时时间 window_size: # 窗口大小 executable_path: 浏览器路径,默认为默认路径 diff --git a/feapder/network/user_pool/normal_user_pool.py b/feapder/network/user_pool/normal_user_pool.py index f14c7656..63c99726 100644 --- a/feapder/network/user_pool/normal_user_pool.py +++ b/feapder/network/user_pool/normal_user_pool.py @@ -209,9 +209,9 @@ def run(self): retry_times = 0 while retry_times <= self._login_retry_times: try: - user = self.login(user) - if user: - self.add_user(user) + login_user = self.login(user) + if login_user: + self.add_user(login_user) else: self.handle_login_failed_user(user) break diff --git a/feapder/pipelines/csv_pipeline.py b/feapder/pipelines/csv_pipeline.py new file mode 100644 index 00000000..922a77d3 --- /dev/null +++ b/feapder/pipelines/csv_pipeline.py @@ -0,0 +1,254 @@ +# -*- coding: utf-8 -*- +""" +Created on 2025-10-16 +--------- +@summary: CSV 数据导出Pipeline +--------- +@author: 道长 +@email: ctrlf4@yeah.net +""" + +import csv +import os +import threading +from typing import Dict, List, Tuple + +from feapder.pipelines import BasePipeline +from feapder.utils.log import log + + +class CsvPipeline(BasePipeline): + """ + CSV 数据导出Pipeline + + 将爬虫数据保存为CSV文件。支持批量保存、并发写入控制、断点续爬等功能。 + + 特点: + - 单表单锁设计,避免全局锁带来的性能问题 + - 自动创建导出目录 + - 支持追加模式,便于断点续爬 + - 通过fsync确保数据落盘 + - 表级别的字段名缓存,确保跨批字段顺序一致 + """ + + # 用于保护每个表的文件写入操作(Per-Table Lock) + _file_locks = {} + + # 用于缓存每个表的字段名顺序(Per-Table Fieldnames Cache) + # 确保跨批次、跨线程的字段顺序一致 + _table_fieldnames = {} + + def __init__(self, csv_dir=None): + """ + 初始化CSV Pipeline + + Args: + csv_dir: CSV文件保存目录 + - 如果不传,从 setting.CSV_EXPORT_PATH 读取 + - 支持相对路径(如 "data/csv") + - 支持绝对路径(如 "/Users/xxx/exports/csv") + """ + super().__init__() + + # 如果未传入参数,从配置文件读取 + if csv_dir is None: + import feapder.setting as setting + csv_dir = setting.CSV_EXPORT_PATH + + # 支持绝对路径和相对路径,统一转换为绝对路径 + self.csv_dir = os.path.abspath(csv_dir) + self._ensure_csv_dir_exists() + + def _ensure_csv_dir_exists(self): + """确保CSV保存目录存在""" + if not os.path.exists(self.csv_dir): + try: + os.makedirs(self.csv_dir, exist_ok=True) + log.info(f"创建CSV保存目录: {self.csv_dir}") + except Exception as e: + log.error(f"创建CSV目录失败: {e}") + raise + + @staticmethod + def _get_lock(table): + """ + 获取表对应的文件锁 + + 采用Per-Table Lock设计,每个表都有独立的锁,避免锁竞争。 + 这样设计既能保证单表的文件写入安全,又能充分利用多表并行写入的优势。 + + Args: + table: 表名 + + Returns: + threading.Lock: 该表对应的锁对象 + """ + if table not in CsvPipeline._file_locks: + CsvPipeline._file_locks[table] = threading.Lock() + return CsvPipeline._file_locks[table] + + @staticmethod + def _get_and_cache_fieldnames(table, items): + """ + 获取并缓存表对应的字段名顺序 + + 第一次调用时从items[0]提取字段名并缓存,后续调用直接返回缓存的字段名。 + 这样设计确保: + 1. 跨批次的字段顺序保持一致(解决数据列错位问题) + 2. 多线程并发时字段顺序不被污染 + 3. 避免重复提取,性能更优 + + Args: + table: 表名 + items: 数据列表 [{},{},...] + + Returns: + list: 字段名列表 + """ + # 如果该表已经缓存了字段名,直接返回缓存的 + if table in CsvPipeline._table_fieldnames: + return CsvPipeline._table_fieldnames[table] + + # 第一次调用,从items提取字段名并缓存 + if not items: + return [] + + first_item = items[0] + fieldnames = list(first_item.keys()) if isinstance(first_item, dict) else [] + + if fieldnames: + # 缓存字段名(使用静态变量,跨实例共享) + CsvPipeline._table_fieldnames[table] = fieldnames + log.info(f"表 {table} 的字段名已缓存: {fieldnames}") + + return fieldnames + + def _get_csv_file_path(self, table): + """ + 获取表对应的CSV文件路径 + + Args: + table: 表名 + + Returns: + str: CSV文件的完整路径 + """ + return os.path.join(self.csv_dir, f"{table}.csv") + + + def _file_exists_and_has_content(self, csv_file): + """ + 检查CSV文件是否存在且有内容 + + Args: + csv_file: CSV文件路径 + + Returns: + bool: 文件存在且有内容返回True + """ + return os.path.exists(csv_file) and os.path.getsize(csv_file) > 0 + + def save_items(self, table, items: List[Dict]) -> bool: + """ + 保存数据到CSV文件 + + 采用追加模式打开文件,支持断点续爬。第一次写入时会自动添加表头。 + 使用Per-Table Lock确保多线程写入时的数据一致性。 + 使用缓存的字段名确保跨批次字段顺序一致,避免数据列错位。 + + Args: + table: 表名(对应CSV文件名) + items: 数据列表,[{}, {}, ...] + + Returns: + bool: 保存成功返回True,失败返回False + 失败时ItemBuffer会自动重试(最多10次) + """ + if not items: + return True + + csv_file = self._get_csv_file_path(table) + + # 使用缓存机制获取字段名(关键!确保跨批字段顺序一致) + fieldnames = self._get_and_cache_fieldnames(table, items) + + if not fieldnames: + log.warning(f"无法提取字段名,items: {items}") + return False + + try: + # 获取表级别的锁(关键!保证文件写入安全) + lock = self._get_lock(table) + with lock: + # 检查文件是否已存在且有内容 + file_exists = self._file_exists_and_has_content(csv_file) + + # 以追加模式打开文件 + with open( + csv_file, + "a", + encoding="utf-8", + newline="" + ) as f: + writer = csv.DictWriter(f, fieldnames=fieldnames) + + # 如果文件不存在或为空,写入表头 + if not file_exists: + writer.writeheader() + + # 批量写入数据行 + # 使用缓存的fieldnames确保列顺序一致,避免跨批数据错位 + writer.writerows(items) + + # 刷新缓冲区到磁盘,确保数据不丢失 + f.flush() + os.fsync(f.fileno()) + + # 记录导出日志 + log.info( + f"共导出 {len(items)} 条数据 到 {table}.csv (文件路径: {csv_file})" + ) + return True + + except Exception as e: + log.error( + f"CSV写入失败. table: {table}, csv_file: {csv_file}, error: {e}" + ) + return False + + def update_items(self, table, items: List[Dict], update_keys=Tuple) -> bool: + """ + 更新数据 + + 注意:CSV文件本身不支持真正的"更新"操作(需要查询后替换)。 + 目前的实现是直接追加写入,相当于INSERT操作。 + + 如果需要真正的UPDATE操作,建议: + 1. 定期重新生成CSV文件 + 2. 使用数据库(MySQL/MongoDB)来处理UPDATE + 3. 或在应用层进行去重和更新 + + Args: + table: 表名 + items: 数据列表,[{}, {}, ...] + update_keys: 更新的字段(此实现中未使用) + + Returns: + bool: 操作成功返回True + """ + # 对于CSV,update操作实现为追加写入 + # 若需要真正的UPDATE操作,建议在应用层处理 + return self.save_items(table, items) + + def close(self): + """ + 关闭Pipeline,释放资源 + + 在爬虫结束时由ItemBuffer自动调用。 + """ + try: + # 清理文件锁字典(可选,用于释放内存) + # 在长期运行的场景下,可能需要定期清理 + pass + except Exception as e: + log.error(f"关闭CSV Pipeline时出错: {e}") diff --git a/feapder/pipelines/mysql_pipeline.py b/feapder/pipelines/mysql_pipeline.py index 8899761b..3ffb3fc1 100644 --- a/feapder/pipelines/mysql_pipeline.py +++ b/feapder/pipelines/mysql_pipeline.py @@ -45,6 +45,8 @@ def save_items(self, table, items: List[Dict]) -> bool: log.info( "共导出 %s 条数据 到 %s, 重复 %s 条" % (datas_size, table, datas_size - add_count) ) + else: + log.debug("没有插入数据,可能全部重复") return add_count != None diff --git a/feapder/requirements.txt b/feapder/requirements.txt index 49fc6fbb..21717674 100644 --- a/feapder/requirements.txt +++ b/feapder/requirements.txt @@ -16,6 +16,6 @@ urllib3>=1.25.8 loguru>=0.5.3 influxdb>=5.3.1 pyperclip>=1.8.2 -webdriver-manager>=3.5.3 +webdriver-manager>=4.0.0 terminal-layout>=2.1.3 playwright \ No newline at end of file diff --git a/feapder/setting.py b/feapder/setting.py index 5dd18246..c52b318c 100644 --- a/feapder/setting.py +++ b/feapder/setting.py @@ -27,12 +27,15 @@ MONGO_DB = os.getenv("MONGO_DB") MONGO_USER_NAME = os.getenv("MONGO_USER_NAME") MONGO_USER_PASS = os.getenv("MONGO_USER_PASS") +MONGO_URL = os.getenv("MONGO_URL") # REDIS # ip:port 多个可写为列表或者逗号隔开 如 ip1:port1,ip2:port2 或 ["ip1:port1", "ip2:port2"] REDISDB_IP_PORTS = os.getenv("REDISDB_IP_PORTS") REDISDB_USER_PASS = os.getenv("REDISDB_USER_PASS") REDISDB_DB = int(os.getenv("REDISDB_DB", 0)) +# 连接redis时携带的其他参数,如ssl=True +REDISDB_KWARGS = dict() # 适用于redis哨兵模式 REDISDB_SERVICE_NAME = os.getenv("REDISDB_SERVICE_NAME") @@ -40,8 +43,10 @@ ITEM_PIPELINES = [ "feapder.pipelines.mysql_pipeline.MysqlPipeline", # "feapder.pipelines.mongo_pipeline.MongoPipeline", + # "feapder.pipelines.csv_pipeline.CsvPipeline", # "feapder.pipelines.console_pipeline.ConsolePipeline", ] +CSV_EXPORT_PATH = "data/csv" # CSV文件保存路径,支持相对路径和绝对路径 EXPORT_DATA_MAX_FAILED_TIMES = 10 # 导出数据时最大的失败次数,包括保存和更新,超过这个次数报警 EXPORT_DATA_MAX_RETRY_TIMES = 10 # 导出数据时最大的重试次数,包括保存和更新,超过这个次数则放弃重试 @@ -65,7 +70,7 @@ user_agent=None, # 字符串 或 无参函数,返回值为user_agent proxy=None, # xxx.xxx.xxx.xxx:xxxx 或 无参函数,返回值为代理地址 headless=False, # 是否为无头浏览器 - driver_type="CHROME", # CHROME、PHANTOMJS、FIREFOX + driver_type="CHROME", # CHROME、EDGE、PHANTOMJS、FIREFOX timeout=30, # 请求超时时间 window_size=(1024, 800), # 窗口大小 executable_path=None, # 浏览器路径,默认为默认路径 @@ -130,6 +135,8 @@ # 设置代理 PROXY_EXTRACT_API = None # 代理提取API ,返回的代理分割符为\r\n PROXY_ENABLE = True +PROXY_MAX_FAILED_TIMES = 5 # 代理最大失败次数,超过则不使用,自动删除 +PROXY_POOL = "feapder.network.proxy_pool.ProxyPool" # 代理池 # 随机headers RANDOM_HEADERS = True @@ -141,9 +148,9 @@ USE_SESSION = False # 下载 -DOWNLOADER = "feapder.network.downloader.RequestsDownloader" +DOWNLOADER = "feapder.network.downloader.RequestsDownloader" # 请求下载器 SESSION_DOWNLOADER = "feapder.network.downloader.RequestsSessionDownloader" -RENDER_DOWNLOADER = "feapder.network.downloader.SeleniumDownloader" +RENDER_DOWNLOADER = "feapder.network.downloader.SeleniumDownloader" # 渲染下载器 # RENDER_DOWNLOADER="feapder.network.downloader.PlaywrightDownloader" MAKE_ABSOLUTE_LINKS = True # 自动转成绝对连接 @@ -161,8 +168,10 @@ # 报警 支持钉钉、飞书、企业微信、邮件 # 钉钉报警 DINGDING_WARNING_URL = "" # 钉钉机器人api -DINGDING_WARNING_PHONE = "" # 报警人 支持列表,可指定多个 +DINGDING_WARNING_PHONE = "" # 被@的群成员手机号,支持列表,可指定多个。 +DINGDING_WARNING_USER_ID = "" # 被@的群成员userId,支持列表,可指定多个 DINGDING_WARNING_ALL = False # 是否提示所有人, 默认为False +DINGDING_WARNING_SECRET = None # 加签密钥 # 飞书报警 # https://open.feishu.cn/document/ukTMukTMukTM/ucTM5YjL3ETO24yNxkjN#e1cdee9f FEISHU_WARNING_URL = "" # 飞书机器人api @@ -177,6 +186,10 @@ WECHAT_WARNING_URL = "" # 企业微信机器人api WECHAT_WARNING_PHONE = "" # 报警人 将会在群内@此人, 支持列表,可指定多人 WECHAT_WARNING_ALL = False # 是否提示所有人, 默认为False +# QMSG报警 +QMSG_WARNING_URL = "" # qmsg机器人api +QMSG_WARNING_QQ = "" # 指定要接收消息的QQ号或者QQ群。多个以英文逗号分割,例如:12345,12346,支持列表,可指定多人 +QMSG_WARNING_BOT = "" # 机器人的QQ号 # 时间间隔 WARNING_INTERVAL = 3600 # 相同报警的报警时间间隔,防止刷屏; 0表示不去重 WARNING_LEVEL = "DEBUG" # 报警级别, DEBUG / INFO / ERROR diff --git a/feapder/templates/project_template/setting.py b/feapder/templates/project_template/setting.py index 59b7a04d..140aaa07 100644 --- a/feapder/templates/project_template/setting.py +++ b/feapder/templates/project_template/setting.py @@ -16,12 +16,15 @@ # MONGO_DB = "" # MONGO_USER_NAME = "" # MONGO_USER_PASS = "" +# MONGO_URL = " # # # REDIS # # ip:port 多个可写为列表或者逗号隔开 如 ip1:port1,ip2:port2 或 ["ip1:port1", "ip2:port2"] # REDISDB_IP_PORTS = "localhost:6379" # REDISDB_USER_PASS = "" # REDISDB_DB = 0 +# # 连接redis时携带的其他参数,如ssl=True +# REDISDB_KWARGS = dict() # # 适用于redis哨兵模式 # REDISDB_SERVICE_NAME = "" # @@ -29,8 +32,10 @@ # ITEM_PIPELINES = [ # "feapder.pipelines.mysql_pipeline.MysqlPipeline", # # "feapder.pipelines.mongo_pipeline.MongoPipeline", +# # "feapder.pipelines.csv_pipeline.CsvPipeline", # # "feapder.pipelines.console_pipeline.ConsolePipeline", # ] +# CSV_EXPORT_PATH = "data/csv" # CSV文件保存路径,支持相对路径和绝对路径 # EXPORT_DATA_MAX_FAILED_TIMES = 10 # 导出数据时最大的失败次数,包括保存和更新,超过这个次数报警 # EXPORT_DATA_MAX_RETRY_TIMES = 10 # 导出数据时最大的重试次数,包括保存和更新,超过这个次数则放弃重试 # @@ -46,9 +51,9 @@ # KEEP_ALIVE = False # 爬虫是否常驻 # 下载 -# DOWNLOADER = "feapder.network.downloader.RequestsDownloader" +# DOWNLOADER = "feapder.network.downloader.RequestsDownloader" # 请求下载器 # SESSION_DOWNLOADER = "feapder.network.downloader.RequestsSessionDownloader" -# RENDER_DOWNLOADER = "feapder.network.downloader.SeleniumDownloader" +# RENDER_DOWNLOADER = "feapder.network.downloader.SeleniumDownloader" # 渲染下载器 # # RENDER_DOWNLOADER="feapder.network.downloader.PlaywrightDownloader" # MAKE_ABSOLUTE_LINKS = True # 自动转成绝对连接 @@ -59,7 +64,7 @@ # user_agent=None, # 字符串 或 无参函数,返回值为user_agent # proxy=None, # xxx.xxx.xxx.xxx:xxxx 或 无参函数,返回值为代理地址 # headless=False, # 是否为无头浏览器 -# driver_type="CHROME", # CHROME、PHANTOMJS、FIREFOX +# driver_type="CHROME", # CHROME、EDGE、PHANTOMJS、FIREFOX # timeout=30, # 请求超时时间 # window_size=(1024, 800), # 窗口大小 # executable_path=None, # 浏览器路径,默认为默认路径 @@ -119,6 +124,8 @@ # # 设置代理 # PROXY_EXTRACT_API = None # 代理提取API ,返回的代理分割符为\r\n # PROXY_ENABLE = True +# PROXY_MAX_FAILED_TIMES = 5 # 代理最大失败次数,超过则不使用,自动删除 +# PROXY_POOL = "feapder.network.proxy_pool.ProxyPool" # 代理池 # # # 随机headers # RANDOM_HEADERS = True @@ -143,8 +150,10 @@ # # 报警 支持钉钉、飞书、企业微信、邮件 # # 钉钉报警 # DINGDING_WARNING_URL = "" # 钉钉机器人api -# DINGDING_WARNING_PHONE = "" # 报警人 支持列表,可指定多个 +# DINGDING_WARNING_PHONE = "" # 被@的群成员手机号,支持列表,可指定多个。 +# DINGDING_WARNING_USER_ID = "" # 被@的群成员userId,支持列表,可指定多个 # DINGDING_WARNING_ALL = False # 是否提示所有人, 默认为False +# DINGDING_WARNING_SECRET = None # 加签密钥 # # 飞书报警 # # https://open.feishu.cn/document/ukTMukTMukTM/ucTM5YjL3ETO24yNxkjN#e1cdee9f # FEISHU_WARNING_URL = "" # 飞书机器人api @@ -159,6 +168,10 @@ # WECHAT_WARNING_URL = "" # 企业微信机器人api # WECHAT_WARNING_PHONE = "" # 报警人 将会在群内@此人, 支持列表,可指定多人 # WECHAT_WARNING_ALL = False # 是否提示所有人, 默认为False +# # QMSG报警 +# QMSG_WARNING_URL = "" # qmsg机器人api +# QMSG_WARNING_QQ = "" # 指定要接收消息的QQ号或者QQ群。多个以英文逗号分割,例如:12345,12346,支持列表,可指定多人 +# QMSG_WARNING_BOT = "" # 机器人的QQ号 # # 时间间隔 # WARNING_INTERVAL = 3600 # 相同报警的报警时间间隔,防止刷屏; 0表示不去重 # WARNING_LEVEL = "DEBUG" # 报警级别, DEBUG / INFO / ERROR diff --git a/feapder/utils/log.py b/feapder/utils/log.py index 2d25ad20..e993f760 100644 --- a/feapder/utils/log.py +++ b/feapder/utils/log.py @@ -67,7 +67,6 @@ def doRollover(self): self.stream = self._open() def shouldRollover(self, record): - if self.stream is None: # delay was set... self.stream = self._open() if self.max_bytes > 0: # are we rolling over? @@ -225,6 +224,13 @@ def get_logger( class Log: log = None + def func(self, log_level): + def wrapper(msg, *args, **kwargs): + if self.isEnabledFor(log_level): + self._log(log_level, msg, args, **kwargs) + + return wrapper + def __getattr__(self, name): # 调用log时再初始化,为了加载最新的setting if self.__class__.log is None: @@ -239,6 +245,12 @@ def debug(self): def info(self): return self.__class__.log.info + @property + def success(self): + log_level = logging.INFO + 1 + logging.addLevelName(log_level, "success".upper()) + return self.func(log_level) + @property def warning(self): return self.__class__.log.warning diff --git a/feapder/utils/metrics.py b/feapder/utils/metrics.py index 0594769e..ab88ee1e 100644 --- a/feapder/utils/metrics.py +++ b/feapder/utils/metrics.py @@ -72,6 +72,19 @@ def define_tagkv(self, tagk, tagvs): def _point_tagset(self, p): return f"{p['measurement']}-{sorted(p['tags'].items())}-{p['time']}" + def _make_time_to_ns(self, _time): + """ + 将时间转换为 ns 级别的时间戳,补足长度 19 位 + Args: + _time: + + Returns: + + """ + time_len = len(str(_time)) + random_str = "".join(random.sample(string.digits, 19 - time_len)) + return int(str(_time) + random_str) + def _accumulate_points(self, points): """ 对于处于同一个 key 的点做聚合 @@ -102,18 +115,18 @@ def _accumulate_points(self, points): continue # 增加 _seq tag,以便区分不同的点 point["tags"]["_seq"] = timer_seqs[tagset] + point["time"] = self._make_time_to_ns(point["time"]) timer_seqs[tagset] += 1 new_points.append(point) else: if self.ratio < 1.0 and random.random() > self.ratio: continue + point["time"] = self._make_time_to_ns(point["time"]) new_points.append(point) for point in counters.values(): # 修改下counter类型的点的时间戳,补足19位, 伪装成纳秒级时间戳,防止influxdb对同一秒内的数据进行覆盖 - time_len = len(str(point["time"])) - random_str = "".join(random.sample(string.digits, 19 - time_len)) - point["time"] = int(str(point["time"]) + random_str) + point["time"] = self._make_time_to_ns(point["time"]) new_points.append(point) # 把拟合后的 counter 值添加进来 @@ -306,6 +319,8 @@ def init( use_udp=False, timeout=22, ssl=False, + retention_policy_replication: str = "1", + set_retention_policy_default=True, **kwargs, ): """ @@ -326,6 +341,8 @@ def init( use_udp: 是否使用udp协议打点 timeout: 与influxdb建立连接时的超时时间 ssl: 是否使用https协议 + retention_policy_replication: 保留策略的副本数, 确保数据的可靠性和高可用性。如果一个节点发生故障,其他节点可以继续提供服务,从而避免数据丢失和服务不可用的情况 + set_retention_policy_default: 是否设置为默认的保留策略,当retention_policy初次创建时有效 **kwargs: 可传递MetricsEmitter类的参数 Returns: @@ -376,8 +393,8 @@ def init( influxdb_client.create_retention_policy( retention_policy, retention_policy_duration, - replication="1", - default=True, + replication=retention_policy_replication, + default=set_retention_policy_default, ) except Exception as e: log.error("metrics init falied: {}".format(e)) @@ -410,7 +427,7 @@ def emit_any( fields: influxdb的field的字段和值 classify: 点的类别 measurement: 存储的表 - timestamp: 点的时间搓,默认为当前时间 + timestamp: 点的时间戳,默认为当前时间 Returns: @@ -441,7 +458,7 @@ def emit_counter( classify: 点的类别 tags: influxdb的tag的字段和值 measurement: 存储的表 - timestamp: 点的时间搓,默认为当前时间 + timestamp: 点的时间戳,默认为当前时间 Returns: @@ -472,7 +489,7 @@ def emit_timer( classify: 点的类别 tags: influxdb的tag的字段和值 measurement: 存储的表 - timestamp: 点的时间搓,默认为当前时间 + timestamp: 点的时间戳,默认为当前时间 Returns: @@ -503,7 +520,7 @@ def emit_store( classify: 点的类别 tags: influxdb的tag的字段和值 measurement: 存储的表 - timestamp: 点的时间搓,默认为当前时间 + timestamp: 点的时间戳,默认为当前时间 Returns: diff --git a/feapder/utils/redis_lock.py b/feapder/utils/redis_lock.py index 8c0aed47..9df0b85d 100644 --- a/feapder/utils/redis_lock.py +++ b/feapder/utils/redis_lock.py @@ -62,7 +62,7 @@ def __enter__(self): if self.locked: # 延长锁的时间 thread = threading.Thread(target=self.prolong_life) - thread.setDaemon(True) + thread.daemon = True thread.start() return self @@ -83,11 +83,12 @@ def acquire(self): if self.wait_timeout > 0: if time.time() - start > self.wait_timeout: - log.info("加锁失败") + log.debug("获取锁失败") break else: + log.debug("获取锁失败") break - log.debug("等待加锁: {} wait:{}".format(self, time.time() - start)) + log.debug("等待锁: {} wait:{}".format(self, time.time() - start)) if self.wait_timeout > 10: time.sleep(5) else: diff --git a/feapder/utils/tail_thread.py b/feapder/utils/tail_thread.py new file mode 100644 index 00000000..eda266d5 --- /dev/null +++ b/feapder/utils/tail_thread.py @@ -0,0 +1,33 @@ +# -*- coding: utf-8 -*- +""" +Created on 2024/3/19 20:00 +--------- +@summary: +--------- +@author: Boris +@email: boris_liu@foxmail.com +""" +import sys +import threading + + +class TailThread(threading.Thread): + """ + 所有子线程结束后,主线程才会退出 + """ + + def start(self) -> None: + """ + 解决python3.12 RuntimeError: cannot join thread before it is started的报错 + """ + super().start() + + if sys.version_info.minor >= 12 and sys.version_info.major >= 3: + for thread in threading.enumerate(): + if ( + thread.daemon + or thread is threading.current_thread() + or not thread.is_alive() + ): + continue + thread.join() diff --git a/feapder/utils/tools.py b/feapder/utils/tools.py index b55fcdea..31952876 100644 --- a/feapder/utils/tools.py +++ b/feapder/utils/tools.py @@ -15,6 +15,7 @@ import datetime import functools import hashlib +import hmac import html import importlib import json @@ -507,7 +508,8 @@ def fit_url(urls, identis): def get_param(url, key): - match = re.search(f"{key}=([^&]+)", url) + pattern = r"(?:[?&])" + re.escape(key) + r"=([^&]+)" + match = re.search(pattern, url) if match: return match.group(1) return None @@ -2466,12 +2468,43 @@ def reach_freq_limit(rate_limit, *key): def dingding_warning( - message, message_prefix=None, rate_limit=None, url=None, user_phone=None + message, + *, + message_prefix=None, + rate_limit=None, + url=None, + user_phone=None, + user_id=None, + secret=None, ): + """ + 钉钉报警,user_phone与user_id 二选一即可 + Args: + message: + message_prefix: 消息摘要,用于去重 + rate_limit: 包名频率,单位秒,相同的报警内容在rate_limit时间内只会报警一次 + url: 钉钉报警url + user_phone: 被@的群成员手机号,支持列表,可指定多个。 + user_id: 被@的群成员userId,支持列表,可指定多个 + secret: 钉钉报警加签密钥 + Returns: + + """ # 为了加载最新的配置 rate_limit = rate_limit if rate_limit is not None else setting.WARNING_INTERVAL url = url or setting.DINGDING_WARNING_URL user_phone = user_phone or setting.DINGDING_WARNING_PHONE + user_id = user_id or setting.DINGDING_WARNING_USER_ID + secret = secret or setting.DINGDING_WARNING_SECRET + if secret: + timestamp = str(round(time.time() * 1000)) + secret_enc = secret.encode("utf-8") + string_to_sign_enc = f"{timestamp}\n{secret}".encode("utf-8") + hmac_code = hmac.new( + secret_enc, string_to_sign_enc, digestmod=hashlib.sha256 + ).digest() + sign = urllib.parse.quote_plus(base64.b64encode(hmac_code)) + url = f"{url}×tamp={timestamp}&sign={sign}" if not all([url, message]): return @@ -2483,10 +2516,17 @@ def dingding_warning( if isinstance(user_phone, str): user_phone = [user_phone] if user_phone else [] + if isinstance(user_id, str): + user_id = [user_id] if user_id else [] + data = { "msgtype": "text", "text": {"content": message}, - "at": {"atMobiles": user_phone, "isAtAll": setting.DINGDING_WARNING_ALL}, + "at": { + "atMobiles": user_phone, + "atUserIds": user_id, + "isAtAll": setting.DINGDING_WARNING_ALL, + }, } headers = {"Content-Type": "application/json"} @@ -2675,13 +2715,61 @@ def feishu_warning(message, message_prefix=None, rate_limit=None, url=None, user return False -def send_msg(msg, level="DEBUG", message_prefix=""): +def qmsg_warning( + message, + message_prefix=None, + rate_limit=None, + url=None, + user_qq=None, + bot_qq=None +): + """qmsg报警""" + + # 为了加载最新的配置 + rate_limit = rate_limit if rate_limit is not None else setting.WARNING_INTERVAL + url = url or setting.QMSG_WARNING_URL + user_qq = user_qq or setting.QMSG_WARNING_QQ + bot_qq = bot_qq or setting.QMSG_WARNING_BOT + + if isinstance(user_qq, list): + user_qq = ','.join(map(str, user_qq)) + + if not all([url, message]): + return + + if reach_freq_limit(rate_limit, url, user_qq, message_prefix or message): + log.info("报警时间间隔过短,此次报警忽略。 内容 {}".format(message)) + return + + data = { + "msg": message, + "qq": user_qq, + "bot": bot_qq, + } + + headers = {"Content-Type": "application/json"} + + try: + response = requests.post( + url, headers=headers, data=json.dumps(data).encode("utf8") + ) + result = response.json() + response.close() + if result.get("code") == 0: + return True + else: + raise Exception(result.get("reason")) + except Exception as e: + log.error("报警发送失败。 报警内容 {}, error: {}".format(message, e)) + return False + + +def send_msg(msg, level="DEBUG", message_prefix="", keyword="feapder报警系统\n"): if setting.WARNING_LEVEL == "ERROR": if level.upper() != "ERROR": return if setting.DINGDING_WARNING_URL: - keyword = "feapder报警系统\n" dingding_warning(keyword + msg, message_prefix=message_prefix) if setting.EMAIL_RECEIVER: @@ -2691,13 +2779,14 @@ def send_msg(msg, level="DEBUG", message_prefix=""): email_warning(msg, message_prefix=message_prefix, title=title) if setting.WECHAT_WARNING_URL: - keyword = "feapder报警系统\n" wechat_warning(keyword + msg, message_prefix=message_prefix) if setting.FEISHU_WARNING_URL: - keyword = "feapder报警系统\n" feishu_warning(keyword + msg, message_prefix=message_prefix) + if setting.QMSG_WARNING_URL: + qmsg_warning(keyword + msg, message_prefix=message_prefix) + ################### diff --git a/feapder/utils/webdriver/playwright_driver.py b/feapder/utils/webdriver/playwright_driver.py index 0d445c06..fe7e5062 100644 --- a/feapder/utils/webdriver/playwright_driver.py +++ b/feapder/utils/webdriver/playwright_driver.py @@ -59,7 +59,7 @@ def __init__( self.url = None self.storage_state_path = storage_state_path - self._driver_type = driver_type + self._driver_type = driver_type or "chromium" self._page_on_event_callback = page_on_event_callback self._url_regexes = url_regexes self._save_all = save_all diff --git a/feapder/utils/webdriver/selenium_driver.py b/feapder/utils/webdriver/selenium_driver.py index 594a029c..9f46d54b 100644 --- a/feapder/utils/webdriver/selenium_driver.py +++ b/feapder/utils/webdriver/selenium_driver.py @@ -29,6 +29,7 @@ class SeleniumDriver(WebDriver, RemoteWebDriver): CHROME = "CHROME" + EDGE = "EDGE" PHANTOMJS = "PHANTOMJS" FIREFOX = "FIREFOX" @@ -43,6 +44,8 @@ class SeleniumDriver(WebDriver, RemoteWebDriver): "keep_alive", } + __EDGE_ATTRS__ = __CHROME_ATTRS__ + __FIREFOX_ATTRS__ = { "firefox_profile", "firefox_binary", @@ -75,6 +78,7 @@ def __init__(self, xhr_url_regexes: list = None, **kwargs): """ super(SeleniumDriver, self).__init__(**kwargs) self._xhr_url_regexes = xhr_url_regexes + self._driver_type = self._driver_type or SeleniumDriver.CHROME if self._xhr_url_regexes and self._driver_type != SeleniumDriver.CHROME: raise Exception( @@ -84,6 +88,9 @@ def __init__(self, xhr_url_regexes: list = None, **kwargs): if self._driver_type == SeleniumDriver.CHROME: self.driver = self.chrome_driver() + elif self._driver_type == SeleniumDriver.EDGE: + self.driver = self.edge_driver() + elif self._driver_type == SeleniumDriver.PHANTOMJS: self.driver = self.phantomjs_driver() @@ -128,9 +135,18 @@ def get_driver(self): return self.driver def firefox_driver(self): + if webdriver.__version__ >= "4.0.0": + raise Exception( + f"暂未适配selenium=={webdriver.__version__}版本的firefox API,建议安装selenium==3.141.0版本或使用CHROME浏览器" + ) + firefox_profile = webdriver.FirefoxProfile() firefox_options = webdriver.FirefoxOptions() firefox_capabilities = webdriver.DesiredCapabilities.FIREFOX + try: + from selenium.webdriver.firefox.service import Service + except (ImportError, ModuleNotFoundError): + Service = None if self._proxy: proxy = self._proxy() if callable(self._proxy) else self._proxy @@ -162,10 +178,16 @@ def firefox_driver(self): kwargs = self.filter_kwargs(self._kwargs, self.__FIREFOX_ATTRS__) - if self._executable_path: - kwargs.update(executable_path=self._executable_path) - elif self._auto_install_driver: - kwargs.update(executable_path=GeckoDriverManager().install()) + if Service is None: + if self._executable_path: + kwargs.update(executable_path=self._executable_path) + elif self._auto_install_driver: + kwargs.update(executable_path=GeckoDriverManager().install()) + else: + if self._executable_path: + kwargs.update(service=Service(self._executable_path)) + elif self._auto_install_driver: + kwargs.update(service=Service(GeckoDriverManager().install())) driver = webdriver.Firefox( capabilities=firefox_capabilities, @@ -186,6 +208,10 @@ def chrome_driver(self): chrome_options.add_experimental_option("useAutomationExtension", False) # docker 里运行需要 chrome_options.add_argument("--no-sandbox") + try: + from selenium.webdriver.chrome.service import Service + except (ImportError, ModuleNotFoundError): + Service = None if self._proxy: chrome_options.add_argument( @@ -229,10 +255,16 @@ def chrome_driver(self): chrome_options.add_argument(arg) kwargs = self.filter_kwargs(self._kwargs, self.__CHROME_ATTRS__) - if self._executable_path: - kwargs.update(executable_path=self._executable_path) - elif self._auto_install_driver: - kwargs.update(executable_path=ChromeDriverManager().install()) + if Service is None: + if self._executable_path: + kwargs.update(executable_path=self._executable_path) + elif self._auto_install_driver: + kwargs.update(executable_path=ChromeDriverManager().install()) + else: + if self._executable_path: + kwargs.update(service=Service(self._executable_path)) + elif self._auto_install_driver: + kwargs.update(service=Service(ChromeDriverManager().install())) driver = webdriver.Chrome(options=chrome_options, **kwargs) @@ -273,6 +305,110 @@ def chrome_driver(self): return driver + def edge_driver(self): + edge_options = webdriver.EdgeOptions() + # 此步骤很重要,设置为开发者模式,防止被各大网站识别出来使用了Selenium + edge_options.add_experimental_option("excludeSwitches", ["enable-automation"]) + edge_options.add_experimental_option("useAutomationExtension", False) + # docker 里运行需要 + edge_options.add_argument("--no-sandbox") + try: + from selenium.webdriver.edge.service import Service + except (ImportError, ModuleNotFoundError): + Service = None + + if self._proxy: + edge_options.add_argument( + "--proxy-server={}".format( + self._proxy() if callable(self._proxy) else self._proxy + ) + ) + if self._user_agent: + edge_options.add_argument( + "user-agent={}".format( + self._user_agent() + if callable(self._user_agent) + else self._user_agent + ) + ) + if not self._load_images: + edge_options.add_experimental_option( + "prefs", {"profile.managed_default_content_settings.images": 2} + ) + + if self._headless: + edge_options.add_argument("--headless") + edge_options.add_argument("--disable-gpu") + + if self._window_size: + edge_options.add_argument( + "--window-size={},{}".format(self._window_size[0], self._window_size[1]) + ) + + if self._download_path: + os.makedirs(self._download_path, exist_ok=True) + prefs = { + "download.prompt_for_download": False, + "download.default_directory": self._download_path, + } + edge_options.add_experimental_option("prefs", prefs) + + # 添加自定义的配置参数 + if self._custom_argument: + for arg in self._custom_argument: + edge_options.add_argument(arg) + + kwargs = self.filter_kwargs(self._kwargs, self.__CHROME_ATTRS__) + if Service is None: + if self._executable_path: + kwargs.update(executable_path=self._executable_path) + elif self._auto_install_driver: + raise NotImplementedError("edge not support auto install driver") + else: + if self._executable_path: + kwargs.update(service=Service(self._executable_path)) + elif self._auto_install_driver: + raise NotImplementedError("edge not support auto install driver") + + driver = webdriver.Edge(options=edge_options, **kwargs) + + # 隐藏浏览器特征 + if self._use_stealth_js: + with open( + os.path.join(os.path.dirname(__file__), "../js/stealth.min.js") + ) as f: + js = f.read() + driver.execute_cdp_cmd( + "Page.addScriptToEvaluateOnNewDocument", {"source": js} + ) + + if self._xhr_url_regexes: + assert isinstance(self._xhr_url_regexes, list) + with open( + os.path.join(os.path.dirname(__file__), "../js/intercept.js") + ) as f: + js = f.read() + driver.execute_cdp_cmd( + "Page.addScriptToEvaluateOnNewDocument", {"source": js} + ) + js = f"window.__urlRegexes = {self._xhr_url_regexes}" + driver.execute_cdp_cmd( + "Page.addScriptToEvaluateOnNewDocument", {"source": js} + ) + + if self._download_path: + driver.command_executor._commands["send_command"] = ( + "POST", + "/session/$sessionId/chromium/send_command", + ) + params = { + "cmd": "Page.setDownloadBehavior", + "params": {"behavior": "allow", "downloadPath": self._download_path}, + } + driver.execute("send_command", params) + + return driver + def phantomjs_driver(self): import warnings diff --git a/feapder/utils/webdriver/webdirver.py b/feapder/utils/webdriver/webdirver.py index bfc38704..8fa2a34e 100644 --- a/feapder/utils/webdriver/webdirver.py +++ b/feapder/utils/webdriver/webdirver.py @@ -52,7 +52,7 @@ def __init__( user_agent: 字符串 或 无参函数,返回值为user_agent proxy: xxx.xxx.xxx.xxx:xxxx 或 无参函数,返回值为代理地址 headless: 是否启用无头模式 - driver_type: CHROME 或 PHANTOMJS,FIREFOX + driver_type: CHROME,EDGE 或 PHANTOMJS,FIREFOX timeout: 请求超时时间 window_size: # 窗口大小 executable_path: 浏览器路径,默认为默认路径 diff --git a/setup.py b/setup.py index a30cc072..cf4fe542 100644 --- a/setup.py +++ b/setup.py @@ -42,20 +42,26 @@ "requests>=2.22.0", "bs4>=0.0.1", "ipython>=7.14.0", - "redis-py-cluster>=2.1.0", "cryptography>=3.3.2", - "selenium>=3.141.0", - "pymongo>=3.10.1", "urllib3>=1.25.8", "loguru>=0.5.3", "influxdb>=5.3.1", "pyperclip>=1.8.2", - "webdriver-manager>=3.5.3", "terminal-layout>=2.1.3", +] + +render_requires = [ + "webdriver-manager>=4.0.0", "playwright", + "selenium>=3.141.0", ] -extras_requires = ["bitarray>=1.5.3", "PyExecJS>=1.5.1"] +all_requires = [ + "bitarray>=1.5.3", + "PyExecJS>=1.5.1", + "pymongo>=3.10.1", + "redis-py-cluster>=2.1.0", +] + render_requires setuptools.setup( name="feapder", @@ -64,11 +70,11 @@ license="MIT", author_email="feapder@qq.com", python_requires=">=3.6", - description="feapder是一款支持分布式、批次采集、任务防丢、报警丰富的python爬虫框架", + description="feapder是一款支持分布式、批次采集、数据防丢、报警丰富的python爬虫框架", long_description=long_description, long_description_content_type="text/markdown", install_requires=requires, - extras_require={"all": extras_requires}, + extras_require={"all": all_requires, "render": render_requires}, entry_points={"console_scripts": ["feapder = feapder.commands.cmdline:execute"]}, url="https://github.com/Boris-code/feapder.git", packages=packages, diff --git a/tests/air-spider/test_air_spider.py b/tests/air-spider/test_air_spider.py index 90301075..597bfe48 100644 --- a/tests/air-spider/test_air_spider.py +++ b/tests/air-spider/test_air_spider.py @@ -24,7 +24,7 @@ def end_callback(self): print("爬虫结束") def start_requests(self, *args, **kws): - for i in range(200): + for i in range(1): print(i) yield feapder.Request("https://www.baidu.com") diff --git a/tests/air-spider/test_render_spider.py b/tests/air-spider/test_render_spider.py new file mode 100644 index 00000000..3067a443 --- /dev/null +++ b/tests/air-spider/test_render_spider.py @@ -0,0 +1,29 @@ +# -*- coding: utf-8 -*- +""" +Created on 2020/4/22 10:41 PM +--------- +@summary: +--------- +@author: Boris +@email: boris_liu@foxmail.com +""" + +import feapder + + +class TestAirSpider(feapder.AirSpider): + def start_requests(self, *args, **kws): + yield feapder.Request("https://www.baidu.com", render=True) + + # def download_midware(self, request): + # request.proxies = { + # "http": "http://xxx.xxx.xxx.xxx:8888", + # "https": "http://xxx.xxx.xxx.xxx:8888", + # } + + def parse(self, request, response): + print(response.bs4().title) + + +if __name__ == "__main__": + TestAirSpider(thread_count=1).start() diff --git a/tests/batch-spider/spiders/test_spider.py b/tests/batch-spider/spiders/test_spider.py index bc213e78..684961bb 100644 --- a/tests/batch-spider/spiders/test_spider.py +++ b/tests/batch-spider/spiders/test_spider.py @@ -18,7 +18,7 @@ class TestSpider(feapder.BatchSpider): def start_requests(self, task): # task 为在任务表中取出的每一条任务 id, url = task # id, url为所取的字段,main函数中指定的 - yield feapder.Request(url, task_id=id) + yield feapder.Request(url, task_id=id, render=True) # task_id为任务id,用于更新任务状态 def parse(self, request, response): title = response.xpath('//title/text()').extract_first() # 取标题 diff --git a/tests/task-spider/test_task_spider.py b/tests/task-spider/test_task_spider.py index 8fba0931..3a361633 100644 --- a/tests/task-spider/test_task_spider.py +++ b/tests/task-spider/test_task_spider.py @@ -13,7 +13,7 @@ class TestTaskSpider(feapder.TaskSpider): def add_task(self): - # 加种子任务 + # 加种子任务 框架会调用这个函数,方便往redis里塞任务,但不能写成死循环。实际业务中可以自己写个脚本往redis里塞任务 self._redisdb.zadd(self._task_table, {"id": 1, "url": "https://www.baidu.com"}) def start_requests(self, task): @@ -40,7 +40,6 @@ def start(args): task_keys=["id", "url"], redis_key="test:task_spider", keep_alive=True, - delete_keys=True, ) if args == 1: spider.start_monitor_task() @@ -56,8 +55,8 @@ def start2(args): task_table="spider_task2", task_table_type="redis", redis_key="test:task_spider", - keep_alive=False, - delete_keys=True, + keep_alive=True, + use_mysql=False, ) if args == 1: spider.start_monitor_task() @@ -68,8 +67,12 @@ def start2(args): if __name__ == "__main__": parser = ArgumentParser(description="测试TaskSpider") - parser.add_argument("--start", type=int, nargs=1, help="用mysql做种子表 (1|2)", function=start) - parser.add_argument("--start2", type=int, nargs=1, help="用redis做种子表 (1|2)", function=start2) + parser.add_argument( + "--start", type=int, nargs=1, help="用mysql做种子表 (1|2)", function=start + ) + parser.add_argument( + "--start2", type=int, nargs=1, help="用redis做种子表 (1|2)", function=start2 + ) parser.start() diff --git a/tests/test-debugger/README.md b/tests/test-debugger/README.md new file mode 100644 index 00000000..c160ae2c --- /dev/null +++ b/tests/test-debugger/README.md @@ -0,0 +1,8 @@ +# xxx爬虫文档 +## 调研 + +## 数据库设计 + +## 爬虫逻辑 + +## 项目架构 \ No newline at end of file diff --git a/tests/test-debugger/items/__init__.py b/tests/test-debugger/items/__init__.py new file mode 100644 index 00000000..e69de29b diff --git a/tests/test-debugger/main.py b/tests/test-debugger/main.py new file mode 100644 index 00000000..929f347b --- /dev/null +++ b/tests/test-debugger/main.py @@ -0,0 +1,19 @@ +# -*- coding: utf-8 -*- +""" +Created on 2023-06-09 20:26:29 +--------- +@summary: 爬虫入口 +--------- +@author: Boris +""" + +import feapder + +from spiders import * + + +if __name__ == "__main__": + test_debugger.TestDebugger.to_DebugSpider( + request=feapder.Request("https://spidertools.cn", render=True), + redis_key="test:xxx", + ).start() diff --git a/tests/test-debugger/setting.py b/tests/test-debugger/setting.py new file mode 100644 index 00000000..2191f57c --- /dev/null +++ b/tests/test-debugger/setting.py @@ -0,0 +1,185 @@ +# -*- coding: utf-8 -*- +"""爬虫配置文件""" +# import os +# import sys +# +# # MYSQL +# MYSQL_IP = "localhost" +# MYSQL_PORT = 3306 +# MYSQL_DB = "" +# MYSQL_USER_NAME = "" +# MYSQL_USER_PASS = "" +# +# # MONGODB +# MONGO_IP = "localhost" +# MONGO_PORT = 27017 +# MONGO_DB = "" +# MONGO_USER_NAME = "" +# MONGO_USER_PASS = "" +# +# # REDIS +# # ip:port 多个可写为列表或者逗号隔开 如 ip1:port1,ip2:port2 或 ["ip1:port1", "ip2:port2"] +# REDISDB_IP_PORTS = "localhost:6379" +# REDISDB_USER_PASS = "" +# REDISDB_DB = 0 +# # 连接redis时携带的其他参数,如ssl=True +# REDISDB_KWARGS = dict() +# # 适用于redis哨兵模式 +# REDISDB_SERVICE_NAME = "" +# +# # 数据入库的pipeline,可自定义,默认MysqlPipeline +# ITEM_PIPELINES = [ +# "feapder.pipelines.mysql_pipeline.MysqlPipeline", +# # "feapder.pipelines.mongo_pipeline.MongoPipeline", +# # "feapder.pipelines.console_pipeline.ConsolePipeline", +# ] +# EXPORT_DATA_MAX_FAILED_TIMES = 10 # 导出数据时最大的失败次数,包括保存和更新,超过这个次数报警 +# EXPORT_DATA_MAX_RETRY_TIMES = 10 # 导出数据时最大的重试次数,包括保存和更新,超过这个次数则放弃重试 +# +# # 爬虫相关 +# # COLLECTOR +# COLLECTOR_TASK_COUNT = 32 # 每次获取任务数量,追求速度推荐32 +# +# # SPIDER +# SPIDER_THREAD_COUNT = 1 # 爬虫并发数,追求速度推荐32 +# # 下载时间间隔 单位秒。 支持随机 如 SPIDER_SLEEP_TIME = [2, 5] 则间隔为 2~5秒之间的随机数,包含2和5 +# SPIDER_SLEEP_TIME = 0 +# SPIDER_MAX_RETRY_TIMES = 10 # 每个请求最大重试次数 +# KEEP_ALIVE = False # 爬虫是否常驻 + +# 下载 +# DOWNLOADER = "feapder.network.downloader.RequestsDownloader" +# SESSION_DOWNLOADER = "feapder.network.downloader.RequestsSessionDownloader" +# RENDER_DOWNLOADER = "feapder.network.downloader.SeleniumDownloader" +# # RENDER_DOWNLOADER="feapder.network.downloader.PlaywrightDownloader" +# MAKE_ABSOLUTE_LINKS = True # 自动转成绝对连接 + +# # 浏览器渲染 +WEBDRIVER = dict( + pool_size=1, # 浏览器的数量 + load_images=True, # 是否加载图片 + user_agent=None, # 字符串 或 无参函数,返回值为user_agent + proxy=None, # xxx.xxx.xxx.xxx:xxxx 或 无参函数,返回值为代理地址 + headless=False, # 是否为无头浏览器 + driver_type="CHROME", # CHROME、EDGE、PHANTOMJS、FIREFOX + timeout=30, # 请求超时时间 + window_size=(1024, 800), # 窗口大小 + executable_path=None, # 浏览器路径,默认为默认路径 + render_time=0, # 渲染时长,即打开网页等待指定时间后再获取源码 + custom_argument=[ + "--ignore-certificate-errors", + "--disable-blink-features=AutomationControlled", + ], # 自定义浏览器渲染参数 + xhr_url_regexes=None, # 拦截xhr接口,支持正则,数组类型 + auto_install_driver=True, # 自动下载浏览器驱动 支持chrome 和 firefox + download_path=None, # 下载文件的路径 + use_stealth_js=False, # 使用stealth.min.js隐藏浏览器特征 +) + +# PLAYWRIGHT = dict( +# user_agent=None, # 字符串 或 无参函数,返回值为user_agent +# proxy=None, # xxx.xxx.xxx.xxx:xxxx 或 无参函数,返回值为代理地址 +# headless=False, # 是否为无头浏览器 +# driver_type="chromium", # chromium、firefox、webkit +# timeout=30, # 请求超时时间 +# window_size=(1024, 800), # 窗口大小 +# executable_path=None, # 浏览器路径,默认为默认路径 +# download_path=None, # 下载文件的路径 +# render_time=0, # 渲染时长,即打开网页等待指定时间后再获取源码 +# wait_until="networkidle", # 等待页面加载完成的事件,可选值:"commit", "domcontentloaded", "load", "networkidle" +# use_stealth_js=False, # 使用stealth.min.js隐藏浏览器特征 +# page_on_event_callback=None, # page.on() 事件的回调 如 page_on_event_callback={"dialog": lambda dialog: dialog.accept()} +# storage_state_path=None, # 保存浏览器状态的路径 +# url_regexes=None, # 拦截接口,支持正则,数组类型 +# save_all=False, # 是否保存所有拦截的接口, 配合url_regexes使用,为False时只保存最后一次拦截的接口 +# ) +# +# # 爬虫启动时,重新抓取失败的requests +# RETRY_FAILED_REQUESTS = False +# # 爬虫启动时,重新入库失败的item +# RETRY_FAILED_ITEMS = False +# # 保存失败的request +# SAVE_FAILED_REQUEST = True +# # request防丢机制。(指定的REQUEST_LOST_TIMEOUT时间内request还没做完,会重新下发 重做) +# REQUEST_LOST_TIMEOUT = 600 # 10分钟 +# # request网络请求超时时间 +# REQUEST_TIMEOUT = 22 # 等待服务器响应的超时时间,浮点数,或(connect timeout, read timeout)元组 +# # item在内存队列中最大缓存数量 +# ITEM_MAX_CACHED_COUNT = 5000 +# # item每批入库的最大数量 +# ITEM_UPLOAD_BATCH_MAX_SIZE = 1000 +# # item入库时间间隔 +# ITEM_UPLOAD_INTERVAL = 1 +# # 内存任务队列最大缓存的任务数,默认不限制;仅对AirSpider有效。 +# TASK_MAX_CACHED_SIZE = 0 +# +# # 下载缓存 利用redis缓存,但由于内存大小限制,所以建议仅供开发调试代码时使用,防止每次debug都需要网络请求 +# RESPONSE_CACHED_ENABLE = False # 是否启用下载缓存 成本高的数据或容易变需求的数据,建议设置为True +# RESPONSE_CACHED_EXPIRE_TIME = 3600 # 缓存时间 秒 +# RESPONSE_CACHED_USED = False # 是否使用缓存 补采数据时可设置为True +# +# # 设置代理 +# PROXY_EXTRACT_API = None # 代理提取API ,返回的代理分割符为\r\n +# PROXY_ENABLE = True +# +# # 随机headers +# RANDOM_HEADERS = True +# # UserAgent类型 支持 'chrome', 'opera', 'firefox', 'internetexplorer', 'safari','mobile' 若不指定则随机类型 +# USER_AGENT_TYPE = "chrome" +# # 默认使用的浏览器头 +# DEFAULT_USERAGENT = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.103 Safari/537.36" +# # requests 使用session +# USE_SESSION = False +# +# # 去重 +# ITEM_FILTER_ENABLE = False # item 去重 +# REQUEST_FILTER_ENABLE = False # request 去重 +# ITEM_FILTER_SETTING = dict( +# filter_type=1 # 永久去重(BloomFilter) = 1 、内存去重(MemoryFilter) = 2、 临时去重(ExpireFilter)= 3、轻量去重(LiteFilter)= 4 +# ) +# REQUEST_FILTER_SETTING = dict( +# filter_type=3, # 永久去重(BloomFilter) = 1 、内存去重(MemoryFilter) = 2、 临时去重(ExpireFilter)= 3、 轻量去重(LiteFilter)= 4 +# expire_time=2592000, # 过期时间1个月 +# ) +# +# # 报警 支持钉钉、飞书、企业微信、邮件 +# # 钉钉报警 +# DINGDING_WARNING_URL = "" # 钉钉机器人api +# DINGDING_WARNING_PHONE = "" # 报警人 支持列表,可指定多个 +# DINGDING_WARNING_ALL = False # 是否提示所有人, 默认为False +# # 飞书报警 +# # https://open.feishu.cn/document/ukTMukTMukTM/ucTM5YjL3ETO24yNxkjN#e1cdee9f +# FEISHU_WARNING_URL = "" # 飞书机器人api +# FEISHU_WARNING_USER = None # 报警人 {"open_id":"ou_xxxxx", "name":"xxxx"} 或 [{"open_id":"ou_xxxxx", "name":"xxxx"}] +# FEISHU_WARNING_ALL = False # 是否提示所有人, 默认为False +# # 邮件报警 +# EMAIL_SENDER = "" # 发件人 +# EMAIL_PASSWORD = "" # 授权码 +# EMAIL_RECEIVER = "" # 收件人 支持列表,可指定多个 +# EMAIL_SMTPSERVER = "smtp.163.com" # 邮件服务器 默认为163邮箱 +# # 企业微信报警 +# WECHAT_WARNING_URL = "" # 企业微信机器人api +# WECHAT_WARNING_PHONE = "" # 报警人 将会在群内@此人, 支持列表,可指定多人 +# WECHAT_WARNING_ALL = False # 是否提示所有人, 默认为False +# # 时间间隔 +# WARNING_INTERVAL = 3600 # 相同报警的报警时间间隔,防止刷屏; 0表示不去重 +# WARNING_LEVEL = "DEBUG" # 报警级别, DEBUG / INFO / ERROR +# WARNING_FAILED_COUNT = 1000 # 任务失败数 超过WARNING_FAILED_COUNT则报警 +# +# LOG_NAME = os.path.basename(os.getcwd()) +# LOG_PATH = "log/%s.log" % LOG_NAME # log存储路径 +# LOG_LEVEL = "DEBUG" +# LOG_COLOR = True # 是否带有颜色 +# LOG_IS_WRITE_TO_CONSOLE = True # 是否打印到控制台 +# LOG_IS_WRITE_TO_FILE = False # 是否写文件 +# LOG_MODE = "w" # 写文件的模式 +# LOG_MAX_BYTES = 10 * 1024 * 1024 # 每个日志文件的最大字节数 +# LOG_BACKUP_COUNT = 20 # 日志文件保留数量 +# LOG_ENCODING = "utf8" # 日志文件编码 +# OTHERS_LOG_LEVAL = "ERROR" # 第三方库的log等级 +# +# # 切换工作路径为当前项目路径 +# project_path = os.path.abspath(os.path.dirname(__file__)) +# os.chdir(project_path) # 切换工作路经 +# sys.path.insert(0, project_path) +# print("当前工作路径为 " + os.getcwd()) diff --git a/tests/test-debugger/spiders/__init__.py b/tests/test-debugger/spiders/__init__.py new file mode 100644 index 00000000..4243fbe2 --- /dev/null +++ b/tests/test-debugger/spiders/__init__.py @@ -0,0 +1,3 @@ +__all__ = [ + "test_debugger" +] \ No newline at end of file diff --git a/tests/test-debugger/spiders/test_debugger.py b/tests/test-debugger/spiders/test_debugger.py new file mode 100644 index 00000000..2ef73f56 --- /dev/null +++ b/tests/test-debugger/spiders/test_debugger.py @@ -0,0 +1,28 @@ +# -*- coding: utf-8 -*- +""" +Created on 2023-06-09 20:26:47 +--------- +@summary: +--------- +@author: Boris +""" + +import feapder + + +class TestDebugger(feapder.Spider): + def start_requests(self): + yield feapder.Request("https://spidertools.cn", render=True) + + def parse(self, request, response): + # 提取网站title + print(response.xpath("//title/text()").extract_first()) + # 提取网站描述 + print(response.xpath("//meta[@name='description']/@content").extract_first()) + print("网站地址: ", response.url) + + +if __name__ == "__main__": + TestDebugger.to_DebugSpider( + request=feapder.Request("https://spidertools.cn", render=True), redis_key="test:xxx" + ).start() diff --git a/tests/test-pipeline/items/spider_data_item.py b/tests/test-pipeline/items/spider_data_item.py index 3072d9a5..1960649a 100644 --- a/tests/test-pipeline/items/spider_data_item.py +++ b/tests/test-pipeline/items/spider_data_item.py @@ -8,6 +8,7 @@ """ from feapder import Item +from feapder.pipelines.csv_pipeline import CsvPipeline class SpiderDataItem(Item): @@ -15,6 +16,7 @@ class SpiderDataItem(Item): This class was generated by feapder. command: feapder create -i spider_data. """ + __pipelines__ = [CsvPipeline()] def __init__(self, *args, **kwargs): # self.id = None # type : int(10) unsigned | allow_null : NO | key : PRI | default_value : None | extra : auto_increment | column_comment : diff --git a/tests/test-pipeline/setting.py b/tests/test-pipeline/setting.py index ca852ad4..ba985f09 100644 --- a/tests/test-pipeline/setting.py +++ b/tests/test-pipeline/setting.py @@ -19,7 +19,8 @@ # 数据入库的pipeline,可自定义,默认MysqlPipeline ITEM_PIPELINES = [ - "pipeline.Pipeline" + "pipeline.Pipeline", + # "feapder.pipelines.csv_pipeline.CsvPipeline" ] # # 爬虫相关 diff --git a/tests/test-pipeline/spiders/test_csv_pipeline_spider.py b/tests/test-pipeline/spiders/test_csv_pipeline_spider.py new file mode 100644 index 00000000..83d4b842 --- /dev/null +++ b/tests/test-pipeline/spiders/test_csv_pipeline_spider.py @@ -0,0 +1,28 @@ +# -*- coding: utf-8 -*- +""" +Created on 2025-12-16 14:52:29 +--------- +@summary: +--------- +@author: Boris +""" + +import feapder +from items import * + + +class TestCsvPipelineSpider(feapder.AirSpider): + def start_requests(self): + for i in range(100): + yield feapder.Request("https://baidu.com", page=i) + + def parse(self, request, response): + # 提取网站title + title = response.xpath("//title/text()").extract_first() + item = spider_data_item.SpiderDataItem() # 声明一个item + item.title = title # 给item属性赋值 + yield item # 返回item, item会自动批量入库 + + +if __name__ == "__main__": + TestCsvPipelineSpider().start() diff --git a/tests/test_csv_pipeline/test_functionality.py b/tests/test_csv_pipeline/test_functionality.py new file mode 100644 index 00000000..190c9137 --- /dev/null +++ b/tests/test_csv_pipeline/test_functionality.py @@ -0,0 +1,454 @@ +# -*- coding: utf-8 -*- +""" +CSV Pipeline 功能测试 + +测试内容: +1. 基础功能测试 +2. 异常处理测试 +3. 边界条件测试 +4. 兼容性测试 + +Created on 2025-10-16 +@author: 道长 +@email: ctrlf4@yeah.net +""" + +import csv +import os +import sys +import shutil +from pathlib import Path + +# 添加项目路径 +sys.path.insert(0, str(Path(__file__).parent.parent.parent)) + +from feapder.pipelines.csv_pipeline import CsvPipeline + + +class FunctionalityTester: + """CSV Pipeline 功能测试器""" + + def __init__(self, test_dir="test_output"): + """初始化测试器""" + self.test_dir = test_dir + self.pipeline = None + self.passed = 0 + self.failed = 0 + + def setup(self): + """测试前准备""" + if os.path.exists(self.test_dir): + shutil.rmtree(self.test_dir) + + os.makedirs(self.test_dir, exist_ok=True) + + csv_dir = os.path.join(self.test_dir, "csv") + self.pipeline = CsvPipeline(csv_dir=csv_dir) + + print(f"✅ 测试环境准备完成") + + def teardown(self): + """测试后清理""" + if self.pipeline: + self.pipeline.close() + + def assert_true(self, condition, message): + """断言真""" + if condition: + print(f" ✅ {message}") + self.passed += 1 + else: + print(f" ❌ {message}") + self.failed += 1 + + def assert_false(self, condition, message): + """断言假""" + self.assert_true(not condition, message) + + def assert_equal(self, actual, expected, message): + """断言相等""" + if actual == expected: + print(f" ✅ {message}") + self.passed += 1 + else: + print(f" ❌ {message} (期望: {expected}, 实际: {actual})") + self.failed += 1 + + def test_basic_save(self): + """测试基础保存功能""" + print("\n" + "=" * 80) + print("测试 1: 基础保存功能") + print("=" * 80) + + # 测试保存单条数据 + item = {"id": 1, "name": "Test Product", "price": 99.99} + result = self.pipeline.save_items("product", [item]) + self.assert_true(result, "保存单条数据") + + # 检查文件是否创建 + csv_file = os.path.join(self.pipeline.csv_dir, "product.csv") + self.assert_true(os.path.exists(csv_file), "CSV 文件已创建") + + # 检查数据是否正确 + with open(csv_file, 'r', encoding='utf-8', newline='') as f: + reader = csv.DictReader(f) + rows = list(reader) + self.assert_equal(len(rows), 1, "文件中有 1 条数据") + if rows: + self.assert_equal(rows[0]["id"], "1", "数据 ID 正确") + self.assert_equal(rows[0]["name"], "Test Product", "数据名称正确") + + def test_batch_save(self): + """测试批量保存""" + print("\n" + "=" * 80) + print("测试 2: 批量保存功能") + print("=" * 80) + + # 生成测试数据 + items = [] + for i in range(10): + items.append({ + "id": i + 1, + "name": f"Product_{i + 1}", + "price": 100 + i, + }) + + result = self.pipeline.save_items("batch_test", items) + self.assert_true(result, "批量保存 10 条数据") + + # 检查数据行数 + csv_file = os.path.join(self.pipeline.csv_dir, "batch_test.csv") + with open(csv_file, 'r', encoding='utf-8', newline='') as f: + reader = csv.DictReader(f) + rows = list(reader) + self.assert_equal(len(rows), 10, "批量保存数据行数正确") + + def test_empty_items(self): + """测试空数据处理""" + print("\n" + "=" * 80) + print("测试 3: 空数据处理") + print("=" * 80) + + result = self.pipeline.save_items("empty_test", []) + self.assert_true(result, "空数据列表返回 True") + + def test_special_characters(self): + """测试特殊字符处理""" + print("\n" + "=" * 80) + print("测试 4: 特殊字符处理") + print("=" * 80) + + items = [ + { + "id": 1, + "name": "产品名称", + "description": 'Contains "quotes" and, commas', + "emoji": "😀🎉🚀", + "newline": "Line1\nLine2", + } + ] + + result = self.pipeline.save_items("special_chars", items) + self.assert_true(result, "保存包含特殊字符的数据") + + # 读取并检查 + csv_file = os.path.join(self.pipeline.csv_dir, "special_chars.csv") + with open(csv_file, 'r', encoding='utf-8', newline='') as f: + reader = csv.DictReader(f) + rows = list(reader) + if rows: + self.assert_equal(rows[0]["name"], "产品名称", "中文字符正确") + self.assert_equal( + rows[0].get("emoji", ""), + "😀🎉🚀", + "Emoji 正确" + ) + + def test_multiple_tables(self): + """测试多表存储""" + print("\n" + "=" * 80) + print("测试 5: 多表存储") + print("=" * 80) + + tables = ["product", "user", "order"] + for table in tables: + item = {"id": 1, "name": f"Test {table}"} + result = self.pipeline.save_items(table, [item]) + self.assert_true(result, f"保存到表 {table}") + + # 检查所有文件 + for table in tables: + csv_file = os.path.join(self.pipeline.csv_dir, f"{table}.csv") + self.assert_true(os.path.exists(csv_file), f"表 {table} 的 CSV 文件存在") + + def test_header_only_once(self): + """测试表头只写一次""" + print("\n" + "=" * 80) + print("测试 6: 表头只写一次") + print("=" * 80) + + table = "header_test" + + # 第一次写入 + items1 = [{"id": 1, "name": "Product 1"}] + self.pipeline.save_items(table, items1) + + # 第二次写入 + items2 = [{"id": 2, "name": "Product 2"}] + self.pipeline.save_items(table, items2) + + # 检查表头行数 + csv_file = os.path.join(self.pipeline.csv_dir, f"{table}.csv") + with open(csv_file, 'r', encoding='utf-8', newline='') as f: + lines = f.readlines() + # 应该是:1 个表头 + 2 条数据 + self.assert_equal(len(lines), 3, "文件中只有 1 行表头和 2 行数据") + + def test_numeric_values(self): + """测试数值类型""" + print("\n" + "=" * 80) + print("测试 7: 数值类型处理") + print("=" * 80) + + items = [ + { + "id": 1, + "price": 99.99, + "stock": 100, + "rating": 4.5, + "active": True, + } + ] + + result = self.pipeline.save_items("numeric_test", items) + self.assert_true(result, "保存包含各类数值的数据") + + # 读取并检查 + csv_file = os.path.join(self.pipeline.csv_dir, "numeric_test.csv") + with open(csv_file, 'r', encoding='utf-8', newline='') as f: + reader = csv.DictReader(f) + rows = list(reader) + if rows: + self.assert_equal(rows[0]["price"], "99.99", "浮点数正确") + self.assert_equal(rows[0]["stock"], "100", "整数正确") + self.assert_equal(rows[0]["rating"], "4.5", "小数正确") + + def test_large_values(self): + """测试大值处理""" + print("\n" + "=" * 80) + print("测试 8: 大值处理") + print("=" * 80) + + large_text = "x" * 10000 # 10KB 的文本 + items = [ + { + "id": 1, + "name": "Large Content", + "content": large_text, + } + ] + + result = self.pipeline.save_items("large_test", items) + self.assert_true(result, "保存大内容数据") + + # 检查数据完整性 + csv_file = os.path.join(self.pipeline.csv_dir, "large_test.csv") + with open(csv_file, 'r', encoding='utf-8', newline='') as f: + reader = csv.DictReader(f) + rows = list(reader) + if rows: + self.assert_equal( + len(rows[0]["content"]), + len(large_text), + "大内容数据完整" + ) + + def test_update_items_fallback(self): + """测试 update_items 降级为 save""" + print("\n" + "=" * 80) + print("测试 9: update_items 降级为 save") + print("=" * 80) + + items = [{"id": 1, "name": "Product 1", "price": 100}] + result = self.pipeline.update_items("update_test", items, ("price",)) + self.assert_true(result, "update_items 返回 True") + + # 检查数据是否存在 + csv_file = os.path.join(self.pipeline.csv_dir, "update_test.csv") + self.assert_true(os.path.exists(csv_file), "update_items 创建了 CSV 文件") + + def test_file_operations(self): + """测试文件操作""" + print("\n" + "=" * 80) + print("测试 10: 文件操作") + print("=" * 80) + + items = [{"id": 1, "name": "Test"}] + table = "file_test" + + result = self.pipeline.save_items(table, items) + self.assert_true(result, "保存数据") + + csv_file = os.path.join(self.pipeline.csv_dir, f"{table}.csv") + + # 检查文件是否可读 + try: + with open(csv_file, 'r', encoding='utf-8') as f: + f.read() + self.assert_true(True, "CSV 文件可读") + except Exception as e: + self.assert_true(False, f"CSV 文件可读 ({e})") + + # 检查文件大小 + file_size = os.path.getsize(csv_file) + self.assert_true(file_size > 0, f"CSV 文件大小 > 0 ({file_size} 字节)") + + def test_concurrent_same_table(self): + """测试同表并发写入""" + print("\n" + "=" * 80) + print("测试 11: 同表并发写入(Per-Table Lock)") + print("=" * 80) + + import threading + + table = "concurrent_same_table" + errors = [] + + def write_data(thread_id): + try: + items = [{"id": thread_id, "name": f"Item_{thread_id}"}] + result = self.pipeline.save_items(table, items) + if not result: + errors.append(f"线程{thread_id}写入失败") + except Exception as e: + errors.append(f"线程{thread_id}异常: {e}") + + # 创建多个线程 + threads = [] + for i in range(5): + t = threading.Thread(target=write_data, args=(i,)) + t.start() + threads.append(t) + + # 等待所有线程完成 + for t in threads: + t.join() + + self.assert_equal(len(errors), 0, "并发写入无错误") + + # 检查数据完整性 + csv_file = os.path.join(self.pipeline.csv_dir, f"{table}.csv") + with open(csv_file, 'r', encoding='utf-8', newline='') as f: + reader = csv.DictReader(f) + rows = list(reader) + self.assert_true(len(rows) > 0, "并发写入产生了数据") + + def test_directory_creation(self): + """测试目录自动创建""" + print("\n" + "=" * 80) + print("测试 12: 目录自动创建") + print("=" * 80) + + # 创建新的 pipeline 实例,指定不存在的目录 + new_csv_dir = os.path.join(self.test_dir, "new_csv_dir") + self.assert_false(os.path.exists(new_csv_dir), "新目录不存在") + + new_pipeline = CsvPipeline(csv_dir=new_csv_dir) + self.assert_true(os.path.exists(new_csv_dir), "目录自动创建") + + new_pipeline.close() + + def test_none_values(self): + """测试 None 值处理""" + print("\n" + "=" * 80) + print("测试 13: None 值处理") + print("=" * 80) + + items = [ + { + "id": 1, + "name": "Product", + "description": None, + "optional_field": "", + } + ] + + result = self.pipeline.save_items("none_test", items) + self.assert_true(result, "保存包含 None 值的数据") + + # 检查文件 + csv_file = os.path.join(self.pipeline.csv_dir, "none_test.csv") + with open(csv_file, 'r', encoding='utf-8', newline='') as f: + reader = csv.DictReader(f) + rows = list(reader) + if rows: + # None 会被转换为字符串 "None" + self.assert_true("None" in rows[0]["description"], + "None 值被正确处理") + + def run_all_tests(self): + """运行所有测试""" + print("\n") + print("╔" + "═" * 78 + "╗") + print("║" + " CSV Pipeline 功能测试 ".center(78) + "║") + print("║" + " 作者: 道长 | 日期: 2025-10-16 ".center(78) + "║") + print("╚" + "═" * 78 + "╝") + + try: + self.setup() + + # 运行所有测试 + self.test_basic_save() + self.test_batch_save() + self.test_empty_items() + self.test_special_characters() + self.test_multiple_tables() + self.test_header_only_once() + self.test_numeric_values() + self.test_large_values() + self.test_update_items_fallback() + self.test_file_operations() + self.test_concurrent_same_table() + self.test_directory_creation() + self.test_none_values() + + # 打印总结 + self.print_summary() + + return self.failed == 0 + + except Exception as e: + print(f"\n❌ 测试过程中出错: {e}") + import traceback + traceback.print_exc() + return False + + finally: + self.teardown() + + def print_summary(self): + """打印测试总结""" + print("\n" + "=" * 80) + print("测试总结") + print("=" * 80) + print(f"✅ 通过: {self.passed}") + print(f"❌ 失败: {self.failed}") + print(f"总计: {self.passed + self.failed}") + + if self.failed == 0: + print("\n🎉 所有测试通过!") + else: + print(f"\n⚠️ 有 {self.failed} 个测试失败") + + print("=" * 80) + + +def main(): + """主函数""" + tester = FunctionalityTester(test_dir="tests/test_csv_pipeline/test_output_func") + success = tester.run_all_tests() + return 0 if success else 1 + + +if __name__ == "__main__": + sys.exit(main()) diff --git a/tests/test_csv_pipeline/test_performance.py b/tests/test_csv_pipeline/test_performance.py new file mode 100644 index 00000000..94eb64a7 --- /dev/null +++ b/tests/test_csv_pipeline/test_performance.py @@ -0,0 +1,537 @@ +# -*- coding: utf-8 -*- +""" +CSV Pipeline 性能测试 + +测试内容: +1. 批量写入性能 +2. 并发写入性能 +3. 内存占用情况 +4. 文件大小和数据完整性 + +Created on 2025-10-16 +@author: 道长 +@email: ctrlf4@yeah.net +""" + +import csv +import os +import sys +import time +import shutil +import threading +import psutil +from pathlib import Path +from typing import List, Dict + +# 添加项目路径 +sys.path.insert(0, str(Path(__file__).parent.parent.parent)) + +from feapder.pipelines.csv_pipeline import CsvPipeline + + +class PerformanceTester: + """CSV Pipeline 性能测试器""" + + def __init__(self, test_dir="test_output"): + """初始化测试器""" + self.test_dir = test_dir + self.pipeline = None + self.process = psutil.Process() + self.test_results = {} + + def setup(self): + """测试前准备""" + # 清理历史测试目录 + if os.path.exists(self.test_dir): + shutil.rmtree(self.test_dir) + + # 创建测试输出目录 + os.makedirs(self.test_dir, exist_ok=True) + + # 初始化 Pipeline + csv_dir = os.path.join(self.test_dir, "csv") + self.pipeline = CsvPipeline(csv_dir=csv_dir) + + print(f"✅ 测试环境准备完成,输出目录: {self.test_dir}") + + def teardown(self): + """测试后清理""" + if self.pipeline: + self.pipeline.close() + + def generate_test_data(self, count: int) -> List[Dict]: + """生成测试数据""" + data = [] + for i in range(count): + data.append({ + "id": i + 1, + "name": f"Product_{i + 1}", + "price": 99.99 + i * 0.1, + "category": "Electronics", + "url": f"https://example.com/product/{i + 1}", + "stock": 100 - (i % 50), + "rating": 4.5 + (i % 5) * 0.1, + "description": f"Description for product {i + 1}" * 3, + }) + return data + + def test_single_batch_performance(self): + """测试单批写入性能""" + print("\n" + "=" * 80) + print("测试 1: 单批写入性能") + print("=" * 80) + + batch_sizes = [100, 500, 1000, 5000] + results = {} + + for batch_size in batch_sizes: + data = self.generate_test_data(batch_size) + + # 测试写入时间 + start_time = time.time() + success = self.pipeline.save_items("product", data) + elapsed = time.time() - start_time + + # 测试结果 + results[batch_size] = { + "success": success, + "elapsed_time": elapsed, + "throughput": batch_size / elapsed if elapsed > 0 else 0, + } + + print(f"批量大小: {batch_size:5d} | " + f"耗时: {elapsed:.4f}s | " + f"吞吐量: {results[batch_size]['throughput']:.0f} 条/秒 | " + f"状态: {'✅' if success else '❌'}") + + self.test_results["single_batch"] = results + return results + + def test_concurrent_write_performance(self): + """测试并发写入性能""" + print("\n" + "=" * 80) + print("测试 2: 并发写入性能(模拟多爬虫线程)") + print("=" * 80) + + thread_counts = [1, 2, 4, 8] + results = {} + + for thread_count in thread_counts: + # 每个线程写入的数据条数 + items_per_thread = 100 + total_items = thread_count * items_per_thread + + def write_thread(thread_id): + """线程工作函数""" + data = self.generate_test_data(items_per_thread) + # 为了模拟不同表,使用不同的表名 + table_name = f"product_thread_{thread_id}" + return self.pipeline.save_items(table_name, data) + + # 记录初始内存 + mem_before = self.process.memory_info().rss / 1024 / 1024 + + # 并发执行 + start_time = time.time() + threads = [] + for i in range(thread_count): + t = threading.Thread(target=write_thread, args=(i,)) + t.start() + threads.append(t) + + # 等待所有线程完成 + for t in threads: + t.join() + + elapsed = time.time() - start_time + mem_after = self.process.memory_info().rss / 1024 / 1024 + mem_delta = mem_after - mem_before + + results[thread_count] = { + "total_items": total_items, + "elapsed_time": elapsed, + "throughput": total_items / elapsed if elapsed > 0 else 0, + "memory_delta_mb": mem_delta, + } + + print(f"线程数: {thread_count} | " + f"总数据: {total_items:5d} | " + f"耗时: {elapsed:.4f}s | " + f"吞吐量: {results[thread_count]['throughput']:.0f} 条/秒 | " + f"内存增长: {mem_delta:.2f}MB") + + self.test_results["concurrent_write"] = results + return results + + def test_memory_usage(self): + """测试内存占用""" + print("\n" + "=" * 80) + print("测试 3: 内存占用情况") + print("=" * 80) + + # 测试不同数量的数据对内存的影响 + test_counts = [1000, 5000, 10000, 50000] + results = {} + + for count in test_counts: + data = self.generate_test_data(count) + + # 记录内存 + mem_before = self.process.memory_info().rss / 1024 / 1024 + + # 执行写入 + start_time = time.time() + self.pipeline.save_items("product_memory", data) + elapsed = time.time() - start_time + + mem_after = self.process.memory_info().rss / 1024 / 1024 + mem_used = mem_after - mem_before + mem_per_item = mem_used / count if count > 0 else 0 + + results[count] = { + "memory_before_mb": mem_before, + "memory_after_mb": mem_after, + "memory_used_mb": mem_used, + "memory_per_item_kb": mem_per_item * 1024, + "elapsed_time": elapsed, + } + + print(f"数据条数: {count:6d} | " + f"内存占用: {mem_used:6.2f}MB | " + f"每条数据: {mem_per_item * 1024:.2f}KB | " + f"耗时: {elapsed:.4f}s") + + self.test_results["memory_usage"] = results + return results + + def test_file_integrity(self): + """测试文件完整性""" + print("\n" + "=" * 80) + print("测试 4: 文件完整性检查") + print("=" * 80) + + # 写入测试数据 + test_data = self.generate_test_data(1000) + table_name = "product_integrity" + + success = self.pipeline.save_items(table_name, test_data) + + if not success: + print("❌ 写入失败") + return {"status": "failed"} + + # 检查文件是否存在 + csv_file = os.path.join(self.pipeline.csv_dir, f"{table_name}.csv") + if not os.path.exists(csv_file): + print("❌ CSV 文件不存在") + return {"status": "file_not_found"} + + # 读取 CSV 文件并检查数据完整性 + read_data = [] + with open(csv_file, 'r', encoding='utf-8', newline='') as f: + reader = csv.DictReader(f) + for row in reader: + read_data.append(row) + + # 对比数据 + if len(read_data) != len(test_data): + print(f"❌ 数据条数不符: 写入{len(test_data)}条,读取{len(read_data)}条") + return { + "status": "count_mismatch", + "written": len(test_data), + "read": len(read_data), + } + + # 检查字段是否完整 + expected_fields = set(test_data[0].keys()) + actual_fields = set(read_data[0].keys()) + if expected_fields != actual_fields: + print(f"❌ 字段不符\n期望: {expected_fields}\n实际: {actual_fields}") + return { + "status": "field_mismatch", + "expected": list(expected_fields), + "actual": list(actual_fields), + } + + # 检查数据值是否正确(抽样检查) + sample_indices = [0, len(test_data) // 2, len(test_data) - 1] + for idx in sample_indices: + original = test_data[idx] + read = read_data[idx] + + for key in original.keys(): + if str(original[key]) != read.get(key, ""): + print(f"❌ 数据不符 (第{idx}行, 字段{key})\n" + f"期望: {original[key]}\n" + f"实际: {read.get(key)}") + return {"status": "data_mismatch", "index": idx, "field": key} + + print(f"✅ 文件完整性检查通过") + print(f" 总条数: {len(read_data)}") + print(f" 字段数: {len(actual_fields)}") + print(f" 文件大小: {os.path.getsize(csv_file) / 1024:.2f}KB") + + return { + "status": "passed", + "total_rows": len(read_data), + "total_fields": len(actual_fields), + "file_size_kb": os.path.getsize(csv_file) / 1024, + } + + def test_append_mode(self): + """测试追加模式(断点续爬)""" + print("\n" + "=" * 80) + print("测试 5: 追加模式(断点续爬)") + print("=" * 80) + + table_name = "product_append" + + # 第一次写入 + data1 = self.generate_test_data(100) + self.pipeline.save_items(table_name, data1) + + csv_file = os.path.join(self.pipeline.csv_dir, f"{table_name}.csv") + size_after_first = os.path.getsize(csv_file) if os.path.exists(csv_file) else 0 + + # 第二次写入(追加) + data2 = self.generate_test_data(100) + self.pipeline.save_items(table_name, data2) + + size_after_second = os.path.getsize(csv_file) if os.path.exists(csv_file) else 0 + + # 读取文件检查数据 + read_data = [] + with open(csv_file, 'r', encoding='utf-8', newline='') as f: + reader = csv.DictReader(f) + for row in reader: + read_data.append(row) + + # 检查是否正确追加 + if len(read_data) == len(data1) + len(data2): + print(f"✅ 追加模式正常") + print(f" 第一次写入: {len(data1)} 条") + print(f" 第二次写入: {len(data2)} 条") + print(f" 最终总数: {len(read_data)} 条") + print(f" 第一次后大小: {size_after_first / 1024:.2f}KB") + print(f" 第二次后大小: {size_after_second / 1024:.2f}KB") + + return { + "status": "passed", + "first_write": len(data1), + "second_write": len(data2), + "total": len(read_data), + "size_growth_kb": (size_after_second - size_after_first) / 1024, + } + else: + print(f"❌ 追加模式异常: 期望{len(data1) + len(data2)}条,实际{len(read_data)}条") + return { + "status": "failed", + "expected": len(data1) + len(data2), + "actual": len(read_data), + } + + def test_concurrent_safety(self): + """测试并发安全性(Per-Table Lock)""" + print("\n" + "=" * 80) + print("测试 6: 并发安全性(Per-Table Lock)") + print("=" * 80) + + table_name = "product_concurrent_safety" + thread_count = 4 + items_per_thread = 250 + + errors = [] + lock = threading.Lock() + + def write_thread(thread_id): + """线程工作函数""" + try: + data = self.generate_test_data(items_per_thread) + success = self.pipeline.save_items(table_name, data) + if not success: + with lock: + errors.append(f"线程{thread_id}写入失败") + except Exception as e: + with lock: + errors.append(f"线程{thread_id}异常: {e}") + + # 并发执行 + threads = [] + start_time = time.time() + for i in range(thread_count): + t = threading.Thread(target=write_thread, args=(i,)) + t.start() + threads.append(t) + + for t in threads: + t.join() + + elapsed = time.time() - start_time + + # 检查文件 + csv_file = os.path.join(self.pipeline.csv_dir, f"{table_name}.csv") + read_data = [] + with open(csv_file, 'r', encoding='utf-8', newline='') as f: + reader = csv.DictReader(f) + for row in reader: + read_data.append(row) + + expected_total = thread_count * items_per_thread + + if len(errors) == 0 and len(read_data) == expected_total: + print(f"✅ 并发安全性测试通过") + print(f" 线程数: {thread_count}") + print(f" 每线程数据: {items_per_thread}") + print(f" 期望总数: {expected_total}") + print(f" 实际总数: {len(read_data)}") + print(f" 耗时: {elapsed:.4f}s") + print(f" 吞吐量: {expected_total / elapsed:.0f} 条/秒") + + return { + "status": "passed", + "thread_count": thread_count, + "items_per_thread": items_per_thread, + "expected_total": expected_total, + "actual_total": len(read_data), + "elapsed_time": elapsed, + "throughput": expected_total / elapsed, + } + else: + print(f"❌ 并发安全性测试失败") + if errors: + for error in errors: + print(f" {error}") + if len(read_data) != expected_total: + print(f" 数据条数不符: 期望{expected_total}条,实际{len(read_data)}条") + + return { + "status": "failed", + "errors": errors, + "expected_total": expected_total, + "actual_total": len(read_data), + } + + def test_multiple_tables(self): + """测试多表存储""" + print("\n" + "=" * 80) + print("测试 7: 多表存储") + print("=" * 80) + + tables = ["product", "user", "order"] + rows_per_table = 500 + results = {} + + start_time = time.time() + + for table in tables: + data = self.generate_test_data(rows_per_table) + success = self.pipeline.save_items(table, data) + + csv_file = os.path.join(self.pipeline.csv_dir, f"{table}.csv") + file_size = os.path.getsize(csv_file) / 1024 if os.path.exists(csv_file) else 0 + + results[table] = { + "success": success, + "file_size_kb": file_size, + } + + print(f"表: {table:10s} | 状态: {'✅' if success else '❌'} | " + f"文件大小: {file_size:.2f}KB") + + elapsed = time.time() - start_time + + # 检查所有文件 + csv_dir = self.pipeline.csv_dir + files = [f for f in os.listdir(csv_dir) if f.endswith('.csv')] + + print(f"\n✅ 多表存储测试完成") + print(f" 表数: {len(tables)}") + print(f" 每表行数: {rows_per_table}") + print(f" 生成的 CSV 文件: {len(files)}") + print(f" 耗时: {elapsed:.4f}s") + + return { + "status": "passed", + "tables": results, + "file_count": len(files), + "elapsed_time": elapsed, + } + + def run_all_tests(self): + """运行所有测试""" + print("\n") + print("╔" + "═" * 78 + "╗") + print("║" + " CSV Pipeline 性能和功能测试 ".center(78) + "║") + print("║" + " 作者: 道长 | 日期: 2025-10-16 ".center(78) + "║") + print("╚" + "═" * 78 + "╝") + + try: + self.setup() + + # 运行所有测试 + self.test_single_batch_performance() + self.test_concurrent_write_performance() + self.test_memory_usage() + self.test_file_integrity() + self.test_append_mode() + self.test_concurrent_safety() + self.test_multiple_tables() + + # 打印总结 + self.print_summary() + + return True + + except Exception as e: + print(f"\n❌ 测试过程中出错: {e}") + import traceback + traceback.print_exc() + return False + + finally: + self.teardown() + + def print_summary(self): + """打印测试总结""" + print("\n" + "=" * 80) + print("测试总结") + print("=" * 80) + + # 单批性能总结 + if "single_batch" in self.test_results: + print("\n1. 单批写入性能:") + results = self.test_results["single_batch"] + for batch_size, data in results.items(): + print(f" {batch_size:5d} 条: {data['throughput']:.0f} 条/秒, " + f"耗时 {data['elapsed_time']:.4f}s") + + # 并发性能总结 + if "concurrent_write" in self.test_results: + print("\n2. 并发写入性能:") + results = self.test_results["concurrent_write"] + for thread_count, data in results.items(): + print(f" {thread_count} 线程: {data['throughput']:.0f} 条/秒, " + f"内存增长 {data['memory_delta_mb']:.2f}MB") + + # 内存占用总结 + if "memory_usage" in self.test_results: + print("\n3. 内存占用情况:") + results = self.test_results["memory_usage"] + for count, data in results.items(): + print(f" {count:6d} 条: {data['memory_used_mb']:.2f}MB, " + f"每条 {data['memory_per_item_kb']:.2f}KB") + + print("\n" + "=" * 80) + print("✅ 所有测试完成!") + print("=" * 80) + + +def main(): + """主函数""" + tester = PerformanceTester(test_dir="tests/test_csv_pipeline/test_output") + success = tester.run_all_tests() + return 0 if success else 1 + + +if __name__ == "__main__": + sys.exit(main()) diff --git a/tests/test_download_midware.py b/tests/test_download_midware.py new file mode 100644 index 00000000..1accbaf7 --- /dev/null +++ b/tests/test_download_midware.py @@ -0,0 +1,45 @@ +# -*- coding: utf-8 -*- +""" +Created on 2023/9/21 13:59 +--------- +@summary: +--------- +@author: Boris +@email: boris_liu@foxmail.com +""" + +import feapder + + +def download_midware(request): + print("outter download_midware") + return request + + +class TestAirSpider(feapder.AirSpider): + def start_requests(self): + yield feapder.Request( + "https://www.baidu.com", download_midware=download_midware + ) + + def parse(self, request, response): + print(request, response) + + +class TestSpiderSpider(feapder.Spider): + def start_requests(self): + yield feapder.Request( + "https://www.baidu.com", download_midware=[download_midware, self.download_midware] + ) + + def download_midware(self, request): + print("class download_midware") + return request + + def parse(self, request, response): + print(request, response) + + +if __name__ == "__main__": + # TestAirSpider().start() + TestSpiderSpider(redis_key="test").start() diff --git a/tests/test_log.py b/tests/test_log.py index 3ec0ac31..c044a238 100644 --- a/tests/test_log.py +++ b/tests/test_log.py @@ -10,4 +10,10 @@ from feapder.utils.log import log -log.debug(1) \ No newline at end of file +log.debug("debug") +log.info("info") +log.success("success") +log.warning("warning") +log.error("error") +log.critical("critical") +log.exception("exception") \ No newline at end of file diff --git a/tests/test_metrics.py b/tests/test_metrics.py index 6b8ae8e5..308c2711 100644 --- a/tests/test_metrics.py +++ b/tests/test_metrics.py @@ -1,3 +1,5 @@ +import asyncio + from feapder.utils import metrics # 初始化打点系统 @@ -13,9 +15,38 @@ ) -for i in range(1000): - metrics.emit_counter("total count", count=1000, classify="test5") - for j in range(1000): - metrics.emit_counter("key", count=1, classify="test5") +async def test_counter_async(): + for i in range(100): + await metrics.aemit_counter("total count", count=100, classify="test5") + for j in range(100): + await metrics.aemit_counter("key", count=1, classify="test5") + + +def test_counter(): + for i in range(100): + metrics.emit_counter("total count", count=100, classify="test5") + for j in range(100): + metrics.emit_counter("key", count=1, classify="test5") + + +def test_store(): + metrics.emit_store("total", 100, classify="cookie_count") + + +def test_time(): + metrics.emit_timer("total", 100, classify="time") + + +def test_any(): + metrics.emit_any( + tags={"_key": "total", "_type": "any"}, fields={"_value": 100}, classify="time" + ) + -metrics.close() +if __name__ == "__main__": + asyncio.run(test_counter_async()) + test_counter_async() + test_store() + test_time() + test_any() + metrics.close() diff --git a/tests/test_mysqldb.py b/tests/test_mysqldb.py index 7d59ce70..1fdd9c09 100644 --- a/tests/test_mysqldb.py +++ b/tests/test_mysqldb.py @@ -2,7 +2,10 @@ db = MysqlDB( - ip="localhost", port=3306, db="feapder", user_name="feapder", user_pass="feapder123" + ip="localhost", port=3306, db="feapder", user_name="feapder", user_pass="feapder123", set_session=["SET time_zone='+08:00'"] ) -MysqlDB.from_url("mysql://feapder:feapder123@localhost:3306/feapder?charset=utf8mb4") \ No newline at end of file +MysqlDB.from_url("mysql://feapder:feapder123@localhost:3306/feapder?charset=utf8mb4") + +result = db.find("SELECT @@global.time_zone, @@session.time_zone, date_format(NOW(), '%Y-%m-%d %H:%i:%s')") +print(f"Database timezone info: {result}") \ No newline at end of file diff --git a/tests/test_proxies_pool.py b/tests/test_proxies_pool.py deleted file mode 100644 index 5c63758e..00000000 --- a/tests/test_proxies_pool.py +++ /dev/null @@ -1,39 +0,0 @@ -# -*- coding: utf-8 -*- -""" -Created on 2021/4/3 4:25 下午 ---------- -@summary: ---------- -@author: Boris -@email: boris_liu@foxmail.com -""" -from feapder.network.proxy_pool import ProxyPool, check_proxy -import requests - -url = "http://tunnel-api.apeyun.com/h?id=2020120800184471713&secret=3U1fEJPuabi3y2QJ&limit=10&format=txt&auth_mode=auto" - -proxy_pool = ProxyPool(size=-1, proxy_source_url=url) - -print(proxy_pool.get()) -# -# headers = { -# "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.103 Safari/537.36", -# "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9", -# "Accept-Encoding": "gzip, deflate, br", -# "Accept-Language": "zh-CN,zh;q=0.9", -# "Connection": "keep-alive", -# } -# -# -# resp = requests.get( -# "http://www.baidu.com", -# headers=headers, -# proxies={ -# "https": "https://182.106.136.67:13586", -# "http": "http://182.106.136.67:13586", -# }, -# ) -# print(resp.text) -# -# a = check_proxy("182.106.136.67", "13586", show_error_log=True, type=1) -# print(a) diff --git a/tests/test_rander_xhr.py b/tests/test_rander_xhr.py index 534e5c57..15fe2da8 100644 --- a/tests/test_rander_xhr.py +++ b/tests/test_rander_xhr.py @@ -12,7 +12,7 @@ class TestRender(feapder.AirSpider): user_agent=None, # 字符串 或 无参函数,返回值为user_agent proxy=None, # xxx.xxx.xxx.xxx:xxxx 或 无参函数,返回值为代理地址 headless=False, # 是否为无头浏览器 - driver_type="CHROME", # CHROME、PHANTOMJS、FIREFOX + driver_type="CHROME", # CHROME、EDGE、PHANTOMJS、FIREFOX timeout=30, # 请求超时时间 window_size=(1024, 800), # 窗口大小 executable_path=None, # 浏览器路径,默认为默认路径