Skip to content

[BUG] 服务器网络正常,但是提交训练上传请求经常出现网络错误 #1252

@AdamGradient

Description

@AdamGradient

🐛 Bug description [Please make everyone to understand it]

这个网络错误是偶尔才有的,多试几次相同操作可能就能解决,但是非常影响体验

报错为
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='api.swanlab.cn', port=443): Max retries exceeded with url: /api/login/api_key (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7fb389d70d60>: Failed to resolve 'api.swanlab.cn' ([Errno -3] Temporary failure in name resolution)"))

哪怕是为了测试网络,选择swanlab login --relogin 也会偶尔出现网络问题

🧑‍💻 Step to reproduce

情况1:训练脚本
情况2:swanlab login --relogin

👾 Expected result

降低出现频率或者不要出现,毕竟国内服务器应该稳定是优势

🚑 Any additional [like screenshots]

我觉得可能是因为retry的默认次数太少了?因为我多次重复操作相当于手动加上了更大的retry次数,但是我想根本问题还是优化下网络吧
另外,如果因为服务器资金问题,可以添加收费模式换取稳定性,我们实验室不缺钱可以报销

  • SwanLab Version:
    swanlab 0.6.8
  • Platform:
    ubuntu

Metadata

Metadata

Assignees

Labels

🐛 bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions