Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

建议增加一个传入xpath,缩小提取范围的功能 #8

Open
JerryChenn07 opened this issue Aug 24, 2020 · 3 comments
Open

建议增加一个传入xpath,缩小提取范围的功能 #8

JerryChenn07 opened this issue Aug 24, 2020 · 3 comments
Assignees
Labels
enhancement New feature or request

Comments

@JerryChenn07
Copy link

我在提取列表页:aHR0cDovL3d3dy5zaGFuZ2hhaS5nb3YuY24vbncyL253MjMxNC9udzIzMTkvbncyNDA3L253NDg2NzgvaW5kZXguaHRtbA==(base64)的时候,会存在误提取的情况。

一些页面结构比较复杂,根据现有提取规则会存在误提取或者提取不到的情况,在想能否增加一个功能,不论是列表页提取还是详情页提取,用户自定义传入xpath,缩小提取范围后,再去提取,这样能大大增加提取精度呀。

崔哥加油

@JerryChenn07 JerryChenn07 added the enhancement New feature or request label Aug 24, 2020
@JerryChenn07
Copy link
Author

JerryChenn07 commented Sep 3, 2020

催更催更

顺便提个bug,崔哥试试这个列表页:aHR0cCUzQS8vd3d3Lmd4emYuZ292LmNuL3pmd2ovenpxcm16ZmJndHdqXzM0ODI4LzIwMTVuZ3pid2pfMzQ4MzEv,这个站点提取异常

@JerryChenn07
Copy link
Author

再提一个bug,一个列表页:aHR0cCUzQS8vd3d3Lm54Lmdvdi5jbi96d2drL3F6ZndqL2xpc3RfNTMuaHRtbA==,这是最后一页了,只有2条数据,但是无法提取。

@JerryChenn07
Copy link
Author

再加一个,比如列表页,标题特别长,text()里会省略部分内容,用...替代了,但是他的标签下还有@title,是最全的标题,可否稍作判断

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants