-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spider logs the count of found URLs more than actual existing URIs. #7737
Spider logs the count of found URLs more than actual existing URIs. #7737
Comments
That's the expected behaviour, status code is not relevant for the count of URLs found. Though ignoring the seeds would probably be more accurate since those are technically not found while spidering. |
I see. You're right. If a URL is found and returned with 404, that should be counted too. And, I agree it will make it accurate to ignore the seeds |
thanks for assigning. I will be able to work in April. |
@jeremychoi do you still plan to tackle this? |
@kingthorin yes. sorry for the delay. I couldn't find time to work on it. I'll do this Q. However, if there's someone else who wants to fix it, it's okay for this to be reassigned. |
No problem and no rush. Life gets busy. I’m working on something else with the Spider but might tackle this in a few weeks if it’s still kicking around. |
Describe the bug
Job spider is reporting always 3 or more even when there is no URLs that can be found, like the following:
"Job spider found 3 URLs"
The logs above came from Automation Framework's messages but the issue seems to exist in the spider addon itself (e.g. https://github.com/zaproxy/zap-extensions/blob/420d2e9d24c44f6a54089c54ad432531cf336b9c/addOns/spider/src/main/java/org/zaproxy/addon/spider/SpiderController.java#L193)
I assume the URLs are including the user-suppiled URL (e.g. a default target URL) plus robots.txt and sitemap.xml which Spider sends automatically.
The root cause might be, increasing the counts regardless of the return status (e.g. 404).
Steps to reproduce the behavior
It can be reproduced using any web apps. Just a simple example can be:
Expected behavior
It should report counts of existing pages only.
Software versions
ZAP 2.12.0
Screenshots
No response
Errors from the zap.log file
No response
Additional context
No response
Would you like to help fix this issue?
Edit:
In the end the resolution of this issue will be an "enhancement" excluding the seeds from the "found" count.
The text was updated successfully, but these errors were encountered: