Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add website statistics to release report #1578

Open
rbbeeston opened this issue Jan 3, 2025 · 4 comments
Open

Add website statistics to release report #1578

rbbeeston opened this issue Jan 3, 2025 · 4 comments
Assignees
Labels
Feature New feature or request

Comments

@rbbeeston
Copy link
Member

please include # of unique visitors, top 10 most popular pages, and any other statistics that would make sense.

@rbbeeston rbbeeston converted this from a draft issue Jan 3, 2025
@rbbeeston rbbeeston added the Feature New feature or request label Jan 3, 2025
@GregKaleka
Copy link
Collaborator

It looks like our Plausible account doesn't include API access.

Image

I suppose we could add some statistics manually, since releases are relatively infrequent. Rob or @sdarwin - thoughts?

@sdarwin
Copy link
Collaborator

sdarwin commented Jan 7, 2025

The URL has changed multiple times. preview.boost.org -> www.boost.io -> www.boost.org.
The Plausible domain should be modified to www.boost.org, and then within a day, update the website to reflect that. However, the process doesn't need to be done immediately, perhaps wait until after boost.org is the real domain. Something to plan for.

If we are publicizing stats anyway, there is a solution to the lack of an api.

https://plausible.io/preview.boost.org/settings/visibility

"Make stats publicly available on https://plausible.io/preview.boost.org/"

Then scrape the public webpage, and parse the stats from the html.

Also "Embed Dashboard".

You are welcome to modify those settings. I have now made the stats public, see https://plausible.io/preview.boost.org/

@GregKaleka
Copy link
Collaborator

  • Traditional scraping doesn't work, because the actual stats are populated via javascript, and don't show up in the raw html when making a programmatic request
  • Using something like selenium or playwright would work, but getting those to work headless in a container is a bit tricky and can be memory-intensive

Spelunking through the requests, though, it looks like there's an internal API that gets the stats to populate the page, and there doesn't appear to be any authentication on those.

For high-level stats like unique visitors and pageviews:
https://plausible.io/api/stats/preview.boost.org/top-stats/?period=custom&date=2025-01-09&from=2024-12-01&to=2025-01-08&filters=%5B%5D&with_imported=true&comparison=previous_period&compare_from=undefined&compare_to=undefined&match_day_of_week=true

For most viewed pages:
https://plausible.io/api/stats/preview.boost.org/pages/?period=custom&date=2025-01-09&from=2024-12-01&to=2025-01-08&filters=%5B%5D&with_imported=true&limit=9

It's hard to know how reliable this approach will be long-term, but it's a very simple way to get the data, and once we have the data modeling set up, we can pivot to another approach if needed if they clamp down on this.

Sound reasonable?

@sdarwin
Copy link
Collaborator

sdarwin commented Jan 9, 2025

@GregKaleka yes, let's try that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature New feature or request
Projects
Status: Accepted
Development

No branches or pull requests

3 participants