Skip to content
Cimbali edited this page Mar 26, 2020 · 11 revisions

Table of Contents

How does CleanLinks work?

CleanLinks protects your private life, by automatically detecting and skipping redirect pages, that track you on your way to the link you really wanted. Tracking parameters (e.g. utm_* or fbclid) are also removed.

You can test the current (master) link cleaning code online.

Request types

CleanLinks analyses and cleans your browser’s requests before they leave the browser, except for javascript requests which are cleaned at the moment they are clicked.

At this stage, it can distinguish between 3 types of requests:

  1. top-level requests, which are the websites that are opened and typically correspond to links that clicked inside or outside of the browser.
  2. other requests, which are initiated by the website to load resources: scripts, images, iframes, etc.
  3. header redirects, which happen when a website issues a 30x to send you from one location to the next. In this case we can clean the destination to which we are redirected.

Embedded URL detection

We automatically detect embedded URLs, which are used either:

  1. when websites report your current URL, or
  2. when websites bring you to an intermediate page to track you and then redirect you to their destination.

These requests are then respectively dropped (we could also consider removing the query parameter containing the current URL) and redirected to the embedded URL.

CleanLinks has rules, which allow to specify which uses of embedded URLs are legitimate and whitelist those, i.e. not redirect them. A typical example is a login page with a ?redirectUrl= parameter to specify where to go once the login is successful.

CleanLinks will break some websites and you will need to manually whitelist these URLs for them to work. This can be done easily via the menu from the CleanLinks icon.

Rules and tracking parameters removal

CleanLinks’ rules also list further tracking parameters (e.g. utm_*) to remove, or rewrites to perform in the URL’s path.

For maximum privacy, rules are maintained and editable locally (with decent defaults distributed in the add-on).

CleanLinks rules structure

The default rules are available as a json file and can be exported or imported from the CleanLinks settings.

Rules are stored per domain hierarchically.

The actions key specifies which actions to perform

There are currently 4 types of actions.

  1. Remove parameters
  2. Whitelist parameters (overrides any parameter removal, inherited or not). This means embedded URLs are also allowed in the matched parameter, without causing the intermediate page to be skipped.
  3. Replace in URL path
  4. Whitelist URL path (overrides any URL path rewrites, inherited or not) This means embedded URLs are also allowed in the URL path, without causing the intermediate page to be skipped.

Keys starting with . match domain parts

For example, with the rules {.org: {actions: {set1}, .mozilla: {actions: {set2}}}}

  • The rules in {set1} are applied to all websites with the top-level domain .org
  • The rules in {set2} are applied to all websites of the domain mozilla.org, or subdomains thereof (.e.g. www.mozilla.org and addons.mozilla.org).

Every other key is a regular expression matching the URL’s path

the actions key displays which cleaning actions to take.
The import/export function is more meant for backing up your rules set than to edit them manually.

What to do when a website is broken

Occasionally, CleanLinks will break websites, because the embedded URL detection is automatic, and a legitimate use case might be missing from the rules.

⚠️ Please keep in mind that CleanLinks always has been (and will remain for the foreseeable future) for advanced users − that can tolerate sites breaking occasionally, and fixing them. Think uMatrix more than uBlock Origin. Making good default rules has to be a community effort. Even if I wanted, I couldn’t possibly keep up with all the websites. This CleanLinks allows rules to be edited directly, without needing to wait for a rules or add-on update.

Is there a website that is not working anymore? Maybe reloading infinitely? Here’s what you can do:

1. Whitelist the URL

  1. Open the CleanLinks menu by clicking on the toolbar button toolbar button,
  2. Select the problematic link in the history,

    💡 To reduce the potential candidates, click the trash icon trash icon to empty the list, then the reload button trash icon to reload the page.

  3. To either allow or remove this embedded URL at all following times this link is loaded, respectively
    • click the “Whitelist Embedded URL” button whitelist icon, or
    • click the “Remove Embedded URL” button blacklist icon.

💡 Alternately, you can also

  • manually edit the rule in the preferences. The edit rule button opens the preferences with the editor pre-populated with the selected link.
  • allow the link to be loaded once without any cleaning, by using the Whitelist once button
  • allow all the requests on the page to be loaded without cleaning, by toggling the add-on off , refreshing the page , and toggling the add-on back on after the page has loaded.

2. Consider contributing the cleaning/whitelisting rule

⚠️ This is important as CleanLinks has no telemetry at all, not even anonymous.

Do you think the website is used by many people, or could be useful to the wider community as a default rule? Please open an issue and I’ll try to integrate it. What I need to know is:

  • on which pages does the problem happen?
  • which parameters should be removed or whitelisted for the website to work?

💡 You can search for the rule by filtering by website in CleanLink’s configuration page, or in the rules file that you can export from that page.

💡 You can copy a dirty/clean link pair from the CleanLinks popup by selecting it and using Ctrl + C (resp. Cmd + C on macOS).

CleanLinks interface

💡 Tooltips show useful help in all the interfaces. Display them by hovering the mouse on question marks and buttons.

CleanLinks popup

The popup is opened by clicking the toolbar icon and looks like this:

Annotated screenshot of the popup

The toolbar icon, popup title (1), and CleanLinks logo in the top left corner of the popup (2) indicate whether the add-on is active or inactive in this tab.

The popup’s center show the history of cleaned links (3) with original links, followed by cleaned links. These are prefixed with ► when redirected end ❌ when their loading was denied.

Filters (4) (5) are shown as boxes in the top part of the popup, allow to filter the history of cleaned links that is displayed below (3). There are 2 types of filters

  • Link type ((4), see Request types). When enabled, show in the history requests that are respectively:

    • pages opened in the top-level frame − most likely clicked from within or outside of the browser ;
    • document-initiated requests − typically resources such as scripts and images ;
    • header redirects.
  • Action type (see embedded URLs and parameter removal). When enabled, show in the history requests that have been cleaned respectively:

    • by skipping intermediate pages straight to the embedded URL ;
    • by removing parameters ;
    • by rewriting (search + replace) the URL path ;
    • by removing a javascript function.

Clicking a cleaned link in the history selects it (6). This is marked by a blue border on the left, and the link text is wrapped to be displayed in full. On each link, the cleaning actions are displayed as follows:

  • Parameters or parts of the path removed of the URL
  • Parts of the path inserted into the URL
  • Whitelisted parts of the URL, left untouched
  • Location of the detected embedded URL

💡 You can copy selected a dirty/clean link pair by using Ctrl + C (resp. Cmd + C on macOS).

Buttons on the bottom (7) perform various actions:

  • Non-link related actions:

    • Toggle (enable/disable) the add-on for this tab.
    • Empty the displayed history of cleaned links (6)
    • Refresh the page
  • Actions based on the selected link:

    • Whitelist once: loads the URL without cleaning it at all.
    • Add to whitelist: adds the parameter (or path) with the embedded URL to the whitelist to allow it.
    • Remove parameter: adds the parameter (or path) with the embedded URL to the rules to remove it.
    • Edit rules: open the settings page with the selected link pre-populated for modification.

CleanLinks settings

🚧 This section is still being written. Please refer to the help tooltips that appear when hovering question marks and buttons.

⚠️ Rules need to be saved after being modified

  • Greyed items are inherited from parent rules.
  • To counter-act a greyed out item, either:
    • add the same item to the whitelist of the rule you’re editing − this overrules the removals, or
    • go to the parent rule (listed above the rules editing and above the add/save/erase buttons − in general the global rule *.*).
Clone this wiki locally