get headers from a user-agent #286

Kikobeats · 2024-04-26T20:56:20Z

Hello,

I love the library, I have been playing with it. It's very complete with lots of data 👏.

I was wondering if it would be possible to get headers from an input user agent instead of relaying them into browserlist.

So this is supported today:

const { HeaderGenerator, PRESETS } = require('header-generator');
const headerGenerator = new HeaderGenerator(PRESETS.MODERN_WINDOWS_CHROME);
console.log(headerGenerator.getHeaders())

and that is what I'm suggesting:

const { HeaderGenerator } = require('header-generator');

const userAgentString = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.1 Safari/605.1.15';
const headerGenerator = HeaderGenerator.fromUserAgent(userAgentString);

console.log(headerGenerator.getHeaders())

This would be extremely helpful to have a more granular control to debug which cases can be detected or not.

Kikobeats · 2024-04-30T14:40:01Z

updated with an example!

barjin · 2024-05-02T09:37:17Z

Hello @Kikobeats - and thank you for your interest in this project!

All our generated data is based on collected data from real web traffic. Without going into too much detail, we have a (constantly updating) dataset of user fingerprints. These contain the user-agent string as well as more intricate details (screen resolution, total amount of memory installed in the system etc.)

During the training phase, we take all these attributes and train a Bayesian network on them. Every possible value of any attribute is then expressed as a conditional probability of the "parent" attributes.

Now, this is where the user-agent comes to play. In our Bayesian network, all the fingerprint fields are based on the user-agent field. For example, let's say our training dataset had 5 records in total, 2 with user-agent: 'desktop', 3 with user-agent: 'mobile'. The other fields are based on those - e.g. for screenResolution, the probability distribution of screen sizes will be skewed towards smaller screens with user-agent:mobile. Every fingerprint combination with non-zero conditional probability must have existed in the training data - this way, we ensure we're generating convincing fingerprints all the time.

Because of this, the user-agent strings need to be sampled from our collection of known user-agents. If you were to submit your own free-form user-agent string, it might not be in the conditional probability tables for the other fingerprint fields and the header-generator would not be able to generate the fingerprint.

Unfortunately, this makes this feature a wontfix for me... But we're still curious! Is there a use case you have for this? We'd love to hear it! Hopefully, we'll be able to find another way around the problem you're trying to solve.

Cheers!

Kikobeats · 2024-05-02T15:23:02Z

No worries and thanks for the explanation, it's really helpful to understand how the library works.

I asked for that because I already has a collection of most used user agent that is updated periodically:
https://github.com/microlinkhq/top-user-agents/blob/master/src/mobile.json

This data is collected from more than 100M that are performed every month, so the sample is large enough.

In order to simulate real traffic, I want to generate realistic headers based in the user agent as input. I already did some tuning with https-tls about TLS fingerprint but I though that maybe I canse use fingerprint-suite to get realistic browser headers (sec-*, etc).

I noted the library is at the end of the process outputting the headers that is the thing I need, so I tried to play a bit with the code to see if I would get similar headers as output but using an user agent as input.

I still think it's possible if found a way to turn the user agent into an unique browserlist match or any other way to connect it before going to bayesian network 😆 but I totally understand it's not the point of the project.

B4nan assigned barjin Apr 29, 2024

B4nan added the t-tooling Issues with this label are in the ownership of the tooling team. label Apr 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

get headers from a user-agent #286

get headers from a user-agent #286

Kikobeats commented Apr 26, 2024 •

edited

Loading

Kikobeats commented Apr 30, 2024

barjin commented May 2, 2024

Kikobeats commented May 2, 2024

get headers from a user-agent #286

get headers from a user-agent #286

Comments

Kikobeats commented Apr 26, 2024 • edited Loading

Kikobeats commented Apr 30, 2024

barjin commented May 2, 2024

Kikobeats commented May 2, 2024

Kikobeats commented Apr 26, 2024 •

edited

Loading