-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Numeric limit on “associated” subset #93
Comments
A pre-defined numeric limit is always going to be too high for some use cases and too low for others. Some very small sets probably present the greatest risk of abuse (for example, when two confusingly designed "competitor" firms collude on "independent, competitive" pricing. (Submit a home page design and privacy policy with clear association, then buy social media ads for "get a quote" pages that appear independent) Another approach to reaching a user-acceptable limit on set size would be to increase the delay for re-submitting domains that were part of previously rejected sets, based on set size. (Attempting to get a large set approved and failing would make the domains in that set ineligible to be in a future set for longer than attempting to get a small set approved and failing would). |
Generally a numerically based limit is always going to come with problems of user association. Common and consistent brand association provides a much more meaningful indication of association to users. Brand association could be consistently determined via reliable DOM Node defined in a way the browser can infer and validate reliably. |
Three domains is problematic. I work for a small publisher, but we have 5 domains that are related enough for us to consider First Party Sets. My take is that a number will always be problematic as there are legitimate publishers that have a lot more domains. For that reason I think a numeric limit will always be problematic and as @dmarti notes is probably open to being gamed. I think First Party Sets has legitimate value to users in the right hands of legitimate publishers, but it all comes down to trust. I think the answer lies in communication with the user about the relationship between sites. We should explore those options. |
@jdcauley Viewability of a shared element could be a useful technical check. (A proposal for an associated set could include the ID of the element and a URL or hash for the content, and a crawler could verify that the same element is present and viewable on all sites in the set.) Could you add the check to the list of possible automated checks at #95? |
Thank you for the feedback so far. I think that for this updated proposal of FPS it's really important to consider that we're focusing on user-facing breakage cases with regards to a single First-Party Set. This means that when discussing a numeric limit, the question is not "how many sites does your organization have, overall?", but rather "what are use cases/user flows you're trying to preserve throughout sites in your organization, and how many co-branded sites are needed in that flow?" (and maybe whether it is possible to solve these problems using other proposed APIs). |
Throwing our use case in here as well. Gannett is a large publisher (400+ local newspapers, plus a national newspaper), we were considering FPS for the following uses:
The hope would be to limit the user experience friction of having to log into multiple publication sites, setup preferences, and depending on their location having to set their regional privacy settings again (potentially dealing with yet another banner/popup). We did notice #111, but our use case does require the unique identifier to achieve some of the use cases, such as SSO. |
@jmanwaring thank you for the comment! Can you say what other things "user state" covers outside of the three items below it? |
@johannhof, the three items below it is pretty inclusive. The only other think that I can think of at the moment to throw in that is feature testing buckets. |
Throughout 2022 Google said during informal conversations with publishers that whilst FPS would evolve, there were no intentions of making large changes that would impact publishers or indeed changes to what constitutes an “appropriate relationship” for parties within an FPS. So I was somewhat surprised to hear Google plans to do both. I have raised the concerns below to Google directly & am posting them here to ensure they are also in the public domain. The proposal to limit the number of domains in an associated set (widely discussed as being three) will harm large publishers that do not have a high logged in user base that want to provide their customers with a personalised experience across domains. Put another way, businesses owned by the same entity, that have common privacy policy & shared parent branding will now be disadvantaged vs platforms that tend to have higher levels of logged in users. Disappointing given publishers have been told repeatedly to invest in first party data by Google to mitigate browser changes. This message has been consistent in private and in public with Google saying “as digital advertising is reshaped by a number of significant, privacy-driven changes, investment in first-party data is emerging as a key strategy that can help marketers and publishers adapt”. It has even been reiterated in quarterly business reviews, with Google launching playbooks & created data maturity models with BCG that espouse the benefits of connecting data across a business, Now those that have invested time, effort and resources to do so are being told that Google will limit how they deploy their investment. Large publishers without logged in users that have been diligently getting ready for these changes in Chrome would be harmed by this limit, but platforms that do, will not. This will surely create information asymmetry between the likes of Google and its competitors in the open web, handing the platforms a significant advantage & further weakening the publishing industry at a time where free, open & well funded news sources are more important in Europe and beyond than ever before. Google should not impose an arbitrary limit on what number of related domains can be in a first party set, rather put their considerable resources to build a solution where any sites that are owned by a publisher, that share a privacy policy & have visible parent branding can be included in a set. |
I really agree to Simon! As I said before i dont understand why ownership does work for other sets but not for a set which really would matter to publishers to make first party between different owned media domains/assets. Associated sets betweeen non GDPR controllers does not make any sense it makes things even more complicated. |
There have been several issues related to ownership requirements in FPS. Common data usage practices, not corporate owners, matter to users. Many web users do not follow M&A news, so may not be aware of common ownership or divestiture. If independently operated subsidiaries have different privacy policies and/or data usage practices, common ownership does not change how the user's data is handled. Arbitrary/opaque nature of "organizations" as the arbiter of first-party sets · Issue #14 · privacycg/first-party-sets Ownership requirements are difficult and costly to enforce. Browsers and independent reviewers of first-party sets do not have experts in corporate ownership documents in all possible jurisdictions in which an FPS could operate. This is especially important with the latest version of the proposal, in which FPS validation is handled by public commenters on a GitHub repository, not by an independent enforcement entity (IEE). An IEE might have been able to hire finance journalists to research ownership, but participants in a GitHub repository won't. Enforcement of organizational structure within FPS · Issue #18 · privacycg/first-party-sets Ownership structures that would make an ownership requirement meaningless are practical. A new entity could be purposefully designed to comply with an ownership requirement. This would not add privacy protections for users, but would increase costs and risks for sites. MERS for web properties · Issue #49 · privacycg/first-party-sets Attackers could falsify ownership records to break an ownership-based FPS. People who are familiar with the FPS process could break a site’s FPS by creating and posting fake records contradicting the common ownership requirement that would be difficult to verify. A set based on user understanding would be more practical to check. FPS as an attack surface · Issue #55 · privacycg/first-party-sets Minutes of a meeting at which FPS membership standards including ownership were discussed: https://github.com/privacycg/meetings/blob/main/2021/telcons/08-12-21-FPS-adhoc-minutes.md |
@dmarti Google communicated previously that a shared privacy policy coupled with parent branding in a header or footer as a way of ensuring consumers were aware of common ownership. That seems fairly reasonable to me and the other publishers that I've spoken to, indeed some publishing networks already take this approach. As for the points on ownership Google say on their Chrome Dev page for FPS: |
Three domains is problematic. I work for a big publisher, if could be get 5 domains it could be great. |
Thanks, @dmarti, for outlining the issues flagged regarding the ownership requirement! @Simon-J-Harris, the .well-known file check is still something that Chrome considers would be helpful in preventing abuse, given the ability to modify the .well-known file demonstrates administrative access to the domain. However, while this is a mitigation to curb unauthorized set creation, it is not a perfect solution to enforce the ownership requirement since it can be circumvented by coordinating sites. Moreover, the ownership requirement was identified by stakeholders in the W3C Privacy Community Group as insufficiently aligned with user benefit (example of feedback). Chrome took this feedback and explored ways to improve the privacy properties of the First-Party Sets proposal. This is why the new proposal focuses on specific use cases for cross-domain cookie access with the goal of preserving existing site functionality that would otherwise be broken. Chrome has updated its proposal to no longer focus on shared branding, a common privacy policy, and common ownership. Instead, the focus is on a subsets approach that puts use cases front and center. You can find more detail on this here, and we are working on new documentation to reflect this elsewhere. While the associated subset currently relies on a numeric limit to mitigate cross-domain tracking, ideally, this subset would mature into discrete, use case specific solutions. For example, login may be another area where Chrome may need to design more purpose-built solutions (see issue #53 for community feedback on preferring browser-mediated, purpose-built APIs). Does a site use 3p cookies across sites to log a user in across sites? What would break for the user if that site lost 3p cookie access? Since FPS is a solution to address user-facing breakage, we want to approach creating solutions from a user-centric perspective. From the feedback on this thread, the team recognizes that the 3 domain limit may be concerning. To maintain alignment with Chrome’s privacy goals, identifying specific use cases, or reasons why you need 3p cookie access across your sites, would be helpful. This will help the ecosystem (Chrome, as well as other browser vendors) assess whether those use cases are in the user’s benefit. (Additionally, an understanding of use cases for 3p cookie access can help browsers surface explanations to users.) For example, the use cases @jmanwaring spelled out are incredibly helpful for our team as we explore privacy-preserving ways to enable use cases that have a tangible impact on user experience. Apologies for the outdated documentation - we realize that this can be confusing, and our team is working on doing a holistic update across the different forums in the coming weeks. We hope these updates will provide additional clarity, and look forward to your feedback. |
Three domains is already too many, if they're deceptive. It's fairly easy to think of some scams that could be pulled with two or three (#101). Meanwhile, three might not be enough for some legit uses. A big risk here is that enough invalid 3-domain sets get through that the entire FPS feature becomes a scam alert (and independent privacy tools and services start warning users about sites that try to use it). There is no way around the need for a competent external review of proposed sets, and a review process that gets flooded with many small, questionable sets will be more vulnerable and less able to produce trustworthy results than a review process with fewer, more honest sets. The hard problem for FPS is maximizing the return on reviewer time. Public reviewers would find it harder to review one large set than one small one (as @johannhof pointed out in #101) but the cost of proposing a large set is higher, because the negative attention resulting from a rejection could result in a scammer burning a set of already-aged domains, and because the review repository could impose a delay before re-submitting a rejected domain. Finding one out-of-compliance domain in a large set would invalidate the whole set, making the review cost per member domain lower. Reviewer time is a scarce resource on which FPS depends. (see also #105) We still don't know if there will be enough of it. Enabling valid sets to get bigger, in combination with negative consequences for proposing invalid sets, would help to maximize it. @helenyc Another possible specific use case is preferences sharing. Some users might expect a preference (such as "no gambling ads" or "high contrast mode") to take effect across sites that they perceive as the same "party": #111 |
There should be some domain owner ship verification system |
Like with meta tags or something |
El Nuevo Día is the largest newspaper in Puerto Rico, and we operate six domains in different verticals (publishing, e-commerce marketplace, classifieds, coupons and deals, etc.) which are closely related. We share information between them through cookies for content recirculation, maintaining sign-in, and offering services/products based on users' interests. We would like to see more information on how the First-Party Sets policy will be implemented and how it will impact our ability to collect and use first-party data. We support the need to improve user privacy on the web and we believe that we should be able to include all of our domains in an Associated Subset to maintain a great user experience, which is one of our competitive advantages in the market today. |
Listín Diario is a leading newspaper in the Dominican Republic. We operate 11 domains and do not share information between them through cookies today. On the other hand, it does not mean that we won't share it in the future. That said, we want to prepare for a 3P cookieless future and improve our user data to provide a better user experience (e.g. personalize content, recommend products, etc.) and targeted advertising across domains. We think it is reasonable to have at least 15 domains in the Associated Subset. Increasing the number of domains in the Associated Subset would allow publishers to collect more data, which would improve the quality of their user data. This can make the user experience more relevant and engaging. We understand that the First-Party Sets policy is designed to protect the privacy of users. However, publishers also have a vested interest in protecting the privacy of their users. We believe that the benefits of allowing publishers to collect and use first-party data outweigh the risks: publishers use data, collected with the user's consent, to generate revenue, which they then use to fund their journalism helping to ensure that high-quality journalism is available to the public. |
I am writing to provide feedback on the First-Party Sets policy. My company is La Prensa Gráfica, a daily newspaper published in El Salvador. We operate 5 domains today (laprensagrafica.com, elgrafico.com, eleconomista.net, ellas.sv and grupolpg.sv), and we share information between domains through cookies to maintain a persistent login. We have had a chance to review the First-Party Sets policy and we are concerned that it may not allow us to continue sharing information between our domains in the way that we do today. We need to be able to maintain a persistent login across domains to provide a seamless experience for our users. Otherwise, it can make it more difficult for users to access their accounts and data. We would like to ask that the First-Party Sets policy be amended to allow for the sharing of information between 5-10 domains for the purpose of maintaining a persistent login across domains. |
La Nación is a Costa Rican newspaper. The newspaper is owned by Grupo Nación, which also owns several other newspapers in Costa Rica, such as El Financiero and La Teja and the domian Teleguía, also magazines like Revista Perfil and Sabores and other related companies like Parque Viva which is the first entertainment center in Central America. If we consider only the newspapers, we operate three domains and share information between them through cookies to manage login accounts. Today, we have a consolidated vision of our domains on Google AdManager and Google Analytics 4. And an important metric for us, dependent on 3P cookies, is user sessions across domains and advertising which leaves quite a bit of profit. We would like to ask for 5+ domains in the Associated Subset, because we want to continue to grow our business and brands. |
El Universal is a Mexican newspaper based in Mexico City. We currently have nine domains: eluniversal.com.mx, viveusa.mx, revistaclase.mx, eluniversalpuebla.com.mx, eluniversalqueretaro.mx, elgrafico.mx, nosotras.com.mx, generacionuniversitaria.com.mx and de10.com.mx. And we do not share cookies between the domains. First-Party Sets proposal review: No concerns. We believe that this proposal is a good way to protect user privacy while still allowing publishers to personalize/monetize content. Associated Subset: We would like to have 5-15 domains in the Associated Subset. We need a considerable number of domains to be able to personalize content for our users. For example, we can use the information collected on various domains to show users news articles that are relevant to their interests. Additional comments: We think that user transparency can be achieved by labeling the domains that are part of the Associated Subset (e.g. news, social media, ecommerce, etc.). This can help users to understand the purpose of each domain and where/how their information is being shared. |
El País is a national Uruguayan daily newspaper with the largest circulation in the country. We operate 6+ domains, including classifieds, and a site that offers coupons, cashback on purchases and group deals to consumers (e.g. elpais.com.uy, elpais.uy, clubelpais.com.uy, clubelpais.com.uy, paula.com.uy and gallito.com.uy). Currently, our primary use of cookies across domains is to allow users to have a single sign-on experience. After understanding the First-Party Sets proposal, we have the following feedback to give: We believe that new brands will be forced into subdomains much more than new domains separated by functionality or niche user base. This is a concern because creating new digital consumer products is part of our core business. |
We, Agea, from Grupo Clarín— the most prominent and diversified media group in Argentina and one of the most important in the Spanish-speaking world, have reviewed the First-Party Sets proposal to learn more about how it could impact our publishing business. Today, Agea has more than 3 associated domains, meaning that the current limit of 3 domains in the Associated Subset as proposed in the First-Party Sets policy does not meet our needs and those of our users. Agea uses third-party cookies to (1) maintain users logged in to their subscribed products, (2) personalize content and page layouts (including the removal of unwanted or unrequired content), (3) promote our own products and services, and (4) increase the quality of our first-party data, as it is key to understanding consumer journey across domains to plan and execute our strategy as a media group. We want to preserve the use cases mentioned above. This would not be possible without Chrome's First-Party Sets proposal, which is a significant step towards a more private and secure web. I would like to emphasize the importance of our first-party user data (collected from and across all of our domains). This data is legitimate and comes from all of our companies, where we have the same shareholding composition. No matter if it is first, second or third-party cookies that we are talking about, in our understanding, it is first-party data. About our business and mission At Agea we produce a series of sites dedicated to news, entertainment, culture, sports, trends, and daily life. Our goal is to present more and better content for different audiences, with innovative apps and presence in various platforms. We advocate for democracy and its freedoms and privacy is a top priority for our company. To adapt to the First-Party Sets proposal, a simple and convenient solution would be to move brands to a single domain. But that doesn’t necessarily make sense for our customers. As a publisher, we have always believed in the independence of our brands and the unique relationship it has established with our customers. Our company’s identity (Grupo Clarín) is not the same as your brands’ identities (Clarín, Ole, etc.). Each domain/brand is committed to an honest and independent communication management, performed with professional responsibility. We are the first choice of millions of Argentineans and we continue to grow organically and increase our user base year after year. It often happens, though, that we buy new domains/brands due to their potential to complement our suite of products and services. Recommendation Agea, part of Grupo Clarín, wants to (1) build first-party relationships with users, (2) gain and maintain consumers' trust in the long term by adopting privacy-preserving APIs and (3) continue to improve users' web browsing experience. That said, we’d like to be able to include 20 domains in the Associated Subset. It is also worth mentioning that there is no way of predicting all use cases. Even the number of domains in the Associated Subset might need to change as the business grows. We’ll learn as we go and Agea is committed to continue this discussion to ensure better privacy for users on the web. |
Posting on behalf of @AnalyticsElTiempo as shared in #146 -- Company: El Tiempo Casa Editorial, a media conglomerate and owner of El Tiempo, the largest circulated daily newspaper in Colombia. 3P cookie usage across O&O domains: We share information between our 10+ domains through cookies for advertising campaigns through a DMP. Examples: www.ElTiempo.com, www.portafolio.co, www.futbolred.com, www.motor.com.co, www.elempleo.com, www.diarioadn.co, www.alo.co, www.caustica.co, www.CityTV.com.co, etc. Future affected processes: Besides advertising-related use cases, we anticipate users having a less personalized experience when browsing through our domains. Before long, we want to be able to recommend content that users are likely to be interested in based on their browsing history (e.g. career related articles at eltiempo.com for users looking for jobs at elempreo.com). First-Party Sets: We have reviewed the First-Party Sets policy and our only concern is that the Associated Subset is limited to 3 domains, which is too restrictive. We think that 10-15 domains would be a good balance between privacy and utility for users:
We’d like the 3-domain limit for the Associated Subset to be reconsidered by taking into account the impact that blocking third-party cookies will have on user experience and on publishers' ability to monetize their content. |
Caracol Televisión is a media and entertainment company that owns several outlets including the private television channel with the highest audience in Colombia with the same name. It produces and broadcasts a variety of programming, including news, sports, entertainment, and telenovelas. The company shares information between its domains (e.g. caracoltv.com, bluradio.com. shock.co, noticiascaracol.com, volkgames.com, bumbox.com, lakalle.com, etc.) through cookies for the following purposes: We have reviewed the First-Party Sets proposal and we have some questions about its impact on analytics. We would like to know if we will still be able to track monthly active users (MAUs), time spent, and number of sessions across all of our brands/domains. These KPIs are important to us because they help us measure overall growth and user engagement. On a more operational note, we offer video content that comes from a third-party platform (OVP) and videos have their own separate domain. Will it need to be included in our First-Party Set or Chrome has another alternative to solve this? We want to track video related metrics and guarantee a good user experience, like playing the video from when it was last interrupted. About the number of domains in the Associated Subset, we would like to have all the domains that we manage at the corporate level as part of it, since all domains are closely related (i.e. media consumption). Additionally, understanding users’ interests, attitudes, and behaviors with their digital media consumption is key to our business. |
I fully support @Simon-J-Harris and @rblanck There are issues that I don't quite understand in the current limitation of 3 and I would be grateful if someone could please help me to understand. We (Vocento) are a media group, with more than 20 news websites, mainly with a regional scope and thematic verticals. We have realized that the only way to survive and have quality journalism with local scope is to organize economies of scale. We combine our local newsrooms with content made by central newsrooms with topics of common interest in our local areas. In other words, we have a centralized vision of our consumers in order to be able to generate synergies. The use cases we are asked to develop are focused on centralized content recommendation systems, transversal service offers such as offers for shows, restaurant reservations, etc. We have a centralized view of our consumers in order to be able to generate synergies. Additionally and not least, we must aggregate our contextual audiences (it would be inoperative to activate it in each of our media) and be able to have a system of frequency control and internal retargeting in our media in order to be efficient for advertisers. The economy of scale to manage our client's vision, given the low level of login we maintain, must be based on a vision of a navigation identifier in First Party that helps us to have efficient systems of content recommendation, and services and to be able to control frequency in cross-site contextual environments. It is incomprehensible that as owners of more than 20 local and vertical media, we cannot manage our consumers because we would have a competitive advantage over the "Open Web". Especially due to the obvious lack of competitiveness, we have against Gate Keepers (European terminology), platforms, etc. Nor do I understand that ownership is not a transparent element for the user. That is what legal and privacy links and consent management are for. If the user is interested in it, they easily find it. I think that the discourse that ownership is not transparent for the browser is something that I would dare, I am not an expert, to propose, if in the domain registry, there is already an "Organisation" field, it could be managed to declare it and leave it open for automatic verification. Only domain owners would be able to modify this field in the registry. I hope that we will not take a decision that will make it more difficult to survive small and medium projects who decided to join to find a way to survive against the really big players with global audiences and multiple services. |
El Comercio is a Peruvian newspaper, one of the oldest Spanish-language papers in the world and one of the most influential media in Peru. We are committed to protecting user privacy and we are looking forward to seeing Chrome developing new features that rely on the First-Party Sets relationship. Today, we share cookies across O&O domains to create products aimed at specific audiences. A sample of our domains include: elcomercio.pe, depor.com, gestion.pe, trome.com, diariocorreo.pe and ojo.pe. Our suggestion is to allow at least 30 domains in the Associated Subset. To gather information across a large number of domains is key as we are constantly doing market studies to generate new products that may interest our users. This is our way of keeping the web engaging and innovative. |
Small affiliated sets are probably more of a risk than big ones because of the issue of reviewer time. (see #101) A reviewer can deal with one large set faster than with many small ones. Any mismatch in any two domains in a large set would let the reviewer reject the whole thing. And after a set is rejected or revoked, the domains in that set should not be allowed as members of other proposed sets for a significant period of time, in order to minimize the burden on reviewers. Reviewers could easily have a lot of time taken up by the same party trying to sneak through slightly modified sets. This time out for domains means that a larger set is likely to be more trustworthy, because the party is putting more domains at risk by proposing a large set than a small one. |
Re-posting @nlozanoarguelles' message from #153, since it's relevant to this discussion. -- Hello, |
Thanks to everyone who has posted here. The specifics you all shared on the number of impacted domains for your organization, and the list of use-cases impacted by third-party cookie deprecation are all very useful. We are currently taking these into consideration. |
Hey @krgovind catching up in this space, learning a lot, trying to understand the current state of things. One question that I can't quite find an answer to is where that 5 domain limit came from? Feel free to just answer that question on its own, although I'll expound a bit below to try to understand more about how we got here and where you might see this going. From all the issues in this repo and great discussions they link to, it seems like there's a significant top line philosophical divide between folks/companies who think: a) the privacy boundary should never go beyond what is in the browser bar and b) those who think there should be some room for a grouping of sites. It seems like Google and MSFT* fall into (b), Mozilla and Apple fall into (a). Within (b) there's different opinions about how these lists should be formed and constrained, but I think that's the top line distinction? So where I'm stuck is if Google thinks that the privacy boundary should go beyond the browser bar, and there's a good number of examples of businesses that have more than 5 sites that are willing to invest (and even have invested) in co-branding, how we wound up here? I'll venture an exploratory guess to try to tease out thinking:
Is that roughly correct? Or has co-branding been abandoned completely? (* I should be clear that I'm not speaking for MSFT here). |
@thegreatfatzby We explained the limit of 5 on the associated subset here. There is no limit on the number of ccTLD or service sites; but we do have technical checks that we think can meaningfully prevent misuse.
We do still list out "Domains whose affiliation with the set primary is clearly presented to users" in our set formation requirements; and ask set submitters to "provide an explanation of how they clearly present the affiliation across domains to users and why users would expect their domains to be affiliated (e.g., an About page, header or footer, shared branding or logo, etc).". However, we trust set submitters to follow these guidelines, and currently do not have technical checks that can verify this.
Correct. (a) We found that it is difficult to verify co-branding at web-scale in a manner that offers equitable access, and accommodates a range of visual design aesthetics and choice; and therefore opted to develop a submission process that relies on objective checks/limits. (b) Early on, we received feedback from other browsers and privacy advocates that having a numeric limit is preferred because it is hard for users to comprehend when their data is passively shared across hundreds of sites. At the time, many ecosystem players recommended a limit of 3-5 domains per set.
We had dismissed this because of the verification, and user-understanding issues that I explained above. |
[Note: This issue captures an open question related to the changes proposed in PR #91 and summarized on issue #92]
While we've proposed a limit of three domains for the "associated" subset, we seek feedback on whether this would be suitable for ecosystem use cases.
The text was updated successfully, but these errors were encountered: