Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Concurrency To Download Vendor List #4005

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 17 additions & 3 deletions config/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -240,10 +240,19 @@ type Privacy struct {
LMT LMT
}

type VendorListFetcher struct {
// Max concurrent request for downloading the latest version of the vendor list from a specific major version
MaxConcurrencyInitFetchLatestVersion int `mapstructure:"max_concurrency_init_fetch_latest_version"`

// Max concurrent request for downloading every specific version of the vendor list from its first version to its latest version
MaxConcurrencyInitFetchSpecificVersion int `mapstructure:"max_concurrency_init_fetch_specific_version"`
}

type GDPR struct {
Enabled bool `mapstructure:"enabled"`
HostVendorID int `mapstructure:"host_vendor_id"`
DefaultValue string `mapstructure:"default_value"`
Enabled bool `mapstructure:"enabled"`
HostVendorID int `mapstructure:"host_vendor_id"`
DefaultValue string `mapstructure:"default_value"`
Comment on lines -244 to +256
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the vertical alignment was intentional.

Why change the indentation style of those lines ?

Copy link
Contributor Author

@zhongshixi zhongshixi Oct 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sebhtml do you mind elaborating more about vertical alignment was intentional ? Changing the indentation is automatically done by the go formatter ( i am using MS Visual Studio ) , which i believe it is easier to read.

but i am curious about other technical aspects related to it

Copy link
Contributor

@sebhtml sebhtml Oct 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The types are vertically aligned:

                                |
                                v
	Timeouts                GDPRTimeouts `mapstructure:"timeouts_ms"`
	NonStandardPublishers   []string     `mapstructure:"non_standard_publishers,flow"`
	NonStandardPublisherMap map[string]struct{}
	TCF2                    TCF2 `mapstructure:"tcf2"`
	AMPException            bool `mapstructure:"amp_exception"`

see

Timeouts GDPRTimeouts `mapstructure:"timeouts_ms"`

I don't know if my comment is relevant though.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be go fmt's decision; they actually look wrong before.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bsardo let me know your opinion

// TODO: migrate the fetcher timeout to VendorListFetcher
Timeouts GDPRTimeouts `mapstructure:"timeouts_ms"`
NonStandardPublishers []string `mapstructure:"non_standard_publishers,flow"`
NonStandardPublisherMap map[string]struct{}
Expand All @@ -255,6 +264,9 @@ type GDPR struct {
// to DefaultValue
EEACountries []string `mapstructure:"eea_countries"`
EEACountriesMap map[string]struct{}

// VendorListFetcher - configuration for fetching the vendor list from a remote source.
VendorListFetcher VendorListFetcher `mapstructure:"vendorlist_fetcher"`
}

func (cfg *GDPR) validate(v *viper.Viper, errs []error) []error {
Expand Down Expand Up @@ -1140,6 +1152,8 @@ func SetupViper(v *viper.Viper, filename string, bidderInfos BidderInfos) {
"FIN", "FRA", "GUF", "DEU", "GIB", "GRC", "GLP", "GGY", "HUN", "ISL", "IRL", "IMN", "ITA", "JEY", "LVA",
"LIE", "LTU", "LUX", "MLT", "MTQ", "MYT", "NLD", "NOR", "POL", "PRT", "REU", "ROU", "BLM", "MAF", "SPM",
"SVK", "SVN", "ESP", "SWE", "GBR"})
v.SetDefault("gdpr.vendorlist_fetcher.max_concurrency_init_fetch_latest_version", 1) // by default one download at a time
v.SetDefault("gdpr.vendorlist_fetcher.max_concurrency_init_fetch_latest_version", 1) // by default one download at a time
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this line repeated twice ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, will change it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks !

v.SetDefault("ccpa.enforce", false)
v.SetDefault("lmt.enforce", true)
v.SetDefault("currency_converter.fetch_url", "https://cdn.jsdelivr.net/gh/prebid/currency-file@1/latest.json")
Expand Down
45 changes: 34 additions & 11 deletions gdpr/vendorlist-fetching.go
Original file line number Diff line number Diff line change
Expand Up @@ -16,11 +16,17 @@ import (
"github.com/prebid/go-gdpr/vendorlist2"
"github.com/prebid/prebid-server/v2/config"
"golang.org/x/net/context/ctxhttp"
"golang.org/x/sync/errgroup"
)

type saveVendors func(uint16, uint16, api.VendorList)
type VendorListFetcher func(ctx context.Context, specVersion uint16, listVersion uint16) (vendorlist.VendorList, error)

type vendorListVersionGroup struct {
specVersion uint16
firstListVersion uint16
}

// This file provides the vendorlist-fetching function for Prebid Server.
//
// For more info, see https://github.com/prebid/prebid-server/issues/504
Expand All @@ -32,7 +38,7 @@ func NewVendorListFetcher(initCtx context.Context, cfg config.GDPR, client *http

preloadContext, cancel := context.WithTimeout(initCtx, cfg.Timeouts.InitTimeout())
defer cancel()
preloadCache(preloadContext, client, urlMaker, cacheSave)
preloadCache(preloadContext, client, urlMaker, cacheSave, cfg.VendorListFetcher)

saveOneRateLimited := newOccasionalSaver(cfg.Timeouts.ActiveTimeout())
return func(ctx context.Context, specVersion, listVersion uint16) (vendorlist.VendorList, error) {
Expand Down Expand Up @@ -61,11 +67,8 @@ func makeVendorListNotFoundError(specVersion, listVersion uint16) error {
}

// preloadCache saves all the known versions of the vendor list for future use.
func preloadCache(ctx context.Context, client *http.Client, urlMaker func(uint16, uint16) string, saver saveVendors) {
versions := [2]struct {
specVersion uint16
firstListVersion uint16
}{
func preloadCache(ctx context.Context, client *http.Client, urlMaker func(uint16, uint16) string, saver saveVendors, conf config.VendorListFetcher) {
versions := [2]vendorListVersionGroup{
{
specVersion: 2,
firstListVersion: 2, // The GVL for TCF2 has no vendors defined in its first version. It's very unlikely to be used, so don't preload it.
Expand All @@ -75,13 +78,33 @@ func preloadCache(ctx context.Context, client *http.Client, urlMaker func(uint16
firstListVersion: 1,
},
}
for _, v := range versions {
latestVersion := saveOne(ctx, client, urlMaker(v.specVersion, 0), saver)
var wgLatestVersion errgroup.Group
var wgSpecificVersion errgroup.Group

for i := v.firstListVersion; i < latestVersion; i++ {
saveOne(ctx, client, urlMaker(v.specVersion, i), saver)
}
wgLatestVersion.SetLimit(conf.MaxConcurrencyInitFetchLatestVersion)
wgSpecificVersion.SetLimit(conf.MaxConcurrencyInitFetchSpecificVersion)

tsStart := time.Now() // For logging how long this takes
for _, v := range versions {
specVersion := v.specVersion
firstVersion := v.firstListVersion
wgLatestVersion.Go(func() error {
latestVersion := saveOne(ctx, client, urlMaker(specVersion, 0), saver)
for i := firstVersion; i < latestVersion; i++ {
currentVersion := i
wgSpecificVersion.Go(func() error {
saveOne(ctx, client, urlMaker(specVersion, currentVersion), saver)
Copy link
Contributor Author

@zhongshixi zhongshixi Oct 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to help reviewer understand the reason behind reason that i do not use channel or wait group+mutex is because the implementation of saveOne is already mutex safe due to its using sync.Map underneath, there is no need to add unnecessary new lock mechanism

https://pkg.go.dev/sync#Map

return nil
})
}
return nil
})
}

wgLatestVersion.Wait()
wgSpecificVersion.Wait()

glog.Infof("Finished Preloading vendor lists within %v", time.Since(tsStart))
}

// Make a URL which can be used to fetch a given version of the Global Vendor List. If the version is 0,
Expand Down
16 changes: 9 additions & 7 deletions gdpr/vendorlist-fetching_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ import (
"net/http"
"net/http/httptest"
"strconv"
"sync"
"testing"

"github.com/stretchr/testify/assert"
Expand Down Expand Up @@ -192,12 +193,16 @@ type versionInfo struct {
}
type saver []versionInfo

var saverMutex sync.Mutex

func (s *saver) saveVendorLists(specVersion uint16, listVersion uint16, gvl api.VendorList) {
vi := versionInfo{
specVersion: specVersion,
listVersion: listVersion,
}
saverMutex.Lock()
*s = append(*s, vi)
saverMutex.Unlock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If *s = append(*s, vi) panics, I think that the saverMutex will never be unlocked.

I think that deferring can avoid that:

	saverMutex.Lock()
	defer saverMutex.Unlock()
	*s = append(*s, vi)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

}

func TestPreloadCache(t *testing.T) {
Expand Down Expand Up @@ -253,7 +258,10 @@ func TestPreloadCache(t *testing.T) {
defer server.Close()

s := make(saver, 0, 5)
preloadCache(context.Background(), server.Client(), testURLMaker(server), s.saveVendorLists)
preloadCache(context.Background(), server.Client(), testURLMaker(server), s.saveVendorLists, config.VendorListFetcher{
MaxConcurrencyInitFetchLatestVersion: 2,
MaxConcurrencyInitFetchSpecificVersion: 2,
})

expectedLoadedVersions := []versionInfo{
{specVersion: 2, listVersion: 2},
Expand Down Expand Up @@ -284,12 +292,6 @@ var vendorList2Expected = testExpected{
vendorPurposes: map[int]bool{1: false, 2: true, 3: true},
}

var vendorListFallbackExpected = testExpected{
Copy link
Contributor Author

@zhongshixi zhongshixi Oct 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this code block is not used at all so i deleted it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you need to delete it to accomplish your goal in this PR ?

Copy link
Contributor Author

@zhongshixi zhongshixi Oct 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but why we need to keep it if the code is not used at all? there is no comment or description associated with the deleted test case.

i used a practice called grooming as you go - so if you happen to touch the related code piece as part of the solution, then you just do a little bit grooming to avoid large bulk of grooming/refactoring in the future

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. I see no problem with that.

vendorListVersion: 215, // Values from hardcoded fallback file.
vendorID: 12,
vendorPurposes: map[int]bool{1: true, 2: false, 3: true},
}

type vendorList struct {
GVLSpecificationVersion uint16 `json:"gvlSpecificationVersion"`
VendorListVersion uint16 `json:"vendorListVersion"`
Expand Down
1 change: 1 addition & 0 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,7 @@ require (
github.com/yudai/golcs v0.0.0-20170316035057-ecda9a501e82 // indirect
github.com/yudai/pp v2.0.1+incompatible // indirect
golang.org/x/crypto v0.21.0 // indirect
golang.org/x/sync v0.8.0 // indirect
golang.org/x/sys v0.18.0 // indirect
google.golang.org/genproto v0.0.0-20230410155749-daa745c078e1 // indirect
google.golang.org/protobuf v1.33.0 // indirect
Expand Down
2 changes: 2 additions & 0 deletions go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -662,6 +662,8 @@ golang.org/x/sync v0.0.0-20200625203802-6e8e738ad208/go.mod h1:RxMgew5VJxzue5/jJ
golang.org/x/sync v0.0.0-20201020160332-67f06af15bc9/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.0.0-20201207232520-09787c993a3a/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.0.0-20210220032951-036812b2e83c/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.8.0 h1:3NFvSEYkUoMifnESzZl15y791HH1qU2xm6eCJU5ZPXQ=
golang.org/x/sync v0.8.0/go.mod h1:Czt+wKu1gCyEFDUtn0jG5QVvpJ6rzVqr5aXyt9drQfk=
golang.org/x/sys v0.0.0-20180823144017-11551d06cbcc/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=
golang.org/x/sys v0.0.0-20180830151530-49385e6e1522/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=
golang.org/x/sys v0.0.0-20180905080454-ebe1bf3edb33/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=
Expand Down
9 changes: 8 additions & 1 deletion sample/001_banner/app.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,13 @@ status_response: "ok" # default response string for /status endpoint

gdpr:
default_value: "0" # disable gdpr, explicitly specifying a default value is a requirement in prebid server config
vendorlist_fetcher:
max_concurrency_init_fetch_latest_version: 1
max_concurrency_init_fetch_specific_version: 1
timeouts_ms:
init_vendorlist_fetches: 30000
active_vendorlist_fetch: 30000


# set up stored request storage using local file system
stored_requests:
Expand All @@ -17,4 +24,4 @@ stored_requests:
stored_responses:
filesystem:
enabled: true
directorypath: ./stored_responses/data/by_id
directorypath: ./stored_responses/data/by_id
Loading