-
Notifications
You must be signed in to change notification settings - Fork 556
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce CPU & network consumption of Facia JSON download #26338
Conversation
7bcd556
to
9431a92
Compare
4ada4c3
to
eb1fdae
Compare
I've talked this through with @abeddow91, @arelra, @ioannakok, @georgeblahblah and @ParisaTork at the 2pm meeting today, and response was positive, no blocking objections! Some evidence shared:
|
Thanks for the detailed write-up @rtyley! Not sure if this has come up already, but (iirc) the only reason @ioannakok and I were able to find for having a Might be worth looking into if it hasn't been discussed yet? |
Ah interesting - that would have been #18364 & #18365 ...
That is a super interesting point @bryophyta ! Looking at the uncompressed files sizes for
...the lite versions aren't even dramatically smaller - they are still 67% of the same size as the original versions of those files. So I think you're right - they're adding 67% to storage requirements (assuming that the Scala case-class memory representation is roughly proportional to the file size) and are actively harmful to the cache hit rate, now that we're using a cache. Subsequent to this PR, in a new PR, I definitely agree: the 'lite' versions should be completely removed. The only thing to watch out for would be to see if the output of the endpoints that use the 'lite' versions ( |
Thanks for this!
Agreed that we'd need to check implications for consumers 👍 Hopefully it should be easy enough to repurpose the existing code that creates the From what I remember though it might take a little bit of detective work to find out who the consumers are though (I happen to know that one of the consumers of But ultimately up to WebX to decide whether it's something to prioritise of course 🙂 |
} | ||
|
||
class FrontJsonFapiLive(val blockingOperations: BlockingOperations) extends FrontJsonFapi { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we're deleting BlockingOperations
this config can also be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, that bit of config is still used by the BlockingOperations
class, and that hasn't been deleted (unfortunately?!) - we're no longer using it for the Fronts-JSON download code, but it's still used elsewhere, eg dfp.OrderAgent
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fantastic work @rtyley 👏 Thank you for doing this and for the great PR write-up!
Excellent work @rtyley 💎 |
eb1fdae
to
565d89c
Compare
Context: #26335 In this change: * Use AWS SDK v2 and non-blocking async code to avoid blocking a thread while downloading the Facia JSON data from S3 * Directly transform bytes into JsValue, without constructing a `String` first Note that the work done between the two metrics `FrontDownloadLatency` & `FrontDecodingLatency` has changed slightly - the conversion to the basic JSON model (JsValue) now occurs in `FrontDownloadLatency`, rather than the 'decoding' step. We could get rid of the futureSemaphore and stop using the dedicated blocking-operations pool here, but we'll leave that to another PR. Co-authored-by: Ravi <[email protected]> Co-authored-by: Ioanna Kokkini <[email protected]>
This change adds these improvements: * Facia data is only re-downloaded & re-parsed if the S3 content has _changed_, thanks to ETag-caching - see https://github.com/guardian/etag-caching . This library has already been used in DotCom PROD with guardian/frontend#26338 * AWS SDK v2: the FAPI client itself now has a `fapi-s3-sdk-v2` artifact. An example PR consuming this updated version of the FAPI client is at: guardian/ophan#5506 Updated FAPI artifact layout ---------------------------- To use FAPI with the new AWS SDK v2 support, users must now have a dependency on *two* FAPI artifacts: * `fapi-s3-sdk-v2` * `fapi-client-playXX` Due to needing to support the matrix of: * AWS SDK v1 & v2 * Play-JSON 2.8, 2.9, 3.0 ...it's best not to try to produce an artifact that corresponds to every single combination of those! Consequently, we provide an artifacts that are specific to the different versions of AWS SDK (or at least, could do - if AWS SDK v1 was moved out of common code), and artifacts that are specific to the different versions of Play-JSON, and allow the user to combine them as needed. A similar approach was used with `guardian/play-secret-rotation`: guardian/play-secret-rotation#8 In order for the different artifacts to have interfaces they can use to join together and become a single useful Facia client, we have a `fapi-client-core` artifact. Any code that doesn't depend on the JSON classes, or the actual AWS SDK version (which isn't much!), can live in there. In particular, we have: * `com.gu.facia.client.ApiClient`, an existing type that is now a trait, with 2 implementations - one that uses the existing `com.gu.facia.client.S3Client` abstraction on S3 behaviour * `com.gu.facia.client.etagcaching.fetching.S3FetchBehaviour`, a new trait that exposes just enough interface to allow the conditional fetching used for ETag-based caching, but doesn't tie you to any specific version of the AWS SDK.
This change adds these improvements: * Facia data is only re-downloaded & re-parsed if the S3 content has _changed_, thanks to ETag-caching - see https://github.com/guardian/etag-caching . This library has already been used in DotCom PROD with guardian/frontend#26338 * AWS SDK v2: the FAPI client itself now has a `fapi-s3-sdk-v2` artifact. An example PR consuming this updated version of the FAPI client is at: guardian/ophan#5506 Updated FAPI artifact layout ---------------------------- To use FAPI with the new AWS SDK v2 support, users must now have a dependency on *two* FAPI artifacts: * `fapi-s3-sdk-v2` * `fapi-client-playXX` Due to needing to support the matrix of: * AWS SDK v1 & v2 * Play-JSON 2.8, 2.9, 3.0 ...it's best not to try to produce an artifact that corresponds to every single combination of those! Consequently, we provide an artifacts that are specific to the different versions of AWS SDK (or at least, could do - if AWS SDK v1 was moved out of common code), and artifacts that are specific to the different versions of Play-JSON, and allow the user to combine them as needed. A similar approach was used with `guardian/play-secret-rotation`: guardian/play-secret-rotation#8 In order for the different artifacts to have interfaces they can use to join together and become a single useful Facia client, we have a `fapi-client-core` artifact. Any code that doesn't depend on the JSON classes, or the actual AWS SDK version (which isn't much!), can live in there. In particular, we have: * `com.gu.facia.client.ApiClient`, an existing type that is now a trait, with 2 implementations - one that uses the existing `com.gu.facia.client.S3Client` abstraction on S3 behaviour * `com.gu.facia.client.etagcaching.fetching.S3FetchBehaviour`, a new trait that exposes just enough interface to allow the conditional fetching used for ETag-based caching, but doesn't tie you to any specific version of the AWS SDK.
This change adds these improvements: * Facia data is only re-downloaded & re-parsed if the S3 content has _changed_, thanks to ETag-caching - see https://github.com/guardian/etag-caching . This library has already been used in DotCom PROD with guardian/frontend#26338 * AWS SDK v2: the FAPI client itself now has a `fapi-s3-sdk-v2` artifact. An example PR consuming this updated version of the FAPI client is at: guardian/ophan#5506 Updated FAPI artifact layout ---------------------------- To use FAPI with the new AWS SDK v2 support, users must now have a dependency on *two* FAPI artifacts: * `fapi-s3-sdk-v2` * `fapi-client-playXX` Due to needing to support the matrix of: * AWS SDK v1 & v2 * Play-JSON 2.8, 2.9, 3.0 ...it's best not to try to produce an artifact that corresponds to every single combination of those! Consequently, we provide an artifacts that are specific to the different versions of AWS SDK (or at least, could do - if AWS SDK v1 was moved out of common code), and artifacts that are specific to the different versions of Play-JSON, and allow the user to combine them as needed. A similar approach was used with `guardian/play-secret-rotation`: guardian/play-secret-rotation#8 In order for the different artifacts to have interfaces they can use to join together and become a single useful Facia client, we have a `fapi-client-core` artifact. Any code that doesn't depend on the JSON classes, or the actual AWS SDK version (which isn't much!), can live in there. In particular, we have: * `com.gu.facia.client.ApiClient`, an existing type that is now a trait, with 2 implementations - one that uses the existing `com.gu.facia.client.S3Client` abstraction on S3 behaviour * `com.gu.facia.client.etagcaching.fetching.S3FetchBehaviour`, a new trait that exposes just enough interface to allow the conditional fetching used for ETag-based caching, but doesn't tie you to any specific version of the AWS SDK.
This change adds these improvements: * Facia data is only re-downloaded & re-parsed if the S3 content has _changed_, thanks to ETag-caching - see https://github.com/guardian/etag-caching . This library has already been used in DotCom PROD with guardian/frontend#26338 * AWS SDK v2: the FAPI client itself now has a `fapi-s3-sdk-v2` artifact. An example PR consuming this updated version of the FAPI client is at: guardian/ophan#5506 Updated FAPI artifact layout ---------------------------- To use FAPI with the new AWS SDK v2 support, users must now have a dependency on *two* FAPI artifacts: * `fapi-s3-sdk-v2` * `fapi-client-playXX` Due to needing to support the matrix of: * AWS SDK v1 & v2 * Play-JSON 2.8, 2.9, 3.0 ...it's best not to try to produce an artifact that corresponds to every single combination of those! Consequently, we provide an artifacts that are specific to the different versions of AWS SDK (or at least, could do - if AWS SDK v1 was moved out of common code), and artifacts that are specific to the different versions of Play-JSON, and allow the user to combine them as needed. A similar approach was used with `guardian/play-secret-rotation`: guardian/play-secret-rotation#8 In order for the different artifacts to have interfaces they can use to join together and become a single useful Facia client, we have a `fapi-client-core` artifact. Any code that doesn't depend on the JSON classes, or the actual AWS SDK version (which isn't much!), can live in there. In particular, we have: * `com.gu.facia.client.ApiClient`, an existing type that is now a trait, with 2 implementations - one that uses the existing `com.gu.facia.client.S3Client` abstraction on S3 behaviour * `com.gu.facia.client.etagcaching.fetching.S3FetchBehaviour`, a new trait that exposes just enough interface to allow the conditional fetching used for ETag-based caching, but doesn't tie you to any specific version of the AWS SDK.
This change adds these improvements: * Facia data is only re-downloaded & re-parsed if the S3 content has _changed_, thanks to ETag-caching - see https://github.com/guardian/etag-caching . This library has already been used in DotCom PROD with guardian/frontend#26338 * AWS SDK v2: the FAPI client itself now has a `fapi-s3-sdk-v2` artifact. An example PR consuming this updated version of the FAPI client is at: guardian/ophan#5506 Updated FAPI artifact layout ---------------------------- To use FAPI with the new AWS SDK v2 support, users must now have a dependency on *two* FAPI artifacts: * `fapi-s3-sdk-v2` * `fapi-client-playXX` Due to needing to support the matrix of: * AWS SDK v1 & v2 * Play-JSON 2.8, 2.9, 3.0 ...it's best not to try to produce an artifact that corresponds to every single combination of those! Consequently, we provide an artifacts that are specific to the different versions of AWS SDK (or at least, could do - if AWS SDK v1 was moved out of common code), and artifacts that are specific to the different versions of Play-JSON, and allow the user to combine them as needed. A similar approach was used with `guardian/play-secret-rotation`: guardian/play-secret-rotation#8 In order for the different artifacts to have interfaces they can use to join together and become a single useful Facia client, we have a `fapi-client-core` artifact. Any code that doesn't depend on the JSON classes, or the actual AWS SDK version (which isn't much!), can live in there. In particular, we have: * `com.gu.facia.client.ApiClient`, an existing type that is now a trait, with 2 implementations - one that uses the existing `com.gu.facia.client.S3Client` abstraction on S3 behaviour * `com.gu.facia.client.etagcaching.fetching.S3FetchBehaviour`, a new trait that exposes just enough interface to allow the conditional fetching used for ETag-based caching, but doesn't tie you to any specific version of the AWS SDK.
This change adds these improvements: * Facia data is only re-downloaded & re-parsed if the S3 content has _changed_, thanks to ETag-caching - see https://github.com/guardian/etag-caching . This library has already been used in DotCom PROD with guardian/frontend#26338 * AWS SDK v2: the FAPI client itself now has a `fapi-s3-sdk-v2` artifact. An example PR consuming this updated version of the FAPI client is at: guardian/ophan#5506 Updated FAPI artifact layout ---------------------------- To use FAPI with the new AWS SDK v2 support, users must now have a dependency on *two* FAPI artifacts: * `fapi-s3-sdk-v2` * `fapi-client-playXX` Due to needing to support the matrix of: * AWS SDK v1 & v2 * Play-JSON 2.8, 2.9, 3.0 ...it's best not to try to produce an artifact that corresponds to every single combination of those! Consequently, we provide an artifacts that are specific to the different versions of AWS SDK (or at least, could do - if AWS SDK v1 was moved out of common code), and artifacts that are specific to the different versions of Play-JSON, and allow the user to combine them as needed. A similar approach was used with `guardian/play-secret-rotation`: guardian/play-secret-rotation#8 In order for the different artifacts to have interfaces they can use to join together and become a single useful Facia client, we have a `fapi-client-core` artifact. Any code that doesn't depend on the JSON classes, or the actual AWS SDK version (which isn't much!), can live in there. In particular, we have: * `com.gu.facia.client.ApiClient`, an existing type that is now a trait, with 2 implementations - one that uses the existing `com.gu.facia.client.S3Client` abstraction on S3 behaviour * `com.gu.facia.client.etagcaching.fetching.S3FetchBehaviour`, a new trait that exposes just enough interface to allow the conditional fetching used for ETag-based caching, but doesn't tie you to any specific version of the AWS SDK.
This change adds these improvements: * Facia data is only re-downloaded & re-parsed if the S3 content has _changed_, thanks to ETag-caching - see https://github.com/guardian/etag-caching . This library has already been used in DotCom PROD with guardian/frontend#26338 * AWS SDK v2: the FAPI client itself now has a `fapi-s3-sdk-v2` artifact. An example PR consuming this updated version of the FAPI client is at: guardian/ophan#5506 Updated FAPI artifact layout ---------------------------- To use FAPI with the new AWS SDK v2 support, users must now have a dependency on *two* FAPI artifacts: * `fapi-s3-sdk-v2` * `fapi-client-playXX` Due to needing to support the matrix of: * AWS SDK v1 & v2 * Play-JSON 2.8, 2.9, 3.0 ...it's best not to try to produce an artifact that corresponds to every single combination of those! Consequently, we provide an artifacts that are specific to the different versions of AWS SDK (or at least, could do - if AWS SDK v1 was moved out of common code), and artifacts that are specific to the different versions of Play-JSON, and allow the user to combine them as needed. A similar approach was used with `guardian/play-secret-rotation`: guardian/play-secret-rotation#8 In order for the different artifacts to have interfaces they can use to join together and become a single useful Facia client, we have a `fapi-client-core` artifact. Any code that doesn't depend on the JSON classes, or the actual AWS SDK version (which isn't much!), can live in there. In particular, we have: * `com.gu.facia.client.ApiClient`, an existing type that is now a trait, with 2 implementations - one that uses the existing `com.gu.facia.client.S3Client` abstraction on S3 behaviour * `com.gu.facia.client.etagcaching.fetching.S3FetchBehaviour`, a new trait that exposes just enough interface to allow the conditional fetching used for ETag-based caching, but doesn't tie you to any specific version of the AWS SDK.
This change adds these improvements: * Facia data is only re-downloaded & re-parsed if the S3 content has _changed_, thanks to ETag-caching - see https://github.com/guardian/etag-caching . This library has already been used in DotCom PROD with guardian/frontend#26338 * AWS SDK v2: the FAPI client itself now has a `fapi-s3-sdk-v2` artifact. An example PR consuming this updated version of the FAPI client is at: guardian/ophan#5506 Updated FAPI artifact layout ---------------------------- To use FAPI with the new AWS SDK v2 support, users must now have a dependency on *two* FAPI artifacts: * `fapi-s3-sdk-v2` * `fapi-client-playXX` Due to needing to support the matrix of: * AWS SDK v1 & v2 * Play-JSON 2.8, 2.9, 3.0 ...it's best not to try to produce an artifact that corresponds to every single combination of those! Consequently, we provide an artifacts that are specific to the different versions of AWS SDK (or at least, could do - if AWS SDK v1 was moved out of common code), and artifacts that are specific to the different versions of Play-JSON, and allow the user to combine them as needed. A similar approach was used with `guardian/play-secret-rotation`: guardian/play-secret-rotation#8 In order for the different artifacts to have interfaces they can use to join together and become a single useful Facia client, we have a `fapi-client-core` artifact. Any code that doesn't depend on the JSON classes, or the actual AWS SDK version (which isn't much!), can live in there. In particular, we have: * `com.gu.facia.client.ApiClient`, an existing type that is now a trait, with 2 implementations - one that uses the existing `com.gu.facia.client.S3Client` abstraction on S3 behaviour * `com.gu.facia.client.etagcaching.fetching.S3FetchBehaviour`, a new trait that exposes just enough interface to allow the conditional fetching used for ETag-based caching, but doesn't tie you to any specific version of the AWS SDK.
This change adds these improvements: * Facia data is only re-downloaded & re-parsed if the S3 content has _changed_, thanks to ETag-caching - see https://github.com/guardian/etag-caching . This library has already been used in DotCom PROD with guardian/frontend#26338 * AWS SDK v2: the FAPI client itself now has a `fapi-s3-sdk-v2` artifact. An example PR consuming this updated version of the FAPI client is at: guardian/ophan#5506 Updated FAPI artifact layout ---------------------------- To use FAPI with the new AWS SDK v2 support, users must now have a dependency on *two* FAPI artifacts: * `fapi-s3-sdk-v2` * `fapi-client-playXX` Due to needing to support the matrix of: * AWS SDK v1 & v2 * Play-JSON 2.8, 2.9, 3.0 ...it's best not to try to produce an artifact that corresponds to every single combination of those! Consequently, we provide an artifacts that are specific to the different versions of AWS SDK (or at least, could do - if AWS SDK v1 was moved out of common code), and artifacts that are specific to the different versions of Play-JSON, and allow the user to combine them as needed. A similar approach was used with `guardian/play-secret-rotation`: guardian/play-secret-rotation#8 In order for the different artifacts to have interfaces they can use to join together and become a single useful Facia client, we have a `fapi-client-core` artifact. Any code that doesn't depend on the JSON classes, or the actual AWS SDK version (which isn't much!), can live in there. In particular, we have: * `com.gu.facia.client.ApiClient`, an existing type that is now a trait, with 2 implementations - one that uses the existing `com.gu.facia.client.S3Client` abstraction on S3 behaviour * `com.gu.facia.client.etagcaching.fetching.S3FetchBehaviour`, a new trait that exposes just enough interface to allow the conditional fetching used for ETag-based caching, but doesn't tie you to any specific version of the AWS SDK.
This change adds these improvements: * Facia data is only re-downloaded & re-parsed if the S3 content has _changed_, thanks to ETag-caching - see https://github.com/guardian/etag-caching . This library has already been used in DotCom PROD with guardian/frontend#26338 * AWS SDK v2: the FAPI client itself now has a `fapi-s3-sdk-v2` artifact. An example PR consuming this updated version of the FAPI client is at: guardian/ophan#5506 Updated FAPI artifact layout ---------------------------- To use FAPI with the new AWS SDK v2 support, users must now have a dependency on *two* FAPI artifacts: * `fapi-s3-sdk-v2` * `fapi-client-playXX` Due to needing to support the matrix of: * AWS SDK v1 & v2 * Play-JSON 2.8, 2.9, 3.0 ...it's best not to try to produce an artifact that corresponds to every single combination of those! Consequently, we provide an artifacts that are specific to the different versions of AWS SDK (or at least, could do - if AWS SDK v1 was moved out of common code), and artifacts that are specific to the different versions of Play-JSON, and allow the user to combine them as needed. A similar approach was used with `guardian/play-secret-rotation`: guardian/play-secret-rotation#8 In order for the different artifacts to have interfaces they can use to join together and become a single useful Facia client, we have a `fapi-client-core` artifact. Any code that doesn't depend on the JSON classes, or the actual AWS SDK version (which isn't much!), can live in there. In particular, we have: * `com.gu.facia.client.ApiClient`, an existing type that is now a trait, with 2 implementations - one that uses the existing `com.gu.facia.client.S3Client` abstraction on S3 behaviour * `com.gu.facia.client.etagcaching.fetching.S3FetchBehaviour`, a new trait that exposes just enough interface to allow the conditional fetching used for ETag-based caching, but doesn't tie you to any specific version of the AWS SDK.
This change adds these improvements: * Facia data is only re-downloaded & re-parsed if the S3 content has _changed_, thanks to ETag-caching - see https://github.com/guardian/etag-caching . This library has already been used in DotCom PROD with guardian/frontend#26338 * AWS SDK v2: the FAPI client itself now has a `fapi-s3-sdk-v2` artifact. An example PR consuming this updated version of the FAPI client is at: guardian/ophan#5506 Updated FAPI artifact layout ---------------------------- To use FAPI with the new AWS SDK v2 support, users must now have a dependency on *two* FAPI artifacts: * `fapi-s3-sdk-v2` * `fapi-client-playXX` Due to needing to support the matrix of: * AWS SDK v1 & v2 * Play-JSON 2.8, 2.9, 3.0 ...it's best not to try to produce an artifact that corresponds to every single combination of those! Consequently, we provide an artifacts that are specific to the different versions of AWS SDK (or at least, could do - if AWS SDK v1 was moved out of common code), and artifacts that are specific to the different versions of Play-JSON, and allow the user to combine them as needed. A similar approach was used with `guardian/play-secret-rotation`: guardian/play-secret-rotation#8 In order for the different artifacts to have interfaces they can use to join together and become a single useful Facia client, we have a `fapi-client-core` artifact. Any code that doesn't depend on the JSON classes, or the actual AWS SDK version (which isn't much!), can live in there. In particular, we have: * `com.gu.facia.client.ApiClient`, an existing type that is now a trait, with 2 implementations - one that uses the existing `com.gu.facia.client.S3Client` abstraction on S3 behaviour * `com.gu.facia.client.etagcaching.fetching.S3FetchBehaviour`, a new trait that exposes just enough interface to allow the conditional fetching used for ETag-based caching, but doesn't tie you to any specific version of the AWS SDK.
This change adds these improvements: * Facia data is only re-downloaded & re-parsed if the S3 content has _changed_, thanks to ETag-caching - see https://github.com/guardian/etag-caching . This library has already been used in DotCom PROD with guardian/frontend#26338 * AWS SDK v2: the FAPI client itself now has a `fapi-s3-sdk-v2` artifact. An example PR consuming this updated version of the FAPI client is at: guardian/ophan#5506 Updated FAPI artifact layout ---------------------------- To use FAPI with the new AWS SDK v2 support, users must now have a dependency on *two* FAPI artifacts: * `fapi-s3-sdk-v2` * `fapi-client-playXX` Due to needing to support the matrix of: * AWS SDK v1 & v2 * Play-JSON 2.8, 2.9, 3.0 ...it's best not to try to produce an artifact that corresponds to every single combination of those! Consequently, we provide an artifacts that are specific to the different versions of AWS SDK (or at least, could do - if AWS SDK v1 was moved out of common code), and artifacts that are specific to the different versions of Play-JSON, and allow the user to combine them as needed. A similar approach was used with `guardian/play-secret-rotation`: guardian/play-secret-rotation#8 In order for the different artifacts to have interfaces they can use to join together and become a single useful Facia client, we have a `fapi-client-core` artifact. Any code that doesn't depend on the JSON classes, or the actual AWS SDK version (which isn't much!), can live in there. In particular, we have: * `com.gu.facia.client.ApiClient`, an existing type that is now a trait, with 2 implementations - one that uses the existing `com.gu.facia.client.S3Client` abstraction on S3 behaviour * `com.gu.facia.client.etagcaching.fetching.S3FetchBehaviour`, a new trait that exposes just enough interface to allow the conditional fetching used for ETag-based caching, but doesn't tie you to any specific version of the AWS SDK.
This change adds these improvements: * Facia data is only re-downloaded & re-parsed if the S3 content has _changed_, thanks to ETag-caching - see https://github.com/guardian/etag-caching . This library has already been used in DotCom PROD with guardian/frontend#26338 * AWS SDK v2: the FAPI client itself now has a `fapi-s3-sdk-v2` artifact. An example PR consuming this updated version of the FAPI client is at: guardian/ophan#5506 Updated FAPI artifact layout ---------------------------- To use FAPI with the new AWS SDK v2 support, users must now have a dependency on *two* FAPI artifacts: * `fapi-s3-sdk-v2` * `fapi-client-playXX` Due to needing to support the matrix of: * AWS SDK v1 & v2 * Play-JSON 2.8, 2.9, 3.0 ...it's best not to try to produce an artifact that corresponds to every single combination of those! Consequently, we provide an artifacts that are specific to the different versions of AWS SDK (or at least, could do - if AWS SDK v1 was moved out of common code), and artifacts that are specific to the different versions of Play-JSON, and allow the user to combine them as needed. A similar approach was used with `guardian/play-secret-rotation`: guardian/play-secret-rotation#8 In order for the different artifacts to have interfaces they can use to join together and become a single useful Facia client, we have a `fapi-client-core` artifact. Any code that doesn't depend on the JSON classes, or the actual AWS SDK version (which isn't much!), can live in there. In particular, we have: * `com.gu.facia.client.ApiClient`, an existing type that is now a trait, with 2 implementations - one that uses the existing `com.gu.facia.client.S3Client` abstraction on S3 behaviour * `com.gu.facia.client.etagcaching.fetching.S3FetchBehaviour`, a new trait that exposes just enough interface to allow the conditional fetching used for ETag-based caching, but doesn't tie you to any specific version of the AWS SDK.
This change adds these improvements: * Facia data is only re-downloaded & re-parsed if the S3 content has _changed_, thanks to ETag-caching - see https://github.com/guardian/etag-caching . This library has already been used in DotCom PROD with guardian/frontend#26338 * AWS SDK v2: the FAPI client itself now has a `fapi-s3-sdk-v2` artifact. An example PR consuming this updated version of the FAPI client is at: guardian/ophan#5506 Updated FAPI artifact layout ---------------------------- To use FAPI with the new AWS SDK v2 support, users must now have a dependency on *two* FAPI artifacts: * `fapi-s3-sdk-v2` * `fapi-client-playXX` Due to needing to support the matrix of: * AWS SDK v1 & v2 * Play-JSON 2.8, 2.9, 3.0 ...it's best not to try to produce an artifact that corresponds to every single combination of those! Consequently, we provide an artifacts that are specific to the different versions of AWS SDK (or at least, could do - if AWS SDK v1 was moved out of common code), and artifacts that are specific to the different versions of Play-JSON, and allow the user to combine them as needed. A similar approach was used with `guardian/play-secret-rotation`: guardian/play-secret-rotation#8 In order for the different artifacts to have interfaces they can use to join together and become a single useful Facia client, we have a `fapi-client-core` artifact. Any code that doesn't depend on the JSON classes, or the actual AWS SDK version (which isn't much!), can live in there. In particular, we have: * `com.gu.facia.client.ApiClient`, an existing type that is now a trait, with 2 implementations - one that uses the existing `com.gu.facia.client.S3Client` abstraction on S3 behaviour * `com.gu.facia.client.etagcaching.fetching.S3FetchBehaviour`, a new trait that exposes just enough interface to allow the conditional fetching used for ETag-based caching, but doesn't tie you to any specific version of the AWS SDK.
This change adds these improvements: * Facia data is only re-downloaded & re-parsed if the S3 content has _changed_, thanks to ETag-caching - see https://github.com/guardian/etag-caching . This library has already been used in DotCom PROD with guardian/frontend#26338 * AWS SDK v2: the FAPI client itself now has a `fapi-s3-sdk-v2` artifact. An example PR consuming this updated version of the FAPI client is at: guardian/ophan#5506 Updated FAPI artifact layout ---------------------------- To use FAPI with the new AWS SDK v2 support, users must now have a dependency on *two* FAPI artifacts: * `fapi-s3-sdk-v2` * `fapi-client-playXX` Due to needing to support the matrix of: * AWS SDK v1 & v2 * Play-JSON 2.8, 2.9, 3.0 ...it's best not to try to produce an artifact that corresponds to every single combination of those! Consequently, we provide an artifacts that are specific to the different versions of AWS SDK (or at least, could do - if AWS SDK v1 was moved out of common code), and artifacts that are specific to the different versions of Play-JSON, and allow the user to combine them as needed. A similar approach was used with `guardian/play-secret-rotation`: guardian/play-secret-rotation#8 In order for the different artifacts to have interfaces they can use to join together and become a single useful Facia client, we have a `fapi-client-core` artifact. Any code that doesn't depend on the JSON classes, or the actual AWS SDK version (which isn't much!), can live in there. In particular, we have: * `com.gu.facia.client.ApiClient`, an existing type that is now a trait, with 2 implementations - one that uses the existing `com.gu.facia.client.S3Client` abstraction on S3 behaviour * `com.gu.facia.client.etagcaching.fetching.S3FetchBehaviour`, a new trait that exposes just enough interface to allow the conditional fetching used for ETag-based caching, but doesn't tie you to any specific version of the AWS SDK.
This change adds these improvements: * Facia data is only re-downloaded & re-parsed if the S3 content has _changed_, thanks to ETag-caching - see https://github.com/guardian/etag-caching . This library has already been used in DotCom PROD with guardian/frontend#26338 * AWS SDK v2: the FAPI client itself now has a `fapi-s3-sdk-v2` artifact. An example PR consuming this updated version of the FAPI client is at: guardian/ophan#5506 Updated FAPI artifact layout ---------------------------- To use FAPI with the new AWS SDK v2 support, users must now have a dependency on *two* FAPI artifacts: * `fapi-s3-sdk-v2` * `fapi-client-playXX` Due to needing to support the matrix of: * AWS SDK v1 & v2 * Play-JSON 2.8, 2.9, 3.0 ...it's best not to try to produce an artifact that corresponds to every single combination of those! Consequently, we provide an artifacts that are specific to the different versions of AWS SDK (or at least, could do - if AWS SDK v1 was moved out of common code), and artifacts that are specific to the different versions of Play-JSON, and allow the user to combine them as needed. A similar approach was used with `guardian/play-secret-rotation`: guardian/play-secret-rotation#8 In order for the different artifacts to have interfaces they can use to join together and become a single useful Facia client, we have a `fapi-client-core` artifact. Any code that doesn't depend on the JSON classes, or the actual AWS SDK version (which isn't much!), can live in there. In particular, we have: * `com.gu.facia.client.ApiClient`, an existing type that is now a trait, with 2 implementations - one that uses the existing `com.gu.facia.client.S3Client` abstraction on S3 behaviour * `com.gu.facia.client.etagcaching.fetching.S3FetchBehaviour`, a new trait that exposes just enough interface to allow the conditional fetching used for ETag-based caching, but doesn't tie you to any specific version of the AWS SDK.
This change adds these improvements: * Facia data is only re-downloaded & re-parsed if the S3 content has _changed_, thanks to ETag-caching - see https://github.com/guardian/etag-caching . This library has already been used in DotCom PROD with guardian/frontend#26338 * AWS SDK v2: the FAPI client itself now has a `fapi-s3-sdk-v2` artifact. An example PR consuming this updated version of the FAPI client is at: guardian/ophan#5506 Updated FAPI artifact layout ---------------------------- To use FAPI with the new AWS SDK v2 support, users must now have a dependency on *two* FAPI artifacts: * `fapi-s3-sdk-v2` * `fapi-client-playXX` Due to needing to support the matrix of: * AWS SDK v1 & v2 * Play-JSON 2.8, 2.9, 3.0 ...it's best not to try to produce an artifact that corresponds to every single combination of those! Consequently, we provide an artifacts that are specific to the different versions of AWS SDK (or at least, could do - if AWS SDK v1 was moved out of common code), and artifacts that are specific to the different versions of Play-JSON, and allow the user to combine them as needed. A similar approach was used with `guardian/play-secret-rotation`: guardian/play-secret-rotation#8 In order for the different artifacts to have interfaces they can use to join together and become a single useful Facia client, we have a `fapi-client-core` artifact. Any code that doesn't depend on the JSON classes, or the actual AWS SDK version (which isn't much!), can live in there. In particular, we have: * `com.gu.facia.client.ApiClient`, an existing type that is now a trait, with 2 implementations - one that uses the existing `com.gu.facia.client.S3Client` abstraction on S3 behaviour * `com.gu.facia.client.etagcaching.fetching.S3FetchBehaviour`, a new trait that exposes just enough interface to allow the conditional fetching used for ETag-based caching, but doesn't tie you to any specific version of the AWS SDK.
This change adds these improvements: * Facia data is only re-downloaded & re-parsed if the S3 content has _changed_, thanks to ETag-caching - see https://github.com/guardian/etag-caching . This library has already been used in DotCom PROD with guardian/frontend#26338 * AWS SDK v2: the FAPI client itself now has a `fapi-s3-sdk-v2` artifact. An example PR consuming this updated version of the FAPI client is at: guardian/ophan#5506 Updated FAPI artifact layout ---------------------------- To use FAPI with the new AWS SDK v2 support, users must now have a dependency on *two* FAPI artifacts: * `fapi-s3-sdk-v2` * `fapi-client-playXX` Due to needing to support the matrix of: * AWS SDK v1 & v2 * Play-JSON 2.8, 2.9, 3.0 ...it's best not to try to produce an artifact that corresponds to every single combination of those! Consequently, we provide an artifacts that are specific to the different versions of AWS SDK (or at least, could do - if AWS SDK v1 was moved out of common code), and artifacts that are specific to the different versions of Play-JSON, and allow the user to combine them as needed. A similar approach was used with `guardian/play-secret-rotation`: guardian/play-secret-rotation#8 In order for the different artifacts to have interfaces they can use to join together and become a single useful Facia client, we have a `fapi-client-core` artifact. Any code that doesn't depend on the JSON classes, or the actual AWS SDK version (which isn't much!), can live in there. In particular, we have: * `com.gu.facia.client.ApiClient`, an existing type that is now a trait, with 2 implementations - one that uses the existing `com.gu.facia.client.S3Client` abstraction on S3 behaviour * `com.gu.facia.client.etagcaching.fetching.S3FetchBehaviour`, a new trait that exposes just enough interface to allow the conditional fetching used for ETag-based caching, but doesn't tie you to any specific version of the AWS SDK.
This change adds these improvements: * Facia data is only re-downloaded & re-parsed if the S3 content has _changed_, thanks to ETag-caching - see https://github.com/guardian/etag-caching . This library has already been used in DotCom PROD with guardian/frontend#26338 * AWS SDK v2: the FAPI client itself now has a `fapi-s3-sdk-v2` artifact. An example PR consuming this updated version of the FAPI client is at: guardian/ophan#5506 Updated FAPI artifact layout ---------------------------- To use FAPI with the new AWS SDK v2 support, users must now have a dependency on *two* FAPI artifacts: * `fapi-s3-sdk-v2` * `fapi-client-playXX` Due to needing to support the matrix of: * AWS SDK v1 & v2 * Play-JSON 2.8, 2.9, 3.0 ...it's best not to try to produce an artifact that corresponds to every single combination of those! Consequently, we provide an artifacts that are specific to the different versions of AWS SDK (or at least, could do - if AWS SDK v1 was moved out of common code), and artifacts that are specific to the different versions of Play-JSON, and allow the user to combine them as needed. A similar approach was used with `guardian/play-secret-rotation`: guardian/play-secret-rotation#8 In order for the different artifacts to have interfaces they can use to join together and become a single useful Facia client, we have a `fapi-client-core` artifact. Any code that doesn't depend on the JSON classes, or the actual AWS SDK version (which isn't much!), can live in there. In particular, we have: * `com.gu.facia.client.ApiClient`, an existing type that is now a trait, with 2 implementations - one that uses the existing `com.gu.facia.client.S3Client` abstraction on S3 behaviour * `com.gu.facia.client.etagcaching.fetching.S3FetchBehaviour`, a new trait that exposes just enough interface to allow the conditional fetching used for ETag-based caching, but doesn't tie you to any specific version of the AWS SDK.
This change adds these improvements: * Facia data is only re-downloaded & re-parsed if the S3 content has _changed_, thanks to ETag-caching - see https://github.com/guardian/etag-caching . This library has already been used in DotCom PROD with guardian/frontend#26338 * AWS SDK v2: the FAPI client itself now has a `fapi-s3-sdk-v2` artifact. An example PR consuming this updated version of the FAPI client is at: guardian/ophan#5506 Updated FAPI artifact layout ---------------------------- To use FAPI with the new AWS SDK v2 support, users must now have a dependency on *two* FAPI artifacts: * `fapi-s3-sdk-v2` * `fapi-client-playXX` Due to needing to support the matrix of: * AWS SDK v1 & v2 * Play-JSON 2.8, 2.9, 3.0 ...it's best not to try to produce an artifact that corresponds to every single combination of those! Consequently, we provide an artifacts that are specific to the different versions of AWS SDK (or at least, could do - if AWS SDK v1 was moved out of common code), and artifacts that are specific to the different versions of Play-JSON, and allow the user to combine them as needed. A similar approach was used with `guardian/play-secret-rotation`: guardian/play-secret-rotation#8 In order for the different artifacts to have interfaces they can use to join together and become a single useful Facia client, we have a `fapi-client-core` artifact. Any code that doesn't depend on the JSON classes, or the actual AWS SDK version (which isn't much!), can live in there. In particular, we have: * `com.gu.facia.client.ApiClient`, an existing type that is now a trait, with 2 implementations - one that uses the existing `com.gu.facia.client.S3Client` abstraction on S3 behaviour * `com.gu.facia.client.etagcaching.fetching.S3FetchBehaviour`, a new trait that exposes just enough interface to allow the conditional fetching used for ETag-based caching, but doesn't tie you to any specific version of the AWS SDK.
This change adds these improvements: * Facia data is only re-downloaded & re-parsed if the S3 content has _changed_, thanks to ETag-caching - see https://github.com/guardian/etag-caching . This library has already been used in DotCom PROD with guardian/frontend#26338 * AWS SDK v2: the FAPI client itself now has a `fapi-s3-sdk-v2` artifact. An example PR consuming this updated version of the FAPI client is at: guardian/ophan#5506 Updated FAPI artifact layout ---------------------------- To use FAPI with the new AWS SDK v2 support, users must now have a dependency on *two* FAPI artifacts: * `fapi-s3-sdk-v2` * `fapi-client-playXX` Due to needing to support the matrix of: * AWS SDK v1 & v2 * Play-JSON 2.8, 2.9, 3.0 ...it's best not to try to produce an artifact that corresponds to every single combination of those! Consequently, we provide an artifacts that are specific to the different versions of AWS SDK (or at least, could do - if AWS SDK v1 was moved out of common code), and artifacts that are specific to the different versions of Play-JSON, and allow the user to combine them as needed. A similar approach was used with `guardian/play-secret-rotation`: guardian/play-secret-rotation#8 In order for the different artifacts to have interfaces they can use to join together and become a single useful Facia client, we have a `fapi-client-core` artifact. Any code that doesn't depend on the JSON classes, or the actual AWS SDK version (which isn't much!), can live in there. In particular, we have: * `com.gu.facia.client.ApiClient`, an existing type that is now a trait, with 2 implementations - one that uses the existing `com.gu.facia.client.S3Client` abstraction on S3 behaviour * `com.gu.facia.client.etagcaching.fetching.S3FetchBehaviour`, a new trait that exposes just enough interface to allow the conditional fetching used for ETag-based caching, but doesn't tie you to any specific version of the AWS SDK.
This change adds these improvements: * Facia data is only re-downloaded & re-parsed if the S3 content has _changed_ (thanks to [ETag-caching](https://github.com/guardian/etag-caching)) * Independence from AWS SDK version (v1 vs v2) _(this PR can replace #286 The [ETag-caching](https://github.com/guardian/etag-caching) library itself is also being used in DotCom PROD, introduced with guardian/frontend#26338. ### Usage ```scala import com.gu.etagcaching.aws.sdkv2.s3.S3ObjectFetching import com.gu.facia.client.{ApiClient, Environment} import software.amazon.awssdk.services.s3.S3AsyncClient val s3AsyncClient = S3AsyncClient.builder().region(...).credentialsProvider(...).build() val apiClient = ApiClient.withCaching( "facia-tool-store", Environment.Prod, S3ObjectFetching.byteArraysWith(s3AsyncClient) ) ``` _PR using this updated version of the FAPI client: https://github.com/guardian/ophan/pull/6741_ ### Independence from AWS SDK version (v1 vs v2) Ideally, the whole of `facia-scala-client` would be independent of AWS SDK version - we'd _like_ consumers of this library to be able to use whatever AWS SDK version they want, without us pulling in dependency on either SDK version. For `facia-scala-client` this is an attainable goal, as the only AWS API action it performs is fetching from S3, and [guardian/etag-caching](https://github.com/guardian/etag-caching) provides the [`S3ByteArrayFetching`](https://github.com/guardian/etag-caching/blob/v6.0.0/aws-s3/base/src/main/scala/com/gu/etagcaching/aws/s3/package.scala#L6-L22) abstraction that encapsulates this action without tying to a specific AWS SDK version. Due to legacy code compatibility, we can't completely remove AWS SDK v1 from `fapi-client` for now, but we _have_ removed it from `fapi-client-core`, which is the new home of `com.gu.facia.client.ApiClient`, which is now a trait, with 2 constructor methods that provide different implementations: * **`ApiClient()`** - legacy, using the existing `com.gu.facia.client.S3Client` abstraction on S3 behaviour * **`ApiClient.withCaching()`** - provides ETag-based caching and is independent of AWS SDK version - the consumer just needs to provide an appropriate instance of `com.gu.etagcaching.aws.sdkv2.s3.S3ObjectFetching` (ie with `"com.gu.etag-caching" %% "aws-s3-sdk-v2"` and `S3ObjectFetching.byteArraysWith(s3AsyncClient)`, introduced with guardian/etag-caching#65) ### Solved problems * **Noisy logging associated with absent collection JSON** - the `etag-caching` library has been updated with guardian/etag-caching#56 to avoid excessive logging that would occur in the Facia client, due to it typically trying to access collections that aren't yet persisted: #32.
This change adds these improvements: * Facia data is only re-downloaded & re-parsed if the S3 content has _changed_ (thanks to [ETag-caching](https://github.com/guardian/etag-caching)) * Independence from AWS SDK version (v1 vs v2) _(this PR can replace #286 The [ETag-caching](https://github.com/guardian/etag-caching) library itself is also being used in DotCom PROD, introduced with guardian/frontend#26338. ### Usage ```scala import com.gu.etagcaching.aws.sdkv2.s3.S3ObjectFetching import com.gu.facia.client.{ApiClient, Environment} import software.amazon.awssdk.services.s3.S3AsyncClient val s3AsyncClient = S3AsyncClient.builder().region(...).credentialsProvider(...).build() val apiClient = ApiClient.withCaching( "facia-tool-store", Environment.Prod, S3ObjectFetching.byteArraysWith(s3AsyncClient) ) ``` _PR using this updated version of the FAPI client: https://github.com/guardian/ophan/pull/6741_ ### Independence from AWS SDK version (v1 vs v2) Ideally, the whole of `facia-scala-client` would be independent of AWS SDK version - we'd _like_ consumers of this library to be able to use whatever AWS SDK version they want, without us pulling in dependency on either SDK version. For `facia-scala-client` this is an attainable goal, as the only AWS API action it performs is fetching from S3, and [guardian/etag-caching](https://github.com/guardian/etag-caching) provides the [`S3ByteArrayFetching`](https://github.com/guardian/etag-caching/blob/v6.0.0/aws-s3/base/src/main/scala/com/gu/etagcaching/aws/s3/package.scala#L6-L22) abstraction that encapsulates this action without tying to a specific AWS SDK version. Due to legacy code compatibility, we can't completely remove AWS SDK v1 from `fapi-client` for now, but we _have_ removed it from `fapi-client-core`, which is the new home of `com.gu.facia.client.ApiClient`, which is now a trait, with 2 constructor methods that provide different implementations: * **`ApiClient()`** - legacy, using the existing `com.gu.facia.client.S3Client` abstraction on S3 behaviour * **`ApiClient.withCaching()`** - provides ETag-based caching and is independent of AWS SDK version - the consumer just needs to provide an appropriate instance of `com.gu.etagcaching.aws.sdkv2.s3.S3ObjectFetching` (ie with `"com.gu.etag-caching" %% "aws-s3-sdk-v2"` and `S3ObjectFetching.byteArraysWith(s3AsyncClient)`, introduced with guardian/etag-caching#65) ### Solved problems * **Noisy logging associated with absent collection JSON** - the `etag-caching` library has been updated with guardian/etag-caching#56 to avoid excessive logging that would occur in the Facia client, due to it typically trying to access collections that aren't yet persisted: #32.
With guardian/facia-scala-client#287, the Facia client has been updated so that it can take advantage of ETag-caching, using the https://github.com/guardian/etag-caching library already used in Frontend since August 2023 with #26338. As some changes were necessary to the `etag-caching` library to facilitate this, we need to update the version of `etag-caching` used by Frontend to v7.0.0. We also need to use an updated version of CAPI, as the CAPI version used by Facia was recently updated with guardian/facia-scala-client#341. Although this change upgrades to the latest version of the Facia client, it doesn't take advantage of the new ETag-caching support there - that can come in a later PR, as it does involve switching to AWS SDK v2 (not everywhere in Frontend, just for where we use the Facia client, but it's still a significant change as the AWS STS credential creation has got more elaborate with AWS SDK v2).
With guardian/facia-scala-client#287, the Facia client has been updated so that it can take advantage of ETag-caching, using the https://github.com/guardian/etag-caching library already used in Frontend since August 2023 with #26338. As some changes were necessary to the `etag-caching` library to facilitate this, we need to update the version of `etag-caching` used by Frontend to v7.0.0. Although this change upgrades to the latest version of the Facia client, it doesn't take advantage of the new ETag-caching support there - that can come in a later PR, as it does involve switching to AWS SDK v2 (not everywhere in Frontend, just for where we use the Facia client, but it's still a significant change as AWS STS credential creation has got more elaborate with AWS SDK v2).
With guardian/facia-scala-client#287, the Facia client has been updated so that it can take advantage of ETag-caching, using the https://github.com/guardian/etag-caching library already used in Frontend since August 2023 with #26338. As some changes were necessary to the `etag-caching` library to facilitate this, we need to update the version of `etag-caching` used by Frontend to v7.0.0. This does involve switching to AWS SDK v2 - not everywhere in Frontend, just for where we use the Facia client!
With guardian/facia-scala-client#287, the Facia client has been updated so that it can take advantage of ETag-caching, using the https://github.com/guardian/etag-caching library already used in Frontend since August 2023 with #26338. As some changes were necessary to the `etag-caching` library to facilitate this, we need to update the version of `etag-caching` used by Frontend to v7.0.0. This does involve switching to AWS SDK v2 - not everywhere in Frontend, just for where we use the Facia client!
This PR looks to fix #26335 by improving the scalability of Facia's pressed-page-json loading code in
FrontJsonFapi
. Originally it focussed only on adopting AWS SDK v2 to allow non-blocking async code for the downloading, but it now additionally includes a more significant change: ETag-based caching.ETag caching is a pretty wonderful standard part of the HTTP protocol, relying on these 3 components:
ETag
HTTP response header - a hash of the content, returned by the serverIf-None-Match
HTTP request header - sent by the client, to indicate what content it already has, by sending its ETag304 Not Modified
HTTP response code - returned by a server when content is unchanged from the supplied ETag, with an otherwise blank response (no need to return a response!)I've extracted the code needed for an
ETagCache
into a new library at https://github.com/guardian/etag-caching .ETag Caching Benefits
PressedPage
s are only re-downloaded if the S3 content has changedPressedPage
s are only re-parsed if the S3 content has changedThe fetching and parsing code is also improved:
String
firstMemory requirements of caching
Currently
facia
runs onc6g.2xlarge
instances (16GB RAM), with a JVM heap of 12GB. I'm going to suggest that it's reasonable for theETagCache
to consume up to 4GB of the 12GB RAM.There are ~300 fronts, each of which can have 4 different variants, so currently there could be up to 1200 different
PressedPage
objects. From heapdump analysis, the largestPressedPage
retains ~22MB of memory (most instances are smaller, averaging at 4MB). With the 4GB budget, and assuming a worse case of 22MB perPressedPage
, we can afford to set a max size on the cache of ~180 entries. Although it's disappointing we can't hold all of thePressedPage
s, the priorities of the eviction policy used by the Caffeine caching library should ensure that we get a good hit rate on the most in-demand fronts.Incidentally, heapdump analysis also shows that some structures within
PressedPage
are memory inefficient when considered from the perspective of holding manyPressedPage
objects in memory at once - using object pooling onmodel.Tag
for instance, would probably lead to a 80% reduction of the total retained memory.Removal of
FutureSemaphore
FutureSemaphore
, introduced in November 2017 with #18331, is removed in this change - as described in #26335 (comment) & #26336 (comment), the late-stage 32-concurrent-decoding-processes limit was harmful: it led to work (JSON download &String
generation) being thrown away, without the result of that work being cached by the CDN.Testing
Testing total request-response time on CODE: