telemetry worker: flush data after stops, and two other fixes #515

cataphract · 2024-07-01T13:55:51Z

Telemetry workers are functionally dead after a Stop lifecycle action, provided there's no intervening Start. While AddPoint actions are still processed, their data is never flushed, since the Stop action handler unschedules FlushMetrics and FlushData actions.

PHP sends a Stop action at the end of every request via ddog_sidecar_telemetry_end(), but a Start action is only generated just after a telemetry worker is spawned. With no more Start actions generated, no metrics can effectively be sent after the first Stop.

It is not clear to me whether the intention is to have a Start/Stop pair on every PHP request (where Stop flushes the metrics) or if the intention is to to have only such a pair in the first request, with the Stop event generated by ddog_sidecar_telemetry_end() effectively a noop. It would appear, judging by this
comment:

Also allow the telemetry worker to have a mode where it's continuing
execution after a start-stop cycle, otherwise it won't send any more metrics afterwards.

that the intention is to keep sending metrics after a Start/Stop pair. It also makes more sense, insofar as data is flushed only on the interval, rather than after every request via Stop. In that case:

The Stop action handler should not unschedule FlushData and FlushMetrics events and
FlushData, if called outside a Start-Stop pair, should not be a noop.

Finally: swap the order in which FlushData and FlushMetrics are scheduled so that FlushMetrics runs first and therefore its generated data can be sent by the next FlushData.

pr-commenter · 2024-07-01T14:00:11Z

Benchmarks

Comparison

Benchmark execution time: 2024-12-06 11:15:26

Comparing candidate commit cbce086 in PR branch glopes/flush-data-after-stop with baseline commit fd011c1 in branch main.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 51 metrics, 2 unstable metrics.

Candidate

Candidate benchmark details

Group 1

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`cbce086`	1733483074	glopes/flush-data-after-stop

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
normalization/normalize_service/normalize_service/A0000000000000000000000000000000000000000000000000...	execution_time	508.601µs	509.656µs ± 1.294µs	509.512µs ± 0.324µs	509.823µs	510.362µs	512.166µs	523.816µs	2.81%	8.200	80.075	0.25%	0.091µs	1	200
normalization/normalize_service/normalize_service/A0000000000000000000000000000000000000000000000000...	throughput	1909066.449op/s	1962119.747op/s ± 4880.429op/s	1962661.293op/s ± 1247.720op/s	1963904.986op/s	1965364.520op/s	1965784.276op/s	1966176.445op/s	0.18%	-8.088	78.315	0.25%	345.098op/s	1	200
normalization/normalize_service/normalize_service/Data🐨dog🐶 繋がっ⛰てて	execution_time	465.969µs	467.288µs ± 0.590µs	467.290µs ± 0.267µs	467.533µs	468.058µs	468.452µs	471.080µs	0.81%	2.195	13.344	0.13%	0.042µs	1	200
normalization/normalize_service/normalize_service/Data🐨dog🐶 繋がっ⛰てて	throughput	2122782.879op/s	2140013.050op/s ± 2696.197op/s	2139998.716op/s ± 1223.498op/s	2141385.363op/s	2143702.016op/s	2145868.807op/s	2146064.784op/s	0.28%	-2.156	13.070	0.13%	190.650op/s	1	200
normalization/normalize_service/normalize_service/Test Conversion 0f Weird !@#$%^&**() Characters	execution_time	175.838µs	176.839µs ± 0.367µs	176.844µs ± 0.209µs	177.053µs	177.490µs	177.706µs	177.768µs	0.52%	-0.082	0.280	0.21%	0.026µs	1	200
normalization/normalize_service/normalize_service/Test Conversion 0f Weird !@#$%^&**() Characters	throughput	5625319.426op/s	5654881.330op/s ± 11749.493op/s	5654704.438op/s ± 6688.364op/s	5661396.240op/s	5673873.040op/s	5684175.092op/s	5687063.110op/s	0.57%	0.096	0.287	0.21%	830.815op/s	1	200
normalization/normalize_service/normalize_service/[empty string]	execution_time	45.245µs	45.508µs ± 0.114µs	45.499µs ± 0.083µs	45.598µs	45.680µs	45.747µs	45.796µs	0.65%	0.046	-0.577	0.25%	0.008µs	1	200
normalization/normalize_service/normalize_service/[empty string]	throughput	21835826.179op/s	21974196.975op/s ± 55014.463op/s	21978719.144op/s ± 40221.587op/s	22012877.848op/s	22063793.950op/s	22083842.708op/s	22101813.493op/s	0.56%	-0.036	-0.579	0.25%	3890.110op/s	1	200
normalization/normalize_service/normalize_service/test_ASCII	execution_time	41.802µs	42.103µs ± 0.215µs	42.057µs ± 0.177µs	42.295µs	42.460µs	42.559µs	42.590µs	1.27%	0.373	-1.114	0.51%	0.015µs	1	200
normalization/normalize_service/normalize_service/test_ASCII	throughput	23479576.214op/s	23752159.921op/s ± 120828.433op/s	23777411.398op/s ± 100312.863op/s	23855837.113op/s	23912385.125op/s	23920646.605op/s	23922244.385op/s	0.61%	-0.361	-1.128	0.51%	8543.860op/s	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
normalization/normalize_service/normalize_service/A0000000000000000000000000000000000000000000000000...	execution_time	[509.477µs; 509.835µs] or [-0.035%; +0.035%]	None	None	None
normalization/normalize_service/normalize_service/A0000000000000000000000000000000000000000000000000...	throughput	[1961443.367op/s; 1962796.128op/s] or [-0.034%; +0.034%]	None	None	None
normalization/normalize_service/normalize_service/Data🐨dog🐶 繋がっ⛰てて	execution_time	[467.206µs; 467.369µs] or [-0.018%; +0.018%]	None	None	None
normalization/normalize_service/normalize_service/Data🐨dog🐶 繋がっ⛰てて	throughput	[2139639.383op/s; 2140386.718op/s] or [-0.017%; +0.017%]	None	None	None
normalization/normalize_service/normalize_service/Test Conversion 0f Weird !@#$%^&**() Characters	execution_time	[176.788µs; 176.890µs] or [-0.029%; +0.029%]	None	None	None
normalization/normalize_service/normalize_service/Test Conversion 0f Weird !@#$%^&**() Characters	throughput	[5653252.964op/s; 5656509.697op/s] or [-0.029%; +0.029%]	None	None	None
normalization/normalize_service/normalize_service/[empty string]	execution_time	[45.492µs; 45.524µs] or [-0.035%; +0.035%]	None	None	None
normalization/normalize_service/normalize_service/[empty string]	throughput	[21966572.500op/s; 21981821.451op/s] or [-0.035%; +0.035%]	None	None	None
normalization/normalize_service/normalize_service/test_ASCII	execution_time	[42.073µs; 42.132µs] or [-0.071%; +0.071%]	None	None	None
normalization/normalize_service/normalize_service/test_ASCII	throughput	[23735414.263op/s; 23768905.580op/s] or [-0.071%; +0.071%]	None	None	None

Group 2

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`cbce086`	1733483074	glopes/flush-data-after-stop

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
benching string interning on wordpress profile	execution_time	142.175µs	143.124µs ± 0.377µs	143.090µs ± 0.180µs	143.268µs	143.633µs	144.183µs	146.136µs	2.13%	2.946	20.365	0.26%	0.027µs	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
benching string interning on wordpress profile	execution_time	[143.071µs; 143.176µs] or [-0.036%; +0.036%]	None	None	None

Group 3

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`cbce086`	1733483074	glopes/flush-data-after-stop

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
credit_card/is_card_number/	execution_time	4.617µs	4.628µs ± 0.005µs	4.628µs ± 0.003µs	4.631µs	4.638µs	4.642µs	4.643µs	0.32%	0.313	0.042	0.11%	0.000µs	1	200
credit_card/is_card_number/	throughput	215373416.243op/s	216065890.140op/s ± 242707.789op/s	216069058.911op/s ± 157021.133op/s	216244234.905op/s	216462298.391op/s	216550453.320op/s	216579289.969op/s	0.24%	-0.306	0.036	0.11%	17162.032op/s	1	200
credit_card/is_card_number/ 3782-8224-6310-005	execution_time	96.302µs	96.488µs ± 0.385µs	96.453µs ± 0.033µs	96.491µs	96.558µs	96.616µs	101.845µs	5.59%	13.572	186.324	0.40%	0.027µs	1	200
credit_card/is_card_number/ 3782-8224-6310-005	throughput	9818834.015op/s	10364127.431op/s ± 39237.712op/s	10367789.869op/s ± 3555.142op/s	10370950.708op/s	10374792.368op/s	10379240.586op/s	10384035.584op/s	0.16%	-13.522	185.384	0.38%	2774.525op/s	1	200
credit_card/is_card_number/ 378282246310005	execution_time	88.393µs	88.827µs ± 0.421µs	88.788µs ± 0.119µs	88.913µs	89.170µs	89.317µs	94.081µs	5.96%	9.763	119.310	0.47%	0.030µs	1	200
credit_card/is_card_number/ 378282246310005	throughput	10629191.711op/s	11258021.769op/s ± 51070.282op/s	11262827.617op/s ± 15087.989op/s	11277575.950op/s	11302223.206op/s	11311134.535op/s	11313054.133op/s	0.45%	-9.396	113.248	0.45%	3611.214op/s	1	200
credit_card/is_card_number/37828224631	execution_time	4.613µs	4.627µs ± 0.005µs	4.627µs ± 0.003µs	4.630µs	4.635µs	4.641µs	4.642µs	0.32%	-0.070	0.148	0.11%	0.000µs	1	200
credit_card/is_card_number/37828224631	throughput	215426726.851op/s	216114693.726op/s ± 245368.502op/s	216112210.341op/s ± 152496.877op/s	216274111.476op/s	216520341.781op/s	216735660.837op/s	216779538.820op/s	0.31%	0.077	0.147	0.11%	17350.173op/s	1	200
credit_card/is_card_number/378282246310005	execution_time	84.776µs	85.466µs ± 0.293µs	85.437µs ± 0.180µs	85.655µs	86.001µs	86.080µs	86.139µs	0.82%	0.147	-0.332	0.34%	0.021µs	1	200
credit_card/is_card_number/378282246310005	throughput	11609082.311op/s	11700703.058op/s ± 40135.826op/s	11704587.618op/s ± 24666.541op/s	11724084.494op/s	11768393.744op/s	11787931.877op/s	11795786.505op/s	0.78%	-0.130	-0.334	0.34%	2838.031op/s	1	200
credit_card/is_card_number/37828224631000521389798	execution_time	64.208µs	64.312µs ± 0.073µs	64.309µs ± 0.023µs	64.331µs	64.374µs	64.412µs	65.175µs	1.35%	8.390	97.778	0.11%	0.005µs	1	200
credit_card/is_card_number/37828224631000521389798	throughput	15343188.775op/s	15549203.647op/s ± 17413.360op/s	15549992.003op/s ± 5584.246op/s	15555714.824op/s	15566495.533op/s	15571570.397op/s	15574321.059op/s	0.16%	-8.291	96.241	0.11%	1231.310op/s	1	200
credit_card/is_card_number/x371413321323331	execution_time	6.834µs	6.844µs ± 0.004µs	6.844µs ± 0.002µs	6.846µs	6.850µs	6.853µs	6.857µs	0.20%	0.339	1.146	0.06%	0.000µs	1	200
credit_card/is_card_number/x371413321323331	throughput	145829534.650op/s	146114227.936op/s ± 80849.951op/s	146123555.462op/s ± 41166.393op/s	146158111.096op/s	146241197.636op/s	146314224.459op/s	146331742.468op/s	0.14%	-0.334	1.142	0.06%	5716.955op/s	1	200
credit_card/is_card_number_no_luhn/	execution_time	4.612µs	4.627µs ± 0.006µs	4.628µs ± 0.004µs	4.632µs	4.638µs	4.643µs	4.644µs	0.34%	0.122	-0.326	0.13%	0.000µs	1	200
credit_card/is_card_number_no_luhn/	throughput	215352618.737op/s	216106487.418op/s ± 286175.190op/s	216092339.532op/s ± 201224.965op/s	216324420.376op/s	216520642.261op/s	216699105.853op/s	216807703.706op/s	0.33%	-0.115	-0.330	0.13%	20235.642op/s	1	200
credit_card/is_card_number_no_luhn/ 3782-8224-6310-005	execution_time	76.866µs	77.341µs ± 0.140µs	77.336µs ± 0.101µs	77.442µs	77.558µs	77.630µs	77.666µs	0.43%	-0.274	-0.058	0.18%	0.010µs	1	200
credit_card/is_card_number_no_luhn/ 3782-8224-6310-005	throughput	12875680.158op/s	12929773.675op/s ± 23368.387op/s	12930570.594op/s ± 16924.148op/s	12946691.996op/s	12969650.953op/s	12983313.766op/s	13009653.163op/s	0.61%	0.284	-0.044	0.18%	1652.394op/s	1	200
credit_card/is_card_number_no_luhn/ 378282246310005	execution_time	68.882µs	69.213µs ± 0.134µs	69.207µs ± 0.093µs	69.307µs	69.445µs	69.493µs	69.513µs	0.44%	0.058	-0.366	0.19%	0.009µs	1	200
credit_card/is_card_number_no_luhn/ 378282246310005	throughput	14385792.631op/s	14448169.844op/s ± 27902.416op/s	14449496.841op/s ± 19418.864op/s	14467754.993op/s	14489136.048op/s	14514778.603op/s	14517676.723op/s	0.47%	-0.048	-0.363	0.19%	1972.999op/s	1	200
credit_card/is_card_number_no_luhn/37828224631	execution_time	4.614µs	4.628µs ± 0.005µs	4.628µs ± 0.003µs	4.632µs	4.638µs	4.640µs	4.643µs	0.33%	0.024	-0.064	0.12%	0.000µs	1	200
credit_card/is_card_number_no_luhn/37828224631	throughput	215356674.901op/s	216072465.202op/s ± 255304.330op/s	216073833.459op/s ± 162876.412op/s	216232588.430op/s	216495711.068op/s	216667019.655op/s	216722100.790op/s	0.30%	-0.017	-0.065	0.12%	18052.742op/s	1	200
credit_card/is_card_number_no_luhn/378282246310005	execution_time	65.092µs	65.502µs ± 0.153µs	65.500µs ± 0.113µs	65.622µs	65.754µs	65.860µs	65.925µs	0.65%	0.085	-0.315	0.23%	0.011µs	1	200
credit_card/is_card_number_no_luhn/378282246310005	throughput	15168788.463op/s	15266684.031op/s ± 35686.812op/s	15267290.601op/s ± 26383.661op/s	15292411.911op/s	15320918.571op/s	15345738.665op/s	15362977.453op/s	0.63%	-0.073	-0.318	0.23%	2523.439op/s	1	200
credit_card/is_card_number_no_luhn/37828224631000521389798	execution_time	64.215µs	64.310µs ± 0.032µs	64.307µs ± 0.017µs	64.327µs	64.362µs	64.396µs	64.411µs	0.16%	0.113	1.365	0.05%	0.002µs	1	200
credit_card/is_card_number_no_luhn/37828224631000521389798	throughput	15525397.361op/s	15549686.751op/s ± 7767.456op/s	15550408.193op/s ± 4139.541op/s	15553764.879op/s	15564096.664op/s	15568658.238op/s	15572620.219op/s	0.14%	-0.108	1.363	0.05%	549.242op/s	1	200
credit_card/is_card_number_no_luhn/x371413321323331	execution_time	6.833µs	6.843µs ± 0.003µs	6.843µs ± 0.002µs	6.845µs	6.849µs	6.852µs	6.854µs	0.16%	0.193	1.165	0.05%	0.000µs	1	200
credit_card/is_card_number_no_luhn/x371413321323331	throughput	145895326.789op/s	146126057.905op/s ± 66257.699op/s	146131705.348op/s ± 34805.927op/s	146163404.338op/s	146225972.657op/s	146290998.123op/s	146344822.837op/s	0.15%	-0.189	1.164	0.05%	4685.127op/s	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
credit_card/is_card_number/	execution_time	[4.628µs; 4.629µs] or [-0.016%; +0.016%]	None	None	None
credit_card/is_card_number/	throughput	[216032253.175op/s; 216099527.105op/s] or [-0.016%; +0.016%]	None	None	None
credit_card/is_card_number/ 3782-8224-6310-005	execution_time	[96.435µs; 96.541µs] or [-0.055%; +0.055%]	None	None	None
credit_card/is_card_number/ 3782-8224-6310-005	throughput	[10358689.461op/s; 10369565.400op/s] or [-0.052%; +0.052%]	None	None	None
credit_card/is_card_number/ 378282246310005	execution_time	[88.769µs; 88.886µs] or [-0.066%; +0.066%]	None	None	None
credit_card/is_card_number/ 378282246310005	throughput	[11250943.919op/s; 11265099.619op/s] or [-0.063%; +0.063%]	None	None	None
credit_card/is_card_number/37828224631	execution_time	[4.626µs; 4.628µs] or [-0.016%; +0.016%]	None	None	None
credit_card/is_card_number/37828224631	throughput	[216080688.012op/s; 216148699.441op/s] or [-0.016%; +0.016%]	None	None	None
credit_card/is_card_number/378282246310005	execution_time	[85.425µs; 85.507µs] or [-0.048%; +0.048%]	None	None	None
credit_card/is_card_number/378282246310005	throughput	[11695140.619op/s; 11706265.498op/s] or [-0.048%; +0.048%]	None	None	None
credit_card/is_card_number/37828224631000521389798	execution_time	[64.302µs; 64.322µs] or [-0.016%; +0.016%]	None	None	None
credit_card/is_card_number/37828224631000521389798	throughput	[15546790.323op/s; 15551616.971op/s] or [-0.016%; +0.016%]	None	None	None
credit_card/is_card_number/x371413321323331	execution_time	[6.843µs; 6.844µs] or [-0.008%; +0.008%]	None	None	None
credit_card/is_card_number/x371413321323331	throughput	[146103022.910op/s; 146125432.961op/s] or [-0.008%; +0.008%]	None	None	None
credit_card/is_card_number_no_luhn/	execution_time	[4.627µs; 4.628µs] or [-0.018%; +0.018%]	None	None	None
credit_card/is_card_number_no_luhn/	throughput	[216066826.289op/s; 216146148.547op/s] or [-0.018%; +0.018%]	None	None	None
credit_card/is_card_number_no_luhn/ 3782-8224-6310-005	execution_time	[77.322µs; 77.360µs] or [-0.025%; +0.025%]	None	None	None
credit_card/is_card_number_no_luhn/ 3782-8224-6310-005	throughput	[12926535.041op/s; 12933012.309op/s] or [-0.025%; +0.025%]	None	None	None
credit_card/is_card_number_no_luhn/ 378282246310005	execution_time	[69.195µs; 69.232µs] or [-0.027%; +0.027%]	None	None	None
credit_card/is_card_number_no_luhn/ 378282246310005	throughput	[14444302.837op/s; 14452036.850op/s] or [-0.027%; +0.027%]	None	None	None
credit_card/is_card_number_no_luhn/37828224631	execution_time	[4.627µs; 4.629µs] or [-0.016%; +0.016%]	None	None	None
credit_card/is_card_number_no_luhn/37828224631	throughput	[216037082.477op/s; 216107847.926op/s] or [-0.016%; +0.016%]	None	None	None
credit_card/is_card_number_no_luhn/378282246310005	execution_time	[65.481µs; 65.524µs] or [-0.032%; +0.032%]	None	None	None
credit_card/is_card_number_no_luhn/378282246310005	throughput	[15261738.182op/s; 15271629.880op/s] or [-0.032%; +0.032%]	None	None	None
credit_card/is_card_number_no_luhn/37828224631000521389798	execution_time	[64.306µs; 64.314µs] or [-0.007%; +0.007%]	None	None	None
credit_card/is_card_number_no_luhn/37828224631000521389798	throughput	[15548610.256op/s; 15550763.245op/s] or [-0.007%; +0.007%]	None	None	None
credit_card/is_card_number_no_luhn/x371413321323331	execution_time	[6.843µs; 6.844µs] or [-0.006%; +0.006%]	None	None	None
credit_card/is_card_number_no_luhn/x371413321323331	throughput	[146116875.225op/s; 146135240.584op/s] or [-0.006%; +0.006%]	None	None	None

Group 4

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`cbce086`	1733483074	glopes/flush-data-after-stop

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
benching deserializing traces from msgpack to their internal representation	execution_time	58.555ms	58.830ms ± 0.193ms	58.755ms ± 0.099ms	58.940ms	59.231ms	59.358ms	59.566ms	1.38%	1.159	0.827	0.33%	0.014ms	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
benching deserializing traces from msgpack to their internal representation	execution_time	[58.804ms; 58.857ms] or [-0.045%; +0.045%]	None	None	None

Group 5

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`cbce086`	1733483074	glopes/flush-data-after-stop

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
sql/obfuscate_sql_string	execution_time	72.995µs	73.282µs ± 0.541µs	73.219µs ± 0.067µs	73.305µs	73.433µs	73.862µs	80.679µs	10.19%	12.877	173.112	0.74%	0.038µs	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
sql/obfuscate_sql_string	execution_time	[73.207µs; 73.357µs] or [-0.102%; +0.102%]	None	None	None

Group 6

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`cbce086`	1733483074	glopes/flush-data-after-stop

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
normalization/normalize_name/normalize_name/Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Lo...	execution_time	280.101µs	280.710µs ± 0.360µs	280.656µs ± 0.163µs	280.850µs	281.200µs	282.022µs	283.048µs	0.85%	2.444	11.137	0.13%	0.025µs	1	200
normalization/normalize_name/normalize_name/Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Lo...	throughput	3532967.091op/s	3562395.875op/s ± 4555.771op/s	3563081.220op/s ± 2073.645op/s	3564835.492op/s	3567812.032op/s	3569354.587op/s	3570135.496op/s	0.20%	-2.416	10.935	0.13%	322.142op/s	1	200
normalization/normalize_name/normalize_name/bad-name	execution_time	27.228µs	27.328µs ± 0.126µs	27.311µs ± 0.029µs	27.345µs	27.409µs	27.473µs	28.968µs	6.07%	11.034	140.159	0.46%	0.009µs	1	200
normalization/normalize_name/normalize_name/bad-name	throughput	34520305.830op/s	36592691.341op/s ± 161172.369op/s	36615704.580op/s ± 38410.089op/s	36649640.017op/s	36694722.520op/s	36702717.232op/s	36726667.548op/s	0.30%	-10.751	135.174	0.44%	11396.607op/s	1	200
normalization/normalize_name/normalize_name/good	execution_time	16.941µs	17.002µs ± 0.026µs	17.002µs ± 0.016µs	17.017µs	17.052µs	17.064µs	17.092µs	0.53%	0.323	0.218	0.16%	0.002µs	1	200
normalization/normalize_name/normalize_name/good	throughput	58507898.566op/s	58815866.559op/s ± 91590.301op/s	58817467.280op/s ± 54938.058op/s	58875596.871op/s	58958198.100op/s	59008665.019op/s	59028571.342op/s	0.36%	-0.314	0.207	0.16%	6476.412op/s	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
normalization/normalize_name/normalize_name/Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Lo...	execution_time	[280.661µs; 280.760µs] or [-0.018%; +0.018%]	None	None	None
normalization/normalize_name/normalize_name/Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Lo...	throughput	[3561764.489op/s; 3563027.261op/s] or [-0.018%; +0.018%]	None	None	None
normalization/normalize_name/normalize_name/bad-name	execution_time	[27.311µs; 27.346µs] or [-0.064%; +0.064%]	None	None	None
normalization/normalize_name/normalize_name/bad-name	throughput	[36570354.401op/s; 36615028.281op/s] or [-0.061%; +0.061%]	None	None	None
normalization/normalize_name/normalize_name/good	execution_time	[16.999µs; 17.006µs] or [-0.022%; +0.022%]	None	None	None
normalization/normalize_name/normalize_name/good	throughput	[58803173.024op/s; 58828560.093op/s] or [-0.022%; +0.022%]	None	None	None

Group 7

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`cbce086`	1733483074	glopes/flush-data-after-stop

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
write only interface	execution_time	1.380µs	3.274µs ± 1.470µs	3.114µs ± 0.023µs	3.135µs	3.174µs	14.435µs	15.514µs	398.24%	7.629	58.135	44.80%	0.104µs	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
write only interface	execution_time	[3.070µs; 3.477µs] or [-6.225%; +6.225%]	None	None	None

Group 8

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`cbce086`	1733483074	glopes/flush-data-after-stop

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
tags/replace_trace_tags	execution_time	2.673µs	2.721µs ± 0.017µs	2.723µs ± 0.006µs	2.728µs	2.751µs	2.757µs	2.761µs	1.40%	-0.614	1.254	0.62%	0.001µs	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
tags/replace_trace_tags	execution_time	[2.719µs; 2.723µs] or [-0.086%; +0.086%]	None	None	None

Group 9

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`cbce086`	1733483074	glopes/flush-data-after-stop

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
two way interface	execution_time	18.417µs	24.490µs ± 13.669µs	18.663µs ± 0.079µs	19.339µs	46.330µs	47.310µs	150.213µs	704.85%	4.934	37.191	55.67%	0.967µs	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
two way interface	execution_time	[22.596µs; 26.384µs] or [-7.735%; +7.735%]	None	None	None

Group 10

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`cbce086`	1733483074	glopes/flush-data-after-stop

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
concentrator/add_spans_to_concentrator	execution_time	9.165ms	9.191ms ± 0.015ms	9.189ms ± 0.008ms	9.196ms	9.209ms	9.239ms	9.312ms	1.34%	3.501	23.476	0.16%	0.001ms	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
concentrator/add_spans_to_concentrator	execution_time	[9.188ms; 9.193ms] or [-0.023%; +0.023%]	None	None	None

Group 11

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`cbce086`	1733483074	glopes/flush-data-after-stop

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
redis/obfuscate_redis_string	execution_time	35.723µs	36.441µs ± 1.291µs	35.852µs ± 0.028µs	35.900µs	39.219µs	39.278µs	40.330µs	12.49%	1.724	1.026	3.53%	0.091µs	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
redis/obfuscate_redis_string	execution_time	[36.262µs; 36.620µs] or [-0.491%; +0.491%]	None	None	None

Group 12

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`cbce086`	1733483074	glopes/flush-data-after-stop

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
normalization/normalize_trace/test_trace	execution_time	297.262ns	308.787ns ± 13.613ns	303.626ns ± 3.600ns	307.699ns	339.722ns	351.041ns	351.903ns	15.90%	1.798	2.147	4.40%	0.963ns	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
normalization/normalize_trace/test_trace	execution_time	[306.900ns; 310.673ns] or [-0.611%; +0.611%]	None	None	None

Baseline

Omitted due to size.

bantonsson

These booleans are woefully undocumented and I'm not completely sure about the expected life cycle, but from reading the code it looks like your interpretation is correct.

I think that it might be better to leave the worker as started if it's restartable and leave the checks in place.

bantonsson · 2024-07-04T11:46:29Z

ddtelemetry/src/worker/mod.rs

@@ -296,7 +293,9 @@ impl TelemetryWorker {
                    self.log_err(&e);
                }
                self.data.started = false;
-                self.deadlines.clear_pending();
+                if !self.config.restartable {


Wouldn't it be enough to only include self.data.started = false; inside the if statement as well, and leave the exit early checks?

I guess it depends. Do we want Start, Stop, Stop to generate two stops? Because that's what PHP ends up generating. The second stop is a noop, but if I moved the assignment self.data.started = false under the condition if !self.config.restartable, then the stops would be effective

If the exit early checks are still in the code, then how would the second Stop be effective?

From what I understand, what you're proposing is that I reset started = false only if !restartable. In that case started would stay true forever once there is a start. So the early check

if !self.data.started { return BREAK; }

would never be hit.

Yes, you're right. I was confusing the early checks.

There is something in the logic that feels a bit broken. The FlushData is also protected by this !self.data.started check. Should it work after a restartable stop?

The things that Stop does, should they happen when the first request ends, or when the worker stops?

Should [FlushData] work after a restartable stop?

My guess is yes, otherwise there is no way to send the data that's collected after the stop (build_observability_batch is called only from the handlers of Stop and FlushData), at least not without an intervening start+stop.

The things that Stop does, should they happen when the first request ends, or when the worker stops?

That is a good point. The final flush of the metrics should happen when the worker stops, not when handling Stop. I guess at some point they were the same, but then the restart thing was introduced. But regardless, once we have a way to send metrics after a Stop, for that happen periodic flushes should still happen. So FlushData shouldn't be skipped or unscheduled after a Stop.

ddtelemetry/src/worker/mod.rs

codecov-commenter · 2024-07-04T15:46:14Z

Codecov Report

Attention: Patch coverage is 72.50000% with 11 lines in your changes missing coverage. Please review.

Project coverage is 71.12%. Comparing base (fd011c1) to head (cbce086).

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #515      +/-   ##
==========================================
+ Coverage   71.08%   71.12%   +0.04%     
==========================================
  Files         301      301              
  Lines       44528    44533       +5     
==========================================
+ Hits        31651    31673      +22     
+ Misses      12877    12860      -17

Components	Coverage Δ
crashtracker	`44.89% <ø> (ø)`
crashtracker-ffi	`8.41% <ø> (ø)`
datadog-alloc	`98.73% <ø> (ø)`
data-pipeline	`90.86% <ø> (ø)`
data-pipeline-ffi	`86.29% <ø> (ø)`
ddcommon	`83.28% <ø> (ø)`
ddcommon-ffi	`68.68% <ø> (ø)`
ddtelemetry	`59.47% <87.87%> (+0.40%)`	⬆️
ddtelemetry-ffi	`22.13% <ø> (ø)`
dogstatsd	`89.45% <ø> (ø)`
dogstatsd-client	`79.77% <ø> (ø)`
ipc	`82.86% <ø> (+0.10%)`	⬆️
profiling	`84.30% <ø> (ø)`
profiling-ffi	`77.46% <ø> (ø)`
serverless	`0.00% <ø> (ø)`
sidecar	`37.96% <0.00%> (-0.06%)`	⬇️
sidecar-ffi	`0.00% <ø> (ø)`
spawn-worker	`50.36% <ø> (ø)`
tinybytes	`94.47% <ø> (ø)`
trace-mini-agent	`72.38% <ø> (ø)`
trace-normalization	`98.23% <ø> (ø)`
trace-obfuscation	`95.77% <ø> (ø)`
trace-protobuf	`77.67% <ø> (ø)`
trace-utils	`93.03% <ø> (ø)`

cataphract · 2024-11-11T11:35:14Z

@bwoebi Can you look at this? I need it for DataDog/dd-trace-php#2735

Telemetry workers are functionally dead after a Stop lifecycle action, provided there's no intervening Start. While AddPoint actions are still processed, their data is never flushed, since the Stop action handler unschedules FlushMetrics and FlushData actions. PHP sends a Stop action at the end of every request via ddog_sidecar_telemetry_end(), but a Start action is only generated just after a telemetry worker is spawned. It is not clear to me whether the intention is to a Start/Stop pair on every PHP requests (where Stop flushes the metrics) or if the intention is to to have only such a pair in the first request, with the Stop event generated by ddog_sidecar_telemetry_end() effectively a noop. It would appear, judging by [this comment](#391): > Also allow the telemetry worker to have a mode where it's continuing execution after a start-stop cycle, otherwise it won't send any more metrics afterwards. that the intention is to keep sending metrics after a Start/Stop pair. In that case: * The Stop action handler should not unschedule FlushData and FlushMetrics events and * FlushData, if called outside a Start-Stop pair, should not be a noop. Finally: swap the order in which FlushData and FlushMetrics are scheduled so that FlushMetrics runs first and therefore its generated data can be sent by the next FlushData.

EnqueuedTelemetryData has actions stored directly in an "actions" field, but also dependencies, configurations and integrations in separate fields; extract_telemetry_actions() converts from the latter into proper TelemetryActions. However, this method was being called only from a branch of register_service_and_flush_queued_actions() that would be hit when RuntimeInfo::apps does not already contain the key corresponding to the (service, env) in question (so, essentially, deps/confs/integrations would be lost on all but the first call).

cataphract · 2024-12-05T12:30:44Z

Added two more commits to fix two other telemetry-related problems.

Fixes the tests in https://github.com/DataDog/dd-trace-php/blob/glopes/waf-telemetry/appsec/tests/integration/src/test/groovy/com/datadog/appsec/php/integration/TelemetryTests.groovy

ddtelemetry/src/worker/mod.rs

bantonsson · 2024-12-05T13:16:58Z

ddtelemetry/src/worker/mod.rs

        let mut payloads = Vec::new();

        if self.data.dependencies.flush_not_empty() {
            payloads.push(data::Payload::AppDependenciesLoaded(
                data::AppDependenciesLoaded {
-                    dependencies: self.data.dependencies.unflushed().cloned().collect(),
+                    dependencies: self.data.dependencies.drain_unflushed().cloned().collect(),


Shouldn't these already be removed on a successful send by the call to removed_flushed? Yeah the method name has a d too much.

I've added a new commit that uses the existing method.

bantonsson · 2024-12-06T10:00:40Z

ddtelemetry/src/worker/mod.rs

+                    async move {
+                        // This is different from the non-functional:
+                        // match self_arc.read().await.send_payload(&payload).await { ... }
+                        // presumably because the temp read guard would live till end of match


Interesting find

cataphract requested a review from a team as a code owner July 1, 2024 13:55

cataphract requested review from pawelchcki, bwoebi and bantonsson and removed request for a team and pawelchcki July 1, 2024 13:55

github-actions bot added the telemetry label Jul 1, 2024

bantonsson reviewed Jul 4, 2024

View reviewed changes

cataphract force-pushed the glopes/flush-data-after-stop branch from 4f63a34 to 251bafa Compare July 4, 2024 15:39

cataphract force-pushed the glopes/flush-data-after-stop branch 4 times, most recently from 447c409 to a7d11b0 Compare July 12, 2024 16:03

cataphract requested review from a team as code owners July 12, 2024 16:03

github-actions bot added the mini-agent label Jul 12, 2024

github-actions bot removed the mini-agent label Aug 1, 2024

cataphract force-pushed the glopes/flush-data-after-stop branch 2 times, most recently from 9c34541 to e458f75 Compare October 15, 2024 11:47

cataphract force-pushed the glopes/flush-data-after-stop branch from e458f75 to b018033 Compare November 11, 2024 11:35

cataphract added 3 commits December 5, 2024 09:29

Fix configurations, integrations, deps being relayed over and over

b80b607

cataphract force-pushed the glopes/flush-data-after-stop branch from b018033 to b439363 Compare December 5, 2024 12:30

cataphract requested a review from a team as a code owner December 5, 2024 12:30

github-actions bot added the sidecar label Dec 5, 2024

cataphract changed the title ~~telemetry worker: flush data after stops~~ telemetry worker: flush data after stops, and two other fixes Dec 5, 2024

bantonsson reviewed Dec 5, 2024

View reviewed changes

Anilm3 requested a review from paullegranddc December 5, 2024 13:45

iamluc approved these changes Dec 5, 2024

View reviewed changes

cataphract added 2 commits December 5, 2024 15:34

Alternative method to prevent repetition of confs, integrations, deps

6b45e4a

Add comment

67da02b

cataphract force-pushed the glopes/flush-data-after-stop branch from 25bdce5 to 67da02b Compare December 5, 2024 18:34

bantonsson approved these changes Dec 6, 2024

View reviewed changes

Merge branch 'main' into glopes/flush-data-after-stop

cbce086

cataphract enabled auto-merge (rebase) December 6, 2024 11:04

cataphract merged commit 24f49f5 into main Dec 6, 2024
43 checks passed

cataphract deleted the glopes/flush-data-after-stop branch December 6, 2024 11:32

cataphract mentioned this pull request Dec 6, 2024

Avoid using lock unnecessarily #784

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

telemetry worker: flush data after stops, and two other fixes #515

telemetry worker: flush data after stops, and two other fixes #515

cataphract commented Jul 1, 2024 •

edited

Loading

pr-commenter bot commented Jul 1, 2024 •

edited

Loading

Group 1

Group 2

Group 3

Group 4

Group 5

Group 6

Group 7

Group 8

Group 9

Group 10

Group 11

Group 12

bantonsson left a comment

bantonsson Jul 4, 2024

cataphract Jul 4, 2024

bantonsson Jul 4, 2024

cataphract Jul 4, 2024 •

edited

Loading

bantonsson Jul 4, 2024

cataphract Jul 4, 2024

codecov-commenter commented Jul 4, 2024 •

edited

Loading

cataphract commented Nov 11, 2024

cataphract commented Dec 5, 2024 •

edited

Loading

bantonsson Dec 5, 2024

cataphract Dec 5, 2024

bantonsson Dec 6, 2024

telemetry worker: flush data after stops, and two other fixes #515

telemetry worker: flush data after stops, and two other fixes #515

Conversation

cataphract commented Jul 1, 2024 • edited Loading

pr-commenter bot commented Jul 1, 2024 • edited Loading

Benchmarks

Comparison

Candidate

Group 1

Group 2

Group 3

Group 4

Group 5

Group 6

Group 7

Group 8

Group 9

Group 10

Group 11

Group 12

Baseline

bantonsson left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cataphract Jul 4, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov-commenter commented Jul 4, 2024 • edited Loading

Codecov Report

cataphract commented Nov 11, 2024

cataphract commented Dec 5, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cataphract commented Jul 1, 2024 •

edited

Loading

pr-commenter bot commented Jul 1, 2024 •

edited

Loading

cataphract Jul 4, 2024 •

edited

Loading

codecov-commenter commented Jul 4, 2024 •

edited

Loading

cataphract commented Dec 5, 2024 •

edited

Loading