-
Notifications
You must be signed in to change notification settings - Fork 476
ci(llmobs): replace mock_tracer with tracer fixture and remove Pin usage #15845
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: munir/remove-dummy-tracer
Are you sure you want to change the base?
Conversation
6a23d95 to
d1713cc
Compare
Performance SLOsComparing candidate remove-mock-tracers (4a8270f) with baseline munir/remove-dummy-tracer (6bcd4f8) 📈 Performance Regressions (3 suites)📈 iastaspects - 118/118✅ add_aspectTime: ✅ 17.955µs (SLO: <20.000µs 📉 -10.2%) vs baseline: 📈 +21.3% Memory: ✅ 42.546MB (SLO: <43.250MB 🟡 -1.6%) vs baseline: +4.6% ✅ add_inplace_aspectTime: ✅ 14.918µs (SLO: <20.000µs 📉 -25.4%) vs baseline: -0.1% Memory: ✅ 42.546MB (SLO: <43.250MB 🟡 -1.6%) vs baseline: +4.3% ✅ add_inplace_noaspectTime: ✅ 0.341µs (SLO: <10.000µs 📉 -96.6%) vs baseline: -0.2% Memory: ✅ 42.644MB (SLO: <43.500MB 🟡 -2.0%) vs baseline: +4.9% ✅ add_noaspectTime: ✅ 0.545µs (SLO: <10.000µs 📉 -94.5%) vs baseline: +0.2% Memory: ✅ 42.546MB (SLO: <43.500MB -2.2%) vs baseline: +4.6% ✅ bytearray_aspectTime: ✅ 17.968µs (SLO: <30.000µs 📉 -40.1%) vs baseline: -0.3% Memory: ✅ 42.625MB (SLO: <43.500MB -2.0%) vs baseline: +4.9% ✅ bytearray_extend_aspectTime: ✅ 23.924µs (SLO: <30.000µs 📉 -20.3%) vs baseline: ~same Memory: ✅ 42.644MB (SLO: <43.500MB 🟡 -2.0%) vs baseline: +5.1% ✅ bytearray_extend_noaspectTime: ✅ 2.757µs (SLO: <10.000µs 📉 -72.4%) vs baseline: +1.0% Memory: ✅ 42.585MB (SLO: <43.500MB -2.1%) vs baseline: +4.9% ✅ bytearray_noaspectTime: ✅ 1.476µs (SLO: <10.000µs 📉 -85.2%) vs baseline: +0.3% Memory: ✅ 42.625MB (SLO: <43.500MB -2.0%) vs baseline: +4.7% ✅ bytes_aspectTime: ✅ 16.621µs (SLO: <20.000µs 📉 -16.9%) vs baseline: +0.6% Memory: ✅ 42.664MB (SLO: <43.500MB 🟡 -1.9%) vs baseline: +4.9% ✅ bytes_noaspectTime: ✅ 1.426µs (SLO: <10.000µs 📉 -85.7%) vs baseline: +0.1% Memory: ✅ 42.625MB (SLO: <43.500MB -2.0%) vs baseline: +5.1% ✅ bytesio_aspectTime: ✅ 55.509µs (SLO: <70.000µs 📉 -20.7%) vs baseline: ~same Memory: ✅ 42.664MB (SLO: <43.500MB 🟡 -1.9%) vs baseline: +5.1% ✅ bytesio_noaspectTime: ✅ 3.281µs (SLO: <10.000µs 📉 -67.2%) vs baseline: -0.3% Memory: ✅ 42.526MB (SLO: <43.500MB -2.2%) vs baseline: +4.6% ✅ capitalize_aspectTime: ✅ 14.670µs (SLO: <20.000µs 📉 -26.7%) vs baseline: +0.9% Memory: ✅ 42.703MB (SLO: <43.500MB 🟡 -1.8%) vs baseline: +5.0% ✅ capitalize_noaspectTime: ✅ 2.595µs (SLO: <10.000µs 📉 -74.0%) vs baseline: -0.6% Memory: ✅ 42.566MB (SLO: <43.500MB -2.1%) vs baseline: +4.7% ✅ casefold_aspectTime: ✅ 14.631µs (SLO: <20.000µs 📉 -26.8%) vs baseline: -0.3% Memory: ✅ 42.566MB (SLO: <43.500MB -2.1%) vs baseline: +4.8% ✅ casefold_noaspectTime: ✅ 3.145µs (SLO: <10.000µs 📉 -68.6%) vs baseline: -0.2% Memory: ✅ 42.566MB (SLO: <43.500MB -2.1%) vs baseline: +4.7% ✅ decode_aspectTime: ✅ 15.631µs (SLO: <30.000µs 📉 -47.9%) vs baseline: -0.2% Memory: ✅ 42.546MB (SLO: <43.500MB -2.2%) vs baseline: +4.7% ✅ decode_noaspectTime: ✅ 1.603µs (SLO: <10.000µs 📉 -84.0%) vs baseline: -1.0% Memory: ✅ 42.605MB (SLO: <43.500MB -2.1%) vs baseline: +4.8% ✅ encode_aspectTime: ✅ 18.079µs (SLO: <30.000µs 📉 -39.7%) vs baseline: 📈 +22.3% Memory: ✅ 42.566MB (SLO: <43.500MB -2.1%) vs baseline: +4.8% ✅ encode_noaspectTime: ✅ 1.491µs (SLO: <10.000µs 📉 -85.1%) vs baseline: ~same Memory: ✅ 42.644MB (SLO: <43.500MB 🟡 -2.0%) vs baseline: +4.9% ✅ format_aspectTime: ✅ 171.289µs (SLO: <200.000µs 📉 -14.4%) vs baseline: +0.4% Memory: ✅ 42.782MB (SLO: <43.250MB 🟡 -1.1%) vs baseline: +5.1% ✅ format_map_aspectTime: ✅ 191.442µs (SLO: <200.000µs -4.3%) vs baseline: +0.1% Memory: ✅ 42.684MB (SLO: <43.500MB 🟡 -1.9%) vs baseline: +4.7% ✅ format_map_noaspectTime: ✅ 3.827µs (SLO: <10.000µs 📉 -61.7%) vs baseline: ~same Memory: ✅ 42.566MB (SLO: <43.250MB 🟡 -1.6%) vs baseline: +5.0% ✅ format_noaspectTime: ✅ 3.142µs (SLO: <10.000µs 📉 -68.6%) vs baseline: -0.6% Memory: ✅ 42.487MB (SLO: <43.250MB 🟡 -1.8%) vs baseline: +4.5% ✅ index_aspectTime: ✅ 15.323µs (SLO: <20.000µs 📉 -23.4%) vs baseline: -0.4% Memory: ✅ 42.566MB (SLO: <43.250MB 🟡 -1.6%) vs baseline: +4.8% ✅ index_noaspectTime: ✅ 0.464µs (SLO: <10.000µs 📉 -95.4%) vs baseline: ~same Memory: ✅ 42.664MB (SLO: <43.500MB 🟡 -1.9%) vs baseline: +5.1% ✅ join_aspectTime: ✅ 17.078µs (SLO: <20.000µs 📉 -14.6%) vs baseline: +0.6% Memory: ✅ 42.585MB (SLO: <43.500MB -2.1%) vs baseline: +4.9% ✅ join_noaspectTime: ✅ 1.544µs (SLO: <10.000µs 📉 -84.6%) vs baseline: ~same Memory: ✅ 42.585MB (SLO: <43.250MB 🟡 -1.5%) vs baseline: +4.8% ✅ ljust_aspectTime: ✅ 20.739µs (SLO: <30.000µs 📉 -30.9%) vs baseline: -0.4% Memory: ✅ 42.605MB (SLO: <43.250MB 🟡 -1.5%) vs baseline: +4.8% ✅ ljust_noaspectTime: ✅ 2.706µs (SLO: <10.000µs 📉 -72.9%) vs baseline: ~same Memory: ✅ 42.703MB (SLO: <43.250MB 🟡 -1.3%) vs baseline: +5.2% ✅ lower_aspectTime: ✅ 17.974µs (SLO: <30.000µs 📉 -40.1%) vs baseline: +0.6% Memory: ✅ 42.605MB (SLO: <43.500MB -2.1%) vs baseline: +4.8% ✅ lower_noaspectTime: ✅ 2.433µs (SLO: <10.000µs 📉 -75.7%) vs baseline: +0.4% Memory: ✅ 42.605MB (SLO: <43.250MB 🟡 -1.5%) vs baseline: +4.8% ✅ lstrip_aspectTime: ✅ 17.544µs (SLO: <30.000µs 📉 -41.5%) vs baseline: -0.7% Memory: ✅ 42.664MB (SLO: <43.250MB 🟡 -1.4%) vs baseline: +5.2% ✅ lstrip_noaspectTime: ✅ 1.859µs (SLO: <10.000µs 📉 -81.4%) vs baseline: -0.2% Memory: ✅ 42.526MB (SLO: <43.500MB -2.2%) vs baseline: +4.6% ✅ modulo_aspectTime: ✅ 166.521µs (SLO: <200.000µs 📉 -16.7%) vs baseline: +0.2% Memory: ✅ 42.762MB (SLO: <43.500MB 🟡 -1.7%) vs baseline: +4.8% ✅ modulo_aspect_for_bytearray_bytearrayTime: ✅ 179.898µs (SLO: <200.000µs 📉 -10.1%) vs baseline: +3.0% Memory: ✅ 42.703MB (SLO: <43.500MB 🟡 -1.8%) vs baseline: +4.3% ✅ modulo_aspect_for_bytesTime: ✅ 168.424µs (SLO: <200.000µs 📉 -15.8%) vs baseline: ~same Memory: ✅ 42.546MB (SLO: <43.500MB -2.2%) vs baseline: +4.4% ✅ modulo_aspect_for_bytes_bytearrayTime: ✅ 172.067µs (SLO: <200.000µs 📉 -14.0%) vs baseline: ~same Memory: ✅ 42.664MB (SLO: <43.500MB 🟡 -1.9%) vs baseline: +4.6% ✅ modulo_noaspectTime: ✅ 3.669µs (SLO: <10.000µs 📉 -63.3%) vs baseline: +0.1% Memory: ✅ 42.566MB (SLO: <43.500MB -2.1%) vs baseline: +4.9% ✅ replace_aspectTime: ✅ 211.860µs (SLO: <300.000µs 📉 -29.4%) vs baseline: -0.1% Memory: ✅ 42.743MB (SLO: <44.000MB -2.9%) vs baseline: +5.1% ✅ replace_noaspectTime: ✅ 2.900µs (SLO: <10.000µs 📉 -71.0%) vs baseline: -0.9% Memory: ✅ 42.644MB (SLO: <43.500MB 🟡 -2.0%) vs baseline: +5.2% ✅ repr_aspectTime: ✅ 1.417µs (SLO: <10.000µs 📉 -85.8%) vs baseline: +0.8% Memory: ✅ 42.625MB (SLO: <43.500MB -2.0%) vs baseline: +5.0% ✅ repr_noaspectTime: ✅ 0.529µs (SLO: <10.000µs 📉 -94.7%) vs baseline: +1.2% Memory: ✅ 42.664MB (SLO: <43.500MB 🟡 -1.9%) vs baseline: +4.7% ✅ rstrip_aspectTime: ✅ 19.142µs (SLO: <30.000µs 📉 -36.2%) vs baseline: +0.7% Memory: ✅ 42.546MB (SLO: <43.500MB -2.2%) vs baseline: +4.4% ✅ rstrip_noaspectTime: ✅ 2.027µs (SLO: <10.000µs 📉 -79.7%) vs baseline: +5.0% Memory: ✅ 42.546MB (SLO: <43.500MB -2.2%) vs baseline: +4.7% ✅ slice_aspectTime: ✅ 15.941µs (SLO: <20.000µs 📉 -20.3%) vs baseline: +0.6% Memory: ✅ 42.526MB (SLO: <43.500MB -2.2%) vs baseline: +4.8% ✅ slice_noaspectTime: ✅ 0.597µs (SLO: <10.000µs 📉 -94.0%) vs baseline: -0.1% Memory: ✅ 42.585MB (SLO: <43.500MB -2.1%) vs baseline: +4.9% ✅ stringio_aspectTime: ✅ 53.933µs (SLO: <80.000µs 📉 -32.6%) vs baseline: -0.3% Memory: ✅ 42.566MB (SLO: <43.500MB -2.1%) vs baseline: +4.7% ✅ stringio_noaspectTime: ✅ 3.651µs (SLO: <10.000µs 📉 -63.5%) vs baseline: -0.8% Memory: ✅ 42.546MB (SLO: <43.500MB -2.2%) vs baseline: +5.0% ✅ strip_aspectTime: ✅ 17.711µs (SLO: <20.000µs 📉 -11.4%) vs baseline: -0.2% Memory: ✅ 42.625MB (SLO: <43.500MB -2.0%) vs baseline: +5.0% ✅ strip_noaspectTime: ✅ 1.874µs (SLO: <10.000µs 📉 -81.3%) vs baseline: +0.6% Memory: ✅ 42.605MB (SLO: <43.500MB -2.1%) vs baseline: +4.8% ✅ swapcase_aspectTime: ✅ 18.578µs (SLO: <30.000µs 📉 -38.1%) vs baseline: +0.6% Memory: ✅ 42.507MB (SLO: <43.500MB -2.3%) vs baseline: +4.5% ✅ swapcase_noaspectTime: ✅ 2.803µs (SLO: <10.000µs 📉 -72.0%) vs baseline: +1.0% Memory: ✅ 42.585MB (SLO: <43.500MB -2.1%) vs baseline: +5.1% ✅ title_aspectTime: ✅ 18.253µs (SLO: <30.000µs 📉 -39.2%) vs baseline: +0.2% Memory: ✅ 42.566MB (SLO: <43.000MB 🟡 -1.0%) vs baseline: +5.1% ✅ title_noaspectTime: ✅ 2.707µs (SLO: <10.000µs 📉 -72.9%) vs baseline: +1.8% Memory: ✅ 42.566MB (SLO: <43.500MB -2.1%) vs baseline: +4.7% ✅ translate_aspectTime: ✅ 24.361µs (SLO: <30.000µs 📉 -18.8%) vs baseline: 📈 +18.5% Memory: ✅ 42.664MB (SLO: <43.500MB 🟡 -1.9%) vs baseline: +5.1% ✅ translate_noaspectTime: ✅ 4.326µs (SLO: <10.000µs 📉 -56.7%) vs baseline: -0.6% Memory: ✅ 42.605MB (SLO: <43.500MB -2.1%) vs baseline: +5.0% ✅ upper_aspectTime: ✅ 18.025µs (SLO: <30.000µs 📉 -39.9%) vs baseline: -0.1% Memory: ✅ 42.605MB (SLO: <43.500MB -2.1%) vs baseline: +5.0% ✅ upper_noaspectTime: ✅ 2.423µs (SLO: <10.000µs 📉 -75.8%) vs baseline: -0.4% Memory: ✅ 42.585MB (SLO: <43.500MB -2.1%) vs baseline: +4.6% 📈 iastaspectsospath - 24/24✅ ospathbasename_aspectTime: ✅ 5.184µs (SLO: <10.000µs 📉 -48.2%) vs baseline: 📈 +21.5% Memory: ✅ 42.448MB (SLO: <43.500MB -2.4%) vs baseline: +4.7% ✅ ospathbasename_noaspectTime: ✅ 4.270µs (SLO: <10.000µs 📉 -57.3%) vs baseline: -0.3% Memory: ✅ 42.566MB (SLO: <43.500MB -2.1%) vs baseline: +5.0% ✅ ospathjoin_aspectTime: ✅ 6.228µs (SLO: <10.000µs 📉 -37.7%) vs baseline: +0.1% Memory: ✅ 42.546MB (SLO: <43.500MB -2.2%) vs baseline: +5.0% ✅ ospathjoin_noaspectTime: ✅ 6.289µs (SLO: <10.000µs 📉 -37.1%) vs baseline: ~same Memory: ✅ 42.467MB (SLO: <43.500MB -2.4%) vs baseline: +4.9% ✅ ospathnormcase_aspectTime: ✅ 3.545µs (SLO: <10.000µs 📉 -64.6%) vs baseline: -0.3% Memory: ✅ 42.349MB (SLO: <43.500MB -2.6%) vs baseline: +4.7% ✅ ospathnormcase_noaspectTime: ✅ 3.620µs (SLO: <10.000µs 📉 -63.8%) vs baseline: +0.4% Memory: ✅ 42.546MB (SLO: <43.500MB -2.2%) vs baseline: +4.8% ✅ ospathsplit_aspectTime: ✅ 4.918µs (SLO: <10.000µs 📉 -50.8%) vs baseline: +0.8% Memory: ✅ 42.507MB (SLO: <43.500MB -2.3%) vs baseline: +4.7% ✅ ospathsplit_noaspectTime: ✅ 4.995µs (SLO: <10.000µs 📉 -50.0%) vs baseline: -0.7% Memory: ✅ 42.507MB (SLO: <43.500MB -2.3%) vs baseline: +4.7% ✅ ospathsplitdrive_aspectTime: ✅ 3.712µs (SLO: <10.000µs 📉 -62.9%) vs baseline: -0.3% Memory: ✅ 42.487MB (SLO: <43.500MB -2.3%) vs baseline: +4.9% ✅ ospathsplitdrive_noaspectTime: ✅ 0.751µs (SLO: <10.000µs 📉 -92.5%) vs baseline: +0.8% Memory: ✅ 42.467MB (SLO: <43.500MB -2.4%) vs baseline: +4.7% ✅ ospathsplitext_aspectTime: ✅ 4.602µs (SLO: <10.000µs 📉 -54.0%) vs baseline: -0.2% Memory: ✅ 42.644MB (SLO: <43.500MB 🟡 -2.0%) vs baseline: +5.2% ✅ ospathsplitext_noaspectTime: ✅ 4.647µs (SLO: <10.000µs 📉 -53.5%) vs baseline: +0.4% Memory: ✅ 42.526MB (SLO: <43.500MB -2.2%) vs baseline: +4.7% 📈 telemetryaddmetric - 30/30✅ 1-count-metric-1-timesTime: ✅ 3.403µs (SLO: <20.000µs 📉 -83.0%) vs baseline: 📈 +13.6% Memory: ✅ 34.898MB (SLO: <35.500MB 🟡 -1.7%) vs baseline: +4.9% ✅ 1-count-metrics-100-timesTime: ✅ 199.338µs (SLO: <220.000µs -9.4%) vs baseline: -0.6% Memory: ✅ 35.036MB (SLO: <35.500MB 🟡 -1.3%) vs baseline: +5.4% ✅ 1-distribution-metric-1-timesTime: ✅ 3.316µs (SLO: <20.000µs 📉 -83.4%) vs baseline: -0.6% Memory: ✅ 34.898MB (SLO: <35.500MB 🟡 -1.7%) vs baseline: +4.7% ✅ 1-distribution-metrics-100-timesTime: ✅ 215.809µs (SLO: <230.000µs -6.2%) vs baseline: +0.5% Memory: ✅ 34.878MB (SLO: <35.500MB 🟡 -1.8%) vs baseline: +4.7% ✅ 1-gauge-metric-1-timesTime: ✅ 2.205µs (SLO: <20.000µs 📉 -89.0%) vs baseline: -0.6% Memory: ✅ 34.819MB (SLO: <35.500MB 🟡 -1.9%) vs baseline: +4.9% ✅ 1-gauge-metrics-100-timesTime: ✅ 137.273µs (SLO: <150.000µs -8.5%) vs baseline: +0.2% Memory: ✅ 34.780MB (SLO: <35.500MB -2.0%) vs baseline: +4.4% ✅ 1-rate-metric-1-timesTime: ✅ 3.126µs (SLO: <20.000µs 📉 -84.4%) vs baseline: -0.5% Memory: ✅ 34.937MB (SLO: <35.500MB 🟡 -1.6%) vs baseline: +5.2% ✅ 1-rate-metrics-100-timesTime: ✅ 212.605µs (SLO: <250.000µs 📉 -15.0%) vs baseline: -0.4% Memory: ✅ 34.937MB (SLO: <35.500MB 🟡 -1.6%) vs baseline: +5.1% ✅ 100-count-metrics-100-timesTime: ✅ 20.188ms (SLO: <22.000ms -8.2%) vs baseline: +0.5% Memory: ✅ 34.859MB (SLO: <35.500MB 🟡 -1.8%) vs baseline: +5.1% ✅ 100-distribution-metrics-100-timesTime: ✅ 2.228ms (SLO: <2.550ms 📉 -12.6%) vs baseline: -0.2% Memory: ✅ 34.800MB (SLO: <35.500MB 🟡 -2.0%) vs baseline: +3.8% ✅ 100-gauge-metrics-100-timesTime: ✅ 1.407ms (SLO: <1.550ms -9.2%) vs baseline: -0.5% Memory: ✅ 34.957MB (SLO: <35.500MB 🟡 -1.5%) vs baseline: +5.2% ✅ 100-rate-metrics-100-timesTime: ✅ 2.197ms (SLO: <2.550ms 📉 -13.9%) vs baseline: +0.6% Memory: ✅ 35.055MB (SLO: <35.500MB 🟡 -1.3%) vs baseline: +5.2% ✅ flush-1-metricTime: ✅ 4.598µs (SLO: <20.000µs 📉 -77.0%) vs baseline: +0.9% Memory: ✅ 35.212MB (SLO: <35.500MB 🟡 -0.8%) vs baseline: +4.8% ✅ flush-100-metricsTime: ✅ 173.383µs (SLO: <250.000µs 📉 -30.6%) vs baseline: -1.2% Memory: ✅ 35.291MB (SLO: <35.500MB 🟡 -0.6%) vs baseline: +5.0% ✅ flush-1000-metricsTime: ✅ 2.187ms (SLO: <2.500ms 📉 -12.5%) vs baseline: +0.5% Memory: ✅ 36.097MB (SLO: <36.500MB 🟡 -1.1%) vs baseline: +5.0% 🟡 Near SLO Breach (15 suites)🟡 coreapiscenario - 10/10 (1 unstable)
|
66cf0d6 to
ac45ae1
Compare
Description
Refactored test fixtures to use the standard
tracerfixture fromtests/conftest.pyinstead of custommock_tracerfixtures. This change:mock_tracerwith thetracerfixture across all integrationstest_spansfixture to encapsulate span container logicPrimarily impacts LLM observability tests across multiple integrations (anthropic, botocore, crewai, google_adk, google_genai, langgraph, litellm, mcp, openai, openai_agents, pydantic_ai, vertexai, vllm).
Testing
All existing tests pass. Linting passes.
Risks
Low. Refactoring maintains same test behavior while simplifying codebase.