Skip to content

Navigation Menu

Explore
By company size
By use case
By industry
View all solutions
Topics
- AI
- DevOps
- Security
- Software Development
- View all
Explore
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
- Enterprise platform
  AI-powered developer platform
Available add-ons
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

darklang / classic-dark Public

Notifications You must be signed in to change notification settings
Fork 10
Star 49

Code
Issues 6
Pull requests 2
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Projects
Security
Insights

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

New plan for trace storage work #10

Open

41 of 60 tasks

StachuDotNet opened this issue Mar 1, 2023 · 2 comments

Open

41 of 60 tasks

New plan for trace storage work #10

StachuDotNet opened this issue Mar 1, 2023 · 2 comments

Comments

Copy link

Member

StachuDotNet commented Mar 1, 2023 •

edited by pbiggar

Loading

DB Clone

prototype the DB clone (go through the steps, record "down times")
- events table (not v2)
  - verify that nothing is querying events (not v2) table
  - look at the events table and see if it's using any foreign keys
    - if any, remove the foreign keys
  - delete the events table from dark-west
- do the DB clone
- update the DB clone
  - set zoning to single-zone (applies to both servers and storage)
  - update postgres version to 14
    conclusion: probably brings unnecessary risk
  - turn down the CPUs by ~1/3

Drop Events table

drop events table in the codebase
- review usages of "events" in codebase - see if we're missing anything
- investigate connection to worker_stats_v1
- write migration script (drop if still exists) to drop events
- update tests if they were somehow referencing events
- update clear-canvas script to not reference events table
  (note: apparently we weren't clearing events_v2!)
- do we need to merge any changes before we drop the events table in prod?
  - Yes.
drop events table in production
- set lock_timeout = '1s'
- set statement_timeout = '1s'
- alter table events drop constraint events_canvas_id_fkey
- alter table events drop constraint events_account_id_fkey
- drop index concurrently if exists idx_events_for_dequeue
- drop index concurrently if exists idx_events_for_dequeue2
- truncate events table
- drop events table
merge the migration
copy the above all from stable-dark to dark

Get Google to shrink a clone

Goal: determine the amount of downtime

make a clone of our DB
set lock_timeout = '1s'
set statement_timeout = '1s'
drop FK on account_id
alter table events drop constraint events_canvas_id_fkey
drop FK on canvas_id
alter table events drop constraint events_account_id_fkey
drop index idx_events_for_dequeue
drop index concurrently if exists idx_events_for_dequeue
drop index idx_events_for_dequeue2
drop index concurrently if exists idx_events_for_dequeue2
drop index index_events_for_stats
truncate events table
drop events table
ask google to shrink it
(they'll do this in real-time synchronously during a workday/call)
record the downtime for reference: [downtime]
lower availability to single-zone
lower CPU from 16 vCPUs to 12 vCPUs

Make a plan for doing this against the prod DB

plan how to alert customers
- of expected downtime, etc
...

another day: (pull into another issue)

Cloud storage

delete trace-related tests
check that 404s continue to work
ensure we overwrite cloud storage traces for execute_handler button
check if execute_function traces are appropriately merged with a cloud-storage -based trace
garbage collection - set object lifecycle for bucket or for traces
ensure pusher is supported
do walkthrough and check it all works

monitoring

schedule weekly call/meeting where we review usage, for 4 weeks. at the end of such, consider what to do
- check table sizes
- check costs

migrate existing canvases

upload to both simultaneously
fetch and upload existing trace data for existing canvases/handlers
possibly automatically switch LD flag once this is done
switch all users to only use uploaded storage data

Maybe later?

turn on private IPs (requires DB downtime)

The text was updated successfully, but these errors were encountered:

All reactions

Copy link

Member Author

StachuDotNet commented Mar 2, 2023

toplevel_oplists has PK
stored_events_v2 no PK
function_results_v3 no PK
function_arguments no PK
traces_v0 has PK
user_data has PK
system_migrations has PK
static_asset_deploys has PK
secrets has PK
scheduling_rules has PK
packages_v0 has PK
op_ctrs no PK (tho does have unique constraint)
events_v0 has PK
events has PK
custom_domains has PK
cron_records has PK
canvases has PK
accounts has PK
access no PK

from an earlier call - notes on which tables have/don't have PKs

All reactions

Sorry, something went wrong.

Copy link

Member Author

StachuDotNet commented Mar 3, 2023 •

edited

Loading

Edit: the contents of this comment have been moved to #11, and have been executed

All reactions

Sorry, something went wrong.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Assignees

No one assigned

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

No branches or pull requests

1 participant

Footer

© 2024 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.