Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store Mongo configuration file in Git repository (Jetstream2) #233

Open
1 of 3 tasks
eecavanna opened this issue Jul 4, 2024 · 10 comments
Open
1 of 3 tasks

Store Mongo configuration file in Git repository (Jetstream2) #233

eecavanna opened this issue Jul 4, 2024 · 10 comments
Assignees
Labels
documentation Improvements or additions to documentation jetstream2 ✈️ Issue related to deploying NMDC EDGE to the Jetstream2 platform x small less than 1 day

Comments

@eecavanna
Copy link
Collaborator

eecavanna commented Jul 4, 2024

Background

@mflynn-lanl created a Mongo configuration file named mongod.conf.

$ docker compose exec mongo bash
root@65279d9b27dc:~# pwd
/data/db
root@65279d9b27dc:~# ls -l | grep conf
-rw-r--r-- 1 root    root       462 Jul  3 18:09 mongod.conf
root@65279d9b27dc:~# cat mongod.conf

Here's what it contains (as of July 4, 2024):

# mongod.conf

# for documentation of all options, see:
#   http://docs.mongodb.org/manual/reference/configuration-options/

# Where and how to store data.
storage:
  dbPath: /data/db
#  engine:
#  wiredTiger:

# where to write logging data.
systemLog:
  destination: file
  logAppend: true
  path: /data/db/log/mongod.log

# network interfaces
net:
  port: 27017
  bindIp: 0.0.0.0


# how the process runs
processManagement:
  timeZoneInfo: /usr/share/zoneinfo

Task

  • Clean up the file (formatting/comments)
  • Store the cleaned up version in this repo
  • Include documentation about how to use the file when deploying Mongo
@eecavanna eecavanna added documentation Improvements or additions to documentation x small less than 1 day labels Jul 4, 2024
@eecavanna eecavanna self-assigned this Jul 4, 2024
@eecavanna eecavanna added the jetstream2 ✈️ Issue related to deploying NMDC EDGE to the Jetstream2 platform label Jul 4, 2024
@eecavanna
Copy link
Collaborator Author

eecavanna commented Jul 4, 2024

A different one of my action items from this week's Jetstream2 squad meeting was to update the docker-compose.prod.yml file (currently in the docs folder in this repo) so that the mongo service has a custom entrypoint that runs mongod with the --config CLI option (whose value is the path to this config file) and also the --setParameter CLI option (whose value is: "logComponentVerbosity={query: {verbosity: 2}}").

In other words, define the entrypoint to be:

[
  "mongod",
  "--config",
  "/data/db/mongod.conf",
  "--setParameter",
  "logComponentVerbosity={query: {verbosity: 2}}"
]

For now, I am going to document that task here in this issue comment instead of creating a separate Issue. Rationale:

  • The custom entrypoint won't work until this config file exists
  • We may be able to put the --setParameter option in the config file also (I can test that locally)

    Note: We plan to omit this option after initial deployment, since it'll make the log grow even faster than normal.

@ssarrafan
Copy link

I don't see any status on the sprint or Jetstream squad board. What's the status of this issue? Is it still active? @eecavanna @mflynn-lanl

@eecavanna
Copy link
Collaborator Author

I created the issue, but never assigned a status to it (or started working on it). I will move it to the next sprint (I do plan to either do it—or close it—next sprint).

@eecavanna
Copy link
Collaborator Author

FYI: I accidentally deleted the config file from the web app VM (on Jetstream) today, while "emptying out" the Mongo data directory. Here's a screenshot showing the output of ls -l (aliased as ll) a few minutes before I deleted that file (and everything else in this directory).

image

@eecavanna
Copy link
Collaborator Author

Here's how the config file above compares with Mongo's default values:

Config variable Value in file Mongo default value Reference Plan
storage.dbPath /data/db /data/db (on Linux) Mongo docs Omit from config file
systemLog.destination file STDOUT Mongo docs Keep in config file
systemLog.logAppend true false Mongo docs Keep in config file
systemLog.path /data/db/log/mongod.log STDOUT and/or the hosts's syslog Mongo docs Keep in config file
net.port 27017 27017 Mongo docs Omit from config file
net.bindIp 0.0.0.0 localhost Mongo docs Keep in config file
processManagement.
timeZoneInfo
/usr/share/zoneinfo /usr/share/zoneinfo Mongo docs Omit from config file

@eecavanna
Copy link
Collaborator Author

eecavanna commented Aug 6, 2024

After removing the redundant information, here's what the config file would contain:

# This is a MongoDB configuration file.
# You can pass it to the MongoDB daemon via: `$ mongod --config /path/to/this/file`
# Reference: http://docs.mongodb.org/manual/reference/configuration-options/

# Configure MongoDB to write its logs to a file at this path.
# Note: This depends upon the directory `/data/db/log/` already existing.
systemLog:
  destination: file
  logAppend: true
  path: /data/db/log/mongod.log

# Configure MongoDB to listen for traffic coming from any host, not just localhost.
net:
  bindIp: 0.0.0.0

I have created a file containing this, on the Jetstream2 VM:

image

@eecavanna
Copy link
Collaborator Author

After observing that the container logs (accessible via $ docker compose logs mongo) no longer contained as much information as before (that information was instead being written to a file), I realized I preferred to have the logs be written to the container logs (i.e. STDOUT, from the perspective of the container). I think we had temporarily set up that "logging to file" while trying to debug a recent issue where data was being deleted when we didn't expect.

So, here's the latest config file. I added a section that explicitly enables authorization. In the container-based deployment (based on the off-the-shelf Mongo image), this section is redundant with the default behavior of Mongo; but that is not the case when (a) we use a config file at all or (b) in the non-container-based deployment on SDSC.

# This is a MongoDB configuration file.
# You can pass it to the MongoDB daemon via: `$ mongod --config /path/to/this/file`
# Reference: http://docs.mongodb.org/manual/reference/configuration-options/

# Configure MongoDB to require clients to authenticate themselves, instead of allowing access by all clients.
# Reference: https://www.mongodb.com/docs/manual/tutorial/configure-scram-client-authentication/#re-start-the-mongodb-instance-with-access-control
# Reference: https://www.mongodb.com/docs/manual/reference/configuration-options/#mongodb-setting-security.authorization
security:
    authorization: enabled

# Configure MongoDB to listen for traffic coming from any host, not just localhost.
# Reference: https://www.mongodb.com/docs/manual/reference/configuration-options/#mongodb-setting-net.bindIp
net:
    bindIp: 0.0.0.0

@eecavanna
Copy link
Collaborator Author

FYI: In the production instance of NMDC EDGE (on SDSC), the Mongo version being used is currently 3.6.8 (https://www.mongodb.com/docs/v5.3/release-notes/3.6/#3.6.8---sep-19--2018).

@ssarrafan
Copy link

@eecavanna would you be ok with me moving anything not Berkeley refactor to the next sprint? Like this issues, seems like this could wait.

@eecavanna
Copy link
Collaborator Author

Hi @ssarrafan, yes on this particular issue. I'll move it now.

There are things I want to do that aren't Berkeley Schema Roll Out-related, next Wednesday-Friday.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation jetstream2 ✈️ Issue related to deploying NMDC EDGE to the Jetstream2 platform x small less than 1 day
Development

No branches or pull requests

2 participants