Stream Engine is the back-end query engine for the OOI CyberInfrastructure. It exposes a JSON-based HTTP interface which allows for the querying of full resolution or subsampled data from the data store and execution of numerous data product algorithms on this data. The results are returned to the user in a JSON data structure for synchronous results or asynchronously as NetCDF4, CSV or TSV.
- Clone the repository
- Clone all submodules (git submodule update --init)
- Create a conda virtual environment with the necessary packages:
conda env create -f conda_env.yml
The default Stream Engine configuration can be found in config/default.py. Overrides can be entered into config/local.py. Gunicorn-specific configuration parameters are set in gunicorn_config.py.
The script manage-streamng
allows for starting, stopping, reloading and
checking the status of Stream Engine. The restart option combines the stop
and start options. The reload option will send a HUP signal to gunicorn
which will terminate all idle workers and restart them. Any busy worker
will continue until the current request is complete. The status option
returns the process id (PID) of the gunicorn parent process.
./manage-streamng start
./manage-streamng stop
./manage-streamng restart
./manage-streamng reload
./manage-streamng status
Note that the stop behavior is similar to the reload behavior. Any active workers will continue until their current task is complete. Any new requests will be rejected but the master gunicorn process will continue running until all workers are shutdown. Stopping Stream Engine should generally be avoided unless necessary.
Our current test server is uframe-3-test
under user asadev
. Source the
Stream Engine conda environment engine
and start the service. Run Stream
Engine in the logs directory as they are written to the current working directory:
source activate engine
cd ~/miniconda/envs/engine/stream_engine/logs
../manage-streamng start
The following logs are generated in the logs folder:
- stream_engine.error.log - General data retrieval, product creation logs
- stream_engine.access.log - Gunicorn access logs
Updating to a new release of Stream Engine is simple, just grab the update, update your conda environment and the preload database submodule then reload Stream Engine.
git pull # or git fetch / git checkout <tag>
git submodule update
conda env update -f conda_env.yml
./manage-streamng reload
- Update preload database submodule (if needed)
- Update conda_env.yml with any desired library updates
- Update config/default.py with the new version
- Update RELEASE_NOTES with the new version
- Commit the above changes
- Tag the commit with the new version
git tag -a vX.X.X
You can then push the commit and the tag to the upstream repo(s):
git push gerrit master
git push gerrit master --tags
- Within the stream_engine root, change directory to preload-database
- Ensure all changes you may have are cleared/saved off
- Run the following commands
git fetch origin # assuming "origin" points to the source URL
git rebase origin/master
cd .. # to stream_engine root
git add preload_database
git commit -m "Issue #nnnnn <message>
git push origin HEAD:nnnnn
- Within the stream_engine root, change directory to util/metadata_service/metadata_service_api
- Ensure all changes you may have are cleared/saved off
- Run the following commands
git fetch origin # assuming "origin" points to the source URL
git rebase origin/master
cd ../../.. # to stream_engine root
git add util/metadata_service/metadata_service_api
git commit -m "Issue #nnnnn <message>
git push origin HEAD:nnnnn
NOTE: this was used for 13182,14654 (may be applicable elsewhere) Ensure up-to-date data has been ingested for the NC files you want to create Then temporarily modify the stream_engine code as follows to create the files:
- In util/netcdf_generator.py's _filter_parameters, change default_params to add: sci_water_pressure
- In util/netcdf_generator.py's _create_files a) comment line: ds = rename_glider_lat_lon(stream_key, ds) b) after the following code snippet (as of 9/3/2020)
for external_stream_key in self.stream_request.external_includes:
for parameter in self.stream_request.external_includes[external_stream_key]:
long_parameter_name = external_stream_key.stream_name + "-" + parameter.name
add the following code snippet to ensure these parameters are retained in the output
if parameter.name in ('m_gps_lat', 'm_gps_lon', 'm_lat', 'm_lon', 'interp_lat', 'interp_lon'):
params_to_include.append(long_parameter_name)
continue
- Once this is done run a data request against the data to produce the NC files. Then back out these changes.