Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Storage Docs / Notifications #116

Open
wants to merge 10 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added pilot/images/keep_on_termination.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 3 additions & 1 deletion pilot/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,10 @@ This section of the documentation details the known issues, limitations and cons
limitations-considerations
budgets
upgrading-ubuntu
notifications

* :ref:`known-issues`
* :ref:`limitations-considerations`
* :ref:`budgets`
* :ref:`upgrading-ubuntu`
* :ref:`upgrading-ubuntu`
* :ref:`notifications`
40 changes: 40 additions & 0 deletions pilot/notifications.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
.. _notifications:

Notifications
=======================================

As a creator / owner of resources inside Ronin you may receive Email based notification depending on their state.
These intend to alert you of possible under utilization of resources to help avoid unnecessary project spend.

.. note::
All notifications will come from the address **[email protected]**, always be cautious when following links from external addresses!

.. _instance_utilization:

Instance Utilization
---------------------------------------

If you have received an email titled "Ronin Under Utilization Alert!" this means that Ronin has noticed an :term:`instance` in your project has gone for more than more than 24 hours with under 10% CPU utilization.
This is designed to ensure you've not accidentally left a machine running idle. Unlike on-premise VM's that run 24/7 we recommend you shut down your instances when not in use, much like you would your own PC.
cs1jmc marked this conversation as resolved.
Show resolved Hide resolved

We do also understand this alert could be a false positive where your workloads are not CPU demanding but still require the machine be on for extended periods.
If this is the case please get in touch and we can make the instance exempt from these alerts.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's more likely that the user has provisioned a far-too-big machine or doesn't understand how to use parallel processing, so it might be worth suggesting that they spin up a smaller machine if appropriate for that workload.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to avoid the discussion of 'right sizing' in this specific doc. I think it needs it's own page / paragraph to be referenced.

That said I'm not sure how to approach the topic as sizing is going to be very specific to the users needs and AWS' instance types are semi-regularly changing. It's hard not to just point to the AWS docs, which isn't helpful for a majority of people.

I fear the doc we make will end up being too simple. This is a topic we're eager to push over to Ronin to see if they can make the UX explain this better so that people are less likely to pick a silly instance for their workload.

Copy link
Contributor Author

@cs1jmc cs1jmc Mar 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://blog.ronin.cloud/selecting-machine/

Turns out they do have a doc for this. It's probably a good place to signpost people to.
Q: What's the likelyhood a user will acutally read this.


.. _unused_drive_storage:

Unused Drive Storage
---------------------------------------

If you have received an email titled "Unused Drive Storage Detected" this means that Ronin has noticed detached drives have been in your project for extended periods.

This could be from a terminated instance that had the "Keep On Termination" flag set:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Describe what happens when this option is set. People won't necessarily understand what "keep on termination" means.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I try to avoid adding bits like that as I feel it becomes a fine line of teaching people to suck eggs. This is where I'm disappointed that Ronin don't have their own end user documentation we can reference...

Do we really have to write our own docs on how another companies product works?
@willfurnass Maybe we add that to my list of questions for Ronin...

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

People won't understand these concepts already so it's better for it to be explained simply and clearly. (It will feel too simple for us but it will be useful for them to have it spelled out.)

But I agree, the RONIN docs need to be better, it's not our job to explain their product.


.. image:: images/keep_on_termination.png
:align: center

|

Or volumes that were detached and forgotten etc...

As provisioned volumes are charged for based on how much storage is requested it's best practice remove unused storage once it's no longer required.
For long term and/or bulk data storage we recommend you use :ref:`object-storage`
24 changes: 24 additions & 0 deletions ronin/drive-storage.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
.. _drive-storage:

Drive Storage
=======================================

Drive storage or block storage as it's often referred as is the storage attached directly to your :term:`instances<instance>` within Ronin.
cs1jmc marked this conversation as resolved.
Show resolved Hide resolved
These are most commonly the "Root Drive" however Ronin gives you the option to create your own additional storage to attach and move between instances.
cs1jmc marked this conversation as resolved.
Show resolved Hide resolved

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Link to the relevant RONIN documentation so people can learn how to do this

Copy link
Contributor Author

@cs1jmc cs1jmc Mar 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://blog.ronin.cloud/storage-help/
This?

We already link to a few docs from here https://docs.rcc.shef.ac.uk/ronin/index.html but It looks like it's worth adding.

bcab9e1 Adds the link in.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


Should you want to know about the nitty gritty, the underlying technology used here is `AWS EBS <https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/AmazonEBS.html>`_.

.. _drive_types:

Drive Types
---------------------------------------

When creating a new non-root drive, Ronin gives you multiple options for drive types.
As a general rule of thumb we recommend you select **SSD**, this is due to how AWS provisions drive speed.

The SSD drives will be allocated 125MiB/s and 3000iops as per `gp3 defaults <https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/general-purpose.html#gp3-performance>`_.
cs1jmc marked this conversation as resolved.
Show resolved Hide resolved

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine but a bit cryptic, agree with Will's proposal

Should your workload require more performance please do get in touch.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like a guide on using the storage-optimised machines properly please 😸
Most of the data processing tasks for CURED are bottlenecked heavily by disk I/O so users being able to easily optimise this would save a lot of time

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could signpost to tooling that helps people quantify IO perf (on Linux and Windows instances - iostat and iotop for Linux).

A (mermaid.js) flow chart for diagnosing IO perf issues could be useful.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes please
I've collected a few tips but I realise it's probably a tricky topic for newcomers to grasp. There's a very wide potential variety of experience in the users - some extremely basic concepts to explain or at least signpost to relevant training


For those wondering why we omit the other drive options, this is mainly due to the behavior of the magnetic storage classes.
The HDD storage classes base their throughput off the provisioned storage size `see here <https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/hdd-vols.html>`_.
Because of this we recommend that you use :ref:`object-storage` for storing large amounts of data.
2 changes: 2 additions & 0 deletions ronin/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -22,11 +22,13 @@ Once you've gotten through those we've got some further details on how and what
:glob:

networking
drive-storage
object-storage
backup-restore
updates

* :ref:`networking`
* :ref:`drive-storage`
* :ref:`object-storage`
* :ref:`backup-restore`
* :ref:`updates`
Loading