-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft chaos engineering definition/whitepaper #3
Comments
Keen to help with that ! |
Happy to support the effort too. |
Me to :) |
Ping |
the best bet is currently to contribute to the proposal here which is sketching out a bit of an outline of what can become a whitepaper/landscape: Here are my ideas for a draft outline, would love feedback since I'm new to this space still:
|
ping |
@caniszczyk That document is likely getting hard to navigate, and make sense of. I'm happy to move it to this repo so we can start using GH issues instead. While GH is not a document-collaboration tool, I guess, should we clearly mark each section in the proposal, we could simply refer to each section from GH issues for discussions. |
+1 to moving to GitHub
On Mon, 21 May 2018, 21:47 Sylvain Hellegouarch, ***@***.***> wrote:
@caniszczyk <https://github.com/caniszczyk> That document is likely
getting hard to navigate, and make sense of. I'm happy to move it to this
repo so we can start using GH issues instead.
While GH is not a document-collaboration tool, I guess, should we clearly
mark each section in the proposal, we could simply refer to each section
from GH issues for discussions.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#3 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAdUOqZhtB29AnwHH1k71IQ2VZFZsqQAks5t0yfngaJpZM4TnmJy>
.
--
Mikolaj Pawlikowski
|
Regarding the outline @caniszczyk, it's a good starting point. I might add a section regardng chaos engineering in relation to other disciplines/practices: security, CI/CD... basically, where does CE fit in the toolchain? But, maybe, this is covered by the "CE in Cloud Native Systems"? |
on the suggestion from everyone, I converted what we had in the gdoc to here: https://github.com/chaoseng/wg-chaoseng/blob/master/WHITEPAPER.md It needs a lot of work but now we can start iterating via pull requests. cc: @chaoseng/maintainers |
@caniszczyk +1 |
Hey all, Here is a strawman of structure for the whitepaper. Hopefully will help the discussion :) Chaos Engineering Whitepaper v0.1What is Chaos Engineering?Short HistoryPrinciplesObjective: Harness and Improve System ResilienceBenefits for Cloud Native SystemsRelation to Existing Software and Operational PracticesUse CasesPracticing Chaos EngineeringChaos Engineering FlowDefine a BaselineState the Hypothesis to Confirm/InfirmDetermine a Perturbation to PerformChaos Engineering PerturbationsDegrade Network ConditionsVary Computing ResourcesStress to the LimitsSimulate Data LossChange ACLs PermissionsProvoke a Security BreachChaos Engineering AutomationContinous Chaos EngineeringChaos Engineering ReportingReport Findings |
Hi @Lawouach Thank you for taking to the time to organize things a bit. |
Hey @veggiemonk. Thanks, it looks like nothing when I look at it now but finding the right phrasing took me half a day the other day. Formalizing is hard :D It depends on how we organize the whitepaper, either we list a bunch of examples for each section (so for instance on "Degrade Network Conditions", we could indicate Gremlin, Pumba, Muxy...) so that there is locality between the topic and potential vendors. Or we continue with a long list of vendors at the bottom of the paper. |
Hi @Lawouach, I totally understand that's hard work! 🙏 For now, the landscape doesn't need to be too formal because the list isn't that long actually. As a suggestion, let's keep it it at the end. What do you think? I don't know if the white paper is the right place for that but what about renaming the section "Chaos Engineering Flow" to "How to start Chaos Engineering". It seems pretty basic but without that it can be hard/dangerous to do CE. What are your views on that? |
Interesting, I like the guidelines approach indeed. There is certainly room for a section around the theory, as per the principles. But a "how to get started" one would be very welcome indeed! |
How to get started + Links to product landscape and getting started points there would be awesome |
Ok let's see what kind of resources we can gather in there. |
A section of case studies and papers around the field was something we discussed in the last meeting also. Maybe as a very final section on 'Further Reading' ? @Lawouach thank you so much for getting this started! What do people think about starting a branch with @Lawouach's structure as a README we can start opening PRs against with sections filled in, a merged PR is an approval and we can go deeper on specific content for each section, then link to each PR in this issue? |
I think I will refine taking comments that were made. Give me a moment :) |
Chaos Engineering Whitepaper v0.1What is Chaos Engineering?Short HistoryPrinciplesDiscuss the steady state, experiment, etc. Just to set the "theory"? Why practicing Chaos Engineering?Harness and Improve System ResilienceIf Chaos Engineering isn't the goal per-se, what is? Resiliency? Reliability? Benefits for Cloud Native SystemsSoftware and Operational Practices In ProductionA clear indication that whereas testing, CI/CD are mostly upstream practices, Chaos Engineering is very much downstream and act against a live system. would that make sense? Use CasesThe current use-cases are a good starting point but should we detail them? Similar to the depth we can find in the serverless whitepaper? Practicing Chaos EngineeringGetting Started With Chaos EngineeringIs my system ready to endure Chaos Engineering?Should we hint at what minimal level you need to be before getting started? I mean, what if your system is barely resilient as it is? Do I need to get started in production?While we may want this, starting in prod may not fit "getting started scenarios". Communicate with the OrganizationThis is where we need to continue the discussion and figure out how far we want/can go with the patterns. Should we talk gamedays for instance? Observability? The following phases may or may not be useful. I think it would be valuable if we could describe what it means to deal with chaos in those various cases, but is it the right place? Chaos Engineering PerturbationsDegrade Network ConditionsVary Computing ResourcesStress to the LimitsSimulate Data LossChange ACLs PermissionsProvoke a Security BreachAssume application fails to restartChaos Engineering AutomationContinous Chaos EngineeringChaos Engineering ReportingReport FindingsLandscape
|
That looks good! Thanks @Lawouach for the hard work! I think a PR is in order for us to move forward. |
@chaoseng/maintainers (CC @caniszczyk) so just out of curiosity what is the plan on iterating on this document now? I had a few minutes this afternoon and wanted to add some of my thoughts here, but it's a bit difficult to know where to start. I'm happy to just take some time, make some edits and submit a PR for consideration, but didn't want to ruffle any feathers or step on any toes. Would it be beneficial to assign topics to individuals to comment on? Just thinking out loud here. |
Hey @mattforni, I'd say it's totally fine to offer PRs to the document? On my side, I used this issue as it felt more rapid to get started but I wonder if that would scale for a whole document indeed :D |
PRs are the way to move forward! ⏩ |
PRs please :)
…On Thu, Jun 28, 2018 at 8:54 AM, Julien Bisconti ***@***.***> wrote:
PRs are the way to move forward! ⏩
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAD5IUlInxWj6BOU6vOOWFOqMM63-Cf3ks5uBOAHgaJpZM4TnmJy>
.
--
Cheers,
Chris Aniszczyk
http://aniszczyk.org
+1 512 961 6719
|
Started on my trail of thoughts #41 |
No description provided.
The text was updated successfully, but these errors were encountered: