Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide source data for the certs and metadata #104

Open
reubano opened this issue Oct 13, 2023 · 11 comments
Open

Provide source data for the certs and metadata #104

reubano opened this issue Oct 13, 2023 · 11 comments

Comments

@reubano
Copy link

reubano commented Oct 13, 2023

It'd be great to have source data (csv, json, etc.) of all the info in this roadmap. That would make converting to a webapp much easier and also allow to filter certs by cost, category, difficultly, etc. I thought about scraping the HTML, but would like to avoid doing that if there is already a more structure data set available.

@reubano
Copy link
Author

reubano commented Oct 13, 2023

Regarding the proposed DB design, you definitely want a RDBMS (SQL). NoSQL is not suited for this type of data. I can assist with refining the schema if you'd like.

@sinecurelife
Copy link
Collaborator

Thank you reubano. I am still tinkering with this. I was hoping to store everything in JSONs so I don't need to pay for a DB, but we'll see what happens!

@reubano
Copy link
Author

reubano commented Jul 5, 2024

@sinecurelife sqlite is a free file based DB so no need to pay. Trust me, this is a highly relational data set so JSON is not an ideal storage format in this case. You can of course have JSON exports of select tables, but the native storage layer should be a DB.

@sinecurelife
Copy link
Collaborator

I'm planning deploying the web app to Azure or AWS (leaning towards Azure right now) so I thought I'd have to pay for their DBs. NoSQL is a lot cheaper on those.

However, I just found out I could just embed the SQLite inside my web app and not use the infrastructure provided DBs. That is, if the web app is only moderately popular. I'll have to build it so I can swap to a big boy MSSQL, MySQL, or PostgresDB later if the load gets too big.

@samC3
Copy link

samC3 commented Oct 25, 2024

Hi @sinecurelife, I've been working on a proof of concept for making the certificate grid interactive and was wondering how you were tracking with the database? So far I've scraped the certificate data from the web page but it would be great to merge in a real dataset if you have it available. Here's the progress I've made so far: https://github.com/samC3/SecCertRoadmapHTML

Really appreciate the work you've put into this site! Hopefully the proof of concept is useful for adding a few features.

@sinecurelife
Copy link
Collaborator

@sinecurelife sqlite is a free file based DB so no need to pay. Trust me, this is a highly relational data set so JSON is not an ideal storage format in this case. You can of course have JSON exports of select tables, but the native storage layer should be a DB.

I've migrated to Azure SQL (Serverless for now to save money).

@sinecurelife
Copy link
Collaborator

Thanks @samC3,

I'll take a peek at your Git. I have a feeling you're going to be well ahead of my efforts.

I had started building JSON files to do NoSQL for these, but I've decided to move to SQL instead. Here are my incomplete JSONs if you want to partially populate: https://github.com/PaulJerimy/SecCertRoadmap/tree/alpha/CredData

The above JSONs also don't have all the fields I want to eventually use. But its a start.

@sinecurelife
Copy link
Collaborator

Hi @sinecurelife, I've been working on a proof of concept for making the certificate grid interactive and was wondering how you were tracking with the database? So far I've scraped the certificate data from the web page but it would be great to merge in a real dataset if you have it available. Here's the progress I've made so far: https://github.com/samC3/SecCertRoadmapHTML

Really appreciate the work you've put into this site! Hopefully the proof of concept is useful for adding a few features.

Wow @samC3 you are shocking far along. That looks really good! Would you be open to a call about integrating this with my infrastructure? I would also like to compensate you for your efforts.

@samC3
Copy link

samC3 commented Oct 27, 2024

Thanks @sinecurelife!

Would you be open to a call about integrating this with my infrastructure?

Would be great to jump on a call and chat, I'll send you a message to figure out a time. There's a few bugs I need to iron out and I think my styling needs a bit more work, but would be more than happy to help with that.

I had started building JSON files to do NoSQL for these, but I've decided to move to SQL instead. Here are my incomplete JSONs if you want to partially populate: https://github.com/PaulJerimy/SecCertRoadmap/tree/alpha/CredData

Oh that's great, I totally missed that branch on that repo. I'll have a look through the data and see about pulling it in. I agree that SQL is a good idea, but the JSON file is small enough to be totally fine for the current amount of information on the page.

@sinecurelife
Copy link
Collaborator

Would be great to jump on a call and chat, I'll send you a message to figure out a time. There's a few bugs I need to iron out and I think my styling needs a bit more work, but would be more than happy to help with that.

Oh that's great, I totally missed that branch on that repo. I'll have a look through the data and see about pulling it in. I agree that SQL is a good idea, but the JSON file is small enough to be totally fine for the current amount of information on the page.

In the short term I'm going to focus on getting the data together for the certifications. In JSON form, this is what I am collecting with OSEE as an example:

[
  {
    "adjacentCategory": [],
    "content": "",
    "agency": "",
    "certName": "",
    "href": "",
    "freeResourcesHREF": [],
    "paidResourcesHREF": [],
    "credentialFinderHREF": null,
    "certDescription": null,
    "mainCategory": "",
    "skillLevel": 1,
    "difficultyLevel": null,
    "reputationLevel": null,
    "accessibilityLevel": null,
    "maturityLevel": null,
    "subCategory": "",
    "tooltiptext": "",
    "acquisitionFee": "",
    "renewalFee": null,
    "renewalPeriod": null,
    "examCode": [],
    "multipleChoiceExam": false,
    "practicalExam": true,
    "interviewRequired": false,
    "courseRequired": true,
    "dod8140Roles": [],
    "17024Accred": false,
    "17024Expiration": null,
    "niceTask": [],
    "niceKnowledge": [],
    "niceSkills": [],
    "jobRoles": {
      "jobRoleTitle": [],
      "niceRoleMatch": [],
      "roleMatchPercentage": []
    },
    "certTags": {
      "securityTags": [],
      "technologyTags": [],
      "softSkillTags": []
    }
  },
  {
    "adjacentCategory": [],
    "content": "OSEE",
    "agency": "offensiveSecurity",
    "certName": "Offensive Security Exploitation Expert",
    "href": "https://www.offensive-security.com/awe-osee/",
    "freeResourcesHREF": ["https://connormcgarr.github.io/", "https://github.com/dhn/OSEE"],
    "paidResourcesHREF": ["https://www.offsec.com/contact-us/", "https://www.blackhat.com/certificates/"],
    "credentialFinderHREF": null,
    "certDescription": "Modern exploits for Windows-based platforms require modern bypass methods to circumvent Microsoft’s defenses. In Advanced Windows Exploitation (EXP-401), OffSec challenges learners to develop creative solutions that work in today’s increasingly difficult exploitation environment.
    The case studies in AWE are large, well-known applications that are widely deployed in enterprise networks. The course dives deep into topics ranging from security mitigation bypass techniques to complex heap manipulations and 64-bit kernel exploitation.
    AWE is a particularly demanding penetration testing course. It requires a significant amount of learner-instructor interaction. Therefore, we limit AWE courses to an in-person, hands-on environment.",
    "mainCategory": "redops",
    "skillLevel": 1,
    "difficultyLevel": null,
    "reputationLevel": null,
    "accessibilityLevel": null,
    "maturityLevel": null,
    "subCategory": "exploit",
    "tooltiptext": "Required course is hands-on in-person only; travel is at learner's expense",
    "acquisitionFee": 12500,
    "renewalFee": 0,
    "renewalPeriod": 0,
    "examCode": ["EXP-401"],
    "multipleChoiceExam": false,
    "practicalExam": true,
    "interviewRequired": false,
    "courseRequired": true,
    "dod8140Roles": [],
    "17024Accred": false,
    "17024Expiration": null,
    "niceTask": [],
    "niceKnowledge": [],
    "niceSkills": [],
    "jobRoles": {
      "jobRoleTitle": [],
      "niceRoleMatch": [],
      "roleMatchPercentage": []
    },
    "certTags": {
      "securityTags": ["exploitation", "reverseEngineering"],
      "technologyTags": [],
      "softSkillTags": []
    }
  }
]

@samC3
Copy link

samC3 commented Oct 30, 2024

That looks great 👌 Especially the acquisitionFee, that's one other filter I think would be a really useful to add. A lot of the other fields could be displayed in the modal - will need some help to figure out what is most important though 😄

I sent you a message through Facebook messenger, is that an alright place to figure out a call?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants