Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restructuring the Software class hierarchy #596

Open
3 of 16 tasks
ajnelson-nist opened this issue Mar 6, 2024 · 12 comments · May be fixed by #598 or #597
Open
3 of 16 tasks

Restructuring the Software class hierarchy #596

ajnelson-nist opened this issue Mar 6, 2024 · 12 comments · May be fixed by #598 or #597

Comments

@ajnelson-nist
Copy link
Contributor

ajnelson-nist commented Mar 6, 2024

Disclaimer

Participation by NIST in the creation of the documentation of mentioned software is not intended to imply a recommendation or endorsement by the National Institute of Standards and Technology, nor is it intended to imply that any specific software is necessarily the best available for the purpose.

Background

UCO Issue 583 proposed several revisions around representations pertaining to software and its configuration.

This Issue focuses on one set of changes pertaining to a restructure of the class hierarchy pertaining to software, so some changes from 583 can be discussed and implemented for UCO 1.4.0.

Requirements

Requirement 583-2

This requirement is ported from Issue 583:

Ability to characterize different types of software objects
At a minimum this should include Software, Code, Application, Script, Library, Package, Process, Compiler, BuildUtility, SoftwareBuild, OperatingSystem, and ServicePack.

Risk / Benefit analysis

Benefits

This benefit is ported from Issue 583:

  • Clarity and consistency of different forms of software observable objects

Risks

These risks are in addition to those listed on Issue 583.

  • The rdfs:comment definition of ProcessThread does not seem entirely coherent with ProcessThread being a subclass of Process. This software overhaul provides an opportunity for clarification.
  • It is unclear whether any observable:Software subclasses should be considered disjoint with other observable:Software subclasses. The rich and adaptive behavioral nature of software might make it impractical to designate any of these classes disjoint.
  • The new class observable:Package has some usage modes where it is an observable:File and where it is not. Take for example the wheel distribution (URL ending .whl) of case-utils, as listed here. The .whl file that was prepared for upload could be considered both an observable:Package, because it is an installable artifact, and observable:File, because it's a file on the build system's file system. However, the object on PyPI might not be classifiable as an observable:File.
    • The file-or-not point is a point for debate somewhat out of scope of this proposal. UCO models File as a subclass of FileSystemObject. PyPI, and other package management ecosystems, might not store blobs like this as files. They're free to store the backing contents of these URLs as blobs in relational database tables or NoSQL stores if they wanted to. But this is generally invisible to the package consumer.

Competencies demonstrated

(For the sake of discusssion, these examples avoid the UCO rule ending IRIs with UUIDs.)

Competency 1

On a laptop, a directory contains a lone, regular file that contains Python code.

#!/usr/bin/env python3
print("Hello, world!")

The SHA3-256 hash of this file's contents is 496e34e7fe23cf69f078cd1fe860b98b2e91101194773b2f144656c0bab877c3.

This below snippet characterizes this Python file with concepts predating this restructuring proposal: There is a File; separately there is a ContentData; and last there is a Relationship stating that the File contains that ContentData, for all times that the Relationship holds. (Let's assume the Relationship still holds.)

Note: This demonstration purposefully avoids attaching a ContentDataFacet directly to the File.

kb:File-1
	a
		observable:File ,
		observable:Script
		;
	core:hasFacet kb:FileFacet-2 ;
	.
kb:FileFacet-2
	a observable:FileFacet ;
	observable:fileName "hello.py" ;
	.
kb:ContentData-3
	a observable:ContentData ;
	core:hasFacet kb:ContentDataFacet-4 ;
	.
kb:ContentDataFacet-4 ;
	a observable:ContentDataFacet ;
	types:hash kb:Hash-5 ;
	.
kb:Hash-5 ;
	a types:Hash ;
	types:hashMethod "SHA3-256"^^vocabulary:HashNameVocab ;
	types:hashValue "496e34e7fe23cf69f078cd1fe860b98b2e91101194773b2f144656c0bab877c3"^^xsd:hexBinary ;
	.
kb:Relationship-6
	a observable:ObservableRelationship ;
	core:isDirectional true ;
	core:kindOfRelationship "Contained_Within" ;
	core:source kb:ContentData-3 ;
	core:target kb:File-1 ;
	.

Competency Question 1.1

Which objects, between the File, ContentData and ObservableRelationship, are classified as, or constitute, the following?

  • observable:Application
  • observable:Code
  • observable:Script

Result 1.1

TODO

Competency 2

An Ubuntu server runs a service called mywebapp. Running the command service mywebapp status reports three tasks associated with the service. The primary task has PID 10001, and two other worker tasks have PIDs 10002 and 10003. A graph containing these objects contains at least the following:

kb:Process-10001
	a
		observable:LinuxService ,
		observable:LinuxTask
		;
	core:hasFacet kb:ProcessFacet-1 ;
	.
kb:ProcessFacet-1
	a observable:ProcessFacet ;
	observable:pid 10001 ;
	.

kb:Process-10002
	a observable:LinuxTask ;
	core:hasFacet kb:ProcessFacet-2 ;
	.
kb:ProcessFacet-2
	a observable:ProcessFacet ;
	observable:parent kb:Process-10001 ;
	observable:pid 10002 ;
	.

kb:Process-10003
	a observable:LinuxTask ;
	core:hasFacet kb:ProcessFacet-3 ;
	.
kb:ProcessFacet-3
	a observable:ProcessFacet ;
	observable:parent kb:Process-10001 ;
	observable:pid 10003 ;
	.

(NOTE: observable:parent might require a revision to its modeling, due to the potential for processes to become daemons, orphans, zombies - each of which severs the original parent link. The community should consider this an invitation to propose updating practices pertaining to observable:parent, and whether deprecation is appropriate.)

Competency Question 2.1

Which objects are classified as observable:Tasks?

SELECT ?nTask
WHERE {
  ?nTask a/rdfs:subClassOf* observable:Task ;
}

Result 2.1

  • kb:Process-10001
  • kb:Process-10002
  • kb:Process-10003

Competency Question 2.2

Which objects are classified as observable:Services?

SELECT ?nService
WHERE {
  ?nService a/rdfs:subClassOf* observable:Service ;
}

Result 2.2

  • kb:Process-10001

Competency Question 2.3

Which processes are, or were, currently non-primary tasks for the service kb:Process-10001? If the process was a task, when is the relationship known to have ended?

Note this requires terminable parent-child relationship objects; and also, this example applies a custom string for core:kindOfRelationship. (Another proposal about strongly-typed ObservableRelationships linking child processes to parents would complement this example well.)

SELECT ?nTask ?lEndTime
WHERE {
  ?nRelationship
    core:kindOfRelationship "Child_Process_Of_Process" ;
    core:source ?nTask ;
    core:target kb:Process-10001 ;
    .
  OPTIONAL {
    ?nRelationship
      core:endTime ?lEndTime ;
      .
  }
}

Result 2.3

Assume that the example is modified to remove these statements (which removes reliance on the mutative observable:pid property) ...

kb:ProcessFacet-2 observable:parent kb:Process-10001 .
kb:ProcessFacet-3 observable:parent kb:Process-10001 .

... and to add these instead:

kb:Relationship-10002-10001
	a observable:ObservableRelationship ;
	core:isDirectional true ;
	core:kindOfRelationship "Child_Process_Of_Process" ;
	core:source kb:Process-10002 ;
	core:target kb:Process-10001 ;
	.
kb:Relationship-10003-10001
	a observable:ObservableRelationship ;
	core:isDirectional true ;
	core:kindOfRelationship "Child_Process_Of_Process" ;
	core:source kb:Process-10003 ;
	core:target kb:Process-10001 ;
	.

To motivate modeling terminable relationships, consider this extra example data, which includes a representation that some process was spawned and became detached from the website service:

kb:Process-1
	a observable:Process ;
	core:description "/sbin/init" ;
	.
kb:Process-10987
	a observable:Process ;
	.

kb:Relationship-10987-10001
	a observable:ObservableRelationship ;
	core:isDirectional true ;
	core:kindOfRelationship "Child_Process_Of_Process" ;
	core:source kb:Process-10987 ;
	core:target kb:Process-10001 ;
	core:endTime "2023-12-25T08:14:15.9Z"^^xsd:dateTime ;
	.
kb:Relationship-10987-1
	a observable:ObservableRelationship ;
	core:isDirectional true ;
	core:kindOfRelationship "Child_Process_Of_Process" ;
	core:source kb:Process-10987 ;
	core:target kb:Process-1 ;
	core:startTime "2023-12-25T08:14:15.9Z"^^xsd:dateTime ;
	.

Then, the query portion pertaining to detached processes would show a process that left home for the holiday:

?nTask ?lEndTime
kb:Process-10002
kb:Process-10003
kb:Process-10987 2023-12-25T08:14:15.9Z

Solution suggestion

This diagram is ported from Issue 583's solution suggestion:

Semantically Structuring Software ObservableObjects

Since the initial implementation sketch of Issue 583, the following changes have been made:

  • observable:LinuxService, a subclass of observable:Service and sibling to observable:WindowsService, has been added.
  • Issue 583 included some subclass rearrangement that would not be considered a backwards-compatible change. For existing classes that will change their position in the subclass hierarchy, shapes are added for UCO 1.4.0, to warn users their current instances should be multi-typed to line up with what will be the parents in UCO 2.0.0.

Coordination

  • Tracking in Jira ticket OCUCO-312
  • Administrative review completed, proposal announced to Ontology Committees (OCs) on 2024-03-05
  • Requirements to be discussed in OC meeting, 2024-05-30 (rescheduled from Mar. 14)
  • Requirements to be discussed in OC meeting, TBD
  • Requirements Review vote has not occurred
  • Requirements development phase completed.
  • Solution announced to OCs on TODO-date
  • Solutions Approval to be discussed in OC meeting, date TBD
  • Solutions Approval vote has not occurred
  • Solutions development phase completed.
  • Backwards-compatible implementation merged into develop for the next release
  • develop state with backwards-compatible implementation merged into develop-2.0.0
  • Backwards-incompatible implementation merged into develop-2.0.0
  • Milestone linked
  • Documentation logged in pending release page
  • Prerelease publication: CASE develop branch updated to track UCO's updated develop branch
  • Prerelease publication: CASE develop-2.0.0 branch updated to track UCO's updated develop-2.0.0 branch
@ajnelson-nist
Copy link
Contributor Author

@sbarnum , there are a few points needed to finish preparing this proposal well enough for a Requirements Review vote:

  • Can you please supply definitions for the new classes added in this Issue's Pull Request.
  • Can you please state whether any of the freshly-rearranged classes are intended to be disjoint with any others. I've inlined my guesses on disjointedness in the proposal.
  • Can you please give your response to Competency Question 1.1.

ajnelson-nist added a commit that referenced this issue Mar 6, 2024
This patch also updates a test result from one of the to-be-rearranged
classes.

A follow-on patch will regenerate Make-managed files.

References:
* RDFLib/pySHACL#222
* #596
* https://www.w3.org/TR/shacl/#NodeConstraintComponent

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit that referenced this issue Mar 6, 2024
References:
* #596

Signed-off-by: Alex Nelson <[email protected]>
@ajnelson-nist
Copy link
Contributor Author

Update on the implementation: The initial PR tried inlining some anonymous sh:NodeShapes to gently warn that some additional types should be applied. Those were written on an incorrect understanding of how sh:node and sh:property work - it seems that if a supplemental shape attached by sh:node fails validation, even with sh:Info-level severity, the entire shape fails.

This patch, particularly between pre-line 8740 and post-line 13771, changes the implementation style to move all of those "gentle warning" shapes into anonymous shapes trailing at the end of observable.ttl. They are removed in the UCO 2.0.0 PR.

I moved them to anonymous nodes because it felt unhelpful to devise shape IRIs for temporary shapes, because they would no longer be relevant after UCO 2.0.0, but as introduced IRIs we might need to retain them permanently as part of backwards compatibility.

From at least how UCO's current testing infrastructure works, there is a slight difference in the validation reporting depending on whether the shape is identified with a blank node or with an IRI.

So, there is a question to address before Solutions Approval: Should these "gentle warning" shapes be given IRIs, or is it fine to have them be blank nodes? Absent requests otherwise, they will be left as blank nodes.

ajnelson-nist added a commit that referenced this issue May 8, 2024
… class

This applies a practice being tried in Issue 602.

A follow-on patch will regenerate Make-managed files.

References:
* #596
* #602

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit that referenced this issue May 8, 2024
References:
* #596

Signed-off-by: Alex Nelson <[email protected]>
@ajnelson-nist
Copy link
Contributor Author

So, there is a question to address before Solutions Approval: Should these "gentle warning" shapes be given IRIs, or is it fine to have them be blank nodes? Absent requests otherwise, they will be left as blank nodes.

From Issue 602, I found a middle ground: The temporary shapes are blank nodes, but are linked to their associated classes with rdfs:seeAlso. I did just confirm that this will have the blank node shape render on the generated documentation page. The next-minor and next-major PRs have been updated with this implementation.

ajnelson-nist added a commit to casework/CASE-Archive that referenced this issue May 10, 2024
No effects were observed on Make-managed files.

References:
* ucoProject/UCO#596

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit to casework/CASE-Archive that referenced this issue May 10, 2024
No effects were observed on Make-managed files.

References:
* ucoProject/UCO#596

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit to casework/casework.github.io that referenced this issue May 10, 2024
A follow-on patch will regenerate Make-managed files.

References:
* ucoProject/UCO#596

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit to casework/casework.github.io that referenced this issue May 10, 2024
References:
* ucoProject/UCO#596

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit to casework/casework.github.io that referenced this issue May 10, 2024
A follow-on patch will regenerate Make-managed files.

References:
* ucoProject/UCO#596

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit to casework/casework.github.io that referenced this issue May 10, 2024
References:
* ucoProject/UCO#596

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit to casework/CASE-Examples that referenced this issue May 10, 2024
A follow-on patch will regenerate Make-managed files.

References:
* ucoProject/UCO#596

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit to casework/CASE-Examples that referenced this issue May 10, 2024
References:
* ucoProject/UCO#596

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit to casework/CASE-Corpora that referenced this issue May 10, 2024
No effects were observed on Make-managed files.

References:
* ucoProject/UCO#596

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit to casework/CASE-Corpora that referenced this issue May 10, 2024
No effects were observed on Make-managed files.

References:
* ucoProject/UCO#596

Signed-off-by: Alex Nelson <[email protected]>
@ajnelson-nist
Copy link
Contributor Author

This Issue is awaiting English definitions to be added to the new classes before we consider whether the requirements are sufficiently specified.

@ajnelson-nist
Copy link
Contributor Author

There was some question in last week's call on the mutability of a process's parent-reference.

Here is a demonstration in macOS or Linux of a process changing its parent, using a Bash shell:

sleep 60

That command runs an idle process for 60 seconds, and the process remains the foreground of the shell. Terminating the parent shell would recursively terminate the child sleep process.

sleep 60 &

That command runs an idle process and backgrounds the process. The shell is still the process's parent - terminating the parent shell would recursively terminate the child sleep process.

nohup sleep 60 &

That command runs like sleep 60 &, except now if the shell is terminated, the root process init inherits the child. This can be seen with ps -ef | grep sleep - see the "PPID" (parent process ID) column.

@ajnelson-nist
Copy link
Contributor Author

@sbarnum : I was looking at the classes in this proposal and thinking of how to demonstrate them. "Service pack" is giving me some confusion versus the other classes.

How would you instantiate Windows XP Service Pack 2 (relevant for at least a lot of available forensic reference data), in these ways:

  • As the software sitting on a hologrammed DVD.
  • As a running operating system.

I expect in both of these cases several of the software types will apply to each node.

ajnelson-nist added a commit to ucoProject/UCO-Profile-gufo that referenced this issue Jun 7, 2024
No effects were observed on Make-managed files.

References:
* ucoProject/UCO#596

Signed-off-by: Alex Nelson <[email protected]>
@sbarnum
Copy link
Contributor

sbarnum commented Oct 24, 2024

This Issue is awaiting English definitions to be added to the new classes before we consider whether the requirements are sufficiently specified.

Here are proposed English definitions for each of the new proposed classes as well as tweaks to a few existing definitions to better align over all and to address some issues that came about from creating clear distinct definitions for Task, ProcessThread, and Process.
Fleshing these definitions out also led to a need to slightly alter the Software class taxonomy as Task and ProcessThread are heavily related to Process but should not be subclasses.
Here is the new updated diagram:

Software Deployment Overview-2 - ObservableObjects drawio

  • BuildUtility
    • A Build Utility is a software-based tool that automates portions or all of the process of creating executable software from source code
  • Compiler
    • A Compiler is a software program that translates source code written in a high-level language (e.g., C++, Python, Java) into machine code that can be understood and executed by a computer processor.
  • DeploymentScript
    • A Deployment Script is a software script used to deploy artifacts, packages, modules, patches, or other resources into an intended execution environment
  • LinuxService
    • A Linux Service (often referred to as a daemon) is a Service running within a Linux operating system, similar to the way a Windows Service runs on Windows.
  • LinuxTask
    • A Linux Task is a set of software computer instructions loaded into memory with the potential to be scheduled for execution within the Linux operating system.
  • Package
    • A Pakcage is a body of software consisting of a collection of individual software (programs, libraries, files, etc.) packaged together to collectively serve a broader purpose
  • Process
    • rdfs:comment "A Process is an instance of a software program that is being executed within a scope having dedicated memory, address space, execution variables, code instructions, state, security info, file handles, etc. Process execution consists of one or more component threads sharing the process resources."@en ;
  • ProcessThread
    • rdfs:comment "A Process Thread is the smallest sequence of programmed instructions that can be managed independently by a scheduler on a computer, which is typically a part of the operating system. It is a scheduled running instantiation of one or more tasks (including CPU flags, counters, timers, stack, etc.) as a component of a process. Multiple threads can exist within one process, executing concurrently and sharing resources such as memory, while different processes do not share these resources. In particular, the threads of a process share its executable code and the values of its dynamically allocated variables and non-thread-local global variables at any given time. based on [https://en.wikipedia.org/wiki/Thread_(computing)]"@en ;
  • Script
    • A Script is a software consisting of computer instructions that can be interpreted and executed in real-time (typically by an interpreter rather than directly by a computer processor) without requiring advance compilation
  • Service
    • A Service is a process that runs in the background rather than under the control of an interactive user. Services are typically long-running and can be configured to start when the operating system starts and continue as long as the operating system is running.
  • ServicePack
    • A Service Pack is a software consisting of a collection of software updates or fixes (patches) for a software delivered as an aggregated single package for ease of installation
  • SoftwareBuild
    • A Software Build is a particular executable version of software that has been created from source code and is ready for testing or deployment
  • Task
    • A Task is a set of software computer instructions loaded into memory with the potential to be scheduled for execution
  • WindowsService
  • WindowsTask
    • rdfs:comment "A Windows Task is a set of software computer instructions loaded into memory with the potential to be scheduled for execution within the Windows operating system."@en;
  • WindowsThread
    • rdfs:comment "A Windows thread is a Process Thread within a Windows process."@en ;

ajnelson-nist pushed a commit that referenced this issue Oct 25, 2024
No effects were observed on Make-managed files.

AJN: This is my transcription of Sean's Issue Comment (see references),
with a few minor grammatical and typographical fixes.

References:
* #596 (comment)

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist pushed a commit that referenced this issue Oct 25, 2024
No effects were observed on Make-managed files.

AJN: This is my transcription of Sean's Issue Comment (see references),
with a few minor grammatical and typographical fixes.

References:
* #596 (comment)

Signed-off-by: Alex Nelson <[email protected]>
@ajnelson-nist
Copy link
Contributor Author

Thank you, @sbarnum, I've incorporated the definition updates.

ajnelson-nist added a commit to ucoProject/UCO-Profile-gufo that referenced this issue Nov 1, 2024
This patch leaves one incompletely-typed class, `ServicePack`, pending
discussion.

References:
* ucoProject/UCO#596
* ucoProject/UCO@faae89b

Signed-off-by: Alex Nelson <[email protected]>
@ajnelson-nist
Copy link
Contributor Author

@sbarnum : I'm looking at some class-pairings.

  • I suspect Compiler is a subclass of Application. Do you have an example where some Compiler is not an Application?
  • Is there an example of a Library that is not a File? I suspect yes, when considering in-process-memory objects, but we haven't delved into modeling that to date.

@sbarnum
Copy link
Contributor

sbarnum commented Nov 4, 2024

  • Can you please state whether any of the freshly-rearranged classes are intended to be disjoint with any others. I've inlined my guesses on disjointedness in the proposal.

I think it is very tricky and risky in defining disjoint between classes of software as there is a large amount of inherent potential overlap.
I am not sure I see significant value in trying to tease apart this issue.
That being said, if I had to take a cut at disjoint assertion that could likely be safe I would go with something like:

• Process disjoint from
	○ Code
	○ Application
	○ Script
	○ Library
	○ ProcessThread
	○ Task
	○ Compiler
	○ BuildUtility
	○ SoftwareBuild
	○ OperatingSystem
	○ ServicePack
• ProcessThread disjoint from
	○ Code
	○ Application
	○ Script
	○ Library
	○ Process
	○ Task
	○ Compiler
	○ BuildUtility
	○ SoftwareBuild
	○ OperatingSystem
	○ ServicePack
• Task disjoint from
	○ Code
	○ Application
	○ Script
	○ Library
	○ Process
	○ ProcessThread
	○ Compiler
	○ BuildUtility
	○ SoftwareBuild
	○ OperatingSystem
	○ ServicePack
• Application disjoint from 
	○ OperatingSystem
	○ Library
• BuildUtility disjoint from
	○ Library
	○ OperatingSystem
• Library disjoint from
	○ OperatingSystem
	○ Compiler
	○ BuildUtility
• ServicePack disjoint from
	○ Application
	○ Library
	○ Compiler
	○ BuildUtility
	○ OperatingSystem
• OperatingSystem disjoint from
	○ Compiler

ajnelson-nist added a commit to ucoProject/UCO-Profile-gufo that referenced this issue Nov 6, 2024
This patch is known to not pass CI due to an already-existing and
unresolved modeling question on ServicePack.

References:
* ucoProject/UCO#596 (comment)

Co-authored-by: Sean Barnum <[email protected]>
Signed-off-by: Alex Nelson <[email protected]>
@ajnelson-nist
Copy link
Contributor Author

@sbarnum , I will agree on tricky, but I do think we will find benefits from trying to specify these disjointedness statements. For instance, this helped catch a specification issue with one of the requirements over on Issue 626 (described here).

@ajnelson-nist
Copy link
Contributor Author

@sbarnum : Also in light of Issue 626 (constraining observable:cpeid), it looks like your disjointedness statements work out so that these subclasses of Software (as proposed in the current Issue) can't have CPEs associated, because they are disjoint with both Application and OperatingSystem.

?nClass
drafting:ServicePack
drafting:Task
uco-observable:Library
uco-observable:Process
uco-observable:ProcessThread

Per this SPARQL query:

PREFIX uco-observable: <https://ontology.unifiedcyberontology.org/uco/observable/>
SELECT ?nClass
WHERE {
  ?nClass
    rdfs:subClassOf* uco-observable:Software ;
    owl:disjointWith
      uco-observable:Application ,
      uco-observable:OperatingSystem
      ;
    .
}
ORDER BY ?nClass

Task, Process, and ProcessThread, I see no controversy. I still need your help understanding ServicePack. But what about Library? A vulnerability in a library would be a significant point of interest in supply chain review.

Could you perhaps illustrate how a Library fits into the composition of an Application?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants