Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specific Syscalls #152

Merged
merged 7 commits into from
Jul 8, 2023
Merged

Specific Syscalls #152

merged 7 commits into from
Jul 8, 2023

Conversation

ryantxu1
Copy link
Collaborator

Adding specific linux and windows syscalls to the taxonomy.

Before merging @hack-sentinel I have a few questions:

  • I don’t know how to get hierarchy tree updated
  • Do these entries of specific calls need neighbors?
  • I have a case where the Linux Pause Syscall can suspend a thread and a process. How can I duplicate a "Linux Pause" class in protégé? If I can't do that, how should the protégé iri be handled?

@hack-sentinel
Copy link
Collaborator

hack-sentinel commented Apr 11, 2023

Adding specific linux and windows syscalls to the taxonomy.

Before merging @hack-sentinel I have a few questions:

* I don’t know how to get hierarchy tree updated

If I understand the problem, you are making edits but not seeing those in the tree viewer for the classes. Sometimes the Protege app will not display what it should even though you've made a change. Try closing the tree and re-opening. You may need to save and reload to get it to render the tree as expected.

* Do these entries of specific calls need neighbors?

[It's possible I haven't interpreted this question right, but here goes...]

By 'neighbors' I will assume you mean specifically sibling entries for specific types of calls for say, different OSes (i.e., children of a common parent class.) As an example of a singleton (no siblings aka neighbors), I see a singleton child 'Windows NtDuplicateToken' as subclass (child) of 'Copy Token'. Generally, we'd hope to find some additional children, just like one would in document outlines so we would have something additional to contrast or even to create a partition across the children classes. But sometimes there is a specialization of a parent concept that needs to be expressed and is just that, a singleton class. Don't worry too much if you haven't found an equivalent Linux specialization of the notion of 'Copy Token' just yet. If you are highly confident there is no equivalent anywhere, then moving Windows NtDuplicateToken up a level might make sense the general case of ontology class design [1].

[1] I'm not sure of all the use cases or prior discussion here, so even if you are sure there is no Linux analog to Windows for Copy Token, delay moving it up until we have a quick chat with the rest of the syscall taxonomy dev & review team. Having an 'unnecessary' abstraction won't hurt any of the logic and might make it easier to navigate (having a first layer of only syscall abstractions at the top level of System Call hierarchy could come in handy.)

* I have a case where the Linux Pause Syscall can suspend a thread and a process. How can I duplicate a "Linux Pause" class in protégé? If I can't do that, how should the protégé iri be handled?

Well you can reuse a class in the sense it can show up in different contexts, structures, or occasionally even have multiple parents. If there are different concepts (say one for pausing a thread and one for pausing a process) then they should be different classes with different IRIs and different labels. I'd recommend having two different classes here: one under Suspend Process (rename existing IRI and label from LinuxPause and Linux Pause to LinuxPauseProcess and 'Linux Pause Process) respectively and add one under Suspend Thread (Linux Pause Thread). Both of these Linux children classes can reference seeAlso -> ..../pause(2). Then copy over and distinguish the definition of Linux Pause Thread, borrowing from Linux Pause Process.

@ioggstream
Copy link
Contributor

Shouldn't we have a POSIX taxonomy first?

@netfl0
Copy link
Contributor

netfl0 commented Apr 27, 2023

Shouldn't we have a POSIX taxonomy first?

This is a great idea, since we've already done the windows analysis perhaps second :)

@ryantxu1 let me know your thoughts here.

Cc @hack-sentinel

@ryantxu1
Copy link
Collaborator Author

ryantxu1 commented May 1, 2023

@netfl0 Open to exploring this eventually, are we envisioning a similar process of bucketing the system interfaces in POSIX similar to the linux/windows syscalls?

@ioggstream
Copy link
Contributor

Further considerations you may have already taken into account and that - imho - could shift the bar towards POSIX APIs. OTOH POSIX APIs can be either system calls (eg. execve ) or library functions (malloc).

  1. Linux has ~400 syscalls with different security profiles. Moreover some syscalls (e.g. fork) are implemented on top of others. Some security-related functions (e.g. malloc/free) are not syscalls (but I don't think people will reference man 2 brk when tagging objects with d3fend).
man 2 syscalls | grep '[a-z]\(2\)     ' -c

Forking can done in different ways, e.g.,

  • fork / vfork
  • clone / clone2 / clone3
  1. POSIX has less APIs but maybe more famous, as they include fork, open, malloc, free, printf. Taking for granted the perl POSIX interface man page, POSIX has ~300 calls, some of which just library calls.
man posix| grep '    "[a-z0-9]+"' 

It is probably easier to use that kind of reference. Moreover it can apply to other OSs

Questions:

  • which is the mapping strategy you used for windows syscalls?

@@ -16216,8 +16593,7 @@ order to constitute a complete standard. For a complete definition of all requir
rdfs:label "Linux ELF File 64bit" .

:LinuxExec a :CreateProcess,
Copy link
Contributor

@ioggstream ioggstream May 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LinuxExec does not create a process. Instead, it replaces the program that is currently being run by the calling process with a new program, with newly initialized stack, heap, and (initialized and
uninitialized) data segments.

There are other parts of the process that are not replaced, eg.,

The process's real UID and real GID, as well its supplementary group IDs, are unchanged;
file descriptors remain open unless marked close-on-exec.

Syscalls are a slippery slope...

@@ -6264,6 +6296,216 @@ Most current Unix-like systems and Microsoft Windows support loadable kernel mod
rdfs:subClassOf :DigitalArtifact ;
rdfs:seeAlso "https://dbpedia.org/resource/Link" .

:Linux_Exit a owl:Class ;
rdfs:label "Linux _Exit" ;
Copy link
Contributor

@ioggstream ioggstream May 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't find _exit in man 2 syscalls | grep _exit
instead, there's an _exit in POSIX.

Here and elsewhere: why don't use the verbatim syscall name, e.g. "exit(2)" in the label?

Copy link
Collaborator Author

@ryantxu1 ryantxu1 Jun 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ioggstream I put '_exit' because the man page of that specifically states that as the name when actually calling the function.
https://man7.org/linux/man-pages/man2/exit.2.html

That is also why I left out the '(2)' in the labels as I was focusing on the function names themselves. However, I'm happy to reconsider the naming convention if there's an advantage to do so!

Copy link
Contributor

@ioggstream ioggstream Jun 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using system call names or function signatures is a "design" choice. Since implementations can vary in time (see also the link you posted https://man7.org/linux/man-pages/man2/exit.2.html) if we want to focus on Linux, I'd use a man-like uri (e.g. man-pages/man2/exit.2.html references both exit and 2).

This allows us to define a POSIX version (e.g., POSIX_exit)

Moreover, I'd avoid camelizing the syscall name (e.g., Linux-exit or Linux-_exit, or something like that)

Comment on lines 6343 to 6353
:LinuxExecve a owl:Class ;
rdfs:label "Linux Execve" ;
rdfs:subClassOf :CreateProcess ;
:definition "Execute program." ;
rdfs:seeAlso "https://man7.org/linux/man-pages/man2/execve.2.html" .

:LinuxExecveat a owl:Class ;
rdfs:label "Linux Execveat" ;
rdfs:subClassOf :CreateProcess ;
:definition "Execute program relative to a directory file descriptor." ;
rdfs:seeAlso "https://man7.org/linux/man-pages/man2/execveat.2.html" .
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are Execute and Create in the same class?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They both Create Process (this they are both a sublcass of that semantic class) if that makes sense. We've got some clean up coming, @ryantxu1 will be pushing to this branch soon.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood. Since I don't think this reflects the Linux taxonomy I will wait for the cleanup.

@netfl0
Copy link
Contributor

netfl0 commented Jun 16, 2023

let me know when ready for review

@ryantxu1
Copy link
Collaborator Author

@netfl0 Ready for review

[ a owl:Restriction ;
owl:onProperty :executes ;
owl:someValuesFrom :Process ] ;
:definition "Executes a process." ;
Copy link
Contributor

@ioggstream ioggstream Jun 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC with system calls, you execute a program. I see that in the Windows world there's spawnwe() that makes fork/exec, but the name ExecuteProcess is confusing to me.

@@ -4525,7 +4535,7 @@ SafeSEH might be applied only to some executable files or modules, allowing an a
owl:someValuesFrom :ExecutableFile ],
[ a owl:Restriction ;
owl:onProperty :restricts ;
owl:someValuesFrom :CreateProcess ] ;
owl:someValuesFrom :SpawnProcess ] ;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Formally, when the execve(2) is invoked, the process was already spawn.

Moreover, I can create a new process just forking an existing one without execve(2) - e.g., running another instance of the same program.

rdfs:label "Spawn Process" ;
skos:altLabel "Process Spawn" ;
rdfs:subClassOf :SystemCall ;
:definition "A process spawn refers to a function that loads and executes a new child process.The current process may wait for the child to terminate or may continue to execute asynchronously. Creating a new subprocess requires enough memory in which both the child process and the current program can execute. There is a family of spawn functions in DOS, inherited by Microsoft Windows. There is also a different family of spawn functions in an optional extension of the POSIX standards. Fork-exec is another technique combining two Unix system calls, which can effect a process spawn." ;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
:definition "A process spawn refers to a function that loads and executes a new child process.The current process may wait for the child to terminate or may continue to execute asynchronously. Creating a new subprocess requires enough memory in which both the child process and the current program can execute. There is a family of spawn functions in DOS, inherited by Microsoft Windows. There is also a different family of spawn functions in an optional extension of the POSIX standards. Fork-exec is another technique combining two Unix system calls, which can effect a process spawn." ;
:definition "A process spawn refers to a function that loads an executable and executes it in a new child process. The current process may wait for the child to terminate or may continue to execute asynchronously. Creating a subprocess requires enough memory in which both the child process and the current program can execute. There is a family of spawn functions in DOS, inherited by Microsoft Windows. There is also a different family of spawn functions in an optional extension of the POSIX standards. Fork-exec is another technique combining two Unix system calls, which can effect a process spawn." ;

owl:onProperty :suspends ;
owl:someValuesFrom :Thread ] ;
:definition "Suspending a thread causes the thread to stop executing user-mode code." ;
rdfs:seeAlso "https://learn.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-suspendthread" .
Copy link
Contributor

@ioggstream ioggstream Jun 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See also https://man7.org/linux/man-pages/man2/signal.2.html

IIRC on Linux it's the Kernel scheduler pausing the thread (e.g., https://docs.kernel.org/scheduler/sched-design-CFS.html ) but I can be a bit rusty on this topic.

@netfl0 netfl0 merged commit e685e91 into d3fend:develop Jul 8, 2023
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants