Add the new docs to replace the old ones

This commit is contained in:
Tate, Hongliang Tian
2024-01-20 12:17:42 +08:00
parent 7b729de3a6
commit 9c55d8f265
54 changed files with 105 additions and 2520 deletions

View File

@ -1,14 +0,0 @@
# Asterinas Documentation
The documentation is rendered as a book with [mdBook](https://rust-lang.github.io/mdBook/),
which can be installed with `cargo`.
```bash
cargo install mdbook
```
To build the book and read it in your default browser, run the following command.
```bash
mdbook serve --open
```

View File

@ -1,9 +1,21 @@
[book] [book]
authors = ["Tate, Hongliang Tian"] authors = ["The Asterinas authors"]
language = "en" language = "en"
multilingual = false multilingual = false
src = "src" src = "src"
title = "Asterinas: A Secure, Fast, and Modern OS in Rust" title = "The Asterinas Book"
[build]
# create-missing = true
[output.html]
default-theme = "navy"
preferred-dark-theme = "navy"
[output.html.playground]
editable = false # allows editing the source code
copyable = true # include the copy button for copying code snippets
copy-js = true # includes the JavaScript for the code editor
line-numbers = false # displays line numbers for editable code
runnable = false # displays a run button for rust code
[rust]
edition = "2021"

View File

@ -1,103 +0,0 @@
<!--
# Table of Content
1. Introduction
2. Design
1. Privilege Separation
1. Case Study 1: Syscall Workflow
2. Case Study 2: Drivers for Virtio Devices on PCI
2. Everything as a Capability
1. Type-Level Programming in Rust
2. CapComp: Zero-Cost Capabilities and Component
3. (More content...)
-->
# Introduction
This document describes Asterinas, a secure, fast, and modern OS written in Rust.
As the project is a work in progress, this document is by no means complete.
Despite the incompleteness, this evolving document serves several important purposes:
1. To improve the quality of thinking by [putting ideas into words](http://www.paulgraham.com/words.html).
2. To convey the vision of this project to partners and stakeholders.
3. To serve as a blueprint for implementation.
## Opportunities
> The crazy people who are crazy enough to think they can change the world,
> are the ones who do. --Steve Jobs
We believe now is the perfect time to start a new Rust OS project. We argue that
if we are doing things right, the project can have a promising prospect to
success and a real shot of challenging the dominance of Linux in the long run.
Our confidence stems from the three technological, industrial, geo-political
trends.
First, [Rust](https://www.rust-lang.org/) is the future of system programming,
including OS development. Due to its advantages of safety, efficiency, and
productivity, Rust has been increasingly embraced by system developers,
including OS developers. [Linux is close to adopting Rust as an official
programming
language.](https://www.zdnet.com/article/linus-torvalds-is-cautiously-optimistic-about-bringing-rust-into-the-linux-kernels-next-release/)
Outside the Linux community, Rust enthusiasts are building from scratch new Rust
OSes, e.g., [Kerla](https://github.com/nuta/kerla),
[Occlum](https://github.com/occlum/occlum),
[Redox](https://github.com/redox-os/redox),
[rCore](https://github.com/rcore-os/rCore),
[RedLeaf](https://github.com/mars-research/redleaf),
[Theseus](https://github.com/theseus-os/Theseus),
and [zCore](https://github.com/rcore-os/zCore). Despite their varying degrees of
success, none of them are general-purpose, industrial-strength OSes that are or
will ever be competitive with Linux. Eventually, a winner will emerge out of this
market of Rust OSes, and Asterinas is our bet for this competition.
Second, Rust OSes are a perfect fit for
[Trusted Execution Environments (TEEs)](https://en.wikipedia.org/wiki/Trusted_execution_environment).
TEEs is an emerging hardware-based security technology that is expected to
become mainstream. All major CPU vendors have launched or announced their
implementations of VM-based TEEs, including
[ARM CCA](https://www.arm.com/architecture/security-features/arm-confidential-compute-architecture),
[AMD SEV](https://developer.amd.com/sev/),
[Intel TDX](https://www.intel.com/content/www/us/en/developer/articles/technical/intel-trust-domain-extensions.html)
and [IBM PEF](https://research.ibm.com/publications/confidential-computing-for-openpower).
Typical applications that demand protection of TEEs also desire a TEE OS that is
more secure and trustworthy than Linux, the latter of which is plagued by the
inevitable security vulnerabilities resulting from the unsafe nature of C
language and the sheer complexity of the codebase. A new Rust OS built from
scratch is less likely to contain memory safety bugs and can enjoy a
significantly smaller Trusted Computing Base (TCB).
Third, the Chinese tech sector has a strong incentive to
invest in local alternatives of critical software like OSes.
Based in China,
we have been observing greater aspiration of Chinese companies
as well as greater support from the Chinese government
to [achieve independency in key technologies like chips and software](https://www.nytimes.com/2021/03/10/business/china-us-tech-rivalry.html).
One success story of Chinese software independence is
relational databases:
[Oracle and IBM are losing ground as Chinese vendors catch up with their US counterparts](https://www.theregister.com/2022/07/06/international_database_vendors_are_losing/).
Can such success stories be repeated in the field of OSes? I think so.
There are some China's home-grown OSes like [openKylin](https://www.openkylin.top/index.php?lang=en), but all of them are based on Linux and lack a self-developed
OS _kernel_. The long-term goal of Asterinas is to fill this key missing core of the home-grown OSes.
## Architecture Overview
Here is an overview of the architecture of Asterinas.
![architecture overview](images/arch_overview.png)
## Features
**1. Security by design.** Security is our top priority in the design of Asterinas. As such, we adopt the widely acknowledged security best practice of [least privilege principle](https://en.wikipedia.org/wiki/Principle_of_least_privilege) and enforce it in a fashion that leverages the full strengths of Rust. To do so, we partition Asterinas into two halves: a _privileged_ OS core and _unprivileged_ OS components. All OS components are written entirely in _safe_ Rust and only the privileged OS core
is allowed to have _unsafe_ Rust code. Furthermore, we propose the idea of _everything-is-a-capability_, which elevates the status of [capabilities](https://en.wikipedia.org/wiki/Capability-based_security) to the level of a ubiquitous security primitive used throughout the OS. We make novel use of Rust's advanced features (e.g., [type-level programming](https://willcrichton.net/notes/type-level-programming/)) to make capabilities more accessible and efficient. The net result is improved security and uncompromised performance.
**2. Trustworthy OS-level virtualization.** OS-level virtualization mechanisms (like Linux's cgroups and namespaces) enable containers, a more lightweight and arguably more popular alternative to virtual machines (VMs). But there is one problem with containers: they are not as secure as VMs (see [StackExchange](https://security.stackexchange.com/questions/169642/what-makes-docker-more-secure-than-vms-or-bare-metal), [LWN](https://lwn.net/Articles/796700/), and [AWS](https://docs.aws.amazon.com/AmazonECS/latest/bestpracticesguide/security-tasks-containers.html)). There is a real risk that malicious containers may exploit privilege escalation bugs in the OS kernel to attack the host. [A study](https://dl.acm.org/doi/10.1145/3274694.3274720) found that 11 out of 88 kernel exploits are effective in breaking the container sandbox. The seemingly inherent insecurity of OS kernels leads to a new breed of container implementations (e.g., [Kata](https://katacontainers.io/) and [gVisor](https://gvisor.dev/)) that are based on VMs, instead of kernels, for isolation and sandboxing. We argue that this unfortunate retreat from OS-level virtualization to VM-based one is unwarranted---if the OS kernels are secure enough. And this is exactly what we plan to achieve with Asterinas. We aim to provide a trustworthy OS-level virtualization mechanism on Asterinas.
**3. Fast user-mode development.** Traditional OS kernels like Linux are hard to develop, test, and debug. Kernel development involves countless rounds of programming, failing, and rebooting on bare-metal or virtual machines. This way of life is unproductive and painful. Such a pain point is also recognized and partially addressed by [research work](https://www.usenix.org/conference/fast21/presentation/miller), but we think we can do more. In this spirit, we design the OS core to provide high-level APIs that are largely independent of the underlying hardware and implement it with two targets: one target is as part of a regular OS in kernel space and the other is as a library OS in user space. This way, all the OS components of Asterinas, which are stacked above the OS core, can be developed, tested, and debugged in user space, which is more friendly to developers than kernel space.
**4. High-fidelity Linux ABI.** An OS without usable applications is useless. So we believe it is important for Asterinas to fit in an established and thriving ecosystem of software, such as the one around Linux. This is why we conclude that Asterinas should aim at implementing high-fidelity Linux ABI, including the system calls, the proc file system, etc.
**5. TEEs as top-tier targets.** (Todo)
**6. Reservation-based OOM prevention.** (Todo)

View File

@ -1,16 +1,53 @@
# Summary # Summary
- [Introduction](README.md) [Introduction](introduction.md)
# Design # Asterinas Kernel
- [Privilege Separation](privilege_separation/README.md) * [Getting Started](kernel/README.md)
- [Case Study 1: Syscall Workflow](privilege_separation/syscall_workflow.md) * [A Zero-Cost, Least-Privilege Approach](kernel/the-approach/README.md)
- [Case Study 2: Drivers for Virtio Devices on PCI](privilege_separation/pci_virtio_drivers.md) * [Framekernel OS Architecture](kernel/the-approach/framekernel.md)
- [Everything as a Capability](capabilities/README.md) * [Component-Level Access Control](kernel/the-approach/components.md)
- [Type-Level Programming in Rust](capabilities/type_level_programming.md) * [Type-Level Capabilities](kernel/the-approach/capabilities.md)
- [Zero-Cost Capabilities](capabilities/zero_cost_capabilities.md) * [Development Status and Roadmap](kernel/status-and-roadmap.md)
- [CapComp: Zero-Cost Capabilities and Components](capabilities/capcomp.md) * [Linux Compatibility](kernel/linux/README.md)
- [Trustworthy Containers]() * [File Systems](kernel/linux/file_systems.md)
- [TEEs as Top-Tier Targets]() * [Networking](kernel/linux/network.md)
- [Fast User-Mode Development]() * [Boot protocols](kernel/linux/boot.md)
# Asterinas Framework
* [An Overview of Framework APIs](framework/README.md)
* [Writing a Kenrel in 100 Lines of Safe Rust](framework/an-100-line-example.md)
# Asterinas OSDK
* [OSDK User Guide](osdk/guide/README.md)
* [Why OSDK](osdk/guide/why.md)
* [Creating an OS Project](osdk/guide/create-project.md)
* [Testing or Running an OS Project](osdk/guide/run-project.md)
* [Working in a Workspace](osdk/guide/work-in-workspace.md)
* [OSDK User Reference](osdk/reference/README.md)
* [Commands](osdk/reference/commands/README.md)
* [cargo osdk new](osdk/reference/commands/new.md)
* [cargo osdk build](osdk/reference/commands/build.md)
* [cargo osdk run](osdk/reference/commands/run.md)
* [cargo osdk test](osdk/reference/commands/test.md)
* [Manifest](osdk/reference/manifest.md)
# How to Contribute
* [Before You Contribute](to-contribute/README.md)
* [Code Organization](to-contribute/code-organization.md)
* [Style Guidelines](to-contribute/style-guidelines/README.md)
* [General Guidelines](to-contribute/style-guidelines/general-guidelines.md)
* [Rust Guidelines](to-contribute/style-guidelines/rust-guidelines.md)
* [Git Guidelines](to-contribute/style-guidelines/git-guidelines.md)
* [Community](to-contribute/community.md)
* [Code of Conduct](to-contribute/code-of-conduct.md)
# Request for Comments (RFC)
* [RFC Overview](rfcs/README.md)
* [RFC-0001: RFC Process](rfcs/0001-rfc-process.md)
* [RFC-0002: Operating System Development Kit (OSDK)](rfcs/0002-osdk.md)

View File

@ -1,38 +0,0 @@
# Everything is a Capability
> A capability is a token, ticket, or key that gives the possessor permission to access an entity or object in a computer system. ---Dennis and Van Horn of MIT, 1966
Capabilities are a classic approach to security and access control in OSes,
especially microkernels. For example, capabilities are known as handles in [Zircon](https://fuchsia.dev/fuchsia-src/concepts/kernel). From the users' perspective, a handle is just an ID. But inside the kernel, a handle is a C++ object that contains three logical fields:
* A reference to a kernel object;
* The rights to the kernel object;
* The process it is bound to (or if it's bound to the kernel).
Capabilities have a few nice properties in terms of security.
* Non-forgeability. New capabilities can only be constructed or derived from existing, valid capabilities. Capabilities cannot be created out of thin air.
* Monotonicity. A new capability cannot have more permissions than the original capability from which the new one is derived.
* Transferability. A capability may be transferred to or borrowed by another user or security domain to grant access to the resource behind the capability.
Existing capability-based systems, e.g., [seL4](https://docs.sel4.systems/Tutorials/capabilities.html), [Zircon](https://fuchsia.dev/fuchsia-src/concepts/kernel/handles), and [WASI](https://github.com/bytecodealliance/wasmtime/blob/main/docs/WASI-capabilities.md), use
capabilities in a limited fashion, mostly as a means to limit the access from
external users (e.g., via syscall), rather than a mechanism to enforce advanced
security policies internally (e.g., module-level isolation).
So we ask this question: is it possible to use capabilities as a _ubitiquous_ security primitive throughout Asterinas to enhance the security and robustness of the
OS? Specifically, we propose a new principle called "_everything is a capability_".
Here, "everything" refers to any type of OS resource, internal or external alike.
In traditional OSes, treating everything as a capability is unrewarding
because (1) capabilities themselves are unreliable due to memory safety problems
, and (2) capabilities are no free lunch as they incur memory and CPU overheads. But these arguments may no longer stand in a well-designed Rust OS like Asterinas.
Because the odds of memory safety bugs are minimized and
advanced Rust features like type-level programming allow us to implement
capabilities as a zero-cost abstraction.
In the rest of this chapter, we first introduce the advanced Rust technique
of [type-level programming (TLP)](type_level_programming.md) and then describe how we leverage TLP as well as
other Rust features to [implement zero-cost capabilities](zero_cost_capabilities.md).
The ideas described above was originally explored in one of our internal project
called [CapComp](capcomp.md).

View File

@ -1,403 +0,0 @@
# CapComp: Towards a Zero-Cost Capability-Based Component System for Rust
## Overview
**CapComp** is a zero-cost capability-based component system for Rust. With CapComp, Rust developers can now build _component-based systems_ that follow the _principle of least privilege_, which is effective in improving system security and stability. CapComp can also make formal verification of Rust systems more feasible since it reduces the Trusted Computing Base (TCB) for a specific set of properties. We believe CapComp is useful for building complex systems of high security or stability requirements, e.g., operating systems.
CapComp features a novel _zero-cost access control_ mechanism. Unlike the traditional approach to capability-based access control (e.g., seL4), CapComp enforces access control at compile time: no runtime check for access rights, no hardware isolation overheads, or extra runtime overheads of Inter-Procedural Calls (IPC). To achieve this, CapComp has realized the full potential of the Rust language by making a joint and clever use of new type primitives, type-level programming, procedural macros, and program analysis techniques.
CapComp is designed to be _pragmatic_. To enjoy the benefits of CapComp, developers need to design new systems with CapComp in mind or refactor existing systems accordingly. But these efforts are minimized thanks to the intuitive and user-friendly APIs provided by CapComp. Furthermore, CapComp supports incremental adoption: it is totally ok to protect only selected parts of a system with CapComp. Lastly, CapComp does not require changes to the Rust language or compiler.
## Motivating examples
To illustrate the motivation of CapComp, let's first examine two examples. While both examples originate in the context of OS, the motivations behind them make sense in other contexts.
### Example 1: Capabilities in a microkernel-based OS
A capability is a communicable, unforgeable token of authority. Conceptually, a capability is a reference to an object along with an associated set of access rights. Capabilities are chosen to be the core primitive that underpins the security of many systems, most notably, several microkernels like seL4, zircon, and Capsicum.
In Rust, the most straightforward way to represent a capability is like this (as is done in zCore, a Rust rewrite of zircon). In other languages like C and C++, the representation is similar.
```rust
/// A handle = a capability.
pub struct Handle<O> {
/// The kernel object referred to by the handle.
pub object: Arc<O>,
/// The access rights associated with the handle.
pub rights: Rights,
}
```
The problem with this approach is three-fold. First, a capability (i.e., `Handle`) consumes more memory than a raw pointer. Second, enforcing access control according to the associated rights requires runtime checking. Third, due to the overhead of runtime checking, users of `Handle` have the incentive to skip the runtime checking whenever possible and use the object directly. Thus, we cannot rule out the odds that some buggy code fails to enforce the access control, leading to security loopholes.
So, here is our question: **is it possible to implement capabilities in Rust without runtime overheads and (potential) security loopholes?**
### Example 2: Key retention in a monolithic OS kernel
Let's consider a key retention subsystem named `keyring` in a monolithic kernel (such a subsystem exists in the Linux kernel). The subsystem serves as a secure and centralized place to manage secrets used in the OS kernel. The users of the `keyring` subsystem include both user programs and kernel services.
Here is an (overly-)simplified version of `keyring` in Rust.
```rust
/// A key is some secret data.
pub struct Key(Vec<u8>);
impl Key {
pub fn new(bytes: &[u8]) -> Self {
Self(Vec::from(bytes))
}
pub fn get(&self) -> &[u8] {
self.0.as_slice()
}
pub fn set(&mut self, bytes: &[u8]) -> &[u8] {
self.0.clear();
self.0.extend_from_slice(bytes);
}
}
/// Store a key with a name.
pub fn insert_key(name: &str, key: Key) {
KEYS.lock().unwrap().insert(name.to_string(), Arc::new(key))
}
/// Get a key by a name.
pub fn get_key(name: &str) -> Option<Arc<Key>> {
KEYS.lock()
.unwrap()
.get(&name.to_string())
.map(|key| key.clone())
}
lazy_static! {
static ref KEYS: Mutex<HashMap<String, Arc<Key>> = {
Mutex::new(HashMap::new())
};
}
```
While any components (or modules/subsystems/drivers) in a monolithic kernel are part of the TCB and thus trusted, it is still highly desirable to constraint the access of keys to only the components that have some legitimate reasons to use them. This minimizes the odds of misusing or leaking the highly sensitive keys maintained by `keyring`.
Our question is that **how to enforce inter-component access control (as demanded here by `keyring`) in a complex system like a monolithic kernel and without runtime overheads?**
## Core concepts
Before demonstrating how we approach the above problems in CapComp, we need to first introduce some core concepts in CapComp.
In CapComp, **components** are Rust crates that are governed by our compile-time capability-based access control mechanism. Crates that maintain mutable states or have global impacts on a system are good candidates to be protected as components.
Unlike a regular crate, a component can decide how many functionalities to expose on a per-component or per-object basis; that is, only granting to its users the bare minimal privileges necessary to do their jobs.
For our purpose of access control, we define the **entry points** of a component as the following types of _public Rust language items_ exposed by a component.
* Structs, enumerations, and unions;
* Functions and associated functions;
* Static items;
* Constant items.
We consider these items as entry points because a user who gets access to such items can directly _execute_ the code within the component. And also for this reason, other types of Rust language items (e.g., modules, type aliases, use declarations, traits, etc.) are not considered as entry points: they either do not involve direct code execution (like modules) or only involves _indirect_ code execution (like type aliases, use declarations, and traits) via other entry points.
But not all entry points are created equal. Some are actually safe to use arbitrarily without jeopardizing other parts of the system, while some are considered **privileged** in the sense that _uncontrolled_ access to such entry points may have undesirable impacts on the system. The opposite of privileged entry points is the **unprivileged** ones. CapComp has a set of built-in rules to identify some common patterns of naive or harmless code that can be safely considered as unprivileged. In the meantime, developers can decide which entry points they believe are unprivileged and point them out for CapComp. The rest of the entry points are thus the privileged ones. Put it in another way, all entry points are considered privileged by default; unprivileged ones are the exceptions. Privileged entry points are the targets of the access control mechanism of CapComp.
With the concepts explained, we can now articulate the three main properties of CapComp.
* **Object-level access control.** All public methods of an object behind a capability are access controlled.
* **Component-level access control.** All privileged entry points of a component are accessed controlled.
* **Mandatory access control.** All users of components cannot penetrate the access control without using the `unsafe` keyword.
These properties are mostly enforced using compile-time techniques, thus incurring zero runtime overheads.
## Usage
To give you a concrete idea of CapComp, we revisit the two motivating examples and show how CapComp can meet their demands.
### Example 1: Capabilities in a microkernel-based OS
One fundamental primitive provided by CapComp is capabilities. The type that represents a capability is `Cap<O, R>`, where `O` is the type of the capability's associated object and `R` is the type of the capability's associated rights.
```rust
// File: cap_comp/cap.rs
use core::marker::PhantomData;
/// A capability.
#[repr(transparent)]
pub struct Cap<O: ?Sized, R> {
rights: PhantomData<R>,
object: O,
}
impl<O, R> Cap<O, R> {
pub fn new(object: O) -> Self {
Self {
rights: PhantomData,
object,
}
}
}
```
Notice how the Rust representation of a capability in CapComp differs from the traditional approach. The key difference is that CapComp represents rights in _types_, not values. This enables CapComp to enforce access control at compile time.
So how do users use CapComp's capabilities? Imagine you are developing a kernel object named `Endpoint`, which can be used to send and receive bytes between threads or processes.
```rust
// File: user_project/endpoint.rs
pub struct Endpoint;
impl Endpoint {
pub fn recv(&self) -> u8 {
todo!("impl recv")
}
pub fn send(&self, _byte: u8) {
todo!("impl send")
}
}
```
You want to ensure that the sender side of an `Endpoint` (could be a process or thread) can only use it to send bytes, while the receiver side can only receive bytes. We can do so easily with CapComp.
First, add `cap_comp` crate as a dependency in your project's `Cargo.toml`.
```rust
[dependencies]
cap_comp = { version = "0.1", features = ["read_right", "write_right"] }
```
The `features` field specifies which kinds of access rights that your project finds interesting. There are a rich set of general-purpose access rights built into CapComp and you can select those that make sense to your project. (We may find a way to support customized access rights in the future.)
Then, we can refactor the code of `Endpoint` to leverage CapComp to protect `Endpoint` with capabilities.
```rust
// File: user_project/endpoint.rs
use cap_comp::{
rights::{Read, Write},
impl_cap, require,
Cap, Rights,
};
pub struct Endpoint;
// Instruct CapComp to implement the right set of methods for
// `Cap<Endpoint, R>`, depending upon rights specified by `R`.
#[impl_cap]
impl Endpoint {
// Implement the recv method for `Cap<Endpoint, R: Read>`.
#[require(rights = [Read])]
pub fn recv(&self) -> u8 {
todo!("impl recv")
}
// Implement the send method for `Cap<Endpoint, R: Write>`.
#[require(rights = [Write])]
pub fn send(&self, _byte: u8) {
todo!("impl send")
}
}
#[cfg(test)]
mod test {
use super::*;
#[test]
fn recv() {
// Create an endpoint capability with the write right.
let endpoint_cap: Cap<Endpoint, Rights![Write]> = Cap::new(Endpoint);
endpoint_cap.send(0);
//endpoint_cap.recv(); // compiler error!
}
#[test]
fn send() {
// Create an endpoint capability with the read right.
let endpoint_cap: Cap<Endpoint, Rights![Read]> = Cap::new(Endpoint);
endpoint_cap.recv();
//endpoint_cap.send(0); // compiler error!
}
}
```
The above code shows the most basic usage of capabilities and rights in CapComp and CapComp's ability to enforce object-level access control at compile time.
### Example 2: Key retention in a monolithic OS kernel
#### What are tokens?
Just like the methods of an object can be access-controlled wtih capabilities, the entry points of a component can be access-controlled with tokens. In CapComp, a **token** is owned by a component or a module inside a component; such a component or module is called **token owner**. A token represents the access rights of the token owner to other components. So by presenting its token, a token owner can prove to a recipient component which access rights it possesses; and based on token, the recipient component can decide whether to allow a specific operation accordingly.
Anywhere inside a component, a user can get access to the token that belongs to the current textual scope via `Token!()`, a special macro provided by CapComp. The macro returns a _zero-sized_ Rust object of type `T: Token`, where `Token` is a marker trait for all token types. Type `T` is encoded with the access rights info via type-programming techniques. Thus, transferring and checking tokens can be done entirely at the compile time, without incurring any runtime cost.
#### Refactor `keyring` with tokens
To make our `keyring` example more realistic, let's consider a system that consists of a `keyring` component and three other components.
* The `keyring` component is responsible for key management.
* The `boot` component parses arguments and configurations (including encryption keys) from users and starts up the entire system.
* The `encrypted_fs` component implements encrypted file I/O, which uses encryption keys.
* The `proc_fs` component implements a Linux-like procfs that facilitate inspecting the status of the system (e.g., all available keys).
To leverage the component-level access control mechanism of CapComp, one needs to provide a configuration file for CapComp, which specifies all components in the target system and their access rights.
```toml
# File: CapComp.toml
# There are four components in total
[components]
keyring = { path = "keyring/" }
boot = { path = "boot/" }
encrypted_fs = { path = "encrypted_fs/" }
procfs = { path = "procfs/" }
# A centralized place to claim access to the keyring component
[access_to.keyring]
# The boot component has the write right.
boot = { rights = ["Write"] }
# The init module inside the encrypted_fs component needs
# the read right
'encrypted_fs.init' = { rights = ["Read", "Inspect"] }
# The proc_fs component has the inspect right
proc_fs = { rights = ["Inspect"] }
```
Now we refactor the `keyring` component with CapComp.
```rust
// File: keyring/lib.rs
use cap_comp::{
load_config, impl_cap, require, unprivileged
Token,
}
load_config!("../CapComp.toml");
/// A key is some secret data.
// Inform CapComp that this type is unprivileged, which means
// it does not have to be protected by capabilities.
#[unprivileged]
pub struct Key {
name: String,
payload: Vec<u8>,
}
#[impl_cap]
impl Key {
pub fn new(name: &str, bytes: &[u8]) -> Self {
Self {
name: name.to_string(),
payload: Vec::from(bytes),
}
}
#[require(rights = [Inspect])]
pub fn name(&self) -> &str {
&self.name
}
#[require(rights = [Read])]
pub fn payload(&self) -> &Vec<u8> {
&self.payload
}
#[require(rights = [Write])]
pub fn payload_mut(&mut self) -> &mut Vec<u8> {
&mut self.payload
}
}
// A component can access this operation only if it has a token
// with the write right.
#[require(bounds = { Tk: Write })]
pub fn insert_key<Tk>(key: Key, token: &Tk) {
todo!("omitted...")
}
// A component can access this operation only if it has a token
// with the write right.
#[require(bounds = { Tk: Read })]
pub fn find_key<Tk>(name: &str, token: &Tk) -> Option<Arc<Cap<Key, Rights![Read]>>> {
todo!("omitted...")
}
// A component can access this operation only if it has a token
// with the inspect right.
#[require(bounds = { Tk: Inspect })]
pub fn list_all_keys<Tk>(name: &str, token: &Tk) -> Option<Arc<Cap<Key, Rights![Inspect]>>> {
todo!("omitted...")
}
```
The `boot` component can insert keys into `keyring` since it has the write right.
```rust
// File: boot/lib.rs
use capcomp::{load_config, Token};
use keyring::{insert_key, Key};
load_config!("../CapComp.toml");
fn parse_cli_arguments(args: &[&str]) {
let key = parse_encryption_key(args);
insert_key("encryption_key", Token!());
}
fn parse_encryption_key(args: &[&str]) {
todo!("omitted...")
}
```
The `encrypted_fs` component can fetch and use its encryption key from `keyring` since it has the read right.
```rust
// File: mount_fs/lib.rs
capcomp::load_config!("../CapComp.toml");
```
```rust
// File: mount_fs/init.rs
use capcomp::Token;
use keyring::find_key;
pub fn init_fs() {
let key = find_key("encryption_key", Token!());
// use the key to init fs...
}
```
For debug purposes, the `proc_fs` component is allowed to list the metadata of keys since it has the inspect right, but not read or update keys.
```rust
// File: proc_fs/lib.rs
use capcomp::{load_config, Token};
use keyring::list_all_keys;
load_config!("../CapComp.toml");
pub fn list_all_key_names() -> Vec<String> {
let keys = list_all_keys(Token!());
key.iter().map(|key| key.name().to_string()).collect()
}
```
As you can see, the CapComp-powered system has greatly narrowed the scope of code that has access to keys. In fact, the scope is clearly defined in `CapComp.toml`. This greatly reduces the odds of misusing or leaking keys and facilitates security auditing for the codebase.
## Prototype
We have implemented a prototype of CapComp that can support the two examples above. More specifically, the prototype demonstrates that the following key design points of CapComp are feasible.
* Using Rust's type-level programming to encode access rights with types and enforce access control at compile time (`Cap<O, R>`, `Rights!`, `Token!`).
* Using Rust's procedure macros to minimize the boilerplate code required (`#[impl_cap]`).
* Using program analysis technique (a lint pass) to check if all privileged entry points of a component are access-controlled (`#[unprivileged]`).
## Discussions
* Figure out the killer apps of CapComp
* How to demonstrate the security benefits brought by CapComp?
* Is it too complex for developers to write Rust code with CapComp?
* What is it like to build a component-based OS with CapComp?
* Can we formally verify the security guarantees of CapComp?

View File

@ -1,742 +0,0 @@
# Type-Level Programming (TLP) in Rust
## What is TLP?
TLP is, in short, _computation over types_, where
* **Types** are used as *values*
* **Generic parameters** are used as *variables*
* **Trait bounds** are used as *types*
* **Traits with associated types** are as used as *functions*
Let's see some examples of Rust TLP crates.
Example 1: TLP integers.
```rust
use typenum::{Sum, Exp, Integer, N2, P3, P4};
type X = Sum<P3, P4>;
assert_eq!(<X as Integer>::to_i32(), 7);
type Y = Exp<N2, P3>;
assert_eq!(<Y as Integer>::to_i32(), -8);
```
Example 2: TLP lists.
```rust
use type_freak::{TListType, list::*};
type List1 = TListType![u8, u16, u32];
type List2 = LPrepend<List1, u64>;
// List2 ~= TListType![u64, u8, u16, u32]
```
In this document, we will first explain how TLP works in Rust and
then will demonstrate the value of TLP by giving a good application---
implementing zero-cost capablities!
## How TLP Works?
### Case Study: TLP bools
Let's define type-level bools.
```rust
/// A marker trait for type-level bools.
pub trait Bool {} //
impl Bool for True {}
impl Bool for False {}
/// Type-level "true".
pub struct True;
/// Type-level "false".
pub struct False;
```
Let's compute over type-level bools.
```rust
/// A trait operator for logical negation on type-level bools.
pub trait Not {
type Output;
}
impl Not for True {
type Output = False;
}
impl Not for False {
type Output = True;
}
/// A type alias to make using the `Not` operator easier.
pub type NotOp<B> = <B as Not>::Output;
```
Let's test the type-level bools.
```rust
use crate::bool::{True, False, NotOp};
use crate::assert_type_same;
#[test]
fn test_not_op() {
assert_type_same!(True, NotOp<False>);
assert_type_same!(False, NotOp<True>);
}
```
Let's define more operations for type-level bools.
```rust
/// A trait operator for logical and on type-level bools.
pub trait And<B: Bool> {
type Output;
}
impl<B: Bool> And<B> for True {
type Output = B;
}
impl<B: Bool> And<B> for False {
type Output = False;
}
/// A type alias to make using the `And` operator easier.
pub type AndOp<B0, B1> = <B0 as And<B1>>::Output;
```
Let's test the and operation.
```rust
#[test]
fn test_and_op() {
assert_type_same!(AndOp<True, True>, True);
assert_type_same!(AndOp<True, False>, False);
assert_type_same!(AndOp<False, True>, False);
assert_type_same!(AndOp<False, False>, False);
}
```
### Mnemonic for TLP functions
#### Defining a function
```rust
impl SelfType {
pub fn method(&self, arg0: Type0, arg1: Type1, /* ... *) -> RetType {
/* function body */
}
}
```
v.s.
```rust
impl<Arg0, Arg1, /* ... */> Method<Arg0, Arg1, /* ... */> for SelfType
where
Arg0: ArgBound0,
Arg1: ArgBound1,
/* ... */
{
type Output/*: RetBound */ = /* function body */;
}
```
Traits like `Method` are called _trait operators_.
#### Calling a function
```rust
object.method(arg0, arg1, /* ... */>);
```
v.s.
```rust
<Object as Method<Arg0, Arg1, /* ... */>>::Output
```
---
```rust
ObjectType::method(&object, arg0, arg1, /* ... */>);
```
v.s.
```rust
MethodOp<Object, Arg0, Arg1, /* ... */>
```
### TLP equality assertions
This is what we want.
```rust
#[test]
fn test_assert() {
assert_type_same!(u16, u16);
assert_type_same!((), ());
// assert_same_type!(u16, bool); // Compiler error!
}
```
Here is how it is implemented.
```rust
/// A trait that is intended to check if two types are the same in trait bounds.
pub trait IsSameAs<T> {
type Output;
}
impl<T> IsSameAs<T> for T {
type Output = ();
}
```
```rust
pub type AssertTypeSame<Lhs, Rhs> = <Lhs as IsSameAs<Rhs>>::Output;
#[macro_export]
macro_rules! assert_type_same {
($lhs:ty, $rhs:ty) => {
const _: AssertTypeSame<$lhs, $rhs> = ();
};
}
```
### TLP equality checks
The `SameAsOp` (and `SameAs`) is a very useful primitive.
```rust
#[test]
fn test_same_as() {
assert_type_same!(SameAsOp<True, True>, True);
assert_type_same!(SameAsOp<False, True>, False);
assert_type_same!(SameAsOp<True, NotOp<False>>, True);
}
```
This is how it gets implemented.
```rust
/// A trait operator to check if two types are the same, returning a Bool.
pub trait SameAs<T> {
type Output: Bool;
}
pub type SameAsOp<T, U> = <T as SameAs<U>>::Output;
```
```rust
impl SameAs<True> for True {
type Output = True;
}
impl SameAs<False> for True {
type Output = False;
}
impl SameAs<True> for False {
type Output = False;
}
impl SameAs<False> for False {
type Output = True;
}
```
Can be simplified in the future (through the unstablized feature of specialization)
```rust
#![feature(specialization)]
impl<T> SameAs<T> for True {
default type Output = False;
}
impl SameAs<True> for True {
type Output = True;
}
impl<T> SameAs<T> for False {
default type Output = False;
}
impl SameAs<False> for False {
type Output = True;
}
```
### TLP control flow
#### Conditions
```rust
#[test]
fn test_if() {
assert_type_same!(IfOp<True, u32, ()>, u32);
assert_type_same!(IfOp<False, (), bool>, bool);
}
```
```rust
pub trait If<Cond: Bool, T1, T2> {
type Output;
}
impl<T1, T2> If<True, T1, T2> for () {
type Output = T1;
}
impl<T1, T2> If<False, T1, T2> for () {
type Output = T2;
}
pub type IfOp<Cond, T1, T2> = <() as If<Cond, T1, T2>>::Output;
```
#### Loops
You don't write loops in TLP; you write _recursive_ functions.
### TLP collections
Take TLP sets as an example.
```rust
use core::marker::PhantomData;
/// A marker trait for type-level sets.
pub trait Set {}
/// An non-empty type-level set.
pub struct Cons<T, S: Set>(PhantomData<(T, S)>);
/// An empty type-level set.
pub struct Nil;
impl<T, S: Set> Set for Cons<T, S> {}
impl Set for Nil {}
```
How to write a _contain_ operator?
```rust
#[test]
fn test_set_contain() {
struct A;
struct B;
struct C;
struct D;
type AbcSet = Cons<A, Cons<B, Cons<C, Nil>>>;
assert_type_same!(SetContainOp<AbcSet, A>, True);
assert_type_same!(SetContainOp<AbcSet, B>, True);
assert_type_same!(SetContainOp<AbcSet, C>, True);
assert_type_same!(SetContainOp<AbcSet, D>, False);
}
```
We can implement the operator with recursion!
```rust
/// A trait operator to check if `T` is a member of a type set;
pub trait SetContain<T> {
type Output;
}
impl<T> SetContain<T> for Nil {
type Output = False;
}
impl<T, U, S> SetContain<T> for Cons<U, S>
where
S: Set,
U: SameAs<T>,
S: SetContain<T>,
SameAsOp<U, T>: Or<SetContainOp<S, T>>,
{
type Output = OrOp<SameAsOp<U, T>, SetContainOp<S, T>>;
}
pub type SetContainOp<Set, Item> = <Set as SetContain<Item>>::Output;
```
Note: needs to implement `SameAs` for all possible item types (e.g., among `A` through `D`).
## An Application of TLP
### Capabilities in Rust
A simplified example from zCore:
```rust
pub struct Handle<O> {
/// The object referred to by the handle.
pub object: Arc<O>,
/// The handle's associated rights.
pub rights: Rights,
}
```
The limitations
* CPU cost: check for rights cost CPU cycles;
* Memory cost: access rights cost memory space;
* Security weakness: it is easy to bypass access rights internally.
### Our idea: _zero-cost_ capabilities
Encoding the access rights in the type `R`.
```rust
pub struct Handle<O, R> {
/// The object referred to by the handle.
object: Arc<O>,
/// The handle's associated rights.
rights: PhantomData<R>,
}
```
*Compile-time* access control with trait bounds on `R`.
```rust
use crate::my_dummy_channel::{Channel, Msg};
use crate::rights::Read;
impl<R: Read> Handle<Channel, R> {
pub fn read(&self) -> Msg {
self.object.read()
}
}
impl<R: Write> Handle<Channel, R> {
pub fn write(&self, msg: Msg) {
self.object.write(msg)
}
}
```
**--> Requirement 1. Construct a set of types that _satisfies all possible combinations of the trait bounds_.**
```rust
use crate::rights::{Rights, RightsIsSuperSetOf};
impl<O, R: Rights> Handle<O, R> {
/// Create a new handle for the given object.
pub fn new(object: O) -> Self {
Self {
object: Arc::new(object),
rights: PhantomData,
}
}
/// Create a duplicate of the handle, refering to the same object.
///
/// The duplicate is guaranteed---by the compiler---to have the same or
/// lesser rights.
pub fn duplicate<R1>(&self) -> Handle<O, R1>
where
R1: Rights,
R: RightsIsSuperSetOf<R1>,
{
Handle {
object: self.object.clone(),
rights: PhantomData,
}
}
}
```
**--> Requirement 2. Implement the `RightsIsSuperSetOf<Sub>` trait for all `Super`, where _the rights represented by the `Super` type is a superset of the rights represented by the `Sub` type_.**
(Question: why a naive implementation won't work?)
### Our solution: the TLP-powered `trait_flags` crate
Our `trait_flags` crate is similar to `bitflags`, but it uses traits (or types) as flags, instead of bits.
```rust
/// Define a set of traits that represent different rights and
/// a macro that outputs types with the desired combination of rights.
trait_flags! {
trait Rights {
Read,
Write,
// many more...
}
}
```
```rust
#[test]
fn define_rights() {
type MyNoneRights = Rights![];
type MyReadRights = Rights![Read];
type MyWriteRights = Rights![Write];
type MyReadWriteRights = Rights![Read, Write];
}
```
```rust
#[test]
fn has_rights() {
fn has_read_right<T: Read>() {}
fn has_write_right<T: Write>() {}
has_read_right::<Rights![Read]>();
has_read_right::<Rights![Read, Write]>();
//has_read_right::<Rights![]>(); // Compiler error!
has_write_right::<Rights![Write]>();
has_write_right::<Rights![Read, Write]>();
//has_write_right::<Rights![]>(); // Compiler error!
}
```
```rust
#[test]
fn rights_is_superset_of() {
fn is_superset_of<Super: Rights + RightsIsSuperSetOf<Sub>, Sub: Rights>() {}
is_superset_of::<Rights![Read], Rights![]>();
is_superset_of::<Rights![Write], Rights![]>();
is_superset_of::<Rights![Read], Rights![Read]>();
is_superset_of::<Rights![Read, Write], Rights![Write]>();
is_superset_of::<Rights![Read, Write], Rights![Write]>();
//is_superset_of::<Rights![Read], Rights![Write]>(); // Compiler error!
//is_superset_of::<Rights![], Rights![Read]>(); // Compiler error!
}
```
### How `trait_flags` works?
Let's walkthrough the Rust code generated by the `trait_flags` macro. Without loss of generality, we assume that the input given to the macro consists of only two rights: the read and write rights.
#### For requirement 1
**Step 1.** Generate a set of the types that can represent all possible combinations of rights.
```rust
// Marker traits
pub trait Rights {}
pub trait Read: Rights {}
pub trait Write: Rights {}
pub struct RightSet<B0, B1>(
PhantomData<B0>, // If B0 == True, then the set contains the read right
PhantomData<B1>, // If B1 == True, then the set contains the write right
);
```
**Step 2.** .
```rust
impl<B0, B1> Rights for RightSet<B0, B1> {}
impl<B1> Read for RightSet<True, B1> {}
impl<B0> Write for RightSet<B0, True> {}
```
#### For requirement 2
**Step 1.** Reduce the problem of `RightsIsSuperSetOf` trait to implementing a trait operator named `RightsIncludeOp`.
```rust
/// A marker trait that marks any pairs of `Super: Rights` and `Sub: Rights` types,
/// where the rights of `Super` is a superset of that of `Sub`.
pub trait RightsIsSuperSetOf<R: Rights> {}
impl<Super, Sub> RightsIsSuperSetOf<Sub> for Super
where
Super: Rights + RightsInclude<Sub>,
Sub: Rights,
True: IsSameAs<RightsIncludeOp<Super, Sub>>,
{
}
/// A type alias for `RightsInclude`.
pub type RightsIncludeOp<Super, Sub> = <Super as RightsInclude<Sub>>::Output;
/// A trait operator that "calculates" if `Super: Rights` is a superset of `Sub: Rights`.
/// If yes, the result is `True`; otherwise, the result is `False`.
pub trait RightsInclude<Sub: Rights> {
type Output;
}
```
**Step 2. ** Implement `RightsIncludeOp`.
Here is a simplified version.
```rust
impl<Super0, Super1, Sub0, Sub1> RightsInclude<RightSet<Sub0, Sub1>>
for RightSet<Super0, Super1>
where
Super0: Bool,
Super1: Bool,
Sub0: Bool,
Sub1: Bool,
BoolGreaterOrEqual<Super0, Sub0>: And<BoolGreaterOrEqual<Super1, Sub1>>,
// For brevity, we omit many less important trait bounds...
{
type Output = AndOp<BoolGreaterOrEqual<Super0, Sub0>, BoolGreaterOrEqual<Super1, Sub1>>;
}
type BoolGreaterOrEqual<B0, B1> = NotOp<AndOp<SameAsOp<B0, False>, SameAsOp<B1, True>>>;
```
### An alternative implementation of `trait_flags`
Let's walkthrough the Rust code generated by this alternative implementation of `trait_flags`, which is based on variable-length TLP sets that we have described in part 1 of this talk. Again, we assume the input to the macro is only the read and write rights.
#### For requirement 1
**Step 1.** Generate a set of the types that can represent all possible combinations of rights.
```rust
use crate::set::{Set, Cons, Nil};
pub trait Rights {}
pub trait Read: Rights {}
pub trait Write: Rights {}
impl<S: Set + Rights> Rights for Cons<ReadRight, S> {}
impl<S: Set + Rights> Rights for Cons<WriteRight, S> {}
impl Rights for Nil {}
pub struct ReadRight;
pub struct WriteRight;
```
**Step 2.** Define the `SameAs` relationship between right types.
```rust
impl SameAs<ReadRight> for ReadRight {
type Output = True;
}
impl SameAs<WriteRight> for ReadRight {
type Output = False;
}
impl SameAs<ReadRight> for WriteRight {
type Output = False;
}
impl SameAs<WriteRight> for WriteRight {
type Output = True;
}
```
The above code is only part that results in a complexity greater than O(N), where N is the number of types. In the future, it can be reduced to the complexity of O(N) with the help of an unstable feature named _specialization_.
```rust
#![feature(specialization)]
impl<T> SameAs<T> for ReadRight {
default type Output = False;
}
impl SameAs<ReadRight> for ReadRight {
type Output = True;
}
impl<T> SameAs<T> for WriteRight {
default type Output = False;
}
impl SameAs<WriteRight> for WriteRight {
type Output = True;
}
```
**Step 3.** Implement the marker traits for _right_ types.
```rust
impl<R> Read for R
where
R: Rights + SetContain<ReadRight>,
True: IsSameAs<SetContainOp<R, ReadRight>>,
{
}
impl<R> Write for R
where
R: Rights + SetContain<WriteRight>,
True: IsSameAs<SetContainOp<R, WriteRight>>,
{
}
```
Note that we have implemented the `SetContain` and `SetContainOp` trait operators in part 1.
#### For requirement 2
**Step 1.** Reduce the problem of `RightsIsSuperSetOf` trait to implementing a trait operator named `RightsIncludeOp`, which in turn can be implemented with `SetIncludeOp`
```rust
/// A marker trait that marks any pairs of `Super: Rights` and `Sub: Rights` types,
/// where the rights of `Super` is a superset of that of `Sub`.
pub trait RightsIsSuperSetOf<R: Rights> {}
impl<Super, Sub> RightsIsSuperSetOf<Sub> for Super
where
Super: Rights + RightsInclude<Sub>,
Sub: Rights,
True: IsSameAs<RightsIncludeOp<Super, Sub>>,
{
}
use crate::set::{SetInclude as RightsInclude, SetIncludeOp as RightsIncludeOp}
```
**Step 2.** Implement `SetIncludeOp`, which is a trait operator that checks if a set A includes a set B, i.e., the set A is a superset of the set B.
```rust
/// A trait operator to check if a set A includes a set B, i.e., A is a superset of B.
pub trait SetInclude<S> {
type Output;
}
impl SetInclude<Nil> for Nil {
type Output = True;
}
impl<T, S: Set> SetInclude<Cons<T, S>> for Nil {
type Output = False;
}
impl<T, S: Set> SetInclude<Nil> for Cons<T, S> {
type Output = True;
}
impl<SuperT, SuperS, SubT, SubS> SetInclude<Cons<SubT, SubS>> for Cons<SuperT, SuperS>
where
SuperS: Set + SetInclude<SubS> + SetContain<SubT> + SetInclude<SubS>,
SubS: Set,
SuperT: SameAs<SubT>,
// For brevity, we omit some trait bounds...
{
type Output = IfOp<
SameAsOp<SuperT, SubT>, // The if condition
SetIncludeOp<SuperS, SubS>, // The if branch
AndOp< // The else branch
SetContainOp<SuperS, SubT>,
SetIncludeOp<SuperS, SubS>,
>,
>;
}
pub type SetIncludeOp<SuperSet, SubSet> = <SuperSet as SetInclude<SubSet>>::Output;
```
## Wrapup
* Part 1: the magic of TLP
* TLP bools
* TLP equality assertions and checks
* TLP control flow (if and ~~loops~~)
* TLP collections (type sets)
* Part 2: the application of TLP
* Traditional capability-based access control
* Zero-cost capability-based access control
* How to implement `trait_flags` with TLP
* Fixed-length set version (`RightSet<R0, R1, ...>`)
* Variable-length set version (`Cons<R0, Cons<R1, ...>`)

View File

@ -1 +0,0 @@
# What are Capabilities?

View File

@ -1,566 +0,0 @@
# Zero-Cost Capabilities
To strengthen the security of Asterinas, we aim to implement all kinds of OS resources
as capabilities. As the capabilities are going to be used throughout the OS,
it is highly desirable to minimize their costs. For this purpose,
we want to implement capabilities as a _zero-cost abstraction_.
Zero cost abstractions, as initially proposed and defined by C++'s creator,
are required to satisfy two criteria:
* What you dont use, you dont pay for;
* What you do use, you couldnt hand code any better.
## Traditional capabilities are not zero-cost abstractions
Capabilities, when implemented straightforwardly, are not zero-cost
abstractions. Take the following code snippet as an example,
which attempts to implement an RPC primitive named `Channel<T>` as a capability.
```rust
pub struct Channel<T> {
buf: Arc<Mutex<VecDeque<T>>,
rights: Rights,
}
impl<T> Channel<T> {
pub fn new() {
Self {
buf: Arc::new(Mutex::new(VecDeque::new())),
rights: Rights::READ | Rights::WRITE | Rights::DUP,
}
}
pub fn push(&self, item: T) -> Result<()> {
if !self.rights.contains(Rights::WRITE) {
return Err(EACCESS);
}
self.buf.lock().push(item);
Ok(())
}
pub fn pop(&self) -> Result<T> {
if !self.rights.contains(Rights::READ) {
return Err(EACCESS);
}
self.buf.lock()
.pop()
.ok_or(EAGAIN)
}
pub fn dup(&self) -> Result<Self> {
if !self.rights.contains(Rights::DUP) {
return Err(EACCESS);
}
let dup = Self {
buf: self.buf.clone(),
rights: self.rights,
};
Ok(dup)
}
pub fn restrict(mut self, right_mask: Rights) -> Self {
let Self { buf, rights } = self;
let rights = rights & right_mask;
Self { buf, rights }
}
}
```
Such an implementation violates the two criteria for zero-cost abstractions.
To see why, let's consider a user would use `Channel<T>` to implement `Pipe`
(like UNIX pipes).
```rust
pub fn pipe() -> (PipeWriter, PipeReader) {
let channel = Channel::new();
let writer = {
let writer_channel = channel
.dup()
.unwrap()
.restrict(Rights::WRITE);
PipeWriter(writer_channel)
};
let reader = {
let reader_channel = channel
.dup()
.unwrap()
.restrict(Rights::READ);
PipeWriter(reader_channel)
};
(writer, reader)
}
pub struct PipeWriter(
// Actually, we know for sure that the channel is write-only.
// No need to keep permissions inside the channel!
// But the abstraction prevents us from trimming the needless information.
Channel<u8>
);
pub struct PipeReader(
// Same problem as above!
Channel<u8>
);
impl PipeWriter {
pub fn write(&self, buf: &[u8]) -> Result<usize> {
for byte in buf {
// Again, we know for sure that the channel is writable.
// So there is no need to check it every time.
// But the abstraction prevents us from avoiding the unnecessary check.
self.0.push(byte);
}
Ok(buf.len())
}
}
impl PipeReader {
pub fn read(&self, buf: &mut [u8]) -> Result<usize> {
let mut nbytes_read = 0;
// Same problem as above!
while let Ok(byte) = self.0.pop() {
buf[nbytes_read] = byte;
nbytes_read += 1;
}
if nbytes_read > 0 {
Ok(nbytes_read)
} else {
Err(EAGAIN)
}
}
}
```
As you can see, the abstraction of `Channel<T>` introduces extra costs,
which would not exist if the same code is written manually instead of
using the abstraction of `Channel<T>`. So a channel capability is not a
zero-cost abstraction.
## The three types of zero-cost capabilities
Our secret sauce for achieving zero-cost capabilities is based on two observations.
1. The access rights may be encoded in types and access rights
can be checked at compile time with some type-level programming tricks. This way,
the memory footprint for representing access rights would become zero and the
runtime check can also be avoided.
1. There could be different forms of capabilities, each covering a different
usage pattern with minimal overheads. Under such arrangement,
the access rights would be represented in types if the situation permits.
Otherwise, they would be encoded in values.
With the two observations, we introduce three types of zero-cost capabilities.
* **Dynamic capabilities.** Dynamic capabilities keep access rights in values
like the traditional capabilities shown in the example above. This is the
most flexible one among the three, but it incurs 4-8 bytes of memory footprint
for storing the access rights and must check the access rights at runtime.
* **Static capabilities.** Static capabilities encode access rights in types.
As the access rights can be determined at the compile time, there is zero
overheads incurred. Static capabilities are useful when the access rights
are known when coding.
* **Static capability references.** A static capability reference is a reference
to a dynamic or static capability plus the associated access rights encoded in types.
A static capability reference may be borrowed from a dynamic capability safely
after checking the access rights. Once the static capability reference is obtained,
it can be used freely without any runtime checks. This enables check-once-use-multiple-times.
Borrowing a static capability reference from a static capability incurs
zero runtime overhead.
The three types of capabilities are summarized in the figure below.
![Three types of zero cost capabilities](../images/three_types_of_zero_cost_capabilities.png)
## Encoding access rights in types
Static capabilities depend on the ability to encode access rights in types.
This section shows how this can be done with type-level programming tricks.
### Introducing the typeflags crate
```rust
//! Type-level flags.
//!
//! The `typeflags` crate can be seen as a type-level implementation
//! of the popular `bitflags` crate.
//!
//! ```rust
//! bitflags! {
//! pub struct Rights: u32 {
//! const READ = 1 << 0;
//! const WRITE = 1 << 1;
//! }
//! }
//! ```
//!
//! The `bitflags` macro generates a struct named `Rights` that
//! has two associated constant values, `Rights::READ` and `Rights::WRITE`,
//! and provides a bunch of methods to operate on the bit flags.
//!
//! The `typeflags` crate provides a macro that adopts a similar syntax.
//! The macro also generates code to represent flags but at the type level.
//!
//! ```rust
//! typeflags! {
//! pub trait RightSet: u32 {
//! struct READ = 1 << 0;
//! struct WRITE = 1 << 1;
//! }
//! }
//! ```
//!
//! The code above generates, among other things, the `RightSet` trait,
//! the `Read` and `Write` structs (which implements the trait), and
//! a macro to construct a type that represents any specified combination
//! of `Read` and `Write` structs.
//!
//! For more example on the usage, see the unit tests.
/// Generate the code that implements a specified set of type-level flags.
macro_rules! typeflags {
// A toy implementation for the purpose of demonstration only.
//
// The implementation is toy because it hardcodes the input and output.
// What's more important is that it suffers two key limitations.
//
// 1. It has a complexity of O(4^N), where N is the number of bits. In
// this example, N equals to 2. Using type-level programming tricks can
// reduce the complexity to O(N^2), or even O(N).
//
// 2. A declarative macro is not allowed to output another declarative macro.
// I suppose that a procedural macro should be able to do that. If so,
// implementing typeflags as a procedural macro should do the job.
// Otherwise, we need to figure out a way to workaround the limitation.
(
// Hardcode the input
trait RightSet: u32 {
struct Read = 1 << 0;
struct Write = 1 << 1;
}
) => {
// Hardcode the output
pub trait RightSet {
const BITS: u32;
fn new() -> Self;
}
pub struct Empty {}
pub struct Read {}
pub struct Write {}
pub struct ReadWrite {}
impl RightSet for Empty {
const BITS: u32 = 0b00;
fn new() -> Self { Self {} }
}
impl RightSet for Read {
const BITS: u32 = 0b01;
fn new() -> Self { Self {} }
}
impl RightSet for Write {
const BITS: u32 = 0b10;
fn new() -> Self { Self {} }
}
impl RightSet for ReadWrite {
const BITS: u32 = 0b11;
fn new() -> Self { Self {} }
}
pub trait RightSetContains<S> {}
impl RightSetContains<Empty> for Empty {}
impl RightSetContains<Empty> for Read {}
impl RightSetContains<Read> for Read {}
impl RightSetContains<Empty> for Write {}
impl RightSetContains<Write> for Write {}
impl RightSetContains<Empty> for ReadWrite {}
impl RightSetContains<Read> for ReadWrite {}
impl RightSetContains<Write> for ReadWrite {}
impl RightSetContains<ReadWrite> for ReadWrite {}
// This macro help construct an arbitrary type flags
macro_rules! RightSet {
() => { Empty }
(Read) => { Read }
(Write) => { Write }
(Read, Write) => { ReadWrite }
(Write, Read) => { ReadWrite }
}
}
}
mod test {
use super::*;
typeflags! {
trait RightSet: u32 {
struct Read = 1 << 0;
struct Write = 1 << 1;
}
}
// Test that the type flags can be constructed through a
// generated macro named RightSet.
type O = RightSet![];
type R = RightSet![Read];
type W = RightSet![Write];
type RW = RightSet![Read, Write];
#[test]
fn new() {
let _o = O::new();
let _r = R::new();
let _w = W::new();
let _rw = RW::new();
}
#[test]
fn to_u32() {
const R_BITS: u32 = 0b00000001;
const W_BITS: u32 = 0b00000010;
assert!(O::BITS == 0);
assert!(R::BITS == R_BITS);
assert!(W::BITS == W_BITS);
assert!(RW::BITS == R_BITS | W_BITS);
}
#[test]
fn contains() {
assert_trait_bound!(O: RightSetContains<O>);
assert_trait_bound!(R: RightSetContains<O>);
assert_trait_bound!(W: RightSetContains<O>);
assert_trait_bound!(RW: RightSetContains<O>);
assert_trait_bound!(R: RightSetContains<R>);
assert_trait_bound!(RW: RightSetContains<R>);
assert_trait_bound!(W: RightSetContains<W>);
assert_trait_bound!(RW: RightSetContains<W>);
assert_trait_bound!(RW: RightSetContains<RW>);
}
}
```
### Implement access rights with typeflags
The `aster-rights/lib.rs` file implements access rights.
```rust
//! Access rights.
use typeflags::typeflags;
use bitflags::bitflags;
bitflags! {
pub struct Rights: u32 {
const READ = 1 << 0;
const WRITE = 1 << 1;
const DUP = 1 << 2;
}
}
typeflags! {
pub trait RightSet: u32 {
struct Read = 1 << 0;
struct Write = 1 << 1;
struct Dup = 1 << 2;
}
}
```
The `aster-rights-proc/lib.rs` file implements the `require` procedural macro.
See the channel capability example later for how `require` is used.
```rust
#[proc_macro_attribute]
pub fn require(_attr: TokenStream, _item: TokenStream) -> TokenStream {
todo!()
}
```
## Example: zero-cost channel capabilities
This example shows how the three types of capabilities can be implemented
for channels.
* Dynamic capabilities: `Channel<Rights>`
* Static capabilities: `Channel<R: RightSet>`
* Static capability references: `ChannelRef<'a, R: RightSet>`
```rust
pub struct Channel<R = Rights>(Arc<ChannelInner>, R);
impl<R> Channel<R> {
pub fn new(rights: R) -> Self {
Self(ChannelInner::new(), rights)
}
}
struct ChannelInner {
buf: Mutex<VecDeque<T>,
}
impl ChannelInner {
pub fn new() -> Self {
Self {
buf: Mutex::new(VecDeque::new()),
}
}
pub fn push(&self, item: T) {
self.buf.lock().push(item);
}
pub fn pop(&self) -> Option<T> {
self.buf.lock().pop()
}
}
impl Channel<Rights> {
pub fn push(&self, item: T) -> Result<()> {
if !self.rights().contains(Rights::WRITE) {
return Err(EACCESS);
}
self.0.push(item);
Ok(())
}
pub fn pop(&self) -> Result<T> {
if !self.rights.contains(Rights::READ) {
return Err(EACCESS);
}
self.0.pop()
.ok_or(EAGAIN)
}
pub fn dup(&self) -> Result<Self> {
if !self.rights.contains(Rights::DUP) {
return Err(EACCESS);
}
let dup = Self {
buf: self.0.clone(),
rights: self.rights,
};
Ok(dup)
}
pub fn rights(&self) -> Rights {
self.rights
}
pub fn restrict(mut self, right_mask: Rights) -> Self {
let new_rights = self.rights() & right_mask;
self.1 = new_rights;
self
}
pub fn to_static<R>(mut self) -> Result<Channel<R>>
where
R: RightSet,
{
let Self (inner, rights) = self;
if !rights.contains(R::BITS) {
return Err(EACCESS);
}
let static_self = Channel(inner, R::new());
Ok(static_self)
}
pub fn to_ref<R>(&self) -> Result<ChannelRef<'_, R>>
where
R: RightSet,
{
if !self.rights().contains(R::BITS) {
return Err(EACCESS);
}
Ok(ChannelRef(self, PhantomData))
}
}
impl<R: RightSet> Channel<R> {
#[require(R > Write)]
pub fn push(&self, item: T) {
self.0.push(item);
}
#[require(R > Read)]
pub fn pop(&self) -> Option<T> {
self.0.pop()
}
#[require(R > DUP)]
pub fn dup(&self) -> Self {
Self(self.0.clone(), self.rights)
}
pub fn rights(&self) -> Rights {
R::BITS
}
#[require(R > R1)]
pub fn restrict<R1>(mut self) -> Channel<R1> {
let Self (inner, _) = self;
Channel(inner, PhantomData)
}
pub fn to_dyn(mut self) -> Channel<Rights>
{
let Self (inner, _) = self;
let dyn_self = Channel(inner, R::BITS);
dyn_self
}
#[require(R > R1)]
pub fn to_ref<R1>(&self) -> ChannelRef<'_, R1> {
ChannelRef(self, PhantomData)
}
}
pub struct ChannelRef<'a, R: RightSet>(&'a Arc<ChannelInner>, PhantomData<R>);
impl<'a, R: RightSet> ChannelRef<'a, R> {
#[require(R > Write)]
pub fn push(&self, item: T) {
self.0.push(item);
}
#[require(R > Read)]
pub fn pop(&self) -> Option<T> {
self.0.pop()
}
pub fn rights(&self) -> Rights {
R::BITS
}
#[require(R > R1)]
pub fn restrict<R1>(mut self) -> ChannelRef<R1> {
let Self (inner, _) = self;
ChannelRef(inner, PhantomData)
}
}
```
So what does code look like after the magical `require` macro expands?
Let's take `ChannelRef::restrict` as an example. After macro expansion,
the code looks like the below.
```rust
impl<'a, R: RightSet> ChannelRef<'a, R> {
pub fn restrict<R1>(mut self) -> ChannelRef<R1>
where
R: RightSetContains<R1>
{
let Self (inner, _) = self;
ChannelRef(inner, PhantomData)
}
}
```

View File

@ -0,0 +1 @@
# An Overview of Framework APIs

View File

@ -0,0 +1 @@
# Writing a Kenrel in 100 Lines of Rust

Binary file not shown.

Before

Width:  |  Height:  |  Size: 209 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 552 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 96 KiB

2
docs/src/introduction.md Normal file
View File

@ -0,0 +1,2 @@
# Introduction

View File

@ -0,0 +1 @@
# Getting Started

View File

@ -0,0 +1 @@
# Getting Started

View File

@ -0,0 +1 @@
# Linux Compatibility

View File

@ -0,0 +1 @@
# Boot protocols

View File

@ -0,0 +1 @@
# File Systems

View File

@ -0,0 +1 @@
# Networking

View File

@ -0,0 +1 @@
# System Calls

View File

@ -0,0 +1 @@
# Roadmap

View File

@ -0,0 +1 @@
# Development Status and Roadmap

View File

@ -0,0 +1 @@
# A Zero-Cost, Least-Privilege Approach

View File

@ -0,0 +1 @@
# Type-Level Capabilities

View File

@ -0,0 +1 @@
# Component-Level Access Control

View File

@ -0,0 +1 @@
# The Framekernel OS Architecture

View File

@ -0,0 +1 @@
# OSDK User Guide

View File

@ -0,0 +1 @@
# Creating an OS Project

View File

@ -0,0 +1 @@
# Testing or Running an OS Project

View File

@ -0,0 +1 @@
# Why OSDK

View File

@ -0,0 +1 @@
# Working in a Workspace

View File

@ -0,0 +1 @@
# OSDK User Reference

View File

@ -0,0 +1 @@
# Commands

View File

@ -0,0 +1 @@
# cargo osdk build

View File

@ -0,0 +1 @@
# cargo osdk new

View File

@ -0,0 +1 @@
# cargo osdk run

View File

@ -0,0 +1 @@
# cargo osdk test

View File

@ -0,0 +1 @@
# Manifest

View File

@ -1,29 +0,0 @@
# Privilege Separation
One fundamental design goal of Asterinas is to support _privilege separation_, i.e., the separation between the privileged OS core and the unprivileged OS components. The privileged portion is allowed to use `unsafe` keyword to carry out dangerous tasks like accessing CPU registers, manipulating stack frames, and doing MMIO or PIO. In contrast, the unprivileged portion, which forms the majority of the OS, must be free from `unsafe` code. With privilege separation, the memory safety of Asterinas can be boiled down to the correctness of the privileged OS core, regardless of the correctness of the unprivileged OS components, thus reducing the size of TCB significantly.
To put privilege separation into perspective, let's compare the architectures
of the monolithic kernels, microkernels, and Asterinas.
![Arch comparison](../images/arch_comparison.png)
The diagram above highlights the characteristics of different OS architectures
in terms of communication overheads and the TCB for memory safety.
Thanks to privilege separation, Asterinas promises the benefit of being _as safe as a microkernel and as fast as a monolithic kernel_.
Privilege separation is an interesting research problem, prompting us to
answer a series of technical questions.
1. Is it possible to partition a Rust OS into the privileged and unprivileged halves? (If so, consider the following questions)
2. What are the safe APIs exposed by the privileged OS core?
3. Can OS drivers be implemented as unprivileged code with the help from the privileged OS?
4. How small can the privileged OS core be?
To answer these questions, we will make two case studies in the rest of this
chapter: one is [the common syscall workflow](syscall_workflow.md) and the other
is [the drivers for Virtio devices on PCI bus](pci_virtio_drivers.md). With the
two case studies, we can be confident to give a big YES for Q1 and Q3. And we
propose some key APIs of the privileged OS core, thus providing a partial answer
for Q2. We cannot give a precise answer to Q4 until the privileged OS core is
fully implemented. But the two case studies shall provide strong evidence that
the final TCB shall be much smaller than the size of the entire OS.

View File

@ -1,189 +0,0 @@
# Case Study 2: Virtio devices on PCI bus
In our journal towards writing an OS without _unsafe_ Rust, a key obstacle is dealing with device drivers. Device drivers are the single largest contributor to OS complexity. In Linux, they constitute 70% of the code base. And due to their low-level nature, device driver code usually involves privileged tasks, like doing PIO or MMIO, accessing registers, registering interrupt handlers, etc. So the question is: can we figure out the right abstractions for the OS core to enable writing most driver code in unprivileged Rust?
Luckily, the answer is YES. And this document will explain why.
We will focus on Virtio devices on PCI bus. The reason is two-fold. First, Virtio devices are the single most important class of devices for our target usage, VM-based TEEs. Second, PCI bus is the most important bus for x86 architecture. Given the versatility of Virtio and the complexity of PCI bus, if a solution can work with Virtio devices on PCI, then it is most likely to work with other types of devices or buses.
## The problem
Here are some of the elements in PCI-based Virtio devices that may involve `unsafe` Rust.
* Access PCI configuration space (doing PIO with `in`/`out` instructions)
* Access PCI capabilities (specified by raw pointers calculated from BAR + offset)
* Initialize Virtio devices (doing MMIO with raw pointers)
* Allocate and initialize Virtio queues (managing physical pages)
* Push/pop entries to/from Virtio queues (accessing physical memory with raw pointers)
## The solution
### PCI bus
### Privileged part
```rust
// file: aster-core-libs/pci-io-port/lib.rs
use x86::IoPort;
/// The I/O port to write an address in the PCI
/// configuration space.
pub const PCI_ADDR_PORT: IoPort<u32> = {
// SAFETY. Write to this I/O port won't affect
// any typed memory.
unsafe {
IoPort::new(0x0cf8, Rights![Wr])
}
}
/// The I/O port to read/write a value from the
/// PCI configuration space.
pub const PCI_DATA_PORT: IoPort<u32> = {
// SAFETY. Read/write to this I/O port won't affect
// any typed memory.
unsafe {
IoPort::new(0x0cf8 + 0x04, Rights![Rd, Wr])
}
};
```
### Unprivileged part
```rust
// file: aster-comps/pci/lib.rs
use pci_io_port::{PCI_ADDR_PORT, PCI_DATA_PORT};
/// The PCI configuration space, which enables the discovery,
/// initialization, and configuration of PCI devices.
pub struct PciConfSpace;
impl PciConfSpace {
pub fn read_u32(bus: u8, slot: u8, offset: u32) -> u32 {
let addr = (1 << 31) |
((bus as u32) << 16) |
((slot as u32) << 11) |
(offset & 0xFF);
PCI_ADDR_PORT.write(addr);
PCI_DATA_PORT.read()
}
pub fn write_u32(bus: u8, slot: u8, offset: u32, val: u32) -> u32 {
let addr = (1 << 31) |
((bus as u32) << 16) |
((slot as u32) << 11) |
(offset & 0xFF);
PCI_ADDR_PORT.write(addr);
PCI_DATA_PORT.write(val)
}
pub fn probe_device(&self, bus: u8, slot: u8) -> Option<PciDeviceConfig> {
todo!("omitted...")
}
}
/// A scanner of PCI bus to probe all PCI devices.
pub struct PciScanner {
bus_no: u8,
slot: u8,
}
impl Iterator for PciScanner {
type Item = PciDevice;
fn next(&mut self) -> Option<Self::Item> {
while !(self.bus_no == 255 && self.slot == 31) {
if self.slot == 31 {
self.bus_no += 1;
self.slot = 0;
}
let config = PciConfSpace::probe_device(self.bus_no, self.slot);
let slot = self.slot;
self.slot += 1;
if let Some(config) = config {
todo!("convert the config to a device...")
}
}
}
}
/// A general PCI device
pub struct PciDevice {
// ...
}
/// The configuration of a general PCI device.
pub struct PciDeviceConfig {
// ...
}
/// The capabilities of a PCI device.
pub struct PciCapabilities {
// ...
}
```
### Virtio
Most code of Virtio drivers can be unprivileged thanks to the abstractions of `VmPager` and `VmCell` provided by the OS core.
```rust
// file: aster-comp-libs/virtio/transport.rs
/// The transport layer for configuring a Virtio device.
pub struct VirtioTransport {
isr_cell: VmCell<u8>,
// ...
}
impl VirtioTransport {
/// Create a new instance.
///
/// According to Virtio spec, the transport layer for
/// configuring a Virtio device consists of four parts:
///
/// * Common configuration structure
/// * Notification structure
/// * Interrupt Status Register (ISR)
/// * Device-specific configuration structure
///
/// This constructor requires four pointers to these parts.
pub fn new(
common_cfg_ptr: PAddr<CommonCfg>,
isr_ptr: PAddr<u8>,
notifier: PAddr<Notifier>,
device_cfg: PAddr<DeviceCfg>,
) -> Result<Self> {
let isr_cell = Self::new_part(isr_ptr)?;
todo!("do more initialization...")
}
/// Write ISR.
pub fn write_isr(&self, new_val: u8) {
self.isr_cell.write(new_val).unwrap()
}
/// Read ISR.
pub fn read_isr(&self) -> u8 {
self.isr_cell.read().unwrap()
}
fn new_part<T: Pod>(part: PAddr<T>) -> Result<VmCell<T>> {
let addr = part.as_ptr() as usize;
let page_addr = align_down(addr, PAGE_SIZE);
let page_offset = addr % PAGE_SIZE;
// Acquire the access to the physical page
// that contains the part. If the physical page
// is not safe to access, e.g., when the page
// has been used by the kernel, then the acquisition
// will fail.
let vm_pager = VmPagerOption::new(PAGE_SIZE)
.paddr(page_addr)
.exclusive(false)
.build()?;
let vm_cell = vm_pager.new_cell(page_offset)?;
vm_cell
}
}
```

View File

@ -1,419 +0,0 @@
# Case study 1: Common Syscall Workflow
## Problem definition
In a nutshell, the job of an OS is to handle system calls. While system calls may differ greatly in what they do, they share a common syscall handling workflow, which includes at least the following steps.
* User-kernel switching (involving assembly code)
* System call parameter parsing (which has to access CPU registers)
* System call dispatching (needs to _interpret_ integer values to corresponding C types specified by Linux ABI)
* Per-system call handling logic, which often involves accessing user-space memory (pointer dereference)
It seems that each of the steps requires the use of `unsafe` more or less. So the question here is: **is it possible to design a syscall handling framework that has a clear cut between privileged and unprivileged code, allowing the user to handle system calls without `unsafe`?**
The answer is YES. This document describes such a solution.
## To `unsafe`, or not to `unsafe`, that is the question
> The `unsafe` keyword has two uses: to declare the existence of contracts the compiler can't check, and to declare that a programmer has checked that these contracts have been upheld. --- The Rust Unsafe Book
> To isolate unsafe code as much as possible, its best to enclose unsafe code within a safe abstraction and provide a safe API. --- The Rust book
Many Rust programmers, sometimes even "professional" ones, do not fully understand when a function should be marked `unsafe` or not. Check out [Kerla OS](https://github.com/nuta/kerla)'s `UserBufWriter` and `UserVAddr` APIs, which is a classic example of _seemingly safe_ APIs that are _unsafe_ in nature.
```rust
impl<'a> SyscallHandler<'a> {
pub fn sys_clock_gettime(&mut self, clock: c_clockid, buf: UserVAddr) -> Result<isize> {
let (tv_sec, tv_nsec) = match clock {
CLOCK_REALTIME => {
let now = read_wall_clock();
(now.secs_from_epoch(), now.nanosecs_from_epoch())
}
CLOCK_MONOTONIC => {
let now = read_monotonic_clock();
(now.secs(), now.nanosecs())
}
_ => {
debug_warn!("clock_gettime: unsupported clock id: {}", clock);
return Err(Errno::ENOSYS.into());
}
};
let mut writer = UserBufWriter::from_uaddr(buf, size_of::<c_time>() + size_of::<c_long>());
writer.write::<c_time>(tv_sec.try_into().unwrap())?;
writer.write::<c_long>(tv_nsec.try_into().unwrap())?;
Ok(0)
}
}
```
```rust
/// Represents a user virtual memory address.
///
/// It is guaranteed that `UserVaddr` contains a valid address, in other words,
/// it does not point to a kernel address.
///
/// Futhermore, like `NonNull<T>`, it is always non-null. Use `Option<UserVaddr>`
/// represent a nullable user pointer.
#[derive(Debug, Copy, Clone, Eq, PartialEq, Ord, PartialOrd, Hash)]
#[repr(transparent)]
pub struct UserVAddr(usize);
impl UserVAddr {
pub const fn new(addr: usize) -> Option<UserVAddr> {
if addr == 0 {
None
} else {
Some(UserVAddr(addr))
}
}
pub fn read<T>(self) -> Result<T, AccessError> {
let mut buf: MaybeUninit<T> = MaybeUninit::uninit();
self.read_bytes(unsafe {
slice::from_raw_parts_mut(buf.as_mut_ptr() as *mut u8, size_of::<T>())
})?;
Ok(unsafe { buf.assume_init() })
}
pub fn write<T>(self, buf: &T) -> Result<usize, AccessError> {
let len = size_of::<T>();
self.write_bytes(unsafe { slice::from_raw_parts(buf as *const T as *const u8, len) })?;
Ok(len)
}
pub fn write_bytes(self, buf: &[u8]) -> Result<usize, AccessError> {
call_usercopy_hook();
self.access_ok(buf.len())?;
unsafe {
copy_to_user(self.value() as *mut u8, buf.as_ptr(), buf.len());
}
Ok(buf.len())
}
}
```
Interestingly, zCore makes almost exactly the same mistake.
```rust
impl Syscall<'_> {
/// finds the resolution (precision) of the specified clock clockid, and,
/// if buffer is non-NULL, stores it in the struct timespec pointed to by buffer
pub fn sys_clock_gettime(&self, clock: usize, mut buf: UserOutPtr<TimeSpec>) -> SysResult {
info!("clock_gettime: id={:?} buf={:?}", clock, buf);
let ts = TimeSpec::now();
buf.write(ts)?;
info!("TimeSpec: {:?}", ts);
Ok(0)
}
}
```
```rust
pub type UserOutPtr<T> = UserPtr<T, Out>;
/// Raw pointer from user land.
#[repr(transparent)]
#[derive(Copy, Clone)]
pub struct UserPtr<T, P: Policy>(*mut T, PhantomData<P>);
impl<T, P: Policy> From<usize> for UserPtr<T, P> {
fn from(ptr: usize) -> Self {
UserPtr(ptr as _, PhantomData)
}
}
impl<T, P: Write> UserPtr<T, P> {
/// Overwrites a memory location with the given `value`
/// **without** reading or dropping the old value.
pub fn write(&mut self, value: T) -> Result<()> {
self.check()?; // check non-nullness and alignment
unsafe { self.0.write(value) };
Ok(())
}
}
```
The examples reveal two important considerations in designing Asterinas:
1. Exposing _truly_ safe APIs. The privileged OS core must expose _truly safe_ APIs: however buggy or silly the unprivileged OS components may be written, they must _not_ cause undefined behaviors.
2. Handling _arbitrary_ pointers safely. The safe API of the OS core must provide a safe way to deal with arbitrary pointers.
With the two points in mind, let's get back to our main goal of privilege separation.
## Code organization with privilege separation
Our first step is to separate privileged and unprivileged code in the codebase of Asterinas. For our purpose of demonstrating a syscall handling framework, a minimal codebase may look like the following.
```text
.
├── asterinas
│   ├── src
│ │   └── main.rs
│   └── Cargo.toml
├── aster-core
│   ├── src
│ │   ├── lib.rs
│ │   ├── syscall_handler.rs
│ │   └── vm
│ │ ├── vmo.rs
│ │ └── vmar.rs
│   └── Cargo.toml
├── aster-core-libs
│ ├── linux-abi-types
│ │   ├── src
│ │   │ └── lib.rs
│ │   └── Cargo.toml
│ └── pod
│ ├── src
│   │ └── lib.rs
│  └── Cargo.toml
├── aster-comps
  └── linux-syscall
│ ├── src
│   │ └── lib.rs
│   └── Cargo.toml
└── aster-comp-libs
   └── linux-abi
├── src
  │ └── lib.rs
   └── Cargo.toml
```
The ultimate build target of the codebase is the `asterinas` crate, which is an OS kernel that consists of a privileged OS core (crate `aster-core`) and multiple OS components (the crates under `aster-comps/`).
For the sake of privilege separation, only crate `asterinas` and `aster-core` along with the crates under `aster-core-libs` are allowed to use the `unsafe` keyword. To the contrary, the crates under `aster-comps/` along with their dependent crates under `aster-comp-libs/` are not allowed to use `unsafe` directly; they may only borrow the superpower of `unsafe` by using the safe API exposed by `aster-core` or the crates under `aster-core-libs`. To summarize, the memory safety of the OS only relies on a small and well-defined TCB that constitutes the `asterinas` and `aster-core` crate plus the crates under `aster-core-libs/`.
Under this setting, all implementation of system calls goes to the `linux-syscall` crate. We are about to show that the _safe_ API provided by `aster-core` is powerful enough to enable the _safe_ implementation of `linux-syscall`.
## Crate `aster-core`
For our purposes here, the two most relevant APIs provided by `aster-core` is the abstraction for syscall handlers and virtual memory (VM).
### Syscall handlers
The `SyscallHandler` abstraction enables the OS core to hide the low-level, architectural-dependent aspects of syscall handling workflow (e.g., user-kernel switching and CPU register manipulation) and allow the unprivileged OS components to implement system calls.
```rust
// file: aster-core/src/syscall_handler.rs
pub trait SyscallHandler {
fn handle_syscall(&self, ctx: &mut SyscallContext);
}
pub struct SyscallContext { /* cpu states */ }
pub fn set_syscall_handler(handler: &'static dyn SyscallHandler) {
todo!("set HANDLER")
}
pub(crate) fn syscall_handler() -> &'static dyn SyscallHandler {
HANDLER
}
static mut HANDLER: &'static dyn SyscallHandler = &DummyHandler;
struct DummyHandler;
impl SyscallHandler for DummyHandler {
fn handle_syscall(&self, ctx: &mut UserContext) {
ctx.set_retval(-Errno::ENOSYS);
}
}
```
### VM capabilities
The OS core provides two abstractions related to virtual memory management.
* _Virtual Memory Address Region (VMAR)_. A VMAR represents a range of virtual address space. In essense, VMARs abstract away the architectural details regarding page tables.
* _Virtual Memory Pager (VMP)_. A VMP represents a range of memory pages (yes, the memory itself, not the address space). VMPs encapsulates the management of physical memory pages and enable on-demand paging.
Both VMARs and VMPs are _privileged_ as they need to have direct access to page tables and physical memory, which demands the use of `unsafe`.
These two abstractions are adopted from similar concepts in zircon ([Virtual Memory Address Regions (VMARs)](https://fuchsia.dev/fuchsia-src/reference/kernel_objects/vm_address_region) and [Virtual Memory Object (VMO)](https://fuchsia.dev/fuchsia-src/reference/kernel_objects/vm_object)), also implemented by zCore.
Interestingly, both VMARs and VMPs are [capabilities](../capabilities/README.md),
an important concept that we will elaborate on later. Basically, they are capabilities as they satisfy the following two properties of *non-forgeability* and *monotonicity*. This is because 1) a root VMAR or VMP can only be created via a few well-defined APIs exposed by the OS core, and 2) a child VMAR o VMP can only be derived from an existing VMAR or VMP with more limited access to resources (e.g., a subset of the parent's address space or memory pages or access permissions).
## Crate `linux-syscall`
Here we demonstrate how to leverage the APIs of `ksos-core` to implement system calls with safe Rust code in crate `linux-syscall`.
```rust
// file: aster-comps/linux-syscall/src/lib.rs
use aster_core::{SyscallContext, SyscallHandler, Vmar};
use linux_abi::{SyscallNum::*, UserPtr, RawFd, RawTimeVal, RawTimeZone};
pub struct SampleHandler;
impl SyscallHandler for SampleHandler {
fn handle_syscall(&self, ctx: &mut SyscallContext) {
let syscall_num = ctx.num();
let (a0, a1, a2, a3, a4, a5) = ctx.args();
match syscall_num {
SYS_GETTIMEOFDAY => {
let tv_ptr = UserPtr::new(a0 as usize);
let tz_ptr = UserPtr::new(a1 as usize);
let res = self.sys_gettimeofday(tv_ptr, );
todo!("set retval according to res");
}
SYS_SETRLIMIT => {
let resource = a0 as u32;
let rlimit_ptr = UserPtr::new(a1 as usize);
let res = self.sys_setrlimit(resource, rlimit_ptr);
todo!("set retval according to res");
}
_ => {
ctx.set_retval(-Errno::ENOSYS)
}
};
}
}
impl SampleHandler {
fn sys_gettimeofday(&self, tv_ptr: UserPtr<RawTimeVal>, _tz_ptr: UserPtr<RawTimeZone>) -> Result<()> {
if tv_ptr.is_null() {
return Err(Errno::EINVAL);
}
// Get the VMAR of this process
let vmar = self.thread().process().vmar();
let tv_val: RawTimeVal = todo!("get current time");
// Write a value according to the arbitrary pointer
// is safe because
// 1) the vmar refers to the memory in the user space;
// 2) the read_slice method checks memory validity (no page faults);
//
// Note that the vmar of the OS kernel cannot be
// manipulated directly by any OS components outside
// the OS core.
vmar.write_val(tv_ptr, tv_val)?;
Ok(())
}
fn sys_setrlimit(&self, resource: u32, rlimit_ptr: UserPtr<RawRlimit>) -> Result<u32> {
if rlimit_ptr.is_null() {
return Err(Errno::EINVAL);
}
let vmar = self.thread().process().vmar();
// Read a value according to the arbitrary pointer is safe
// due to reasons similar to the above code, but with one
// addition reason: the value is of a type `T: Pod`, i.e.,
// Plain Old Data (POD).
let new_rlimit = vmar.read_val::<RawRlimit>(rlimit_ptr)?;
todo!("use the new rlimit value")
}
}
```
## Crate `pod`
This crate defines a marker trait `Pod`, which represents plain-old data.
```rust
/// file: aster-core-libs/pod/src/lib.rs
/// A marker trait for plain old data (POD).
///
/// A POD type `T:Pod` supports converting to and from arbitrary
/// `mem::size_of::<T>()` bytes _safely_.
/// For example, simple primitive types like `u8` and `i16`
/// are POD types. But perhaps surprisingly, `bool` is not POD
/// because Rust compiler makes implicit assumption that
/// a byte of `bool` has a value of either `0` or `1`.
/// Interpreting a byte of value `3` has a `bool` value has
/// undefined behavior.
///
/// # Safety
///
/// Marking a non-POD type as POD may cause undefined behaviors.
pub unsafe trait Pod: Copy + Sized {
fn new_from_bytes(bytes: &[u8]) -> Self {
*Self::from_bytes(bytes)
}
fn from_bytes(bytes: &[u8]) -> &Self {
// Ensure the size and alignment are ok
assert!(bytes.len() == core::mem::size_of::<Self>());
assert!((bytes as *const u8 as usize) % core::mem::align_of::<Self>() == 0);
unsafe {
core::mem::transmute(bytes)
}
}
fn from_bytes_mut(bytes: &[u8]) -> &mut Self {
// Ensure the size and alignment are ok
assert!(bytes.len() == core::mem::size_of::<Self>());
assert!((bytes as *const u8 as usize) % core::mem::align_of::<Self>() == 0);
unsafe {
core::mem::transmute(bytes)
}
}
fn as_bytes(&self) -> &[u8] {
let ptr = self as *const u8;
let len = core::mem::size_of::<Self>();
unsafe {
core::slice::from_raw_parts(ptr, len)
}
}
fn as_bytes_mut(&mut self) -> &mut [u8] {
let ptr = self as *mut u8;
let len = core::mem::size_of::<Self>();
unsafe {
core::slice::from_raw_parts_mut(ptr, len)
}
}
}
macro_rule! impl_pod_for {
(/* define the input */) => { /* define the expansion */ }
}
impl_pod_for!(
u8, u16, u32, u64,
i8, i16, i32, i64,
);
unsafe impl<T: Pod, const N> [T; N] for Pod {}
```
## Crate `linux-abi-type`
```rust
// file: aster-core-libs/linux-abi-types
use pod::Pod;
pub type RawFd = i32;
pub struct RawTimeVal {
sec: u64,
usec: i64,
}
unsafe impl Pod for RawTimeVal {}
```
## Crate `linux-abi`
```rust
// file: aster-comp-libs/linux-abi
pub use linux_abi_types::*;
pub enum SyscallNum {
Read = 0,
Write = 1,
/* ... */
}
```
## Wrap up
I hope that this document has convinced you that with the right abstractions (e.g., `SyscallHandler`, `Vmar`, `Vmp`, and `Pod`), it is possible to write system calls---at least, the main system call workflow---without _unsafe_ Rust.

View File

@ -0,0 +1 @@
# RFC-0001: RFC Process

View File

@ -0,0 +1 @@
# RFC-0002: Operating System Development Kit (OSDK)

1
docs/src/rfcs/README.md Normal file
View File

@ -0,0 +1 @@
# RFC Process

View File

@ -0,0 +1 @@
# Before You Contribute

View File

@ -0,0 +1 @@
# Code of Conduct

View File

@ -0,0 +1 @@
# Code Organization

View File

@ -0,0 +1 @@
# Community

View File

@ -0,0 +1 @@
# Style Guidelines

View File

@ -0,0 +1 @@
# General Guidlines

View File

@ -0,0 +1 @@
# Commit Guidlines

View File

@ -0,0 +1 @@
# Rust Guidlines