Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 21 additions & 5 deletions ckb-vm-fuzzing-utils/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,8 @@ use ckb_std::{
syscalls::traits::{Error, IoResult, SyscallImpls},
};
use ckb_vm::{
Error as VMError, Memory, Register, SupportMachine, Syscalls,
memory::load_c_string_byte_by_byte,
Error as VMError, Memory, RISCV_PAGESIZE, Register, SupportMachine, Syscalls,
memory::{FLAG_EXECUTABLE, FLAG_FREEZED, load_c_string_byte_by_byte},
registers::{A0, A1, A2, A3, A4, A5, A7},
};
use core::ffi::CStr;
Expand Down Expand Up @@ -177,13 +177,29 @@ where
impls.load_input_by_field(buf, offset, index, source, field)
})?,
SyscallCode::LoadCellDataAsCode => {
let addr = machine.registers()[A0].to_u64() as *mut u8;
let memory_size = machine.registers()[A1].to_u64() as usize;
let addr = machine.registers()[A0].to_u64();
let memory_size = machine.registers()[A1].to_u64();
let content_offset = machine.registers()[A2].to_u64() as usize;
let content_size = machine.registers()[A3].to_u64() as usize;
let index = machine.registers()[A4].to_u64() as usize;
let source = machine.registers()[A5].to_u64().try_into().expect("parse source");
let result = self.impls.load_cell_code(addr, memory_size, content_offset, content_size, index, source);

let mut buf = vec![0u8; memory_size as usize];
let result = self.impls.load_cell_code(
buf.as_mut_ptr() as *mut u8,
memory_size as usize,
content_offset,
content_size,
index,
source,
);
machine.memory_mut().store_bytes(addr, &buf)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about this, the whole idea of SyscallImplsSynchronousWrapper is that underlying impls should handle the full syscall. What's the storing and permission setting doing here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For storing, I wrote it based on load_data: https://github.com/nervosnetwork/ckb-vm-contrib/blob/main/ckb-vm-fuzzing-utils/src/lib.rs#L82

For permission setting, replied here: #22 (comment)

let mut current_addr = addr;
while current_addr < addr + memory_size {
let page = current_addr / RISCV_PAGESIZE as u64;
machine.memory_mut().set_flag(page, FLAG_EXECUTABLE | FLAG_FREEZED)?;
current_addr += RISCV_PAGESIZE as u64;
}
self.set_return(result, machine);
}
SyscallCode::LoadCellData => self.load_ois(machine, |buf, impls, offset, index, source| {
Expand Down
9 changes: 7 additions & 2 deletions ckb-vm-syscall-tracer/protos/traces.proto
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,10 @@ message IoData {
uint64 additional_length = 2;
}

message IoDataAsCode {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use a new type, or should we reuse IoData(but restricting additional_length to 0?

Personally I would just reuse IoData

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course, reusing the previous data structure is more concise, I will modify it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about it and it's probably not possible, because you designed IoData to handle Partial Loading, but Load Cell Data As Code is not designed with partial loading in mind. If I choose to share IoData, I won't be able to distinguish them in apply_partial_content.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note the following 2 are 2 separate, independent question:

  • Do we need a new type like IoDataAsCode in protobuf message format?
  • Do we need a new enum variant in PartialSyscallContent?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know what to do. For the two questions above, I think I can:

  1. No.
  2. Yes.

bytes available_data = 1;
}

message Fds {
repeated uint64 fds = 1;
}
Expand All @@ -30,8 +34,9 @@ message Syscall {
// 2) split them into 3 different fields.
uint64 success_output_data = 2;
IoData io_data = 3;
Terminated terminated = 4;
Fds fds = 5;
IoDataAsCode io_data_as_code = 4;
Terminated terminated = 5;
Fds fds = 6;
}
}

Expand Down
17 changes: 16 additions & 1 deletion ckb-vm-syscall-tracer/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -568,6 +568,7 @@ impl<M: SupportMachine> Syscalls<M> for SyscallBasedCollectorVMSyscalls<M> {
enum PartialSyscallContent {
ReturnWithCode,
IoData { data_addr: u64, input_length: u64 },
IoDataAsCode { data_addr: u64, input_length: u64 },
Exec { args: Vec<Vec<u8>> },
Spawn { args: Vec<Vec<u8>> },
Wait,
Expand Down Expand Up @@ -600,7 +601,9 @@ fn build_partial_content<M: SupportMachine>(machine: &mut M) -> Result<Option<Pa
PartialSyscallContent::IoData { data_addr, input_length }
}
SyscallCode::LoadCellDataAsCode => {
panic!("Load cell data as code syscall is not supported!");
let data_addr = machine.registers()[A0].to_u64();
let input_length = machine.registers()[A1].to_u64();
PartialSyscallContent::IoDataAsCode { data_addr, input_length }
}
SyscallCode::VmVersion => PartialSyscallContent::ReturnWithCode,
SyscallCode::CurrentCycles => PartialSyscallContent::ReturnWithCode,
Expand Down Expand Up @@ -663,6 +666,18 @@ fn apply_partial_content<M: SupportMachine>(
})),
}
}
PartialSyscallContent::IoDataAsCode { data_addr, input_length } => {
let return_code = machine.registers()[A0].to_i64();
if return_code != 0 {
return return_syscall(return_code);
}
let data = machine.memory_mut().load_bytes(*data_addr, *input_length)?;
traces::Syscall {
value: Some(traces::syscall::Value::IoDataAsCode(traces::IoDataAsCode {
available_data: data.as_ref().to_vec(),
})),
}
}
PartialSyscallContent::Exec { .. } => {
let return_code = machine.registers()[A0].to_i64();
if return_code != 0 {
Expand Down
20 changes: 17 additions & 3 deletions protobuf-ckb-syscalls/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -175,14 +175,28 @@ impl SyscallImpls for ProtobufBasedSyscallImpls {

fn load_cell_code(
&self,
_buf_ptr: *mut u8,
_len: usize,
buf_ptr: *mut u8,
len: usize,
_content_offset: usize,
_content_size: usize,
_index: usize,
_source: Source,
) -> Result<(), Error> {
panic!("Load cell data as code is not suported!");
match self.syscall() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend that a #[cfg(target_arch = "riscv64")] guard shall be added here. In native environment(such as fuzzing), this implementation does not really work.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand this. It seems to me that this code is useful when the debugger uses CkbFlavoredImplSyscalls, and it shouldn't be ignored by #[cfg(target_arch = "riscv64")].

Referring to your original code:

https://github.com/xxuejie/ckb-vm-contrib/blob/load-cell-code-changes/ckb-vm-fuzzing-utils/src/lib.rs#L404

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So ProtobufBasedSyscallImpls have multiple use cases: it can be used by CkbFlavoredImplSyscalls to implement a ckbvm Machine, it can also be used by fuzzing code. When doing fuzzing, load_cell_code must fail as not supporting load_cell_code function.

You are right that #[cfg(target_arch = "riscv64")] won't work, but we still need a way for fuzzers to leverage the same code where load_cell_code should just panic.

Maybe we can keep load_cell_code panicking as it is now, and move all the actual logics into ProtobufBasedSyscalls

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the code according to the above idea. The additional modification is to move CkbFlavoredImplSyscalls and ProtobufBasedSyscalls to protobuf-ckb-syscalls (otherwise there will be a circular dependency problem between protobuf-ckb-syscalls and ckb-vm-fuzzing-utils).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me this is not a proper change.

The point of having CkbFlavoredImplSyscalls, is that there might well be other syscall implementations that could take advantage of CkbFlavoredImplSyscalls. So only either of the following 2 is IMHO a good solution:

  • Keep the generic logic in CkbFlavoredImplSyscalls in fuzzing utils crate, and keep ProtobufBasedSyscalls in protobuf-ckb-syscalls crate
  • We remove CkbFlavoredImplSyscalls and only keep ProtobufBasedSyscalls

So the question here is: can we build a generic enough type CkbFlavoredImplSyscalls that has all the details of load_cell_code, but allow customization of how to fetch the actual code data? If so, we should split the 2 and keep them in separate packages, otherwise, we should only have ProtobufBasedSyscalls

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The second solution seems more appealing to me.

For the first idea, I don't have any particularly good suggestions for improvement. Intuitively, my thought is to add a flag to ProtobufBasedSyscallImpls to indicate whether load_cell_code should directly panic or perform an actual memory copy. This way, we can keep the original versions of CkbFlavoredImplSyscalls and ProtobufBasedSyscall unchanged while supporting both fuzzing and debugger use cases. However, this modification feels quite crude.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for the design of CkbFlavoredImplSyscalls, I think it's still an open topic, perhaps better discussed in a new issue rather than pull requests.

I've submitted a new commit following the solution 2: "We remove CkbFlavoredImplSyscalls and only keep ProtobufBasedSyscalls."This choice assumes that both solutions are "good solution".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've put together a different design here, if I were doing it, I would probably go with this alternative design.

I cannot speak for the whole team, but on a personal scale, I would just wait and carefuly craft a design than to roll out a sloppy one early. IMHO it's a real unfortunate thing that we have shippped a WHOLE LOT of code without careful thinking, resulting in poor APIs that we cannot change at a later stage. Fundamentally it's the team's decision, but I would vote against merging code before it is in a proper state now.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I apologize for my rash suggestion.

I tested your new design, and it works perfectly. I reviewed it carefully, and I think it's great.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I updated this PR, you can review it.

Some(traces::syscall::Value::ReturnWithCode(code)) => {
if code != 0 {
return Err(Error::try_from(code as u64).unwrap());
}
Ok(())
}
Some(traces::syscall::Value::IoDataAsCode(io_data_as_code)) => {
unsafe {
std::ptr::copy_nonoverlapping(io_data_as_code.available_data.as_ptr(), buf_ptr, len);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the place IMHO we should do the full syscall implementation, including:

  • alignment checking
  • code copying
  • zero-fill padding regions before and after the code
  • setting proper memory permissions.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe there are some design reasons that may cause issues with the points you mentioned in the implementation:

  1. Alignment checking: buf_ptr stores the address of the buffer in the native environment, not in the ckb-vm. Therefore, alignment checking cannot be performed on it. (I designed this API following io_syscall. Please let me know if I got this wrong.)
  2. Code copying: No issues here.
  3. Zero-fill padding regions before and after the code: For simplicity, I collected the zero-padding before and after the code via the syscall tracer, so we only need to copy here.
  4. Setting proper memory permissions: In the context of this function, I have no way to access the memory instance. I think you're right, but I'm not sure how to handle it. So, I wrote the permission-related code in ckb-vm-fuzzing-utils.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's some confusions. I will just explain my original design ideas:

  • SyscallImplsSynchronousWrapper in ckb-script-fuzzing-toolkit is just an adapter for converting between different styles of CKB syscalls. Given another chance I would definitely want to unify the interfaces used by different CKB syscalls. But we live in a world as it is, so this adapter is built. It should not contain any actual implementation, or part of syscall implementation. It always defer to the inner SyscallImpls trait to fully implement a syscall.
  • protobuf-ckb-syscalls is originally designed for fuzzing code running in a native environment. But this is not always the case, here one can see that the parameters for protobuf-based CKB syscalls could also come from a CKB-VM's memory region(well this assumes that the memory has certain restrictions of course, but it's just the details), we can just think of it as a pointer to a location, you can cast it to address and check for alignments.
  • It is definitely right that protobuf-based CKB syscalls neglect the handling of memory permissions(and likely, dirty page checkings), but if I were doing this, I would consider the trait method definition for load_cell_code in SyscallImpls is not good enough. I would lean towards modifying this trait method definition to be capable, and then fully implement the syscall right inside protobuf-based CKB syscalls. Relying on SyscallImplsSynchronousWrapper to do part of the syscalls, to me, is really leaky abstraction.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modify the load_cell_code definition in SyscallImpls to add two parameters: _memory_addr and _memory_flag. _memory_addr represents the location in ckb-vm memory, which may or may not match buf_ptr depending on the implementation. _memory_flag is used to record modified page flags. Depending on the implementation, it may directly map to the flags in the memory instance or serve as a temporary buffer.

fn load_cell_code(
    &self,
    buf_ptr: *mut u8,
    len: usize,
    content_offset: usize,
    content_size: usize,
    index: usize,
    source: Source,
    _memory_addr: usize,
    _memory_flag: &mut [u8],
) -> Result<(), Error> {
    build_result(self.syscall(
        buf_ptr as u64,
        len as u64,
        content_offset as u64,
        content_size as u64,
        index as u64,
        source as u64,
        consts::SYS_LOAD_CELL_DATA_AS_CODE,
    ))
}

In protobuf-ckb-syscalls, the implementation is as follows:

fn load_cell_code(
    &self,
    buf_ptr: *mut u8,
    len: usize,
    _content_offset: usize,
    _content_size: usize,
    _index: usize,
    _source: Source,
    memory_addr: usize,
    memory_flag: &mut [u8],
) -> Result<(), Error> {
    match self.syscall() {
        Some(traces::syscall::Value::ReturnWithCode(code)) => {
            if code != 0 {
                return Err(Error::try_from(code as u64).unwrap());
            }
            Ok(())
        }
        Some(traces::syscall::Value::IoData(io_data)) => {
            const RISCV_PAGESIZE: usize = 1 << 12;
            const FLAG_FREEZED: u8 = 0b01;
            const FLAG_EXECUTABLE: u8 = 0b10;
            if memory_addr % RISCV_PAGESIZE != 0 {
                return Err(UNEXPECTED_ERROR);
            }
            if len % RISCV_PAGESIZE != 0 {
                return Err(UNEXPECTED_ERROR);
            }
            if memory_flag.len() != len / RISCV_PAGESIZE {
                return Err(UNEXPECTED_ERROR);
            }
            for i in 0..memory_flag.len() {
                memory_flag[i] = FLAG_FREEZED | FLAG_EXECUTABLE;
            }
            unsafe {
                std::ptr::copy_nonoverlapping(io_data.available_data.as_ptr(), buf_ptr, len);
            }
            Ok(())
        }
        _ => unreachable!(),
    }
}

In ckb-vm-fuzzing-utils's SyscallImplsSynchronousWrapper, the handling is as follows:

SyscallCode::LoadCellDataAsCode => {
    let addr = machine.registers()[A0].to_u64();
    let memory_size = machine.registers()[A1].to_u64();
    let content_offset = machine.registers()[A2].to_u64() as usize;
    let content_size = machine.registers()[A3].to_u64() as usize;
    let index = machine.registers()[A4].to_u64() as usize;
    let source = machine.registers()[A5].to_u64().try_into().expect("parse source");

    let mut buf = vec![0u8; memory_size as usize];
    let mut memory_flag = vec![0u8; memory_size as usize / RISCV_PAGESIZE];
    let result = self.impls.load_cell_code(
        buf.as_mut_ptr() as *mut u8,
        memory_size as usize,
        content_offset,
        content_size,
        index,
        source,
        addr as usize,
        &mut memory_flag,
    );
    machine.memory_mut().store_bytes(addr, &buf)?;
    for (i, f) in memory_flag.iter().enumerate() {
        let page = addr / RISCV_PAGESIZE as u64 + i as u64;
        machine.memory_mut().set_flag(page, *f)?;
    }
    self.set_return(result, machine);
}

@xxuejie Do you think this modification is correct?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me, SyscallImplsSynchronousWrapper still does work where it should belong elsewhere: why does SyscallImplsSynchronousWrapper know exactly what flag to set here? I see 2 variation on this problem:

For the first solution, load_cell_code can simply return a list of memory pages to set and their permissions:

fn load_cell_code(
    &self,
    buf_ptr: u64,
    len: usize,
    content_offset: usize,
    content_size: usize,
    index: usize,
    source: Source,
) -> Result<Vec<(u64, [u8; 4096], u8)>, Error>;

Notice the tuple is just for simplicity, we can debate if a struct makes more sense.

In this signature, load_cell_code does not really write the data to VM memory, it merely prepares multiple pages of data as return result. For each page, the method returns the page address, full 4096 bytes of data, page flag. This way SyscallImplsSynchronousWrapper can just fill each page in VM memory, without thinking about how to prepare the data and the flag. Personally I consider this a real separation of implementation, and simply adapter code.

When returning a bulk of data could be debatable in performance, another design leverages a callback based solution:

fn load_cell_code<F>(
    &self,
    buf_ptr: u64,
    len: usize,
    content_offset: usize,
    content_size: usize,
    index: usize,
    source: Source,
    writer: F,
) -> Result<(), Error>
where
    F: FnMut(page_start: u64, data: &[u8; 4096], flag: u8) -> Result<(), Error>;

This way a callback function handles actual data writing, which is mostly what SyscallImplsSynchronousWrapper needs to simply implement. No big bulk of data are passed around in this design.

So really the key point to me here, is that SyscallImplsSynchronousWrapper should not be in preparing any piece of the data(real memory data, memory flags, etc.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some more thought, the API proposed here does not really work. I've put my personal suggestions in this commit.

Basically, the idea is: current API works best for ckb-std, while I believe it has benefits to keep SyscallImplsSynchronousWrapper a pure adapter pattern, we can introduce a separate struct implementing proper behaviors for supporting load_cell_code for ckb's current setup.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This solution is a good one that supports both ckb-std and ckb-debugger. I will try to implement and test it in ckb-debugger.

}
Ok(())
}
_ => unreachable!(),
}
}

fn load_cell_data(&self, buf: &mut [u8], offset: usize, _index: usize, _source: Source) -> IoResult {
Expand Down