Implement load data as code in syscall tracer #22

mohanson · 2025-08-04T02:41:57Z

No description provided.

xxuejie · 2025-08-05T00:06:40Z

ckb-vm-fuzzing-utils/src/lib.rs

+                    index,
+                    source,
+                );
+                machine.memory_mut().store_bytes(addr, &buf)?;


I'm not sure about this, the whole idea of SyscallImplsSynchronousWrapper is that underlying impls should handle the full syscall. What's the storing and permission setting doing here?

For storing, I wrote it based on load_data: https://github.com/nervosnetwork/ckb-vm-contrib/blob/main/ckb-vm-fuzzing-utils/src/lib.rs#L82

For permission setting, replied here: #22 (comment)

xxuejie · 2025-08-05T00:07:24Z

ckb-vm-syscall-tracer/protos/traces.proto

  uint64 additional_length = 2;
 }

+message IoDataAsCode {


Should we use a new type, or should we reuse IoData(but restricting additional_length to 0?

Personally I would just reuse IoData

Of course, reusing the previous data structure is more concise, I will modify it.

I thought about it and it's probably not possible, because you designed IoData to handle Partial Loading, but Load Cell Data As Code is not designed with partial loading in mind. If I choose to share IoData, I won't be able to distinguish them in apply_partial_content.

Note the following 2 are 2 separate, independent question:

Do we need a new type like IoDataAsCode in protobuf message format?

Do we need a new enum variant in PartialSyscallContent?

I know what to do. For the two questions above, I think I can:

No.

Yes.

xxuejie · 2025-08-05T00:09:12Z

protobuf-ckb-syscalls/src/lib.rs

+            }
+            Some(traces::syscall::Value::IoDataAsCode(io_data_as_code)) => {
+                unsafe {
+                    std::ptr::copy_nonoverlapping(io_data_as_code.available_data.as_ptr(), buf_ptr, len);


This is the place IMHO we should do the full syscall implementation, including:

alignment checking

code copying

zero-fill padding regions before and after the code

setting proper memory permissions.

I believe there are some design reasons that may cause issues with the points you mentioned in the implementation:

Alignment checking: buf_ptr stores the address of the buffer in the native environment, not in the ckb-vm. Therefore, alignment checking cannot be performed on it. (I designed this API following io_syscall. Please let me know if I got this wrong.)

Code copying: No issues here.

Zero-fill padding regions before and after the code: For simplicity, I collected the zero-padding before and after the code via the syscall tracer, so we only need to copy here.

Setting proper memory permissions: In the context of this function, I have no way to access the memory instance. I think you're right, but I'm not sure how to handle it. So, I wrote the permission-related code in ckb-vm-fuzzing-utils.

I think there's some confusions. I will just explain my original design ideas:

SyscallImplsSynchronousWrapper in ckb-script-fuzzing-toolkit is just an adapter for converting between different styles of CKB syscalls. Given another chance I would definitely want to unify the interfaces used by different CKB syscalls. But we live in a world as it is, so this adapter is built. It should not contain any actual implementation, or part of syscall implementation. It always defer to the inner SyscallImpls trait to fully implement a syscall.

protobuf-ckb-syscalls is originally designed for fuzzing code running in a native environment. But this is not always the case, here one can see that the parameters for protobuf-based CKB syscalls could also come from a CKB-VM's memory region(well this assumes that the memory has certain restrictions of course, but it's just the details), we can just think of it as a pointer to a location, you can cast it to address and check for alignments.

It is definitely right that protobuf-based CKB syscalls neglect the handling of memory permissions(and likely, dirty page checkings), but if I were doing this, I would consider the trait method definition for load_cell_code in SyscallImpls is not good enough. I would lean towards modifying this trait method definition to be capable, and then fully implement the syscall right inside protobuf-based CKB syscalls. Relying on SyscallImplsSynchronousWrapper to do part of the syscalls, to me, is really leaky abstraction.

Modify the load_cell_code definition in SyscallImpls to add two parameters: _memory_addr and _memory_flag. _memory_addr represents the location in ckb-vm memory, which may or may not match buf_ptr depending on the implementation. _memory_flag is used to record modified page flags. Depending on the implementation, it may directly map to the flags in the memory instance or serve as a temporary buffer.

fn load_cell_code( &self, buf_ptr: *mut u8, len: usize, content_offset: usize, content_size: usize, index: usize, source: Source, _memory_addr: usize, _memory_flag: &mut [u8], ) -> Result<(), Error> { build_result(self.syscall( buf_ptr as u64, len as u64, content_offset as u64, content_size as u64, index as u64, source as u64, consts::SYS_LOAD_CELL_DATA_AS_CODE, )) }

In protobuf-ckb-syscalls, the implementation is as follows:

fn load_cell_code( &self, buf_ptr: *mut u8, len: usize, _content_offset: usize, _content_size: usize, _index: usize, _source: Source, memory_addr: usize, memory_flag: &mut [u8], ) -> Result<(), Error> { match self.syscall() { Some(traces::syscall::Value::ReturnWithCode(code)) => { if code != 0 { return Err(Error::try_from(code as u64).unwrap()); } Ok(()) } Some(traces::syscall::Value::IoData(io_data)) => { const RISCV_PAGESIZE: usize = 1 << 12; const FLAG_FREEZED: u8 = 0b01; const FLAG_EXECUTABLE: u8 = 0b10; if memory_addr % RISCV_PAGESIZE != 0 { return Err(UNEXPECTED_ERROR); } if len % RISCV_PAGESIZE != 0 { return Err(UNEXPECTED_ERROR); } if memory_flag.len() != len / RISCV_PAGESIZE { return Err(UNEXPECTED_ERROR); } for i in 0..memory_flag.len() { memory_flag[i] = FLAG_FREEZED | FLAG_EXECUTABLE; } unsafe { std::ptr::copy_nonoverlapping(io_data.available_data.as_ptr(), buf_ptr, len); } Ok(()) } _ => unreachable!(), } }

In ckb-vm-fuzzing-utils's SyscallImplsSynchronousWrapper, the handling is as follows:

SyscallCode::LoadCellDataAsCode => { let addr = machine.registers()[A0].to_u64(); let memory_size = machine.registers()[A1].to_u64(); let content_offset = machine.registers()[A2].to_u64() as usize; let content_size = machine.registers()[A3].to_u64() as usize; let index = machine.registers()[A4].to_u64() as usize; let source = machine.registers()[A5].to_u64().try_into().expect("parse source"); let mut buf = vec![0u8; memory_size as usize]; let mut memory_flag = vec![0u8; memory_size as usize / RISCV_PAGESIZE]; let result = self.impls.load_cell_code( buf.as_mut_ptr() as *mut u8, memory_size as usize, content_offset, content_size, index, source, addr as usize, &mut memory_flag, ); machine.memory_mut().store_bytes(addr, &buf)?; for (i, f) in memory_flag.iter().enumerate() { let page = addr / RISCV_PAGESIZE as u64 + i as u64; machine.memory_mut().set_flag(page, *f)?; } self.set_return(result, machine); }

@xxuejie Do you think this modification is correct?

To me, SyscallImplsSynchronousWrapper still does work where it should belong elsewhere: why does SyscallImplsSynchronousWrapper know exactly what flag to set here? I see 2 variation on this problem:

For the first solution, load_cell_code can simply return a list of memory pages to set and their permissions:

fn load_cell_code( &self, buf_ptr: u64, len: usize, content_offset: usize, content_size: usize, index: usize, source: Source, ) -> Result<Vec<(u64, [u8; 4096], u8)>, Error>;

Notice the tuple is just for simplicity, we can debate if a struct makes more sense.

In this signature, load_cell_code does not really write the data to VM memory, it merely prepares multiple pages of data as return result. For each page, the method returns the page address, full 4096 bytes of data, page flag. This way SyscallImplsSynchronousWrapper can just fill each page in VM memory, without thinking about how to prepare the data and the flag. Personally I consider this a real separation of implementation, and simply adapter code.

When returning a bulk of data could be debatable in performance, another design leverages a callback based solution:

fn load_cell_code<F>( &self, buf_ptr: u64, len: usize, content_offset: usize, content_size: usize, index: usize, source: Source, writer: F, ) -> Result<(), Error> where F: FnMut(page_start: u64, data: &[u8; 4096], flag: u8) -> Result<(), Error>;

This way a callback function handles actual data writing, which is mostly what SyscallImplsSynchronousWrapper needs to simply implement. No big bulk of data are passed around in this design.

So really the key point to me here, is that SyscallImplsSynchronousWrapper should not be in preparing any piece of the data(real memory data, memory flags, etc.)

After some more thought, the API proposed here does not really work. I've put my personal suggestions in this commit.

Basically, the idea is: current API works best for ckb-std, while I believe it has benefits to keep SyscallImplsSynchronousWrapper a pure adapter pattern, we can introduce a separate struct implementing proper behaviors for supporting load_cell_code for ckb's current setup.

This solution is a good one that supports both ckb-std and ckb-debugger. I will try to implement and test it in ckb-debugger.

mohanson · 2025-08-14T06:24:20Z

@xxuejie I modified the PR according to your suggestion and tested it in ckb-debugger. Please review this PR again.

xxuejie · 2025-08-14T06:43:20Z

protobuf-ckb-syscalls/src/lib.rs

        _source: Source,
    ) -> Result<(), Error> {
-        panic!("Load cell data as code is not suported!");
+        match self.syscall() {


I recommend that a #[cfg(target_arch = "riscv64")] guard shall be added here. In native environment(such as fuzzing), this implementation does not really work.

I don't quite understand this. It seems to me that this code is useful when the debugger uses CkbFlavoredImplSyscalls, and it shouldn't be ignored by #[cfg(target_arch = "riscv64")].

Referring to your original code:

https://github.com/xxuejie/ckb-vm-contrib/blob/load-cell-code-changes/ckb-vm-fuzzing-utils/src/lib.rs#L404

So ProtobufBasedSyscallImpls have multiple use cases: it can be used by CkbFlavoredImplSyscalls to implement a ckbvm Machine, it can also be used by fuzzing code. When doing fuzzing, load_cell_code must fail as not supporting load_cell_code function.

You are right that #[cfg(target_arch = "riscv64")] won't work, but we still need a way for fuzzers to leverage the same code where load_cell_code should just panic.

Maybe we can keep load_cell_code panicking as it is now, and move all the actual logics into ProtobufBasedSyscalls

I updated the code according to the above idea. The additional modification is to move CkbFlavoredImplSyscalls and ProtobufBasedSyscalls to protobuf-ckb-syscalls (otherwise there will be a circular dependency problem between protobuf-ckb-syscalls and ckb-vm-fuzzing-utils).

To me this is not a proper change.

The point of having CkbFlavoredImplSyscalls, is that there might well be other syscall implementations that could take advantage of CkbFlavoredImplSyscalls. So only either of the following 2 is IMHO a good solution:

Keep the generic logic in CkbFlavoredImplSyscalls in fuzzing utils crate, and keep ProtobufBasedSyscalls in protobuf-ckb-syscalls crate

We remove CkbFlavoredImplSyscalls and only keep ProtobufBasedSyscalls

So the question here is: can we build a generic enough type CkbFlavoredImplSyscalls that has all the details of load_cell_code, but allow customization of how to fetch the actual code data? If so, we should split the 2 and keep them in separate packages, otherwise, we should only have ProtobufBasedSyscalls

The second solution seems more appealing to me.

For the first idea, I don't have any particularly good suggestions for improvement. Intuitively, my thought is to add a flag to ProtobufBasedSyscallImpls to indicate whether load_cell_code should directly panic or perform an actual memory copy. This way, we can keep the original versions of CkbFlavoredImplSyscalls and ProtobufBasedSyscall unchanged while supporting both fuzzing and debugger use cases. However, this modification feels quite crude.

As for the design of CkbFlavoredImplSyscalls, I think it's still an open topic, perhaps better discussed in a new issue rather than pull requests.

I've submitted a new commit following the solution 2: "We remove CkbFlavoredImplSyscalls and only keep ProtobufBasedSyscalls."This choice assumes that both solutions are "good solution".

I've put together a different design here, if I were doing it, I would probably go with this alternative design.

I cannot speak for the whole team, but on a personal scale, I would just wait and carefuly craft a design than to roll out a sloppy one early. IMHO it's a real unfortunate thing that we have shippped a WHOLE LOT of code without careful thinking, resulting in poor APIs that we cannot change at a later stage. Fundamentally it's the team's decision, but I would vote against merging code before it is in a proper state now.

I apologize for my rash suggestion.

I tested your new design, and it works perfectly. I reviewed it carefully, and I think it's great.

Also, I updated this PR, you can review it.

xxuejie · 2025-08-19T01:03:14Z

protobuf-ckb-syscalls/src/lib.rs

    }
 }
+
+// TODO: it remains a question if this should be it's own type, or we should


This TODO is posed as a question for discussion, I'm not sure if we should split ProtobufVmRunnerImpls into a separate type, or have ProtobufImpls implement CkbvmRunnerImpls.

Use a single ProtobufImpls avoid repetition, but saves extra coding

Use 2 types clearly distinguish between 2 uses, but we have to implement a dummy SyscallImpls on ProtobufVmRunnerImpls again.

Likewise, I can't say which behavior is more correct. But I tend to let each code do its own thing.

In that sense feel free to just remove the TODOs.

xxuejie

🎆

Implement load data as code

7e251e4

mohanson requested a review from xxuejie August 4, 2025 02:41

mohanson changed the title ~~Implement load data as code~~ Implement load data as code in syscall tracer Aug 4, 2025

xxuejie reviewed Aug 5, 2025

View reviewed changes

Reuse IoData

e164d26

mohanson mentioned this pull request Aug 11, 2025

Update the signature of the load cell code nervosnetwork/ckb-std#140

Closed

mohanson force-pushed the update branch from 2f0ba08 to e164d26 Compare August 14, 2025 05:26

Follow xxuejie's advice

54637cc

Cargo fmt

9bb597d

xxuejie reviewed Aug 14, 2025

View reviewed changes

mohanson added 3 commits August 14, 2025 17:31

Move real logic of load_cell_code into ProtobufBasedSyscalls

2f60f88

Only keep ProtobufBasedSyscalls

a96306e

Use the new design

8483a81

xxuejie reviewed Aug 19, 2025

View reviewed changes

Remove a todo

6dc02e2

xxuejie approved these changes Aug 19, 2025

View reviewed changes

mohanson merged commit e955f28 into nervosnetwork:main Aug 19, 2025
7 checks passed

mohanson deleted the update branch August 19, 2025 05:35

Implement load data as code in syscall tracer #22

Implement load data as code in syscall tracer #22

Uh oh!

Conversation

mohanson commented Aug 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mohanson commented Aug 14, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

xxuejie left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mohanson commented Aug 4, 2025 •

edited

Loading