Skip to content

Commit d157927

Browse files
committed
unix: new implementation of unix/stream & unix/seqpacket
[this is an updated version of d80a97d, that had been reverted] Provide protocol specific pr_sosend and pr_soreceive for PF_UNIX SOCK_STREAM sockets and implement SOCK_SEQPACKET sockets as an extension of SOCK_STREAM. The change meets three goals: get rid of unix(4) specific stuff in the generic socket code, provide a faster and robust unix/stream sockets and bring unix/seqpacket much closer to specification. Highlights follow: - The send buffer now is truly bypassed. Previously it was always empty, but the send(2) still needed to acquire its lock and do a variety of tricks to be woken up in the right time while sleeping on it. Now the only two things we care about in the send buffer is the I/O sx(9) lock that serializes operations and value of so_snd.sb_hiwat, which we can read without obtaining a lock. The sleep of a send(2) happens on the mutex of the receive buffer of the peer. A bulk send/recv of data with large socket buffers will make both syscalls just bounce between owning the receive buffer lock and copyin(9)/copyout(9), no other locks would be involved. Since event notification mechanisms, such as select(2), poll(2) and kevent(2) use state of the send buffer to monitor writability, the new implementation provides protocol specific pr_sopoll and pr_kqfilter. The sendfile(2) over unix/stream is preserved, providing protocol specific pr_send and pr_sendfile_wait methods. - The implementation uses new mchain structure to manipulate mbuf chains. Note that this required converting to mchain two functions that are shared with unix/dgram: unp_internalize() and unp_addsockcred() as well as adding a new shared one uipc_process_kernel_mbuf(). This induces some non- functional changes in the unix/dgram code as well. There is a space for improvement here, as right now it is a mix of mchain and manually managed mbuf chains. - unix/seqpacket previously marked as PR_ADDR & PR_ATOMIC and thus treated as a datagram socket by the generic socket code, now becomes a true stream socket with record markers. - Note on aio(4). First problem with socket aio(4) is that it uses socket buffer locks for queueing and piggybacking on this locking it calls soreadable() and sowriteable() directly. Ideally it should use pr_sopoll() method. Second problem is that unlike a syscall, aio(4) wants a consistent uio structure upon return. This is incompatible with our speculative read optimization, so in case of aio(4) write we need to restore consistency of uio. At this point we workaround those problems on the side of unix(4), but ideally those workarounds should be socket aio(4) problem (not a first class citizen) rather than problem of unix(4), definitely a primary facility.
1 parent fbd7087 commit d157927

File tree

2 files changed

+1343
-391
lines changed

2 files changed

+1343
-391
lines changed

0 commit comments

Comments
 (0)