⚠️ [inter-processor communication] remove hardware spinlocks and inter-core communication links #1268
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR removes the hardware spinlocks module and also the inter-core communication links.
The hardware spinlocks are definitely (another) example of over-engineering. They cost unnecessary additional hardware. The same functionality can be realized on the software side via the
A
ISA extensions and its sub-extensions:Zalrsc
: Use load-reservate and store-conditional instructions for implementing a spinlock. This approach only adds a minimal hardware overhead. However, this solution is also less efficient as the processor only implements a single global reservation set.Zaamo
: Use atomic read-modify-write operations for implementing a spinlock. This is very efficient in terms of performance but adds a significant hardware overhead.Until now, the ICC mechanism was mainly used for booting the second core. The ICC queues were too small for efficient communication between the cores (only 4 entries). Larger queues would cost significantly more hardware and also affect the critical path of the processor, as they are mapped directly into the CSR logic of the core. In addition, since the queues are ultimately only FIFOs, they only provide serial access to the data. However, the nice thing about the ICC was its low latency.
However, communication via the shared memory is much more flexible. For example, by using C's
_Atomic
variables you can build various communication structures.