Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ mimpid = 0x01040312 -> Version 01.04.03.12 -> v1.4.3.12

| Date | Version | Comment | Ticket |
|:----:|:-------:|:--------|:------:|
| 16.05.2025 | 1.11.4.8 | :warning: remove hardware spinlocks and CPU's inter-core communication links | [#1268](https://github.com/stnolting/neorv32/pull/1268) |
| 16.05.2025 | 1.11.4.7 | :warning: make `mcause` CSR read-only | [#1267](https://github.com/stnolting/neorv32/pull/1267) |
| 12.05.2025 | 1.11.4.6 | :bug: fix missing burst signal in bus register stage (introduced in previous version / v1.11.4.5) | [#1266](https://github.com/stnolting/neorv32/pull/1266) |
| 12.05.2025 | 1.11.4.5 | add explicit "burst" signal to processor-internal bus; clean-up CPU "fence" decoding | [#1265](https://github.com/stnolting/neorv32/pull/1265) |
Expand Down
1 change: 0 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -190,7 +190,6 @@ for custom tightly-coupled co-processors, accelerators or interfaces
data transfers and conversions
* cyclic redundancy check unit ([CRC](https://stnolting.github.io/neorv32/#_cyclic_redundancy_check_crc)) to test
data integrity (CRC8/16/32)
* 32 dedicated hardware spinlocks ([HWSPINLOCK](https://stnolting.github.io/neorv32/#_hardware_spinlocks_hwspinlock))

**Debugging**

Expand Down
35 changes: 11 additions & 24 deletions docs/datasheet/cpu.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -68,23 +68,20 @@ direction as seen from the CPU.
|=======================
| Signal | Width/Type | Dir | Description
4+^| **Clock and reset**
| `clk_i` | 1 | in | Global clock line, all registers triggering on rising edge.
| `rstn_i` | 1 | in | Global reset, low-active.
| `clk_i` | 1 | in | Global clock line, all registers triggering on rising edge.
| `rstn_i` | 1 | in | Global reset, low-active.
4+^| **Interrupts (<<_traps_exceptions_and_interrupts>>)**
| `msi_i` | 1 | in | RISC-V machine software interrupt.
| `mei_i` | 1 | in | RISC-V machine external interrupt.
| `mti_i` | 1 | in | RISC-V machine timer interrupt.
| `firq_i` | 16 | in | Custom fast interrupt request signals.
| `dbi_i` | 1 | in | Request CPU to halt and enter debug mode (RISC-V <<_on_chip_debugger_ocd>>).
4+^| **<<_inter_core_communication_icc>> links**
| `icc_tx_o` | `icc_t` | out | TX link
| `icc_rx_i` | `icc_t` | in | RX link
| `msi_i` | 1 | in | RISC-V machine software interrupt.
| `mei_i` | 1 | in | RISC-V machine external interrupt.
| `mti_i` | 1 | in | RISC-V machine timer interrupt.
| `firq_i` | 16 | in | Custom fast interrupt request signals.
| `dbi_i` | 1 | in | Request CPU to halt and enter debug mode (RISC-V <<_on_chip_debugger_ocd>>).
4+^| **Instruction <<_bus_interface>>**
| `ibus_req_o` | `bus_req_t` | out | Instruction fetch bus request.
| `ibus_rsp_i` | `bus_rsp_t` | in | Instruction fetch bus response.
| `ibus_req_o` | `bus_req_t` | out | Instruction fetch bus request.
| `ibus_rsp_i` | `bus_rsp_t` | in | Instruction fetch bus response.
4+^| **Data <<_bus_interface>>**
| `dbus_req_o` | `bus_req_t` | out | Data access (load/store) bus request.
| `dbus_rsp_i` | `bus_rsp_t` | in | Data access (load/store) bus response.
| `dbus_req_o` | `bus_req_t` | out | Data access (load/store) bus request.
| `dbus_rsp_i` | `bus_rsp_t` | in | Data access (load/store) bus response.
|=======================

.Bus Interface Protocol
Expand Down Expand Up @@ -118,7 +115,6 @@ The generic type "suv(x:y)" represents a `std_ulogic_vector(x downto y)`.
| `BOOT_ADDR` | suv(31:0) | CPU reset address. See section <<_address_space>>.
| `DEBUG_PARK_ADDR` | suv(31:0) | "Park loop" entry address for the <<_on_chip_debugger_ocd>>, has to be 4-byte aligned.
| `DEBUG_EXC_ADDR` | suv(31:0) | "Exception" entry address for the <<_on_chip_debugger_ocd>>, has to be 4-byte aligned.
| `ICC_EN` | boolean | Implement <<_inter_core_communication_icc>> module. Automatically enabled for the SMP <<_dual_core_configuration>>.
| `RISCV_ISA_Sdext` | boolean | Implement RISC-V-compatible "debug" CPU operation mode required for the <<_on_chip_debugger_ocd>>.
| `RISCV_ISA_Sdtrig` | boolean | Implement RISC-V-compatible <<_trigger_module>>. See section <<_on_chip_debugger_ocd>>.
| `RISCV_ISA_Smpmp` | boolean | Implement RISC-V-compatible physical memory protection (PMP). See section <<_smpmp_isa_extension>>.
Expand Down Expand Up @@ -616,15 +612,6 @@ extension is shorthand for the following set of other extensions:
The C `_Atomic` specifier should be used for atomic variables.
See section <<_coherence_example>> for more information.

.Hardware-Based Spinlocks
[TIP]
The `A` ISA extension and the according sub-extensions require a lot of additional logic that increases
the size of the processor. Furthermore, the `A` instructions require special attention to maintain
<<_memory_coherence>>. Alternatively (or as a further addition), hardware-based spinlocks can be enabled,
which can be used to build more complex synchronization structures without the need for the complex `A`
ISA extension or special attention to memory coherence. See the <<_hardware_spinlocks_hwspinlock>> module
description for more information.


==== `B` ISA Extension

Expand Down
47 changes: 0 additions & 47 deletions docs/datasheet/cpu_csr.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -81,8 +81,6 @@ In the following table these CSRs are highlighted with "⚠️".
| 0xf14 | <<_mhartid>> | `CSR_MHARTID` | MRO | Machine hardware thread ID
| 0xf15 | <<_mconfigptr>> | `CSR_MCONFIGPTR` | MRO | Machine configuration pointer register
5+^| **<<_neorv32_specific_csrs>>**
| 0xbc0 | <<_mxiccsreg>> | `CSR_MXICCSREG` | MRW | Inter-core communication status register
| 0xbc1 | <<_mxiccdata>> | `CSR_MXICCDATA` | MRW | Inter-core communication data register
| 0x800 .. 0x803 | <<_cfureg, `cfureg0`>> .. <<_cfureg, `cfureg3`>> | `CSR_CFUCREG0` .. `CSR_CFUCREG3` | URW | Custom CFU registers 0 to 3
| 0xfc0 | <<_mxisa>> | `CSR_MXISA` | MRO | Extended machine CPU ISA and extensions
|=======================
Expand Down Expand Up @@ -940,51 +938,6 @@ custom/implementation-specific use (assured by the RISC-V privileged specificati
| Description | User-defined CSRs to be used within the <<_custom_functions_unit_cfu>>.
|=======================

{empty} +
[discrete]
===== **`mxiccsreg`**

[cols="<1,<8"]
[grid="none"]
|=======================
| Name | <<_inter_core_communication_icc>> status register
| Address | `0xbc0`
| Reset value | `0x40000000`
| ISA | `Zicsr` & `X`
| Description | Shows the status of the core's inter-core communication link (message queue / FIFO status flags).
The entire CSR is read-only. However, write accesses are ignored.
This CSR is hardwired to all-zero if the <<_dual_core_configuration>> is disabled.
|=======================

.`mxiccsreg` CSR Bits
[cols="^1,^2,^1,<5"]
[options="header",grid="rows"]
|=======================
| Bit | Name [C] | R/W | Description
| 0 | `CSR_MXICCSREG_RX_AVAIL` | r/- | Set if RX data from the other core is available.
| 1 | `CSR_MXICCSREG_TX_FREE` | r/- | Set if there is free space for TX data for the other core.
| 31:2 | - | r/- | Reserved; hardwired to zero.
|=======================


{empty} +
[discrete]
===== **`mxiccdata`**

[cols="<1,<8"]
[grid="none"]
|=======================
| Name | <<_inter_core_communication_icc>> data register
| Address | `0xbc1`
| Reset value | `0x00000000`
| ISA | `Zicsr` & `X`
| Description | This CSR provides access to the inter-core communication message queues that are implemented
as simple FIFOs. Writing to this register will put data into the message queue so it can be read by the other
core. Reading from this register will return data received from the other core (i.e. this CSR has side effects
when reading). A read access will return all-zero of no RX data is available from the other core.
This CSR is hardwired to all-zero if the <<_dual_core_configuration>> is disabled.
|=======================


{empty} +
[discrete]
Expand Down
53 changes: 8 additions & 45 deletions docs/datasheet/cpu_dual_core.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -47,51 +47,11 @@ sections at boot-up.
| **Cache coherency** | Be aware that there is no cache snooping available. If any CPU1 cache is enabled
care must be taken to prevent access to outdated data - either by using cache synchronization (`fence` / `fence.i`
instructions) or by using atomic memory accesses. See <<_memory_coherence>> for more information.
| **Inter-core communication** | See section <<_inter_core_communication_icc>>.
| **Bootloader** | Only core 0 will boot and execute the bootloader while core 1 is held in standby.
| **Booting** | See section <<_dual_core_boot>>.
|=======================


==== SMP Software Library

An SMP library provides basic functions for launching the secondary core and for performing direct
core-to-core communication:

[cols="<1,<8"]
[grid="none"]
|=======================
| neorv32_smp.c | link:https://stnolting.github.io/neorv32/sw/neorv32__smp_8c.html[Online software reference (Doxygen)]
| neorv32_smp.h | link:https://stnolting.github.io/neorv32/sw/neorv32__smp_8h.html[Online software reference (Doxygen)]
|=======================


==== Inter-Core Communication (ICC)

Both cores can communicate with each other via a direct point-to-point connection based on FIFO-like message
queues. These direct communication links are faster (in terms of latency) compared to a memory-mapped or
shared-memory communication. Additionally, communication using these links is guaranteed to be atomic.

The inter-core communication (ICC) module is implemented as dedicated hardware module within each CPU core
(VHDL file `rtl/core/neorv32_cpu_icc.vhd`). This module is automatically included if the dual-core option
is enabled. Each core provides a **32-bit wide** and **4 entries deep** FIFO for sending data to the other core.
Hence, there are two FIFOs: one for sending data from core 0 to core 1 and another one for sending data the
opposite way.

The ICC communication links are accessed via two NEORV32-specific CSRs. Hence, those FIFOs are accessible only
by the CPU core itself and cannot be accessed by the DMA or any other CPU core.

The <<_mxiccsreg>> provides read-only status information about the core's ICC links: bit 0 becomes set if
there is RX data available for _this_ core (send from the other core). Bit 1 is set as long there is
free space in _this_ core's TX data FIFO. The <<_mxiccdata>> CSR is used for actual data send/receive operations.
Writing this register will put the according data word into the TX link FIFO of _this_ core. Reading this CSR
will return a data word from the RX FIFO of _this_ core.

The ICC FIFOs do not provide any interrupt capabilities. Software is expected to use the machine-software
interrupt of the receiving core (provided by the <<_core_local_interruptor_clint>>) to inform it about
available messages.


==== Dual-Core Boot

After reset, both cores start booting. However, core 1 will - regardless of the <<_boot_configuration>> - always
Expand All @@ -101,20 +61,23 @@ booting, executing either the <<_bootloader>> or the pre-installed image from th

To boot-up core 1, the primary core has to use a special library function provided by the NEORV32 software framework:

.CPU Core 1 launch function prototype (note that this function can only be executed on core 0)
.CPU Core 1 Launch Function Prototype (note that this function can only be executed on core 0)
[source,c]
----
int neorv32_smp_launch(int (*entry_point)(void), uint8_t* stack_memory, size_t stack_size_bytes);
----

When executed, core 0 uses the <<_inter_core_communication_icc>> to send launch data that includes the entry point
for core 1 (via `entry_point`) and the actual stack configuration (via `stack_memory` and `stack_size_bytes`).
Note that the main function for core 1 has to use a specific type (return `int`, no arguments):
When executed, core 0 uses the two 32-bit `MTIMECMP` registers of the <<_core_local_interruptor_clint>> to
store the _launch configuration_ at a defined address location. This launch configuration consists of the stack
configuration (via `stack_memory` and `stack_size_bytes`) and the actual entry point for core 1. After these
registers have been populated, core 1 will trigger core 1's software interrupt (also via the CLINT) to wake it
from sleep mode. After that, core 1 will fetch the launch configuration and will start execution at the configured
entry point.

.CPU Core 1 Main Function
[source,c]
----
int core1_main(void) {
int core1_main(void) { // return `int`, no arguments
return 0; // return to crt0 and go to sleep mode
}
----
Expand Down
2 changes: 0 additions & 2 deletions docs/datasheet/overview.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -198,7 +198,6 @@ rtl/core
├-neorv32_cpu_cp_shifter.vhd - Bit-shift co-processor (base ISA)
├-neorv32_cpu_decompressor.vhd - Compressed instructions decoder (C ext.)
├-neorv32_cpu_frontend.vhd - Instruction fetch and issue
├-neorv32_cpu_icc.vhd - Inter-core communication unit
├-neorv32_cpu_lsu.vhd - Load/store unit
├-neorv32_cpu_pmp.vhd - Physical memory protection unit (Smpmp ext.)
├-neorv32_cpu_regfile.vhd - Data register file
Expand All @@ -211,7 +210,6 @@ rtl/core
├-neorv32_fifo.vhd - Generic FIFO component
├-neorv32_gpio.vhd - General purpose input/output port unit
├-neorv32_gptmr.vhd - General purpose 32-bit timer
├-neorv32_hwspinlock.vhd - Hardware spinlocks
├-neorv32_imem.vhd - Generic processor-internal instruction memory
├-neorv32_neoled.vhd - NeoPixel (TM) compatible smart LED interface
├-neorv32_onewire.vhd - One-Wire serial interface controller
Expand Down
4 changes: 0 additions & 4 deletions docs/datasheet/soc.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,6 @@ image::neorv32_processor.png[align=center]
* _optional_ autonomous direct memory access controller (<<_direct_memory_access_controller_dma,**DMA**>>)
* _optional_ stream link interface (<<_stream_link_interface_slink,**SLINK**>>), AXI4-Stream compatible
* _optional_ cyclic redundancy check unit (<<_cyclic_redundancy_check_crc,**CRC**>>)
* _optional_ hardware spinlocks (32x) (<<_hardware_spinlocks_hwspinlock,**HWSPINLOCK**>>)
* _optional_ on-chip debugger with JTAG TAP (<<_on_chip_debugger_ocd,**OCD**>>), optional authentication and hardware breakpoint
* _optional_ system configuration information memory to determine hardware configuration via software (<<_system_configuration_information_memory_sysinfo,**SYSINFO**>>)

Expand Down Expand Up @@ -303,7 +302,6 @@ The generic type "`suv(x:y)`" is an abbreviation for "`std_ulogic_vector(x downt
| `IO_SLINK_RX_FIFO` | natural | 1 | SLINK RX FIFO depth, has to be a power of two, minimum value is 1, max 32768.
| `IO_SLINK_TX_FIFO` | natural | 1 | SLINK TX FIFO depth, has to be a power of two, minimum value is 1, max 32768.
| `IO_CRC_EN` | boolean | false | Implement the <<_cyclic_redundancy_check_crc>> unit.
| `IO_HWSPINLOCK_EN` | boolean | false | Implement the <<_hardware_spinlocks_hwspinlock>> module.
|=======================


Expand Down Expand Up @@ -859,6 +857,4 @@ include::soc_neoled.adoc[]

include::soc_gptmr.adoc[]

include::soc_hwspinlock.adoc[]

include::soc_sysinfo.adoc[]
50 changes: 0 additions & 50 deletions docs/datasheet/soc_hwspinlock.adoc

This file was deleted.

Loading