| | 1 | [wiki:Component SocLib Components General Index] |
| | 2 | |
| | 3 | = !VciMasterNic = |
| | 4 | |
| | 5 | == 1) Functional Description == |
| | 6 | |
| | 7 | The !VciMasterNic component, is a GMII compliant, network controller |
| | 8 | for Gigabit Ethernet network, with a built-in DMA capability. |
| | 9 | |
| | 10 | It can support a throughput of 1 Gigabit/s, as long as the system clock frequency |
| | 11 | is larger or equal to the GMII clock frequency (ie 125 MHz). |
| | 12 | |
| | 13 | To improve the throughput, this component supports up to 4 channels, |
| | 14 | indexed by the source IP address for the received (RX) packets, |
| | 15 | and indexed by the destination IP address for the sent (TX) packets. |
| | 16 | The actual number of channels is an hardware parameter that cannot be larger than 8. |
| | 17 | |
| | 18 | Regarding the GMII physical interface, this simulation model supports three modes |
| | 19 | of operation, defined by a constructor parameter: |
| | 20 | * '''NIC_MODE_FILE''': Both the RX packets stream an the TX packets stream are read/written from/to dedicated files "nic_rx_file.txt" and "nic_tx_dile.txt", stored in the same directory as the top.cpp file. |
| | 21 | * '''NIC_MODE_SYNTHESIS''': The TX packet stream is still written to the "nic_tx_file.txt" file, but the RX packet stream is synthesised. The packet length (between 64 and 1538 bytes) and the source MAC address (8 possible values) are pseudo-random numbers. |
| | 22 | * '''NIC_MODE_TAP''': The TX and RX packet streams are send and received to and from the physical network controller of the workstation running the simulation. |
| | 23 | |
| | 24 | The packet length can have any value, from 64 to 1542 bytes. |
| | 25 | |
| | 26 | The minimal data transfer unit between software and the NIC is a 4K bytes '''container''', |
| | 27 | containing an integer number of variable size packets. |
| | 28 | The max number of packets in a container is 66 packets. |
| | 29 | |
| | 30 | The received packets (RX) and the sent packets (TX) are stored in |
| | 31 | two memory mapped software FIFOs, implemented as chained buffers. |
| | 32 | Each slot in these FIFOs is a 4 Kbytes container. The number of containers, |
| | 33 | defining the queue depth, is a software defined parameter. |
| | 34 | |
| | 35 | The container format is defined below: |
| | 36 | |
| | 37 | The first 34 words define the fixed-format container header : |
| | 38 | || word0 || NB_WORDS || NB_PACKETS || |
| | 39 | || word1 || PLEN[0] || PLEN[1] || |
| | 40 | || ... || ... || ... || |
| | 41 | || word33 || PLEN[64] || PLEN[65] || |
| | 42 | |
| | 43 | * NB_PACKETS is the actual number of packets in the container. |
| | 44 | * NB_WORDS is the number of useful words in the container. |
| | 45 | * PLEN[i] is the number of bytes for packet[i]. |
| | 46 | |
| | 47 | The packets are stored in the (1024 - 34) following words, |
| | 48 | The max number of packets in a container is 66 packets, and the packets are word-aligned. |
| | 49 | |
| | 50 | For the DMA engines, a container has only two states (full or empty), defined |
| | 51 | by a single bit, called the container "status". |
| | 52 | To access both the container status, and the data contained in the container, the DMA |
| | 53 | engines use two physical addresses, that are packed in a 64 bits ''container descriptor'': |
| | 54 | * desc[25:0] contain bits[31:6] of the "full" status physical address. |
| | 55 | * desc[51:26] contain bits[31:6] of the "buffer" physical address. |
| | 56 | * desc[63:52] contain the common 12 physical address extension bits. |
| | 57 | |
| | 58 | Inside the NIC controller, each channel implements a 2 slots chained buffer (two |
| | 59 | containers) for RX, and another 2 slots chained buffer( two containers) for TX. |
| | 60 | For each channel, the build-in RX_DMA engine moves the RX containers from |
| | 61 | the internal 2 slots chained buffer to the external chained buffer implementing |
| | 62 | the RX queue in memory. |
| | 63 | Another build-in TX-DMA engine moves the TX containers from the external |
| | 64 | chained buffer implementing the TX queue in memory, to the internal TX 2 slots |
| | 65 | chained buffer. |
| | 66 | |
| | 67 | To improve the throughput for one specific channel, the DMA engines use ''pipelined bursts'': The burst length cannot be larger than 64 bytes, but each channel send 4 pipelined VCI transactions to mask the round-trip latency. Therefore, thi NIC controller can control up to 32 parallel VCI transactions (4 channels * 4 bursts * 2 directions). |
| | 68 | The CMD/RSP matching uses both the VCI TRDID and PKTID fields: |
| | 69 | * the channel index is sent in TRDID[3:2] |
| | 70 | * the burst index is sent in TRDID[1:0] |
| | 71 | * the is_rx bit is sent in SRCID |
| | 72 | |
| | 73 | == 2) Addressable registers and buffers == |
| | 74 | |
| | 75 | In a virtualized environment each channel segment will be mapped in the address space of a different virtual machine. |
| | 76 | Each channel takes a segment of 32 Kbytes in the address space, to simplify the address decoding, but only 20K bytes are used. |
| | 77 | |
| | 78 | * The first 4 Kbytes contain the RX_0 container data |
| | 79 | * The next 4 Kbytes contain the RX_1 container data |
| | 80 | * The next 4 Kbytes contain the TX_0 container data |
| | 81 | * The next 4 Kbytes contain the TX_1 container data |
| | 82 | * The next 4 Kbytes contain the channel addressable registers: |
| | 83 | |
| | 84 | || NIC_RX_STS_0 || RX_0 status (full or empty) || read/write || |
| | 85 | || NIC_RX_STS_1 || RX_1 status (full or empty) || read/write || |
| | 86 | || NIC_TX_STS_0 || TX_0 status (full or empty) || read/write || |
| | 87 | || NIC_TX_STS_1 || TX_1 status (full or empty) || read/write || |
| | 88 | || NIC_RX_DESC_LO_0 || RX_0 descriptor low word || read/write || |
| | 89 | || NIC_RX_DESC_HI_0 || RX_0 descriptor high word || read/write || |
| | 90 | || NIC_RX_DESC_LO_1 || RX_1 descriptor low word || read/write || |
| | 91 | || NIC_RX_DESC_HI_1 || RX_1 descriptor high word || read/write || |
| | 92 | || NIC_TX_DESC_LO_0 || TX_0 descriptor low word || read/write || |
| | 93 | || NIC_TX_DESC_HI_0 || TX_0 descriptor high word || read/write || |
| | 94 | || NIC_TX_DESC_LO_1 || TX_1 descriptor low word || read/write || |
| | 95 | || NIC_TX_DESC_HI_1 || TX_1 descriptor high word || read/write || |
| | 96 | || NIC_MAC_4 || MAC address 32 LSB bits || read_only || |
| | 97 | || NIC_MAC_2 || MAC address 16 MSB bits || read_only || |
| | 98 | || NIC_RX_RUN || RX channel activated || write_only || |
| | 99 | || NIC_TX_RUN || TX channel activated || write_only || |
| | 100 | |
| | 101 | |
| | 102 | On top of the channels segments is the hypervisor segment, taking 4 Kbytes, |
| | 103 | and containing the global configuration registers: (all read/write). |
| | 104 | In a virtualized environment, the corresponding page should not be mapped |
| | 105 | in the virtual machines address spaces, as it should not accessed by the virtual machines. |
| | 106 | || Register name || function || Reset value || |
| | 107 | || NIC_G_VIS || bitfield / bit N = 0 -> channel N is disabled || all inactive || |
| | 108 | || NIC_G_ON || NIC active if non zero (inactive at reset) || inactive || |
| | 109 | || NIC_G_BC_ENABLE || boolean / broadcast enabled if true || disabled || |
| | 110 | || NIC_G_TDM_ENABLE || boolean / enable TDM dor TX if true || disabled || |
| | 111 | || NIC_G_TDM_PERIOD || value of TDM time slot || || |
| | 112 | || NIC_G_PYPASS_ENABLE || boolean / enable bypass for TX if true || enabled || |
| | 113 | || NIC_G_MAC_4[8] || default MAC address 32 LSB bits for channel[i] || || |
| | 114 | || NIC_G_MAC_2[8] || default MAC address 16 LSB bits for channel[i] || || |
| | 115 | |
| | 116 | The Hypervisor segment contains also various event counters for statistics (read/write) |
| | 117 | |
| | 118 | || NIC_G_NPKT_RX_G2S_RECEIVED || number of packets received on GMII RX port || |
| | 119 | || NIC_G_NPKT_RX_G2S_DISCARDED || number of RX packets discarded by RX_G2S FSM || |
| | 120 | || || || |
| | 121 | || NIC_G_NPKT_RX_DES_SUCCESS || number of RX packets transmited by RX_DES FSM || |
| | 122 | || NIC_G_NPKT_RX_DES_TOO_SMALL || number of discarded too small RX packets || |
| | 123 | || NIC_G_NPKT_RX_DES_TOO_BIG || number of discarded too big RX packets || |
| | 124 | || NIC_G_NPKT_RX_DES_MFIFO_FULL || number of discarded RX packets for fifo full || |
| | 125 | || NIC_G_NPKT_RX_DES_CRC_FAIL || number of discarded RX packets for checksum || |
| | 126 | || || || |
| | 127 | || NIC_G_NPKT_RX_DISPATCH_RECEIVED || number of packets received by RX_DISPATCH FSM || |
| | 128 | || NIC_G_NPKT_RX_DISPATCH_BROADCAST || number of broadcast RX packets received || |
| | 129 | || NIC_G_NPKT_RX_DISPATCH_DST_FAIL || number of discarded RX packets for DST MAC || |
| | 130 | || NIC_G_NPKT_RX_DISPATCH_CH_FULL || number of discarded RX packets for channel full || |
| | 131 | || || || |
| | 132 | || NIC_G_NPKT_TX_DISPATCH_RECEIVED || number of packets received by TX_DISPATCH FSM || |
| | 133 | || NIC_G_NPKT_TX_DISPATCH_TOO_SMALL || number of discarded too small TX packets || |
| | 134 | || NIC_G_NPKT_TX_DISPATCH_TOO_BIG || number of discarded too big TX packets || |
| | 135 | || NIC_G_NPKT_TX_DISPATCH_SRC_FAIL || number of discarded TX packets because SRC MAC || |
| | 136 | || NIC_G_NPKT_TX_DISPATCH_BROADCAST || number of broadcast TX packets received || |
| | 137 | || NIC_G_NPKT_TX_DISPATCH_BYPASS || number of bypassed TX->RX packets || |
| | 138 | || NIC_G_NPKT_TX_DISPATCH_TRANSMIT || number of transmit TX packets || |
| | 139 | |
| | 140 | For extensibility issues, you should access all these registers using the globally-defined offsets in file |
| | 141 | source:trunk/soclib/soclib/module/connectivity_component/vci_multi_nic/include/soclib/multi_nic.h |
| | 142 | |
| | 143 | This hardware component checks for segmentation violation, and can be used as a default target. |
| | 144 | |
| | 145 | == 3) Component definition & usage == |
| | 146 | |
| | 147 | source:trunk/soclib/soclib/module/connectivity_component/vci_multi_nic/caba/metadata/vci_multi_nic.sd |
| | 148 | |
| | 149 | {{{ |
| | 150 | Uses( 'vci_multi_nic' ) |
| | 151 | }}} |
| | 152 | |
| | 153 | == 4) CABA Implementation == |
| | 154 | |
| | 155 | === CABA sources === |
| | 156 | |
| | 157 | * interface : source:trunk/soclib/soclib/module/connectivity_component/vci_multi_nic/caba/source/include/vci_multi_nic.h |
| | 158 | * implementation : source:trunk/soclib/soclib/module/connectivity_component/vci_multi_nic/caba/source/src/vci_multi_nic.cpp |
| | 159 | |
| | 160 | === CABA Constructor parameters === |
| | 161 | {{{ |
| | 162 | VciMultiNic( |
| | 163 | sc_module_name name, // Component Name |
| | 164 | const soclib::common::IntTab &tgtid, // Target index |
| | 165 | const soclib::common::MappingTable &mt, // MappingTable |
| | 166 | const size_t channels, // Number of channels |
| | 167 | const uint32_t mac4, // MAC address 32 LSB bits |
| | 168 | const uint32_t mac2, // MAC address 16 MSB bits |
| | 169 | const int mode, // GMII physical interface modeling |
| | 170 | const uint32_t inter_frame_gap); // delay between two packets |
| | 171 | |
| | 172 | }}} |
| | 173 | |
| | 174 | === CABA Ports === |
| | 175 | |
| | 176 | * '''p_resetn''' : Global system reset |
| | 177 | * '''p_clk''' : Global system clock |
| | 178 | * '''p_vci''' : The VCI target port |
| | 179 | * '''p_rx_irq[k]''' : As many RX IRQ ports as the number of channels |
| | 180 | * '''p_tx_irq[k]''' : As many TX IRQ ports as the number of channels |
| | 181 | |
| | 182 | == 4) TLM-DT implementation == |
| | 183 | |
| | 184 | The TLM-DT implementation is not available yet. |