“Using embedded processor soft cores (such as MicroBlaze, PicoBlaze, etc.) in FPGA design to form a programmable system-on-chip (SystemOn Programmable Chip, SOPC) has better modifiability and maintainability than ASIC, and has been widely used. Applications.
Using embedded processor soft cores (such as MicroBlaze, PicoBlaze, etc.) in FPGA design to form a programmable system-on-chip (SystemOn Programmable Chip, SOPC) has better modifiability and maintainability than ASIC, and has been widely used. Applications. Since large processor cores such as ARM and MicroBlaze have debug interfaces, with the cooperation of the corresponding debug modules, the debug software can be implemented through the JTAG interface: Execute to a breakpoint, suspend, single-step execution, view the internal state of the processor, view And basic online debugging functions such as modifying the data in the Memory space. The above-mentioned basic online debugging functions are of great significance to the debugging of embedded systems.
For compact processors such as PicoBlaze, which occupy less resources and are easy to design and develop, they generally do not have debug interfaces. However, the above processors are often used in SOPC system design. When using the above processors, because there is no debug interface, the standard and efficient debug mechanisms belonging to large processors can no longer be used. The software and hardware co-simulation of the system is an important way to ensure the correctness of the design. However, in the design of systems such as receiver baseband signal processing, the test cases used in the simulation often have insufficient coverage, or it is difficult to construct corresponding test cases after a fault is found. Therefore, there is an urgent need to make the in-circuit debugging function easily extend to general processors.
In view of the above application requirements, the new debugging method proposed here can enable processors without a debugging interface to establish a standard debugging mechanism by introducing a universal debugging module (Universal Debug Module, UDM). The debug module uses the processor’s interrupt mechanism to realize the processor’s response to breakpoint (breakpoint) mechanism, uses an ingenious address mapping mechanism based on dual-port RAM to achieve the function of setting breakpoints on multiple lines of code at the same time, and can easily Realize the interaction of debugging information and commands between the system being debugged and the debugging host. UDM also has the advantage of easy expansion, and can share a UDM when there are multiple processors in the SOPC system.
2. The general principle of online debugging
There are two main methods of online debugging of embedded processors: background debugging mode (backgroud debug mode, BDM) technology and JTAG debugging technology based on IEEE P1149.1 protocol. BDM technology has been widely used in Motorola microcontrollers. Processors such as ARM, MIPS and PowerPC have on-chip debugging functions based on JTAG technology. For example, ARM company proposed the RDI debugging interface standard based on JTAG technology, which is mainly used for Debugging of ARM chips. By adding an extension design that supports debugging in the processor core, simple control signals can be input in the reserved debugging interface to achieve: processor suspend (Halt), output PC value and general register value, output and modify the memory space Basic raw debugging operations such as data. Usually the design of the above debug interface is related to the instruction set architecture. For example, MIPS32 provides the following debugging methods: ① Breakpoint instruction BREAK; ② Some self-trapping instructions TRAP; ③ Special control register WATCH, through programming, specific load/store operations and fetch Refers to the operation to generate a special exception; ④ a TLB-based MMU, through programming, access to any memory page can generate a specific exception.
For processors without a debugging interface, online debugging is mainly realized by fully considering the possible debugging requirements in the software and hardware design, coupled with the communication mechanism between the debugging host and the system being debugged. In this debug mode, the debug code needs to be inserted into the normal program, the debug information is output to the debug host, and it can also receive commands from the debug host and make various responses. The main defect of this method is that for different debugging requirements, the debugging code in the normal program must be constantly modified, resulting in a low degree of standardization and generality. The UDM here can establish a convenient debugging mechanism for such processors without modifying the processor core, which is a method different from the mainstream large-scale processors to realize online debugging.
3. How the Universal Debug Module (UDM) works
3.1 System Description
The block diagram of the debugging system using UDM is shown in Figure 1. The ARM, DSP and other processors on the same PCB as the FPGA are used as embedded processors for auxiliary debugging (hereinafter referred to as auxiliary processors), which simplifies the UDM. Communication with the debug host. Through the bus interface of the auxiliary processor, various control and data registers in the UDM are directly mapped to the memory space of the auxiliary processor. The data read and write operations can be directly performed in the Memory window of the auxiliary processor development tool to realize the control of the UDM, as shown in Figure 4 and Figure 5. Since it is a common design to integrate an FPGA and an embedded processor chip on a single PCB, this communication method is applicable to a wide range of applications.
The UDM is directly used as the peripheral of the embedded processor for the external auxiliary debugging of the FPGA. If multiple UDM modules are connected to the external processor bus, the debugging of multiple processors can be realized at the same time.
UDM makes the processor respond to the interrupt and call the debug service routine (DebugRouTIne, DR) by generating the debug interrupt (DeBug Interrupt, DI) signal. The UDM generates the DI signal by monitoring the processor’s instruction fetch address (InstrucTIon Address, IA). PicoBlaze can access UDM through its bus interface when running DR, so as to realize the output of debug information and the response to debug commands.
3.2 Breakpoint Setting Mechanism
When the DI is generated, the processor will immediately execute the DR, thus interrupting the normal execution flow and turning it into a debugging service. Therefore, determining the timing of the DI generation is the core of implementing the breakpoint mechanism. The DI signal is generated by monitoring the processor’s instruction fetch address (Instruction Address, IA). Comparing IA with one data directly through a comparator can only set one breakpoint at a time. In order to solve this contradiction, the following methods are adopted: Use dual-port RAM to store breakpoint configuration information in UDM, so that every 1bit in RAM is stored with the program. Corresponding to an address in the area, the data is 1 means a breakpoint is set, and 0 means no.
Convert the input IA to address the RAM storage area, so that the RAM can output a breakpoint information at one end that just represents the output address, and then according to this data, the correct DI signal can be generated. On the other side of the dual-port RAM, breakpoint settings can be easily modified. In this way, the number of breakpoints that can be set is mainly limited by the capacity of the dual-port RAM in the UDM.
3.3 Debug service program
As long as it is ensured in DR that the processor does not change the internal and external environment of the target program, it is equivalent to implementing the suspend function of the processor. Therefore, it is necessary to isolate the DR and the execution environment of the target program, which can be achieved by making certain settings on the compiler or by enforcing coding conventions. After the processor is suspended, the DR communicates with the external debug host and responds to various debug commands issued by the debug host by querying the command register. These commands include: moving the relevant debugging information to the buffer area that can be observed by the external debugging host, modifying the data in the Memory space, exiting the DR to make the target program continue to execute, and so on. Since the DR must use isolated resources from the target program and the code capacity in small processors, the size of the external memory space, etc. are relatively limited, the design of the DR should take up as few ports, general-purpose registers, and total lines of code as possible. .
4. Design example
Xilinx’s PicoBlaze is a commonly used compact processor, which consists of ALU, program counter stack (suitable for nested subroutines), 16 8-bit general-purpose registers, 64-byte RAM scratchpad, program counter and The controller and interrupt support circuit are formed, and its code capacity is 1024. This section takes the application of PicoBlaze as an example, designs a concrete UDM, and carries on the actual verification on Spartan3S5000FPGA. The hardware resources used by the UDM are 1 18KB BRAM and 62 Spartan-3 logic slices, and the software resources are 61 lines of assembly code. The functions are as follows:
・Breakpoints can be set at each line of code at the same time. If no breakpoints are set, DI can be forcibly generated to run DR to output debugging information;
・The debug information that can be observed are: the value of the program counter PC, the s0 ~ sb registers, the 64byte scratchpad, and the data in the Memory space. The above debug information can be refreshed when DR is running.
4.1 Hardware implementation
The hardware structure of UDM based on PicoBlaze processor application is shown in Figure 2. The UDM has a bus interface with both the debug terminal and the PicoBlaze, so its internal registers are divided into three categories: controlled only by the PicoBlaze, controlled only by the auxiliary processor, and controlled by both.
PicoBlaze and the auxiliary processor write data on the A and B ports of the dual-port RAM respectively. In order to reduce the occupation of PicoBlaze’s I/O ports, PicoBlaze first writes the address to the RAM addressing register before writing data to the dual-port RAM, and then writes the data to the address specified by the previous operation by writing the data output register.
The B port of the dual-port RAM is connected to the bus of the auxiliary processor. The data bit width is 16, and the accessible address range is 0 to 255. The addresses 0 to 165 are used as the buffer area for interactive debugging data, and the addresses 192 to 255 are used to store interrupts. Click Set Information. Each register stores the breakpoint setting of 16 lines of code. Since the code capacity of PicoBlaze is 1024 lines, it only needs to occupy 64 registers. For example, if the data at address 193 is 0x4080, it means that the 24th and 31st lines set breakpoints. The data bit width of the A port of the dual-port RAM is 8, which is used to input debugging information when DR is running, and output breakpoint setting information when the target program is running. Therefore, there is an address selection circuit in the A port, so that the address input to the A port is determined by the RAM addressing register and IA in different situations. When running the target program, the input address of port A is the upper 7 bits of IA plus the offset 0x180, and the output 8-bit data is then addressed by the lower 3 bits of IA to output 1-bit data, so the obtained data just reflects the same as IA Whether the corresponding code has set a breakpoint. The interrupt signal generating circuit generates the DI signal output to the processor according to the above data and the timing requirements of the interrupt signal.
The debug command register is jointly controlled by PicoBlaze and the auxiliary processor, and the auxiliary processor writes different numbers to this register to represent different debug commands. When running DR, the response to various debug commands is realized by querying this register. Before responding to the debug command, PicoBlaze clears the debug command register to 0 as a handshake operation mechanism with the auxiliary processor. When a 3 is written to the debug command register, the DI signal is generated immediately regardless of whether a breakpoint is set.
4.2 Software implementation
In PicoBlaze-based applications, in order to reduce the code capacity, the DR process is relatively simple. After the initialization is prepared, the s0 ~ sb registers, the 64byte internal RAM, and the data in the Memory space are output to the dual-port RAM in turn, and then fall into a cycle of waiting and processing debugging commands. The isolation of the target program and the DR execution environment is achieved by restricting the target program to only allow to modify registers s0 ~ sb and 64bytes of internal RAM, while DR only allows to modify registers se ~ sf. Only when the debug command is exit debug, the DR program will end, and PicoBlaze will return to the execution of the target program. When the debug command is to refresh debug information, PicoBlaze will repeat the process of initialization and debug information output.
4.3 Actual verification and usage
Before applying UDM, it is simulated by NC-verilog, and some simulation waveforms are shown in Figure 3. The figure reflects how PicoBlaze transfers to execute DR when the DI signal pdm_int is generated. Due to space limitations, the simulation waveforms for verifying other functions will not be repeated here.
In order to further test the function and performance of UDM, the following simple PicoBlaze processor system is established in FPGA. Only a 252X8bit RAM and UDM are connected to the outside of PicoBlaze. The target program flow on PicoBlaze is the following infinite loop: Put s0 ~ sb into 0 ~ 11 in turn, and then put 11 ~ 0 in turn; write 64byte RAM in turn Enter 0 to 63, and then reversely write 63 to 0; write 0 to 251 in sequence to the external RAM, and then reversely write 255 to 4. Such a simple design can ensure that the line of code where the processor responds to the breakpoint can be directly seen from the output debugging information.
Figure 4 shows the interface for debugging control on the debugging host. The 1040 at the offset address 0x184 indicates that the breakpoint is set at the two lines of code 38 and 44. In fact, the breakpoint can be set from the address 0x180 ~ 0x1ff. Address 0x200 is the current PC value, which can be updated by writing 1 to address 0x208; address 0x202 is the enable bit of UDM, UDM is enabled only when it is 1; address 0x204 is the debug command register, Write 1 to make the processor exit from the breakpoint, write 2 to make the processor refresh debug information, write 3 to force the processor to enter DR and output debug information; address 0x206 indicates the debug state, when it is 3, it indicates that the processor is running DR , and the debug information has been output.
The interface for displaying debugging information is shown in Figure 5. Addresses 0x00 to 0x0b Display the data of registers s0 to sb, addresses 0x0c to 0x3b display the data of the internal 64byte memory, and addresses 0x4c to 0x14b display the data of the external memory space of PicoBlaze. Since the interruption point in Figure 5 is set just after the completion of sequentially writing 0 to 251 to the PicoBlaze’s Memory space, the displayed data is incremented. When the breakpoint is set just after the completion of sequentially writing 255 to 4 to the Pico Blaze’s Memory space, the displayed data becomes decremented. The debug information displayed at many other breakpoints is also as expected and the location of the breakpoint setting, so UDM is fully functional and efficient.
In the FPGA prototype design of developing a signal processing chip, tracking processing, message processing, and overall process control are completed by a PicoBlaze, and it is almost impossible to replace the above-mentioned processors with large processors due to the limitation of FPGA resources. Due to the complex data and control signals input to the PicoBlaze, simulation verification does not cover a wide variety of practical use cases well. By using the UDM based on the design of the PicoBlaze processor, the online debugging of the above three PicoBlazes is easily realized, which plays an important role in improving the development efficiency. The above UDM has also been well promoted in other engineering applications using PicoBlaze.
A general debug module is designed to assist processors without debug interfaces to establish a standard debug mechanism. Through the use of this module, a general, standard and convenient debugging method is proposed, which satisfies the urgent need of online debugging for multiple compact processors without debugging interfaces in SOPC system. The new method realizes the suspension of the processor by generating a debug interrupt to make the processor jump to the debug service program, and realizes the function of setting breakpoints on multiple lines of code at the same time based on an ingenious address mapping mechanism in dual-port RAM. Debug commands such as data movement are implemented through the debug service program. The new method also has the advantage of being easily scalable and can debug multiple embedded processors simultaneously. The new debugging method plays an important role in improving the debugging efficiency in engineering practice. It is a common debugging method with obvious application value.