

## TN-001 Technical Note

## The Benefits of Local Memory for Your Analog I/O Application

**Background** - Most of today's Analog I/O boards are based on Peripheral Component Interconnect (PCI) bus architecture. PCI is also the underlying architecture of the Compact PCI (cPCI) and PXI busses. In most systems, PCI is implemented as a 32 bit wide bus which runs at a 33 MHz clock rate. There are 64 bit width and 66 MHz clock rate extensions to the PCI bus standard, but these are typically implemented in only high end server machines. In the common implementation, the PCI bus is capable of transferring, at most:

## (32 bits)\*(1byte/8bits)\*(33 MHz)=132 MB/s

In reality, each transfer requires some overhead, typically four clocks, to get started, thus reducing the available bus bandwidth. There can be many devices on the PCI bus capable of initiating a bus transfer. These devices are called bus masters. Without bus mastering, the host processor would be required to initiate and execute transfers. Arbitration mechanisms are provided to prevent a single bus master device from monopolizing the PCI bus. After a certain number of bus cycles, a device must stop its transfer, relinquish the bus, and then attempt to reclaim the bus if needed to complete the transfer. Obviously, if more than one device is transferring data, the 132 MB/s peak bandwidth is shared among the devices.

There are three principle reasons for using on-board memory:

- Analog I/O Throughput
- System Performance
- Guaranteed Availability

**Analog I/O Throughput** - In a common application, such as ultrasound, a waveform is generated at 40 MS/s to take full advantage of the Acquitek CH board analog reconstruction filter. This requires 80 MB/s of output data. Inputs are sampled from a quadrature receiver, again oversampled to ease analog filtering. Two inputs at 20 MS/s require another 80 MB/s, thus exceeding the PCI bandwidth. With Acquitek's architecture, depicted in Figure 1, the output waveform data is loaded into the on-board memory prior to initiating input data capture. With 533 MB/s memory bandwidth, the ultrasound application is practical, with memory and PCI bandwidth in reserve.





The integration embodied in Exacq's DSP/PCI/SDRAM ontroller chip allows the use of commodity synchronous DRAM (SDRAM) chips, the same kind commonly found in PC133 memory modules. This is a very low cost, low power solution compared to the static RAM (SRAM) or FIFOs found on other analog I/O boards, and the savings are passed on to customers. Whereas other vendors charge hundreds or thousands of dollars to upgrade to 16 MB of memory, Acquitek boards include that amount, standard!

That memory equates to 200 mS of output data at the full 40 MS/s rate – a duration of, for example, 12 unique NTSC video fields.

**System Performance** – PCI transfers are much more efficient in burst mode than in single transfer mode. Since the overhead per transaction is fixed, the longer the burst, the greater the efficiency. As discussed

above, it is easy to load a system close to or over the peak PCI throughput. In a heavily loaded system, it is crucial to utilize the PCI bus at high efficiency. Due to their extensive memory, Acquitek boards are able to transfer data in up to 128kB bursts, enabling 99.99% bus efficiency.

Most PCI Analog I/O boards available today support bus mastering, because generating data transfer addresses is not a productive use of a Pentium processor! However, in many applications, streaming a large capture or playback to/from disk is desirable.

Without a large, accessible memory bank onboard, this requires data to make two trips across the PCI bus, as shown in figure 2a for the capture case.



Figure 2a

By contrast, the Acquitek architecture makes use of an integrated PCI Bridge/SDRAM controller which not only performs bus mastering to and from host memory, but also maps the on-board memory into PCI space so that other bus master devices can access it. This facilitates transfers as shown in figure 2b, with the disk controller bus mastering directly out of on-board memory to the disk drive. Since the data makes only one trip across the PCI bus, not two, *streaming directly to disk at the full 80 MB/s input rate is achievable*<sup>1</sup>. The same concept works in the output direction as well.



Figure 2b

**Guaranteed Availability** - Most of today's Analog I/O boards run under multi-tasking operating systems with virtual memory managers. Windows and Linux are examples of such operating systems (OS).

While Windows may routinely switch out the data acquisition task for up to 10 ms, there are ways in which applications and drivers can hold on to the system for longer. We all know that there are misbehaving applications and poorly written drivers which may block attention from the processor for even longer. A large memory provides the analog I/O board a buffer to ride through these periods for hundreds of milliseconds.

A PCI bus master supporting scatter-gather is able to map the physical memory addresses used on the PCI bus to the addresses of virtual memory used by the OS. Scatter-gather tables take 8 bytes to map each 4 KB block of virtual memory, an overhead of about 0.2%. This doesn't sound like much, but if 10 MB is to be captured, 20 KB is needed to store the scattergather table. Without on-board memory to store this table, the bus master must access the OS page table across the PCI bus before every block transfer.

<sup>1</sup> This is dependent upon the IDE controller and disk drive speed. External IDE controller and high performance drive system are recommended for full 80MB/s throughput.