Traits of the True Non-Volatile FPGA

Bertrand LeighLattice has recently announced the industry's first true 90nm non-volatile FPGA, the LatticeXP2 family. The product announcement along with recent industry activities in non-volatile FPGAs has prompted me to clearly define the basic traits for a real non-volatile FPGA.

The key capabilities that FPGA designers should look for in a non-volatile FPGA device are:

  1. "Instant-on" logic.
  2. High security with enhanced security capabilities.
  3. Small footprint and density migration options.

When you look at these capabilities, they may seem trivial. However, achieving these capabilities in a 90nm non-volatile technology is far from trivial.  Let's take these capabilities one at a time.

Instant-on logic: means the FPGA logic is active within 2ms of the last power supply reaching its minimum voltage.  If you compare this to a traditional SRAM-based FPGA or the recently announced hybrid or die-stack approach "non-volatile" FPGA, configuration times are typically an order of magnitude higher at >100ms.  This significant difference in configuration time allows FPGA designers to use the LatticeXP2 for power up reset and control logic. 

Security: a single-chip flash-based technology naturally gives the FPGA user enhanced security simply by having the configuration data transfer on-chip rather than off-chip from another non-volatile device.  If you choose to transfer from an off-chip device, the LatticeXP2 FPGA lets you encrypt the data transfer.  In addition, the LatticeXP2 device also comes with enhanced security capabilities such as a Flash Protect Key that locks the Flash programming, and an OTP fuse option that altogether disables any unwanted re-configuration of the FPGA.

Small footprint and density migration: because it is a one-chip solution, the LatticeXP2 devices will be offered in variety of packages, including the small foot-print 132csBGA, flexible QFP packages like 144TQFP and 208PQFP and high pin count packages such as 256fpBGA, 484fpBGA and 672fpBGA packages. 

The LatticeXP2 is Lattice's third-generation non-volatile programmable device, with a feature set carried over from many years of field experience in programmability.  You will find many other elegant programmability features such as FlashBAK or user Flash capability, Serial Tag memory, TransFR or Transparent Field Reprogrammability and Dual-boot capabilities.

To find out more about these features, please join our series of webcasts on these exciting topics on LatticeXP2 over the next few weeks.

Glitch Free for the New Year

Bertrand LeighThis is a continuation from my previous blog WYSIWYG Does Not Apply for Input Noise

Once you have determined that input noise is causing malfunction in your logic, what are some ways to fix it?  If you have a situation where PCB changes would be painful, try this as one of the techniques that could help you deglitch your clock input:

deglitch diagram thumbnail

The register implementation can be used to eliminate multiple clock transitions on both rising and falling edges of the clock. Just hold the registers in a reset state if the clock had unwanted multiple transitions close to each other.

May your New Year be glitch free.  Wishing you a safe and happy FPGA new year.

WYSIWYG Does Not Apply for Input Noise

Bertrand LeighWe have recently run into an FPGA hardware debugging situation where WYSIWYG was not true, so I decided to write this blog post to help you save some hardware debugging time, if you are faced with a similar situation debugging your system with FPGAs.

Most of you who are familiar with hardware debugging can relate to situations where you have an unexpected output switching condition caused by a certain input, but when you put a scope probe to measure what might be causing the problem, the problem goes away. 

With slower process technologies, you may be able to detect a remnant of the input noise when you probe the input signal that might give you a clue.  In our particular case, the input noise was on the order of less than a 1ns wide pulse.  Unless you have access to a 20GHz sampling scope, you will not be able to capture this type of noise condition properly.  To further complicate things for us, this noise pulse was occurring in an approximately 20us timing window.   So we were looking for a 0.005% duration of a pulse in a timing window that we didn't know where it was occurring.

A few key points that you want to preserve in this type of debugging situation are:

  1. Preserve the input condition (ie. you don't want to put any scope probe on the input signal) so that the output error condition is preserved.
  2. You need to be able to trigger the error condition and be able to observe what the input is doing. 

How do you monitor the input without putting a scope probe on the input pin itself?  In our case, we were dealing with an FPGA, so we could take the suspected input signal, duplicate the node and internally route the signal out to an observable output pin.  Once we did that, we could clearly see that the input signal was causing the input translator to trip by way of the observable output pin.  We really never saw the actual input noise pulse that was causing the problem but we were able to prove that there was a noise condition on the input that caused the input buffer to detect a transition.

Although, we were dealing with 0.18 micron technology for our particular situation, the problem is not going to get easier with 90nm and 65nm devices, which are faster and require capabilities to detect even narrower input noise pulses.

What are your FPGA prototyping needs?

Bertrand LeighWith every new FPGA product introduction, Lattice typically provides several evaluation boards to make your life a bit easier to prototype and test out the capabilities of our FPGA products.  Our Applications Engineering group has a significant influence on evaluation boards-- some of them are defined and built by our Applications Engineers.  With this blog posting, I would like to do an informal survey of what you would like to see in future evaluation boards.

Here is bit of background on the typical types of evaluation boards we build:

1) To demonstrate specialized features of our FPGAs.  In the past we have provided capabilities such as PCI, PCI Express, DDR1/DDR2, SPI4.2 and others as part of our Advanced Evaluation Boards. 

2) To provide generic prototyping and evaluation capability with general prototyping area and very basic power supply and programming capability.  This is provided as our Standard Evaluation Board.

When I am working in the lab and want to take a quick look at something, there always seems to be a need to put an FPGA device in a socket with a very basic power supply and programming capability. For that I don't need a fancy connector to run multi-Gbps or fancy termination like DDR2. I just need basic functionality that can run at a reasonable speed (50MHz-100MHz). 

Hence, the reason for this blog post is to get your feedback. What are your FPGA prototyping needs?  For your answer please consider the following:

1) Does Lattice's current evaluation board offering satisfy your requirements?  If so, what features do you use? If not, what features would you like to see?

2) What is your price threshold for purchasing an FPGA evaluation board?

3) What is your operation speed requirement in MHz?

4) Would you like to see a very low cost FPGA prototype board? What capabilities do you need on this type of a board?

I look forward to your feedback on this.

Low-cost FPGA with SERDES

Bertrand LeighIt's an exciting day at Lattice as we introduce the LatticeECP2M that:

1. enhances the amount of Embedded Block Ram (EBR) and
2. adds SERDES capability to the popular LatticeECP2 family.

The LatticeECP2M addresses the growing requirement for high-speed serial I/O interfaces in low-cost FPGAs.  Until now if you needed SERDES in an FPGA, your selection of FPGAs would be limited to high-end (i.e. relatively expensive) FPGAs.  In an continuing effort to bring premium features to affordable FPGAs, Lattice has added 4 to 16 SERDES channels (up to 3.125Gbps per channel) to the LatticeECP2M, a chip that already has a high EBR to LUT ratio, DSP blocks and DDR2 memory interface support.

As part of the product introduction, we will be conducting a series of webcasts over next several weeks, highlighting the new device's capabilities such as digital video interfaces, detailed device architecture and SERDES features.

Speedy FPGA

Bertrand LeighIt was very clear to my 4 year old when I read to him about speedy boats and speedy cars how fast they go. But how fast do FPGAs go?

I should give him credit for giving me the inspiration to write this piece: to try and explain FPGA speed to everyone so we can all have a similar baseline understanding on how to estimate FPGA operating speed.

The following functional areas determine the overall operating speed of the FPGA:

  1. I/O interface
  2. Logic implementation
  3. Clock tree and PLL
  4. Other functional blocks

The I/O speed is generally measured by input setup/hold time and output clock-to-out time. This will give us the raw speed in terms of frequency (MHz) or data rate (Mbps) of the individual I/O interface.  If an application requires multiple I/O pins, the skew between the pins will affect the I/O operation speed as well.

Logic implementation speed is generally determined by the internal register-to-register operation speed. The register-to-register speed is determined by the logic block and routing delays between the registers. The FPGA static timing analysis tool will report the register to register speed in MHz or point to point delay.

Clock tree delay skew and PLL (sysCLOCK PLL) speed also are part of the determining factor for the logic speed.  Clock tree delay skew will directly impact the logic register-to-register speed.  Again, the FPGA static timing analysis tool will report this as part of the register-to-register speed of operation in MHz. PLLs will generally be able to support the fastest speed that the registers and logic can operate.

Other functional blocks like Embedded Block RAM (EBR) and Multiplier (sysDSP) will also have their associated operating frequencies.  These operating frequencies are all taken into account into the overall speed of operation for the FPGA. 

The question "how fast do FPGAs go?" can be answered in terms of frequency in MHz.  But this simple frequency of operation is made up of all the components discussed above.  The following are the LatticeECP2 FPGA specifications for each component of speed. These specifications tell us how fast LatticeECP2 FPGAs go.

  • I/O - 840Mbps generic LVDS
  • Logic - 250MHz to 500MHz
  • Clock tree/PLL - 420MHz
  • EBR - 350MHz
  • DSP - 325MHz

LUT by any other name...

Bertrand LeighWhen a problem or an issue gets too complex, I find it helpful to always get back to the basics. This is the case with estimating FPGA resource utilization or gate count.  It is natural for industry publications to discuss FPGA density in terms of gate count. After all, today's FPGAs are at gate densities that ASICs were at a few years ago. 

But what does gate count mean in the FPGA world? Gate counts do, sort of, give a rough estimate of complexity of function(s) that you are trying to implement into an FPGA. And depending on which marketing friend you talk to, the gate count has a large sliding scale.

Now, back to the basics.  FPGAs really only have certain basic functional blocks, as listed below in order of importance. You should be able to convert information given in an FPGA data sheet to these basic units:

    slice diagram thumbnail

  1. The most basic logic block is a LUT and Register pair. The associated diagram on this posting shows a Slice that is 2 LUTs/2 Registers unit.
  2. An I/O cell or pin.
  3. Embedded Block RAM.
  4. A DSP or Multiplier block.

The rest of the special functional blocks don't make it to my main list since they only consume a small portion of the device and/or only a limited number of designers will use them:

  1. PLL/DLL
  2. SERDES
  3. Microprocessor
  4. Analog

With the basic functional blocks, I can get to a fairly accurate FPGA device utilization, within 5% to 10% of the actual implementation.  The units of measure for each of the basic blocks are: LUT/Register count, I/O cell or pin count, Block RAM in raw bits (or number of blocks if you know the memory block size), and Multiplier Blocks (also pay attention to the multiplier width 9x9, 18x18, etc.).

Getting back to my thoughts on the title...

"LUT by any other name... is not an accurate measure of FPGA utilization"

FPGAs have used the 4 input look-up table (LUT4) structure as their basic logic block from the beginning. When I hear the utilization measure of a function taking 20K LUTs, it is clear in my mind what density FPGA device I am supposed to choose.

When I hear terms like 1.5 million gates, 16K logic cells, or 18K logic elements, those terms might as well be replaced by "widgets". My mind does not easily translate "widgets" into device utilization. 

Must We Always Have Models?

Bertrand LeighOur industry has come to a point where engineers are expecting simulation models for everything we do.  Without a simulation model, we appear to be lost in even making some basic engineering judgement. 

Take for example, SSO (Simultanuous Switching Output) noise. This is not a new concept. We had dealt with this type of noise a while ago with 20-pin to 24-pin PDIP packages. In these PDIP packages, assigning the Vcc and GND pins to the corner pins gave us the largest lead inductance in the package that in turn caused significant SSO noise if we are not careful.

The basic concept of SSO noise is governed by the equation V=L*(di/dt).

  • V is the SSO noise voltage generated by the simultanuousely switching outputs
  • L is the combined inductance of the bond wire, lead frame and PCB connectivity components of the Vcc or GND path
  • di/dt is the instantanuous switch current of each of the switching pins

No doubt, the industry has moved from simple PDIP packages to more complex packaging technologies with many combinations of I/O pins and drive settings that can switch simultanuously.  Don't get me wrong, I am your typical engineer who would love to have a simulation model, if an accurate one is available, without having to take out a second mortgage on my house to get it.

So, are we to stop designing systems with SSO conditions because we don't have access to a SSO simulator? There are a few practical things you can do to minimize the affect of the SSO noise.  This is based on understanding the principle and paying attention to pin assignments when you design your FPGA.   

1) Understanding that instantanuous switching current di/dt is one of the two contributing factors, paying attention not to use excessive drive current (ie. using the appropriate programmable drive at 4mA, 8mA, 12mA, 16mA and 20mA) and the appropriate slew rate setting can minimize the contribution.

2) You can also distribute your simultanuously switching outputs across multiple GND pins on a given package will effectively reduce the SSO noise as well.  This, again, is effectively reducing the di/dt. 

3) The other component is the L, the Vcc and GND inductance. You don't have much control of the device packaging component of the L.  You have to trust that the FPGA chip and package designer has taken care of providing you the most optimum L for a given package.  However, the PCB component of the L should be minimized through the proper Vcc and GND layout techniques. 

To give you a practical sense about how much SSO noise contribution you get in practice, our typical lab measurements of 16 contiguous outputs switching at 16mA, fast slew rate to a lumped 5pF load per pin can generate SSO noise in the order of 650mV for a >200 ball fpBGA package.  This is assuming you have also taken care of proper Vcc and GND PCB layout. 

Rest assured that we are working towards creating better simulation models, including SSO. It would be useful to have a model when we have to reliably design a DDR2 memory interface running at 400Mbps data rate with 64 I/O pins switching.  But when you are implementing a 16- or 32-bit data bus interface to a microcontroller that is running at 33MHz, careful pin assignment and drive setting will go a long way to avoid SSO noise, without simulation.

Slow Inputs Acceptable For FPGA?

Bertrand LeighSlow_inputThis is a common question I get when you have system requirements to drive relatively slow input signals into fast FPGA inputs.  "How can I prevent the inputs from causing output oscillation?"

The theory behind this FPGA input or any fast CMOS device input behavior is simple.  As the input rise time gets slower, the signal stays in the input transition point a longer period of time.  This combined with any signal noise associated with the transition causes the input translator to detect false transitions which in turn get translated to the output.  The end result usually is output oscillation.

You can effectively use an input series resistor with the FPGA's internal input bus-hold latches to improve the slow input ramp time tolerance.  It's been demonstrated in our lab that the input ramp time improves from hundreds of nanoseconds to microseconds. More details can be found in the technote posted on Lattice web site.

Why I Blog

Bertrand LeighSince I have been asked to start blogging, I've given it some thought.  In my busy schedule do I really want to take on another task that I have to schedule?  Well, after much deliberation, I came to these conclusions of why I should blog. 

I have been in the PLD industry for close to 20 years. That's even old in dog years! All this time I have been in Applications Engineering and I have talked to countless engineers.  So I should be able to pass on all that life experience to make it easy for someone who might be facing a similar situation that I've been through.  If I can save them a few hours to a few weeks worth of struggling to solve a technical problem, that alone would be worthwhile starting my blog. 

Managing a group of Applications Engineers and interfacing with many more Field Applications Engineers, I am constantly answering questions, learning and figuring out technical problems.  My collegues at the office will tell you that my office and phone queue is usually 2 to 3 deep.  If I am not talking to a person, I am probably answering my email.  I hope to make my blog another communication avenue, not only for the external engineers but also for my collegues, my direct reports and the field engineers that I interface with on a day-to-day basis.  This should, of course, not replace any critical communications, but rather enhance information transfer for common information that I am repeating several times.  So I welcome your inputs and comments on my blog to make this communication work for everyone's benefit. Someone once said that being an Applications Engineer is like being nibbled to death by a bunch of ducks.  I hope to reduce the nibbling with this blog.

It's a rare opportunity for an engineer to express our own opinion.  With my blog, I hope I can honestly express what I feel about some of the issues and implementations related to the PLD industry. 

The general topics I will be following are:
1: industry relevent discussion/opinion
2: PLD tricks of the trade
3: Lattice product specific discussion/answers
4: response to the comments/email.

Let the blogging begin, and I will tell you how my expectations match reality after a few months.

Bertrand Leigh, Lattice Applications Engineering