« Must We Always Have Models? | Main | Speedy FPGA »

LUT by any other name...

Bertrand LeighWhen a problem or an issue gets too complex, I find it helpful to always get back to the basics. This is the case with estimating FPGA resource utilization or gate count.  It is natural for industry publications to discuss FPGA density in terms of gate count. After all, today's FPGAs are at gate densities that ASICs were at a few years ago. 

But what does gate count mean in the FPGA world? Gate counts do, sort of, give a rough estimate of complexity of function(s) that you are trying to implement into an FPGA. And depending on which marketing friend you talk to, the gate count has a large sliding scale.

Now, back to the basics.  FPGAs really only have certain basic functional blocks, as listed below in order of importance. You should be able to convert information given in an FPGA data sheet to these basic units:

    slice diagram thumbnail

  1. The most basic logic block is a LUT and Register pair. The associated diagram on this posting shows a Slice that is 2 LUTs/2 Registers unit.
  2. An I/O cell or pin.
  3. Embedded Block RAM.
  4. A DSP or Multiplier block.

The rest of the special functional blocks don't make it to my main list since they only consume a small portion of the device and/or only a limited number of designers will use them:

  1. PLL/DLL
  2. SERDES
  3. Microprocessor
  4. Analog

With the basic functional blocks, I can get to a fairly accurate FPGA device utilization, within 5% to 10% of the actual implementation.  The units of measure for each of the basic blocks are: LUT/Register count, I/O cell or pin count, Block RAM in raw bits (or number of blocks if you know the memory block size), and Multiplier Blocks (also pay attention to the multiplier width 9x9, 18x18, etc.).

Getting back to my thoughts on the title...

"LUT by any other name... is not an accurate measure of FPGA utilization"

FPGAs have used the 4 input look-up table (LUT4) structure as their basic logic block from the beginning. When I hear the utilization measure of a function taking 20K LUTs, it is clear in my mind what density FPGA device I am supposed to choose.

When I hear terms like 1.5 million gates, 16K logic cells, or 18K logic elements, those terms might as well be replaced by "widgets". My mind does not easily translate "widgets" into device utilization. 

Comments

Bertrand,
Great article. We try to explain this to customer engineers and they are finally getting the message.

Bernie Montoya
Summit Sales

If I understand correctly, LUT4 is under a xilinx patent, so some companies only provide LUT3 in their devices?

how does that kind of architecture perform in real-world applications? Does lattice have any expreience with those?

Lattice FPGAs consist of mainstream architecture for its logic block (ie. LUT4 and Register combination). So our experience and estimation guidelines are based on this structure. Logic utilization estimates must take into consideration the overall device space and routing resources. This translates to leaving enough space for a given device density in the order of ~20% or more.

On your reference to LUT3 implementation, there's only one FPGA vendor that promotes this structure. But we can go back to the basic the same way when counting resources. It will roughly translate to 1*LUT3=0.75*LUT4. Further, a tile resource that implements a LUT3 can either implement a LUT3 or a register but not both. So 0.75*LUT4 and register pair will translate to an equivalent 2*LUT3 resources.

Thank you for your instruction. I have a puzzle about resource utilization about Slice and LUT.We can get both information after mapping. As we know, these Slices may have some unused LUTs & registers. Whether routing resource of these Slices make it impossible to use these remanent LUTs? Routing resource utilization is also important? Utilization of Slice & LUTs,which is more important?

Unused LUTs and Registers within a Slice could still be used by the software if the additional/new user logic implementation logically makes sense to use those resources. As you mentioned in your posting, part of the resources would already be used for the existing logic so the s/w would make use of the available LUT and Registers when:

1) The additional LUT and Register can share the already utilized routing, input and output resources to the Slice.

2) There is enough resources to implement the completely independent LUT and Register.

Routing resource utilization is important for the Lattice Place & Route tool to have enough room to implement the user specified logic. The only way the the user could affect the routing resource is to code your logic in the most connectivity efficient way. You will have to trust that we have done our homework to provide enough routing resources in the h/w and that the software algorithm is also efficient in converting your design.

For the utilization of Slice and LUT, I would count the LUTs and Registers to get the more accurate sense of my device utilization. As discussed earlier, the Slice count will provide a general sense about your device utilization. Based on this I think LUT count gives you better resolution.

Post a comment

If you have a TypeKey or TypePad account, please Sign In