Trust, but verify. How to catch peanut butter engineering before it spreads into your system — Part 1: Validation.

I will address this topic with two blog posts: validation (i.e. post silicon) — Part 1, and verification (pre-silicon) — Part 2. In this blog post, I will focus on validation.

One of the upsides of using catalog chips that have been in the market for a long time and have ramped in substantial volumes is that other system companies already found a lot of the bugs, and the chip supplier has had an opportunity to fix them, screen or calibrate defective parts at automated testing (ATE), withdraw the chip from the market, or at least warn new users about them with an ERRATA. Your system may be different, and you may still get bitten by some bug that is exposed due to your unique operational conditions, but generally catalog parts after they are ramped for some time in volume provide a certain herd immunity to the system companies that use them.

Unfortunately, catalog chips are also going to cost you significantly more than a custom silicon chip in larger volumes, the footprint will be significantly bigger considering all components needed, and will also result in a more inflexible set of options for the system designers. But when you go the custom silicon route YOU are the first and possibly the ONLY user for this chip, so how do you prevent silicon bugs getting into your system?

First, let’s talk about the risks:

  • Peanut butter engineering at your chip supplier. This refers to the reality that your chip supplier is in the business of making as many chips as they can in as short a period of time as they can. Strong engineering culture at your supplier is a mitigation in general, but given the commercial pressures chip companies are under they need to deliver revenue. And the pressure from management is generally to produce more with the same engineering resources, to spread the peanut butter so to speak over all the chips they are working on. So how does peanut butter engineering manifest itself in real life during validation?. Here are some ways:
  1. Very liberal (i.e. watered down) interpretations of JEDEC standards.
  • Automated test equipment (ATE) program changes over the lifetime of the product. J-STD-46 standard defines some of the reasons why a chip supplier must inform customers that a major change has been made provided that they have purchased components up to 2 years prior, and with whom there are various possible contractual obligations. In annex A of the J-STD-46 under datasheet changes, it is stated that the “Elimination of final electrical measurement or burn-in (if specifically stated in the datasheet as being performed)” is listed as an example of a major change that requires a PCN (product or process change notice) to be issued. However, most datasheets that I’ve seen don’t explicitly state what is ATE tested and what isn’t, so the supplier absent some other agreement with the system company will not issue a PCN for a test program change. In my experience, suppliers remove tests over the lifetime of the chip without issuing a PCN to respond to market pressures to reduce the price of the product while trying to maintain margins, which means they lower costs by removing tests based on “historical data” for that chip.

All of these risks have mitigation, which are fairly straightforward to implement as long as the chip supplier is cooperative and the system company has the right specialists on its side. The following are the main mitigation to the risks above:

  • Contracts. Require in your contracts with custom silicon suppliers the following:
  1. Establish your PCN requirements in your supplier agreement. Make sure to cover changes to the ATE program, this can get messy though since these are changed often by suppliers. They’d have to issue a new datasheet now showing what parameters are no longer covered by the datasheet assuming you implement point (2) below.
  • Review all validation reports in detail. Check the following in the reports:
  1. Look at the plots to check that truly the points of data add up to the sample size the supplier is saying they took to calculate CPK, and also to check that they actually took the data for your chip and didn’t just re-use old data from a different chip.
  • ECO reviews. When the chip is evaluated, some bugs may be found. It is critical that the system company has chip specialists helping to review these ECOs proposed by the chip supplier to make sure that proper root causing is completed by the chip supplier. Chip specialists that may be needed depending on what types of bugs are found are: DV and AMS verification engineers, analog chip designers, digital chip designers, RF chip designers, package engineers, foundry engineers, and others. Chip suppliers are sometimes under pressure to tape out quick fixes, or simply hand waive away issues and ask for spec limit changes. This is a very serious danger to your custom chip program as you can end in a run-break-fix cycle with multiple tape outs due to bad or incomplete root causing which will put your system schedule at risk. There are many tools available to debug chip issues such as FIBs, and other FA techniques. You must check that proper methods are being used to root cause your chip’s bugs and not accept incomplete root causes for your ECOs. This is why it is vital that your system company has chip technical experts on your side to ensure your project is not the one where the chip supplier spreads resources thin and you end up getting peanut butter engineering adding risk to your system launch.

Some suppliers will tell you that corner units will be filtered out at ATE to try to avoid having to provide the corner samples. But that is not a valid argument because many parameters in a datasheet are not ATE tested, and even the ones that are may not be tested in the future as the supplier starts to remove testing over the lifetime of the product. Also, if you do all your builds with mostly nominal material you may not see an issue, but once you hit volume you will start seeing DPPM issues if your system is sensitive to some of the chip supplier corners. It’s better to catch this early and either spin the chip to fix them, or change the ATE to fix the issue by calibration or filter them out at the chip supplier’s ATE so you never receive the parts. Parameters that you find are critical like this should be highlighted to the supplier as a “never remove ATE testing” parameter.

  • Trust, but verify. Custom system silicon when done with the assistance of silicon experts puts the system company in control of its own destiny. Hiring silicon experts full time at your company may not be a reasonable thing due to not having enough work for them, and that is why customsilicon.com provides you a solution so that you can engage with chip suppliers on custom silicon programs and mitigate all the risks listed above without hiring so many experts full time. When purchasing catalog parts for your system, unless you perform similar due diligence to what is described above, you’re trusting but not verifying that your components will be of good quality and not likely to cause yield or other issues when you go to production in high volumes.

Developing custom silicon can have huge benefits from an economic, engineering and market perspective for system companies, but it takes a structured and detailed approach to ensure proper take off and a successful landing. Don’t hesitate to contact us at info@customsilicon.com for any further questions, or help you may require.

Originally published at https://customsilicon.com.

Other related articles:

--

--

CustomSilicon.com

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store