During a recent project designing a thermal solution for a customer’s PCB (printed circuit board) layout, Advanced Thermal Solutions, Inc. (ATS) Field Application Engineer Peter Konstatilakis also analyzed the thermal properties of a series of SFP (small form-factor pluggable) optical transceivers on the edge of the board.
From that project came the idea of examining the thermal challenges presented by SFP and QSFP (quad SFP) and designing a heat sink solution that future customers could use to solve potential issues that stem from the increased power requirements of the compact transceivers that are frequently used in the transmission of data.
After conducting an analytical analysis, running computer simulations, and testing the heat sinks in the state-of-the-art ATS labs, Peter demonstrated a new heat sink design and optimized layout sequence that showed 30 percent improvement on QSFP heat sinks currently on the market.
In addition, he showed that having heat sinks with fewer fins upstream and heat sinks with more fins downstream provided a near isothermal relationship between the first and last QSFP, an important consideration for QSFP arrangements.
Peter recently sat down with ATS Vice-President of Marketing and eCommerce Rebecca O’Day and Marketing Communications Specialist Josh Perry to discuss the project, his research, and the successful design of the new QSFP cooling solution.
JP: What prompted the work on QSFP heat sinks? Why did we start looking into this technology?
PK: Optics are pretty big now with all the higher information rates, 400 gigabyte cards, which is 400 gigabytes of throughput and that’s a lot. They need these multiple high-powered SFP or QSFP to do that. So, higher power demands call for ATS expertise in thermal management.
RO: Optics are really expanding. It’s not just routers and things like that, but they’re also used in storage, array networks, video…so this kind of thing could really be able to expand.
PK: Anywhere that you are transferring data, which is basically everywhere – the Cloud, big servers, the internet itself. They’re being used a ton.
JP: Was the impetus for designing QSFP heat sinks something that was prompted by a customer or did we think about the technology and recognize that it needed to be cooled?
PK: We had worked on SFP cooling for a customer first, so that helped us understand the area a bit more. Also, from what we were hearing from customers, QSFP that were being designed had higher throughput, which means higher power. And it is also good to have products that we can market, even if it isn’t for every customer, and show that we can handle the optical transceiver arena.
JP: What was the first step in designing the heat sinks? Did you know a lot about QSFP or did you have to do a lot of research?
PK: There is definitely a lot to think about. You can’t use a TIM (thermal interface material) because the QSFP isn’t fixed in the cage; it can be hot swapped. After a few insertions and removals, it will gunk up the TIM.
JP: Was that something you knew before?
PK: It was something I knew before, but there is also a specification document for this technology written by the SFF (Small Form Factor) Committee, which is a standard controlled document that engineers design to for this form factor and it stated in there not to use a TIM. When we looked at it with the customer, it made sense and when we asked the customer they agreed.
RO: If there is no TIM, how does the interface work? Is it a direct interface? Is it flat enough?
PK: You have to specify a good enough flatness and surface roughness, within cost means, that will still have a low contact resistance. That was one of the challenges as well as understanding the airflow of typical QSFP arrangements because you have four in a row, so you’re going to have preheated air going into the fourth QSFP.
JP: When designing the heat sinks, what were the issues that you needed to consider?
PK: One consideration was getting as much surface area as we can, so that required extending the heat sink off the edge of the cage and we also had fins on the bottom of the heat sink. Usually, you only have fins above the cage but there was some room underneath, about 10 mm depending on what components are around, which provides additional surface area.
We also found that when you extend the surface the spreading resistance becomes an issue as well, so you need to increase the thickness of the base to help spread the heat to the outer extremities of the heat sink. You want the first QSFP and the last QSFP case temperatures’ to be isothermal due to laser performance (an electrical parameter), whereas each individual heat sink should be isothermal to get the most out of all the heat sink surface area (a heat transfer parameter).
‘Cold’ spots insinuate a lack of heat transfer to that location and thus poor use of that surface area. Then it was about the airflow and having the front heat sinks be shorter with fewer fins and the back two to be taller with denser fin arrays.
JP: Was the difference in fin arrays between upstream and downstream heat sinks how you optimized the design to account for the preheated air?
PK: What is really important is to keep each QSFP at the same temperature, within reason, because they all work together. So, if one is a higher temperature than another, the laser performance is going to be affected and it will affect the stack. You want to have them as isothermal as you can; the case temperature from the first QSFP to the last.
We figured when we were going through the design, you could have a shorter heat sink up front with fewer fins to help the airflow pass to the downstream QSFP. The upstream QSFP wouldn’t need as much cooling because they’re getting the fresher air and faster airflow. So, if you relax the front heat sinks and make the ones in the back more aggressive, then you’re going to get better cooling downstream.
What happens is the front heat sinks aren’t as effective. This is fine as long as the upstream QSFP case temperatures are lower than the downstream QSFP. The overall effect is that the upstream QSFP temperatures will be closer to the temperature of the downstream QSFP, keeping the stack as isothermal as possible.
This is where the limit lies. Minimizing the upstream QSFP heat sinks, which in turn minimizes the amount of preheat to the downstream QSFP and allows as much airflow to enter downstream QSFP. At the same time ensuring the upstream QSFP temperatures are equal to or just lower than the downstream QSFP. This keeps the downstream QSFP temperatures at a minimum, while also keeping the transceiver stack close to isothermal.
JP: Were there any unexpected challenges that you had to account for?
PK: There was a challenge in testing and making sure that the thermocouples (which you can see in the picture below) contact the heat sink surface correctly and all of them at the same point. I had to glue it, so it may touch the case of the heat sink or it may not, depending on how the glue set, so I had to put a little thermal grease inside the pocket just to have the thermocouple make good contact with the heater block itself.
The metal piece (heater block) mimics the QSFP and we put a cartridge heater in the middle to heat it up and then we put a groove where the thermocouple is attached as I just explained.
Other than that…it was really just the flatness. It was hard to test and get reliable data between several heat sinks because there is going to be some flatness variation between them. Sometimes there isn’t enough to show a variation, but if I’m seeing different data with a different heat sink on the same heater block then the flatness and surface roughness is affecting it.
RO: On the flatness issue, in theory someone could spend a lot of money and make sure that it was completely flat but there’s a certain point where it has to be flat enough.
PK: Obviously there are diminishing returns after a certain point, so you have to find that line. There are no calculations that explain flatness and surface roughness, so at the end of the day it comes down to testing.
RO: I find it interesting that the testing was a challenge because it appears to us on the outside that this is a standard approach but then you get into it and have to ask how are we going to measure the temperature accurately:
PK: There is always something that comes up which you didn’t think about until you start doing the testing and you have to make a change and modify it to make it work. That is where experience comes in handy. The more testing you do, the more you’ve seen and you can take care of the problem before it arises.
RO: It’s a good example of what we can do at ATS. We don’t have to test with a full, expensive board or the full optical arrangement, instead we can come up with inexpensive (low startup cost) ways to test that will provide quick, accurate data to help the customer get to market.
JP: So, we tested three different arrangements for the heat sinks?
PK: Yeah. There were two different designs with changes in the density of the fins. Based on the CFD (computational fluid dynamics) and in the lab, the best outcome was having the less dense fins in front for the first two heat sinks and having the denser fin arrays downstream. As we expected, more airflow was able to make it to the back heat sinks and were able to cool them more effectively.
We were seeing less than a degree difference, especially at higher airflows, between the first heat sink and the last and that was pretty impressive. That configuration also provided the lowest temperature for the final two QSFP. Those are going to be the limiting factors; they’re going to be the highest temperature components no matter what since they’re receiving preheated air. That’s why it’s important to minimize the preheated air and maximize the airflow downstream by designing shorter, low fin-density heat sinks upstream.
If you put a dense heat sink up front, you’re going to restrict airflow downstream and you’re going to pull more heat out of the component because it is a better heat sink. With this you’re going to dump more heat into the air and send it to the downstream QSFP. So, it is worth keeping some heat in the upstream components, which has a double effect of keeping all of the QSFP temperatures as isothermal as possible. As long as the upstream components aren’t going over the case temperature of the last component, then you’re fine.
RO: It’s almost counter-intuitive. The general thermal design says to pull as much heat away from the component as quickly as possible and dissipate it, but you’re saying it was better to leave some of the heat in place.
PK: For the upstream QSFP, absolutely. There is margin because it is receiving so much fresh air.
That is really because we’re working in a system environment where choices upstream affect the airflow downstream. If it wasn’t a system and you’re looking at a single component, then sure you want to get rid of all the heat. And again, leaving heat in also allows the QSFP components to be as isothermal as possible.
JP: It sounds like it worked the way that you expected going in?
PK: Yeah it did. I’m not going to sit here and pretend it always happens that way but what we thought would happen did happen and we were able to design it analytically before we went into CFD and testing.
JP: Were there certain calculations that you use when working with a system?
PK: We can look at the fan curve. Each heat sink has its own pressure drop and the way you use a fan curve is to analyze the four heat sinks, add the pressure drops together, and then examine the fan curve (the amount of airflow varies with the pressure that the fan sees) with the higher the pressure, the less airflow. So, we’re able to estimate the amount of airflow across the system based on the total pressure drop.
We also use Q=mCpΔT and that way we can determine, based on the amount of power coming from the component, what is the air temperature that is leaving the heat sink. It is a little conservative because we’re saying that all of the heat is going into the next heat sink, which isn’t true because a little is escaping to other locations, but being conservative doesn’t make a difference when comparing designs.
Analyzing the airflow into each heat sink and the temperature into each heat sink lets us know what we have to design for; just because you’re putting more surface area doesn’t mean you have a good solution.
RO: This is a good example of how thermal management is more than just removing the heat, but also analyzing how the heat travels and thinking about it as a system. It’s much more complicated.
JP: How important is for ATS to be able to see potential thermal challenges in new technology, like this, and work through the problem even if it isn’t for a specific design or customer?
PK: It always helps to have more experience. It’s knowledge for the future. We’ve already seen it, we’ve already dealt with it, and we can save time and cost for the customer.
Whenever we run into this issue, we can say we tested that in the lab and explain the solution that we found. We don’t need to do more analysis, but provide the customer with a solution.
For more information about Advanced Thermal Solutions, Inc. (ATS) thermal management consulting and design services, visit https://www.qats.com/consulting or contact ATS at 781.769.2800 or firstname.lastname@example.org.