Chinese semiconductor industry

Status
Not open for further replies.

latenlazy

Brigadier
Computation lithography and mask optimizations are needed regardless the light source to get the best resolution for advanced process nodes, maybe will be a less of a necessity with FELs and SSMBs.
Yeah, but the more coherent your light source the easier it is to apply optimization aglorithms. You're always bumping up inherent stochastic constraints otherwise.
 

FairAndUnbiased

Brigadier
Registered Member
I completely disagree that product uptime does not take resources from r&d

Most R&D concepts are proven on 100, 150 or at most 200 mm tools first. 300 mm proof of concept is ridiculously expensive. Prototype tools are made at those sizes too.

Product uptime problems on existing 300 mm wafer processing tools isn't handed to the R&D team, it is given to product engineering team. it is a matter of optimizing.

The only time where R&D would get involved is for a low volume tool that was basically sold as a R&D tool.
 

tphuang

Lieutenant General
Staff member
Super Moderator
VIP Professional
Registered Member
Product uptime improvement is not R&D side engineering. It’s production and service engineering. Very different part of any hardware company’s org chart.
Okay so let's put it this way since we are somewhat in a AI related thread and I don't know how semi tools companies develop things.

Let's use example of Huawei AI cluster which it promises to have 10x the uptime of competition. Is it just product and services that lead to the 10x up time? Or did Huawei design the system from ground up to have higher requirements per component and add enough redundancies and failsafe that it ends up being inherently far more reliable? No amount of tweaking and debugging on a less well designed system can get to the same level of reliability.

And going back to the original point. You can feel free to spend all your time focused on the overlay. That's not something on top of my radar
 

latenlazy

Brigadier
Okay so let's put it this way since we are somewhat in a AI related thread and I don't know how semi tools companies develop things.

Let's use example of Huawei AI cluster which it promises to have 10x the uptime of competition. Is it just product and services that lead to the 10x up time? Or did Huawei design the system from ground up to have higher requirements per component and add enough redundancies and failsafe that it ends up being inherently far more reliable? No amount of tweaking and debugging on a less well designed system can get to the same level of reliability.
Mechanical engineering doesn't work like that.


EDIT: For the sake of more informative conversation I should probably specify a little. The factors that shape system uptime and downtime for a product that's a computation solution system is not the same factors that shape system uptime and downtime for physical equipment. The nature of fail modes, how to address them, and how system designs interact with them are not analogous.
 

FairAndUnbiased

Brigadier
Registered Member
Mechanical engineering doesn't work like that.


EDIT: For the sake of more informative conversation I should probably specify a little. The factors that shape system uptime and downtime for a product that's a computation solution system is not the same factors that shape system uptime and downtime for physical equipment. The nature of fail modes, how to address them, and how system designs interact with them are not analogous.
Physical equipment reliability vs. Software reliability have nothing to do with each other. For example software doesn't deal with wear, creep, etc. But it has to deal with unintended use. Yet unintended use in mechanical equipment is simply mitigated by the high tech concept of a locked enclosure.
 

latenlazy

Brigadier
Physical equipment reliability vs. Software reliability have nothing to do with each other. For example software doesn't deal with wear, creep, etc. But it has to deal with unintended use. Yet unintended use in mechanical equipment is simply mitigated by the high tech concept of a locked enclosure.
I would extend that further to computer systems reliability has nothing to do with mechanical systems reliability. Like how you think about the specifics of reliability and robustness design for a server farm is *completely* different from how you would go about doing this for a robot arm or a vapor deposition tool or a lithographic scanner etc.
 

latenlazy

Brigadier
Actually a fairly good point of reference to think about this stuff is to compare server maintenance vs car maintenance, except for industrial production equipment you do have the option of swapping out the component equivalent of a whole engine block if you *really* need to unlike with a car.

In general for any mechanical and electrical system (electrical system =/= computer system) the key point of failure you have to check against to maintain high reliability and uptime are the components most subject to wear and fatigue. In an electrical system that’s often some specific high power component or board in the overall electrical board design. In a mechanical system that’s often a few handful of components subjected to the most strain or frequency of use.

Insofar as you build reliability into mechanical equipment at the design phase a big part of that is *seviceability* design, since the goal is to make sure if you have to do maintenance the maintenance is quick and easy. You take for granted in a mechanical system that you *will* have to service the equipment because mechanical wear is inevitable (yay entropy). In mechanical systems any component that you expect to be the primary point of strain or wear should already be designed for robustness in the preliminary design phase through either specific material choices or choices of operational mechanism. If you get that wrong or miss something, unless you intend to rebuild an entirely new design from the ground up, you’re mostly locked into the overall system design since you have very little margin to change the functional output or input of a component relative to its interaction with other components in the system in order to preserve the product’s operational functions as intended by the design, and your options are to do a material swap, a parts refinement, or a redesign of component mechanisms which retain the same overall functional relationship with other parts of the system. For electrical systems you can actually cheat a little, push out a preliminary electrical system design, and then just revise and swap out boards and board components, since electrical systems are inherently far more modular in their function and work via components that have very straightforward and simple interaction mechanisms with one another.

These are *all* part of product and service engineering, not R&D, unless you decide to do a do over of the *whole* system, in which case what you would be doing is essentially launching a whole new product, and you better pray that that is not how you have to use your R&D resources because that all but says your product is a failed product and you have to build an entirely different product to meet the original value proposition that you are trying to offer.
 
Last edited:

tphuang

Lieutenant General
Staff member
Super Moderator
VIP Professional
Registered Member
I would extend that further to computer systems reliability has nothing to do with mechanical systems reliability. Like how you think about the specifics of reliability and robustness design for a server farm is *completely* different from how you would go about doing this for a robot arm or a vapor deposition tool or a lithographic scanner etc.
you do realize there are plenty of mechanical part in AI cluster that could fail, right?
Actually a fairly good point of reference to think about this stuff is to compare server maintenance vs car maintenance, except for industrial production equipment you do have the option of swapping out the component equivalent of a whole engine block if you *really* need to unlike with a car.

In general for any mechanical and electrical system (electrical system =/= computer system) the key point of failure you have to check against to maintain high reliability and uptime are the components most subject to wear and fatigue. In an electrical system that’s often some specific high power component or board in the overall electrical board design. In a mechanical system that’s often a few handful of components subjected to the most strain or frequency of use.

Insofar as you build reliability into mechanical equipment at the design phase a big part of that is *seviceability* design, since the goal is to make sure if you have to do maintenance the maintenance is quick and easy. You take for granted in a mechanical system that you *will* have to service the equipment because mechanical wear is inevitable (yay entropy). In mechanical systems any component that you expect to be the primary point of strain or wear should already be designed for robustness in the preliminary design phase thorough either specific material choices or choices of operational mechanism. If you get that wrong or miss something, unless you intend to rebuild an entirely new design from the ground up, you’re mostly locked into the overall system design since you have very little margin to change the functional output or input of a component relative to its interaction with other components in the system in order to preserve the product’s operational functions as intended by the design, and your options are to do a material swap, a parts refinement, or a redesign of component mechanisms which retain the same overall functional relationship with other parts of the system. For electrical systems you can actually cheat a little, push out a preliminary electrical system design, and then just revise and swap out boards and board components, since electrical systems are inherently far more modular in their function and work via components that have very straightforward and simple interaction mechanisms with one another.

These are *all* part of product and service engineering, not R&D, unless you decide to do a do over of the *whole* system, in which case what you would be doing is essentially launching a whole new product, and you better pray that that is not how you have to use your R&D resources because that all but says your product is a failed product and you have to build an entirely different product to meet the original value proposition that you are trying to offer.
Again, I'm not talking about subbing out parts. I'm talking about designing system with higher reliability across the board.

More importantly, we are not getting anywhere with this. So, i'm going to stop this conversation here.
 

latenlazy

Brigadier
Btw @tphuang I do not appreciate you deleting my last message. You can give me a strike or a ban for talking back on this one but insofar as the point was raised about how we should think about R&D resources and how systems level reliability engineering works and is managed in hardware outfits I think it’s rather misleading to let stand the impression that all systems level reliability design should be thought of and observed the same way. It is my professional opinion (given that the topic we’re discussing happens to be the actual engineering domain I work in) that you are actively encouraging a misunderstanding of how this stuff works for different engineering disciplines than the ones you’re familiar with, and this runs counter to the supposed purpose of this thread, which is to promote a more educated understanding of these topical domains.
 
Last edited:
Status
Not open for further replies.
Top