tag 标签: microprocessors

相关博文
  • 热度 17
    2013-3-22 21:57
    2244 次阅读|
    0 个评论
    In 1974, Robert Dennard proposed a scaling theory that drew on Moore's Law to promise ever-faster microprocessors. If from one generation to the next the transistor length shrinks by a factor of about 0.7, the transistor budget doubles, speed goes up by 40%, total chip power remains the same, and a legion of other good things continues to be bestowed on the semiconductor industry. Unfortunately Dennard scaling petered out at 90 nm. Clock rates stagnated and power budgets have grown at each process node. Many traditional tricks just don't work any more. For instance, shrinking transistors meant thinner gate oxide thicknesses, but once those hit 1.2 nm (about the size of five adjacent silicon atoms), tunnelling created unacceptable levels of leakage. Semiconductor engineers replaced the silicon-dioxide insulator (with a dielectric constant of 3.9) with other materials like hafnium dioxide (dielectric constant = 25), to allow for somewhat thicker insulation. Voltages had to go down, but are limited by subthreshold leakages as the transistors' threshold voltage must inevitably decline. More leakage means greater power dissipation. A lot of innovative work is being done, like the use of 3D finFETs, but the Moore's manna of yore has, to a large extent, dried up. Like the cavalry in a bad western, multi-core came riding to the rescue, and it's hard to go a day without seeing some new many-core CPU introduction. Most sport symmetric multi-processing architectures, where two or more cores share some cache plus the main memory. Some problems can really profit from SMP, but many can't. Amdahl's Law tells us that even with an infinite number of cores, an application that is 50% parallelizable will get only a 2x speed-up over a single-core design. But that law is optimistic, and doesn't account for the inevitable bus conflicts that will occur when sharing L2 and main memory. Interprocessor communication, locks, and the like make things even worse. Data from Sandia National Labs shows that even for some very parallel problems multi-core just doesn't scale after a small number of processors are involved. In Power Challenges May End the Multicore Era (Communications of the ACM, February 2013, subscription required), the authors develop rather complex models that show multi-core may (and the operative word is "may") bang into a dead-end due to power constraints. Soon. The key takeaways are that by the 8-nm node (expected around 2018) more than 50% of the transistors on a microprocessor die will have to be dark, or turned off, at any one time just to keep the parts from self-destructing from overheating. The most optimistic scenarios show only a 7.9x speed-up between the 45-nm and 8-nm nodes; a more conservative estimate pegs that at 3.7x. The latter is some 28 times less than one would expect from the gains Moore's Law has led us to expect. I have some problems with the paper: - The authors assume an Intel/AMD-like CPU architecture. That is, huge, honking processors whose entire zeitgeist is performance. We in the embedded space are already power-constrained and generally use simpler CPUs. It's reasonable to assume a mid-level ARM part will run into the same issues, but perhaps not at 8 nm. - They don't discuss memory contention, locks, and interprocessor communication. That's probably logical as their thesis is predicated on power constraints. But these issues will make the results even worse in real-world applications. The equations presented indicate no bus contention for shared L2 (and L2 is always shared on multi-core CPUs) and none for main memory accesses. Given that L1 is tiny (32-64KB) one would expect plenty of L1 misses and thus lots of L2 activity... and therefore plenty of contention. - The models analyse applications in which 75% to 99% of the work can be done in parallel. Plenty of embedded systems won't come near 75%. - It appears the analysis assumes cache wait states are constant: three for L1 and 20 for L2. Historically that has not been the case—the 486 had zero wait state cache. It's hard to predict how caches in the future will behave but assuming past trends continue, the paper's conclusions will be even worse. - The paper figures on a linear relationship between frequency and performance, and the authors acknowledge that memory speeds don't support this assumption. The last point is insanely hard to analyse. Miss rates for L1 and L2 are extremely dependent on the application. SDRAM is very slow for the first access to a block, though succeeding transfers happen very quickly indeed. But any transaction could take just three cycles (if in L1) to hundreds. One wonders how much tolerance a typical hard real-time system would have for such uncertainty. Two conclusions are presented: the pessimistic one is the chicken little scenario where we hit a computational brick wall. Happily, the paper address a number of more optimistic possibilities ranging from microarchitecture improvements to unpredictable disruptive technologies. The latter has driven semiconductor technology for decades, and I for one am optimistic that some cool and unexpected inventions will continue to drive computer performance on its historical upward trajectory.
  • 热度 12
    2013-3-22 21:49
    1722 次阅读|
    0 个评论
    In 1974 Robert Dennard conceptualized a scaling theory that drew on Moore's Law to promise ever-faster microprocessors. If from one generation to the next the transistor length shrinks by a factor of about 0.7, the transistor budget doubles, speed goes up by 40%, total chip power remains the same, and a legion of other good things continues to be bestowed on the semiconductor industry. Unfortunately Dennard scaling petered out at 90 nm. Clock rates stagnated and power budgets have grown at each process node. Many traditional tricks just don't work any more. For instance, shrinking transistors meant thinner gate oxide thicknesses, but once those hit 1.2 nm (about the size of five adjacent silicon atoms), tunnelling created unacceptable levels of leakage. Semiconductor engineers replaced the silicon-dioxide insulator (with a dielectric constant of 3.9) with other materials like hafnium dioxide (dielectric constant = 25), to allow for somewhat thicker insulation. Voltages had to go down, but are limited by subthreshold leakages as the transistors' threshold voltage must inevitably decline. More leakage means greater power dissipation. A lot of innovative work is being done, like the use of 3D finFETs, but the Moore's manna of yore has, to a large extent, dried up. Like the cavalry in a bad western, multi-core came riding to the rescue, and it's hard to go a day without seeing some new many-core CPU introduction. Most sport symmetric multi-processing architectures, where two or more cores share some cache plus the main memory. Some problems can really profit from SMP, but many can't. Amdahl's Law tells us that even with an infinite number of cores, an application that is 50% parallelizable will get only a 2x speed-up over a single-core design. But that law is optimistic, and doesn't account for the inevitable bus conflicts that will occur when sharing L2 and main memory. Interprocessor communication, locks, and the like make things even worse. Data from Sandia National Labs shows that even for some very parallel problems multi-core just doesn't scale after a small number of processors are involved. In Power Challenges May End the Multicore Era (Communications of the ACM, February 2013, subscription required), the authors develop rather complex models that show multi-core may (and the operative word is "may") bang into a dead-end due to power constraints. Soon. The key takeaways are that by the 8-nm node (expected around 2018) more than 50% of the transistors on a microprocessor die will have to be dark, or turned off, at any one time just to keep the parts from self-destructing from overheating. The most optimistic scenarios show only a 7.9x speed-up between the 45-nm and 8-nm nodes; a more conservative estimate pegs that at 3.7x. The latter is some 28 times less than one would expect from the gains Moore's Law has led us to expect. I have some problems with the paper: - The authors assume an Intel/AMD-like CPU architecture. That is, huge, honking processors whose entire zeitgeist is performance. We in the embedded space are already power-constrained and generally use simpler CPUs. It's reasonable to assume a mid-level ARM part will run into the same issues, but perhaps not at 8 nm. - They don't discuss memory contention, locks, and interprocessor communication. That's probably logical as their thesis is predicated on power constraints. But these issues will make the results even worse in real-world applications. The equations presented indicate no bus contention for shared L2 (and L2 is always shared on multi-core CPUs) and none for main memory accesses. Given that L1 is tiny (32-64KB) one would expect plenty of L1 misses and thus lots of L2 activity... and therefore plenty of contention. - The models analyse applications in which 75% to 99% of the work can be done in parallel. Plenty of embedded systems won't come near 75%. - It appears the analysis assumes cache wait states are constant: three for L1 and 20 for L2. Historically that has not been the case—the 486 had zero wait state cache. It's hard to predict how caches in the future will behave but assuming past trends continue, the paper's conclusions will be even worse. - The paper figures on a linear relationship between frequency and performance, and the authors acknowledge that memory speeds don't support this assumption. The last point is insanely hard to analyse. Miss rates for L1 and L2 are extremely dependent on the application. SDRAM is very slow for the first access to a block, though succeeding transfers happen very quickly indeed. But any transaction could take just three cycles (if in L1) to hundreds. One wonders how much tolerance a typical hard real-time system would have for such uncertainty. Two conclusions are presented: the pessimistic one is the chicken little scenario where we hit a computational brick wall. Happily, the paper address a number of more optimistic possibilities ranging from microarchitecture improvements to unpredictable disruptive technologies. The latter has driven semiconductor technology for decades, and I for one am optimistic that some cool and unexpected inventions will continue to drive computer performance on its historical upward trajectory.  
  • 热度 17
    2011-11-3 17:56
    1981 次阅读|
    2 个评论
    When I was a young lad in England in the early 1970s, I used to read a monthly electronics hobbyist magazine called Practical Electronics . When I say "read" I really mean "devour!" When the time was coming close for a new issue to hit the stands, I would visit the newsagent every day after school pleading "Has it arrived yet?" As soon as the magazine did arrive I read it cover to cover, and then I jumped on a bus to visit my local electronics store to purchase the components required to build one of that month's projects (my favorite series was called "Take Twenty" in which each project had fewer than 20 components and cost less than 20 shillings). Many of the projects in those days employed 7400-series TTL integrated circuits. This family of components contained hundreds of devices that provided everything from basic logic gates and flip-flops to more sophisticated elements like counters and decoders and even simple Arithmetic Logic Units (ALUs). This was, to a large extent, how I learned the fundamentals of digital electronics. Also, because this is what I was learning with, it made perfect sense to me that you could have a silicon chip containing four 2-input NAND gates (the 7400) or four 2-input NOR gates (the 7402). Time passed (as is its wont) and I met older engineers who had grown up creating digital logic circuits using discrete transistors resistors, and capacitors. It amazed me that many of these folks simply couldn't wrap their brains around the use of digital integrated circuits. (And don't even get me started talking about the analog-digital divide.) Over the years I've come to see this many times. I've met guys who grew up working with vacuum tubes who couldn't make the transition to semiconductors – they found the high-tension supplies associated with the tubes easier to understand than the low-voltage power supplies used by transistors. I've also met folks who understood the use of basic digital chips but who couldn't get to grips with the concept of simple 8bit microprocessors. And there are folks who were experts with 8bit microprocessors and assembly language who find themselves overwhelmed by 32bit and 64bit processors and high-level programming languages. I must admit that I've started to wonder if this will one day happen to me – is there some new technology on the horizon that will leave me baffled and bewildered...  
  • 热度 24
    2011-7-21 23:33
    1877 次阅读|
    0 个评论
    About 14 years ago, I wrote a piece about a toaster that could be connected to the Internet. That article was entirely a joke, a riff on the 'net meme that was starting to pervade the embedded world. At the time readers responded that such products were already on the market, though it's hard to see how such a connection would improve one's breakfast experience. NXP recently announced their "GreenChip" smart lighting product , which gives each light bulb an IP address. A wireless network then manages lighting to garner more efficiencies. Now we can be even lazier and never turn off a light. Sensors can determine if there's motion in the room and take care of this routine activity. Dad won't have to yell at the kids to turn off the damn light! It has been about 130 years since Edison and Swan built the first practical electric light bulbs. Indeed, the first commercial establishment illuminated by them was London's Savoy Theatre, home to Gilbert and Sullivan, in 1881. Since then uncounted billions have been produced, perhaps making bulbs of various types the most common electrical device in existence. But how common? Curious, I inventoried our house, which, at 2400 square feet, is small compared to the US average of 2700 ( according to http://www.infoplease.com ). I also toted up the number of PCs and products that certainly have one or more embedded microprocessors, such as remote controls, TVs, phones of various sorts, appliances and the like. We have four PCs, one being a Mac and the others Windows machines. ( Note that none of the kids live at home anymore, so their machines don't count. ) About 41 other products have embedded systems built in, though I suspect that count is a bit low. Though my woodshop abounds with electrical equipment only one drill and the radio are obviously smart. The tablesaw is deadly dim with only a universal motor, but microprocessors have made some saws so smart they pretty much can't injure the operator. To see a Sawstop in action is impressive: the demonstrator pushes a hotdog at full speed into the blade, which falls under the table and is stopped by a chunk of aluminum in 5 msec. The hotdog gets a minor nick. I didn't check the barn though the John Deere certainly has one or more processors, and the sawmill's digital hour meter must, too. We have a lot of two cycle-powered equipment out there, but I suspect none have micros. But we have 118 light bulbs of various types, not including spares. I would have guessed 50. Summing, that's 163 bulbs and computer-enabled devices. If each had an IP address, and scaling to the 130 million houses in the U.S. ( according to the U.S. Census bureau ) gives 21 billion devices that could sport an IP address in this country alone. IPv4 is dead. Long live IPv6.