原创 Multicore and speed limits

2012-2-21 16:59 1565 14 15 分类: 消费电子

In Finksburg, Maryland, Route 140 is heavily patrolled. Breaking the speed limit would likely lead to a fine.


But some speed limits can't be exceeded, no matter how much one wishes to. The speed of light comes to mind. As does the speed at which a teenager's brain matures.


Then there are multiprocessor limits. Amdahl's Law tells us that the max speedup achievable is:

 

amdahlslaw.jpg


where f is the percentage of a problem that cannot be parallelized, and n is the number of processors. In a system, where, say, only 50% of the problem can be executed in parallel, even with an infinite number of CPUs you can only halve the execution time by adding processors.


Gustafson's Law suggests that Amdahl is too conservative, and notes that sometimes problems scale faster in the parallel portion than in sequential. Google's Pagerank algorithm is one example. I suspect that in most embedded systems, though, Gustafson won't apply.


However, I believe Amdahl and Gustafson are optimistic in many cases, especially when working with symmetric multicore processors. These have two or more identical cores, each with their own L1 cache. They share L2 and a common memory bus. Executing out of L1 they will scream. But that cache is tiny – often only 32KB. Go to L2 – or worse, main memory – and the brake lights come on. Up to dozens of wait states slow processing, and bus contention will occur if more than one CPU needs memory at the same time. This effect is pretty hard to model since it will be both non-deterministic and very problem-specific.


But Sandia National Labs researchers have come up with some interesting data showing that even on traditional parallel problems multicore's advantages diminish very quickly. Going from two to four cores nets some serious execution-time reduction. Double down, to 8 cores, and there's no gain. Each additional doubling slows the system down – by a lot. A 64 core solution slows the system by half an order of magnitude over one with just four.


Multicore as being pushed by the major semi vendors in some cases can offer some significant advantages, both in terms of speed and power. But I think the benefits are being oversold. Memory bandwidth is a hugely-limiting factor. Alternatives such as asymmetric multiprocessing are often a better solution, depending, of course, on the nature of the problem being addressed.


A new processor technology from Venray Technology is an interesting twist on the memory bandwidth problem. Instead of adding DRAM to a CPU, they add CPUs to DRAM. Small (20k transistors) processors are tightly integrated with memory.


A typical arrangement marries 4 of these cores with 64 MB of DRAM. That puts the CPU transistor count at 0.01% of the memory. Venray's web site is long on marketing-speak and short of tech details, but the idea is compelling.


 

PARTNER CONTENT

文章评论1条评论)

登录后参与讨论

用户1406868 2012-8-4 14:17

I would really peerfr a lesson were the video plays most of the time. This is mostley reading the screen, which is very small and they are so paraoid that you will copy it that they disable printing, The video really does not help much and i have been through half of the course. I personally would reccomend metal method with doug marks over this.
相关推荐阅读
用户3671694 2016-04-18 17:49
What would you change about C?
If you’re an old-timer you’ve most likely written code in a large number of languages that have ma...
用户3671694 2016-04-18 17:33
A look at a new embedded heap manager
Many of us don’t give much thought about the math our compilers do. Toss off a call to a sine func...
用户3671694 2016-04-15 17:12
Why names are critical
The Linux printk function has various logging levels, which include KERN_EMERG, KERN_ERR and other...
用户3671694 2016-03-14 19:02
What do you think of ultra-low power watchdogs?
I have written extensively about designing ultra-low power systems that operate from coin cells. U...
用户3671694 2016-02-26 21:58
Comment headers: The best and the worst
I read a great deal of code. The vast majority is in C with some C++ and a bit of assembly sprinkl...
用户3671694 2016-02-12 17:58
What's your take on knobs?
In a recent Embedded Muse Richard Wall reviews the latest version of Digilent’s Analog Discovery U...
EE直播间
更多
我要评论
1
14
关闭 站长推荐上一条 /3 下一条