tag 标签: CoreMark

相关帖子
相关博文
  • 热度 13
    2015-3-27 14:42
    1430 次阅读|
    0 个评论
    Many know about EEMBC’s  CoreMark , a suite of programs used to compare processor horsepower. Traditional benchmarks don’t do well in evaluating processors running embedded workloads.   They’ve recently added the ULPBench , which is a program that exercises processors to determine how much energy they use. The benchmark consists of a number of mathematical and sorting operations. Then the processor goes to sleep. Over the duration of a number of these iterations energy use is monitored.   The source code is available on the site. At this writing seven MCUs have been scored, with the results published on the site.   What do the scores mean? A non-charitable person might say “nothing.” Each MCU has so many low-power modes it’s awfully hard to make comparisons. Some might preserve RAM, others allow for quick response to interrupts, and all use different amount of energy. Further, the benchmark is likely not representative of your application. However, it does provide a somewhat objective way to compare MCUs.   EEMBC also sells a tool to take the energy measurements. Called the Energy Monitor, this $75 device is just a little PCB that connects (via USB) to a Windows application.   This device is one of the very few true energy monitors I’m aware of. Others generally call themselves “power monitors,” which is a misnomer as they generally put a sense resistor in series with the power supply to measure current, not power. EEMBC’s Energy Monitor is very different. The device under test is powered from a small capacitor that is charged as needed. A supervisory microprocessor controls the charging and monitors how much was required. The result: true energy usage expressed in microJoules. Where a current monitor might show one CPU uses 20 mA awake and another 1 mA, a measurement in Joules may show that the high-amperage part is awake for much less time than the other contestant, so in fact may use less energy from the battery.   They were kind enough to send me one, and I gave it a run on a Renesas RL78/G14 evaluation board. Installation of the Windows app was hindered by the unsigned USB drivers. The Windows 8 drill sergeant is sometimes really annoying. However, this site  shows how to get around the problem.   The Energy Monitor is the board on the right; the target is on the left. Note the LED on the target. The benchmark code turns on an LED when the target is not asleep. My first run, with LED installed, yielded this:     The test repeated 12 times giving the staircase shape of the graph. Energy is being measured, and energy is power integrated over time. So each run adds to the test.   Removing the LED and the numbers are unsurprisingly smaller. Note that the graph nicely autoscales:     The claimed accuracy is 1% for currents under 1 mA, 2% over that to a max of 28 mA. On the low scale it will resolve to 50 nA, more than enough for monitoring the behavior of ultra-low power systems that must operate for years off a small battery.   Traditional benchmarks may not give useful performance metrics for any particular application, but at least they are repeatable. One wonders about energy measurements. I’ve complained in this space about so many vendors not giving worst-case current consumption numbers, and recently  lauded Freescale for publishing detailed metrics. For their Kinetis parts they cite max as the mean of many parts plus 3 sigma. With neither the mean nor the standard deviation listed, though, testing a single MCU may not tell us much about max energy consumption. Full characterization may require testing many parts. And if your device runs over a wide temperature range, better figure on running those sorts of tests, too.   As I mentioned, the ULPBench code may or may not mirror your application’s power profile. What’s nice, though, is that this really inexpensive tool takes great data, and the benchmark code is available. Tune it to meet your needs and they you can evaluate MCUs from different vendors.
  • 热度 20
    2013-1-30 09:57
    2098 次阅读|
    1 个评论
    Here's a rather meaningless question: How fast is your CPU? The amount of work a processor can get done in a period of time is dependent on many factors, including the compiler (and its optimisation level), wait states, background activity such as direct memory access that can steal cycles, and much more. Yet plenty of folks have tried to establish benchmarks to make some level of comparison possible. Principle among these is Dhrystone. But Dhrystone has problems. Compiler writers target the benchmark with optimisations that may not help developers much, but give better scores. Much of the execution time is spent in libraries, which can vary wildly between compilers. And both the source code and reporting methods are not standardised. A few years ago the EEMBC people addressed these and other issues with their CoreMark benchmark which is targeted at evaluating just the processor core. It's small—about 16k of code, with little I/O. All of the computations are made at run time so the compiler can't cleverly solve parts of the problem. CoreMark is focused primarily on integer operations—the control problems addressed by embedded systems. The four bits of workload tested are matrix manipulation, linked lists, state machines, and CRCs. The output of each stage is input to the next to thwart over-eager compiler writers. One rule is that each benchmark must include the name and version of the compiler used, as well as the compiler flags. Full disclosure, no hiding behind games. The result has been good news for us. Some of the compiler vendors have taken on CoreMark as the new battleground, publishing their scores and improving their tools to ace the competition. IAR and Green Hills are examples. Jack's addiction Scores ( www.coremark.org/benchmark/index.php?pg=benchmark ) are expressed as raw CoreMark, CoreMark/MHz (more interesting to me), and CoreMark/Core (for multi-core devices). There are two types of results—those submitted from vendors, and those certified by EEMBC's staff (for a charge). Results range from 0.03 CoreMark/MHz for a PIC18F97J60 to 168 for a Tilera TILEPro64 running 64 threads. The single-threaded max is 5.1 for a Fujitsu SPARK64V(8). But away from speed demons like Pentium-class or SPARC machines, the highest score is for Atmel's SAM4S16CAU—a Cortex M4 device—which notches in at 3.38 CoreMark/MHz. That beats out a lot of high-end devices. Clock rates do matter, and while the Intel Core i5 gets a score of 5.09 CoreMark/MHz, its raw result, at 2,500MHz, is 12,715, or 6,458 CoreMark/core. That thrashes the Atmel device, which was tested to 21MHz, where it netted 71 CoreMark. There are some caveats. Some processors can load the entire test in cache. For those, it makes sense to use some of EEMBC's more comprehensive benchmarks. Wait states are a problem, so tests report where the code runs: if it's from flash it'll generally be slower than from RAM. The nearly-shocking news that the Core i5 is less than two times the score/MHz for a Cortex M4 neglects nifty features like floating point (the i5 has an insanely-fast FPU, which the benchmark's integer tests ignore). Some companies couple CoreMark with EEMBC's EnergyBench to compute performance per mA, a number of increasing importance. Best of all, the code is freely available at www.coremark.org . I've turned into a crack-head, and my drug of choice is the CoreMark scores. It's fascinating to compare various processors and compilers. The results can be pretty surprising. Thanks to Marcus Levy of EEMBC for answers to my questions.