tag 标签: dynamic allocation

相关博文
  • 热度 12
    2011-3-18 12:29
    1950 次阅读|
    0 个评论
    In my previous column entitled "Poor reasons for rejecting C++", I sought to dispel some misconceptions about C++. 1 Among the many reader comments were some valid concerns that merit further discussion. I'll address one of those concerns in my upcoming columns.   Some readers took exception to my statement that, "I know of no place where the C++ language performs dynamic allocation or recursion behind the scenes." As I explained a few months ago, I think of the C and C++ programming languages as distinct from their accompanying standard libraries. 2 Although the various components in the Standard C++ library perform dynamic allocation behind the scenes, the language itself does not.   Nonetheless, you may be legitimately concerned that your C++ code will invoke a function that uses dynamic allocation against your wishes. In that case, you can choose among various techniques that will trigger a compile or link error to alert you that your code is using dynamic allocation   Probably the simplest such technique is to replace the global operator new with a version that causes a link error. As I explained in an earlier column, a C++ new-expression allocates memory by calling a function named operator new . 3 Each C++ environment provides a default implementation for a global operator new , declared as:   void *operator new(std::size_t n) throw (std::bad_alloc);   However, the C++ Standard lets you define a function with the same name and signature (parameter types) to replace the default implementation. 4   To prevent using dynamic allocation, simply define a replacement version of operator new as follows:   // declare this function, but don't define it void *operator_new_blocker(); void *operator new(std::size_t) throw (std::bad_alloc) { return operator_new_blocker(); }   Hence, a new-expression as in:   int *p = new int;   will compile, as always, to call operator new . When the linker drags the definition for operator new into the executable, it will also try to drag in operator_new_blocker as well. However, the link will fail because operator_new_blocker isn't defined. Any call to operator new will provoke a link error, whether the call occurs directly in your code or indirectly in library code that your code uses.   Compile errors are generally preferable to link errors because compile errors tend to be more accurate in pinpointing the location and describing the nature of translation errors. However, you can't use compile-time techniques to intercept calls to operator new in previously-compiled components that you're linking into your code. You have to rely on the linker.   You must place the replacement definition for operator new so that the linker will bind the definition into the executable program if and only if at least one call to operator new occurs somewhere in the program. That is, you must ensure that the linker won't incorporate the replacement definition unless the program actually calls operator new . If you inadvertently link operator new into your program, the linker will hunt for a definition for operator_new_blocker that it won't find, and you'll get spurious link errors.   Where you need to place your definition for operator new depends on your development tools. Many modern compilers and linkers employ some form of smart linking that will link into the executable only those external functions and data that the program actually uses. For example, if a single source file contains external definitions for functions f , g , and h , yet the program calls only f and h , a smart linker will link f and h into the program, but omit g . If you're using a compiler with a smart linker, you can place the definition for operator new in any source module that's compiled and linked as part of the build. A program that never calls operator new should still link without complaint.   With compilers and linkers that don't use smart linking, you may have to place operator new into its own separately-compiled source file, and possibly place the resulting object module into a separate library. You must also configure the build so that the linker looks in that separate library before it looks in the standard library.   In truth, defining only one version of operator new won't cover all possible cases. For example, an array-new-expression as in:   char *p = new char ;   allocates memory by calling a function named operator new , declared as:   void *operator new (std::size_t n) throw (std::bad_alloc);   The Standard C++ library also provides default implementations for "nothrow" versions of operator new and operator new , declared as:   void *operator new (std::size_t n, std::nothrow_t const ) throw (); void *operator new (std::size_t n, std::nothrow_t const ) throw ();   By default, these functions return a null pointer, rather than throw an exception, if the allocation fails. According to the C++ Standard, these four variants of operator new ( new, new ) are the only replaceable memory allocation functions. If you really want to prevent your C++ programs from using dynamic memory allocation, you should replace all four variants of operator new with versions that cause link errors.   Four variants of operator delete (one corresponding to each replaceable variant of operator new ) are also replaceable. You don't need to replace these to block unintended use of dynamic memory management, but it's probably still a good idea to replace them.   If you want to ensure that your program doesn't use any of the Standard C allocation functions either, you should write corresponding replacements for malloc , calloc , realloc , and free so that calling any one of them triggers a link error as well. Whereas the C++ Standard specifically allows programmers to replace the definitions for certain forms of operator new and operator delete , the C Standard says that replacing any standard library function leads to undefined behavior. 5 Nonetheless, I think you'll find writing these replacements for malloc , et. al, will work on nearly all platforms as I've described.   Truth in advertising: Although I have used this technique to successfully prevent using dynamic memory allocation in my own code, I have yet to see it employed in any large scale effort. If you run into difficulties with it, please let me know, and I'll pass it on to my readers.   Endnotes: 1. Saks, Dan, "Poor reasons for rejecting C++", Embeddeddesignindia.com, March 2011. http://forum.embeddeddesignindia.co.in/BLOG_ARTICLE_6865.HTM . 2. Saks, Dan, "Freestanding vs. hosted implementations", Embeddeddesignindia.com, March 2011 . http://forum.embeddeddesignindia.co.in/BLOG_ARTICLE_6903.HTM . 3. Saks, Dan, "Allocating objects vs. allocating storage", Embedded.com, September 2, 2008. 4. ISO/IEC Standard 14882:2003(E), Programming lan5.ISO/IEC Standard 9899:1999, Programming languages-C.
  • 热度 10
    2011-3-15 12:12
    2208 次阅读|
    0 个评论
    In 2009, Michael Barr wrote an article entitled "Real men program in C." 1 The article prompted numerous comments from readers, including a number who cited reasons to prefer C over C++. I chimed in by writing a column entitled "Poor reasons for rejecting C++" in which I explained why I disagreed with some of those comments. 2 That, in turn, provoked another flurry of comments. Many of those comments affirmed the wisdom of my analysis, thank you. Others raised interesting issues that I will try to address in this and future columns.   Some readers took exception to my statement that, "I know of no place where the C++ language performs dynamic allocation or recursion behind the scenes." One reader wrote that, "The flexible nature of common data types such as the C++ vector and string classes would seem to completely rely on dynamic allocation, just as the whole STL library. I can't quite see how you could implement things like the string + operator, vector push_back member function, etc. ... without a heap ..."   Arguably, I could have been more explicit that I was talking only about the C++ language itself and not its accompanying library. I did elaborate that, "Indeed, your code might call a function that uses dynamic allocation or recursion, but this is no more a problem in C++ than in C," but I guess I wasn't explicit enough.   I think embedded programmers should learn to think of the C and C++ programming languages as distinct from their accompanying standard libraries. Although I'm not prepared to say embedded programmers should view all languages this way, it's probably an appropriate way to view statically-typed, compiled languages, which include C and C++, as well as others such as Ada. Most compilers come with a small collection of mysteriously-named functions that support operations such as program startup, program shutdown, and floating point arithmetic, but such functions are not really part of the library as defined by the language standards. Rather, such functions are just artifacts of the compiler's code generation strategy.   The C Standard formalizes a separation between the language and the library by distinguishing between hosted and freestanding implementations. Informally, a hosted implementation is a C translation and execution environment running under an operating system with full support for the language and library. A freestanding implementation is a C translation and execution environment with nearly full language support but essentially no support for the standard library's runtime components—an environment not uncommon among low-end embedded systems.   Here's what the C 99 Standard actually says: 3   "The two forms of conforming implementation are hosted and freestanding. A conforming hosted implementation shall accept any strictly conforming program. A conforming freestanding implementation shall accept any strictly conforming program that does not use complex types and in which the use of the features specified in the library clause (clause 7) is confined to the contents of the standard headers , , , , , , and ."   Those seven named headers define only constants, types, and a few function-like macros. They don't declare any functions. A freestanding implementation need not provide headers such as or . Thus, it need not support input and output functions such as fgetc and printf or memory management functions such as malloc and free . A freestanding implementation doesn't even need to support the string handling functions in , such as memcpy and strlen .   Although many embedded systems do without i/o and memory management, I suspect very few get by without memcpy . Thus, the C Standard's notion of a freestanding implementation defines a narrower subset than what most embedded tool chains offer and what most embedded programmers actually use. Part of learning to be a good embedded developer is learning to be selective about which parts of the library are safe and appropriate to use. Documents such as the MISRA-C guidelines can provide assistance in this regard. 4   The C++ Standard distinguishes hosted and freestanding implementations much as the C Standard does, but a freestanding environment in C++ demands more runtime support than it does in C. In particular, the 2003 C++ Standard says that a freestanding implementation must provide at least the headers , , , , , , and , and that "the supplied version of the header shall declare at least the functions abort() , atexit() , and exit() ." 5   Whereas a freestanding C implementation need not provide , and therefore need not support malloc and free , a freestanding C++ implementation must provide and therefore must support operators new and delete . For programmers concerned that they might inadvertently use dynamic allocation, C++ offers compile- and link-time techniques you can use to prevent using these operators. I'll discuss those techniques in a future blog.   End notes: 1. Barr, Michael, "Real men program in C", Embedded.com, August 1, 2009. 2. Saks, Dan. "Poor reasons for rejecting C++", Eetasia.com, March 2011. http://forum.eetasia.com/BLOG_ARTICLE_6861.HTM . 3. ISO/IEC Standard 9899:1999, Programming languages—C. 4. MISRA-C 2004: Guidelines for the use of the C language in critical systems. Motor Industry Research Association, October 2004. 5. ISO/IEC Standard 14882:2003(E), Programming languages—C.  
  • 热度 12
    2011-3-13 18:28
    1602 次阅读|
    0 个评论
    My colleague, Michael Barr, wrote an interesting piece entitled "Real men program in C." 1 I won't try to summarize it—you can read it yourself. The article provoked numerous comments from readers citing reasons to prefer C over C++. I disagree with several of those comments, and I'd like to say a little in reply to each. One reader wrote, "Just for the fun of it you might as well want to have a look at what Linus Torvalds thinks about C and C++," followed by the URL to the remark (which I've omitted intentionally). I suppose reading what Torvalds wrote would be fun if you like reading rants. It wasn't fun for me. It's provocative but not well grounded, as explained by my friend and colleague Steve Dewhurst. Torvalds has made positive contributions to the computing industry, but this wasn't one of them. Another reader wrote, "I've studied C++ but never used it professionally and, the more I study it, the less I think it appropriate for embedded use. It's not just that it is expensive in terms of resources; it encourages you to do too many risky things. It's bad enough that you can do things like dynamic allocation and recursion in C but, in C++, the language will do it without your ever realizing it, unless you know what it's doing behind the scenes..." I have used C++ for prototyping embedded systems and consulted for others who have developed production systems in C++. The more I use it and see it used, the more convinced I am that it's preferable to C, when available. Rather than encourage risky behavior, many of C++'s core features—such as classes, access control, constructors, destructors, references and overloading—work hand-in-hand with stricter type checking to help rein in the riskiest parts of C. While I agree that typical C++ code tends to be a bit larger and slower than comparable C, the added expense is rarely prohibitive (on the order of 10-15%). I've had little problem tuning time-critical code, such as interrupt handlers, to be as or more efficient than it would be in C. I know of no place where the C++ language performs dynamic allocation or recursion behind the scenes. Indeed, your code might call a function that uses dynamic allocation or recursion, but this is no more a problem in C++ than in C. In fact, C++ supports simple compile- and link-time techniques you can use to explicitly prevent using dynamic allocation, which I'll cover in an upcoming column. Yet another reader wrote, "The only part of object orientation that is a must-have in embedded systems is private encapsulation. C supports this nicely enough with static variables declared at file scope." While I agree with the basic sentiment of the first sentence, I wouldn't word it so strongly. I would say that the single most useful feature of C++ that's absent from C is classes with private access control. I would not call it a "must-have" feature simply because programmers do write viable embedded systems in C—without classes. However, you can write better embedded systems with classes. I wouldn't want to code without them. On the other hand, I disagree that C supports access control "nicely enough with static variables declared at file scope". I believe the reader is referring to the common technique whereby you: - place the data declarations for the encapsulated data structure in a separate source file, - add the keyword static to each data declaration so that the data has internal linkage, 2 and - define non-inline functions with external linkage in that same source file to provide the publicly accessible operations on the encapsulated data. I refer to this technique as encapsulation by separate compilation . It works "nicely enough" only if you're willing to accept the following: - Your code will probably be both bigger and slower than it would be if the data were not encapsulated, because code outside the encapsulation unit can access the data only through non-inline function calls. - Each program can have at most one instance of the encapsulated data structure, unless you're willing to accept even bigger and slower code that comes with using an ad hoc memory manager. C++ classes don't have these limitations. The same reader added that "Initialization through constructors is not something you desire; in fact you will want to avoid this for security reasons and for reduced bootup time." I strongly disagree with this statement. Initialization by constructors is highly desirable. I'd rank constructors and destructors as the second most useful C++ feature that's absent from C (after classes with access control). Failure to properly initialize objects can lead to security problems, and constructors go a long way toward ensuring proper initialization. (Order-dependencies in member initializers can be hazardous, but you can use static analysis tools that will alert you to such hazards.) I would be interested to see concrete examples of how using constructors can cause security problems. So would Robert Secord, author of Secure Coding in C and C++ (2005, Addison-Wesley) and The CERT C Secure Coding Standard (2008, Addison-Wesley). If you're concerned about startup time, C++ offers features such as "placement new" to give you greater control over when constructors execute. Using placement new, you can defer initialization for selected objects until after system startup. I expect to cover this in the not-too-distant future. Despite my avowed preference for C++ over C, I still recognize that many of my readers have legitimate reasons to use C, such as the lack of an adequate C++ compiler for the intended target platform. 3 I will continue to write about topics of interest to both C and C++ programmers, in part because I think your programming language choice should be well-informed. Between the time I wrote this column and the time it was posted, further comments on Michael Barr's column pursued the question of whether constructors can lead to security problems. The reader who originally raised this issue explained that the problem "isn't because of the language, but because of the unreliable nature of RAM. there could be days, weeks or years from the point of initialization to the point where the variable is used. If you rely on the initialization values, you leave the safety in the hands of the RAM manufacturer." It sounds to me like the problem he's describing is, by his own admission, independent of the use of constructors. I still see no basis for the claim that one should avoid using constructors for security reasons. Endnotes: 1. Barr, Michael, "Real men program in C", Embedded.com, August 1, 2009. 2. Saks, Dan, "Linkage in C and C++", Embedded Systems Design , March, 2008, page 9. 3. Saks, Dan, "Moving to Higher Ground", Embedded Systems Programming , October, 2003, page 45.
  • 热度 9
    2011-3-13 18:28
    2049 次阅读|
    0 个评论
    My colleague, Michael Barr, wrote an interesting piece entitled "Real men program in C." 1 I won't try to summarize it—you can read it yourself. The article provoked numerous comments from readers citing reasons to prefer C over C++. I disagree with several of those comments, and I'd like to say a little in reply to each. One reader wrote, "Just for the fun of it you might as well want to have a look at what Linus Torvalds thinks about C and C++," followed by the URL to the remark (which I've omitted intentionally). I suppose reading what Torvalds wrote would be fun if you like reading rants. It wasn't fun for me. It's provocative but not well grounded, as explained by my friend and colleague Steve Dewhurst. Torvalds has made positive contributions to the computing industry, but this wasn't one of them. Another reader wrote, "I've studied C++ but never used it professionally and, the more I study it, the less I think it appropriate for embedded use. It's not just that it is expensive in terms of resources; it encourages you to do too many risky things. It's bad enough that you can do things like dynamic allocation and recursion in C but, in C++, the language will do it without your ever realizing it, unless you know what it's doing behind the scenes..." I have used C++ for prototyping embedded systems and consulted for others who have developed production systems in C++. The more I use it and see it used, the more convinced I am that it's preferable to C, when available. Rather than encourage risky behavior, many of C++'s core features—such as classes, access control, constructors, destructors, references and overloading—work hand-in-hand with stricter type checking to help rein in the riskiest parts of C. While I agree that typical C++ code tends to be a bit larger and slower than comparable C, the added expense is rarely prohibitive (on the order of 10-15%). I've had little problem tuning time-critical code, such as interrupt handlers, to be as or more efficient than it would be in C. I know of no place where the C++ language performs dynamic allocation or recursion behind the scenes. Indeed, your code might call a function that uses dynamic allocation or recursion, but this is no more a problem in C++ than in C. In fact, C++ supports simple compile- and link-time techniques you can use to explicitly prevent using dynamic allocation, which I'll cover in an upcoming column. Yet another reader wrote, "The only part of object orientation that is a must-have in embedded systems is private encapsulation. C supports this nicely enough with static variables declared at file scope." While I agree with the basic sentiment of the first sentence, I wouldn't word it so strongly. I would say that the single most useful feature of C++ that's absent from C is classes with private access control. I would not call it a "must-have" feature simply because programmers do write viable embedded systems in C—without classes. However, you can write better embedded systems with classes. I wouldn't want to code without them. On the other hand, I disagree that C supports access control "nicely enough with static variables declared at file scope". I believe the reader is referring to the common technique whereby you: - place the data declarations for the encapsulated data structure in a separate source file, - add the keyword static to each data declaration so that the data has internal linkage, 2 and - define non-inline functions with external linkage in that same source file to provide the publicly accessible operations on the encapsulated data. I refer to this technique as encapsulation by separate compilation . It works "nicely enough" only if you're willing to accept the following: - Your code will probably be both bigger and slower than it would be if the data were not encapsulated, because code outside the encapsulation unit can access the data only through non-inline function calls. - Each program can have at most one instance of the encapsulated data structure, unless you're willing to accept even bigger and slower code that comes with using an ad hoc memory manager. C++ classes don't have these limitations. The same reader added that "Initialization through constructors is not something you desire; in fact you will want to avoid this for security reasons and for reduced bootup time." I strongly disagree with this statement. Initialization by constructors is highly desirable. I'd rank constructors and destructors as the second most useful C++ feature that's absent from C (after classes with access control). Failure to properly initialize objects can lead to security problems, and constructors go a long way toward ensuring proper initialization. (Order-dependencies in member initializers can be hazardous, but you can use static analysis tools that will alert you to such hazards.) I would be interested to see concrete examples of how using constructors can cause security problems. So would Robert Secord, author of Secure Coding in C and C++ (2005, Addison-Wesley) and The CERT C Secure Coding Standard (2008, Addison-Wesley). If you're concerned about startup time, C++ offers features such as "placement new" to give you greater control over when constructors execute. Using placement new, you can defer initialization for selected objects until after system startup. I expect to cover this in the not-too-distant future. Despite my avowed preference for C++ over C, I still recognize that many of my readers have legitimate reasons to use C, such as the lack of an adequate C++ compiler for the intended target platform. 3 I will continue to write about topics of interest to both C and C++ programmers, in part because I think your programming language choice should be well-informed. Between the time I wrote this column and the time it was posted, further comments on Michael Barr's column pursued the question of whether constructors can lead to security problems. The reader who originally raised this issue explained that the problem "isn't because of the language, but because of the unreliable nature of RAM. there could be days, weeks or years from the point of initialization to the point where the variable is used. If you rely on the initialization values, you leave the safety in the hands of the RAM manufacturer." It sounds to me like the problem he's describing is, by his own admission, independent of the use of constructors. I still see no basis for the claim that one should avoid using constructors for security reasons. Endnotes: 1. Barr, Michael, "Real men program in C", Embedded.com, August 1, 2009. 2. Saks, Dan, "Linkage in C and C++", Embedded Systems Design , March, 2008, page 9. 3. Saks, Dan, "Moving to Higher Ground", Embedded Systems Programming , October, 2003, page 45.