constructors-如何使用-有什么中文资料面包板社区

相关博文

Issues when constructing memory-mapped objects

热度 13

用户3635970

2011-8-3 00:55

2254 次阅读|

0 个评论

Last March, I posted a column explaining some common approaches to representing and manipulating memory-mapped devices in C. 1 I followed that with another column discussing some better alternatives using classes in C++. 2 The C++ alternatives are better in the sense that they yield interfaces that are typically easier to use correctly and harder to use incorrectly than the C alternatives, and yet have much the same performance. In C, structures often provide the best way to model the device registers of memory-mapped devices. For example: typedef struct timer_type timer_type; struct timer_type { device_register TMOD; device_register TDATA; device_register TCNT; }; defines the layout for a timer that employs three device registers. The header file that defines this structure might also define useful constants and types for manipulating the registers, such as: #define TE 0x01 #define TICKS_PER_SEC 50000000 typedef uint32_t timer_count_type; along with functions that provide basic operations for programming a timer, such as: void timer_disable(timer_type *t); void timer_enable(timer_type *t); void timer_set(timer_type *t, timer_count_type c); timer_count_type timer_get(timer_type const *t); In C++, you can wrap all of the timer components into a single class that more effectively hides some of the timer's complexity. The class definition looks something like: class timer_type { public: enum { TICKS_PER_SEC = 50000000 }; typedef uint32_t count_type; void disable(); void enable(); void set(count_type c); count_type get() const; private: enum { TE = 0x01 }; device_register TMOD; device_register TDATA; device_register TCNT; }; As I explained last April, a constructor is a special class member function that provides guaranteed initialization for objects of its class type. 3 This timer_type class doesn't have any constructors, but it probably should. This month, I'll discuss adding constructors to classes that represent memory-mapped devices. As you'll see, writing such constructors is no big thing. The challenging part is getting the constructors to execute. Defining constructors In many embedded systems, the appropriate way to initialize a device is to put it into an inactive state. In such systems, the constructor for a timer might simply make sure that the timer is disabled. To do this, simply add the constructor declaration to the class definition: class timer_type { public: ~~~ timer_type(); ~~~ }; and define the function as: inline timer_type::timer_type() { disable(); } Alternatively, you can define the function within the class definition, as in: class timer_type { public: ~~~ timer_type() { disable(); } ~~~ }; in which case the function is also implicitly an inline function. Declaring objects Normally, you don't choose the memory locations where program objects reside—the compiler does, often with substantial help from the linker. For example, if the compiler encountered an object declaration at global scope such as: timer_type the_timer; the compiler would set aside so many bytes at some offset within some code segment. If that definition appeared at local scope, the compiler would set aside so many bytes at some offset within the stack frame of the function containing the definition. As I a few months ago, in either case, the compiler would automatically plant code to invoke the constructor in the "right" place. 4 However, a timer—or any object representing a memory-mapped device—isn't an ordinary object. The compiler doesn't get to choose where the object resides—the hardware designer does. Thus, to access the object, the code needs a declaration for a name it can use to reference the memory-mapped location as if that location were an object of the proper device type. As I explained last year, that declaration can have different forms. Failure to launch With most C and C++ compilers, you can name a memory-mapped object using a standard extern declaration such as: extern timer_type the_timer; and then use linker command options or linker scripts to force the_timer into the desired address. However, this declaration is not a definition, so the compiler doesn't generate a constructor call for the object. That last sentence is worth elaborating. An object declaration is a statement that effectively says to the compiler: "Here's a name and some attributes for an object that's somewhere in this program, possibly here." An object definition is a statement that says: "Here's a name and the complete set of attributes for an object that's right here." All definitions are declarations, but not all declarations are definitions. An object definition prompts the compiler to generate code to allocate the object. A non-defining declaration does not. A definition for an object of class type with constructors also prompts the compiler to generate code to call a constructor. A non-defining declaration does not. Some C and C++ compilers provide a non-standard language extension that lets you position an object at a specified memory address. For example, to declare a timer object residing at location 0xFFFF6000 , you might write a memory-mapped object declaration of the form: timer_type the_timer @ 0xFFFF6000; with one compiler, or: timer_type the_timer _at(0xFFFF6000); with another. With most compilers that support such declarations, these aren't definitions either. When that's the case, the compiler won't generate a constructor for these declarations. The other common alternative is to define a pointer to a device register as a macro: #define the_timer ((timer_type *)0xFFFF6000) or as a constant pointer: timer_type *const the_timer = (timer_type *)0xFFFF6000; In C++, using a reinterpret_cast operator, as in: timer_type *const the_timer = reinterpret_cast(0xFFFF6000); reduces the hazard of the cast somewhat. You can also use a reference instead of a constant pointer, as in: timer_type the_timer = *reinterpret_cast(0xFFFF6000); These declarations for the_timer as a pointer (or reference) are object definitions, but they define only the pointer (or reference) to a memory-mapped object. They don't define the memory-mapped object itself. Thus, once again, the compiler won't generate a constructor call applied to the memory-mapped object. Of course, C programmers don't face this issue. C doesn't provide constructors, so C programmers just write named initialization functions and call them explicitly. C++ programs could do this, too. For example, you could define an initialization function for timer_type , as in: class timer_type { public: ~~~ void construct() { disable(); } ~~~ }; Then, you could write: timer_type the_timer = *reinterpret_cast(0xFFFF6000); the_timer.construct(); to set up a timer and initialize it. And there'd be no problem if it weren't so darned easy to forget to write such calls now and then. Stay tuned Constructors serve a useful purpose, and it would be preferable to use them whenever possible to initialize memory-mapped objects. Fortunately, C++ provides alternative forms for operator new that make this feasible. That will be my subject next month. Endnotes: 1. Saks, Dan, "Alternative models for memory-mapped devices", Embeddedesignindia.com, March 2011. http://forum.embeddeddesignindia.co.in/BLOG_ARTICLE_6981.HTM . 2. Saks, Dan, "Memory-mapped devices as C++ classes", Embeddedesignindia.com, March 2011. http://forum.embeddeddesignindia.co.in/BLOG_ARTICLE_7022.HTM . 3. Saks, Dan, "Demystifying constructors," Embeddedesignindia.com, April 2011. http://forum.embeddeddesignindia.co.in/BLOG_ARTICLE_7113.HTM 4. Saks, Dan. "Constructors and object definitions," Embeddedesignindia.com, April 2011. http://forum.embeddeddesignindia.co.in/BLOG_ARTICLE_7323.HTM
The troubles with constructing memory-mapped objects

热度 18

用户3635970

2011-8-3 00:48

3073 次阅读|

0 个评论

A few months ago, I wrote a column explaining some common approaches to representing and manipulating memory-mapped devices in C. 1 I followed that with another column explaining some better alternatives using classes in C++. 2 The C++ alternatives are better in the sense that they yield interfaces that are typically easier to use correctly and harder to use incorrectly than the C alternatives, and yet have much the same performance. In C, structures often provide the best way to model the device registers of memory-mapped devices. For example: typedef struct timer_type timer_type; struct timer_type { device_register TMOD; device_register TDATA; device_register TCNT; }; defines the layout for a timer that employs three device registers. The header file that defines this structure might also define useful constants and types for manipulating the registers, such as: #define TE 0x01 #define TICKS_PER_SEC 50000000 typedef uint32_t timer_count_type; along with functions that provide basic operations for programming a timer, such as: void timer_disable(timer_type *t); void timer_enable(timer_type *t); void timer_set(timer_type *t, timer_count_type c); timer_count_type timer_get(timer_type const *t); In C++, you can wrap all of the timer components into a single class that more effectively hides some of the timer's complexity. The class definition looks something like: class timer_type { public: enum { TICKS_PER_SEC = 50000000 }; typedef uint32_t count_type; void disable(); void enable(); void set(count_type c); count_type get() const; private: enum { TE = 0x01 }; device_register TMOD; device_register TDATA; device_register TCNT; }; As I explained last April, a constructor is a special class member function that provides guaranteed initialization for objects of its class type. 3 This timer_type class doesn't have any constructors, but it probably should. This month, I'll discuss adding constructors to classes that represent memory-mapped devices. As you'll see, writing such constructors is no big thing. The challenging part is getting the constructors to execute. Defining constructors In many embedded systems, the appropriate way to initialize a device is to put it into an inactive state. In such systems, the constructor for a timer might simply make sure that the timer is disabled. To do this, simply add the constructor declaration to the class definition: class timer_type { public: ~~~ timer_type(); ~~~ }; and define the function as: inline timer_type::timer_type() { disable(); } Alternatively, you can define the function within the class definition, as in: class timer_type { public: ~~~ timer_type() { disable(); } ~~~ }; in which case the function is also implicitly an inline function. Declaring objects Normally, you don't choose the memory locations where program objects reside—the compiler does, often with substantial help from the linker. For example, if the compiler encountered an object declaration at global scope such as: timer_type the_timer; the compiler would set aside so many bytes at some offset within some code segment. If that definition appeared at local scope, the compiler would set aside so many bytes at some offset within the stack frame of the function containing the definition. As I a few months ago, in either case, the compiler would automatically plant code to invoke the constructor in the "right" place. 4 However, a timer—or any object representing a memory-mapped device—isn't an ordinary object. The compiler doesn't get to choose where the object resides—the hardware designer does. Thus, to access the object, the code needs a declaration for a name it can use to reference the memory-mapped location as if that location were an object of the proper device type. As I explained last year, that declaration can have different forms. Failure to launch With most C and C++ compilers, you can name a memory-mapped object using a standard extern declaration such as: extern timer_type the_timer; and then use linker command options or linker scripts to force the_timer into the desired address. However, this declaration is not a definition, so the compiler doesn't generate a constructor call for the object. That last sentence is worth elaborating. An object declaration is a statement that effectively says to the compiler: "Here's a name and some attributes for an object that's somewhere in this program, possibly here." An object definition is a statement that says: "Here's a name and the complete set of attributes for an object that's right here." All definitions are declarations, but not all declarations are definitions. An object definition prompts the compiler to generate code to allocate the object. A non-defining declaration does not. A definition for an object of class type with constructors also prompts the compiler to generate code to call a constructor. A non-defining declaration does not. Some C and C++ compilers provide a non-standard language extension that lets you position an object at a specified memory address. For example, to declare a timer object residing at location 0xFFFF6000 , you might write a memory-mapped object declaration of the form: timer_type the_timer @ 0xFFFF6000; with one compiler, or: timer_type the_timer _at(0xFFFF6000); with another. With most compilers that support such declarations, these aren't definitions either. When that's the case, the compiler won't generate a constructor for these declarations. The other common alternative is to define a pointer to a device register as a macro: #define the_timer ((timer_type *)0xFFFF6000) or as a constant pointer: timer_type *const the_timer = (timer_type *)0xFFFF6000; In C++, using a reinterpret_cast operator, as in: timer_type *const the_timer = reinterpret_cast(0xFFFF6000); reduces the hazard of the cast somewhat. You can also use a reference instead of a constant pointer, as in: timer_type the_timer = *reinterpret_cast(0xFFFF6000); These declarations for the_timer as a pointer (or reference) are object definitions, but they define only the pointer (or reference) to a memory-mapped object. They don't define the memory-mapped object itself. Thus, once again, the compiler won't generate a constructor call applied to the memory-mapped object. Of course, C programmers don't face this issue. C doesn't provide constructors, so C programmers just write named initialization functions and call them explicitly. C++ programs could do this, too. For example, you could define an initialization function for timer_type , as in: class timer_type { public: ~~~ void construct() { disable(); } ~~~ }; Then, you could write: timer_type the_timer = *reinterpret_cast(0xFFFF6000); the_timer.construct(); to set up a timer and initialize it. And there'd be no problem if it weren't so darned easy to forget to write such calls now and then. Stay tuned Constructors serve a useful purpose, and it would be preferable to use them whenever possible to initialize memory-mapped objects. Fortunately, C++ provides alternative forms for operator new that make this feasible. That will be my subject next month. Endnotes: 1. Saks, Dan, "Alternative models for memory-mapped devices", Eetasia.com, March 2011. http://forum.eetasia.com/BLOG_ARTICLE_6978.HTM . 2. Saks, Dan, "Memory-mapped devices as C++ classes", Eetasia.com, March 2011. http://forum.eetasia.com/BLOG_ARTICLE_7018.HTM . 3. Saks, Dan, "Demystifying constructors," Eetasia.com, April 2011. http://forum.eetasia.com/BLOG_ARTICLE_7112.HTM 4. Saks, Dan. "Constructors and object definitions," Eetasia.com, April 2011. http://forum.eetasia.com/BLOG_ARTICLE_7322.HTM
Utilising member initializers

热度 11

用户3635970

2011-6-16 13:51

1795 次阅读|

0 个评论

In C++, a constructor is a special class member function that allows guaranteed initialisation for objects of its class type. In my recent articles, I've been discussing what constructors are and what kind of code compilers generate on their behalf. 1, 2 Last month, I started to explain the behaviour of constructors for classes with members that have constructors of their own. 3 This month, I'll pick up where I left off. Although C doesn't support constructors, C programs can provide functions that mimic constructors. Well-written C programs do. As in the past, I'll explain the behaviour of C++ constructors by presenting C code that exhibits much the same behaviour. This should help you see not only what C++ is doing behind the scenes, but also show how you can emulate constructors in C. Assignment vs. initialisation For my example, I've been using a class for entries in a symbol table, where each entry stores a name and an associated ID and a value. The name is a character string, the ID is an unsigned integer value, and the value is a sequence of one or more signed integer values. The entry class definition looks in part like: class entry { entry(string const n, int v); ~~~ private: static unsigned counter; string name; unsigned id; sequence value; }; Here, string might be the Standard C++ string class, or something similar. The sequence class might actually be a typedef alias for a standard container class template specialization, such as: typedef vector sequence; or it might be a custom-built class. A class can have more than one constructor. At the moment, the entry class has just one, declared as: entry(string const n, int v); This constructor initializes an entry so that its name is n and its value is v . The constructor also uses static member counter to generate a unique ID for each entry . The constructor definition that I presented last time looked like: entry::entry(string const n, int v) { name = n; value.push_back(v); id = ++counter; } Strictly speaking, the first statement in the constructor body is not an initialisation. It's an assignment that replaces name 's value, assuming the string has already been initialized. Calling push_back is not an initialisation, either. It assumes that value already has an initial value, and appends one more value to whatever's already there. Remember, entry 's members name and value have class types. Those classes have constructors, which provide guaranteed initialisation. C++ preserves the guarantee by inserting default constructor calls for entry 's members into the entry constructor itself. (A default constructor is a constructor that can be called with an empty argument list.) The compiler generates a call that applies the default string constructor to entry 's member name , and another call that applies the default sequence constructor to member value . A C function that performs the same work as the entry constructor might look like: void entry_construct_nv (entry *_this, string const *n, int v) { string_construct(_this-name); sequence_construct(_this-value); string_copy(_this-name, n); sequence_push_back(_this-value, v); _this-id = ++counter; } As the C code indicates, the entry constructor initializes name to be an empty string, only to replace the empty string with a copy of n . Similarly, it initializes value to be an empty sequence, only to append the value of v . The constructor code would be shorter and faster if the constructor simply initialized name with a copy of n and value as a sequence containing just v . C++ provides member initializers to eliminate such unnecessary default initialisation. Member initializers Again, the entry class has two members of class type, name and value . C++ upholds the initialisation guarantee by applying the default constructors to name and value as part of the entry constructor. If you'd like the entry constructor to apply different constructors to its members, it must use member initializers. A constructor definition may include a list of member initializers. Each member initializer specifies the initial value for some class member. For example, in: entry::entry(string const n, int v) : name (n), value (1, v) { id = ++counter; } the member initializer name (n) specifies that this entry constructor will initialise its member name using the string copy constructor with n as its argument. The member initializer value (1, v) specifies that the entry constructor will initialise value using a sequence constructor that accepts 1 and v as its arguments. This will initialise the sequence to contain one element whose value is v . Using member initializers often streamlines the work of a constructor. In this case, it eliminates statements from the constructor body. A C function that performs the same work as this entry constructor has fewer function calls than it did before: void entry_construct_nv (entry *_this, string const *n, int v) { string_construct_copy(_this-name, n); sequence_construct_cv(_this-value, 1, v); _this-id = ++counter; } Member initializers can appear only in constructors, not in any other functions. If present, the member initializer list must appear after the closing parenthesis of the constructor's parameter list and before the opening brace of the function body. If a constructor has any member initializers, a colon (":") must appear before the first one. Different programmers have different styles for formatting member initializer lists. I like to place the colon immediately after the parameter list and place the member initializers on the line that follows. Each member initializer has the general form: member-name ( expression-list ) where member-name is the name of a class member and expression-list is a list of zero or more expressions separated by commas. If the initializer names a member of class type, the expression list can be any sequence that's acceptable as the argument list to one of the constructors for that member. Whenever possible, C++ strives to treat objects of class and non-class types according to uniform rules. Thus, you can use member initializers to initialise members of non-class type. Typically when a member initializer has the form m (v) and m has a non-class type, the member initializer generates the same code as the assignment: m = v; appearing in the constructor body. For example, you can write the previous constructor definition as: entry::entry(string const n, int v): name (n), id (++counter) , value (1, v) { } and it generates the same code as before. It's not uncommon for a C++ constructor to have an empty function body and do all the work in the member initializers. When a member has a const-qualified type, you must use a member initializer to initialise that member. For example, if you declare the id member in the entry class to be const , as in: class entry { ~~~ unsigned const id; sequence value; }; then any statement that tries to modify id won't compile, even in a constructor: entry::entry(string const n, int v): name (n), value (1, v) { id = ++counter; // error if id is const } The only way to initialise such a const member is with a member initializer. The same is true for members of a reference type. Less to be surprised about One of the easiest ways to misuse an object in C is to fail to initialise it properly. In C++, you can use constructors to guarantee initialisation, thus getting the compiler to do automatically what you might forget to do yourself. Unfortunately, fear about guaranteed initialisation leads some programmers to shy away from C++, claiming that C++ does too much behind the scenes. Using member initializers offers more control over what constructors do, and helps eliminate unnecessary default initialisation. Endnotes: 1. Saks, Dan, "Demystifying constructors," Embeddeddesignindia.co.in , April 2011. http://forum.embeddeddesignindia.co.in/BLOG_ARTICLE_7113.HTM 2. Saks, Dan, "Constructors and object definitions," Embeddeddesignindia.co.in , April 2011. http://forum.embeddeddesignindia.co.in/BLOG_ARTICLE_7323.HTM 3. Saks, Dan, "Understanding member initialisation," Embeddeddesignindia.co.in , May 2011. http://forum.embeddeddesignindia.co.in/BLOG_ARTICLE_7818.HTM
Constructors and object definitions

热度 12

用户3635970

2011-4-22 12:34

2215 次阅读|

0 个评论

One of the easiest ways to misuse a structure object in C is to fail to initialise it properly. In C++, you can reduce the incidence of uninitialized objects by using classes with constructors. A constructor is a special class member function that provides guaranteed initialisation for objects of its class type. In my previous blog, I tried to take some of the mystery out of constructors by explaining what it is that constructors do and don't do. 1 In essence, a constructor's job is to place appropriate initial values into an object's shallow part and, if there is a deep part, acquire and initialise it, too. (The shallow part of an object is the storage that contains the object's data members, as well as its base class sub-objects, vptr and padding, if any. The deep part of an object is any storage used to represent the object's state beyond the shallow part. An object usually accesses its deep part via pointers or other resource handles residing in the shallow part.) I concluded the article by listing the places that you're likely to see constructors called. I'll now focus on two of those places: object definitions at local scope and at global (namespace) scope. For my examples, I'll use a ring_buffer class similar to the one I used previously. The class definition looks, in part, like: class ring_buffer { public: ring_buffer(); ring_buffer(size_t n); ~~~ private: char *base; size_t size; size_t head, tail; }; The class has two constructors. The first constructor, the default constructor , has an empty parameter list. It initializes a ring_buffer whose capacity is some default number of characters. The second constructor has a parameter n of type size_t . It initializes a ring_buffer whose capacity is n characters. In C, you can emulate the behaviour of the ring_buffer constructors using functions declared as: rb_construct(ring_buffer *this); rb_construct_size(ring_buffer *this, size_t n); In C++, each constructor has an implicitly-declared parameter named this , which points to an object of the constructor's class. In C, you must declare that parameter explicitly. C doesn't support function overloading, so each C function must have a unique name. Constructors and local objects The most obvious place where constructor calls occur is in definitions for class objects at block scope (within function bodies). For example, given the following C++ code: void f() { int i = 0; size_t n = 64; ring_buffer rb; ~~~ the compiler will generate a constructor call to initialise rb . With most C++ compilers, the entry code for function f will allocate storage for all the local objects ( i , j , and rb ) at once, and then perform the initialisations in the order that the declarations appear. The generated code should be essentially the same as what you'd get from the following C code: void f() { int i; size_t n; ring_buffer rb; i = 0; n = 64; rb_construct(rb); ~~~ In the C++ code, rb 's definition doesn't specify any constructor arguments, so the definition invokes ring_buffer 's default constructor—the one with the empty parameter list: ring_buffer(); If rb's definition had been written instead as: ring_buffer rb (n); then the definition would invoke the constructor declared as: ring_buffer(size_t n); Each element of an array is itself an object. If the array element type has a constructor, then each element must be constructed. Thus, an array definition local to a C++ function, as in: void f() { ring_buffer rba ; // for some constant N ~~~ typically generates a loop that applies a constructor to each element, much as the following C code does: void f() { ring_buffer rba ; // for some constant N ring_buffer *p; for (p = rba; p != rba + N; ++p) rb_construct(p); ~~~ Constructors and local static objects A local object normally has automatic storage duration. The program creates the object upon function entry. That is, the program allocates storage and applies a constructor to that object each time it enters the function containing the object's definition. However, a local object can have static storage duration, as in: void f() { size_t n = 64; ring_buffer rb (n); ~~~ In this case, the program allocates the object's storage prior at build time and constructs the object only once, the first time execution passes through the object's definition. C++ compilers typically introduce a "first time through switch"—a statically allocated Boolean object that tracks whether the local object has been initialized. The equivalent C code looks something like: static bool rb_initialized = false; static ring_buffer rb; void f() { size_t n; n = 64; if (!rb_initialized) { rb_construct_size(rb, n); rb_initialized = true; } ~~~ This technique is not thread-safe. A C++ compiler that supports multiple threads would have to do something a little fancier to prevent two threads from accessing the switch concurrently. Constructors and non-local static objects Definitions for local objects appear as statements inside function bodies. As shown in the previous examples, the initialisation for such an object executes as part of the function containing the object definition. In contrast, definitions for non-local objects appear outside function bodies. The initialisation of such objects takes place at program start-up or shortly thereafter. Some C++ compilers create a function for each translation unit that invokes the constructors for non-local static objects defined in that unit. For example, suppose a module contains the non-local definitions: // xyz.cpp ~~~ ring_buffer rb1; widget w; ring_buffer rb2 (128); A typical implementation technique is to generate an initialisation function that calls constructors for rb1 , w , and rb2 . The C equivalent of that function looks something like: void __sti__xyz() { rb_construct(rb1); widget_construct(w); rb_construct_size(rb2, 128); } Using the prefix __sti__ (for "static initialisation") to name these functions was a convention that began with the earliest C++ compilers. I believe some compilers still use it. The C++ Standard requires that the program initialise all the non-local static objects in a given translation unit before the program uses any function or object in that unit. 2 When the implementation uses static initialisation functions such as __sti__xyz , the program must call this function before the program uses rb1 , w , or rb2 . Many compilers satisfy this requirement by planting a call to each static initialisation function somewhere in the program's start-up code. The C++ Standard mandates that the constructors for non-local static objects in a given translation unit execute in the order that the objects are declared. Unfortunately, the Standard doesn't specify the initialisation order for objects in different translation units. A program written with the expectation that the modules will be initialized in a certain order could easily yield disappointing results. Steve Dewhurst describes this problem in some detail, and offers some workarounds. 3 So does Scott Meyers. 4 Some compilers offer #pragma directives, compiler and linker options, or other extensions to give you better control over initialisation order. Constructors in other places C++ compilers inject constructor calls into a number of other places in programs. I'll explain where those places are, and why they make sense, in upcoming columns. Endnotes: 1. Saks, Dan, "Demystifying constructors," Embeddeddesignindia.com, April 2011, http://forum.embeddeddesignindia.co.in/BLOG_ARTICLE_7113.HTM 2. ISO/IEC Standard 14882:2003(E), Programming languages – C++ . 3. Dewhurst, Stephen C., C++ Gotchas . Addison-Wesley, 2003 4. Meyers, Scott, Effective C++, 3 rd ed . Addison-Wesley, 2005.
Demystifying constructors

热度 14

用户3635970

2011-4-5 20:09

2531 次阅读|

0 个评论

Even the experienced C++ programmer can be confused about what exactly constructors do and when they get called. One of the easiest ways to misuse a structure object in C is to fail to initialize it properly. In C++, a class can have special member functions, called constructors , that provide guaranteed initialization for objects of that class type. The guarantee isn't absolute—you can subvert it using a cast. Nonetheless, using constructors can reduce the incidence of uninitialized objects. While most C++ programmers use constructors frequently, I keep running into C++ programmers, even experienced ones, who seem to misunderstand how constructors really work. They're surprised, and somewhat dismayed, when a seemingly simple statement generates a flurry of constructor calls that they didn't expect. Initialization is rarely optional. When it doesn't get done, subsequent operations often fail. However, initialization can be a problem when it happens at unexpected times, especially when the affected code is time-critical. I'll start to take some of the mystery out of when constructors execute and what it is that they actually do in this column. As I often do, I'll explain the behavior of C++ by showing equivalent code in C. If you're a C programmer who doesn't use C++, I think you'll still find these insights helpful. C code that mimics the discipline imposed by C++ is often better code. Shallow parts vs. deep parts I'll begin by introducing a little terminology that should simplify the remaining discussion. Consider an abstract type that implements a ring buffer of characters. A ring buffer is a first-in-first-out data structure. Data can be inserted at the buffer's back end and removed from the front end. The C++ definition for a ring buffer class might look, in part, like: class ring_buffer { ~~~ private: char *base; size_t size; size_t head, tail; }; Member base represents an array that holds the buffered characters. Member size represents the number of elements in that array. Members head and tail are the indices of the elements at the buffer's front and back ends, respectively. As I explained in a prior column, a class without base classes and virtual functions, and with all data members having the same accessibility (all public or all private), has essentially the same storage layout as a structure containing the same data members in the same order. 1 Thus, for example, the ring_buffer class above has the same storage layout as a C structure defined as: typedef struct ring_buffer ring_buffer; struct ring_buffer { char *base; size_t size; size_t head, tail; }; In truth, member base stores a pointer to the initial element of the array, not the array itself. The array is part of ring_buffer 's implementation, but it's not one of the data members. The array occupies storage allocated separately from the ring_buffer object. The ring_buffer is an example of a class with deep structure . A class has deep structure if it has at least one data member that refers to separately-allocated resources managed by the class. Classes with members that are pointers to dynamically-allocated memory are the most common classes with deep structure. However, a class with a member of any type that designates a separately-allocated managed resource, such as an integer designating a file, also has deep structure. Obviously, not all classes have deep structure. For example, a class representing complex numbers typically has just two data members of some floating-point type, as in: class complex { ~~~ private: double real, imaginary; }; Nothing in this class refers to resources beyond the data members. Such classes have shallow structure . The shallow part of an object is the storage that contains the object's data members, as well as its base class sub-objects, vptr and padding, if any. (I briefly described base class sub-objects and vptrs in an earlier column.) 1 The sizeof operator applied to a class object (or the class itself) yields the number of bytes in the shallow part of the object (or class). The deep part of an object is any storage used to represent the object's state beyond the shallow part. Objects with shallow structure have no deep part. Where the shallow parts come from When you define an object in either C or C++, as in: ring_buffer rb; the compiler generates code to allocate the object's shallow part. The initialization of the ring_buffer , including the allocation of its deep part, won't happen unless you write additional code. In C, you have to invoke the initialization code explicitly every time you define a ring_buffer . In C++, you can provide constructors for ring_buffer , which the compiler will use to initialize each ring_buffer automatically. Before I explain where the shallow parts come from, I want to dispel a common misconception: With modern C++ compilers, a constructor doesn't allocate the shallow part of the object it constructs. Rather, the program allocates the shallow part using one of the usual run-time mechanisms for storage allocation. The constructor then initializes the shallow part, and in so doing, may allocate and initialize a deep part as well. (In some early C++ implementations, the constructor did memory allocation for the shallow part, but only for new-expressions. C++ has evolved so that such implementations are now extinct and can be found only in museums.) Now, back to the allocation of the shallow parts. By the "usual run-time mechanisms" for storage allocation I mean whatever the compiler normally does depending on whether the object to be allocated has automatic, static, or dynamic storage duration. 2 These mechanisms are essentially the same in C++ as they are in C. For an object with automatic storage duration ("automatic objects"), the shallow part will be allocated on the run-time stack. During optimization, the compiler may decide to place some automatic objects into CPU registers. It might even do this for a class object whose shallow part is small enough to fit into the available registers. However, it's easier to discuss automatic allocation if we don't belabor this detail and instead speak as if automatic objects are always placed in the stack. If an automatic object is a function parameter, its storage will be allocated as the program evaluates function arguments prior to the call. If an automatic object is declared within a function body, its storage will be allocated upon entering the function. For an object with static storage duration, the compiler, linker, and loader collaborate to place the object's shallow part in memory before the program starts running. From the running program's perspective, an object with static storage duration is always there. (In reality, the constructor doesn't run until run time. The new draft standard for C++ provides a new keyword constexpr , which will allow some constructors to "run" at compile time.) In C++, objects with dynamic storage duration are those created by new-expressions. A new-expression allocates memory by calling a function named operator new or operator new ; size = n; head = tail = 0; } The new-expression acquires the ring_buffer 's deep part. By default, it throws an exception if it fails. The rest of the constructor initializes the shallow part. A C function that does essentially the same job looks like: void rb_construct(ring_buffer *this, size_t n) { if ((this-base = (char *)malloc(n)) == NULL) /* return or announce failure more overtly */ this-size = n; this-head = this-tail = 0; } In C, you should probably call this function as soon as possible after the definition or statement that allocates the shallow part, as in: ring_buffer rb; rb_construct(rb, 32); Where constructors get called Again, whenever your program defines an object with a class type, the compiler automatically plants a call to the object's constructor at the appropriate place in the generated code. If you learn to anticipate where those places are, you're less likely to be surprised by the code the compiler generates. Among the places you're likely to see constructors called are: * Definitions for objects of class type, or for arrays with elements of class type. * New-expressions that create objects of class type, or arrays with elements of class type. * Return statements that return class objects by value. * Function calls with parameters of class type passed by value. * Explicit type conversions (cast expressions). * Any other expressions that create temporary objects of class type. * Throwing an exception of class type. * Catching an object of class type by value. I'll look at some of these in detail in my next column. Endnotes: 1. Saks, Dan. "Classes are structure, and then some," Embeddeddesignindia.com, March 2011. http://forum.embeddeddesignindia.co.in/BLOG_ARTICLE_6834.HTM 2. Saks, Dan. "Storage class specifiers and storage duration," Embedded Systems Design , January 2008, p. 9. 3. Saks, Dan. "Allocating objects vs. allocating storage," Embedded Systems Design , September 2008, p. 11. 4. Saks, Dan. "Allocating arrays," Embedded Systems Design , January 2009, p. 9.

更多...

标签: constructors