热度 12
2011-3-14 17:00
1730 次阅读|
0 个评论
I've written a couple of articles on dynamic allocation in C and C++ emphasizing the distinction between allocating objects and allocating raw (uninitialized) storage. When a program allocates an object, it not only allocates storage for the object, but also initializes that storage with a value appropriate for the type of object that will occupy that storage. When a program just allocates storage, it leaves that storage uninitialized. I have also explained the distinction between deallocating objects and deallocating storage. When a program deallocates an object, it releases not only the storage occupied by the object, but also any other resources the object was using. I also sketched out how C++ compilers translate new-expressions into more primitive operations. I also showed how to write C code that emulates the behavior of new-expressions. However, I stalled out when I got to array delete-expressions. I also left some details out of the code for both the C++ implementation and the C emulation of array new-expressions. I'll fill in most of the missing pieces in my future blogs. A recap New-expressions in C++ allocate objects. Each new-expression is conceptually, if not actually, a two-step process: (1) allocate storage for an object, and (2) initialize it. For objects of class types, initializing an object usually involves calling a constructor. For example, suppose class widget is defined as: class widget { public: widget(); // a constructor ~widget(); // a destructor // ... }; Then a new-expression such as: pw = new widget (); translates more-or-less into something like: pw = static_cast(operator new(sizeof(widget))); pw-widget(); The first statement acquires storage for a widget object by calling operator new , and converts the address of that storage from type void * to type widget * . The second statement initializes the storage by applying widget 's default constructor. (That second statement—an explicit constructor call—is not something you can actually write in C++.) Delete-expressions in C++ deallocate objects. Each delete-expression is also a two-step process: (1) release resources that the object was using, and (2) deallocate the storage for the object. For objects of class types, releasing resources involves calling a destructor. If pw is a pointer to an object of class type widget , a delete-expression such as delete pw; translates more-or-less into something like: if (pw != NULL) { pw-~widget(); operator delete(pw); } A delete-expression applied to a null pointer does nothing. If the pointer is non-null, the delete-expression applies the destructor to the soon-to-be-deleted object, and then deallocates the object's storage by passing the object's address to operator delete . In contrast to a new-expression, a call to the Standard C malloc function, as in: pw = (widget *)malloc(sizeof(widget)); merely allocates storage for a widget , leaving the storage uninitialized. In contrast to a delete-expression, a call to the Standard C free function, as in: free(pw); merely deallocates the storage for a widget , without any regard for any additional resources that the widget may have been using. Although C doesn't have classes with constructors and destructors, you can emulate them by using structs and functions. For example, you can implement a C++ widget class as a C struct: typedef struct widget widget; struct widget { // widget data members go here }; (The typedef immediately before the struct definition elevates the name widget from a mere tag to a full-fledged type name.) You can also implement each widget class member function in C++ as a non-member function in C whose first parameter is a pointer to the widget to be manipulated, possibly along with other parameters. For example, you might declare the C implementation of the widget default constructor and destructor as: void widget_construct(widget *pw); void widget_destroy(widget *pw); You can closely approximate the behavior of a C++ new-expression as an inline C function: inline widget *new_widget() { widget *pw = (widget *)malloc(sizeof(widget)); if (pw != NULL) widget_construct(pw); return pw; } Thereafter, you can construct a dynamically-allocated widget with a default initial value using just: pw = new_widget(); which is a pretty good approximation for the C++ new-expression: pw = new widget; Similarly, you can mimic the behavior of a C++ delete-expression by using another inline function: inline void delete_widget(widget *pw) { if (pw != NULL) { widget_destroy(pw); free(pw); } } Then, if pw points to a dynamically-allocated widget , you can delete it by calling: delete_widget(pw); which is a pretty fair approximation for the C++ delete-expression: delete pw; Allocating and deallocating arrays A C++ array new-expression as in: pw = new widget ; allocates an array of 10 properly initialized widget s. As with other new-expressions, an array new-expression is still a two-step process: (1) allocate storage, and (2) initialize it. However, with an array new-expression the second step is a loop, which applies the default widget constructor to each array element in ascending order by element address. An array delete-expression such as: delete ; You can mimic the behavior of an array delete-expression as a function declared as: void delete_widget_array(widget *pw); but the implementation isn't as straightforward as it is for new_widget_array . Whereas the array dimension appears explicitly in an array new-expression, it never appears in an array delete-expression nor in the declaration of delete_widget_array . Each array delete-expression specifies the address of (the initial element of) the array, but not the array's dimension. The delete-expression and delete_widget_array have to obtain the array dimension another way. A common technique for passing the array dimension to the delete-expression is for the array new-expression to stash the array dimension in a location just before the array itself, as illustrated in the Figure 1 . Inasmuch as that additional location stores an array dimension, it should be declared as type size_t , the same as the parameter to new_widget_array . Previously, new_widget_array allocated space for the array using: widget *pw = (widget *)malloc(n * sizeof(widget)); In the new version, it allocates space for the array plus storage for the array dimension using: size_t size = sizeof(size_t) + n * sizeof(widget); size_t *ps = (size_t *)malloc(size); If ps is non-null ( malloc returns a pointer to the requested storage), then new_widget_array can proceed to place the array dimension at the beginning of that storage: *ps = n; and then compute the address of the first element in the array itself: ++ps; The allocated array is an array of widgets , but ps is a pointer to a size_t , so new_widget_array needs a cast to obtain a pointer it can use to access the array elements: pw = (widget *)ps; Altogether, the new version of new_widget_array looks like: widget *new_widget_array(size_t n) { widget *pw = NULL; size_t size = sizeof(size_t) + n * sizeof(widget); size_t *ps = (size_t *)malloc(size); if (ps != NULL) { widget *p; *ps = n; pw = (widget *)++ps; for (p = pw; p != pw + n; ++p) widget_construct(p); } return pw; } The delete_widget_array function assumes that its parameter, pw , is a pointer returned from some prior call to new_widget_array . Therefore, delete_widget_array must obtain the array dimension by effectively reversing the pointer computation done in new_widget_array . A complete implementation of the function looks like: void delete_widget_array(widget *pw) { if (pw != NULL) { size_t *ps = (size_t *)pw; size_t n = *—ps; widget *p = pw + n; while (p != pw) widget_destroy(—p); free(ps); } } Padding and alignment, again These implementations of new_widget_array and delete_widget_ array will work just fine on any processor in which no type has an alignment stricter than that of size_t . However, it could produce undefined behavior on any machine with more strictly aligned types. I'll explain why, and what you can do about it, in a future article.