 

用户3635970

文章：76 阅读：189182 评论：3 赞：1187

 好友  私信个人主页

文章 76
原创 0
阅读 189182
评论 3
赞 1187

原创 Allocating and deallocating arrays, in detail

 2011-3-14 17:04  2636 10 10 分类: 消费电子

I've written a couple of articles on dynamic allocation in C and C++ emphasizing the distinction between allocating objects and allocating raw (uninitialized) storage. When a program allocates an object, it not only allocates storage for the object, but also initializes that storage with a value appropriate for the type of object that will occupy that storage. When a program just allocates storage, it leaves that storage uninitialized.

I have also explained the distinction between deallocating objects and deallocating storage. When a program deallocates an object, it releases not only the storage occupied by the object, but also any other resources the object was using.

I also sketched out how C++ compilers translate new-expressions into more primitive operations. I also showed how to write C code that emulates the behavior of new-expressions. However, I stalled out when I got to array delete-expressions. I also left some details out of the code for both the C++ implementation and the C emulation of array new-expressions. I'll fill in most of the missing pieces in my future blogs.

A recap

New-expressions in C++ allocate objects. Each new-expression is conceptually, if not actually, a two-step process: (1) allocate storage for an object, and (2) initialize it. For objects of class types, initializing an object usually involves calling a constructor.

For example, suppose class widget is defined as:

class widget

{

public:

widget(); // a constructor

~widget(); // a destructor

// ...

};

Then a new-expression such as:

pw = new widget ();

translates more-or-less into something like:

pw = static_cast(operator new(sizeof(widget)));

pw->widget();

The first statement acquires storage for a widget object by calling operator new, and converts the address of that storage from type void * to type widget *. The second statement initializes the storage by applying widget's default constructor. (That second statement—an explicit constructor call—is not something you can actually write in C++.)

Delete-expressions in C++ deallocate objects. Each delete-expression is also a two-step process: (1) release resources that the object was using, and (2) deallocate the storage for the object. For objects of class types, releasing resources involves calling a destructor.

If pw is a pointer to an object of class type widget, a delete-expression such as delete pw; translates more-or-less into something like:

if (pw != NULL)

{

pw->~widget();

operator delete(pw);

}

A delete-expression applied to a null pointer does nothing. If the pointer is non-null, the delete-expression applies the destructor to the soon-to-be-deleted object, and then deallocates the object's storage by passing the object's address to operator delete.

In contrast to a new-expression, a call to the Standard C malloc function, as in:

pw = (widget *)malloc(sizeof(widget));

merely allocates storage for a widget, leaving the storage uninitialized. In contrast to a delete-expression, a call to the Standard C free function, as in:

free(pw);

merely deallocates the storage for a widget, without any regard for any additional resources that the widget may have been using.

Although C doesn't have classes with constructors and destructors, you can emulate them by using structs and functions. For example, you can implement a C++ widget class as a C struct:

typedef struct widget widget;

struct widget

{

// widget data members go here

};

(The typedef immediately before the struct definition elevates the name widget from a mere tag to a full-fledged type name.)

You can also implement each widget class member function in C++ as a non-member function in C whose first parameter is a pointer to the widget to be manipulated, possibly along with other parameters. For example, you might declare the C implementation of the widget default constructor and destructor as:

void widget_construct(widget *pw);

void widget_destroy(widget *pw);

You can closely approximate the behavior of a C++ new-expression as an inline C function:

inline

widget *new_widget()

{

widget *pw = (widget *)malloc(sizeof(widget));

if (pw != NULL)

widget_construct(pw);

return pw;

}

Thereafter, you can construct a dynamically-allocated widget with a default initial value using just:

pw = new_widget();

which is a pretty good approximation for the C++ new-expression:

pw = new widget;

Similarly, you can mimic the behavior of a C++ delete-expression by using another inline function:

inline

void delete_widget(widget *pw)

{

if (pw != NULL)

{

widget_destroy(pw);

free(pw);

}

Then, if pw points to a dynamically-allocated widget, you can delete it by calling:

delete_widget(pw);

which is a pretty fair approximation for the C++ delete-expression:

delete pw;

Allocating and deallocating arrays

A C++ array new-expression as in:

pw = new widget [10];

allocates an array of 10 properly initialized widgets. As with other new-expressions, an array new-expression is still a two-step process: (1) allocate storage, and (2) initialize it. However, with an array new-expression the second step is a loop, which applies the default widget constructor to each array element in ascending order by element address.

An array delete-expression such as:

delete [] pw;

is also a two-step process: (1) apply the destructor to each array element, and (2) deallocate the storage for the array. In this case, it's the first step that's the loop—a loop that applies the destructor to each array element in the reverse order.

You can emulate the behavior of a C++ array new-expression as a C function:

widget *new_widget_array(size_t n)

{

widget *pw = (widget *)malloc(n * sizeof(widget));

if (pw != NULL)

{

widget *p;

for (p = pw; p != pw + n; ++p)

widget_construct(p);

}

return pw;

}

Thereafter, you can dynamically allocate an array of properly initialized widgets using just:

pw = new_widget_array(n);

which is a pretty good approximation for the C++ new-expression:

pw = new widget [n];

You can mimic the behavior of an array delete-expression as a function declared as:

void delete_widget_array(widget *pw);

but the implementation isn't as straightforward as it is for new_widget_array. Whereas the array dimension appears explicitly in an array new-expression, it never appears in an array delete-expression nor in the declaration of delete_widget_array. Each array delete-expression specifies the address of (the initial element of) the array, but not the array's dimension. The delete-expression and delete_widget_array have to obtain the array dimension another way.

A common technique for passing the array dimension to the delete-expression is for the array new-expression to stash the array dimension in a location just before the array itself, as illustrated in the Figure 1. Inasmuch as that additional location stores an array dimension, it should be declared as type size_t, the same as the parameter to new_widget_array.

Previously, new_widget_array allocated space for the array using:

widget *pw = (widget *)malloc(n * sizeof(widget));

In the new version, it allocates space for the array plus storage for the array dimension using:

size_t size = sizeof(size_t) + n * sizeof(widget);

size_t *ps = (size_t *)malloc(size);

If ps is non-null (malloc returns a pointer to the requested storage), then new_widget_array can proceed to place the array dimension at the beginning of that storage:

*ps = n;

and then compute the address of the first element in the array itself:

++ps;

The allocated array is an array of widgets, but ps is a pointer to a size_t, so new_widget_array needs a cast to obtain a pointer it can use to access the array elements:

pw = (widget *)ps;

Altogether, the new version of new_widget_array looks like:

widget *new_widget_array(size_t n)

{

widget *pw = NULL;

size_t size = sizeof(size_t) + n * sizeof(widget);

size_t *ps = (size_t *)malloc(size);

if (ps != NULL)

{

widget *p;

*ps = n;

pw = (widget *)++ps;

for (p = pw; p != pw + n; ++p)

widget_construct(p);

}

return pw;

}

The delete_widget_array function assumes that its parameter, pw, is a pointer returned from some prior call to new_widget_array. Therefore, delete_widget_array must obtain the array dimension by effectively reversing the pointer computation done in new_widget_array. A complete implementation of the function looks like:

void delete_widget_array(widget *pw)

{

if (pw != NULL)

{

size_t *ps = (size_t *)pw;

size_t n = *—ps;

widget *p = pw + n;

while (p != pw)

widget_destroy(—p);

free(ps);

}

Padding and alignment, again

These implementations of new_widget_array and delete_widget_ array will work just fine on any processor in which no type has an alignment stricter than that of size_t. However, it could produce undefined behavior on any machine with more strictly aligned types. I'll explain why, and what you can do about it, in a future article.