[讀書筆記]Universal References in C++11 part 3: 記憶體管理以及Rule of Three(Five)

[insert page=’338′ display=’link’]

上一篇主要介紹了rvalue reference的概念，以及它導入用來解決無法區分lvalue reference以及暫存物件之間的copy問題。基本上這些操作的本質，其實也就是怎麼用更有效率、更簡單的方法去控管我們所使用的資源而已。──記憶體管理，一直是C++程式設計中一個非常重要的課題。

一個最常被提到的準則便是RAII的概念。何謂RAII？全名Resource Acquisition Is Initialisation，意即所有接下來這個物件所會用到的資源，在建構階段就已經全部建立完成。例如檔案的開啟、變數的初始化或記憶體配置。但這句話只說了一半，另外一個對應概念是Resource Release Is Destruction (RRID)，意即在解構階段必須將所有要來的資源還回去。

自己new出來的要記得delete，之前開檔案要記得關。它是一種資源所有權管理的概念，每個物件要在自己的生命週期內管理自己擁有的資源；它也是一個設計的大原則，用來提醒程式設計師要記得將不要的資源釋放出去。不然帶來的後果，便是系統的不穩定或是直接崩潰。

可是我們該怎麼在C++中正確的實作RAII、RRID呢？只要在建構子跟解構子中做完對應的初始化以及釋放的動作就行了嗎？答案是，對，可是也不對。正確的解答應該要看我們對於這個類別在功能性上的設計與期待。若是該物件有將別的物件拷貝一份到自己身上的需求，則需要思考的是我們要進行的是deep copy、shallow copy還是將別人身上所管理的資源move到自己身上。嗯？這個不是上一篇所提到東西嗎？──沒錯，只是上一篇在程式碼的實作上並沒有符合我們對於一個類別在資源管理機制上的期待。一份完整的實作，必須要遵循 Rule of Three的設計準則：若是設計師有實作下列任何一個function，則其他的function都必須要有合適的實作：

class Widget{
public:
    /** Copy constructor */
    Widget(const Widget& other){
        memberwiseCopy(other);
    }
    /** Destructor */
    ~Widget(){}
    /** Copy assignment operator */
    Widget& operator= (const Widget& other)
    {
        memberwiseCopy(other);
        return *this;
    }
}

若沒有實作的話，其實編譯會自動產生這些方法(參考這裡)，做的事情就跟上面這一段差不多。只是預設的實作方式通常不符合各種使用需求。最常見的情況是我們在是類別中控管了一個raw pointer：

#define SAFE_DELETE(a) do { delete (a); (a) = NULL; } while (0)
class Widget{
public:
    /** Copy constructor */
    Widget(const Widget& other){
        memberwiseCopy(other);
    }
    /** Destructor */
    ~Widget(){SAFE_DELETE(m_pHeapStorageResource)}
    /** Copy assignment operator */
    Widget& operator= (const Widget& other)
    {
        memberwiseCopy(other);
        return *this;
    }
public:
    Resource* m_pHeapStorageResource;
}

若是我們使用預設的Copy constructor跟Copy assignment operator產生新的Widget物件時，因為是用shallow copy的方式，所以在第一個物件解構完畢之後，第二個物件中的m_pHeapStorageResource會變成dangling pointer，結果就是第二個物件在解構時造成系統崩潰。所以在Copy constructor跟Copy assignment operator中，我們必須對pHeapStorageResource使用deep copy的方式來避免。這一點，在上一篇文章也有提及。另外我們也有談到怎麼使用rvalue reference來避免多餘的copy操作。這個在C++11中所引入的特性，也讓原本的Rule of three準則變成了Rule of five，因為我們必須再多加二個function的實作才行：

/** Move constructor */
Widget(Widget&& other){
    //memberwiseMove(other);//oops! will call memberwiseMove(Widget& other), not our intent here
    //member list initialization
    memberwiseMove(std::forward<Widget>(other));
}

/** Move assignment operator */
Widget& operator= (Widget&& other)
{
    //memberwiseMove(other);//oops! will call memberwiseMove(Widget& other), other is lvalue because it has a name
    memberwiseMove(std::forward<Widget>(other));//forward rvalue to memberwiseMove
    return *this;
}

我們可以看到裡面有這一行的呼叫： memberwiseMove(std::forward<Widget>(other));

為什麼要使用用std::forward呢？因為若沒有包上這一層的話，其實他會去呼叫的會是lvalue reference版本的memberwiseMove。

二個memberwiseMove版本的實作如下：

void memberwiseMove(Widget&& other){
    SAFE_DELETE(m_pHeapStorageResource);//safe release resource
    m_pHeapStorageResource = other.m_pHeapStorageResource; //move
    other.m_pHeapStorageResource = nullptr;

    //Did not benefit from move semantics,
    //because we didn't implement Move assignment operator for class Resource, so it will call it's own default Copy assignment operator(memberwiseCopy)
    m_stackStorageResource = other.m_stackStorageResource;  
    m_i = other.m_i;                                        //did not benefit from move semantics, because it as same as copying an int
}
void memberwiseMove(Widget& other){
    SAFE_DELETE(m_pHeapStorageResource);//safe release resource
    m_pHeapStorageResource = other.m_pHeapStorageResource; //move
    other.m_pHeapStorageResource = nullptr;
    m_stackStorageResource = other.m_stackStorageResource;
    m_i = other.m_i;
}

到底為什麼會這樣？傳進來的參數明明是以rvalue reference傳進來的阿？讓我們來回顧一下什麼是lvalue reference：

先看它有沒有一個名字，若有名字則一定是lvalue。
試著取出這個語句的address (使用&)，若可以取出，則是lvalue，反之則是rvalue。

下面的程式碼證實我們可以取出other的address，因此other雖然宣告成rvalue reference，但它所關連到的名字卻是lvalue reference。

Widget& operator= (Widget&& other)
{
    Widget* test = &other;  // get address
    //memberwiseMove(other);//oops! will call memberwiseMove(Widget& other), other is lvalue because it has a name
    memberwiseMove(std::forward<Widget>(other));//forward rvalue to memberwiseMove
    return *this;
}

這就是為何我們要再包裝一層std::forward了。這個function其實並沒有做任何事，它的職責就只是將物件傳到其他該傳的function而已。若傳進來的是lvalue reference，則傳出呼叫則是lvalue reference的版本；若是傳進來的是rvalue reference，則傳出呼叫的則是rvalue reference的版本。

接下來讓我們來整理一下使用這個Widget時的狀況吧：

int main(){

    Widget obj1;

    Widget obj2(obj1);// call Copy constructor
    Widget obj3;
    obj3 = obj1; // call Copy assignment operator

    Widget obj4;
    obj4 = Widget(); //call Move assignment operator

    // call Move constructor, should not call obj1 again, because we mark obj1 move to obj5 here
    Widget obj5 = std::move(obj1);

    //call Move assignment operator, should not call obj2 again, because we mark obj2 move to obj5 here
    obj5 = std::move(obj2);
    
    return 0;
}

我們可以看到上面的code有一個std::move的用法，到底這個function是用來做什麼的？其實它的性質跟std::forward有點像，只是，他的職責是用來將lvalue的名字隱藏起來，因此在呼叫其他function的時候可以順利的找到rvalue reference的版本。它之所以叫做std::move，並不是它真的做了move的動作，其實他什麼事情都沒有做。它只是用來標示提醒程式設計師某個lvalue即將被轉移到其他的物件上，『希望』接下來不要再使用這個將被轉移的lvalue。所以單純呼叫std::move的話，其實是不會發生任何事的。以上面的例子來看，obj1跟obj2都被move到了obj5上面，因此接下來不要再呼叫這二個物件。

當然，這個並沒有任何強制性，程式設計師還是能夠呼叫這二個物件。只是明明C++已經多了std::move這道麻煩的手續來提醒我們了，若還是執意要呼叫，當然編譯器並沒辦法阻止我們。而後果？通常會是程序崩潰。

寫到這裡，或許有人會想：為什麼設計一個類別要寫這麼多function阿……會不會太麻煩了一點，有沒有減化的方法？當然有，方法有二個：

1. 把除了建構子跟解構子的function宣告成private或delete，直接禁用這些功能。

/** Copy constructor */
Widget(const Widget& other) = delete;
/** Copy assignment operator */
Widget& operator= (const Widget& other) = delete;
/** Move constructor */
Widget(Widget&& other) = delete;
/** Move assignment operator */
Widget& operator= (Widget&& other) = delete;

2. 遵循Rule of Zero 的設計準則，這樣就不用實作那一堆由編譯器產生的煩人功能了。其實作如下(可選擇用任何的smart pointer機制)：

class Widget{
public:
    /** Default constructor */
    Widget()
        : m_resourcePtr(new Resource())
        , m_i(0){

    }

public:
    std::shared_ptr<Resource> m_resourcePtr;
    Resource m_stackStorageResource;
    int m_i;
};



int main(){

    Widget obj1;
    obj1.m_resourcePtr.get()->m_data = "this is obj1";
    Widget obj2(obj1);// call Copy constructor
    obj2.m_resourcePtr.get()->m_data = "this is obj2";
    Widget obj3;
    obj3 = obj1; // call Copy assignment operator

    Widget obj4;
    obj4 = Widget(); //call Move assignment operator


    Widget obj5 = std::move(obj1);

    obj5 = std::move(obj2);


    return 0;
}

由上面的code我們可以看到，對於Widget這個物件而言，由於它身上已經沒有任何指標物件了，因此也就不用擔心任何資源所有權轉移的問題。而這，也就是Rule of Zero 所談論的哲學：不要手動控管任何的raw pointer，將資源管理這件事全部交給smart pointer處理。上面所討論所有copy不copy，資源move不move的問題，都在smart pointer的機制下被很好的控管完成。對於modern C++而言，這也是被推薦的做法：任何手動控管raw pointer的機制都應該被認為是過時。若在產品環境允許的狀態下，應該首先考慮用Rule of Zero的準則來進行程式設計。

最後，本系列的程式碼，可在github上下載：網址在此

[讀書筆記]Universal References in C++11 part 3: 記憶體管理以及Rule of Three(Five)

[UnrealEngine4][C++] GENERATED_BODY() vs GENERATED_UCLASS_BODY() - 地平線的彼端－－日出與日落之國地平線的彼端－－日出與日落之國

Leave a Reply Cancel reply