[讀書筆記]Universal References in C++11 part 2: Move semantics, lvalue reference and rvalue reference

上一篇:
[insert page=’309′ display=’link’]

C++11以前,效能上最大的瓶頸就是程式執行過程中,會產生許多不必要的臨時物件,並進行許多昂貴的複制操作。從上一篇的文章中我們知道,C++11導入『T&&』語意來解決這個問題。或許有人會問,這個問題不是早就有相關機制能夠解決了嗎?將function的傳入參數指定為reference或pointer不是就能避掉無謂複制操作?例如下面實作的三種myCopyFunction,這個函式的主要用途是將傳入的參數copy給global 物件myBackupWidget:

class Resource{
public:
    std::string m_data;
};
class Widget{
public:
    Widget() 
        : m_pHeapStorageResource(nullptr){
    }
public:
    Resource* m_pHeapStorageResource;
    Resource m_stackStorageResource;
    int m_i;
};
static Widget g_myBackupWidget;
void myCopyFunc1(Widget param){
    g_myBackupWidget.m_pHeapStorageResource = param.m_pHeapStorageResource;
    g_myBackupWidget.m_stackStorageResource = param.m_stackStorageResource;
    g_myBackupWidget.m_i = param.m_i;

    if(param.m_pHeapStorageResource){
        param.m_pHeapStorageResource->m_data = "myCopyFunc1 param heap Storage: change data here will change origin widget";
    }
    param.m_stackStorageResource.m_data = "myCopyFunc1 param stack storage: change data here will not change origin widget";
}
void myCopyFunc2(Widget& param){
    g_myBackupWidget.m_pHeapStorageResource = param.m_pHeapStorageResource;
    g_myBackupWidget.m_stackStorageResource = param.m_stackStorageResource;
    g_myBackupWidget.m_i = param.m_i;
    if(param.m_pHeapStorageResource){
        param.m_pHeapStorageResource->m_data = "myCopyFunc2 param heap Storage: change data here will change origin widget";
    }
    param.m_stackStorageResource.m_data = "myCopyFunc2 param stack storage: change data here will change origin widget";
}
void myCopyFunc3(Widget* param){
    g_myBackupWidget.m_pHeapStorageResource = param->m_pHeapStorageResource;
    g_myBackupWidget.m_stackStorageResource = param->m_stackStorageResource;
    g_myBackupWidget.m_i = param->m_i;

    if(param->m_pHeapStorageResource){
        param->m_pHeapStorageResource->m_data = "myCopyFunc3 param heap Storage: change data here will change origin widget";
    }
    param->m_stackStorageResource.m_data = "myCopyFunc3 param stack storage: change data here will change origin widget";
}

int main(){
    Widget origin;
    origin.m_pHeapStorageResource = new Resource();
    origin.m_pHeapStorageResource->m_data = "data are allocated on heap Storage";
    origin.m_stackStorageResource.m_data = "data are allocated on stack storage";

    cout << origin.m_pHeapStorageResource->m_data.c_str() << endl;
    cout << origin.m_stackStorageResource.m_data.c_str() << endl;
    myCopyFunc1(origin); // call by value,會產生個新的param object,並使用member-wise copy將a的member copy給param
    cout << origin.m_pHeapStorageResource->m_data.c_str() << endl;
    cout << origin.m_stackStorageResource.m_data.c_str() << endl;
    myCopyFunc2(origin); // call by reference
    myCopyFunc2(Widget());// call by reference,產生一個臨時物件

    cout << origin.m_pHeapStorageResource->m_data.c_str() << endl;
    cout << origin.m_stackStorageResource.m_data.c_str() << endl;
    myCopyFunc3(&origin);// call by pointer

    cout << origin.m_pHeapStorageResource->m_data.c_str() << endl;
    cout << origin.m_stackStorageResource.m_data.c_str() << endl;
    delete origin.m_pHeapStorageResource;
    origin.m_pHeapStorageResource = nullptr;
    system("pause");
    return 0;
}



雖然myCopyFunc2跟myCopyFunc3能夠避掉copy的物件的成本,但卻無法解決『產生臨時物件』的問題,因為,對於myFunc2而言,我們沒辦法區分外面傳進來的到底是lvalue(Widget origin)還是自動產生出來臨時物件。

臨時物件,以專業的述語來描述的話,其實就等同於rvalue。

補充:雖然上一回已經有介紹過,但是不是對於分辨lvalue還有rvalue還有些困惑呢?沒關係,這裡提供一個簡單的準則參考:首先,先看它有沒有一個名字,若有名字則一定是lvalue。若要更精確一點的話,則可以去試著取出這個語句的address (使用&),若可以取出,則是lvalue,反之則是rvalue。所以臨時物件是rvalue。

若我們可以明確的區分出傳入的參數是rvalue的話,則我們在呼叫myCopyFunc2的時候,就可以很放心的將param所擁有的m_pHeapStorageResource直接『轉移』給g_myBackupWidget,而完全不用擔心Widget origin是不是在destructor的時候對pHeapStorageResource做了刪除的動作。是的,指標所有權的『轉移』,便是整個Universal References的核心概念,也是減少暫存物件所帶來的複制成本。

或許已經有人看出來了,上面這幾個copy的程式,其實是在做就是類似default copy constructor或default assignment operator的動作:member-wise copy(可參考這裡)。什麼是member-wise copy?或許看上面的程式展示已經有一點點感覺了,這裡就將它描述的更清楚一點:所謂的member-wise copy,就是會把所有的member呼叫一次它自己的copy assignment operator。基本上若我們的類別中沒有對operator=進行實作的話,遇到指標其實就只會真的複制它的指標而已(shallow copy)。因此在記憶體資源管理上,若沒處理好則常常會產生dangling pointer進而造成程式crash。通常的解法,是提供一個deepCopy的function,並複寫operator=的實作。在這裡的例子可改寫如下:

class Resource{
public:
    Resource* deepCopy(){
        Resource* pResource = new Resource();
        pResource->m_data = m_data;
        return pResource;
    }
public:
    std::string m_data;
};
void myCopyFunc2(const Widget& param){
    g_myBackupWidget.m_pHeapStorageResource = param.m_pHeapStorageResource->deepCopy();
    g_myBackupWidget.m_stackStorageResource = param.m_stackStorageResource;
    g_myBackupWidget.m_i = param.m_i;
    if(param.m_pHeapStorageResource){
        param.m_pHeapStorageResource->m_data = "myCopyFunc2 param heap Storage: change data here will change origin widget";
    }
    //param.m_stackStorageResource.m_data = "myCopyFunc2 param stack storage: change data here will change origin widget";
}
void main(){
    Widget origin;
    origin.m_pHeapStorageResource = new Resource();
    myCopyFunc2(origin); // call by reference
}

我們可以看到Resource類別中多了一個deepCopy。而這個避免產生dangling pointer的預防措施,也就是造成複制成本的根本原因。就概念上而言,對lvalue進行deepCopy完全是合情合理的行為,但對於rvalue進行deepCopy則完全是一種浪費。若我們可以知道傳入的參數是rvalue,則完全可以用『轉移』指標所有權的概念來進行處理。這就是為什麼C++11要提供&&這個語句,來將接受rvalue參數的功能給區分出來了。為了避免混亂,這裡我們將void myCopyFunc2(Widget& param)這種型式稱作lvalue reference,而void myCopyFunc2(Widget&& param)則稱作rvalue reference。

等等,在上一篇文章中不是說這種型式的function稱作universal reference嗎?怎麼突然改了一個名字呢?──是的,這裡並沒有將名詞打錯。只是universal reference通常是用來代指有template功能,需要編譯期推導其傳入形態的function。這裡我們先不要將情況弄的這麼複雜,只要將接受rvalue的function理解為rvalue reference即可;而其本質上,也確實是如此。

現在,就讓我們試著將rvalue reference導入看看會發生什麼事:

class Resource{

public:
    Resource* deepCopy(){
        Resource* pResource = new Resource();
        pResource->m_data = m_data;
        return pResource;
    }
public:
    std::string m_data;
};
class Widget{
public:
    /** Default constructor */
    Widget(){
        m_pHeapStorageResource = new Resource();
        m_pHeapStorageResource->m_data = "data are allocated on heap Storage";
    }
    /** Copy constructor */
    Widget(const Widget& other){
        memberwiseCopy(other);
        m_pHeapStorageResource->m_data = "data are allocated on heap Storage";
    }
    /** Destructor */
    ~Widget(){
        if(m_pHeapStorageResource){
            delete m_pHeapStorageResource;
            m_pHeapStorageResource = nullptr;
        }
    }
    /** Copy assignment operator */
    Widget& operator= (const Widget& other)
    {
        memberwiseCopy(other);
        return *this;
    }

//simulate default compiler generated copy behavior
void memberwiseCopy(const Widget& other){
    m_pHeapStorageResource = other.m_pHeapStorageResource;
    m_stackStorageResource = other.m_stackStorageResource;
    m_i = other.m_i;
}
public:
    Resource* m_pHeapStorageResource;
    Resource m_stackStorageResource;
    int m_i;
};
static Widget g_myBackupWidget;
void myCopyFunc2(const Widget& param){
    g_myBackupWidget.m_pHeapStorageResource = param.m_pHeapStorageResource->deepCopy();
    g_myBackupWidget.m_stackStorageResource = param.m_stackStorageResource;
    g_myBackupWidget.m_i = param.m_i;
    if(param.m_pHeapStorageResource){
        param.m_pHeapStorageResource->m_data = "myCopyFunc2 param heap Storage: change data here will change origin widget";
    }}

void myCopyFunc2(Widget&& param){
    g_myBackupWidget.m_pHeapStorageResource = param.m_pHeapStorageResource; //move
    param.m_pHeapStorageResource = nullptr;
    g_myBackupWidget.m_stackStorageResource = param.m_stackStorageResource;//no move, will copy
    g_myBackupWidget.m_i = param.m_i; //copy
    g_myBackupWidget.m_pHeapStorageResource->m_data = "myCopyFunc2 rvalue param heap Storage: move param.m_pHeapStorageResource to g_myBackupWidget ";

}

int main(){
    Widget origin;

    myCopyFunc2(origin); // 呼叫void myCopyFunc2(Widget& param){
    cout << origin.m_pHeapStorageResource->m_data.c_str() << endl;
    cout << origin.m_stackStorageResource.m_data.c_str() << endl;
    myCopyFunc2(Widget());//呼叫void myCopyFunc2(Widget&& param){
    cout << g_myBackupWidget.m_pHeapStorageResource->m_data.c_str() << endl;
    cout << g_myBackupWidget.m_stackStorageResource.m_data.c_str() << endl;

    system("pause");
    return 0;
}



從上面的程式我們可以看到,rvalue reference在功能上跟lvalue reference完全沒有衝突,它就只是設計來補充lvalue reference的不足而已。若我們沒有宣告rvalue reference,則C++會自動去找到lvalue reference的版本。

呼!我們終於把Move semantics、lvalue reference跟rvalue reference的概念講完了。什麼?還有move semantics沒講到?仔細看一下這二行:

g_myBackupWidget.m_pHeapStorageResource = param.m_pHeapStorageResource; //move
param.m_pHeapStorageResource = nullptr;

Move講的就是指標所有權轉移的概念。重點在第二行將指標指為nullptr的行為,將m_pHeapStorageResource的所有權從param身上釋放掉,這樣就不用擔心Widget在Destructor時把它釋放掉了。

 

到這裡,我們已經講完Universal References幾個核心。不知道還需要多少篇幅才能把所有的東西談完,但接下來,只要涵蓋完rule of three(five)、std:move、std::forward、reference collapsing以及RVO(return value optimization)這幾個概念,就能明白Universal References到底是什麼,以及它在C++11中扮演的角色是多麼的重大。

最後,先帶過std:move、std::forward這二個function的用途:

std::move:用來將lvalue cast成rvalue,用法: myCopyFunc2(std::move(origin))

std::forward:用來將傳入的rvalue reference以rvalue的形式傳給其他rvalue reference function,若傳入的是lvalue,則保持lvalue的傳入形式。

Leave a Reply

Your email address will not be published. Required fields are marked *