上一篇:
[insert page=’309′ display=’link’]
C++11以前,效能上最大的瓶頸就是程式執行過程中,會產生許多不必要的臨時物件,並進行許多昂貴的複制操作。從上一篇的文章中我們知道,C++11導入『T&&』語意來解決這個問題。或許有人會問,這個問題不是早就有相關機制能夠解決了嗎?將function的傳入參數指定為reference或pointer不是就能避掉無謂複制操作?例如下面實作的三種myCopyFunction,這個函式的主要用途是將傳入的參數copy給global 物件myBackupWidget:
class Resource{ public: std::string m_data; }; class Widget{ public: Widget() : m_pHeapStorageResource(nullptr){ } public: Resource* m_pHeapStorageResource; Resource m_stackStorageResource; int m_i; }; static Widget g_myBackupWidget; void myCopyFunc1(Widget param){ g_myBackupWidget.m_pHeapStorageResource = param.m_pHeapStorageResource; g_myBackupWidget.m_stackStorageResource = param.m_stackStorageResource; g_myBackupWidget.m_i = param.m_i; if(param.m_pHeapStorageResource){ param.m_pHeapStorageResource->m_data = "myCopyFunc1 param heap Storage: change data here will change origin widget"; } param.m_stackStorageResource.m_data = "myCopyFunc1 param stack storage: change data here will not change origin widget"; } void myCopyFunc2(Widget& param){ g_myBackupWidget.m_pHeapStorageResource = param.m_pHeapStorageResource; g_myBackupWidget.m_stackStorageResource = param.m_stackStorageResource; g_myBackupWidget.m_i = param.m_i; if(param.m_pHeapStorageResource){ param.m_pHeapStorageResource->m_data = "myCopyFunc2 param heap Storage: change data here will change origin widget"; } param.m_stackStorageResource.m_data = "myCopyFunc2 param stack storage: change data here will change origin widget"; } void myCopyFunc3(Widget* param){ g_myBackupWidget.m_pHeapStorageResource = param->m_pHeapStorageResource; g_myBackupWidget.m_stackStorageResource = param->m_stackStorageResource; g_myBackupWidget.m_i = param->m_i; if(param->m_pHeapStorageResource){ param->m_pHeapStorageResource->m_data = "myCopyFunc3 param heap Storage: change data here will change origin widget"; } param->m_stackStorageResource.m_data = "myCopyFunc3 param stack storage: change data here will change origin widget"; } int main(){ Widget origin; origin.m_pHeapStorageResource = new Resource(); origin.m_pHeapStorageResource->m_data = "data are allocated on heap Storage"; origin.m_stackStorageResource.m_data = "data are allocated on stack storage"; cout << origin.m_pHeapStorageResource->m_data.c_str() << endl; cout << origin.m_stackStorageResource.m_data.c_str() << endl; myCopyFunc1(origin); // call by value,會產生個新的param object,並使用member-wise copy將a的member copy給param cout << origin.m_pHeapStorageResource->m_data.c_str() << endl; cout << origin.m_stackStorageResource.m_data.c_str() << endl; myCopyFunc2(origin); // call by reference myCopyFunc2(Widget());// call by reference,產生一個臨時物件 cout << origin.m_pHeapStorageResource->m_data.c_str() << endl; cout << origin.m_stackStorageResource.m_data.c_str() << endl; myCopyFunc3(&origin);// call by pointer cout << origin.m_pHeapStorageResource->m_data.c_str() << endl; cout << origin.m_stackStorageResource.m_data.c_str() << endl; delete origin.m_pHeapStorageResource; origin.m_pHeapStorageResource = nullptr; system("pause"); return 0; }
雖然myCopyFunc2跟myCopyFunc3能夠避掉copy的物件的成本,但卻無法解決『產生臨時物件』的問題,因為,對於myFunc2而言,我們沒辦法區分外面傳進來的到底是lvalue(Widget origin)還是自動產生出來臨時物件。
臨時物件,以專業的述語來描述的話,其實就等同於rvalue。
補充:雖然上一回已經有介紹過,但是不是對於分辨lvalue還有rvalue還有些困惑呢?沒關係,這裡提供一個簡單的準則參考:首先,先看它有沒有一個名字,若有名字則一定是lvalue。若要更精確一點的話,則可以去試著取出這個語句的address (使用&),若可以取出,則是lvalue,反之則是rvalue。所以臨時物件是rvalue。
若我們可以明確的區分出傳入的參數是rvalue的話,則我們在呼叫myCopyFunc2的時候,就可以很放心的將param所擁有的m_pHeapStorageResource直接『轉移』給g_myBackupWidget,而完全不用擔心Widget origin是不是在destructor的時候對pHeapStorageResource做了刪除的動作。是的,指標所有權的『轉移』,便是整個Universal References的核心概念,也是減少暫存物件所帶來的複制成本。
或許已經有人看出來了,上面這幾個copy的程式,其實是在做就是類似default copy constructor或default assignment operator的動作:member-wise copy(可參考這裡)。什麼是member-wise copy?或許看上面的程式展示已經有一點點感覺了,這裡就將它描述的更清楚一點:所謂的member-wise copy,就是會把所有的member呼叫一次它自己的copy assignment operator。基本上若我們的類別中沒有對operator=進行實作的話,遇到指標其實就只會真的複制它的指標而已(shallow copy)。因此在記憶體資源管理上,若沒處理好則常常會產生dangling pointer進而造成程式crash。通常的解法,是提供一個deepCopy的function,並複寫operator=的實作。在這裡的例子可改寫如下:
class Resource{ public: Resource* deepCopy(){ Resource* pResource = new Resource(); pResource->m_data = m_data; return pResource; } public: std::string m_data; }; void myCopyFunc2(const Widget& param){ g_myBackupWidget.m_pHeapStorageResource = param.m_pHeapStorageResource->deepCopy(); g_myBackupWidget.m_stackStorageResource = param.m_stackStorageResource; g_myBackupWidget.m_i = param.m_i; if(param.m_pHeapStorageResource){ param.m_pHeapStorageResource->m_data = "myCopyFunc2 param heap Storage: change data here will change origin widget"; } //param.m_stackStorageResource.m_data = "myCopyFunc2 param stack storage: change data here will change origin widget"; } void main(){ Widget origin; origin.m_pHeapStorageResource = new Resource(); myCopyFunc2(origin); // call by reference }
我們可以看到Resource類別中多了一個deepCopy。而這個避免產生dangling pointer的預防措施,也就是造成複制成本的根本原因。就概念上而言,對lvalue進行deepCopy完全是合情合理的行為,但對於rvalue進行deepCopy則完全是一種浪費。若我們可以知道傳入的參數是rvalue,則完全可以用『轉移』指標所有權的概念來進行處理。這就是為什麼C++11要提供&&這個語句,來將接受rvalue參數的功能給區分出來了。為了避免混亂,這裡我們將void myCopyFunc2(Widget& param)這種型式稱作lvalue reference,而void myCopyFunc2(Widget&& param)則稱作rvalue reference。
等等,在上一篇文章中不是說這種型式的function稱作universal reference嗎?怎麼突然改了一個名字呢?──是的,這裡並沒有將名詞打錯。只是universal reference通常是用來代指有template功能,需要編譯期推導其傳入形態的function。這裡我們先不要將情況弄的這麼複雜,只要將接受rvalue的function理解為rvalue reference即可;而其本質上,也確實是如此。
現在,就讓我們試著將rvalue reference導入看看會發生什麼事:
class Resource{ public: Resource* deepCopy(){ Resource* pResource = new Resource(); pResource->m_data = m_data; return pResource; } public: std::string m_data; }; class Widget{ public: /** Default constructor */ Widget(){ m_pHeapStorageResource = new Resource(); m_pHeapStorageResource->m_data = "data are allocated on heap Storage"; } /** Copy constructor */ Widget(const Widget& other){ memberwiseCopy(other); m_pHeapStorageResource->m_data = "data are allocated on heap Storage"; } /** Destructor */ ~Widget(){ if(m_pHeapStorageResource){ delete m_pHeapStorageResource; m_pHeapStorageResource = nullptr; } } /** Copy assignment operator */ Widget& operator= (const Widget& other) { memberwiseCopy(other); return *this; } //simulate default compiler generated copy behavior void memberwiseCopy(const Widget& other){ m_pHeapStorageResource = other.m_pHeapStorageResource; m_stackStorageResource = other.m_stackStorageResource; m_i = other.m_i; } public: Resource* m_pHeapStorageResource; Resource m_stackStorageResource; int m_i; }; static Widget g_myBackupWidget; void myCopyFunc2(const Widget& param){ g_myBackupWidget.m_pHeapStorageResource = param.m_pHeapStorageResource->deepCopy(); g_myBackupWidget.m_stackStorageResource = param.m_stackStorageResource; g_myBackupWidget.m_i = param.m_i; if(param.m_pHeapStorageResource){ param.m_pHeapStorageResource->m_data = "myCopyFunc2 param heap Storage: change data here will change origin widget"; }} void myCopyFunc2(Widget&& param){ g_myBackupWidget.m_pHeapStorageResource = param.m_pHeapStorageResource; //move param.m_pHeapStorageResource = nullptr; g_myBackupWidget.m_stackStorageResource = param.m_stackStorageResource;//no move, will copy g_myBackupWidget.m_i = param.m_i; //copy g_myBackupWidget.m_pHeapStorageResource->m_data = "myCopyFunc2 rvalue param heap Storage: move param.m_pHeapStorageResource to g_myBackupWidget "; } int main(){ Widget origin; myCopyFunc2(origin); // 呼叫void myCopyFunc2(Widget& param){ cout << origin.m_pHeapStorageResource->m_data.c_str() << endl; cout << origin.m_stackStorageResource.m_data.c_str() << endl; myCopyFunc2(Widget());//呼叫void myCopyFunc2(Widget&& param){ cout << g_myBackupWidget.m_pHeapStorageResource->m_data.c_str() << endl; cout << g_myBackupWidget.m_stackStorageResource.m_data.c_str() << endl; system("pause"); return 0; }
從上面的程式我們可以看到,rvalue reference在功能上跟lvalue reference完全沒有衝突,它就只是設計來補充lvalue reference的不足而已。若我們沒有宣告rvalue reference,則C++會自動去找到lvalue reference的版本。
呼!我們終於把Move semantics、lvalue reference跟rvalue reference的概念講完了。什麼?還有move semantics沒講到?仔細看一下這二行:
g_myBackupWidget.m_pHeapStorageResource = param.m_pHeapStorageResource; //move param.m_pHeapStorageResource = nullptr;
Move講的就是指標所有權轉移的概念。重點在第二行將指標指為nullptr的行為,將m_pHeapStorageResource的所有權從param身上釋放掉,這樣就不用擔心Widget在Destructor時把它釋放掉了。
到這裡,我們已經講完Universal References幾個核心。不知道還需要多少篇幅才能把所有的東西談完,但接下來,只要涵蓋完rule of three(five)、std:move、std::forward、reference collapsing以及RVO(return value optimization)這幾個概念,就能明白Universal References到底是什麼,以及它在C++11中扮演的角色是多麼的重大。
最後,先帶過std:move、std::forward這二個function的用途:
std::move:用來將lvalue cast成rvalue,用法: myCopyFunc2(std::move(origin))
std::forward:用來將傳入的rvalue reference以rvalue的形式傳給其他rvalue reference function,若傳入的是lvalue,則保持lvalue的傳入形式。
Leave a Reply