矩陣鏈乘積
维基百科,自由的百科全书
矩阵链乘积是可用動態規劃解决的最佳化问题。給定一序列矩陣,期望求出相乘這些矩陣的最有效方法。此問題並不是真的去執行其乘法,而只是決定執行乘法的順序而已。
因為矩陣乘法具有結合律,所有其運算順序有很多種選擇。換句話說,不論如何括號其乘積,最後結果都會是一樣的。例如,若有四個矩陣A、B、C和D,將可以有:
- (ABC)D = (AB)(CD) = A(BCD) = A(BC)D = ...
但括號其乘積的順序是會影響到需計算乘積所需簡單算術運算的數目,即其效率。例如,設A為一10×30矩陣,B為30×5矩陣與C為5×60矩陣,則
- (AB)C 有 (10×30×5) + (10×5×60) = 1500 + 3000 = 4500 個運算
- A(BC) 有 (30×5×60) + (10×30×60) = 9000 + 18000 = 27000 個運算
明顯地,第一種方式要有效多了。既然以確認過此問題了,那要如何決定n個矩陣相乘的最佳順序呢?可以比較每一順序的運算量(使用蠻力),但這將需要時間O(2n),是一種非常慢且對大n不實在的方法。那解決方法,如我們將看到的,是將問題分成一套相關的子問題。以解答子問題一次而再使用其解答數次,即可以徹底地得出其所需時間。此一方法稱為動態規劃。
目录 |
[编辑] 算法
一開始,假定真的想知道的是乘完矩陣所需的最小成本,或算術運算的最小量。若只有兩個矩陣相乘,則只會有一種方法去乘它們,所有其最小成本為乘積的成本。一般地,可以用下列的遞迴演算法求出最小成本:
- 取得矩陣的序列且將其分成兩個子序列。
- 找出乘完每一子序列的最小成本。
- 將成本加起來,並加上兩個結果矩陣相乘的成本。
- 在每一矩陣序列可分開的位置運作,並取其最小值。
例如,若有四矩陣ABCD,計算每一分法(A)(BCD)、(AB)(CD)和(ABC)(D)所需的成本,遞迴計算ABC、AB、CD和BCD的最小成本。然後選擇最好的一個。這方法不只找出其最小成本,也決定做這乘積的最好辦法:儘只是以最小總成本分開,並對每一因子做相同的事情。
不幸地,若真實作此演算法,將會發現它和比較每一順序的運算量一樣慢!發生什麼事了嗎?答案是因為多做了許多多餘的工作。例如,上述遞迴計算了ABC與AB以找到最小成本,但在找出ABC的最小成本時,亦需要去找出AB的最小成本。當遞迴較深時,會有越來越多這種不需要的重複產生。
一簡單的解決方法為備忘錄法:每次計算乘完一特定子序列所需的最小成本時,儲存其數值。若再遇到相同子序列時,便只需讀取已存的答案,而不需要再重計算一次。當n個矩陣會有約n2/2個不同的子序列,其所需空間是合理的。可證明此一簡單的技術可使得運算時間由O(2n)降至O(n3),使得其對實際應用有足夠的效率。此為由上而下動態規劃。
偽代碼:
Matrix-Chain-Order(int p[]) { n = p.length - 1; for (i = 1; i <= n; i++) m[i,i] = 0; for (l=2; l<=n; l++) { // l is chain length for (i=1; i<=n-l+1; i++) { j = i+l-1; m[i,j] = MAXINT; for (k=i; k<=j-1; k++) { q = m[i,k] + m[k+1,j] + p[i-1]*p[k]*p[j];//Matrix Ai has the dimension p[i-1] x p[i]. if (q < m[i,j]) { m[i,j] = q; s[i,j] = k; } } } } }
另一種解決方法是預先知道需要計算的成本並事先計算它們。其運作如下:
- 對每一由2至矩陣數目n的k:
- 計算長度k的子序列的最小成本,使用已計算過的成本。
如此做過之後,將可以得到整個序列的最小成本。即使其亦需要O(n3)的時間,此一方法有其實作的優點,它不需要使用遞迴,不需要測試是否為已計算的值,而且可以丟掉一些已不需要的結果來節省空間。此為由下而上動態規劃:可解答此問題的第二種方法。
[编辑] 此算法在其他领域的用途
Although this algorithm applies well to the problem of matrix chain multiplication, researchers such as Gerald Baumgartner have noted that it generalizes well to solving a more abstract problem: given a linear sequence of objects, an associative binary operation on those objects, and a way to compute the cost of performing that operation on any two given objects (as well as all partial results), compute the minimum cost way to group the objects to apply the operation over the sequence.
One common special case of this is string concatenation. Say we have a list of strings. In C, for example, the cost of concatenating two strings of length m and n using strcat is O(m + n), since we need O(m) time to find the end of the first string and O(n) time to copy the second string onto the end of it. Using this cost function, we can write a dynamic programming algorithm to find the fastest way to concatenate a sequence of strings (although this is rather useless, since we can concatenate them all in time proportional to the sum of their lengths). A similar problem exists for singly-linked lists.
Another generalization is to solve the problem when many parallel processors are available. In this case, instead of adding the costs of computing each subsequence, we just take the maximum, because we can do them both simultaneously. This can drastically affect both the minimum cost and the final optimal grouping; more "balanced" groupings that keep all the processors busy are favored. Heejo Lee et al. describe even more sophisticated approaches.
[编辑] 實作
[编辑] 引用
- Viv. Dynamic Programming. A 1995 introductory article on dynamic programming.
- Cormen, T.H., C.E. Leiserson, and R.L. Rivest. Introduction to Algorithms. McGraw-Hill, New York, NY. 1990. ISBN 0-262-03293-7. Section 15.2. The most popular reference where people encounter this algorithm.
- G. Baumgartner, D. Bernholdt, D. Cociorva, R. Harrison, M. Nooijen, J. Ramanujam and P. Sadayappan. A Performance Optimization Framework for Compilation of Tensor Contraction Expressions into Parallel Programs. 7th International Workshop on High-Level Parallel Programming Models and Supportive Environments (HIPS '02). Fort Lauderdale, Florida. 2002.
- Heejo Lee, Jong Kim, Sungje Hong, and Sunggu Lee. Parallelizing matrix chain products. Tech. Rep. CS-HPC-97003, Pohang University of Science and Technology. 1997.
- Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms, Second Edition. MIT Press and McGraw-Hill, 2001. ISBN 0-262-03293-7. Section 15.2: Matrix-chain multiplication, pp.331–339.