慕标5832272
矩阵求幂确实是正确的方法,但还有一些工作要做。由于g(n)不是常数值,因此无法有效地(O(log n)而不是O(n))将矩阵求幂应用于当前形式的递推关系。需要找到类似的递归关系,g(n)只有一个常数项尾随。由于g(n)是三次方,因此需要 3 个递归项:g(n) = x*g(n-1) + y*g(n-2) + z*g(n-3) + w展开它们每个的三次表达式:n³ + n² = x(n³-2n²+n) + y(n³-5n²+8n-4) + z*(n³-8n²+21n-18) + w = n³(x+y+z) + n²(-2x-5y-8z) + n(x+8y+21z) + (w-4y-18z)匹配系数以获得三个联立方程 加上x, y, z另一个来计算w: x + y + z = 1-2x - 5y - 8z = 1 x + 8y + 21z = 0 w - 4y - 18z = 0解决它们得到:x = 3 y = -3 z = 1 w = 6方便的是,这些系数也是整数*,这意味着可以直接对递归进行模运算。* 我怀疑这是巧合——这很可能是招聘考官的意图。因此,矩阵递推方程为:| a b c 1 0 0 0 | | f(n-1) | | f(n) || 1 0 0 0 0 0 0 | | f(n-2) | | f(n-1) || 0 1 0 0 0 0 0 | | f(n-3) | | f(n-2) || 0 0 0 3 -3 1 6 | x | g(n) | = | g(n+1) || 0 0 0 1 0 0 0 | | g(n-1) | | g(n) || 0 0 0 0 1 0 0 | | g(n-2) | | g(n-1) || 0 0 0 0 0 0 1 | | 1 | | 1 |最终的矩阵求幂方程为: [n-2]| a b c 1 0 0 0 | | f(2) | | f(n) | | f(2) | | r || 1 0 0 0 0 0 0 | | f(1) | | f(n-1) | | f(1) | | q || 0 1 0 0 0 0 0 | | f(0) | | f(n-2) | | f(0) | | p || 0 0 0 3 -3 1 6 | x | g(3) | = | g(n+1) | , | g(3) | = | 36 || 0 0 0 1 0 0 0 | | g(2) | | g(n) | | g(2) | | 12 || 0 0 0 0 1 0 0 | | g(1) | | g(n-1) | | g(1) | | 2 || 0 0 0 0 0 0 1 | | 1 | | 1 | | 1 | | 1 |(每个操作都是隐式模数10^9 + 7或提供的任何此类数字。)请注意,Java 的%运算符是remainder ,它与负数的模数不同。例子:-1 % 5 == -1 // Java-1 = 4 (mod 5) // mathematical modulus解决方法很简单:long mod(long b, long a){ // computes a mod b // assumes that b is positive return (b + (a % b)) % b;}原始迭代算法:long recurrence_original( long a, long b, long c, long p, long q, long r, long n, long m // 10^9 + 7 or whatever) { // base cases if (n == 0) return p; if (n == 1) return q; if (n == 2) return r; long f0, f1, f2; f0 = p; f1 = q; f2 = r; for (long i = 3; i <= n; i++) { long f3 = mod(m, mod(m, a*f2) + mod(m, b*f1) + mod(m, c*f0) + mod(m, mod(m, i) * mod(m, i)) * mod(m, i+1) ); f0 = f1; f1 = f2; f2 = f3; } return f2;}模矩阵函数:long[][] matrix_create(int n){ return new long[n][n];}void matrix_multiply(int n, long m, long[][] c, long[][] a, long[][] b){ for (int i = 0; i < n; i++) { for (int j = 0; j < n; j++) { long s = 0; for (int k = 0; k < n; k++) s = mod(m, s + mod(m, a[i][k]*b[k][j])); c[i][j] = s; } }}void matrix_pow(int n, long m, long p, long[][] y, long[][] x){ // swap matrices long[][] a = matrix_create(n); long[][] b = matrix_create(n); long[][] c = matrix_create(n); // initialize accumulator to identity for (int i = 0; i < n; i++) a[i][i] = 1; // initialize base to original matrix for (int i = 0; i < n; i++) for (int j = 0; j < n; j++) b[i][j] = x[i][j]; // exponentiation by squaring // there are better algorithms, but this is the easiest to implement // and is still O(log n) long[][] t = null; for (long s = p; s > 0; s /= 2) { if (s % 2 == 1) { matrix_multiply(n, m, c, a, b); t = c; c = a; a = t; } matrix_multiply(n, m, c, b, b); t = c; c = b; b = t; } // write to output for (int i = 0; i < n; i++) for (int j = 0; j < n; j++) y[i][j] = a[i][j];}最后,新算法本身:long recurrence_matrix( long a, long b, long c, long p, long q, long r, long n, long m) { if (n == 0) return p; if (n == 1) return q; if (n == 2) return r; // original recurrence matrix long[][] mat = matrix_create(7); mat[0][0] = a; mat[0][1] = b; mat[0][2] = c; mat[0][3] = 1; mat[1][0] = 1; mat[2][1] = 1; mat[3][3] = 3; mat[3][4] = -3; mat[3][5] = 1; mat[3][6] = 6; mat[4][3] = 1; mat[5][4] = 1; mat[6][6] = 1; // exponentiate long[][] res = matrix_create(7); matrix_pow(7, m, n - 2, res, mat); // multiply the first row with the initial vector return mod(m, mod(m, res[0][6]) + mod(m, res[0][0]*r) + mod(m, res[0][1]*q) + mod(m, res[0][2]*p) + mod(m, res[0][3]*36) + mod(m, res[0][4]*12) + mod(m, res[0][5]*2) );}以下是上述两种算法的一些示例基准。原始迭代算法:n time (μs)-------------------10^1 9.310^2 44.910^3 401.50110^4 3882.09910^5 27940.910^6 88873.59910^7 877100.510^8 9057329.09910^9 91749994.4新矩阵算法:n time (μs)------------------10^1 69.16810^2 128.77110^3 212.69710^4 258.38510^5 318.19510^6 380.910^7 453.48710^8 560.42810^9 619.83510^10 652.34410^11 750.51810^12 769.90110^13 851.84510^14 934.91510^15 1016.73210^16 1079.61310^17 1123.41310^18 1225.323旧算法用了 90 多秒来计算n = 10^9,而新算法只用了 0.6多毫秒(150,000 倍的加速)!原始算法的时间复杂度显然是线性的(正如预期的那样);n = 10^10花了太长时间才完成,所以我没有继续。新算法的时间复杂度显然是对数的——将 的数量级加倍n导致执行时间加倍(同样,正如预期的那样,由于平方求幂)。n对于( )的“小”值,< 100矩阵分配和运算的开销掩盖了算法本身,但随着n增加而迅速变得微不足道。