Q：找到一个1最大数量的二维数组连续行

I have 2D array of size m*m with element values either 0s or 1s. Furthermore, each column of the array has a contiguous block of 1s (with 0 outside that block). The array itself is too large to be held in memory (as many as 10^6 rows), but for each column I can determine the lower bound, a, and the upper bound, b, of the 1s in that column. For a given n, I need to find out those n consecutive rows which have the maximum number of 1s. I can easily do it for smaller numbers by calculating the sum of each row one by one, and then choosing n consecutive rows whose sum comes out to be maximum, but for large numbers, it is consuming too much time. Is there any efficient way for calculating this? Perhaps using Dynamic Programming?

Here is an example code fragment showing my current approach, where successive calls to read_int() (not given here) provide the lower and upper bounds for successive columns:

``````   long int harr={0};       //initialized to zero
for(int i=0;i<m;i++)
{
for(int j=a;j<=b;j++)        // for finding sum of each row
harr[j]++;
}
for(int i=0;i<n;i++)
{
}
for(int i=n;i<m;i++)
{
current=current+harr[i]-harr[i-n];
{
}
}
``````

For example (with m = 6 and n = 3)

Here the answer would be row 1 to row 3 with a total 1-count of 13 in those rows. (Row 2 to row 4 also maximizes the sum as there is a tie.)

``````   long int harr={0};       //initialized to zero
for(int i=0;i<m;i++)
{
for(int j=a;j<=b;j++)        // for finding sum of each row
harr[j]++;
}
for(int i=0;i<n;i++)
{
}
for(int i=n;i<m;i++)
{
current=current+harr[i]-harr[i-n];
{
}
}
``````

Here is a different approach. Think of each pair a, b as defining an interval of the form [a,b+1). The task is to find the n consecutive indices which maximizes the sum of the parenthesis depth of the numbers in that interval. Every new a bumps the parenthesis depth at a up by 1. Every new b causes the parenthesis depth after b to go down by 1. In the first pass -- just load these parentheses depth deltas. Then one pass gets the parenthesis depths from these deltas. The following code illustrates this approach. I reduced m to 6 for testing purposes and replaced calls to the unkown read_int() by accesses to hard-wired arrays (which correspond to the example in the question):

``````#include <stdio.h>

int main(void){
int n = 3;
int lower_bound = {0,1,2,3,1,2};
int upper_bound = {3,4,3,5,2,4};
int m = 6;
int harr={0};

//load parenthesis depth-deltas (all initially 0)
for(int i=0;i<m;i++)
{
a = lower_bound[i];
b = upper_bound[i];
harr[a]++;
if(b < m-1)harr[b+1]--;
}

//determine p-depth at each point
for(int i = 1; i < m; i++){
harr[i] += harr[i-1];
}

//find optimal n-rows by sliding-window
for(int i=0;i<n;i++)
{
}
lower = 0;
upper = n-1;

for(int i=n;i<m;i++)
{
current = current+harr[i]-harr[i-n];
{
lower = i-n+1;
upper = i;
}
}
printf("Max %d rows are %d to %d with a total sum of %d ones\n", n,lower,upper,answer);
return 0;
}
``````

(Obviously, the loop which loads harr can be combined with the loop which computes answer. I kept it as two passes to better illustrate the logic of how the final harr values can be obtained from the parentheses deltas).

When this code is compiled and run its output is:

``````Max 3 rows are 1 to 3 with a total sum of 13 ones
``````

``````#include <stdio.h>

int main(void){
int n = 3;
int lower_bound = {0,1,2,3,1,2};
int upper_bound = {3,4,3,5,2,4};
int m = 6;
int harr={0};

//load parenthesis depth-deltas (all initially 0)
for(int i=0;i<m;i++)
{
a = lower_bound[i];
b = upper_bound[i];
harr[a]++;
if(b < m-1)harr[b+1]--;
}

//determine p-depth at each point
for(int i = 1; i < m; i++){
harr[i] += harr[i-1];
}

//find optimal n-rows by sliding-window
for(int i=0;i<n;i++)
{
}
lower = 0;
upper = n-1;

for(int i=n;i<m;i++)
{
current = current+harr[i]-harr[i-n];
{
lower = i-n+1;
upper = i;
}
}
printf("Max %d rows are %d to %d with a total sum of %d ones\n", n,lower,upper,answer);
return 0;
}
``````

（显然，循环加载Harr可以结合循环计算的答案。我把它作为两通过更好地说明逻辑如何最后Harr值可以从括号中的三角洲获得）。

``````Max 3 rows are 1 to 3 with a total sum of 13 ones
``````

I'm not sure how the following will scale for your 10^6 rows, but it manages the the trailing sum of x consecutive rows in a single pass without function call overhead. It may be worth a try. Also insure you are compiling with full optimizations so the compiler can add its 2 cents as well.

My original thought was to find some way to read x * n integers (from your m x n matrix) and in some fashion look at a population of set bits over that number of bytes. (checking the endianness) and taking either the first or last byte for each integer to check whether a bit was set. However, the logic seemed as costly as simply carrying the sum of the trailing x rows and stepping through the array while attempting to optimize the logic.

I don't have any benchmarks from your data to compare against, but perhaps this will give you another idea or two.:

``````#include <stdio.h>
#include <stdlib.h>

#ifndef CHAR_BIT
#define CHAR_BIT  8
#endif

#ifndef INT_MIN
#define INT_MIN -(1U << (sizeof (int) * CHAR_BIT - 1))
#endif

int main (int argc, char **argv) {

/* number of consecutive rows to sum */
size_t ncr = argc > 1 ? (size_t)atoi (argv) : 3;

/* static array to test summing and row id logic, not
intended to simulate the 0's or 1's */
int a[] = {{1,2,3,4,5},
{2,3,4,5,6},
{3,4,5,6,7},
{4,5,6,7,8},
{3,4,5,6,7},
{0,1,2,3,4},
{1,2,3,4,5}};
int sum[ncr];               /* array holding sum on ncr rows */
int sumn = 0;               /* sum of array values */
int max = INT_MIN;          /* variable holding maximum sum  */
size_t m, n, i, j, k, row = 0, sidx;

m = sizeof  a / sizeof *a;  /* matrix m x n dimensions */
n = sizeof *a / sizeof **a;

for (k = 0; k < ncr; k++)   /* initialize vla values */
sum[k] = 0;

for (i = 0; i < m; i++)     /* for each row */
{
sidx = i % ncr;         /* index for sum array */

if (i > ncr - 1) {      /* sum for ncr prior rows */
for (k = 0; k < ncr; k++)
sumn += sum[k];
/* note 'row' index assignment below is 1 greater
than actual but simplifies output loop indexes */
max = sumn > max ? row = i, sumn : max;
sum[sidx] = sumn = 0; /* zero index to be replaced and sumn */
}

for (j = 0; j < n; j++) /* compute sum for current row */
sum [sidx] += a[i][j];
}

/* output results */
printf ("\n The maximum sum for %zu consecutive rows: %d\n\n", ncr, max);

for (i = row - ncr; i < row; i++) {
printf (" row[%zu] : ", i);
for (j = 0; j < n; j++)
printf (" %d", a[i][j]);
printf ("\n");
}

return 0;
}
``````

Example Output

``````\$./bin/arraymaxn

The maximum sum for 3 consecutive rows: 80

row :  3 4 5 6 7
row :  4 5 6 7 8
row :  3 4 5 6 7

\$./bin/arraymaxn 4

The maximum sum for 4 consecutive rows: 100

row :  2 3 4 5 6
row :  3 4 5 6 7
row :  4 5 6 7 8
row :  3 4 5 6 7

\$ ./bin/arraymaxn 2

The maximum sum for 2 consecutive rows: 55

row :  3 4 5 6 7
row :  4 5 6 7 8
``````

Note: if there are multiple equivalent maximum consecutive rows (i.e. two sets of rows where the 1's add up the the same number), the first occurrence of the maximum is selected.

I'm not sure what optimizations you are choosing to compile with, but regardless which code you use, you can always try the simple hints to the compiler to inline all functions (if you have functions in your code) and fully optimize the code. Two helpful ones are:

``````gcc -finline-functions -Ofast
``````

``````#include <stdio.h>
#include <stdlib.h>

#ifndef CHAR_BIT
#define CHAR_BIT  8
#endif

#ifndef INT_MIN
#define INT_MIN -(1U << (sizeof (int) * CHAR_BIT - 1))
#endif

int main (int argc, char **argv) {

/* number of consecutive rows to sum */
size_t ncr = argc > 1 ? (size_t)atoi (argv) : 3;

/* static array to test summing and row id logic, not
intended to simulate the 0's or 1's */
int a[] = {{1,2,3,4,5},
{2,3,4,5,6},
{3,4,5,6,7},
{4,5,6,7,8},
{3,4,5,6,7},
{0,1,2,3,4},
{1,2,3,4,5}};
int sum[ncr];               /* array holding sum on ncr rows */
int sumn = 0;               /* sum of array values */
int max = INT_MIN;          /* variable holding maximum sum  */
size_t m, n, i, j, k, row = 0, sidx;

m = sizeof  a / sizeof *a;  /* matrix m x n dimensions */
n = sizeof *a / sizeof **a;

for (k = 0; k < ncr; k++)   /* initialize vla values */
sum[k] = 0;

for (i = 0; i < m; i++)     /* for each row */
{
sidx = i % ncr;         /* index for sum array */

if (i > ncr - 1) {      /* sum for ncr prior rows */
for (k = 0; k < ncr; k++)
sumn += sum[k];
/* note 'row' index assignment below is 1 greater
than actual but simplifies output loop indexes */
max = sumn > max ? row = i, sumn : max;
sum[sidx] = sumn = 0; /* zero index to be replaced and sumn */
}

for (j = 0; j < n; j++) /* compute sum for current row */
sum [sidx] += a[i][j];
}

/* output results */
printf ("\n The maximum sum for %zu consecutive rows: %d\n\n", ncr, max);

for (i = row - ncr; i < row; i++) {
printf (" row[%zu] : ", i);
for (j = 0; j < n; j++)
printf (" %d", a[i][j]);
printf ("\n");
}

return 0;
}
``````

``````\$./bin/arraymaxn

The maximum sum for 3 consecutive rows: 80

row :  3 4 5 6 7
row :  4 5 6 7 8
row :  3 4 5 6 7

\$./bin/arraymaxn 4

The maximum sum for 4 consecutive rows: 100

row :  2 3 4 5 6
row :  3 4 5 6 7
row :  4 5 6 7 8
row :  3 4 5 6 7

\$ ./bin/arraymaxn 2

The maximum sum for 2 consecutive rows: 55

row :  3 4 5 6 7
row :  4 5 6 7 8
``````

``````gcc -finline-functions -Ofast
``````
c  arrays  dynamic-programming