Majority Algorithm Solution

Brian Rothstein

Oct 22, 1999

1 Problem

Given an array, A, consisting of N (N > 0) positive integers, write a function which finds the integer that occurs a majority of the time in the array [if it exists] and return it. If it doesn't exits, return 0. The function must run in O(1) space and O(N) time.

2 Setup

It will be helpful to define the following terminology:

d(a,b) =

ě
í
î

a = b

a š b

#A_i,j(b) =

j
ĺ
k = i

d(A[k],b)

M(A,i,j) =

ě
ď
ď
ď
ď
í
ď
ď
ď
ď
î

\nexists b: #A_i,j(b) >

j-i+1

$b: #A_i,j(b) >

j-i+1

P(A,i,j) = b:{"c:#A_i,j(b) ł #A_i,j(c)}

: #A_i,j(b) is just the number of times that b occurs in A[i..j].
: M(A,i,j) finds the majority element of A[i..j] if it exists. Otherwise it returns zero.
: P(A,i,j) finds the element that occurs at least as many times as any other element in A[i..j].

3 Solution

3.1 Simplification

For simplicity, we'll start off by considering

M˘(A,i,j) =

ě
í
î

undefined

if M(A,i,j) = 0

if M(A,i,j) = b, b š 0

Notice that M˘(A,i,j) only needs to return a meaningful result if a majority exists in A[i..j]. For this reason, we can assume that there is a majority element, since if there's not, we can return anything.

So for now, assume M(A,1,N) = m, m š 0. (Call this assumption [M*])

3.2 Two important theorems.

Theorem 1. [M*] Ů{M(A,1,i) = 0} Ţ M(A,i+1,N) = m.

What this says is that if there is a majority element in A[1..N] (as we just assumed) and there is no majority element in A[1..i] then there must be a majority element in A[i+1..N]. Furthermore, the majority element in A[i+1..N] must be the same as the majority element in A[1..N].

Proof.

Let c = P(A,1,i).

(1)

Ţ #A_1,i(m) Ł #A_1,i(c).

(2)

M(A,1,i) = 0 Ţ #A_i,i(c) Ł

(3)

Now assume that M(A,i+1,N) š m.

(4)

Ţ #A_i+1,N(m) Ł

N-i

(5)

(2) Ţ #A_1,i(m) + #A_i+1,N(m) Ł #A_1,i(c) + #A_i+1,N(m)

(6)

(3) Ů(5) Ţ #A_1,i(c) + #A_i+1,N(m) Ł

N-i

(7)

(6) Ů(7) Ţ #A_1,i(m) + #A_i+1,N(m) Ł

(8)

Ţ #A_1,N(m) Ł

which contradicts assumption [M*].

So it must be that M(A,i+1,N) = m.

Theorem 2. {M(A,1,i) = 0} Ů{M(A,i+1,j) = 0} Ţ M(A,1,j) = 0.

This says that if there is no majority element in A[1..i] and there is no majority element in A[i+1..j] then there is no majority element in A[1..j].

Proof.

Let c = P(A,1,i), d = P(A,i+1,j)

(9)

M(A,1,i) = 0 Ţ #A_1,i(c) Ł

(10)

M(A,i+1,j) = 0 Ţ #A_i+1,j(d) Ł

j-i

(11)

Now suppose M(A,1,j) = e, e š 0

(12)

Ţ #A_1,j(e) >

(13)

(9) Ţ #A_1,i(e) + #A_i+1,j(e) Ł #A_1,i(c) + #A_i+1,j(d)

(14)

(10) Ů(11) Ţ #A_1,i(c) + #A_i+1,j(d) Ł

j-i

(15)

Ţ #A_1,j(e) Ł

which contradicts (12), (13).

So it must be that M(A,1,j) = 0.

3.3 First attempt at a loop invariant

Theorems 1 and 2 suggest the following loop invariant for M˘(A,1,N):

L(a,i,j,b) : = {M(A,1,i-1) = 0} Ů{M(A,i,j) = b}

With this invariant, we could have j iterate through A and set i to j every time that we find that there is no majority in A[i..j]. Theorem 2 would then allow us to merge {M(A,1,i-1) = 0} Ů{M(A,i,j) = 0} into M(A,1,j) = 0. Theorem 1 would then allow us to conclude that b is the majority winner (under [M*]) when j = N.

This loop invariant is close to the right one but it's not strong enough to allow us to find or prove an algorithm. This is because it doesn't allow us to figure out when M(A,i,j) = 0.

3.4 Second attempt at a loop invariant

A better loop invariant is:

L(A,i,j,k,b): = {M(A,1,i-1) = 0} Ů {k > j} Ů {#A_i,j(b) =

k-i+1

}

Notice that L Ţ M(A,i,j) = b since

{#A_i,j(b) =

k-i+1

} Ů{k > j} Ţ #A_i,j(b) >

j-i+1

More importantly, we can determine from this invariant when M(A,i,j) = 0.

Theorem 3. L Ů{j+1 = k} Ů{A[j+1] š b } Ţ M(A,i,j+1) = 0.

Proof.

L Ů{A[j+1] š b } Ţ #A_i,j(b) = #A_i,j+1(b) =

k-i+1

(16)

(16) Ů{j+1 = k} Ţ #A_i,j+1(b) =

(j+1)-i+1

(17)

Ţ M(A,i,j+1) š b.

(18)

Now suppose M(A,i,j+1) = c, c š 0,c š b.

(19)

Ţ #A_i,j+1(c) >

(j+1)-i+1

(20)

But M(A,i,j) = b Ţ #A_i,j(c) <

j-i+1

(21)

Ţ #A_i,j+1(c) <

j-i+1

+ 1 =

(j+1)-i+1+1

(22)

Ţ 2 ·#A_i,j+1(c) < (j+1)-i+1+1

(23)

Ţ 2 ·#A_i,j+1(c) Ł (j+1)-i+1

(24)

Ţ #A_i,j+1(c) Ł

(j+1)-i+1

which contradicts (20).

So M(A,i,j+1) = 0.

3.5 Algorithm for M˘(A,1,N)

M'(A,1,N)
{
    i=j=1;
    k=2;
    b=A[1];
                                                    // (p1)
    while(j<N)
    {                                               // (p2)
        j++;
        if(A[j]==b) k+=2;
        else if(j==k) { i=j; b=A[j]; k=k+1; }
    }                                               // (p3)

    return b;                                       // (p4)
}

To prove that the algorithm is correct, we need to show 4 things.

Initialization: The loop invariant, L, is true at (p1)
Maintenance: If L is true at (p2), then it will be true at (p3).
Boundedness: The loop will terminate in a finite number of steps.
Correctness: After the loop (p4), L Ů{j > = N} Ţ M˘(A,1,N) = b.

3.6 Proof of Algorithm M˘(A,1,N)

Initialization.

At (p1) we have L(A,i,j,k,b)

= L(A,1,1,2,A[1])

= {M(A,1,0) = 0} Ů{2 > 1} Ů{#A_1,1(A[1]) =

2-1+1

}

which is clearly true.

Maintenance. Assume that L(A,i,j,k,b) holds at (p2). We need to show 3 things:

{L(A,i,j,k,b)} Ů{A[j+1] = b} Ţ L(A,i,j+1,k+2,b)

(25)

{L(A,i,j,k,b)} Ů{A[j+1] š b} Ů{j+1 = k}

Ţ L(A,j+1,j+1,k+1,A[j+1])

(26)

{L(A,i,j,k,b)} Ů{A[j+1] š b} Ů{j+1 š k}

Ţ L(A,i,j+1,k,b)

(27)

These correspond to the three cases inside the loop.

Proof of (25).

L(A,i,j,k,b) Ů{A[j+1] = b} Ţ

{M(A,1,i-1) = 0} Ů{k+2 > j+1} Ů{#A_i,j+1(b) =

k-i+1

+ 1}

= L(A,i,j+1,k+2,b).

Proof of (26). By theorem 3,

L(A,i,j,k,b) Ů{j+1 = k} Ů{A[j+1] š b } Ţ M(A,i,j+1) = 0.

By theorem 2,

{M(A,1,i-1) = 0} Ů{M(A,i,j+1) = 0} Ţ M(A,1,j+1) = 0

Therefore,

L(A,i,j,k,b) Ů{j+1 = k} Ů{A[j+1] š b } Ţ

{M(A,1,j+1) = 0} Ů{k+1 > j+1} Ů

#A_j+1,j+1(A[j+1]) =

k+1-(j+1)+1

k+1-(k)+1

= 1

= L(A,j+1,j+1,k+1,A[j+1]).

Proof of (27).

L(A,i,j,k,b) Ů{A[j+1] š b} Ů{j+1 š k} Ţ

{M(A,1,i-1) = 0} Ů{k > j+1} Ů{#A_i,j+1(b) =

j-i+1

}

= L(A,i,j+1,k,b).

Thus the invariant is maintained through the loop.

Boundedness. The loop obviously terminates in N-1 steps. Since N is finite, the loop will terminate in a finite number of steps. More importantly, the number of steps is linear with respect to N.

Correctness. At (p4) we have,

j = N Ţ L(A,i,N,k,b)

= {M(A,1,i-1) = 0} Ů{k > N} Ů{#A_i,N(b) =

k-i+1

}

Ţ #A_i,N(b) >

N-i+1

Ţ M(A,i,N) = b

Therefore since {M(A,1,i-1) = 0} Ů{M(A,i,N) = b}, we know by theorem 1 that M(A,1,N) = b, which is what we wanted to show. So b is the majority element under [M*] (i.e. assuming that there exists a majority element.)

But now the implementation of M(A,1,N) is obvious!

3.7 Algorithm for M(A,1,N)

M(A,1,N)
{
    b = M'(A,1,N);
    count = 0;
    i = 1;
    while(i<=N)
    {
        if(A[i]==b) count++;
        i++;
    }

    if(count > N/2) return b;
    else return 0;
}

The proof is left as an exercise to the reader.

4 Notes

In M˘(A,1,N), the variable, i, is not used. Its only real use is to prove the loop invariant. Since b achieves the same value with or without i, you can get rid of it (or put it in comments so that the proof is still clear.)
M˘(A,1,N) can be optimized to return b if k > N since at that point we have

#A_i,j(b) = k-i+1
2
> N-i+1
2
Ţ M(i,N) = b.
M(A,1,N) can be optimized by merging M˘(A,1,N) with it and returning b in the first loop (the M˘ loop) if [(k-i+1)/ 2] > ^N/₂ since then we have

#A_i,j(b) = k-i+1
2
> N
2

Ţ #A_1,N(N) ł #A_i,j(N) > N
2

Ţ M(A,1,N) = b.

File translated from T_EX by T_TH, version 2.25.
On 22 Oct 1999, 06:42.