Majority Algorithm Solution
Brian Rothstein
Oct 22, 1999
1 Problem
Given an array, A, consisting of N (N > 0) positive integers,
write a function which finds the integer that occurs a majority of the
time in the array [if it exists] and return it. If it doesn't exits,
return 0. The function must run in O(1) space and O(N) time.
2 Setup
It will be helpful to define the following terminology:
#Ai,j(b) = |
j å
k = i
|
d(A[k],b) |
|
M(A,i,j) = |
ì ï ï ï ï í
ï ï ï ï î
|
|
\nexists b: #Ai,j(b) > |
j-i+1 2
|
|
| |
|
| |
|
P(A,i,j) = b:{"c:#Ai,j(b) ³ #Ai,j(c)} |
|
- #Ai,j(b) is just the number of times that b occurs in A[i..j].
- M(A,i,j) finds the majority element of A[i..j] if it exists. Otherwise it returns zero.
- P(A,i,j) finds the element that occurs at least as many times as any other element in A[i..j].
3 Solution
3.1 Simplification
For simplicity, we'll start off by considering
Notice that M¢(A,i,j) only needs to return a
meaningful result if a majority exists in A[i..j]. For this
reason, we can assume that there is a majority element, since if
there's not, we can return anything.
So for now, assume
M(A,1,N) = m, m ¹ 0. (Call this assumption [M*])
3.2 Two important theorems.
Theorem 1.
[M*] Ù{M(A,1,i) = 0} Þ M(A,i+1,N) = m.
What this says is that if there is a majority element in A[1..N]
(as we just assumed) and there is no majority element in A[1..i]
then there must be a majority element in A[i+1..N]. Furthermore,
the majority element in A[i+1..N] must be the same as the majority
element in A[1..N].
Proof.
M(A,1,i) = 0 Þ #Ai,i(c) £ |
i 2
|
. |
| (3) |
Now assume that M(A,i+1,N) ¹ m. |
| (4) |
(2) Þ #A1,i(m) + #Ai+1,N(m) £ #A1,i(c) + #Ai+1,N(m) |
| (6) |
(3) Ù(5) Þ #A1,i(c) + #Ai+1,N(m) £ |
i 2
|
+ |
N-i 2
|
= |
N 2
|
|
| (7) |
(6) Ù(7) Þ #A1,i(m) + #Ai+1,N(m) £ |
N 2
|
|
| (8) |
Þ #A1,N(m) £ |
N 2
|
which contradicts assumption [M*]. |
|
So it must be that M(A,i+1,N) = m.
Theorem 2.
{M(A,1,i) = 0} Ù{M(A,i+1,j) = 0} Þ M(A,1,j) = 0.
This says that if there is no majority element in A[1..i] and there is no majority element
in A[i+1..j] then there is no majority element in A[1..j].
Proof.
Let c = P(A,1,i), d = P(A,i+1,j) |
| (9) |
M(A,1,i) = 0 Þ #A1,i(c) £ |
i 2
|
|
| (10) |
M(A,i+1,j) = 0 Þ #Ai+1,j(d) £ |
j-i 2
|
|
| (11) |
Now suppose M(A,1,j) = e, e ¹ 0 |
| (12) |
(9) Þ #A1,i(e) + #Ai+1,j(e) £ #A1,i(c) + #Ai+1,j(d) |
| (14) |
(10) Ù(11) Þ #A1,i(c) + #Ai+1,j(d) £ |
i 2
|
+ |
j-i 2
|
= |
j 2
|
|
| (15) |
Þ #A1,j(e) £ |
j 2
|
which contradicts (12), (13). |
|
So it must be that M(A,1,j) = 0.
3.3 First attempt at a loop invariant
Theorems 1 and 2 suggest the following loop invariant for M¢(A,1,N):
L(a,i,j,b) : = {M(A,1,i-1) = 0} Ù{M(A,i,j) = b} |
|
With this invariant, we could have j iterate through A and set i to j
every time that we find that there is no majority in A[i..j].
Theorem 2 would then allow us to merge {M(A,1,i-1) = 0} Ù{M(A,i,j) = 0} into M(A,1,j) = 0.
Theorem 1 would then allow us to conclude that b is the majority winner (under [M*]) when j = N.
This loop invariant is close to the right one but it's not strong enough to
allow us to find or prove an algorithm. This is because it doesn't allow us to
figure out when M(A,i,j) = 0.
3.4 Second attempt at a loop invariant
A better loop invariant is:
L(A,i,j,k,b): = {M(A,1,i-1) = 0} Ù {k > j} Ù {#Ai,j(b) = |
k-i+1 2
|
} |
|
Notice that L Þ M(A,i,j) = b since
{#Ai,j(b) = |
k-i+1 2
|
} Ù{k > j} Þ #Ai,j(b) > |
j-i+1 2
|
|
|
More importantly, we can determine from this invariant when M(A,i,j) = 0.
Theorem 3.
L Ù{j+1 = k} Ù{A[j+1] ¹ b } Þ M(A,i,j+1) = 0.
Proof.
L Ù{A[j+1] ¹ b } Þ #Ai,j(b) = #Ai,j+1(b) = |
k-i+1 2
|
|
| (16) |
(16) Ù{j+1 = k} Þ #Ai,j+1(b) = |
(j+1)-i+1 2
|
|
| (17) |
Now suppose M(A,i,j+1) = c, c ¹ 0,c ¹ b. |
| (19) |
Þ #Ai,j+1(c) > |
(j+1)-i+1 2
|
|
| (20) |
But M(A,i,j) = b Þ #Ai,j(c) < |
j-i+1 2
|
|
| (21) |
Þ #Ai,j+1(c) < |
j-i+1 2
|
+ 1 = |
(j+1)-i+1+1 2
|
|
| (22) |
Þ 2 ·#Ai,j+1(c) < (j+1)-i+1+1 |
| (23) |
Þ 2 ·#Ai,j+1(c) £ (j+1)-i+1 |
| (24) |
Þ #Ai,j+1(c) £ |
(j+1)-i+1 2
|
which contradicts (20). |
|
So M(A,i,j+1) = 0.
3.5 Algorithm for M¢(A,1,N)
M'(A,1,N)
{
i=j=1;
k=2;
b=A[1];
// (p1)
while(j<N)
{ // (p2)
j++;
if(A[j]==b) k+=2;
else if(j==k) { i=j; b=A[j]; k=k+1; }
} // (p3)
return b; // (p4)
}
To prove that the algorithm is correct, we need to show 4 things.
- Initialization: The loop invariant, L, is true at (p1)
- Maintenance: If L is true at (p2), then it will be true at (p3).
- Boundedness: The loop will terminate in a finite number of steps.
- Correctness: After the loop (p4), L Ù{j > = N} Þ M¢(A,1,N) = b.
3.6 Proof of Algorithm M¢(A,1,N)
Initialization.
At (p1) we have L(A,i,j,k,b) |
|
= {M(A,1,0) = 0} Ù{2 > 1} Ù{#A1,1(A[1]) = |
2-1+1 2
|
} |
|
which is clearly true.
Maintenance. Assume that L(A,i,j,k,b) holds at (p2). We need to show 3 things:
{L(A,i,j,k,b)} Ù{A[j+1] = b} Þ L(A,i,j+1,k+2,b) |
| (25) |
|
{L(A,i,j,k,b)} Ù{A[j+1] ¹ b} Ù{j+1 = k} |
| Þ L(A,j+1,j+1,k+1,A[j+1]) |
| (26) |
| |
|
|
{L(A,i,j,k,b)} Ù{A[j+1] ¹ b} Ù{j+1 ¹ k} |
| | (27) |
| |
|
These correspond to the three cases inside the loop.
Proof of (25).
L(A,i,j,k,b) Ù{A[j+1] = b} Þ |
|
{M(A,1,i-1) = 0} Ù{k+2 > j+1} Ù{#Ai,j+1(b) = |
k-i+1 2
|
+ 1} |
|
Proof of (26).
By theorem 3,
L(A,i,j,k,b) Ù{j+1 = k} Ù{A[j+1] ¹ b } Þ M(A,i,j+1) = 0. |
|
By theorem 2,
{M(A,1,i-1) = 0} Ù{M(A,i,j+1) = 0} Þ M(A,1,j+1) = 0 |
|
Therefore,
L(A,i,j,k,b) Ù{j+1 = k} Ù{A[j+1] ¹ b } Þ |
|
{M(A,1,j+1) = 0} Ù{k+1 > j+1} Ù |
|
#Aj+1,j+1(A[j+1]) = |
k+1-(j+1)+1 2
|
= |
k+1-(k)+1 2
|
= 1 |
|
= L(A,j+1,j+1,k+1,A[j+1]). |
|
Proof of (27).
L(A,i,j,k,b) Ù{A[j+1] ¹ b} Ù{j+1 ¹ k} Þ |
|
{M(A,1,i-1) = 0} Ù{k > j+1} Ù{#Ai,j+1(b) = |
j-i+1 2
|
} |
|
Thus the invariant is maintained through the loop.
Boundedness. The loop obviously terminates in N-1 steps. Since N is finite,
the loop will terminate in a finite number of steps. More importantly, the number of steps
is linear with respect to N.
Correctness. At (p4) we have,
= {M(A,1,i-1) = 0} Ù{k > N} Ù{#Ai,N(b) = |
k-i+1 2
|
} |
|
Therefore since {M(A,1,i-1) = 0} Ù{M(A,i,N) = b}, we know by theorem 1
that M(A,1,N) = b, which is what we wanted to show. So b is the majority element
under [M*] (i.e. assuming that there exists a majority element.)
But now the implementation of M(A,1,N) is obvious!
3.7 Algorithm for M(A,1,N)
M(A,1,N)
{
b = M'(A,1,N);
count = 0;
i = 1;
while(i<=N)
{
if(A[i]==b) count++;
i++;
}
if(count > N/2) return b;
else return 0;
}
The proof is left as an exercise to the reader.
4 Notes
- In M¢(A,1,N), the variable, i, is not used. Its only real use is to prove
the loop invariant. Since b achieves the same value with or without i, you can
get rid of it (or put it in comments so that the proof is still clear.)
- M¢(A,1,N) can be optimized to return b if k > N since at that point we have
#Ai,j(b) = |
k-i+1 2
|
> |
N-i+1 2
|
Þ M(i,N) = b. |
|
- M(A,1,N) can be optimized by merging M¢(A,1,N) with it and returning b in
the first loop (the M¢ loop) if [(k-i+1)/ 2] > N/2 since then we have
Þ #A1,N(N) ³ #Ai,j(N) > |
N 2
|
|
|
File translated from TEX by TTH, version 2.25.
On 22 Oct 1999, 06:42.