Finding Lines

From programming_contest
Jump to navigation Jump to search

Introduction

This problem asks you analyze a set of n points (10^5) and determine whether there exists some line that covers p*n points (where p > .2)

Solutions

Brute Force

Idea

Iterate over all pairs of points, check whether the line between them covers n*p points.

Runtime

  • n^3 = 10^15

Brute Force with Hashing

Idea

Maintain a hashmap of lines to the points that it covers. Iterate over all pairs of points. Add those two points to the set for the line between them. Iterate over all found lines to see if any of them cover n*p points.

Requires a canonical, hashable line representation. use triple {a,b,c} where y*a = b*x + c. Eliminates needs for doubles, deals with vertical lines.

Runtime

  • n^2 = 10^10

Divide and Conquer

Idea

If we are looking for worst-case 20% of points, we can divide the set up into sub-sets. If there is some line that covers 20%, it will cover 20% in at least one of the sets.

Overview: divide all points up into K sets. Run the n^2 algorithm above to generate "candidate" lines. Check the candidates against all points in the entire set to see if it covers the full 20%. The important point is that each subset can only generate 5 lines, since each line must cover 20% of the points.

Runtime

n=points, p=probability of lying on the line, k=points per group

  • groups * group_analysis_time + points * candidate lines
  • n/k * k^2 + n * (1/p * n/k)
  • nk + n^2/kp

Minimized somewhere around k=700 when p=.2, but realistically runs faster when k is much smaller (around 16). This is due to the unlikelihood of getting a full n/kp lines per subset. One can guard against a worst case scenario by having multiple staves of division (divide into groups of 1024, then 128, then 16, for instance)

Code

Solution - Java

import java.util.*;
public class f {
	Scanner in=new Scanner(System.in);
	public static void main(String[] args) {
		new f().go();
	}
	private void go() {
		int n=in.nextInt();
		int p=in.nextInt();
		int[][] ps=new int[n][2];
		for(int i=0;i<n;i++)for(int j=0;j<2;j++)ps[i][j]=in.nextInt();
		if(n==1){
			System.out.println("possible");
			System.exit(0);
		}
		HashSet<ArrayList<Integer>> c=new HashSet<ArrayList<Integer>>();
		int ppg=16;
		for(int i=0;i<n;i+=ppg){//split into groups of ppg
			HashMap<ArrayList<Integer>,HashSet<Integer>> cs=new HashMap<ArrayList<Integer>,HashSet<Integer>>();
			int j;
			for(j=0;j<ppg&&j+i<n;j++)for(int k=j+1;k<ppg&k+i<n;k++){
				ArrayList<Integer> on=new ArrayList<Integer>();//y*on[1]=on[0]x+on[2]...should be canonical line definition
				on.add(ps[k+i][1]-ps[j+i][1]);//n
				on.add(ps[k+i][0]-ps[j+i][0]);//d
				int gcd=(int) greatestCommonDenominator(on.get(0),on.get(1));
				on.set(0, on.get(0)/gcd);
				on.set(1, on.get(1)/gcd);
				on.add(-1*on.get(0)*ps[j+i][0]+on.get(1)*ps[j+i][1]);
				if(!cs.containsKey(on))cs.put(on, new HashSet<Integer>());
				cs.get(on).add(j+i);
				cs.get(on).add(k+i);
			}
			for(ArrayList<Integer>jj:cs.keySet())if(cs.get(jj).size()*100>=j*p)c.add(jj);
		}
		for(ArrayList<Integer>i:c){
			int count=0;
			for(int j=0;j<n;j++)if(ps[j][1]*i.get(1)==i.get(0)*ps[j][0]+i.get(2))count++;
			if(count*100>=p*n){
				System.out.println("possible");
				System.exit(0);
			}
		}
		System.out.println("impossible");
	}
	long greatestCommonDenominator( long a, long b ){
		while( b != 0 ){
			long t = a%b;
			a = b;
			b = t;
		}
		return a;
	}
}

Randomized Algorithm

Idea

20% of points is a HUGE amount. Can choose random points, and check the line between them against the rest of the points. Given a big number of trials, probability of false negative is absurdly small.

Probability of 2 points lying on certain line that covers p*n points: p^2 Probability of 2 points NOT lying on that line: 1 - p^2 Probability of NOT finding a line after n trials: (1 - p^2)^n Probability of finding line after n trials: 1 - (1 - p^2)^n

With 10^5 points, can run many trials....100 is a good number. at n = 100, p = .2, p(success) = .98.....ehhhh...not good enough with 200 trials, p(success) = .999....ehhh...decent...but to be safe. with 500 trials, p(success) = .999999998.....probably good enough....if it fails, choose a new random seed and resubmit

Runtime

trials * points