Catch the Plane: Difference between revisions

From programming_contest
Jump to navigation Jump to search
No edit summary
No edit summary
Line 9: Line 9:
dp[node][probability] = minimum time of getting to a node with at least a given probability
dp[node][probability] = minimum time of getting to a node with at least a given probability


The issue with the first one is time is 10^18. Even with [[Category:Interesting Points|range compression]], where wee only consider times that a bus leaves or arrives at, we are still at 10^6^2 = 10^12. The second method is not viable since probability doesn't give a discrete measure, or perhaps more validly, there are too many discrete probabilities to memoize.
The issue with the first one is time is 10^18. Even with [[Category:Interesting Points|range compression]], where wee only consider times that a bus leaves or arrives at, we are still at 10^6^2 = 10^12. The second method is not viable since probability doesn't give a discrete measure, or perhaps more validly, there are too many discrete probabilities to memoize. Plus, it's not inherently clear what the relation would look like.


The key thing to note here is that we're not REALLY optimizing in two dimensions. We are optimizing in one dimension, probability, and the other dimension is a constraint, namely that we must be at node X before time t. It doesn't matter how much earlier than time t we get there. So what challenges do we face if we just dijkstra over maximum probability? The issue is the maximum probability way of reaching a node may not give us a path that allows us to reach the end node in time. So ultimately, we want to find the highest probability path, but have a way of doing it ONLY if we know we have a way of reaching the goal along that path. Unfortunately we don't know which paths allow us to reach the airport in time!
So lets go with some fundamentals and see where it gets us:


There is a second issue. Suppose we are just optimizing over probability: We then run into an issue where arriving late at a given node with a high probability, but the only way to get to the end before time T is a very low probability path. It would, thus, have been better to take an earlier bus with somewhat lower probability to get to this node, but which allows us to take a higher probability out!. This is an intrinsic problem when we have different busses we can take out of a node depending on arrival time. Of course, trying to consider arrival time blows up the complexity again, anyway.
The probability of reaching the the airport by taking a bus is the probability we get there by taking our ideal first bus plus the probability that bus doesn't run times the probability of getting there on some later bus.


So what to do?
Based on this, we see that the probability of getting there depends on
# Our choice of bus to take
# ultimately the probability of getting there on the LAST possible bus


Idea: What if we run the problem backwards (airport->start point), and only consider buses that are arrive at the airport in time? This eliminates time as a factor we need to consider! Since we will only be evaluating paths which we KNOW get us to the aiport in time. This involves initializing every node to 0, and processing nodes in order of highest probability of reaching.
So when we arrive at a node, we would have to recurse on the LAST bus to leave the stop, then recurse on the second to last one, calculate whether it makes sense to try to take that second to last bus, and then get a probability of getting there on EITHER of the last two buses (max(p(last), p(second_to_last) * p(second_to_last_runs) + p(last) * 1-p(second_to_last_runs))). We can then repeat the process for each earlier bus that leaves the stop. This will get us an answer, but note that we have done O(n) work for this bus. What happens when we get a slightly later bus that arrives at this station? Do we have to calculate this work all over again? No. The answers to p(last) and p(second_to_last) should be the same. Even performing the calculation again with cached values for each p, however, is enough to add that factor of O(n). To avoid this, at each node, we have to cache a list for each departing bus, namely, what is the probability of reaching if I arrive before THIS bus? This is calculated the first time we arrive at a node, and then simply looked up (be sure to sort it for fast lookup! or just use a treeset based on time) at each arrival. When a bus arrives, look up the probability we get there based on the next time a bus leaves (note that this may not involve actually TAKING this bus, but that doesn't matter). The code looks as follows:


...
<syntaxhighlight>
double goes[]; // probability a bus actually goes
double memo[]; // probability of arriving on time if we go on this bus
treeset<Pair>[] memo2; // array, one for each node, of a treeset of pairs. pair.first = time. pair.second = probability. sort based on time, and probability represents best probability after that time
recurse(bus) {
    if (memo[bus] != inf) return memo[bus];
    temp = memo2[bus.arrival_station];
    for (i : buses leaving bus.arrival_station in reverse order) { //
        recurse(i);
        p = max(temp.get((i+1).leave_time), memo[i]*goes[i] + temp.get((i+1).leave_time)*(1-goes[i])) // don't try to take i, OR try to take i, and it either goes or it doesn't
        temp.add(i.leave_time, p)
    }
    memo[bus] = temp.ciel(bus.arrive_time); //returns the first element with a time greater than our input
    return memo[bus];
}
 
</syntaxhighlight>
While this solution is correct and fast, the issue is the recursion depth. With 10^6 buses and possible stack frames, we're going to smash that. So naturally, we simply convert this to a DP problem

Revision as of 23:47, 2 November 2018

This problem asks gives us a bus schedule, where each bus has a probability of ACTUALLY running, and then asks us to figure out the max probability we can arrive at a given node before a given time.

Standard problems might ask us to optimize for the probability of ever reaching a given location, or the minimum time it takes to get there, but attempting to do both at the same time makes this slightly more difficult. A way to fix this might be to use a dp:

dp[node][time] = maximum probability of getting to node at time

or swapping the dimensions

dp[node][probability] = minimum time of getting to a node with at least a given probability

The issue with the first one is time is 10^18. Even with , where wee only consider times that a bus leaves or arrives at, we are still at 10^6^2 = 10^12. The second method is not viable since probability doesn't give a discrete measure, or perhaps more validly, there are too many discrete probabilities to memoize. Plus, it's not inherently clear what the relation would look like.

So lets go with some fundamentals and see where it gets us:

The probability of reaching the the airport by taking a bus is the probability we get there by taking our ideal first bus plus the probability that bus doesn't run times the probability of getting there on some later bus.

Based on this, we see that the probability of getting there depends on

  1. Our choice of bus to take
  2. ultimately the probability of getting there on the LAST possible bus

So when we arrive at a node, we would have to recurse on the LAST bus to leave the stop, then recurse on the second to last one, calculate whether it makes sense to try to take that second to last bus, and then get a probability of getting there on EITHER of the last two buses (max(p(last), p(second_to_last) * p(second_to_last_runs) + p(last) * 1-p(second_to_last_runs))). We can then repeat the process for each earlier bus that leaves the stop. This will get us an answer, but note that we have done O(n) work for this bus. What happens when we get a slightly later bus that arrives at this station? Do we have to calculate this work all over again? No. The answers to p(last) and p(second_to_last) should be the same. Even performing the calculation again with cached values for each p, however, is enough to add that factor of O(n). To avoid this, at each node, we have to cache a list for each departing bus, namely, what is the probability of reaching if I arrive before THIS bus? This is calculated the first time we arrive at a node, and then simply looked up (be sure to sort it for fast lookup! or just use a treeset based on time) at each arrival. When a bus arrives, look up the probability we get there based on the next time a bus leaves (note that this may not involve actually TAKING this bus, but that doesn't matter). The code looks as follows:

double goes[]; // probability a bus actually goes
double memo[]; // probability of arriving on time if we go on this bus
treeset<Pair>[] memo2; // array, one for each node, of a treeset of pairs. pair.first = time. pair.second = probability. sort based on time, and probability represents best probability after that time
recurse(bus) {
    if (memo[bus] != inf) return memo[bus];
    temp = memo2[bus.arrival_station];
    for (i : buses leaving bus.arrival_station in reverse order) { //
        recurse(i);
        p = max(temp.get((i+1).leave_time), memo[i]*goes[i] + temp.get((i+1).leave_time)*(1-goes[i])) // don't try to take i, OR try to take i, and it either goes or it doesn't
        temp.add(i.leave_time, p)
    }
    memo[bus] = temp.ciel(bus.arrive_time); //returns the first element with a time greater than our input
    return memo[bus];
}

While this solution is correct and fast, the issue is the recursion depth. With 10^6 buses and possible stack frames, we're going to smash that. So naturally, we simply convert this to a DP problem