Open-Pit Mining
This problem gives us a DAG of nodes, each with an associated value and asks us for the maximum value subset with the condition that if any node is chosen, any of its ancestor nodes must be as well.
First simplification is to just subtract cost from value for each node to only deal with a single number. There is no reason to deal with them separately.
If one thinks through a couple of example graphs, including the one in the problem statement, the intuitive understanding is that in order to choose a positive node lower in the DAG, one must be able to "justify" the negative value of the nodes above it in the DAG, or more generally, the selection of positive value nodes must collectively be able to justify the negative weight nodes which are required to dig them up. Any set of nodes in our solution must be constrained by the negative weights, that is, if we are matching positive weight to negative 1-to-1, then we will run out of negative first, and the positive left over is our "profit". (This is a fancy way of saying the profit is positive). Conversely, any nodes that are NOT in our solution must be constrained by the amount of positive weight available, that is, there is not enough positive to match all the negative.
Taking this, we can start to formulate a solution. Consider the following:
- select some positive weight node
- choose some ancestor node with a negative weight and "allocate" our positive weight against it.
- if there is positive weight left over, choose some other ancestor
- repeat with some other positive weight node, keeping in mind we have already accounted for the cost of some of the negative weight nodes with the first positive node
One might see a shortcoming of this solution. We could, when choosing our first positive node, match it with some negative node. Some later node might then find it has no-body to match with, but it would if we had chosen a different negative node to match with the first node. We need some way to "push" the value back to the first node so it can use a different negative node. Sounds a lot like an augmenting path algorithm.
The construction we use is
- draw an edge from a source to each positive weight node with the weight of that node
- draw an edge of infinite weight from each positive weight node to each negative one that is an ancestor
- draw an edge from each negative weight node to a sink with abs(weight)
When max flow is run, the set of nodes which are "negative weight" constrained will be the ones which ultimately determine the max flow of the graph (aka the min-cut). Nodes on the source side of the cut will be in the final set (meaning the cut is beyond the negative weight nodes, and thus the set is negative weight constrained), and nodes on the sink side will cut will not be included. The problem only asks for the maximum flow, however.
Note how this construction also guarantees that all our ancestors are included. Since there is an edge from each node to each of its negative weight ancestors, it means that both will always be on the same side of the cut (or we'd have the cut spanning an infinite edge, which obviously is not minimal!). Further, we can prove trivially that our positive weight ancestors will be included as well (if we're included, it means we accounted for the value of all the negative weight ancestors successfully, which means a digging up a positive weight ancestor would be free, so must be included in an optimal set).
There are 200 nodes, which means we run fast enough. Further, the count is also a good hint that the solution required a flow.
One last note: the transformation of the graph is suspiciously similar to the construction of weighted könig's theorem. It is possible a minor modification to the theorem can justify the construction here.