Skip to content

Instantly share code, notes, and snippets.

@xiaohan2012
Last active March 4, 2016 12:50
Show Gist options
  • Save xiaohan2012/f6856024d0e4da1b4d52 to your computer and use it in GitHub Desktop.
Save xiaohan2012/f6856024d0e4da1b4d52 to your computer and use it in GitHub Desktop.

PCST:

  • why I work on PCST
    • the road map: PCST->MST->Quota->Budget
    • PCST->MST
    • MST = Quota
    • Quota -> Budget: how the 5-approximation comes?
    • that's for undirected case
  • incorrectness of the proof on the greedy primal-dual method
    • the chapter
      • p11: minimum violation set
      • p19: general algorithm
      • p40: pcst algorithm
    • dtgs
    • Def of minimal violation set in Figure 4.3 will be the roots of active components. Then dual constraints might be broken.
    • However, if we let the violation set be all the active components and their subset, solution feasibility is not guaranteed. Even the reverse deletion step will not help.
    • No idea on how to formulate integer program for PCST-DAG specifically
    • No idea on the algorithm if we use the original integer program: specifically, what is the VIOLATION-SET oracle?
      • generate feasible solution
      • does not violate any dual constraints so that approximation guarantee can be proved
  • Is PCST-DAG NP-hard?
    • how is PCST NP-hard derived? derive steiner tree(ST) to PCST as ST is special case of PCST
  • PCST-dag is NP-hard because Steiner-DAG is NP-hard and Steiner-DAG is special case of PCST-DAG
    • (root, set node) with weight 1
    • (set node, element node) with weight 0
  • for directed case: I only found resources for steiner problem
  • branch-and-cut algorithm by Ivana seems to be more general:
    • the paper: how to prove the approximation ratio
      • why transform to directed version?
      • a different IP forumulation
      • some preprocessing:
        1. least-cost test: path from i to j has less cost than edge from i to j, remove the edge
      • branch-and-cut
        • insert the cut constraints(exponential) at the begining of the branching/separation step
        • why use the maxflow to find the violating constraints
      • no approximation guarantee

Sampling:

  • checked the query paper:
    • it focues on modifying the result on the same tree
    • however, we want to find trees on different parts of the graph
  • the root sampling stragegy
    • simple upperbound and pruning using threshold
      • determining the threshold: U / average edge cost
    • further sample by heuristic by choosing either of the following:
      • exploitation: subtree that has high log|V| * |V| / |C|(or simply density thresholded by subtree size) is more promising, we maintain a fixed-size queue for those roots
      • exploration: select those with high upper bound
      • probability to explore depends on how much we have covered the all nodes. For example, we covered 90% of the nodes, then the proba to explore is 0.1
    • a pipline to add sampling preference to roots(maybe overkilling):
      • sample roots that are at the peak of the frequency against time
      • dense subgraph detection to identify candidates
      • preference on people/interaction on the dense interaction subgraph
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment