P and NP
Decision Problems
- Most computing problems belong to the category of decision problems,
i.e. the computational problems for which the intended output
is either yes (accept) or no (reject).
For example, is the pattern string p a substring
of text string s? Is an object in a collection? Is a given array
of integers sorted?
- We can often turn an optimization problem into a decision problem.
For example, minimum spanning tree (minimize/optimize the sum of the
weights of all edges in the result tree) can be viewed of
whether an edge in the graph G should be included in G's minimum
spanning tree.
- A decision problem can be represented as a set of inputs
whose output is yes/accept. These inputs can further be encoded
as strings over some alphabet.
For example, the decision problem "is an array consists exactly
3 unique numbers, each number is between 1 and 5, sorted in
ascending order?" can be represented by the following set of inputs:
{[1, 2, 3], [1, 2, 4], [1, 2, 5], [2, 3, 4], [2, 3, 5], [3, 4, 5]}
The complexity class P includes those problems,
such that for each problem, there exists at least one algorithm
that can find its solution in polynomial time.
The complexity class NP includes those problems,
such that for each problem, its solution can be verified
in polynomial time, but there is no guarantee that there exists
an algorithm that can find its solution in polynomial time.
An example of NP problem is travelling salesman problem:
"Given a list of cities (vertices) and the distances between each
pair of cities (weight of edges), what is the shortest possible
route that visits each city exactly once and returns to the
original city?" (Quote from Wikipedia with slight modification).
A slightly simplified example of the travelling salesman problem
is Hamiltonian Cycle problem: Given a graph G, is there a simple
cycle in G that visits each vertex in G exactly once?
P is a subset of NP. Any problem in class P is also a problem
in class NP. There is no known answer whether P is a proper
subset of NP.
NP-completeness: If a problem is in class NP and can
be reduced to a know NP-complete problem in polynomial time,
then this problem is NP-complete. Any problem in P is NOT
in class NP-complete. All NP-complete problems are equally
hard to solve.
NP-hard: A problem X is NP-hard if there is an NP-complete
problem Y such that Y can be reduced to X in polynomial time.
NP-hard problems can be equally hard as or harder than NP-complete
problems. NP-hard problems may not even in NP class.
Commonly know NP-complete problems:
- SATisfiability (the first problem known to be NP-complete):
Given a Boolean formula (usually in conjunctive normal form),
is there a consistent assignment (TRUE or FALSE) to the variables
in the formula so that the formula evaluates to TRUE?
3-SAT is a special case of SAT problems, where
each conjuctive's disjunctive clause is limited to at
most 3 literals. 3-SAT is also NP-complete.
- Vertex Cover: is there a vertex cover containing at most
K vertices for a graph G? A vertex cover is defined as
a set of vertices that includes at least one end point
of every edge of the graph.
- Clique: is there an integer k, such that a subset graph of G with
at least k of its vertices form s a complete graph.
A complete graph is defined as a graph that every
pair of its vertices are adjacent.
Reduction (with a polynomial function) of 3-SAT to Clique:
A conjuctive expression with k three-literal clauses can be reduced
to a k-Clique problem. The implication is that if Clique
has a polynomial solution, then 3-SAT also has a polynomial
solution. Therefore, Clique is also NP-complete.
How are NP-complete problems usually solved?
- Using approximation algorithms: try to find not the best/optimal
solution but a good enough one.
- Backtracking algorithms, which are usually exponential, but
can be improved if we can apply heuristics.
- Branch-and-Bound algorithms: branching would divide search space
into multiple sub spaces and bounding functions would eliminate
any sub spaces that doesn't look promising to contain solutions.