Data Structures and Algorithms

P and NP

Decision Problems

Most computing problems belong to the category of decision problems, i.e. the computational problems for which the intended output is either yes (accept) or no (reject). For example, is the pattern string p a substring of text string s? Is an object in a collection? Is a given array of integers sorted?
We can often turn an optimization problem into a decision problem. For example, minimum spanning tree (minimize/optimize the sum of the weights of all edges in the result tree) can be viewed of whether an edge in the graph G should be included in G's minimum spanning tree.
A decision problem can be represented as a set of inputs whose output is yes/accept. These inputs can further be encoded as strings over some alphabet.
For example, the decision problem "is an array consists exactly 3 unique numbers, each number is between 1 and 5, sorted in ascending order?" can be represented by the following set of inputs: {[1, 2, 3], [1, 2, 4], [1, 2, 5], [2, 3, 4], [2, 3, 5], [3, 4, 5]}

The complexity class P includes those problems, such that for each problem, there exists at least one algorithm that can find its solution in polynomial time.

The complexity class NP includes those problems, such that for each problem, its solution can be verified in polynomial time, but there is no guarantee that there exists an algorithm that can find its solution in polynomial time.

An example of NP problem is travelling salesman problem: "Given a list of cities (vertices) and the distances between each pair of cities (weight of edges), what is the shortest possible route that visits each city exactly once and returns to the original city?" (Quote from Wikipedia with slight modification).

A slightly simplified example of the travelling salesman problem is Hamiltonian Cycle problem: Given a graph G, is there a simple cycle in G that visits each vertex in G exactly once?

P is a subset of NP. Any problem in class P is also a problem in class NP. There is no known answer whether P is a proper subset of NP.

NP-completeness: If a problem is in class NP and can be reduced to a know NP-complete problem in polynomial time, then this problem is NP-complete. Any problem in P is NOT in class NP-complete. All NP-complete problems are equally hard to solve.

NP-hard: A problem X is NP-hard if there is an NP-complete problem Y such that Y can be reduced to X in polynomial time. NP-hard problems can be equally hard as or harder than NP-complete problems. NP-hard problems may not even in NP class.

Commonly know NP-complete problems:

SATisfiability (the first problem known to be NP-complete): Given a Boolean formula (usually in conjunctive normal form), is there a consistent assignment (TRUE or FALSE) to the variables in the formula so that the formula evaluates to TRUE?
3-SAT is a special case of SAT problems, where each conjuctive's disjunctive clause is limited to at most 3 literals. 3-SAT is also NP-complete.
Vertex Cover: is there a vertex cover containing at most K vertices for a graph G? A vertex cover is defined as a set of vertices that includes at least one end point of every edge of the graph.
Clique: is there an integer k, such that a subset graph of G with at least k of its vertices form s a complete graph. A complete graph is defined as a graph that every pair of its vertices are adjacent.

Reduction (with a polynomial function) of 3-SAT to Clique: A conjuctive expression with k three-literal clauses can be reduced to a k-Clique problem. The implication is that if Clique has a polynomial solution, then 3-SAT also has a polynomial solution. Therefore, Clique is also NP-complete.

How are NP-complete problems usually solved?

Using approximation algorithms: try to find not the best/optimal solution but a good enough one.
Backtracking algorithms, which are usually exponential, but can be improved if we can apply heuristics.
Branch-and-Bound algorithms: branching would divide search space into multiple sub spaces and bounding functions would eliminate any sub spaces that doesn't look promising to contain solutions.