In computational complexity theory, NP ("Non-deterministic Polynomial time") is the set of decision problems solvable in polynomial time on a non-deterministic Turing machine. Or, equivalently, the set of decision problems that can be reformulated as a binary function A(x, y) over strings such that
- for a certain constant number c it holds that a string x is an element of the original decision problem iff there is a string y with length smaller than |x|c such that A(x, y),
- the function A is decidable by a Turing machine in polynomial time.
An y value for a certain x such that A(x, y) holds is usually referred to as a certificate for x since it shows the membership of x to the original decision problem.
The importance of this class of decision problems is that it contains many interesting searching and optimization problems where we want to know if there exists a certain solution for a certain problem. Examples are the traveling salesman problem where we want to know if there is a route of some length that goes through all the nodes in a certain network and the satisfiability problem where we want to know if a certain formula in propositional logic with propositional variables is satisfiable or not. In both cases there is a certificate (a path that of that length, a truth assignment that makes the formula true) that is limited in size and for which we can decide in polynomial time whether it solves the problem.
Because of its importance there have been many efforts to come up with algorithms that decide the problems in NP in polynomial time. However, it seems that for some problems in NP we cannot come up with an essentially better algorithm than the one that simply tries all possible certificates until we find the right one. Whether this is really true or not is still one of the big open questions in computer science (see Complexity classes P and NP for an in-depth discussion). An important notion in this context is the set of NP-Complete decision problems, which is a subset of NP and might be informally described as the "hardest" problems in NP. These are problems that if there is a polynomial-time algorithm for even one of them, then there is a polynomial-time algorithm for all the problems in NP. Because of this, once a problem has been proven to be NP-complete this is widely regarded as a sign that a polynomial algorithm for this problem probably does not exist.
A commonly used equivalent characterization of the problems in NP is this: if, given a solution, that solution can be checked for validity in polynomial time, the problem is in NP. The nondeterministic Turing machine essentially has a "guessing unit" that allows it to guess all the bits needed to construct the right answer, if one exists. If the correctness of the answer can then be verified by a polynomial-time algorithm, then the problem is solvable in nondeterministic polynomial time. There are many common problems that can be solved in this way; if one can calculate the "quality" of a solution in polynomial time, then a decision version of the problem, such as "is there a solution with quality better than k", can be solved by a nondeterministic polynomial-time Turing machine.
There is also a simple logical characterization of NP: it contains precisely those languages expressible in second order logic restricted to exclude universal quantification over relations, functions, and subsets.
Example
The decision version of the traveling salesperson problem is in NP. Given an input matrix of distances between N cities, the problem is to determine if there is a route visiting all cities with total distance less than k. A nondeterministic Turing machine can find such a route as follows:
- At each city it visits it "guesses" the next city to visit, until it has visited every vertex. If it gets stuck, it stops immediately.
- At the end it verifies that the route it has taken has cost less than k in O(n) time.
One can think of each guess as "forking" a new copy of the Turing machine to follow each of the possible paths forward, and if at least one machine finds a route of distance less than k, that machine accepts the input.
What makes this a natural decision version of the problem is that, if we could solve this problem quickly, we could use binary search to solve the original computation problem (finding the route and its exact length) quickly as well.