Author: Philip Wu
If I gave you a set of numbers, how many times would you have to access a number in order to give me any number from that set? Of course, only once!
If I gave you a set of n numbers, how many times would you have to access a number to return the smallest number in the set? Well, you simply have to store the current minimum somewhere and then compare that to each number in the set, then replacing that current minimum with that number if it is larger. Then, we have to have n amount of accesses. In more formal notation, if we were given a problem T, in which we receive a set of n items and have to return the smallest one, thenT(n)=O(n).
Next, if I gave you a set of n numbers and wanted you to sort that set, how many accesses would you have to make in order to carry out that process? A slow way to do that would be to check for the minimum in the entire set, then look for the second smallest number, then for the third, etc. That would take around n^2 accesses. Our problem T(n)=O(n^2).
Do you notice anything about the number of steps it took in the last two problems? Yes! They are all polynomials in comparison to our input size, n. These problems are therefore noted as Polynomial-time problems-- the number of steps we have to carry out is a polynomial, in proportion to our input size.
Now, what if I gave you a set of n numbers and asked you to return all its subsets that sum up to a number m? How many steps would that take? Well, maybe we could find all the subsets, and then perform addition in each one to see which ones add up to m. Our computational time would then be the number of subsets times n. If we were to find the number of subsets, we would remark that in making a subset, one item has two choices, either be in that subset or don’t be in that subset. The number of subsets is therefore 2n, making our problems computational time (2^n)n. In the worst case scenario, checking a solution to this problem would also be (2^n)n.
Solving a sudoku puzzle would also be exponential, as the subset sum problem, but checking if a solution to a sudoku puzzle is correct or not would be rather simple. We could solve it in polynomial time.
Let's say we do the three problems above with a set of 10 numbers, then 100 numbers, then 1000 numbers, and so on. What is each problem’s computational time, based on their input? You will see that the time it takes to complete the third problem skyrockets.
There are many problems that have computational times that have the same nature as our third problem, that there they take an exponential amount of time to solve, in comparison to its input, much more than there are problems with polynomial computation times, in fact. More specifically, there are many problems that are similar to our fourth example, sudoku. They can be solved in exponential time, but can be checked in polynomial time. In more formal terms, these problems can be solved by a non-deterministic Turing Machine in polynomial time, where non-deterministic means that giving an algorithm the same input twice does not necessitate it give you back the same output, or take the same route to get to the same output. They can, however, be checked by deterministic turing machines in polynomial time, where deterministic means giving an algorithm the same input multiple times will make it return the same output every single time and take the same path to that output every single time.Many of these non-deterministic polynomial time problems have huge real world applications, where, if simpler, they could make huge advances in every field, including helping cure cancer,
This is the big deal around the P vs NP problem. It asks whether the set of all polynomial time problems is equal to or contained in the set of all non-deterministic polynomial time problems. The process of proving it not equal would generate tons of new techniques and information in computer science. The process of proving it is equal has the promise of giving the benefits listed above, but it is likely not equal But let’s suppose we ignore that and ask ourselves how would we even prove P=NP? We could prove it has to, or we could just find polynomial time algorithms for all NP problems, however the second one would take way too much time.
Or would it? Within the set of NP problems, there is a set of problems that are NP-complete. That is, they are in NP and as hard as NP problems can be (NP-hard), and every other problem that is not as hard as that problem has a computational time a polynomial term less than that problem. Finding a polynomial time algorithm for a NP-complete algorithm would therefore mean ALL NP problems are computable in polynomial time. Previously, there were problems that were believed to be NP problems, but further research showed you could create a P algorithm for it. These problems weren’t NP-Complete, though.
So that’s a brief overview of P vs NP, but P vs NP and the P and NP sets are just a tiny introduction to the computer science topic concerned with the computational time and space of a problem--Complexity. In Complexity, P and NP are just two classes of many more classes. Many, many, many more classes. The rest of this topic will attempt to provide a brief overview of some of the more common complexity classes.
This image will act as a guide for the rest of this topic. Refer back to this image for a nice and neat visualization of the content. However, take it with a grain of salt. It shows that NP is larger than P, but as we discussed earlier, that may not be the case, as is similar to other relations between other classes. A more accurate representation regarding that problem would be like this--
--where each “?” indicates that we don’t know if these two sets are equal or not.
Unlike before in the topic, where our problems returned sets, or numbers, etc, further ahead our problems will only return two states-- yes or no. This doesn’t make these problems inherently different, but it does help us in defining particular complexity classes.
We defined the NP class as the class of problems that can be solved in non-deterministic polynomial time but can be checked in deterministic polynomial time checks. Here, we will redefine it so that it can check if a “yes” solution works in polynomial time. It therefore follows that there is a class such that its members can be solved in non-deterministic polynomial time and can check if a “no” solution doesn’t work. That is coNP. The intersection of these two are the problems that can be checked in polynomial time for both no and yes solutions. Here are some NP problems:
1. The traveling salesperson
2. Factoring
And here are some NP-Complete problems:
1. Boolean satisfiability
2. Hamiltonian Paths Problem
Now within the NP and coNP classes there is the RP (randomized polynomial time) and coRP class, respectively. Here we have to introduce another turing machine, a probabilistic one, where the machine is non-deterministic and has a chance of returning the wrong answer (for example it might return yes if the actual answer is no). The RP class is then defined as always returning no when the answer is no, and having a 50% chance of returning no when the answer is actually yes, and a 50% chance of correctly returning yes. The coRP class is the opposite of that. It has a 50% chance of returning yes when the answer is actually no and always returns yes for answers that are actually yes. The intersection of these two classes is defined as another class ZPP (zero-error probabilistic polynomial time). Whenever ZPP problems return yes or no, they are always returning the correct answer. Otherwise, they return I don’t know with probability at most 50%.
The BPP (bounded-error probabilistic polynomial time) class is the class of problems that return the wrong answer at most ⅓ of the time, no matter whether the correct answer is yes or no. The P vs BPP problem is also extremely important, as it would prove that all probabilistic problems have polynomial solutions, and a part of the NP and coNP class are probabilistic. EXPTIME is simply the class of all problems that can be solved in exponential time, including all PTIME (polynomial) problems.
Now, departing from time and onto space, where space represents memory. PSPACE is the class of all problems that can be solved by a deterministic turing machine in a polynomial amount of space and NPSPACE is the class of all problems that can be solved by a non-deterministic turing machine in a polynomial amount of space. Similarly, EXPSPACE and NEXPSPACE are the same as PSPACE and NPSPACE respectively but in an exponential amount of space.
Comments