There's already a known algorithm M you can implement right now that can find satisfying assignments to satisfiable SAT instances in polynomial time conditioned only on P=NP.
Note: I'll be glossing over a few technicalities in this explanation, feel free to ask about them.
Roughly, the algorithm M is as follows. Suppose you have input x of length n. For each integer c between 1 and n, encode c in binary and try to decode it in UTF-8. If it decodes, try to interpret it as a Python program, and run it on input x for 100n^c steps. If it executes successfully and outputs a valid assignment, return the assignment. Otherwise, continue onto the next c. If c=n and you haven't found an assignment yet, do a brute force search over assignments and return what you find.
Now suppose P=NP, so some Python program finds SAT solutions in polynomial time. In fact, there are infinitely many such programs, since each valid program can be prefixed with as many "pass" expressions as we want. Interpreting the UTF-8 encoding of any such program p in binary gives us some integer c.
There are only finitely many inputs of length ≤ c, so the slowest of them still runs in finite (and thus polynomial) time. For any input of length > c, the above algorithm M will spend at most 100(c-1)n^c time trying to execute the (c-1) different strings in Python for at most 100n^c steps each. If M happens on a solution before then, great! If it doesn't, then now we run p -- which by assumption does find solutions in polynomial time -- for 100n^c steps.
If 100n^c is an upper bound on the true running time of p, then this necessarily returns a valid assignment, meaning that we can find assignments in time O(c n^c) time for some integer c, i.e. polynomial time. Otherwise, if 100n^c is too small, we just wait for the next copy p' of p (with a pass prefixed to it) that will have an even looser time bound of 100n^c' . This also may not be enough, but eventually we will find some p''''' for which the time bound does suffice (depending on the true running time of p). And so we find valid solutions in O(c'''''n^c''''') time.
None of this required us to know what p is; effectively M makes the non-constructive constructive by trying every program out in a systematic way that the running time never exceeds some (also unknown) polynomial bound.
Is it practical? Suppose that the shortest Python program finding SAT assignments is at least 100 bytes long. Then c > 256^100, and thus the running time is something like O(n^(256^100)). So be sure to get a good CPU.
This seems to be like you're trolling. You're basically doing a brute force search for polynomial time programs... so you do need to know the polynomial time program before you actually solve anything?! You're just bounding the search space by only allowing programs to run for a certain number of steps under the assumption that any problem you care to solve would be solvable within a finite amount of steps... and then claiming that both the search for the program and its execution turns out to run in polynomial time - which seems obviously wrong as you yourself point, the logarithmic you're talking about itself is growing exponentially on the size of the input?!?!
>so you do need to know the polynomial time program before you actually solve anything?!
No, you try all the programs one by one. Assuming P = NP one of them will work eventually. The number of programs you need to search before finding the correct one is a constant, even if we don't know what it is. Then by scaling the number of steps you do at each iteration you guarantee that the total runtime will be polynomial in the size of the input.
It's exponential in the size of the P-time algorithm for SAT but that's a constant and therefore irrelevant to the complexity class.
> No, you try all the programs one by one. Assuming P = NP one of them will work eventually.
You're repeating what I said: you do need to find the program first, and you do that by brute force. What exactly is the disagreement here, do you not know that trying one-by-one is exactly what "brute force" means? Perhaps this theorem is mathematically sound and I am missing something, but as a programmer I can guarantee this solution is completely useless.
> as a programmer I can guarantee this solution is completely useless
Oh absolutely, you'd never want to run such a program. However, it's existence is a useful mathematical trick. Similar to a brute-force chess program: it's not practical to run, but it's existence tells us a lot about chess (e.g. that it doesn't require conciousness, or other such woo).
> you do need to find the program first
Note that there are two programs being considered: a program P which directly solves NP problems in P-time; and a "universal search" program which indirectly solves NP problems in P-time.
The universal search definitely solves the problem (assuming P exists), since it brute-forces the outputs of all programs, until it finds a valid solution. Eventually it will either reach the output of P (which is a valid solution); or it will stop earlier by finding some other solution first. Let's assume the worst case, that it doesn't find anything before P's solution: how long will this search take?
Let's ignore Python and stick to some simple binary encoding (e.g. for some Turing machine, or whatever). We'll make our encoding prefix-free by using '11' to indicate the 'end of program'; hence every binary string which only contains '11' at the end is a valid program. Here are the first few, arranged by length:
We're using a binary alphabet, so we'll divide up our runtime between these programs using powers of 2, such that each gets a fraction 1/(2^length): the program of length 2 gets 1/2^2 = 1/4 of our runtime; the program of length 3 gets 1/2^3 = 1/8 of our runtime; the programs of length 4 each get 1/2^4 = 1/16 of our runtime; and so on. This exponential allocation ensures that we never allocate more than 100% of our runtime (similar to the sum 1/2 + 1/4 + 1/8 + ... never exceeding 1). The prefix-free encoding ensures that longer programs only exist in the "gaps" left by shorter programs (e.g. imagine every program arranged as leaves of a binary tree; tracing the left/right paths down through the tree gives us a prefix-free encoding). Fun fact: the parent used UTF-8 because that's a prefix-free encoding.
The program P will have some binary encoding in this scheme; let's say its length is L bits. L is just a constant: it has nothing to do with the particular NP problem instance we want to solve; e.g. if we're solving a traveling salesman problem, the length of the program P has nothing to do with the number of cities we want to visit. According to the above scheme, universal search will spend 1/2^L of its time running P: that is exponential in L but independent of our problem size (e.g. the number of cities our salesman is visiting).
Hence universal search will run one step of P for each 2^L of its own steps; that's probably quite a large, impractical slow-down, but as far as complexity theory's concerned it's only a constant factor. If running P directly has some polynomial time complexity like O(N^A) for some power A, then universal search will have complexity O(2^L × N^A) = O(N^A) since the constant factor is dominated by the polynomial factor.
You may be left wondering how universal search can keep track of all these running programs. There are actually several ways to do it, depending on preference. One particularly simple approach is the "FAST" algorithm: this only runs one program at a time, and just restarts everything from scratch over and over. For example, we can run 11 for one step, and see if it's made a valid solution; if not, start again by running 11 for two steps, then 011 for one step; if neither found a solution, start again by running 11 for four steps, then 011 for two steps, then 0011 for one step, then 1011 for one step; and so on. This exponential-restarting ends up taking twice as long, but that's just another constant factor we can ignore https://people.idsia.ch/~juergen/toesv2/node28.html
Yet another approach is to choose programs and divide up our runtime using probabilities (e.g. coin tosses). The GUESS algorithm performs the same program-length-based allocation as FAST, but is embarrassingly parallel rather than sequential https://people.idsia.ch/~juergen/toesv2/node31.html
If you don't like linear factors, you can also use Hutter Search: that starts off by running universal search, but in parallel it performs a brute-force search for proofs that some other program both is equivalent and faster (such as P, in the above example). It uses exponential restarts (like FAST), which lets it switch to a faster program once found. Hutter Search only has a small linear slowdown (depending on how often we restart, and how we divide our resources between running our current solution and searching for provably-faster ones). Instead, it has a colossal (but constant!) additive factor (the time spent finding a provably faster program, then waiting for the next restart so it can switch to running that)
Ok, I get it now. You're basically playing around with the complexity factor by bounding the universe of possible solutions (programs) such that it will only grow polynomially. And assuming that because P=NP, the size-bound programs you try are guaranteed to find an answer, all without depending on the size of the input.
But I think the only relevance of the P VS NP problem is in whether or not we can find answer to NP problems in a "timely" fashion (as the article defines it: before the end of the universe), and I think one could show that this strategy (the name is universal search) doesn't guarantee that for any practical hardware we can envisage, now or in the future - the number of steps is just far too great for any interesting problem, even if it's not nearly as bad as exponentially growing (being less bad is not enough)?
> Oh absolutely, you'd never want to run such a program. However, it's existence is a useful mathematical trick.
This was the beginning of the thread: If you have a non-constructive proof that P=NP, it still means you cannot solve all NP problems quickly until someone actually finds a program to do so.
I do agree the reply about brute forcing the solution space is a bit trollish, in the sense that while it does purport to provide a "constructive" way to find the polynomial time NP solver, it's not actually feasible.
The sleight of hand is where it asserts that the length of the program is a "constant" -- which is technically true because it doesn't depend on the length of the NP problem they're trying to solve... but we're not trying to solve the NP problem, we're trying to find the solver for the NP problem.
I'm no expert in all things NP, but the assumption that a polynomial time SAT solver (if it exists) would have a "normal" length seem to be a very suspect assumption.
Very nice explanation! I think it's worth mentioning that this is called universal search (you mention this in passing) or Levin Search or Levin's universal search algorithm. It was invented by Leonid Levin in 1972.
>The following does not quite show it can't happen but it does give one pause: an old result of Levin's shows that there is a program you could write NOW such that if P=NP then this program decides SAT except for a finite number of formulas (all of which are NOT in SAT) and can be proven to work in poly time (I will later give three pointers to proofs). The finite number of formulas is why the people above may still be right. But only a finite number- seems like a weak kind of nonconstructive.
So, based on this post, it is actually not known that P=NP implies existence of polynomial time algorithm for deciding SAT. The universal search idea only yields a polynomial time algorithm that decides SAT for all but a finite number of formulas.
I didn't say it solves every SAT problem in polynomial time, I said it can
> find satisfying assignments to satisfiable SAT instances in polynomial time
The distinction between this and solving SAT, and the slight strengthenings one can make to universal search, are part of those few technicalities I promised to gloss over.
Also, cut the snark. I'm not a complexity theorist, but my PhD thesis was in the intersection of graph algorithms and complexity (the first word of the title literally being "Hardness"), and I'm more than qualified enough to understand and present well-known complexity results from half a century ago, even when a misguided appeal to authority is at play.
My comment was not meant to be snarky. I actually got the (incorrect) impression from reading your comment that P = NP would imply a polynomial time algorithm that decides SAT. I decided to look up a more authentic source and shared the best resource that I could find.
I am guessing you took offense at the "an actual complexity theorist" part of my comment, but it wasn't a personal remark against you. The general audience of hackernews is programmers and startup enthusiasts.
>I'm more than qualified enough to understand and present well-known complexity results from half a century ago, even when a misguided appeal to authority is at play.
All the more reason you should not be so insecure and automatically assume the worst about other people's intentions. *
That hypothetical algorithm you’re looking for has no known upper bound beyond being a finite number. A polynomial algorithm can include an arbitrarily large constant factor so you need to run for an arbitrarily long time searching for it.
All algorithms run in O(1) time if you can arbitrarily increase the constant to however long it’s going to take, but that’s not what constant time means.
Polynomial time is O(X^(constant)) and you aren’t providing a constant. Going if O(X^(constant)) may not work therefore we try a larger constant until it works stops being polynomial time. Similarly running 2^X different polynomial algorithms takes exponential time not a polynomial running time.
The running time is polynomial, and there is some constant c independent of n (or X in your notation) such that the running time is O(n^c). The only "sleight of hand" is that we don't know what this c is.
Yet, provably, there is some constant c such that for absolutely no satisfiable input does the program take more than c*n^c steps to halt, where n is the length of the input.
Note that not knowing c doesn't mean we can't put an upper bound on c. For example, I believe (but don't quote me on this) that c can be upper bounded by Rayo's number. As, otherwise, I think we can derive a compact first-order expression for an upper bound on Rayo's number, which would be a contradiction. And, although Rayo's number is big, it's ultimately still a constant.
But ultimately none of this is necessary to prove that the problem is in P. The mere existence of the constant c independent of n suffices.
Your algorithm is dependent on both n and the existence of a solution so as do right now it isn’t guaranteed to actually have some constant c that works for all n.
More pedantically, it’s possible for a polynomial solution to P=NP to exist without it running in polynomial time in python. So no there’s doesn’t necessarily exist a c that works for all n even if there exists a polynomial solution to P=NP.
> More pedantically, it’s possible for a polynomial solution to P=NP to exist without it running in polynomial time in python.
That's not true. Any Turing machine can be compiled into a Python program with at most cubic slowdown. More precisely, the number of Python bytecode operations executed will be at most quadratic in the running time p(n) of the TM, with all operation acting on a bounded number of objects of size O(log p(n)) (which for polynomial p() is just O(log n)).
One can check Chapter 1 of Arora & Barak's Computational Complexity book for a description of the C/Java analogue of this claim, but it follows from a straightforward simulation argument: literally just hardcode the TM states and transition model, then execute it. The only bit to be careful about is addressing and address sizes growing unbounded, but the max size is only logarithmic in running time (simple proof, I believe this is an exercise in Sipser's book), and that's why we give ourself a cubic buffer.
Actually my concern was running into finite recursion limits, sys.maxsize for arrays, etc.
The language does specify that the highest possible limit is platform-dependent and there are functions that must return a number so the limit must exist, be enforced, and be a finite number. But that’s a really pedantic class of issues and the spec has few specific limitations so you could probably avoid any of them but it’s a more complicated problem than what you posted.
Note: I'll be glossing over a few technicalities in this explanation, feel free to ask about them.
Roughly, the algorithm M is as follows. Suppose you have input x of length n. For each integer c between 1 and n, encode c in binary and try to decode it in UTF-8. If it decodes, try to interpret it as a Python program, and run it on input x for 100n^c steps. If it executes successfully and outputs a valid assignment, return the assignment. Otherwise, continue onto the next c. If c=n and you haven't found an assignment yet, do a brute force search over assignments and return what you find.
Now suppose P=NP, so some Python program finds SAT solutions in polynomial time. In fact, there are infinitely many such programs, since each valid program can be prefixed with as many "pass" expressions as we want. Interpreting the UTF-8 encoding of any such program p in binary gives us some integer c.
There are only finitely many inputs of length ≤ c, so the slowest of them still runs in finite (and thus polynomial) time. For any input of length > c, the above algorithm M will spend at most 100(c-1)n^c time trying to execute the (c-1) different strings in Python for at most 100n^c steps each. If M happens on a solution before then, great! If it doesn't, then now we run p -- which by assumption does find solutions in polynomial time -- for 100n^c steps.
If 100n^c is an upper bound on the true running time of p, then this necessarily returns a valid assignment, meaning that we can find assignments in time O(c n^c) time for some integer c, i.e. polynomial time. Otherwise, if 100n^c is too small, we just wait for the next copy p' of p (with a pass prefixed to it) that will have an even looser time bound of 100n^c' . This also may not be enough, but eventually we will find some p''''' for which the time bound does suffice (depending on the true running time of p). And so we find valid solutions in O(c'''''n^c''''') time.
None of this required us to know what p is; effectively M makes the non-constructive constructive by trying every program out in a systematic way that the running time never exceeds some (also unknown) polynomial bound.
Is it practical? Suppose that the shortest Python program finding SAT assignments is at least 100 bytes long. Then c > 256^100, and thus the running time is something like O(n^(256^100)). So be sure to get a good CPU.