Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Complexity theory’s 50-year journey to the limits of knowledge (quantamagazine.org)
206 points by nsoonhui on Aug 18, 2023 | hide | past | favorite | 114 comments


> Suppose that P = NP. To prove it, researchers would need to find a fast algorithm for an NP-complete problem, which might be hiding in some obscure corner of that vast landscape.

They wouldn't need to find such an algorithm. They would just need to prove that such an algorithm exists.


Technically it is impossible for this to happen for P=NP since if P=NP then there is a straight forward algorithm for solving SAT by universal search.


Can you explain your comment? NP-completeness to my understanding simply means that there exists a polynomial time reduction to SAT. How can SAT constructively get a P algorithm implementation merely from proving that an NP-complete problem has a solution in P without the algorithm itself?

I don’t want to accuse you of being incorrect if you’re alluding to something I’m ignorant of, but since it’s certainly possible in my mind that an NP-complete problem could be non-constructively proven to be in P, what you wrote doesn’t sound right.


There's already a known algorithm M you can implement right now that can find satisfying assignments to satisfiable SAT instances in polynomial time conditioned only on P=NP.

Note: I'll be glossing over a few technicalities in this explanation, feel free to ask about them.

Roughly, the algorithm M is as follows. Suppose you have input x of length n. For each integer c between 1 and n, encode c in binary and try to decode it in UTF-8. If it decodes, try to interpret it as a Python program, and run it on input x for 100n^c steps. If it executes successfully and outputs a valid assignment, return the assignment. Otherwise, continue onto the next c. If c=n and you haven't found an assignment yet, do a brute force search over assignments and return what you find.

Now suppose P=NP, so some Python program finds SAT solutions in polynomial time. In fact, there are infinitely many such programs, since each valid program can be prefixed with as many "pass" expressions as we want. Interpreting the UTF-8 encoding of any such program p in binary gives us some integer c.

There are only finitely many inputs of length ≤ c, so the slowest of them still runs in finite (and thus polynomial) time. For any input of length > c, the above algorithm M will spend at most 100(c-1)n^c time trying to execute the (c-1) different strings in Python for at most 100n^c steps each. If M happens on a solution before then, great! If it doesn't, then now we run p -- which by assumption does find solutions in polynomial time -- for 100n^c steps.

If 100n^c is an upper bound on the true running time of p, then this necessarily returns a valid assignment, meaning that we can find assignments in time O(c n^c) time for some integer c, i.e. polynomial time. Otherwise, if 100n^c is too small, we just wait for the next copy p' of p (with a pass prefixed to it) that will have an even looser time bound of 100n^c' . This also may not be enough, but eventually we will find some p''''' for which the time bound does suffice (depending on the true running time of p). And so we find valid solutions in O(c'''''n^c''''') time.

None of this required us to know what p is; effectively M makes the non-constructive constructive by trying every program out in a systematic way that the running time never exceeds some (also unknown) polynomial bound.

Is it practical? Suppose that the shortest Python program finding SAT assignments is at least 100 bytes long. Then c > 256^100, and thus the running time is something like O(n^(256^100)). So be sure to get a good CPU.


This seems to be like you're trolling. You're basically doing a brute force search for polynomial time programs... so you do need to know the polynomial time program before you actually solve anything?! You're just bounding the search space by only allowing programs to run for a certain number of steps under the assumption that any problem you care to solve would be solvable within a finite amount of steps... and then claiming that both the search for the program and its execution turns out to run in polynomial time - which seems obviously wrong as you yourself point, the logarithmic you're talking about itself is growing exponentially on the size of the input?!?!


He's not trolling, it's a well known result.

>so you do need to know the polynomial time program before you actually solve anything?!

No, you try all the programs one by one. Assuming P = NP one of them will work eventually. The number of programs you need to search before finding the correct one is a constant, even if we don't know what it is. Then by scaling the number of steps you do at each iteration you guarantee that the total runtime will be polynomial in the size of the input.

It's exponential in the size of the P-time algorithm for SAT but that's a constant and therefore irrelevant to the complexity class.


> No, you try all the programs one by one. Assuming P = NP one of them will work eventually.

You're repeating what I said: you do need to find the program first, and you do that by brute force. What exactly is the disagreement here, do you not know that trying one-by-one is exactly what "brute force" means? Perhaps this theorem is mathematically sound and I am missing something, but as a programmer I can guarantee this solution is completely useless.


I'm not disputing that it's brute force, I'm just explaining how it results in polynomial time algorithm.

> but as a programmer I can guarantee this solution is completely useless

yeah, no shit


> as a programmer I can guarantee this solution is completely useless

Oh absolutely, you'd never want to run such a program. However, it's existence is a useful mathematical trick. Similar to a brute-force chess program: it's not practical to run, but it's existence tells us a lot about chess (e.g. that it doesn't require conciousness, or other such woo).

> you do need to find the program first

Note that there are two programs being considered: a program P which directly solves NP problems in P-time; and a "universal search" program which indirectly solves NP problems in P-time.

The universal search definitely solves the problem (assuming P exists), since it brute-forces the outputs of all programs, until it finds a valid solution. Eventually it will either reach the output of P (which is a valid solution); or it will stop earlier by finding some other solution first. Let's assume the worst case, that it doesn't find anything before P's solution: how long will this search take?

Let's ignore Python and stick to some simple binary encoding (e.g. for some Turing machine, or whatever). We'll make our encoding prefix-free by using '11' to indicate the 'end of program'; hence every binary string which only contains '11' at the end is a valid program. Here are the first few, arranged by length:

  Length 0 = []
  Length 1 = []
  Length 2 = [11]
  Length 3 = [011]
  Length 4 = [0011, 1011]
  Length 5 = [00011, 01011, 10011]
  Length 6 = [000011, 001011, 010011, 100011, 101011]
  ...
We're using a binary alphabet, so we'll divide up our runtime between these programs using powers of 2, such that each gets a fraction 1/(2^length): the program of length 2 gets 1/2^2 = 1/4 of our runtime; the program of length 3 gets 1/2^3 = 1/8 of our runtime; the programs of length 4 each get 1/2^4 = 1/16 of our runtime; and so on. This exponential allocation ensures that we never allocate more than 100% of our runtime (similar to the sum 1/2 + 1/4 + 1/8 + ... never exceeding 1). The prefix-free encoding ensures that longer programs only exist in the "gaps" left by shorter programs (e.g. imagine every program arranged as leaves of a binary tree; tracing the left/right paths down through the tree gives us a prefix-free encoding). Fun fact: the parent used UTF-8 because that's a prefix-free encoding.

The program P will have some binary encoding in this scheme; let's say its length is L bits. L is just a constant: it has nothing to do with the particular NP problem instance we want to solve; e.g. if we're solving a traveling salesman problem, the length of the program P has nothing to do with the number of cities we want to visit. According to the above scheme, universal search will spend 1/2^L of its time running P: that is exponential in L but independent of our problem size (e.g. the number of cities our salesman is visiting).

Hence universal search will run one step of P for each 2^L of its own steps; that's probably quite a large, impractical slow-down, but as far as complexity theory's concerned it's only a constant factor. If running P directly has some polynomial time complexity like O(N^A) for some power A, then universal search will have complexity O(2^L × N^A) = O(N^A) since the constant factor is dominated by the polynomial factor.

You may be left wondering how universal search can keep track of all these running programs. There are actually several ways to do it, depending on preference. One particularly simple approach is the "FAST" algorithm: this only runs one program at a time, and just restarts everything from scratch over and over. For example, we can run 11 for one step, and see if it's made a valid solution; if not, start again by running 11 for two steps, then 011 for one step; if neither found a solution, start again by running 11 for four steps, then 011 for two steps, then 0011 for one step, then 1011 for one step; and so on. This exponential-restarting ends up taking twice as long, but that's just another constant factor we can ignore https://people.idsia.ch/~juergen/toesv2/node28.html

Yet another approach is to choose programs and divide up our runtime using probabilities (e.g. coin tosses). The GUESS algorithm performs the same program-length-based allocation as FAST, but is embarrassingly parallel rather than sequential https://people.idsia.ch/~juergen/toesv2/node31.html

If you don't like linear factors, you can also use Hutter Search: that starts off by running universal search, but in parallel it performs a brute-force search for proofs that some other program both is equivalent and faster (such as P, in the above example). It uses exponential restarts (like FAST), which lets it switch to a faster program once found. Hutter Search only has a small linear slowdown (depending on how often we restart, and how we divide our resources between running our current solution and searching for provably-faster ones). Instead, it has a colossal (but constant!) additive factor (the time spent finding a provably faster program, then waiting for the next restart so it can switch to running that)


Ok, I get it now. You're basically playing around with the complexity factor by bounding the universe of possible solutions (programs) such that it will only grow polynomially. And assuming that because P=NP, the size-bound programs you try are guaranteed to find an answer, all without depending on the size of the input. But I think the only relevance of the P VS NP problem is in whether or not we can find answer to NP problems in a "timely" fashion (as the article defines it: before the end of the universe), and I think one could show that this strategy (the name is universal search) doesn't guarantee that for any practical hardware we can envisage, now or in the future - the number of steps is just far too great for any interesting problem, even if it's not nearly as bad as exponentially growing (being less bad is not enough)?


> Oh absolutely, you'd never want to run such a program. However, it's existence is a useful mathematical trick.

This was the beginning of the thread: If you have a non-constructive proof that P=NP, it still means you cannot solve all NP problems quickly until someone actually finds a program to do so.


Are you quoting someone? I did ctrl-f and found nothing.


It's literally the first comment: https://news.ycombinator.com/item?id=37182868

I do agree the reply about brute forcing the solution space is a bit trollish, in the sense that while it does purport to provide a "constructive" way to find the polynomial time NP solver, it's not actually feasible.

The sleight of hand is where it asserts that the length of the program is a "constant" -- which is technically true because it doesn't depend on the length of the NP problem they're trying to solve... but we're not trying to solve the NP problem, we're trying to find the solver for the NP problem.

I'm no expert in all things NP, but the assumption that a polynomial time SAT solver (if it exists) would have a "normal" length seem to be a very suspect assumption.


Very nice explanation! I think it's worth mentioning that this is called universal search (you mention this in passing) or Levin Search or Levin's universal search algorithm. It was invented by Leonid Levin in 1972.

https://steemit.com/steemstem/@markgritter/leonid-levin-s-un...


Relevant blog post by an actual complexity theorist: https://blog.computationalcomplexity.org/2018/10/if-pnp-then....

>The following does not quite show it can't happen but it does give one pause: an old result of Levin's shows that there is a program you could write NOW such that if P=NP then this program decides SAT except for a finite number of formulas (all of which are NOT in SAT) and can be proven to work in poly time (I will later give three pointers to proofs). The finite number of formulas is why the people above may still be right. But only a finite number- seems like a weak kind of nonconstructive.

So, based on this post, it is actually not known that P=NP implies existence of polynomial time algorithm for deciding SAT. The universal search idea only yields a polynomial time algorithm that decides SAT for all but a finite number of formulas.


I didn't say it solves every SAT problem in polynomial time, I said it can

> find satisfying assignments to satisfiable SAT instances in polynomial time

The distinction between this and solving SAT, and the slight strengthenings one can make to universal search, are part of those few technicalities I promised to gloss over.

Also, cut the snark. I'm not a complexity theorist, but my PhD thesis was in the intersection of graph algorithms and complexity (the first word of the title literally being "Hardness"), and I'm more than qualified enough to understand and present well-known complexity results from half a century ago, even when a misguided appeal to authority is at play.


My comment was not meant to be snarky. I actually got the (incorrect) impression from reading your comment that P = NP would imply a polynomial time algorithm that decides SAT. I decided to look up a more authentic source and shared the best resource that I could find.

I am guessing you took offense at the "an actual complexity theorist" part of my comment, but it wasn't a personal remark against you. The general audience of hackernews is programmers and startup enthusiasts.

>I'm more than qualified enough to understand and present well-known complexity results from half a century ago, even when a misguided appeal to authority is at play.

All the more reason you should not be so insecure and automatically assume the worst about other people's intentions. *

* This time the snark is intentional.


> still runs in finite (and thus polynomial) time

Not all finite runtimes are polynomial…

That hypothetical algorithm you’re looking for has no known upper bound beyond being a finite number. A polynomial algorithm can include an arbitrarily large constant factor so you need to run for an arbitrarily long time searching for it.

All algorithms run in O(1) time if you can arbitrarily increase the constant to however long it’s going to take, but that’s not what constant time means.

Polynomial time is O(X^(constant)) and you aren’t providing a constant. Going if O(X^(constant)) may not work therefore we try a larger constant until it works stops being polynomial time. Similarly running 2^X different polynomial algorithms takes exponential time not a polynomial running time.


The running time is polynomial, and there is some constant c independent of n (or X in your notation) such that the running time is O(n^c). The only "sleight of hand" is that we don't know what this c is.

Yet, provably, there is some constant c such that for absolutely no satisfiable input does the program take more than c*n^c steps to halt, where n is the length of the input.

Note that not knowing c doesn't mean we can't put an upper bound on c. For example, I believe (but don't quote me on this) that c can be upper bounded by Rayo's number. As, otherwise, I think we can derive a compact first-order expression for an upper bound on Rayo's number, which would be a contradiction. And, although Rayo's number is big, it's ultimately still a constant.

But ultimately none of this is necessary to prove that the problem is in P. The mere existence of the constant c independent of n suffices.


Your algorithm is dependent on both n and the existence of a solution so as do right now it isn’t guaranteed to actually have some constant c that works for all n.

More pedantically, it’s possible for a polynomial solution to P=NP to exist without it running in polynomial time in python. So no there’s doesn’t necessarily exist a c that works for all n even if there exists a polynomial solution to P=NP.


> More pedantically, it’s possible for a polynomial solution to P=NP to exist without it running in polynomial time in python.

That's not true. Any Turing machine can be compiled into a Python program with at most cubic slowdown. More precisely, the number of Python bytecode operations executed will be at most quadratic in the running time p(n) of the TM, with all operation acting on a bounded number of objects of size O(log p(n)) (which for polynomial p() is just O(log n)).

One can check Chapter 1 of Arora & Barak's Computational Complexity book for a description of the C/Java analogue of this claim, but it follows from a straightforward simulation argument: literally just hardcode the TM states and transition model, then execute it. The only bit to be careful about is addressing and address sizes growing unbounded, but the max size is only logarithmic in running time (simple proof, I believe this is an exercise in Sipser's book), and that's why we give ourself a cubic buffer.


Actually my concern was running into finite recursion limits, sys.maxsize for arrays, etc.

The language does specify that the highest possible limit is platform-dependent and there are functions that must return a number so the limit must exist, be enforced, and be a finite number. But that’s a really pedantic class of issues and the spec has few specific limitations so you could probably avoid any of them but it’s a more complicated problem than what you posted.


Can you really separate the two? For something like NP-completeness, I'm having trouble conceptualizing how a proof for existence would not require demonstration.


Yes, there are non-constructive proofs for the existence of polynomial time algorithms [1]. The Robertson–Seymour theorem [2] is a common example, it shows that certain classes of graphs can be characterized by finite sets of forbidden subgraphs [3] which can be checked for in polynomial time but it does say what those sets of forbidden subgraphs are or how they can be found.

[1] https://en.wikipedia.org/wiki/Non-constructive_algorithm_exi...

[2] https://en.wikipedia.org/wiki/Robertson%E2%80%93Seymour_theo...

[3] More precisely forbidden minors.


An example for such a proof would be using the probabilistic method. However, even if we are ignoring some artificial problems, there are some "natural" problems that are known to be in P that we do not have algorithms for.

https://en.wikipedia.org/wiki/Non-constructive_algorithm_exi...

Also see:

https://cs.stackexchange.com/questions/92087/are-there-any-p...


Thank you- your link to non-constructive proofs led me to this HN comment, which seems to flesh out why such proofs are not applicable to P = NP

https://news.ycombinator.com/item?id=29022963

Im just reading about constructive vs. Non-constructive proofs, but my intuition seems to be that a proof would P = NP would have to be constructive.

https://news.ycombinator.com/item?id=19720511


That "proof" is applicable to any algorithm that "exists". As "an algorithm exists, so we can enumerate all of the Turing machines "in parallel" and find it" would work for anything in P, other algorithms as well. However, good luck actually running that algorithm... Enumerating Turing machines in parallel, executing them, and n could be many times larger than the age of the universe. You want something that you can execute, not something that in theory exists.


The kind of ultra-shoddy "construction" you get from "Here is an impossible to build machine that would give us the proof" would still not get you a demonstration. It would still be something we don't have the algorithm for.


Probably very silly, but for a long time, when I've seen "p=np" thrown around, I've always considered it some kind of matrix math problem where n could be substituted by an identity matrix which itself could be substituted by some kind of sampling (which would represent observations of events bounded by np-space) unitary matrix multiplied by its conjugate transpose[0](thus, p = np -> p = ip -> p = uu*p, or p= u*up), which seems like that would be in line with a probabilistic method, is that the case? Or is this a wrong way to think about this?

[0] https://en.wikipedia.org/wiki/Unitary_matrix


I don't understand what you mean. NP is not N multiplied by P, it is the name of a set, the set of all problems which can be solved by a Nondeterministic algorithm with Polynomial time complexity. P is the name of the set of all problems which can be solved by a deterministic algorithm with Polynomial time complexity. The P=NP problem is a set identity problem, it's not an algebraic equation.

Note that here nondeterministic algorithm can be thought of as an algorithm which can try every value in parallel when faced with any choice, even an infinite amount of values. For example, a nondeterministic algorithm can try every number in R in parallel. There is no known way to replace this nondeterminism with probabilistic methods.


Substitute the N? No, you can't substitute "try every solution simultaneously" by "some kind of sampling".

Let's use a very simple example. How could sampling help you find the password that hashes to 21e400789a8ad12adb89d72ca8d92cc72400fea4?


You couldn't sample from anything if the password that hashes to 21e400789a8ad12adb89d72ca8d92cc72400fea4 is never used to encrypt any text. Though if the password that hashes to 21e400789a8ad12adb89d72ca8d92cc72400fea4 goes on to be used to encrypt any text, and one has access to the cipher text, then you can sample from the cipher text.


Okay, so you can't design anything for my example problem? It's not being used to encrypt anything, all you get is the hash. Which is solvable very very fast with an NP machine. If you can't apply your idea to that, you haven't found a general solution.


Yeah, a general solution wouldn't be useful in the wild against keys that are never used in the probabilistic approach i am thinking of, observations of cipher text are needed (which isn't unbounded).


The point is this. Here is a nondeterministic algorithm that proves the problem the poster above was formulating is in NP:

  matching Passwords = choice(AllStrings, str -> hash(str) == desiredHash)
Assuming hash(str) has polynomial time, the nondeterministic algorithm above is clearly polynomial as well. Here `choice` is not a function, it is an elementary nondeterministic operation, one which evaluates the given function on each value in the input variable in O(1). AllStrings is the set of all strings, maybe limited to some max length if desired (otherwise, matchingPasswords will be an infinite set itself).

As far as it is known, there is no way to replace this with any sampling method that would take polynomial time. Of course, this is not proven, it is a related problem known as BPP=NP. BPP is the class of problems for which Bounded-error Probabilistic algorithms with Polynomial time complexity exist.


Yes, proofs don't have to be constructive.

Consider proofs by contradiction, you could potentially show that if such an algorithm does not exist some important true statement would be rendered false.


Sure. There is definitely a googolth digit of pi. Computing what the digit is, however, is not necessary to prove that such a digit exists.


It's a quite bad analogy because the googolth digit of pi is completely constructive. (You don't need to calculate it, but it is constructive)

P = NP proof could be not constructive.


Note: finding arbitrary digits of pi doesn't require finding the preceding digits. It's kind of freaky.

https://en.wikipedia.org/wiki/Bailey%E2%80%93Borwein%E2%80%9...


Proving the Nth digit of pi exists is not (necessarily) constructive.

Though to make it an actual proof and not a truism you might say "when writing pi in the shortest decimal representation, there is a millionth digit".

Proving pi is irrational would suffice, without actually calculating the first million digits.


My intuition was that if a proof for P = NP exists, it would be incomparable to the kind of Pi example you provide- Pi is defined as an irrational ratio, so the existence of whether x digit of Pi exists. It would instead be like saying, 'the x digit of P is 7, and here is a proof that is not a straight calculation'. The idea of a proof which can demonstrate knowledge of X digit of Pi, without verification, doesn't click for me.


suppose NP-complete problem X is in NP but not P. From this premise I arrive at a contradiction and conclude X is in P. Thus P=NP with no polynomial algorithm for solving any NP-complete problem provided


This is an aside. But I twigged on a caption for one of the figures: “Every computational problem takes longer to solve as its input grows larger, but not all problems grow harder at the same rate.”

It’s obvious from CS201 that this phrase is generally true and I have no pedantic objection to its inclusion in the article. However, I’m curious if this is strictly true or if we can find a (nontrivial) counterexample. Are there any algorithms that run faster as input grows? My first intuition is some kind of probabilistic question solved to a given correctness bound.

Edit: it is trivial to construct an algorithm with this property. My precise question is whether any meaningful problems have optimal algorithms with this property.


> Edit: it is trivial to construct an algorithm with this property.

It's actually impossible to construct an algorithm that has its runtime decrease as the input size grows. You can do this for finitely many examples, but we're talking about asymptotics here. With discrete run-times and a lower-bound of 1 operation, the run-time will have to stop decreasing at some point and just stay at the same constant value for all but finitely-many exceptions. This makes it a constant-time algorithm.

A constant run-time is a counter-example to that caption though, as the problem doesn't take longer as the input grows. An example would be checking if a number is divisible by 2. You only need to check 1 bit, so it doesn't take any longer as you add more bits that the algorithm doesn't touch.


I guess even if you allow probabilistic algorithms and expected runtime, you still can't have an algorithm that runs faster as the input size grows. Because... it takes O(log n) time to calculate any random variable whose expected value has a denominator of size n? I'm not entirely sure but that seems right to me.


With probabilistic algorithms, you still have the fundamental limitation that there's a lower-bound to how fast an algorithm can run. You can't keep getting faster indefinitely.


You could get faster in expected value. What if the time on input of length n was an expected value of 100 + 1/n?

That isn't possible, but it might not quite be obvious why that isn't possible.


Having a constant runtime means there's a constant bound on the runtime, not that the runtime is an exact constant value. 100 + 1/n would still be constant, as it's bounded above and below by constants.

To have sub-constant time it would have to go to 0 as n increases, as otherwise we could bound the time below by a constant. This isn't possible, as a computer will take at least 1 time unit on every input. So the expected time will always be at least 1.


That's for randomly sampling a large input?

If each piece of input is already independent, then you don't have to calculate that, you can just use the input in order.


Maybe not! With O(log N) such as binary search you are assuming some sort of structure to the data. It is not necessary for an algorithm to scan (otherwise O(N) is as good as it would get)

What if we devise a data structure that becomes more efficient as it grows to do a certain (perhaps useless - but that is OK!) operation.

For example we have a data structure at N=0 with a million disconnected nodes and one of them being the target node.

As N increases add a node with a pointer to the target node. The algorithm to find the target node is a random search through nodes until it finds one with a pointer then halts. As N increases, time on average decreases with complexity O(1-N/(N-K)) where K=1000000


You can't have that because when N=K that's not a valid expression and when N>K you're putting a negative number in the big-O notation. You can't have negative runtime.

All algorithms have a lower-bound on runtime of 1. If a sequence is decreasing and bounded below, it converges to a value [1]. If the sequence is discrete, that means it reaches exactly that limit and never deviates from it.

[1] https://en.wikipedia.org/wiki/Monotone_convergence_theorem


It should have been N+K not N-K


I'm not sure what your big-O expression is supposed to mean, but if a sequence is decreasing and bounded below (which runtime is) then it has a limit. Since the sequence is discrete, it can't get arbitrarily close to the limit without reaching it. So at some point the runtime will stop decreasing.

You can approximate the runtime with a function that decreases forever, but the actual runtime will be constant for large enough n.


It is for a typical case not a specific case. The mean time, if you like.


The bit I glossed over is picking a random number as per sister comment. More thinking needed!


A cheeky way to do it is this, reuse the randomness that we are assuming about the input data:

1. Data structure in pseudo code:

    max = 0
    items = []
    def add(x: int32):
       max = max(x, max)
       items.push(x)

2. Algorithm:

    def doit()
        if max == int32.max return 0
        sleep (1000)
        return 0
As you add items to this data structure, on average over all possible data inputs the time taken for doit() will tend from K+1000ms to a constant K.


That's what I'm saying. The runtime will go down to a constant. It can't keep decreasing forever.


“Tend to” is a mathematical term mean get closer to. Formally, for any e as small as you like there exists an N within e of the value being approached.

This is possible because we are sampling a probability distribution. A dice is discrete but the probability of rolling a 1 as the number of sides increases tends to zero even if there is a whole “1” in a given dice.


An algorithm can't have that property for its expected runtime, as it can only look at a sub-constant amount of the input of it runs in sub-constant time.

it's not possible for it to behave differently as n increases because it doesn't have enough time to distinguish between different values of n once it's large enough.


You could use just the asymptote and call it "constant time".

But that's an extremely limited and misleading analysis, so you should not do that. If the time taken goes from a quadrillion to 7 as the problem size grows, don't call it constant.


"constant time" in complexity theory just means there's a constant bound on runtime. It doesn't have to actually have the exact same runtime down to the instruction for every input. Here, the bound would be a quadrillion.

Of course, showing a constant upper-bound doesn't tell us that it isn't even faster than constant as in the proposition I was responding to. That's why I focused on the constant lower-bound.


I know what it means, and I stand by it being limited to the point of misleading in a case like this.

Running any (halting) algorithm on a human computer is constant time, because you're bound by the number of states you can fit into some terabytes, but nobody should actually try to use that as a final analysis.


> Running any (halting) algorithm on a human computer is constant time, because you're bound by the number of states you can fit into some terabytes, but nobody should actually try to use that as a final analysis.

That's the sort of thing I would usually say to explain why we do things like use asymptomatic analysis and ignore polynomial factors. If you have a different solution, that's great, but you're no longer taking about the kind of runtime the article is about.


It depends on how abstract you want to be. You shouldn't always shove the lever all the way to the most abstract single value.

My suggested solution is "don't delete all nuance". If the runtime doesn't behave nicely, give that a mention. Most things do behave nicely, so that's not much of a burden.

> the kind of runtime the article is about

The article spends plenty of time talking about computational feasibility, which is not about the limits as you approach infinity.

It even equates P with "trivial", which is... not good.


You do if you are a complexity theorist :)


Finding a set of mututally orthogonal vectors. For a given sparsity, after a large enough dimention, two randomly chosen vector will almost always be orthogonal.


That would be a case of higher probability of finding a solution in one step.

But the solution would still need to be checked and another candidates generated until a solution is found.

Average time would be minimized by generating random vectors each time.

But that would increase the worst case to unbounded (effectively infinite) time since an increasingly vanishingly small but finite chance that a solution has not yet been found will exist after any number of steps.

Some kind of simple search would be vastly more bounded, but in practice require more computation.


Of course. The obvious examples are probabilistic, e.g., Monte Carlo methods, as an increase in the problem space decreases the number of samples needed.


> it is trivial to construct an algorithm with this property

Actually, it is impossible to construct an algorithm with this property, at least under any usual computational model.

Every computational model has discrete time units; therefore the amount of time taken cannot be vanishingly small.

(This is assuming the usual notion of complexity, where only the asymptotic behavior matters. It doesn't matter if it gets increasingly faster over the first million input sizes...it only matters what the behavior is as n -> infinity.)

A program cannot be increasingly faster forever.


Consider a hashmap. Hashmaps give expected constant access time, and amortised constant access time on the condition that their size grows at a sufficient rate and that the hash function is a universal hashing function. This is parametric.

Assume that you use a linkedlist for collisions, then for some growth rates, your cost of access is not constant, and is certainly at most O(n).

The point being that the universal hashing function has 1/m probability of collision; if your growthrate is bad. Then m << n, which turns the expected time from constant to at most O(N) because m grows slower than N.


"Faster as the input grows" is perfectly compatible with a minimum number of time units. If your definition insists the minimum of time units must be 0, with infinitesimal time getting involved, then your definition sucks.


For a function to strictly decrease forever, at least one of two statements is true:

1. It decreases by vanishingly smaller amount.

2. It decreases to negative infinity.

So...which is it? Infinitesimal time units, or negative time?


Strictly speaking it either tends to negative infinity, or tends to so real number X, which could be any number positive or negative.


In that latter case, statement 1 is true.


3. Nobody said "strictly".


They said decreases, not stays constant


Plenty of O(n), O(nlogn), O(n^2) algorithms don't strictly increase.

Rounding the decrease to the nearest time unit some of the time is hardly a disqualifier for "decreasing". (Though specifically they said "faster".)


Yes, but...

The mathematically rigorous form of the posed question is whether there is an algorithm with run time f(N) for which there exists strictly decreasing g(N) such that f(N) = O(g(N)). That is, the run time is bounded by a forever decreasing function.

f(N) itself doesn't need to be strictly decreasing, but there needs to be a g(N) that is.

Bottom line, no algorithm can have a forever decreasing run time as N increases. An algorithm can exhibit that behavior over a limited range of N, but that's not how complexity works.


> The mathematically rigorous form of the posed question is whether there is an algorithm with run time f(N) for which there exists strictly decreasing g(N) such that f(N) = O(g(N)). That is, the run time is bounded by a forever decreasing function.

Let's say my minimum runtime is 7, and the strictly decreasing bound on my runtime is g(N) = 1e15/N + 7

Doesn't that fit your description fine?


I could imagine some behaviour along those lines in search engines, specifically if you're searching for similar documents. Which none of them seem to — would be useful! Let me know if I'm wrong! — but imagine a search engine that lets you look for documents 'similar to this document here'. Also imagine that's threshold-based; you want equal levels of similarity regardless of the rarity of the input document.

In that case, as the size of the corpus grows it should get easier to find ones in the right range of similarity.


If your computational model requires that the algorithm reads the input then I don't see how it's possible. Even if above some size the algorithm does nothing but exits the time will still grow linearly with the input from reading.

You could imagine an algorithm that takes some astronomical time for size 1, less for 2, etc.. but for any finite time at size 1 there will be some size N where the reading time finally dominates the computation.


Depends what you mean by larger. The example that occurs to me is the priblem of determining whether or not an integer is prime. This can be done relatively quickly for numbers of the form 2^p-1 where p is prime, but would take much longer for a much smaller prime not of this form.


Those would be effectively different problems.

Adding the constraint of only needing to classify a particular form of prime will always result in an algorithm of equal or lesser complexity order.


If you are willing to settle for expected instead of worst case time, there are many string-related algorithms that are O(m/n). Intuition is that as the string size grows it's "more likely" to have some property on expectation.


Same for hash related algorithms and data structures.


caches? the more input the fewer cache misses


This is easily one of the most well written articles published by quanta. Truly a pleasure to read.


Yeah, a lot of love went into this. Could be made into a book, i would buy.


nitpick: I wonder how TSP became NP complete. It would require to polynomial time algorithm to validate the solution. I doubt such algorithm does exist.


Yes, TSP is in NP.

Strictly speaking, the only problems in NP are decision problems, so when we say "TSP is in NP", we're talking about the "decision version of TSP": given a graph G, does there exist a Hamiltonian circuit with total length <= K?

And it's straightforward to show that this problem is in NP, because if the answer is "yes", then simply exhibiting such a tour is enough to provide a polynomial-time-verifiable proof of the answer's correctness.

(Note that it is not known whether TSP is in co-NP, so it is not known whether one can produce a polynomial-time-verifiable proof of a "no" answer.)

But note that if it were to turn out that the decision version of TSP was solvable in polynomial time, then the optimization version would also be in P, because you can just binary-search on all of the possible values of K. This results in at most a polynomial slowdown.


One thing I've always found fascinating is the relationship between quantum computing and complexity classes. With the promise of quantum computers solving certain problems exponentially faster, it'll be interesting to see how they fit into our established P vs. NP paradigm. On a side note, the term "limits of knowledge" in the title feels a bit hyperbolic.

I always appreciate how Quanta conveys the intricacies of deeply technical topics to a wider audience.


> With the promise of quantum computers solving certain problems exponentially faster, it'll be interesting to see how they fit into our established P vs. NP paradigm.

Its been studied for a while now. One of the most interesting things, is there may be problems in BQP that are not in NP.


Yes indeed. it's intriguing to consider problems in BQP that may not fall into NP. this opens up a world of possibilities and challenges our current understanding of computational complexity.

It suggests there are problems solvable by quantum computers that classical ones can't even verify the results of. This could lead to fascinating developments in fields like cryptography, where solutions could be found but not easily checked.

on the other, it raises questions about the practicality of quantum computing. if we can't verify a solution, how do we trust it?

Maybe we need to rethink our definitions of 'solving' and 'verifying'. after all, these concepts were developed with classical computing in mind.

But maybe this this is just a limitation or topological property of our current theoretical framework? Like we're trying to fit quantum computing into a model that just isn't equipped to handle it.

either way, it feels like we are on the cusp of a paradigm shift. Pretty exciting

Anyway interested to hear more thoughts on this


You can verify solutions given by quantum computers


I don't think that is known to be true (in polynomial time) for all problems solvable by a quantum computer (e.g. Fourier Sampling)


It’s interesting to think about, but the complexity classes are actually equal[1]. This sort of makes sense if you think of quantum computers as simulatable by a classical computer.

[1] https://arxiv.org/abs/2001.04383


Huh? Assuming I understand your comment correctly, that is, not what that paper is showing.

That paper is showing that if the verifier has polynomial time computational power, with two provers who each have unlimited computational power, and who have some shared quantum entanglement, are communicating individually with the prover, and the prover also has access to randomness and such, then there is a protocol that can allow the provers to, for any recursively enumerable language, demonstrate to the verifier, for any string, whether that string is in that language.

RE is a WAY bigger class than what the verifier can compute alone. And, is also way bigger than the without-entanglement version, MIP.

That paper is showing that adding quantum entanglement to the setup for MIP, allows for way more problems to be in the class.


>RE is a WAY bigger class than what the verifier can compute alone

I always think of the RE(CE) sets as those that are possible to compute. I’m aware that RE sets are not computable, but since RE sets are those accepted by a turing machine, I still consider that set to be the “main” one when speaking informally. Maybe this is misguided


the verifier in the MIP* setup/definition is assumed to be limited to a polynomial amount of computation.


That's not really what that paper is saying.

The class of problems computable by a quantum computer is equal to that of a classical computer. That does not neccesarily imply anything about the relationship between BQP and P or BQP and NP.


You’re completely right. I somehow got my wires crossed as to what was possible to compute and the running time.


Is there a missed connection between entropy and circuit complexity? It looks like one might be able to draw some parallels between the two.

Other thoughts; it's interesting that PvsNP is essentially about the lengths of proofs, and thus it is the task of deciding in poly time whether you can decide all decidable problems whose proofs are polynomial to their encoding.


This is addressed, rather extensively, in the article.


Entropy or it beying a metaproof?


being*


It's exciting that cryptographic constructions (like m/n secret sharing where m is hidden) can solve problems in complexity; I would have expected things to flow mostly the other way.


related : https://www.bbc.co.uk/programmes/b06mtms8

P v NP In Our Time

Melvyn Bragg and guests discuss the problem of P versus NP, which has a bearing on online security. There is a $1,000,000 prize on offer from the Clay Mathematical Institute for the first person to come up with a complete solution. At its heart is the question "are there problems for which the answers can be checked by computers, but not found in a reasonable time?" If the answer to that is yes, then P does not equal NP. However, if all answers can be found easily as well as checked, if only we knew how, then P equals NP. The area has intrigued mathematicians and computer scientists since Alan Turing, in 1936, found that it's impossible to decide in general whether an algorithm will run forever on some problems. Resting on P versus NP is the security of all online transactions which are currently encrypted: if it transpires that P=NP, if answers could be found as easily as checked, computers could crack passwords in moments.

With

Colva Roney-Dougal Reader in Pure Mathematics at the University of St Andrews

Timothy Gowers Royal Society Research Professor in Mathematics at the University of Cambridge

And

Leslie Ann Goldberg Professor of Computer Science and Fellow of St Edmund Hall, University of Oxford


Who can read an article this long these days?


Some people still read entire books (they literally just sit there for hours and read every word on every page)


[flagged]


We've banned this account for posting unsubstantive and/or flamebait comments and otherwise breaking the site guidelines. You've been doing a bunch of that lately, unfortunately. Not cool.

If you don't want to be banned, you're welcome to email hn@ycombinator.com and give us reason to believe that you'll follow the rules in the future. They're here: https://news.ycombinator.com/newsguidelines.html.


It interested me, so I did read the whole thing... do you never read anything longer than this at all?? No books? No multi-part blog posts, tutorials?


What is different "these days" compared to the "old days" that would make reading long articles harder?


Ikr? I too would rather have the summary of it beamed right into my consciousness in a split second.





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: