The extra expense is not the generation of the overflow bit, but the infrastruct...

ncmncm · on March 21, 2022

> it severely hampers a superscalar or out of order processor, as it can't work out very easily which instructions can be run in parallel or out of order

This is an old myth, endlessly parroted. On current x86, the status register is renamed just like other registers, as could easily have been done in a better RISC-V design. Lack of status flags will be RISC-V's equivalent of delay slots, that once felt like an optimization but has already aged badly.

The unreliable presence of POPCNT and ROT instructions was a worse failing, apparently mitigated lately.

adrian_b · on March 20, 2022

One must not forget that on any non-toy CPU, any instruction may generate exceptions, e.g. invalid opcode exceptions or breakpoint exceptions.

In every 4-5 instructions, one is a load or store, which may generate a multitude of exceptions.

Allowing exceptions does not slow down a CPU. However they create the problem that a CPU must be able to restore the state previous to the exception, so the instruction results must not be committed to permanent storage before it becomes certain that they could not have generated an exception.

Allowing overflow exceptions on all integer arithmetic instructions, would increase the number of instructions that cannot be committed yet at any given time.

This would increase the size of various internal queues, so it would increase indeed the cost of a CPU.

That is why I have explained that overflow exceptions can be avoided while still having zero-overhead overflow checking, by using sticky overflow flags.

On a microcontroller with a target price under 50 cents, which may lack a floating-point unit, the infrastructure to support a flags register may be missing, so it may be argued that it is an additional cost, even if the truth is that the cost is negligible. Such an infrastructure existed in 8-bit CPUs with much less than 10 thousand transistors, so arguing that it is too expensive in 32-bit or 64-bit CPUs is BS.

On the other hand, any CPU that includes the floating-point unit must have a status register for the FPU and means of testing and setting its flags, so that infrastructure already exists.

It is enough to allocate some of the unused bits of the FPU status register to the integer overflow flags.

So, no, there are absolutely no valid arguments that may justify the failure to provide means for overflow checking.

I have no idea why they happened to make this choice, but the reasons are not those stated publicly. All this talk about "costs" is BS made up to justify an already taken decision.

For a didactic CPU, as RISC-V was actually designed, lacking support for overflow checking or for indexed addressing is completely irrelevant. RISC-V is a perfect target for student implementation projects.

The problem appears only when an ISA like RISC-V is taken outside its right domain of application and forced into industrial or general-purpose applications by managers who have no idea about its real advantages and disadvantages. After that, the design engineers must spend extra efforts into workarounds for the ISA shortcomings.

Moreover, the claim that overflow checking may have any influence upon the parallel execution of instructions is incorrect.

For a sticky overflow bit, the order in which it is updated by instructions does not matter. For an overflow bit that shows the last operation, the bit updates must be reordered, but that is also true for absolutely all the registers in a CPU. Even if 4 previous instructions that were executed in parallel had the same destination register, you must ensure that the result stored in the register is the result corresponding to the last instruction in program order. One more bit along hundreds of other bits does not matter.

throwaway81523 · on March 20, 2022

> After that, the design engineers must spend extra efforts into workarounds for the ISA shortcomings.

That is too optimistic. Programs will keep running unchecked and we'll keep getting CVE's from overflow bugs.

ansible · on March 20, 2022

> The clean solution from a micro architectural point of view would be to have an overflow bit (or whatever flags you wanted) in every integer register.

That's what the Mill CPU does. Each "register" also had the other usual flags, and even some new ones like Not a Result, which helps with vector operations and access protection.