Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You are right in that the use-cases are very similar to regular autodiff, with the added benefit that the returned gradient also accounts for the effects of taking alternative branches.

Just to clarify: we do a kind of source-to-source transformation by transparently injecting some API-calls in the right places (e.g., before branching-statements) before compilation. However, the compiled program then returns the program output alongside the gradient.

For the continuous parts, the AD library that comes with DiscoGrad uses operator overloading.



> with the added benefit that the returned gradient also accounts for the effects of taking alternative branches.

Does this mean that you can take the partial derivative in respect to some boolean variable that will be used in an if (for example), but with regular autodiff you can't?

I'm struggling to understand why regular autodiff works even in presence of this limitation. Is it just a crude approximation of the "true" derivative?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: