Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Well...I think that take is a little overly cynical, and I disagree particularly with this:

>the mapping from inputs to outputs performed by deep nets quickly stops making sense if new inputs differ even slightly from what they saw at training time

In my experience that isn't really true, if you have an appropriately designed net, training data which appropriately samples the problem space, and the net is not overtrained (overfit).

You can think of training data as representing points in high dimensional space. Like any interpolation problem, if you sample the space with the right density, you can get accurate interpolation results - and neural nets have another huge advantage, in that they learn highly nonlinear interpolation in these high d spaces. So the net may be unlikely to generalize to points outside of the sampled space - although now that I think of it I'm not sure of how nets handle extrapolation - but when you're dealing with space with thousands of dimensions (like each pixel in an image) you can still derive a ton of utility from the interpolation which effectively replaces hardcoded rules about the problem you're solving.



I may be jumping the gun a little because I was thinking about this in the context of another thread, but a practical problem with machine learning in general is that, for a learned model to generalise well to unseen data, the training dataset (all the data that you have available, regardless of how you partition it to training, testing and validation) must be drawn from the same distribution as the "real world" data.

The actual problem is that this is very difficult, if not impossible, to know before training begins. Most of the time, the best that can be achieved is to train a model on whatever data you have and then painstakingly test it at length and at some cost, on the real-world inputs the trained model has to operate on.

Basically, it's very hard to know your sampling error.

Regarding interpolation and dense sampling etc, the larger the dimensionality of the problem the harder it gets to ensure your data is "dense", let alone that it covers an adequate region of the instance space. For example, the pixels in one image are a tiny, tiny subset of all pixels in all possible images- which is what you really want to represent. Come to that, the pixels in many hundred thousands of images are still a tiny, tiny subset of all pixels in all possible images. I find Chollet's criticism not cynical, but pragmatic and very useful. It's important to understand the limitations of whatever tool you're using.

>> although now that I think of it I'm not sure of how nets handle extrapolation

They don't. It's the gradient optimisation. Gets stuck to local minima, always has, always will. Maybe a new training method will come along at some point. Until then don't expect exrapolation.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: