Visualizations for machine learning datasets

bonniemuffin · on Oct 8, 2017

I'm not sure why this is described as being for "machine learning datasets" when it actually sounds useful for any exploratory data analysis, regardless of what the user intends to do with the data. Anyway, this looks super slick, and I definitely want to try it out in a jupyter notebook.

divbzero · on Oct 8, 2017

Probably for the same reason that predictive models are described as powered by “AI” instead of by “statistical learning”: Some terms are simply more in vogue even if they’re more vague.

somesnm · on Oct 9, 2017

For nice looking basic statistics of your dataset the pandas-profiling library will do the trick https://github.com/JosPolfliet/pandas-profiling

nl · on Oct 9, 2017

The front page https://pair-code.github.io/facets/ is more useful, because you can load up your own data and try it out.

elsherbini · on Oct 8, 2017

They have an example of using the tool on the quick draw dataset: https://pair-code.github.io/facets/quickdraw.html

theschreon · on Oct 8, 2017

Does this also work for larger numbers of features? (like, 2000, as opposed to 6-9 in the demo)

lacksconfidence · on Oct 8, 2017

Only if your dataset is really small. It only supports up to a low millions of points