Some really good advice in this article which cautions against complex algorithms, never mind machine-learning & AIO, until you validate the data.
- Concurrency
- Accuracy
- Relevance
- Completeness
Do not force (even by mild assumption) the use of sophisticated algorithms and complex models if the data does not support them. Sometimes much simpler is much better. The problem of overfitting (building unnecessarily complex models which serve only to reproduce idiosyncracies of the training data) is well documented, but the extent of this problem is still capable of causing surprise! Let the algorithms - in collaboration with the data - speak for themselves.
http://www.analyticbridge.com/profiles/blogs/when-to-trust-the-algorithms-and-not-the-data