This is a nice compilation of issues that you should be concerned. The examples are mostly from things that interest Google, but you will find this advice itself is useful no matter what type of data you work with. The advice is split into three broad categories:
- technical (e.g., look at your distributions), process (e.g., separate validation, description, and evaluation), and communication (e.g., data analysis starts with questions, not data or a technique).
Patric Riley. Practical advice for analysis of large, complex data sets. The Unofficial Google Data Science Blog. Published October 31, 2016. Available at http://www.unofficialgoogledatascience.com/2016/10/practical-advice-for-analysis-of-large.html.