5 Free Data Science Books for the New Year
Now that Christmas and the New Year are behind us the nights are becoming a little longer with each passing day. Nevertheless, there’s still loads of cold winter nights left to endure (unless you’re in the Southern Hemisphere, in which case – throw me a shrimp on the barbie!).
It’s time to dust off your New Year resolutions from last year (remember those?) and get ready for a new start, a new you and learn some new data skills.
I’ve thrown together a collection of five excellent (and free!) Data Science eBooks for your Kindle to sharpen up your ninja skills while you’re on the long commute to work. Just try not to read them while driving!
I hope that you find something in here that will get your mental juices flowing with ideas about how to tackle your data.
All these books are free, so dive in and enjoy!
by Hadley Wickham and Garrett Grolemund
Learn how to use R to turn raw data into insight, knowledge, and understanding. This book introduces you to R, RStudio, and the tidyverse, a collection of R packages designed to work together to make data science fast, fluent, and fun. Suitable for readers with no previous programming experience, R for Data Science is designed to get you doing data science as quickly as possible.
Authors Hadley Wickham and Garrett Grolemund guide you through the steps of importing, wrangling, exploring, and modeling your data and communicating the results. You’ll get a complete, big-picture understanding of the data science cycle, along with basic tools you need to manage the details. Each section of the book is paired with exercises to help you practice what you’ve learned along the way.
You’ll learn how to:
- Wrangle – transform your datasets into a form convenient for analysis
- Program – learn powerful R tools for solving data problems with greater clarity and ease
- Explore – examine your data, generate hypotheses, and quickly test them
- Model – provide a low-dimensional summary that captures true “signals” in your dataset
- Communicate – learn R Markdown for integrating prose, code, and results
by Malcolm MacLean
D3.js can help you make data beautiful.
Data is the new medium of choice for telling a story or presenting compelling information on the Internet and d3.js is an extraordinary framework for presentation of data on a web page.
This book is not for experts. It’s put together as a guide to get you started if you’re unsure what d3.js can do. It reads more like a story as it leads the reader through the basics of line graphs and on to discover animation, tooltips, tables, interfacing with MySQL databases via PHP, sankey diagrams, force diagrams, maps and more…
By Mohammed J. Zaki and Wagner Meira, Jr.
The fundamental algorithms in data mining and analysis are the basis for business intelligence and analytics, as well as automated methods to analyze patterns and models for all kinds of data. This textbook for senior undergraduate and graduate data mining courses provides a comprehensive overview from an algorithmic perspective, integrating concepts from machine learning and statistics, with plenty of examples and exercises.
“This book by Mohammed Zaki and Wagner Meira Jr is a great option for teaching a course in data mining or data science. It covers both fundamental and advanced data mining topics, explains the mathematical foundations and the algorithms of data science, includes exercises for each chapter, and provides data, slides and other supplementary material on the companion website.”
Gregory Piatetsky-Shapiro, Founder, ACM SIGKDD, the leading professional organization for Knowledge Discovery and Data Mining.
by Willi Richert and Luis Pedro Coehlo
As the Big Data explosion continues at an almost incomprehensible rate, being able to understand and process it becomes even more challenging. With Building Machine Learning Systems with Python, you’ll learn everything you need to tackle the modern data deluge – by harnessing the unique capabilities of Python and its extensive range of numerical and scientific libraries, you will be able to create complex algorithms that can ‘learn’ from data, allowing you to uncover patterns, make predictions, and gain a more in-depth understanding of your data.
Featuring a wealth of real-world examples, this book provides gives you with an accessible route into Python machine learning. Learn the Iris dataset, find out how to build complex classifiers, and get to grips with clustering through practical examples that deliver complex ideas with clarity. Dig deeper into machine learning, and discover guidance on classification and regression, with practical machine learning projects outlining effective strategies for sentiment analysis and basket analysis. The book also takes you through the latest in computer vision, demonstrating how image processing can be used for pattern recognition, as well as showing you how to get a clearer picture of your data and trends by using dimensionality reduction.
Keep up to speed with one of the most exciting trends to emerge from the world of data science and dig deeper into your data with Python with this unique data science tutorial.
by Lee Baker
Did you know that between them, Sarah Palin, Mike Huckabee and Mitt Romney enjoyed a total of 193% support from Republican candidates in the 2012 US primaries? It must be true – it was on a pie chart broadcast on Fox News. Did you also know that the number 34 is smaller than 14, and zero is much bigger than 22? Honest, it’s true, it was published in a respectable national newspaper after the 2017 UK General Election. There can’t have been any kind of misdirection here because they were all shown on a pie chart.
In this astonishing book, award winning statistician and author Lee Baker uncovers how politicians, the press, corporations and other statistical conmen use graphs and charts to deceive their unwitting audience. Like how a shocking, and yet seemingly innocuous statement as “Every year since 1950, the number of children gunned down has doubled”, meant that there should have been at least 35 trillion gun deaths in 1995 alone, the year the quote was printed in a reputable journal. Or how an anti-abortion group made their point by trying to convince us all that 327,000 is actually a larger number than 935,573. Nice try, but no cigar – we weren’t born yesterday.
In his trademark sardonic style, the author reveals the secrets of how the statistical hustlers use graphs and charts to manipulate and misrepresent for political or commercial gain – and often get away with it.
Written as a layman’s guide to lying, cheating and deceiving with graphs, there’s not a dull page in sight!
And it’s got elephants in it too…
So there you have it – 5 free Data Science eBooks to get your back-to-work-after-the-holidays head back on and into the swing of things.
I hope you enjoy them, and it would be great if you would leave brief reviews of these books in the comments below – I’m sure all the authors would appreciate your comments and shares.
About the Author
With decades of experience in science, statistics and artificial intelligence, he has a passion for telling stories with data. Despite explaining it a dozen times, his mother still doesn’t understand what he does for a living.
Insisting that data analysis is much simpler than we think it is, he authors friendly, easy-to-understand blogs and books that teach the fundamentals of data analysis and statistics.
His mission is to unleash your inner data ninja!
As the CEO of Chi-Squared Innovations, one day he’d like to retire to do something simpler, like crocodile wrestling.
PS – Don’t forget to connect with me in Twitter: @eelrekab
Other DSC Articles by the same Author
- 5 Free Statistics eBooks You Need to Read This Autumn
- 5 Free Data Science eBooks For Your Summer Reading List