Python: Real-World Data Science

Front Cover

Unleash the power of Python and its robust data science capabilities

About This BookUnleash the power of Python 3 objectsLearn to use powerful Python libraries for effective data processing and analysisHarness the power of Python to analyze data and create insightful predictive modelsUnlock deeper insights into machine learning with this vital guide to cutting-edge predictive analyticsWho This Book Is For

Entry-level analysts who want to enter in the data science world will find this course very useful to get themselves acquainted with Python's data science capabilities for doing real-world data analysis.

What You Will LearnInstall and setup PythonImplement objects in Python by creating classes and defining methodsGet acquainted with NumPy to use it with arrays and array-oriented computing in data analysisCreate effective visualizations for presenting your data using MatplotlibProcess and analyze data using the time series capabilities of pandasInteract with different kind of database systems, such as file, disk format, Mongo, and RedisApply data mining concepts to real-world problemsCompute on big data, including real-time data from the InternetExplore how to use different machine learning models to ask different questions of your dataIn Detail

The Python: Real-World Data Science course will take you on a journey to become an efficient data science practitioner by thoroughly understanding the key concepts of Python. This learning path is divided into four modules and each module are a mini course in their own right, and as you complete each one, you'll have gained key skills and be ready for the material in the next module.

The course begins with getting your Python fundamentals nailed down. After getting familiar with Python core concepts, it's time that you dive into the field of data science. In the second module, you'll learn how to perform data analysis using Python in a practical and example-driven way. The third module will teach you how to design and develop data mining applications using a variety of datasets, starting with basic classification and affinity analysis to more complex data types including text, images, and graphs. Machine learning and predictive analytics have become the most important approaches to uncover data gold mines. In the final module, we'll discuss the necessary details regarding machine learning concepts, offering intuitive yet informative explanations on how machine learning algorithms work, how to use them, and most importantly, how to avoid the common pitfalls.

Style and approach

This course includes all the resources that will help you jump into the data science field with Python and learn how to make sense of data. The aim is to create a smooth learning path that will teach you how to get started with powerful Python libraries and perform various data science techniques in depth.

 

Contents

Introduction and First Steps Take a Deep Breath
5
Objectoriented Design
35
Objects in Python
63
When Objects Are Alike
97
Expecting the Unexpected
137
When to Use Objectoriented Programming
167
Python Data Structures
199
Python Objectoriented Shortcuts
241
Classifying with scikitlearn Estimators
653
Predicting Sports Winners with Decision Trees
671
Recommending Movies Using Affinity Analysis
691
Extracting Features with Transformers
711
Social Media Insight Using Naive Bayes
733
Discovering Accounts to Follow Using Graph Mining
763
Beating CAPTCHAs with Neural Networks
791
Authorship Attribution
815

Strings and Serialization
273
The Iterator Pattern
313
Python Design Patterns I
345
Python Design Patterns II
375
Testing Objectoriented Programs
403
Concurrency
441
Introducing Data Analysis and Libraries
479
NumPy Arrays and Vectorized Computation
491
Data Analysis with pandas
511
Data Visualization
539
Time Series
563
Interacting with Databases
587
Data Analysis Application Examples
607
Getting Started with Data Mining
633
Clustering News Articles
841
Classifying Objects in Images Using Deep Learning
873
Working with Big Data
903
Next Steps
931
Giving Computers the Ability to Learn from Data
943
Training Machine Learning Algorithms for Classification
959
A Tour of Machine Learning Classifiers Using scikitlearn
991
Building Good Training Sets Data Preprocessing
1041
Compressing Data via Dimensionality Reduction
1071
Learning Best Practices for Model Evaluation and Hyperparameter Tuning
1113
Combining Different Models for Ensemble Learning
1145
Predicting Continuous Target Variables with Regression Analysis
1181
Reflect and Test Yourself Answers
1217

Common terms and phrases

About the author (2016)

Fabrizio Romano was born in Italy in 1975. He holds a master's degree in computer science engineering from the University of Padova. He is also a certified Scrum master. Before Python, he has worked with several other languages, such as C/C++, Java, PHP, and C#. In 2011, he moved to London and started working as a Python developer for Glasses Direct, one of Europe's leading online prescription glasses retailers. He then worked as a senior Python developer for TBG (now Sprinklr), one of the world's leading companies in social media advertising. At TBG, he and his team collaborated with Facebook and Twitter. They were the first in the world to get access to the Twitter advertising API. He wrote the code that published the first geo-narrowcasted promoted tweet in the world using the API. He currently works as a senior platform developer at Student.com, a company that is revolutionizing the way international students find their perfect home all around the world. He has delivered talks on Teaching Python and TDD with Python at the last two editions of EuroPython and at Skillsmatter in London.

Phuong Vo.T.H has a MSc degree in computer science, which is related to machine learning. After graduation, she continued to work in some companies as a data scientist. She has experience in analyzing users' behavior and building recommendation systems based on users' web histories. She loves to read machine learning and mathematics algorithm books, as well as data analysis articles.

Martin Czygan studied German literature and computer science in Leipzig, Germany. He has been working as a software engineer for more than 10 years. For the past eight years, he has been diving into Python, and is still enjoying it. In recent years, he has been helping clients to build data processing pipelines and search and analytics systems. His consultancy can be found at http://www.xvfz.net.

Bibliographic information