更新时间:2021-06-30 19:06:29
coverpage
Title Page
Packt Upsell
Why subscribe?
PacktPub.com
Contributors
About the author
About the reviewer
Packt is searching for authors like you
Preface
Who this book is for
What this book covers
To get the most out of this book
Download the example code files
Download the color images
Conventions used
Get in touch
Reviews
Analyzing Insurance Severity Claims
Machine learning and learning workflow
Typical machine learning workflow
Hyperparameter tuning and cross-validation
Analyzing and predicting insurance severity claims
Motivation
Description of the dataset
Exploratory analysis of the dataset
Data preprocessing
LR for predicting insurance severity claims
Developing insurance severity claims predictive model using LR
GBT regressor for predicting insurance severity claims
Boosting the performance using random forest regressor
Random Forest for classification and regression
Comparative analysis and model deployment
Spark-based model deployment for large-scale dataset
Summary
Analyzing and Predicting Telecommunication Churn
Why do we perform churn analysis and how do we do it?
Developing a churn analytics pipeline
Exploratory analysis and feature engineering
LR for churn prediction
SVM for churn prediction
DTs for churn prediction
Random Forest for churn prediction
Selecting the best model for deployment
High Frequency Bitcoin Price Prediction from Historical and Live Data
Bitcoin cryptocurrency and online trading
State-of-the-art automated trading of Bitcoin
Training
Prediction
High-level data pipeline of the prototype
Historical and live-price data collection
Historical data collection
Transformation of historical data into a time series
Assumptions and design choices
Real-time data through the Cryptocompare API
Model training for prediction
Scala Play web service
Concurrency through Akka actors
Web service workflow
JobModule
Scheduler
SchedulerActor
PredictionActor and the prediction step
TraderActor
Predicting prices and evaluating the model
Demo prediction using Scala Play framework
Why RESTful architecture?
Project structure
Running the Scala Play web app
Population-Scale Clustering and Ethnicity Prediction
Population scale clustering and geographic ethnicity
Machine learning for genetic variants
1000 Genomes Projects dataset description
Algorithms tools and techniques
H2O and Sparkling water
ADAM for large-scale genomics data processing
Unsupervised machine learning
Population genomics and clustering
How does K-means work?
DNNs for geographic ethnicity prediction
Configuring programming environment
Data pre-processing and feature engineering
Model training and hyperparameter tuning
Spark-based K-means for population-scale clustering
Determining the number of optimal clusters
Using H2O for ethnicity prediction
Using random forest for ethnicity prediction
Topic Modeling - A Better Insight into Large-Scale Texts
Topic modeling and text clustering
How does LDA algorithm work?
Topic modeling with Spark MLlib and Stanford NLP
Implementation
Step 1 - Creating a Spark session
Step 2 - Creating vocabulary and tokens count to train the LDA after text pre-processing
Step 3 - Instantiate the LDA model before training