Natalia:谷歌机器学习

分享者:Jian.1990 | 分享时间: 2017-1-22 21:49 | 我要分享
知识文档-点击下载

Natalia_谷歌机器学习.pdf

490.34 KB

所需流量: 5 积分 [立即下载]

英文 完整版 PDF

Natalia_谷歌机器学习PDF第000页.jpg
Nail Your Next ML Gig
Natalia Ponomareva
Research, Machine Learning Google

Natalia_谷歌机器学习PDF第001页.jpg
What is Machine Learning?

Natalia_谷歌机器学习PDF第002页.jpg
What is Machine Learning (ML)

Conceptually: given (training) data, discover some underlying pattern and use this discovered pattern (on new data).
Types
- Supervised learning
- Unsupervised learning
- Semi-supervised learning
- Etc...

Natalia_谷歌机器学习PDF第003页.jpg
What is ML: continued

Supervised Learning - training input data is labeled.
- Classification (learn a decision boundary). Example: text / image / video classification, spam detection etc
- Regression (learn to predict a continuous value). Example: predict a house price, predict how much user is willing to spend etc

Natalia_谷歌机器学习PDF第004页.jpg
What is ML: continued

Natalia_谷歌机器学习PDF第005页.jpg
How ML can help your business

- Personalization of services (provide each user with a unique experience, tailored to the user), which can maximize the engagement and revenue
- Automate some tasks that are error prone/take a lot of time (transcription, character recognition etc)
- Analyze the data to come up with better decisions
- Etc ...

Natalia_谷歌机器学习PDF第006页.jpg
Where do I start?

- Start small
- Sample your data
- Come up with initial features
- Build a simple model, see if it looks promising
- Scale
- Train on the full data available
- Work on improving features (feature engineering)
- Try different algorithms (model selection)
- Think about productionalizing

Natalia_谷歌机器学习PDF第007页.jpg
Real world supervised ML conceptually

Natalia_谷歌机器学习PDF第008页.jpg
Feature engineering: what is it?

Conceptually, feature engineering is the process of transforming your raw data (logs, history of products bought or behaviour on the website etc.) into a vector that can be used by the learning algorithm for training and prediction.
- It is highly domain specific
- It depends on what you are trying to learn from the data
- It is labour-intensive

Natalia_谷歌机器学习PDF第009页.jpg
Feature engineering: how to go about it

High level steps are
- Decide on the insight you are trying to obtain (for example, we want to train a model to suggest a user another song to listen to).
- Decide how you will model the insight (there are numerous ways!)
- For example, we will have a classification model that, given the user and a song, will return whether the user will be interested in this song or not
- We will have the list of songs, run it throught the model and will show the songs to the user which our model thinks might be interesting to the user
- Consider what data you have (for example, history of songs the user listened to and user profile information)

Natalia_谷歌机器学习PDF第010页.jpg
Feature engineering: continued

- Consider what might be relevant
- User age (probably)? User name and email address (not at all). Location (possibly)? History of songs the user listened to (yes).
- What genres of songs user listened to before (country, rock, pop etc) (very relevant)
- Come up with digital representation for the relevant information
- Come up with the features that
- Describe the user
- Describe the songs

Natalia_谷歌机器学习PDF第011页.jpg
Feature engineering: continued

- Prepare your final training data
- Given the features for the user u(u1, u2, u3...un)
- And features for the songs s1 (s11, s12, s13, ... , s1k), s2 (s21, s22, .. s2k)...
- Create training instances for a user u (u,si) (1 - listened to) and (u,sj) (0 - didn't listen to)

Natalia_谷歌机器学习PDF第012页.jpg
Feature normalization


Your features are most likely to be on a different scale:
- User age: numeric value between 0 and 100
- User income: from 0 to millions!
Some machine learning models may not work well with such a variety of features
- Regularization will penalize features differently
- Distance will be governed by the feature with the largest range
- Some optimization algos can converge faster (gradient descent)
- etc...


Natalia_谷歌机器学习PDF第013页.jpg
Model evaluation

Natalia_谷歌机器学习PDF第014页.jpg
Model selection: rough guidelines

Natalia_谷歌机器学习PDF第015页.jpg
Model selection: bringing it to the next level

Consider deep learning
- If you have a lot of labelled data (think millions of instances)
- If you have a hard time coming up with features or the connection between features is very complicated (example: object detection)
- Can tolerate longer training/refinement time
- If you know what you are doing
- What architecture to choose? (how many layers? Fully connected or not? etc)
- How to prevent overfitting (DNN can model rare dependencies in the data, but should they be allowed to?)

Natalia_谷歌机器学习PDF第016页.jpg
Hyperparameter tuning

Natalia_谷歌机器学习PDF第017页.jpg
Supervised ML Pipelines

Natalia_谷歌机器学习PDF第018页.jpg
Tools/Frameworks

Natalia_谷歌机器学习PDF第019页.jpg
Things to consider before choosing a framework


Natalia_谷歌机器学习PDF第020页.jpg
Next steps

Natalia_谷歌机器学习PDF第021页.jpg
ML tools for production: Hands-on approach

Natalia_谷歌机器学习PDF第022页.jpg
ML tools for production: Hands-on approach

Natalia_谷歌机器学习PDF第023页.jpg
ML tools: ML as a service

Natalia_谷歌机器学习PDF第024页.jpg

Natalia_谷歌机器学习PDF第025页.jpg

Natalia_谷歌机器学习PDF第026页.jpg

Natalia_谷歌机器学习PDF第027页.jpg

Natalia_谷歌机器学习PDF第028页.jpg

Natalia_谷歌机器学习PDF第029页.jpg

Natalia_谷歌机器学习PDF第030页.jpg

Natalia_谷歌机器学习PDF第031页.jpg



最新热门
 
相关知识
邮件订阅

Copyright© 2005-2018 USEIT.COM.CN|帮助中心|关于我们|注册协议|投诉指引|获取积分|投稿须知|作者权益|下载须知|常见问题|联系我们|手机版|USEIT Inc.

GMT+8, 2020-5-31 09:33 , Processed in 0.154424 second(s), 20 queries , Gzip On, MemCache On.

鄂公网安备 42011202000160号 鄂ICP备13013806号-1

网络经济主体信息湖北省互联网举报平台 中国互联网违法和不良信息举报中心

快速回复 返回顶部 返回列表