Merge pull request #250 from apachecn/dev

定期合并 - Dev
This commit is contained in:
片刻 2018-07-25 12:05:59 +08:00 committed by GitHub
commit e480b0849e
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
3 changed files with 3094 additions and 13 deletions

View File

@ -1,6 +1,8 @@
# Kaggle # Kaggle
![](static/images/logos/kaggle-logo-gray-bigger.jpeg) ![](static/images/logos/kaggle-logo-gray-bigger.jpeg)
* [ApacheCN 开源组织](https://github.com/apachecn/organization): https://github.com/apachecn/organization
> **欢迎任何人参与和完善:一个人可以走的很快,但是一群人却可以走的更远** > **欢迎任何人参与和完善:一个人可以走的很快,但是一群人却可以走的更远**
* <strong>ApacheCN - Kaggle组队群【686932392】<a target="_blank" href="//shang.qq.com/wpa/qunwpa?idkey=716b584bbd7cdf64e961b499c7fb5891faf1f6c92dad026e3c596a57c834f1ec"><img title="ApacheCN - Kaggle组队群【686932392】" src="http://www.apachecn.org/wp-content/uploads/2017/10/ApacheCN-group.png" alt="ApacheCN - Kaggle组队群【686932392】" /></a></strong></li> * <strong>ApacheCN - Kaggle组队群【686932392】<a target="_blank" href="//shang.qq.com/wpa/qunwpa?idkey=716b584bbd7cdf64e961b499c7fb5891faf1f6c92dad026e3c596a57c834f1ec"><img title="ApacheCN - Kaggle组队群【686932392】" src="http://www.apachecn.org/wp-content/uploads/2017/10/ApacheCN-group.png" alt="ApacheCN - Kaggle组队群【686932392】" /></a></strong></li>
* [Kaggle](https://www.kaggle.com) 是一个流行的数据科学竞赛平台。 * [Kaggle](https://www.kaggle.com) 是一个流行的数据科学竞赛平台。
@ -15,6 +17,16 @@
## [竞赛](https://www.kaggle.com/competitions) ## [竞赛](https://www.kaggle.com/competitions)
* 【推荐】特征工程全过程: https://www.cnblogs.com/jasonfreak/p/5448385.html
> train loss 与 test loss 结果分析
* train loss 不断下降test loss不断下降说明网络仍在学习;
* train loss 不断下降test loss趋于不变说明网络过拟合;
* train loss 趋于不变test loss不断下降说明数据集100%有问题;
* train loss 趋于不变test loss趋于不变说明学习遇到瓶颈需要减小学习率或批量数目;
* train loss 不断上升test loss不断上升说明网络结构设计不当训练超参数设置不当数据集经过清洗等问题。
``` ```
机器学习比赛,奖金很高,业界承认分数。 机器学习比赛,奖金很高,业界承认分数。
现在我们已经准备好尝试 Kaggle 竞赛了,这些竞赛分成以下几个类别。 现在我们已经准备好尝试 Kaggle 竞赛了,这些竞赛分成以下几个类别。
@ -31,6 +43,7 @@
* [**数字识别**](/competitions/getting-started/digit-recognizer) * [**数字识别**](/competitions/getting-started/digit-recognizer)
* [**泰坦尼克**](/competitions/getting-started/titanic) * [**泰坦尼克**](/competitions/getting-started/titanic)
* [**房价预测**](/competitions/getting-started/house-price) * [**房价预测**](/competitions/getting-started/house-price)
* [**nlp-情感分析**](/competitions/getting-started/word2vec-nlp-tutorial)
> [第3部分训练场 Playground](https://www.kaggle.com/competitions?sortBy=deadline&group=all&page=1&pageSize=20&segment=playground) > [第3部分训练场 Playground](https://www.kaggle.com/competitions?sortBy=deadline&group=all&page=1&pageSize=20&segment=playground)
@ -134,15 +147,3 @@
* 企鹅: 529815144(片刻) 1042658081(那伊抹微笑) 190442212(瑶妹) * 企鹅: 529815144(片刻) 1042658081(那伊抹微笑) 190442212(瑶妹)
* **ApacheCN - 学习机器学习群【629470233】<a target="_blank" href="//shang.qq.com/wpa/qunwpa?idkey=30e5f1123a79867570f665aa3a483ca404b1c3f77737bc01ec520ed5f078ddef"><img border="0" src="static/images/logos/ApacheCN-group.png" alt="ApacheCN - 学习机器学习群【629470233】" title="ApacheCN - 学习机器学习群【629470233】"></a>** * **ApacheCN - 学习机器学习群【629470233】<a target="_blank" href="//shang.qq.com/wpa/qunwpa?idkey=30e5f1123a79867570f665aa3a483ca404b1c3f77737bc01ec520ed5f078ddef"><img border="0" src="static/images/logos/ApacheCN-group.png" alt="ApacheCN - 学习机器学习群【629470233】" title="ApacheCN - 学习机器学习群【629470233】"></a>**
* **Kaggle (数据科学竞赛平台) | [ApacheCN(apache中文网)](http://www.apachecn.org/)** * **Kaggle (数据科学竞赛平台) | [ApacheCN(apache中文网)](http://www.apachecn.org/)**
## [ApacheCN 组织资源](http://www.apachecn.org/)
> [kaggle: 机器学习竞赛](https://github.com/apachecn/kaggle)
| 深度学习 | 机器学习 | 大数据 | 运维工具 |
| --- | --- | --- | --- |
| [TensorFlow R1.2 中文文档](http://cwiki.apachecn.org/pages/viewpage.action?pageId=10030122) | [机器学习实战-教学](https://github.com/apachecn/MachineLearning) | [Spark 2.2.0和2.0.2 中文文档](http://spark.apachecn.org/) | [Zeppelin 0.7.2 中文文档](http://cwiki.apachecn.org/pages/viewpage.action?pageId=10030467) |
| [Pytorch 0.3 中文文档 ](http://pytorch.apachecn.org/cn/0.3.0/) | [Sklearn 0.19 中文文档](http://sklearn.apachecn.org/) | [Storm 1.1.0和1.0.1 中文文档](http://storm.apachecn.org/) | [Kibana 5.2 中文文档](http://cwiki.apachecn.org/pages/viewpage.action?pageId=8159377) |
| | [LightGBM 中文文档](http://lightgbm.apachecn.org/cn/latest) | [Kudu 1.4.0 中文文档](http://cwiki.apachecn.org/pages/viewpage.action?pageId=10813594) | |
| | [XGBoost 中文文档](http://xgboost.apachecn.org/cn/latest) | [Elasticsearch 5.4 中文文档](http://cwiki.apachecn.org/pages/viewpage.action?pageId=4260364) |
| | | [Beam 中文文档](http://beam.apachecn.org/) |

File diff suppressed because one or more lines are too long

View File

@ -20,7 +20,7 @@ from torch.utils.data import Dataset, DataLoader
import os.path import os.path
# 数据路径 # 数据路径
data_dir = '/media/wsw/B634091A3408DF6D/data/kaggle/datasets/getting-started/digit-recognizer/' data_dir = '/opt/data/kaggle/getting-started/digit-recognizer/'
class CustomedDataSet(Dataset): class CustomedDataSet(Dataset):
def __init__(self, train=True): def __init__(self, train=True):