失效链接处理 |
spark全栈数据分析 PDF 下蝲
本站整理下蝲Q?/strong>
版权归出版社和原作者所有,链接已删除,误买正?/b>
用户下蝲说明Q?/strong>
?sh)子版仅供预览,下蝲?4时内务必删除,支持正版Q喜Ƣ的误买正版书c:(x)
http://product.dangdang.com/26183154.html
相关截图Q?br />
![]() 资料介:(x) 本书介绍?jin)作者提出的敏捷数据U学的方法论Q结合作者在行业中多q的实际工作l验Qؓ(f)数据U学团队提供?jin)一套以cM敏捷开发的Ҏ(gu)开展数据科学研I的实践l验。全书基于Spark做全栈数据分析,书中展示?jin)工业界一些常见工L(fng)使用Q包括从前端昄到后端处理的各个环节Q手把手帮助数据U学家快速将理论转化为真正面向用L(fng)应用E序Q从而让读者在利用数据创造真正h(hun)值的同时Q也能不断完善自q研究。本书适合初学者阅读,数据U学家、工E师、分析师都能在本书中有所收获?/span> 资料目录Q?br /> 目录 前言 .................................................................................................. xiv WⅠ部分 准备工作 W?章 理论 ..........................................................................................3 D .............................................................................................................................3 定义 .............................................................................................................................5 Ҏ(gu)?................................................................................................................5 敏捷数据U学宣言 ............................................................................................6 瀑布模型的问?.......................................................................................................10 研究与应用开?..............................................................................................11 敏捷软g开发的问题 ...............................................................................................14 最l质量:(x)偿还技术?....................................................................................14 瀑布模型的拉?..............................................................................................15 数据U学q程 ...........................................................................................................16 讄预期 ..........................................................................................................17 数据U学团队的角?......................................................................................18 认清机遇与挑?..............................................................................................19 适应变化 ..........................................................................................................21 q程中的注意事项 ...................................................................................................23 代码审核与结对编E?......................................................................................25 敏捷开发的环境Q提高生产效?....................................................................25 用大q打印实现想?......................................................................................27 W?章 敏捷工具 ................................................................................29 可~性=易用?...................................................................................................30 敏捷数据U学之数据处?.......................................................................................30 搭徏本地环境 ...........................................................................................................32 配置要求 ..........................................................................................................33 配置Vagrant .....................................................................................................33 下蝲数据 ..........................................................................................................33 搭徏EC2环境 ............................................................................................................34 下蝲数据 ..........................................................................................................38 下蝲q运行代?.......................................................................................................38 下蝲代码 ..........................................................................................................38 q行代码 ..........................................................................................................38 JupyterW记?...................................................................................................39 工具集概?...............................................................................................................39 敏捷开发工h的要?..................................................................................39 Python 3 ...........................................................................................................39 使用JSON行和Parquet序列化事?.................................................................42 攉数据 ..........................................................................................................45 使用Sparkq行数据处理 .................................................................................45 使用MongoDB发布数据 .................................................................................48 使用Elasticsearch搜烦(ch)数据 .............................................................................50 使用Apache Kafka分发数?.......................................................................54 使用PySpark Streaming处理数?...............................................................57 使用scikit-learn与Spark MLlibq行机器学习(fn) ................................................58 使用 Apache AirflowQ孵化项目)(j)q行调度 ....................................................59 反思我们的工作程 ......................................................................................70 轻量U网l应?..............................................................................................70 展示数据 ..........................................................................................................73 本章结 ...................................................................................................................75 W?章 数据 ........................................................................................77 飞行航班数据 ...........................................................................................................77 航班准点情况数据 ..........................................................................................78 OpenFlights数据?...........................................................................................79 天气数据 ...................................................................................................................80 敏捷数据U学中的数据处理 ...................................................................................81 l构化数据vs.半结构化数据 ..........................................................................81 SQL vs. NoSQL .........................................................................................................82 SQL ...................................................................................................................83 NoSQL与数据流~程 ......................................................................................83 Spark: SQL NoSQL ......................................................................................84 NoSQL中的表结?..........................................................................................84 数据序列?......................................................................................................85 动态结构表的特征提取与呈现 ......................................................................85 本章结 ...................................................................................................................86 WⅡ部分 攀登金字塔 W?章 记录攉与展C?......................................................................89 整体使用 ...................................................................................................................90 航班数据攉与序列化 ...........................................................................................91 航班记录处理与发?...............................................................................................94 把航班记录发布到MongoDB .................. |