Organize, analyze, and interpret any large, complex dataset, and apply your insights to real-world problems and questions.
In this Specialization, you will develop a robust set of skills that will allow you to process, analyze, and extract meaningful information from large amounts of complex data. You will install and configure Hadoop with MapReduce, use Spark, Pig and Hive, perform predictive modelling with open source tools, and leverage graph analytics to model problems and perform scalable analytical tasks. In the final Capstone Project, developed in partnership with data software company Splunk, you’ll apply the skills you learned by building your own tools and models to analyze big data in the context of retail, sports, current events, or another area of your choice.
第 1 门课程
下一班次：9月 28 — 10月 26
Commitment 3 weeks of study, 5-6 hours/week
What’s the “hype” surrounding the Big Data phenomenon? Who are these mysterious data scientists everyone is talking about? What kinds of problem-solving skills and knowledge should they have? What kinds of problems can be solved by Big Data technology?
After this short introductory course you will have answers to all these questions. Additionally, you will start to become proficient with the key technical terms and big data tools and applications to prepare you for a deep dive into the rest of the courses in the Big Data specialization.
Each day, our society creates 2.5 quintillion bytes of data (that’s 2.5 followed by 18 zeros). With this flood of data the need to unlock actionable value becomes more acute, rapidly increasing demand for Big Data skills and qualified data scientists.
第 2 门课程
开始于 October 2015
Are you looking for hands-on experience analyzing big data? After completing this course, you will be able to install, configure and implement an Apache Hadoop stack ranging from basic “Big Data” components to MapReduce and Spark execution frameworks. Moreover, the hands-on exercises in this course require more computing power than a personal computer. So you will use distributed/parallel processing systems to better manipulate and understand the significance of your own data.
第 3 门课程
开始于 November 2015
Do you have specific business questions you want answered? Need to learn how to interpret results through analytics? This course will help you answer these questions by introducing you to HBase, Pig and Hive.
In this course, you will take a real Twitter data set, clean it, bring it into an analytics engine, and create summary charts and drill-down dashboards. After completing this course, you will be able to utilize BigTable, distributed data store, columnar data, noSQL, and more!
第 4 门课程
开始于 December 2015
Want to learn the basics of large-scale data processing? Need to make predictive models but don’t know the right tools? This course will introduce you to open source tools you can use for parallel, distributed and scalable machine learning.
After completing this course’s hands-on projects with MapReduce, KNIME and Spark, you will be able to train, evaluate, and validate basic predictive models. By the end of this course, you will be building a Big Data platform and utilizing several different tools and techniques.
第 5 门课程
开始于 January 2016
Want to understand your data network structure and how it changes under different conditions? Curious to know how to identify closely interacting clusters within a graph? Have you heard of the fast-growing area of graph analytics and want to learn more? This course gives you a broad overview of the field of graph analytics so you can learn new ways to model, store, retrieve and analyze graph-structured data.
After completing this course, you will be able to model a problem into a graph database and perform analytical tasks over the graph in a scalable manner. Better yet, you will be able to apply these techniques to understand the significance of your data sets for your own projects.
大数据 – 毕业项目
开始于 February 2016
Welcome to the Capstone Project for Big Data! In collaboration with Splunk, a software company focused on analyzing machine-generated big data, you will build a Big Data ecosystem using tools and methods from the earlier courses in this specialization.
First, you can choose an application: retail, sports, current events, etc. Then, you will enrich the datasets / data models we’ve already used in this specialization with external data sets of your choice. After bringing in data from at least three distinct sources, you will build searches and/or dashboards that address the Capstone Project questions.
In this project, you will use a post-process search to aggregate or otherwise transform the data and extract meaningful insights. By utilizing visualization and communication techniques for Big Data, you will be able to conduct basic storytelling and model interpretation.
Excitingly, if you have a top project (whether Hadoop or Splunk), you will be eligible to present to Splunk and meet Splunk recruiters and engineering leadership!
6大数据 – 毕业项目$59