Net crawler

使用express搭建的网络爬虫，爬取 慕课网上指定ID老师的所有课程。 async 代码部分是去年写的，当然今年我就重构了 Promise 部分的代码。

实现功能

details picture

NodeJS
express - Fast, unopinionated, minimalist web framework
jade - Jade is a terse language for writing HTML templates.
request - Simplified HTTP request client.
cheerio - Fast, flexible & lean implementation of core jQuery designed specifically for the server.
Promise
AsyncJS - Async utilities for node and the browser

input 输入值点击 confirm button 后，向后端特定路径发起请求。
后端处理该请求，调用 imoocCrawler.startCrawler() 方法。
对于 startCrawler() 方法我做了两种实现，一种是 Promise, 一种是 Async 方法，两种方法作用一致也都有效，作对比学习使用。切换使用只需要在调用位置更改注释就好。
对指定的慕课网老师ID 进行 URL 拼接操作，Promise 链式调用执行 getCoursesList() 方法使用 request 开始请求 html body，使用 cheerio.load(htmlBody) 解析 dom, then 调用 getCoursesLinks() 方法解析所有课程名称及老师名称放入 arr 中，then 调用 getCourseChapters() 方法获取每门课下面所有的视频列表并返回 arr。
在imoocCrawler.startCrawler()的回调中，判断 arr 的长度，如果有值则返回 JOSN 格式字符串 data，或者返回暂无数据的 msg.

  npm install

  npm run start

如果有任何问题，请提issue。

Thanks for your watching.