Our goal is to build a recipe crawler and search system that discover interesting facts from recipes and provide optimized search results. The system consist of two major components, including web crawler, search.
Based on our experiences on web development and data science, as well as the descriptions mentioned above, we take Feb, 2016 as the 1st stage with the primary goal of prototyping our application following the development guild lines mentioned below. Here's the tentative timeline.
- [2016/02/08 - 2016/02/12] Project Selection, Plan Discussion, Proposal Draft Writing, Resource Discovery
- [2016/02/13 - 2016/03/07] System Design, Project Implementation
- Web Crawler
- Search
Exploratory Analyzer / Recommender
- [2016/03/08 - 2016/03/15] Document Writing, User Manual Writing and Video Presentation Making
- Modularity. Following the principle "loose coupling and high cohesion", each module should be standalone.
- Minimalism. Each module should be kept short, simple, and concise. Every piece of code should be transparent upon first reading.
- Easy extensibility. New modules (as new classes and functions) are should be simply add, and existing modules should be extended easily.
- Javascript: Node.js, Express.js, AngularJS
- Database: MongoDB
- Cloud Platform: Cloud Foundry
BitTiger Project: AppStore - Website
Web Crawler
MEAN Stack
MEAN is an acronym for MongoDB, Express.js , Angular.js and Node.js
A very good online course about MEAN stack on edX:
MongoDB: MongoDB is an open-source, document database (NoSQL) designed for ease of development and scaling.
Express.js: Fast, unopinionated, minimalist web framework for Node.js.
Angular.js: Angular is a development platform for building mobile and desktop web applications.
Node.js: Node.js is a JavaScript runtime built on Chrome's V8 JavaScript engine.
Hybrid Mobile App
Ionic: Ionic is an advanced HTML5 hybrid mobile app framework, it makes it incredibly easy to build beautiful and interactive mobile apps using HTML5 and AngularJS.
Search
ElasticSearch: Elasticsearch is a search server based on Lucene. It provides a distributed, multitenant-capable full-text search engine with an HTTP web interface and schema-free JSON documents.
MongoDB & ElasticSearch For Full Text Search In Chinese
@Pikachu