Insight Data Engineering - Coding Challenge

Challenge Summary

This challenge is to implement two features:

Clean and extract the text from the raw JSON tweets that come from the Twitter Streaming API, and track the number of tweets that contain unicode.
Calculate the average degree of a vertex in a Twitter hashtag graph for the last 60 seconds, and update this each time a new tweet appears.

Here, we have to define a few concepts (though there will be examples below to clarify):

A tweet's text is considered "clean" once all of the escape characters (e.g. \n, ", / ) are replaced and unicode have been removed.
A Twitter hashtag graph is a graph connecting all the hashtags that have been mentioned together in a single tweet.

I used Python 2.7.10 to complete the challenge

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
src		src
tweet_input		tweet_input
tweet_output		tweet_output
README.md		README.md
run.sh		run.sh