Case Study: Visualization of Taxi rides in Chicago using RStudio and ShinyApp

Project Name

Project Description

How to use the dashboard

Data visualizations:

Versions

Downloads

Install libraries

install.packages(‘shinydashboard’)install.packages(‘shinyWidgets’)install.packages(‘data.table’)install.packages(‘lubridate’)install.packages(‘ggplot2’)install.packages(‘leaflet’)install.packages(‘rgdal’)install.packages(‘shiny’)install.packages(‘DT’)

How to process data

Main Overview Of How We processed Data:

Nodejs Program Main Algorithm code:

const fs = require("fs");const nReadlines = require("n-readlines");const broadbandLines = new nReadlines("data.csv");// line by linewhile ((line = broadbandLines.next())) {  // parse to ASCII  const lineStr = line.toString("ascii");  // if first line -> get columns array  lineNumber === 1 &&  columns.push(...lineStr.split(",")) &&  setNeededColumns(lineStr.split(",")) &&  wstream.write(arrayToCsvLine(neededColumns));  // get values arr  const values = lineStr.split(",");  // get data object   const data = arrsToObj(columns, values);  // check if data is correct  if (correctData(data)) {    // get needed part from curr data    const dataArr = getNeededDataArr(data);    // process each row that we read    Manipulator.process(getNeededDataObj(data));    // increment the number of elements    trueElsNum++;    // save to csv    wstream.write(arrayToCsvLine(dataArr));    // show mem    if (trueElsNum % 250000 === 0) showMemory();  }  lineNumber++;  }Manipulator.exportAllTables();

Basically, it consists of the main following steps:

  • Reading a file line by line,
  • Parsing that line into data object,
  • Filtering & processing that object,
  • Saving that filtered object into new csv data file,
  • Exporting all generated tables data into csv files.

Let’s delve a bit deeper into these steps:

  1. Reading a file line by line:
    We are reading the whole 7Gb data file line by line by using a simple library package “n-readlines” that you can install by running “npm i n-readlines” and then the following:
    while ((line = broadbandLines.next())) will parse each line in the buffer which we then parsed by using “line.toString(“ascii”);” into ascii format, which is a regular string.
  2. Parsing that line into data object:
    After we obtained string representation of buffer that was read, we parse that line into an data object to be able to manipulate it inside NodeJS. To do that, we have created two functions that take care of that: getNeededDataArr, getNeededDataObj.
    First function returns the data values in form of array, whereas the second function returns the object with the following structure: {date, duration, miles, from_area, to_area, company, year, month, day, hour, minute}.
  3. Filtering & processing that object:
    Filtering is done by comparing the data object through multiple conditions in if statement like so:
if ( parseFloat(row["Trip Miles"]) >= 0.5 &&
parseFloat(row["Trip Miles"]) <= 100 &&
parseInt(row["Trip Seconds"]) <= 18000 &&
parseInt(row["Trip Seconds"]) >= 60 &&
row["Pickup Community Area"] != false &&
row["Dropoff Community Area"] != false
) {
return true;
} else {
return false;
}
Manipulator.process(getNeededDataObj(data));

How to install & use this parser NodeJS Program?

How to convert miles to kilometers:

  • Open csv file ‘milageBinsTotalRides_table.csv’ in Excel.
  • In the new column, use function “=CONVERT(A2,”km”,”mi”)”
  • Instead of A2, paste the first cell with miles data.
  • Drag down the converted cell and all miles are converted to kilometers.

How to convert 24-hour time to 12-hour:

  • Open csv file ‘eachHourTotalRides_table.csv’ in Excel.
  • In the new column, on the same row as “00” write 12.
  • In the same column, under 12, write 1 and drag down till the next 12 occurring.
  • Check whether data is correlated to each other in the way that 00 and 12, 13 and 1, 23 and 11 are in the same rows.
  • In the new column on the right write “AM” and drag down till 11. Next to the 12, write “PM” and drag down the cell till the end.
  • Check whether data is mapped correctly.
  • Name new columns as ‘hour_tw’ and ‘ampm’.
  • Save the new modified file or replace the old one.

Interesting Facts:

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Malika

Computer scientist with a passion for solving problems and creating user-friendly experiences. 👩🏻‍💻