Practice with SQL window functions and parameterized queries from R using Olympics data.
An introduction to distributed SQL query engines like Hive and Impala.
Determine functional dependencies to structure relations in third normal form.
Use regular expressions and text analysis functions on India's Independence day speeches.
Create a normalized SQL database from a CSV file of India's Independence Day speeches.
An R package including a dataset of full-text English renderings of Indian Independence Day speeches, delivered annually on 15 August since 1947.
An R package to query the Article Search API of The New York Times for articles with an “India” location keyword. It also includes functions to prepare this data to be ready for analysis, as well as a shiny app to visualize the output dataset.
Use R to structure and query database tables of Indian census data
Explore how access to electricity in India varies with respect to latrine access at state and district levels through a scatterplot, dumbbell plot and a bivariate bubble map.
Compare mapping styles like a choropleth, dot density map, proportional symbols map, and 3D choropleth using Indian electricity and latrine access data
Tutorial for creating animated maps using packages like {tmap} and {gganimate} and interactive maps using packages like {ggiraph}, {mapview}, {leaflet} and {plotly}
Tutorial for creating static choropleths and cartograms using packages like {tmap}, {ggplot2}, {cartogram}, {geogrid} and {geofacet}
Learn different topological relations to spatially subset data via the {sf} package
Explore electricity, latrine and water access data from the Indian Census
Explore median household income data in the Delaware Valley at various levels of scope
Visualize India's states through a range of geospatial representations
An introduction to Python’s Pygal plotting library
Recreating a D3 visualization of Walmart’s US growth in R with {gganimate}
Tidy text analysis of India PM’s radio addresses in a shiny app
Scraping Saravana Bhavan’s web site to map restaurant locations in time and space
Investigation and visualization of 2016 Presidential election campaign contributions in PA
Next word prediction app in support of JHU Data Science Capstone on Coursera
Exploring the effectiveness of different ML models to classify motion data into exercise categories
Using linear regression models to quantify the difference in fuel efficiency among automatic and manual transmission cars
My attempt at the classic Kaggle competition to predict survival on the Titanic
Text and figures are licensed under Creative Commons Attribution CC BY-NC 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".