(Almost) everything is possible with R: tiempoR: A Shiny Dashboard for the Dark Sky API

With nearly everything closed due to the pandemic, outdoor activities are a great option to get out of the house, but that often depends on having good weather conditions. So how do we know if it’s going to be sunny or stormy when we play European football or ultimate frisbee? The best option has been DarkSky, until March 31st, when they got acquired by apple. Already Apple has shut down API Key registrations, and they’ve announced that they will shut down the webpage, and Android apps on July 1st. That means that either you have an iPhone and paid $3.99 for the app, or you just won’t be able to access the really good rain forecasts by DarkSky. The good news is that with the help of Shiny, shinydashboards, the tidyverse, gt, and R, you can create a dashboard that presents the same information as darsksy.net that will not be shut down by Apple anytime soon. This blog post will walk you through the process of coding it in R.

The data

The data is obtained through the darksky package for R, which is an API wrapper for the DarkSky API. The darksky package provides two functions get_current_forecast and get_forecast_for. Both take latitude and longitude arguments and return the current forecast, or the forecast for the input date, respectively. In our dashboard, we’ll be displaying the current forecast, so it will be sufficient to only use get_current_forecast.

get_current_forecast(latitude, longitude) translates into the simplest DarkSky API call: https://api.darksky.net/forecast/key/lat,long. Both lat and long are parameters in the get_current_forecast function, but what about the key? This is one of the first problems we will encounter in creating the shiny app, as the darksky package handles keys in an inconvenient way for Shiny apps; it stores it in the environment, but we’ll see a solution in the Shiny section.

get_current_forecast returns a list of four dataframes that contain current, minutely, hourly, and daily weather information, each one containing a different set of variables suitable for each time resolution. For example, we can find max and min temperature in the daily dataframe, but not in the current dataframe, or we can only find rain probability and intensity in the minutely dataframe, but not any other variable. For that reason, we will also split how we display these four dataframes in two tabs, one displaying the more immediate weather forecast, using the current, minutely, and hourly dataframes, and the other being the more future forecast, using only the daily dataframe. It is also worth noting that each dataframe has a different range and resolution: current only contains the current weather to the second, minutely contains 60 minutes of rain prediction, hourly contains 50 hours of weather data, and daily contains 8 days of weather prediction.

Graphs

One component of the DarkSky website that we want to have accessible even after the website gets shut down is the weather graphs. For our app, we wanted to create plots that were similar to the ones displayed on darsksy.net, which are scatter plots with a line connecting the points. For this, the ggplot package in R comes in handy because it generates many different types of graphs which options to customize them to appear similar to DarkSky’s graphs. Under the Current weather tab of our app, we’ve included eleven graphs plotting variables that predict precipitation conditions, humidity, dew point, wind speed, pressure, UV index, and visibility. To create a ggplot in R displaying the time in hours on the x-axis and the probability of rain on the y-axis, we use the function ggplot and passed in the arguments aes(x = time, y = precipProbability). Since we want a graph that is a scatter plot with a line connecting the points, we set the geometry (geom) to: geom_point followed by geom_line.

Tables

Tables provide an additional method of displaying data in a readable format. For our tables, we used the gt (great looking tables) package, which was released last April by the RStudio team and serves as the “ggplot” for tables. Our app contains three tables made using gt: a summary of the current, hourly, and a seven day forecast. The dataframes passed into the gt function to create these three tables were the minutely, hourly, and daily dataframes. This function is particularly helpful in formatting the columns of the table. In our case, we used cols_label to rename our columns, cols_move_to_start to rearrange them, and tab_header to add the titles and subtitles. Since the time columns from the wrapper are in a date-time format with year, month, day, hour, minute, and second, we first have to wrangle the time columns into the ISO format (YYYY-MM-DD) before passing the dataframe through the gt function. We accomplished this format by using str_split and map_chr. Another feature of the gt tables is the tab_source_note function, which provided a convenient way for us to display the data credits for DarkSky and Apple.

Since our ultimate goal is to compile all of these components from DarkSky into a Shiny app, it works out nicely that gt is being developed by the same people as Shiny and comes with two functions to render them in Shiny, render_gt and gt_output.

Dashboard

For our dashboard we use Shiny and shinydashboard, two packages built by the RStudio team to help make interactive analysis tools for R. Shiny provides the interactive functionality, allowing the user to set their own coordinates and API key, and to customize the plot included in the daily tab. Shinydashboard provides the structure into which the different Shiny widgets, ggplot graphs, and gt tables are laid out, and lets us separate them into the two tabs.

Structure

As mentioned into the data section, we divide the four dataframes into a current tab and a daily tab. The current weather tab contains the current, minutely, and hourly weather. It displays a snapshot of the current weather conditions including a brief weather description in English, temperature, wind speed, humidity, pressure, UV Index, dew point and visibility. After the current weather snapshot (graphed as a table using gt) it graphs the minutely dataframe in two separate ggplot plots, one showing the precipitation probability for the next 60 minutes, and one showing the precipitation intensity also for the next 60 minutes. Then, another table using gt shows the weather conditions (same as in the snapshot but including precipitation probability and intensity) for the next 24 hours. Then a plot for each of the weather conditions for the next 50 hours is added. The hourly weather tab includes a table that summarizes the weather conditions in English for the day (same as in the snapshot of the current weather tab), and shows sunrise/sunset times, precipitation information, max and min temperature, UV Index, humidity and wind speed. Below the table there is a variable selector and a plot. The variable chosen will become the y-axis in the plot, with the x-axis being the 8 following days.

Input

The Shiny app lets the user specify the latitude and longitude of the desired location (with the White House coordinates as the default for the app), the DarkSky API key, and, in the daily weather tab, the y-axis variable for the plot. In case you are unfamiliar with Shiny, the way the code is structured is simple. There are two parts, the ui and the server functions. The ui contains and defines how the different graphs, tables, and input widgets are laid out, what are their identifiers, and for the input widgets, their different parameters. For example, in a range slider selector you would specify the max and the min of the slider. The server part includes the rest of the R code, the dplyr data wrangling, the ggplot graph code, gt table creation, etc. Anything that will be ultimately referenced in the ui part or the server part has to be in the server and the ui function respectively, also, anything that will change and depends on any inputs from the input function will have to be on the server function inside a reactive object (either a reactive or any render_X function). Anything that is a reactive object, will have to be inside another reactive object, if it’s in server. Those are the shiny rules. A good piece of advice, if you have anything that won’t change through any input, leave it outside the ui and the server functions, they are still referenceable inside the server and ui functions, and will help anyone that reads the code realize that they don’t change. The latitude and longitude input fields are simple text fields that are referenced in a reactive code section that calls get_current_forecast anytime there is a change in the latitude and the longitude. The darksky API key input calls another reactive snippet that makes use of sys.setenv to set the key in a place the darksky package can find it. The variable selector directly modifies the y aesthetic inside the ggplot code, and as it is inside a render_X function (i.e. reactive) it will just regenerate the plot every time the variable is reselected.

Data wrangling

There is minimal data wrangling in this dashboard, as get_current_forecast returns it in tidy format. We only extract the four dataframes into their independent ones, and “clean” them by removing some variables we don’t use. We do this as to create tables using gt; gt has a function dedicated to hide columns, but if you hide columns not in the dataset it will throw an error at you, and the DarkSky API can return a different set of variables, for example, when there is a storm it returns a variable indicating how far it is from the coordinates, but won’t return that variable if there isn’t one. That’s why it’s just better to use dplyr and select the variables that we want rather than hiding columns. gt still doesn’t offer a select function.

Demo

A practical example of using the dashboard can be determining optimal time of day to go and play frisbee. Once you go to https://shiny.reed.edu/s/users/samaebius/tiempoR/ you simply need to enter the coordinates of your backyard and scroll down under the current tab to view the forecast by the hour. Ideal frisbee conditions are lower wind speeds, lower precipitation probability, and a safe UV index, although the adventurous player might try their luck against stronger winds. To determine the best day of the week to play European football, the user only needs to select the daily tab to read a graph of the maximum temperatures for the week or look at the summary column to view a short description of the expected conditions.

Try it yourself! If you’ve reached this far, you probably are interested in using tiempoR (weatherR in Spanish). Head into this link and let us know if you had any suggestions on how to improve it! Our emails are daherrero and samaebius at reed.edu.

Sources

Dark Sky, Dark Sky API
Copyright © 2020 Apple Inc. All rights reserved.

tidyverse
Wickham et al., (2019). Welcome to the tidyverse. Journal of Open Source Software, 4(43), 1686, https://doi.org/10.21105/joss.01686

lubridate
Garrett Grolemund, Hadley Wickham (2011). Dates and Times Made Easy with lubridate. Journal of Statistical Software, 40(3), 1-25. URL http://www.jstatsoft.org/v40/i03/.

gt
Richard Iannone, Joe Cheng and Barret Schloerke (2020). gt: Easily Create Presentation-Ready Display Tables. R package version 0.2.0.5. https://CRAN.R-project.org/package=gt

shiny
Winston Chang, Joe Cheng, JJ Allaire, Yihui Xie and Jonathan McPherson (2020). shiny: Web Application Framework for R. R package version 1.4.0.2. https://CRAN.R-project.org/package=shiny

shinydashboard
Winston Chang and Barbara Borges Ribeiro (2018). shinydashboard: Create Dashboards with ‘Shiny’. R package version0.7.1. https://CRAN.R-project.org/package=shinydashboard

darksky
Bob Rudis (2017). darksky: Tools to Work with the ‘Dark Sky’ ‘API’. R package version 1.3.0. https://CRAN.R-project.org/package=darksky

R
R Core Team (2019). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/.