In the article How to Import Weather Data in MySQL , we demonstrated how to load weather forecast data into MySQL using a python script that was run on regular intervals. Since historical weather data is often just as critical for data science and analytics applications, in this article we demonstrate how to load weather history data using Python. We will use the Visual Crossing Weather API because it offers completely free access that includes past weather data as well as forecast.
We are going to use the Timeline Weather API so that the code within this article can be used to seamlessly retrieve forecast, history data, or both within Python. If you query a date range in the future, you will retrieve the weather forecast. If you query a date range in the past, you will retrieve historical data. The switch between history and forecast is seamless – there is no change in the query or data format.
Why shouldn’t we scrape weather data from our favorite web site?
Scraping weather data means we simply visit a web site and either manually or programmatically copy the data from that web page. Programmatic scraping of weather data can be difficult to implement and then even more difficult to maintain. When the web site changes, even in small ways, the scraping code will almost certainly need changing. Most importantly, scraping data is against the usage terms of almost all web sites.
If we are looking for a reliable solution to retrieving weather data on regular intervals we need a more robust solution. Using a Weather API avoids us have to scrape data.
Prerequisites
Before starting, you need access to Weather API we are going to use. If you don’t have an account already, you can sign up for a free account at Visual Crossing Weather Services. Visual Crossing offers a perpetual free tier allowing up to 1000 results per day without any cost. if you need to query more weather data, you can pay-per-result or sign up for a monthly plan.
Our Python script has been written in Python 3.8.2 but should work in most recent versions. In order to make things simple, we have kept the library requirements to a minimum. In this sample we are going to use the CSV format to download the data so we include a library to help process the data that is returned by the API.
Here’s the full list of import statements in our script
import csv
import codecs
import urllib.request
import urllib.error
import sys
Setting up the weather data input parameters
The first part of our script sets up the parameters for the script to download the weather data, and the second part retrieves the weather data. The weather data is retrieved using a RESTful weather API so we simply have to create a web query URL within the Python script and run it as a web query to retrieve the result data.
The first part of the code sets up some parameters to customize the weather data that is retrieved.
BaseURL = 'https://weather.visualcrossing.com/VisualCrossingWebServices/rest/services/timeline/'
ApiKey='YOUR_API_KEY'
#UnitGroup sets the units of the output - us or metric
UnitGroup='us'
#Location for the weather data
Location='Washington,DC'
#Optional start and end dates
#If nothing is specified, the forecast is retrieved.
#If start date only is specified, a single historical or forecast day will be retrieved
#If both start and and end date are specified, a date range will be retrieved
StartDate = ''
EndDate=''
#JSON or CSV
#JSON format supports daily, hourly, current conditions, weather alerts and events in a single JSON package
#CSV format requires an 'include' parameter below to indicate which table section is required
ContentType="csv"
#include sections
#values include days,hours,current,alerts
Include="days"
In the above code, we set up parameters for the script. To keep this script simple, we’re hardcoded variables for the various parameters to help with readability. In production code, you may want to make these more dynamic.
The first parameter is the location for which the weather data should be queried. This is in form of an address, partial address or latitude/longitude pair (for example 35.46,-75.12).
We then use the API Key which is provided when signing up for the API. You can access it by going to your Weather Data Account Page and copying it from your account details.
Next, we specify the date range of the weather data the we are seeking. The Weather API will automatically request historical or forecast data based on the date range requested. The code requests a start and end date in the form YYYY-MM-DD, for example 2020-03-26 is the 26th March, 2020. The format of the date is important for the weather API query.
If start date and end date are not specified, the query will request the next 15 days weather forecast.
You can also use a dynamic date period as the start date such as yesterday, tomorrow, last7days etc. See Weather Data Periods for more details about the options available.
Many more API parameters are available than what are shown in this simple example. For more information on the full set of Weather API parameters, see the Weather API documentation.
Downloading the weather data
The next section of code creates the Weather API request from the parameters, submits the request to the server, and then parses the result.
#basic query including location
ApiQuery=BaseURL + Location
#append the start and end date if present
if (len(StartDate)):
ApiQuery+="/"+StartDate
if (len(EndDate)):
ApiQuery+="/"+EndDate
#Url is completed. Now add query parameters (could be passed as GET or POST)
ApiQuery+="?"
#append each parameter as necessary
if (len(UnitGroup)):
ApiQuery+="&unitGroup="+UnitGroup
if (len(ContentType)):
ApiQuery+="&contentType="+ContentType
if (len(Include)):
ApiQuery+="&include="+Include
ApiQuery+="&key="+ApiKey
The first part of this code section constructs the requests to form a single URL. In this example, we are sending the request as a GET request. Other request techniques are available such as POST, which is useful if your query string is long or even ODATA for specialized data import and data science applications. See the Weather API documentation section for more information.
print(' - Running query URL: ', ApiQuery)
print()
try:
CSVBytes = urllib.request.urlopen(ApiQuery)
except urllib.error.HTTPError as e:
ErrorInfo= e.read().decode()
print('Error code: ', e.code, ErrorInfo)
sys.exit()
except urllib.error.URLError as e:
ErrorInfo= e.read().decode()
print('Error code: ', e.code,ErrorInfo)
sys.exit()
The final two lines of this code section download the requested weather data and provides some simple error handling. In this example, we have used the urllib.request library to provide the retrieval functionality.
Error handling with urllib.request
It is important to provide error handling so that problems may be resolved quickly. When providing the error handling, ensure that you check the HTTP response code and also read the response body.
The response body will contain full details of the Weather API error, if any, and looking at it via logging or other output mechanism is the best way to resolve most query problems. Another useful way to troubleshoot API requests that are returning an error is to copy the URL from the code into a web browser such as Chrome or Edge. This will provide a quick and easy test to see any errors that are being returned. This technique can also be used to see the structure of the resulting weather data. Don’t forget you can also use the Weather Data Services query builder page to construct requests and see results.
Using the weather data
As mentioned above, we are using Comma Separated Values (CSV) as the output format in this example, but JSON format is available, too. The CSV data is encoded in UTF-8 encoding so we need to indicate that in the code to ensure accurate decoding.
# Parse the results as CSV
CSVText = csv.reader(codecs.iterdecode(CSVBytes, 'utf-8'))
We now have the weather data as a CSVText instance. From here we can use the data in many ways. For example we can analyze the weather data in R, load it into a database or simply display it to the user. In this example, we simply display the data as direct output.
The raw data is simply a table of weather data rows. In this case the historical weather data for each day requested. Obviously, your production code will do something far more useful with the data.
name,datetime,tempmax,tempmin,temp,feelslikemax,feelslikemin,feelslike,dew,humidity,precip,precipprob,precipcover,preciptype,snow,snowdepth,windgust,windspeed,winddir,sealevelpressure,cloudcover,visibility,solarradiation,solarenergy,uvindex,severerisk,sunrise,sunset,moonphase,conditions,description,icon,stations
"Washington, DC, United States",2021-12-14,58.9,34.8,44.3,58.9,34.8,44,28.4,58.5,0,0,,,0,0,9.2,4.7,66.6,1032.2,34.4,12.5,118,7.1,4,10,2021-12-14T07:19:08,2021-12-14T16:46:50,0.41,Partially cloudy,Partly cloudy throughout the day.,partly-cloudy-day,"KDCA,F0198,KADW,KDAA,PWDM2"
"Washington, DC, United States",2021-12-15,54.1,36,45.3,54.1,36,45.1,39,78.9,0,4,,,0,0,8.5,4.3,109.6,1031.5,63.5,15,81.4,7.1,4,10,2021-12-15T07:19:49,2021-12-15T16:47:07,0.44,Partially cloudy,Partly cloudy throughout the day.,partly-cloudy-day,
"Washington, DC, United States",2021-12-16,62,45,53.7,62,41.9,52.6,36.1,67.1,0,16,,,0,0,20.8,10.3,208.8,1019.9,55.7,15,73.3,6.4,3,10,2021-12-16T07:20:29,2021-12-16T16:47:26,0.47,Partially cloudy,Partly cloudy throughout the day.,partly-cloudy-day,Address,Date time,Minimum Temperature,Maximum Temperature,Temperature,Dew Point,Relative Humidity,Heat Index,Wind Speed,Wind Gust,Wind Direction,Wind Chill,Precipitation,Precipitation Cover,Snow Depth,Visibility,Cloud Cover,Sea Level Pressure,Weather Type,Latitude,Longitude,Resolved Address,Name,Info,Conditions
"Herndon,VA",01/01/2019,44.6,60.8,53.1,45.1,75.58,,23.2,36.3,269.29,41.4,0,8.33,,9.1,83.5,1014.8,"Mist, Fog, Light Rain",38.96972,-77.38519,"Herndon,VA","","","Overcast"
"Herndon,VA",01/02/2019,37.3,44.9,42.4,35.9,78.17,,8.2,,143.95,38.7,0,0,,10,98.3,1023.6,"",38.96972,-77.38519,"Herndon,VA","","","Overcast"
The code simply steps through the rows of CSV data, prints out the data.
RowIndex = 0
# The first row contain the headers and the additional rows each contain the weather metrics for a single day
# To simply our code, we use the knowledge that column 0 contains the location and column 1 contains the date. The data starts at column 4
for Row in CSVText:
if RowIndex == 0:
FirstRow = Row
else:
print('Weather in ', Row[0], ' on ', Row[1])
ColIndex = 0
for Col in Row:
if ColIndex >= 4:
print(' ', FirstRow[ColIndex], ' = ', Row[ColIndex])
ColIndex += 1
RowIndex += 1
Parsing JSON data
If you prefer to use JSON-formatted data (which can often be a significantly faster way of processing the data if you are processing the data within Python), you can parse the response as follows:
import json
....
weatherData = json.loads(data.decode('utf-8'))
The variable weatherData now contains the parsed weather data set and can be easily processed.
Full source code for downloading historical weather data in Python
For the full code listing, head over to our Github for this and other Python, Java and other Weather API examples.
Next steps
This simple code demonstrates how easy it is to download historical weather data using Python without incurring the headaches of scraping data from a website that must change its format or even block your IP at any time. One you have the weather data loaded into Python, you can now start analyzing and using the data in any project you choose.
Questions or need help?
If you have a question or need help, please post on our actively monitored forum for the fastest replies. You can also contact us via our support site or drop us an email at support@visualcrossing.com.
Comments are closed.