USGS Streamflow Data Extraction
Introduction
Streamflow data represent the hydrologic response of a watershed as influenced by the changes in watershed conditions. Within the United States, streamflow measurements are mainly provided by the United States Geological Survey (USGS), or Water Management districts. Information on streamflow has a wide range of applications such as estimation of water yield, design of flood control structures, or understanding of the impact of past and future watershed characteristics on watershed response. In addition, streamflow data is one of the primary data required to calibrate and validate hydrologic watershed models , and thus rapid acquisition of streamflow data saves some of the time spent on collection of data needed by the watershed modelers. Below is a description of steps that can be adapted to download streamflow data from the USGS website using available packages within python.
Step 1: Import the required packages - Install the packages if needed.
There are three python libraries that need to be imported as described in the code below. The python Pandas and numpy are used in the manipulation of streamflow data. The "hydrofunctions package" uses requests to extract streamflow data from USGS website, and imports it into python. Check out the documentation and capabilities of the hydrofunction package here.
# Import required packages
import hydrofunctions as hf
import pandas as pd
import numpy as np
Step2: Write a function to request streamflow data and manipulate the data for further processing.
Once the streamflow data has been downloaded from the USGS database, it needs some manipulation to make sure that it is consistent and can be used for further analysis if needed. The code below will extract the streamflow data for a single station and return data as data frame.
'''
A function streamflow_raw that extracts USGS streamflow data
Inputs:
station_id - USGS station Id
start_date - Start date
end_date - End date
Outputs:
Daily streamflow records
'''
def streamflow_raw (station_id,start_date, end_date):
# Extract streamflow from a gauge
flow_extract = hf.NWIS(str(station_id), 'dv', start_date, end_date, parameterCd='00060')
# Convert the data object to a data frame
raw_flow = flow_extract.get_data().df().reset_index()
# Add names to the data frame
raw_flow.columns = ['Date', 'Flow (cfs)', 'Code']
# set date as index
raw_flow.index = pd.to_datetime(raw_flow.Date)
# Exclude negative flow values if they exist and replace any negative values with nan if any
raw_flow.loc[raw_flow['Flow (cfs)']<0,'Flow (cfs)'] = np.nan
# Make sure that the dates are consistent
raw_flow_daily = raw_flow [['Flow (cfs)', 'Code']].resample('D').asfreq()
return (raw_flow_daily.round(2))
Step3: Test the function with the USGS streamflow gauge id USGS 03341500
Below is how the streamflow data looks within the USGS database. The commands used to extract streamflow data are shown in the code below.
"""
# ---------------------------------- WARNING ----------------------------------------# Some of the data that you have obtained from this U.S. Geological Survey database# may not have received Director's approval. Any such data values are qualified# as provisional and are subject to revision. Provisional data are released on the# condition that neither the USGS nor the United States Government may be held liable# for any damages resulting from its use.## Additional info: https://help.waterdata.usgs.gov/policies/provisional-data-statement## File-format description: https://help.waterdata.usgs.gov/faq/about-tab-delimited-output# Automated-retrieval info: https://help.waterdata.usgs.gov/faq/automated-retrievals## Contact: gs-w_support_nwisweb@usgs.gov# retrieved: 2020-10-21 21:47:08 EDT (nadww02)## Data for the following 1 site(s) are contained in this file# USGS 03341500 WABASH RIVER AT TERRE HAUTE, IN# -----------------------------------------------------------------------------------## Data provided for site 03341500# TS parameter Description# 225418 00060 Discharge, cubic feet per second# 225826 00065 Gage height, feet## Data-value qualification codes included in this output:# # A Approved for publication -- Processing and review completed.# P Provisional data subject to revision.# e Value has been estimated.# agency_cd site_no datetime tz_cd 225418_00060 225418_00060_cd 225826_00065 225826_00065_cd5s 15s 20d 6s 14n 10s 14n 10sUSGS 03341500 2018-10-01 00:00 EST 10200 A 8.62 AUSGS 03341500 2018-10-01 00:15 EST 10200 A 8.62 AUSGS 03341500 2018-10-01 00:30 EST 10200 A 8.61 AUSGS 03341500 2018-10-01 00:45 EST 10200 A 8.59 AUSGS 03341500 2018-10-01 01:00 EST 10200 A 8.59 AUSGS 03341500 2018-10-01 01:15 EST 10200 A 8.59 AUSGS 03341500 2018-10-01 01:30 EST 10200 A 8.58 AUSGS 03341500 2018-10-01 01:45 EST 10100 A 8.56 AUSGS 03341500 2018-10-01 02:00 EST 10100 A 8.56 AUSGS 03341500 2018-10-01 02:15 EST 10100 A 8.55 AUSGS 03341500 2018-10-01 02:30 EST 10100 A 8.54 AUSGS 03341500 2018-10-01 02:45 EST 10100 A 8.53 AUSGS 03341500 2018-10-01 03:00 EST 10100 A 8.53 A"""
# Test the Streamflow function
station_id = '03341500'
start_date = '2018-10-01'
end_date = '2020-10-19'
flow = streamflow_raw (station_id,start_date, end_date)
A data frame is generated after running the above code. At this point, the data frame can be exported as a csv file. or to a database. A visualization of the acquired streamflow data can performed to examine any inconsistences as shown below.
Conclusion
I have uploaded the entire code on my google CodeLab. You can access or download the entire code by following this link . In summary, by using this python script, you can rapidly extract streamflow records on daily resolution for numerous USGS streamflow gauges. Further analysis such as data examination, visualization or data analysis such as trend test can be accomplished in the least time possible. As an example, I have included a comparison of the streamflow variability at different river sections as shown in the interactive graph below.
Note. You can interactively select any of the stream flow gauge(s) by clicking on the legend item.