Methods for air data analysis
Introduction
1
Best practices
1.1
Welcome to the Tidyverse
1.2
Keep R updated
1.3
Script organization
1.4
Divide and conquer
1.5
Codebooks, metadata, and data dictionaries
1.6
Data formatting
1.7
Pollutant names
2
Get data
2.1
Air monitoring
2.1.1
Retrieving data from AQS
2.1.2
Retrieving data from AQS DataMart
2.1.3
Current AQI observations
2.1.4
WAIR Site information table
2.1.5
AirNow:Active AQI monitors
2.1.6
EPA Google Earth Site Maps
2.1.7
Retrieving data from LIMS via Tableau
2.1.8
Retrieving continuous data from AirVision
2.1.9
Retrieving data from MPCA WAIR database
2.1.10
Air toxics Data Explorer
2.1.11
Criteria Pollutant Data Explorer
2.1.12
Air Quality Index Summary Reports
2.1.13
Air Monitoring for PAHs
2.1.14
Internal Data Explorers
2.2
Health and standards
2.2.1
Inhalation health benchmarks (IHBs)
2.2.2
Air Quality Standards (NAAQS and MAAQS)
2.3
Air modeling
2.3.1
NATA modeling
2.3.2
MNRISKS statewide risk modeling
2.3.3
Downscaler modeling results for Ozone and PM2.5
2.3.4
CMAQ Ozone modeling
2.4
Context
2.4.1
MN Emissions Inventory
2.4.2
EPA’s NEI
2.4.3
Facility locations
2.4.4
Weather observations
2.4.5
HYSPLIT wind trajectories
2.4.6
Land use maps
2.4.7
United States Census boundaries
2.4.8
American Community Survey (ACS)
Quality assurance methods
3
Data cleaning
3.1
Blank, NULL, and missing values
3.2
Qualified data
3.3
Duplicate observations
4
Data validation
4.1
Instrument drift or leaks in a system.
4.2
Extreme values and outliers
4.3
Sequential repeats and “sticky” numbers
4.4
Unique detected values
4.5
References
5
Collocated monitors
5.1
Evaluate collocated air monitors
5.2
Prioritize multiple air monitors (POCs)
6
Detection limits
6.1
Method Detection Limit
6.2
Calculating MDLs
6.3
Finding MPCA detection limits
6.4
Qualifier Codes for Detection Limits
6.5
Estimating below detection values
6.6
Recommended Steps
6.7
Multiple detection limits
6.8
Reporting limits
6.9
References
7
Completeness checks
7.1
Completeness checks
Summary methods
8
Summary statistics
8.1
Test for normality
8.2
Bootstrapping
8.3
Below the detection limit
8.4
Upper confidence limits (UCLs)
8.5
Annual summaries for incomplete data
8.6
Comparison of data to inhalation health benchmarks
9
Site comparisons
9.1
Confidence intervals
9.2
Tools to visualize differences
Beeswarm Plots
Correlation matrices
References
10
From start to finish
10.1
Using functions
10.2
A simpler analysis
Visuals
11
Charts
11.1
Calendar plots
11.2
Colors and themes
11.3
Pollution roses
12
Time series
Hour of the day
Day of the week
Seasonality
13
Maps
Interactive map of air monitors
Creating shapefiles in R
MPCA data tools
14
Air toxics explorer
14.1
Data Cleaning
Null Results
Duplicate Observations
14.2
Data Validation
Parameter Occurrence Codes (POCs)
Data Completeness
Below the detection limit
14.3
Producing Annual Summaries
Calculating annual means
Calculating upper confidence limits
14.4
Comparing sites
15
Criteria pollutant explorer
16
AQI explorer
Edit this book
References
Created by MPCA-air
MPCA data tools
The following sections describe the methods used to produce MPCA’s public data products.