Knox County Ozone Appendix P Calculation and Justification

Knox County Ozone Appendix P Calculation and Justification

Air Quality Data Analysis Using Open Source Tools Daniel Garver and Ryan Brown US EPA Region 4 The Problem Do you ever want to do more than is possible through a standard AQS report? And have it automated? Do you wish you had additional software but dont have the resources? Software like SAS, SigmaPlot, Matlab, etc. is useful for air quality data analysis. For many agencies, however, these tools are cost prohibitive. Do you have routine reports or analyses that you currently generate manually and are resource/time intensive? Outline About Python

SQL databases and Python SQLAlchemy Region 4 Examples: Ozone site correlations Air monitoring geodatabase Automated wind roses and pollution roses Other potential applications What is Python? Python is a programming language that lets you work more quickly and integrate your systems more effectively. You can learn to use Python and see almost immediate Python.or gains in productivity and lower Extremely portable and compatible the same code runsgon Windows, maintenance costs. Linux/Unix, Mac OS X and works with existing code in C, C++, Fortran,

Java, R, and HTML Open Source License free to use (unlimited number of users and licenses) and immediately available for many diverse applications Strong user base that continually develops and supports python free add-ons available for database integration, graphing, math/statistics, web development (22,846 add-ons currently available and growing) Same functionality as widely available, costly, commercial software Free, open source add-ons similar to: Matlab SAS PDF writing software Graphing software HTML Web Interface software Database management/migration software Using Python With Air Quality Data Air quality data is almost always stored in an SQL relational database

Examples: AQS, AQS Datamart (Oracle) AIRNow (MySQL) Proprietary air monitoring data management systems such as E-DAS or AirVision (Usually MS SQL Server or Oracle) State or Local Agency Databases (various platforms) The first step is to access the data using Python SQLAlchemy: a Python add-on What is SQLAlchemy? Python and SQLAlchemy SQLAlchemy is a Python add-on that interacts with relational databases The same Python code can communicate with any type of SQL database (e.g. Oracle, MySQL, SQL Server, Access) Python code with SQLAlchemy is easier to read and write than raw SQL Your data can then be used for other purposes in Python

SQLAlchemy An Example This Python code query query == session.query(dm.dim_monitor).join(dm.dim_facility) session.query(dm.dim_monitor).join(dm.dim_facility) Generates this SQL to the AQS Data Mart: SELECT SELECT aqsmart.dim_monitor.dim_monitor_key aqsmart.dim_monitor.dim_monitor_key AS AS aqsmart_dim_monitor_dim__1, aqsmart_dim_monitor_dim__1, aqsmart.dim_monitor.dim_facility_key aqsmart.dim_monitor.dim_facility_key AS AS aqsmart_dim_monitor_dim__2, aqsmart_dim_monitor_dim__2, aqsmart.dim_monitor.dim_substance_key aqsmart.dim_monitor.dim_substance_key AS AS aqsmart_dim_monitor_dim__3, aqsmart_dim_monitor_dim__3, aqsmart.dim_monitor.mo_id aqsmart.dim_monitor.mo_id AS AS aqsmart_dim_monitor_mo_id,

aqsmart_dim_monitor_mo_id, aqsmart.dim_monitor.poc aqsmart.dim_monitor.poc AS AS aqsmart_dim_monitor_poc, aqsmart_dim_monitor_poc, aqsmart.dim_monitor.measurement_scale aqsmart.dim_monitor.measurement_scale AS AS aqsmart_dim_monitor_meas_4, aqsmart_dim_monitor_meas_4, aqsmart.dim_monitor.measurement_scale_definition aqsmart.dim_monitor.measurement_scale_definition AS AS aqsmart_dim_monitor_meas_5, aqsmart_dim_monitor_meas_5, aqsmart.dim_monitor.project_type_code AS aqsmart_dim_monitor_proj_6, aqsmart.dim_monitor.project_type_code AS aqsmart_dim_monitor_proj_6, aqsmart.dim_monitor.project_type aqsmart.dim_monitor.project_type AS AS aqsmart_dim_monitor_proj_7, aqsmart_dim_monitor_proj_7, aqsmart.dim_monitor.dominant_source AS aqsmart.dim_monitor.dominant_source AS aqsmart_dim_monitor_domi_8, aqsmart_dim_monitor_domi_8,

aqsmart.dim_monitor.probe_location aqsmart.dim_monitor.probe_location AS AS aqsmart_dim_monitor_prob_9, aqsmart_dim_monitor_prob_9, aqsmart.dim_monitor.probe_height aqsmart.dim_monitor.probe_height AS AS aqsmart_dim_monitor_prob_a, aqsmart_dim_monitor_prob_a, aqsmart.dim_monitor.probe_horiz_distance aqsmart.dim_monitor.probe_horiz_distance AS AS aqsmart_dim_monitor_prob_b, aqsmart_dim_monitor_prob_b, aqsmart.dim_monitor.probe_vert_distance AS aqsmart_dim_monitor_prob_c, aqsmart.dim_monitor.probe_vert_distance AS aqsmart_dim_monitor_prob_c, aqsmart.dim_monitor.sample_residence_time aqsmart.dim_monitor.sample_residence_time AS AS aqsmart_dim_monitor_samp_d, aqsmart_dim_monitor_samp_d, aqsmart.dim_monitor.unrestr_air_flow_ind aqsmart.dim_monitor.unrestr_air_flow_ind AS AS aqsmart_dim_monitor_unre_e, aqsmart_dim_monitor_unre_e,

aqsmart.dim_monitor.surrogate_ind aqsmart.dim_monitor.surrogate_ind AS AS aqsmart_dim_monitor_surr_f, aqsmart_dim_monitor_surr_f, aqsmart.dim_monitor.collaborating_programs aqsmart.dim_monitor.collaborating_programs AS AS aqsmart_dim_monitor_coll_10, aqsmart_dim_monitor_coll_10, aqsmart.dim_monitor.last_sampling_date aqsmart.dim_monitor.last_sampling_date AS AS aqsmart_dim_monitor_last_11, aqsmart_dim_monitor_last_11, aqsmart.dim_monitor.etl_last_load_process aqsmart.dim_monitor.etl_last_load_process AS AS aqsmart_dim_monitor_etl__12, aqsmart_dim_monitor_etl__12, aqsmart.dim_monitor.etl_last_load_date AS aqsmart_dim_monitor_etl__13 aqsmart.dim_monitor.etl_last_load_date AS aqsmart_dim_monitor_etl__13 FROM FROM aqsmart.dim_monitor aqsmart.dim_monitor JOIN JOIN aqsmart.dim_facility

aqsmart.dim_facility ON ON aqsmart.dim_facility.dim_facility_key aqsmart.dim_facility.dim_facility_key == aqsmart.dim_monitor.dim_facility_key aqsmart.dim_monitor.dim_facility_key Once you have access to your air quality data in Python, you can: Perform statistical analysis Create graphs Export data for other applications (e.g. Excel) Display your data in a web interface Create PDF reports Load data into another database (e.g. for GIS) Perform geoprocessing

Perform automated analysis for data validation or QA/QC An AQS Example: Ozone Site Comparison Regressions (with Graphs!) Can be run in minutes! Continually useful! Graphs! Ozone Site Comparison Regressions Compares one site with all other sites in the CBSA based on daily maximum 8-hr average Generates nonparametric Theil-Sen Regression plots of each site pair Calculates correlation statistics Spearmans r, Pearsons r, etc. Potential uses: Evaluating network design (identifying redundant sites) Modeling concentrations in the event of missing

data Ozone Site Comparison Matrix Site A Site B pn Spearman r Spearman val Pearson r Pearson p-val Theil slope Theil int 47-093-1020-44201-1 47-093-0021-44201-1 1444 0.951914952 0 0.943372755 0

0.94118 0.00185 47-063-0003-44201-1 47-093-0021-44201-1 292 0.941248585 9.46E-139 0.933572848 2.90E-131 0.91892 7.00E-05 47-105-0108-44201-1

47-093-0021-44201-1 93 0.912599594 4.10E-37 0.900784971 1.00E-34 0.75 0.01025 47-089-0002-44201-1 47-093-0021-44201-1 1417 0.877047948 0

0.733866562 6.64E-240 1 -0.001 47-105-0109-44201-1 47-093-0021-44201-1 1327 0.832461933 0 0.704831416 7.99E-200 0.88889 0.00411

47-001-0101-44201-1 47-093-0021-44201-1 1420 0.825591772 0 0.693255698 5.94E-204 0.92453 0.00421 47-009-0101-44201-1 47-093-0021-44201-1 1482 0.747143127 7.83E-265

0.641101197 2.45E-172 0.91667 -0.00271 47-155-0101-44201-1 47-093-0021-44201-1 1484 0.726548204 8.57E-244 0.619745174 3.62E-158 1.03571 -0.00966

47-155-0102-44201-1 47-093-0021-44201-1 1102 0.655271896 3.59E-136 0.474699528 5.18E-63 0.94118 -0.00524 47-009-0102-44201-1 47-093-0021-44201-1 1425 0.637935642 1.17E-163

0.535401114 1.65E-106 0.85714 0.00771 Regression Plots: Selected Southeast Sites Example: An Air Monitoring Geodatabase Custom AQS Data Mart Queries New database design is created in geodatabase Data loaded into a geodatabase Python scripting in GIS can automatically create custom layers and feature classes

Example: Air Pollution Wind Roses Customized AQS Data Mart query pulls: Concentration data over a certain value (e.g. SO2 > 75 ppb) Corresponding wind speed and direction for those hours with high concentration Another Python add-on is used to generate a wind rose or pollution rose of the hours with high concentration Other possibilities for air quality data Object wrappers/framework for retrieving data from specific databases (e.g. AQS Datamart)

Could design complex queries with less coding Speed up development time for specific analysis, graphing, web development functionality Python powered web interface Data retrieval, analysis/graphics, and web connectivity can be coded all together Summary Agencies rely on air quality data analysis for planning and decision making State, Local, and Federal Budgets are being cut Data analysis automation can provide real value - Time savings - Better, more reliable analysis - Python is a free tool that can help achieve this goal: - Easy to learn and read - Ideal for scientists and engineers (part time programmers) Additional Resources

Daniel Garver [email protected] 404-562-9839 Ryan Brown [email protected] 404-562-9147 Python: SQLAlchemy: Codecademy Python course (free): Questions?

Recently Viewed Presentations

  • Colibacillosis F5+ group of E.coli Pili serogroups K99,

    Colibacillosis F5+ group of E.coli Pili serogroups K99,

    Colibacillosis F5+ group of E.coli Pili serogroups K99, K88, F41, 987P, F18. ST and LT enterotoxins Inadequate colostrum Sow agalactia Viruses, coccidia ... human control essential Blitz treat (metaphylaxis) herd 30 days Shampoo pigs and disinfect with 0.5% Clorox Move...
  • Les Pronoms

    Les Pronoms

    Les deux pronoms Il y a deux objets dans la phrase. Je te donne mon parapluie. Les deux pronoms Je te donne mon parapluie. le Merci! Les deux pronoms me te le lui y en nous la leur vous les...
  • Contrasts in tolerance? Makling sense of differences in ...

    Contrasts in tolerance? Makling sense of differences in ...

    Making sense of differences in European Prison Rates Introduction The punitive turn Making sense of differences in Prison rates External factor explanations Table 1: Selective prison rates 2005/6 Three issues needing more study 1.What can prison rates tell us 2....
  • War Tests the Greeks - White Plains Middle School

    War Tests the Greeks - White Plains Middle School

    White Plains City School District Other titles: Arial Default Design War Tests the Greeks Slide 2 The Persians and the Greeks Slide 4 The Persian Wars Slide 6 The Alliance Slide 8 The Peloponnesian War Slide 10 Summation of Wars...
  • Statistics Refresher for Fisheries Professionals

    Statistics Refresher for Fisheries Professionals

    Catch Curve Decisions. Descending limb. Include age with peak catch. Use all ages where C. t >0though some argue for various right truncation rules. Weighted regression. Reduced effect of ages with low CPE. Demonstrated with R script. Chapman-Robson method. Generally...
  • The PLC Journey Continues…

    The PLC Journey Continues…

    PLC teams develop norms to guide their collaboration. By what standards of behaviors will a team agree to operate? Ground Rules or habits that govern the group. When individuals work through a process to create explicitly stated norms, and then...
  • Genealogy Gallery Walk: Making Research Personal Presented by:

    Genealogy Gallery Walk: Making Research Personal Presented by:

    Where do I come from? How does my family background impact who I am? Genealogy Project. Essential Questions. One of the teachers on the English team experienced a death in the family this past Fall and, during the process of...
  • Chapter 10: Gases - South Georgia State College

    Chapter 10: Gases - South Georgia State College

    Solve The gas mass is the difference between the mass of the flask filled with gas and the mass of the evacuated flask: The gas volume equals the volume of water the flask can hold, calculated from the mass and...