PEcAn has the (optional) ability to initiate large ensembles of runs to perform different types of analyses:
- Sensitivity Analysis (Global Sensitivity Analysis)
- Uncertainty Analysis
- State Data Assimilation
- Parameter Data Assimilation
For now we will not focus on data assimilation, but we should keep in mind ways to generalize and expand the app to eventually support visualizations for these analyses as well.
-
PEcAn runs ensembles of runs as part of the PEcAn Workflow and then saves all the run outputs to
.Rdata
files at theget.results
step. -
The following steps in the workflow (
run.ensemble.analysis
andrun.sensitivity.analysis
) load these Rdata files, perform analyses, and save outputs as.Rdata
files.
-
There are disagnostic plots associated with the different analyses that are currently saved as pdfs. The pdfs can be viewed in the basic PEcAn app.
- The goal is to port these figures to shiny where they can be interactive.
- The plotting functions that we want to use already exist in the uncertainty package and are used throught the sensitivity and uncertainity analyses workflows.
- We can also explore additional visualizations once the basic ones are working.
-
Global sensitivity analysis already has a very basic app that Alexey and I made years ago for a project. We may ultimately want to depricate the old app and include global sensitivity as a tab in the new Ensemble Plots app.
Basic instuctions for setting up ensemble runs through PEcAn can be found in the tutorial
The tutorial also goes through the output files in more detail than I have explained here.
I put together a mock up of what I think the app could look like.
http://test-pecan.bu.edu/shiny/ecowdery/DemoEnsemblePlot/?workflow_id=1000009294
- Right now it only loads the pre-existing pdf files in essentialy the same way that the original PEcAn app does.
- You may not want to use any of the code that I used for the example app. I made it just to have a visual example.
The code for the demo app is on my shiny_DemoEnsemblePlot
branch of pecan: https://github.com/bcow/pecan/tree/shiny_DemoEnsemblePlot/shiny/DemoEnsemblePlot
-
Work through the tutorial and see what the pdfs look like in the old pecan app
-
Look at the demo app to get an idea of what the new app could look like
-
Read through and try running
/pecan/modules/uncertainty/R/run.ensemble.analysis.R
/pecan/modules/uncertainty/R/run.sensitivity.analysis.R
Both these functions use plotting functions in/pecan/modules/uncertainty/R/plots.R
-
The main arguments to these functions is
settings
which is an R list object that is generated by runningread.settings(pecan.xml)
If you have a workflow id and want to find its respective pecan.xml
, run:
bety <- betyConnect()
folder <- tbl(bety, "workflows") %>% filter(id == 1000009290) %>% pull(folder)
file.path(folder, "pecan.xml")
One workflow can include multiple types of ensembles. In the database the ensembles table keeps track of which types of runs have been performed.
For example, I recently did a series of ensembles as part of workflow 1000009294.
By looking at the ensembles
table and filtering by workflow id 1000009294
in the database I can see the run types
> tbl(bety, "ensembles") %>% filter(workflow_id == 1000009294) %>% collect
# A tibble: 2 x 6
id notes created_at updated_at runtype workflow_id
* <dbl> <chr> <dttm> <dttm> <chr> <dbl>
1 1000018566 "" 2018-06-11 21:00:42 2018-06-11 21:00:42 ensemble 1000009290
2 1000018565 "" 2018-06-11 20:59:52 2018-06-11 20:59:52 sensitivity analysis 1000009290
I ran an ensemble and a sensitivity analysis.
By looking at the runs
table and filtering by ensemble id 1000018566
in the database I can see that the ensemble contianed 25 runs
> tbl(bety, "runs") %>% filter(ensemble_id == 1000018566) %>% collect()
# A tibble: 25 x 14
id model_id site_id start_time finish_time outdir outprefix setting parameter_list created_at
* <dbl> <dbl> <dbl> <dttm> <dttm> <chr> <chr> <chr> <chr> <dttm>
1 1.00e9 1.00e9 756 2004-01-01 00:00:00 2004-12-31 00:00:00 "" "" "" ensemble=1 2018-06-11 21:00:42
2 1.00e9 1.00e9 756 2004-01-01 00:00:00 2004-12-31 00:00:00 "" "" "" ensemble=19 2018-06-11 21:00:43
# ... with more rows, and 4 more variables: updated_at <dttm>, started_at <dttm>, finished_at <dttm>, ensemble_id <dbl>
By looking at the runs
table and filtering by ensemble id 1000018565
in the database I can see that the sensitivity analysis contianed 77 runs
> tbl(bety, "runs") %>% filter(ensemble_id == 1000018565) %>% collect
# A tibble: 77 x 14
id model_id site_id start_time finish_time outdir outprefix setting parameter_list created_at
* <dbl> <dbl> <dbl> <dttm> <dttm> <chr> <chr> <chr> <chr> <dttm>
1 1.00e9 1.00e9 756 2004-01-01 00:00:00 2004-12-31 00:00:00 "" "" "" quantile=15.8… 2018-06-11 17:00:14
2 1.00e9 1.00e9 756 2004-01-01 00:00:00 2004-12-31 00:00:00 "" "" "" quantile=2.27… 2018-06-11 16:59:52
# ... with more rows, and 4 more variables: updated_at <dttm>, started_at <dttm>, finished_at <dttm>, ensemble_id <dbl>