There are two types of configuration:
load.project()behaves when executed. For example, whether to have logging enabled.
mungescripts. For example, you may define
plot_footnote = "My Proj"to control a consistent look and feel for plots.
Both types are stored in the
config object accessible from the global environment. The function
project.config() will display the current configuration, including project specific configuration.
ProjectTemplate configuration settings exist in the
data_loading: This can be set to ‘on’ or ‘off’. If
data_loadingis on, the system will load data from both the
cachetaking precedence in the case of name conflict. By default,
data_loading_header: This can be set to ‘on’ or ‘off’. If
data_loading_headeris on, the system will load text data files, such as CSV, TSV, or XLSX, treating the first row as header.
data_ignore: A comma separated list of files to be ignored when importing from the
data/directory. Regular expressions can be used but should be delimited (on both sides) by
/. The default is to ignore no files. Note that filenames and filepaths should never begin with a
/, entire directories under
data/can be ignored by adding a trailing
/. See Mastering ProjectTemplate for more details.
cache_loading: This can be set to ‘on’ or ‘off’. If
cache_loadingis on, the system will load data from the
cachedirectory before any attempt to load from the
datadirectory. By default,
recursive_loading: This can be set to ‘on’ or ‘off’. If
recursive_loadingis on, the system will load data from the
datadirectory and all its sub difrectories recursively. By default,
munging: This can be set to ‘on’ or ‘off’. If
mungingis on, the system will execute the files in the
mungedirectory sequentially using the order implied by the
mungingis off, none of the files in the
mungedirectory will be executed. By default,
logging: This can be set to ‘on’ or ‘off’. If
loggingis on, a logger object using the
log4rpackage is automatically created when you run
load.project(). This logger will write to the
logsdirectory. By default,
logging_level: The value of
logging_levelis passed to a logger object using the
log4rpackage during logging when when you run
load.project(). By default,
load_libraries: This can be set to ‘on’ or ‘off’. If
load_librariesis on, the system will load all of the R packages listed in the
librariesfield described below. By default,
libraries: This is a comma separated list of all the R packages that the user wants to automatically load when
load.project()is called. These packages must already be installed before calling
load.project(). By default, the reshape, plyr, ggplot2, stringr and lubridate packages are included in this list.
as_factors: This can be set to ‘on’ or ‘off’. If
as_factorsis on, the system will convert every character vector into a factor when creating data frames; most importantly, this automatic conversion occurs when reading in data automatically. If ‘off’, character vectors will remain character vectors. By default,
data_tables: This can be set to ‘on’ or ‘off’. If
data_tablesis on, the system will convert every data set loaded from the
datadirectory into a
data.table. By default,
attach_internal_libraries: This can be set to ‘on’ or ‘off’. If
attach_internal_librariesis on, then every time a new package is loaded into memory during
load.project()a warning will be displayed informing that has happened. By default,
cache_loaded_data: This can be set to ‘on’ or ‘off’. If
cache_loaded_datais on, then data loaded from the
load.project()will be automatically cached (so it won’t need to be reloaded next time
load.project()is called). By default,
cache_loaded_datais on for newly created projects. Existing projects created without this configuration setting will default to off. Similarly, when
migrate.project()is called in those cases, the default will be off.
sticky_variables: This is a comma separated list of any project-specific variables that should remain in the global environment after a
clear()command. This can be used to clear the global environment, but keep any large datasets in place so they are not unnecessarily re-generated during
load.project(). Note that any this will be over-ridden if the
force=TRUEparameter is passed to
clear(). By default,
The project specific configuration is specified in the
lib/globals.R file using the
add.config function. This will contain whatever is relevant for your project, and will look something like this:
> add.config( keep_data = FALSE, # should temporary data be kept? header = "Private & Confidential" # header in reports )
Note that commas need to be present after each config item except the last. Comments can also be inserted to document what each config variable does.
To use project specific configuaration in any
src script, simply use the form
ProjectTemplate will automatically load project specific content in
lib/globals.R before any other file in
lib, so the filename should not be changed.
add.config() function can also be used anywhere in the project. So if a particular analysis in
src wanted to override the value in
globals.R, you can simply add the relevant
add.config() command to the top of that script.