The docstring is defined as lines that start with #'
. To avoid problems
when converting to .pdf
and .html
, the format of documentation should
comply to Pandoc's markdown syntax.
To allow for the execution of export_docs_html()
and export_docs_pdf()
,
pandoc needs to be available through the command line. In addition, a LaTeX
distribution (e.g. TeXLive) is also required for PDF compilation.
Usage
export_docs(path, outfp = NULL)
export_docs_pdf(path, outfp, style = "amsart")
export_docs_html(path, outfp)
Examples
(fin = system.file("cases", "wine_network", "wine.R", package = "stom"))
#> [1] "C:/Users/tomliao/AppData/Local/Programs/R/R-4.2.3/library/stom/cases/wine_network/wine.R"
docstring = export_docs(fin) # Return as character vector if `outfp=NULL`
cat(docstring, sep="\n")
#> ---
#> title: Joint Estimation of Interaction and Piped Effects in an Item Response Model
#> author: "Yongfu Liao"
#> date: "April 27, 2023"
#> ---
#>
#> Description
#> -----------
#>
#> The Directed Acyclic Graph (DAG) shown below represent the data generating
#> process of the IRT model of interest. It can be conceptualized as a wine
#> rating context, where the rating of wine quality ($R$) is influenced by
#> three factors:
#>
#> 1. $J$: Judge leniency
#> 2. $W$: Wine quality
#> 3. $O_w$: Wine origin
#>
#> Furthermore, it is assumed that $O_w$ differentially influences $R$
#> based on the levels of another variable $O_j$, the origin of the judge.
#> In stats jargon, there's an interaction between $O_w$ and $O_j$. In simpler
#> conceptual terms, consider the scenario that, for instance, French wines are
#> rated higher by French judges, in addition to the scores they should have
#> received based on their quality alone. The simulation code of this data
#> generating process is found in the `sim_dat()` function in `wine.R`.
#>
#> ![Underlying data-generating process of the IRT Model](dag)
#>
#> The Original Model and its Problem
#> ----------------------------------
#>
#> The specification of the original model is shown in the equations below.
#> A problem found in this model is that it cannot stably recover the wine
#> quality (`W`) and the interaction (`Int`) parameters.
#>
#> $$
#> \begin{aligned}
#> R & \sim Bernoulli( p ) \\
#> logit(p) &= W_{[Wid]} + J_{[Jid]} + Int_{[O_j, O_w]} \\
#> J & \sim Normal( 0, \sigma_J ) \\
#> \sigma_J & \sim Exponential(1) \\
#> \\
#> W_{[Wid]} & \sim Normal( a_{[O_w]}, \sigma_W ) \\
#> a & \sim Normal( 0, 1.5 ) \\
#> \sigma_W & \sim Exponential(1)
#> \end{aligned}
#> $$
#>
#>
#> Potential Causes
#> ----------------
#>
#> With some exploration on a simpler model (the response was modeled as normal
#> distributions generated from latent scores), it was found that the problem
#> seemed to arise from an identifiablity issue: the model cannot correctly
#> attribute the right amount of effect to wine quality (which is influenced by
#> wine origin) and the direct (interactive) influence of wine origin on rating
#> scores. The parameter estimates float around case by case when different
#> configurations of the interaction are set in the simulation.
#>
#>
#> Fixes
#> -----
#>
#> As illustrated in `wine2_normal.stan` (Case 4 & 5), the problem can be fixed
#> by imposing additional constraints on the model. Two of them are imposed
#> here to correctly identify the parameters:
#>
#> 1. A sum-to-zero constraint on the effects of wine origin on wine quality.
#> That is, the effect through the path $O_w \rightarrow W$.
#> 2. One of a term in the interaction matrix (`(2, 2)` in the case here) is
#> constraint to zero as the reference.
#>
#>
#> ToDo
#> ----
#>
#> Test whether the conclusion also holds with logit models (binary/ordinal
#> response models).
if (FALSE) { # \dontrun{
export_docs(fin, "docs.md")
export_docs_pdf(fin, "docs.pdf")
export_docs_html(fin, "docs.html")
} # }