# literature review for using arXiv as a corpus for analysis

Published 2020-05-31T02:44:00.001Z by Physics Derivation Graph

"Towards Machine-assisted Meta-Studies: The Hubble Constant"
https://arxiv.org/pdf/1902.00027.pdf
"an approach for automatic extraction of measured values from the astrophysical literature, using the Hubble constant for our pilot study. Our rules-based model – a classical technique in natural language processing – has successfully extracted 298 measurements of the Hubble constant, with uncertainties, from the 208,541 available arXiv astrophysics papers."

"Scienceography: the study of how science is written" (2013)
https://arxiv.org/abs/1202.2638
https://arxiv.org/pdf/1202.2638.pdf
Focused on characterization
separates out packages, comments, authors, figures in the .tex source

"Transforming the arχiv to XML" (2008)
Kohlhase

"An Architecture for Recovering Meaning in a LATEX to OMDoc Conversion" (2009)