Changes in version 0.1.0                        

  - Actual beta release with proper number and Zenodo citation!
  - PDF conversion with convert() or convert_grobid() now defaults to
    the new GDPR-compliant server at TUE
  - Updates to the effect_size module
  - New report_app() to make a report with all default modules in a GUI
    by just uploading a PDF
  - Improvements to unit tests
  - Removed {fs} dependency and added custom path_sanitize()
  - Added internal functions to the website for developer reference
  - Updated the vignette on creating modules to explain validation text.

                     Changes in version 0.0.1.9001                      

  - extract_eq now catches "Hedges's g" (formerly just "g") and returns
    values ordered by paper_id, text_id and group_id
  - Updated xml_read_grobid() (an internal helper function for reading
    Grobid XMLs) to handle some stats better (e.g., "... g z =" is now
    read as "... gz = ")
  - Updated grobid XML read-in to better handle URLs with ? in the
    middle (less likely to cause an incorrect sentence split), and to
    remove no-content headers from the text table
  - Fixed some bibliography parsing problems with non-articles.
  - Updated psychsci for the read-in improvements.
  - retractionwatch database updated

                       Changes in version 0.0.1.0                       

Our beta release! We've made so many changes, and we're sure there are
still many bugs to catch and things to improve, but we need other people
to start using metacheck to help us.

                     Changes in version 0.0.0.9107                      

  - code_check now checks if code is parseable (thanks @Raphael-Merz!)
  - many new code_*() functions abstracted out from the code_check
    module. These may eventually move to a new package specifically for
    codecheck

                     Changes in version 0.0.0.9106                      

  - Added functions from svutils back in.
  - Reorganised some ML read-in functions (internal).
  - Ollama further support in llm() and vignette.
  - The code_check module handles local file with the argument
    local_path
  - New local_files() function (thanks @lakens!)
  - Updated vignettes

                     Changes in version 0.0.0.9105                      

  - Much less buggy .grobid_to_bibr() conversion, handling URLs in text,
    xrefs, url, and eq tables better.
  - extract_equations() renamed to extract_eq() and now extracts degrees
    of freedom (df column)
  - Improvements to .tei_text() to fix common problems with grobid
    handling of equations (e.g., "")
  - Corresponding paper schema changes
  - Updated psychsci and demopaper() and demofile() for new schema and
    read

                     Changes in version 0.0.0.9104                      

  - Updated file_types to fix a bug that prepended X to all extensions
    starting with a number.
  - paper_id() now returns a vector, not a table, fixing modules that
    used it that way
  - read() no longer errors when reading an empty directory, just
    messages and returns an empty paperlist
  - read() only reads in the .json version if a .json and .xml file with
    the same name exist
  - read() has a new argument recursive (default FALSE) to recursively
    read a directory. This does not handle it well if individual files
    have the same paper_id, so don't do that.

                     Changes in version 0.0.0.9103                      

  - converting grobid xml to bibr json now saves the file after each
    conversion, instead of at the end, making it better for large
    batches (although slightly less efficient by potentially duplicating
    crossref lookups shared between papers)
  - convert() has new arguments crossref_lookup (default FALSE) and
    keep_xml (default TRUE). It also saves XML and/or JSON files as they
    are converted, rather than at the end, in case of breaking failure.
  - Updated the "open_practices" module, which is much faster than the
    ODDPub version of this module (about 40x faster), also returns open
    materials and registrations, and has a lower false negative rate,
    but also a higher false positive rate. This removes the oddpub
    dependency.
  - Restructured file names (not function names) for functions so all
    archive helper (e.g., osf, github, zenodo) start with "archive-" and
    database helpers (e.g., pubpeer, retractionwatch) start with "db-".
  - Restructured text functions to start with text_, so search_text() is
    now text_search() and expand_text() is now text_expand(). The old
    names will exist as aliases.
  - Internal functions now prefaced with . to make it clearer for
    developers.
  - All {archive}_retrieve() functions now renamed to {archive}_info()
    and the old {archive}_info() internal functions are now
    .{archive}_info()

                     Changes in version 0.0.0.9102                      

  - Shiny app improvements: you can now view HTML reports in the browser
  - Fixes the "prereg_check" module to address an error when there are
    more than 10 OSF registrations in a batch that caused unmergable
    data frames.
  - Fixes the "code_check" module to address an error when checking
    multiple files that have no repositories with code.
  - The module "code_check" now has an argument "file_limit" to control
    how many code files per repo are downloaded and processed. The
    default is 20.
  - Fixed a problem where invisible figures in grobid would mess up the
    text section ids

                     Changes in version 0.0.0.9101                      

  - metacheck_app() the shiny app is back!
  - grobid_convert() now reads in the url table more accurately
  - extract_urls() uses a simplified regex that seems better at catching
    full URLs
  - updated FLoRA and rw databases
  - osf_links(), rb_links(), github_links() and aspredicted_links()
    simplified to use the more accurate url table instead of a full text
    search.

                     Changes in version 0.0.0.9100                      

  - So many updates to fix things that broke with the new structure
  - Using httptest2 to mock tests that access external APIs

                     Changes in version 0.0.0.9070                      

  - Major updates to replace grobid functions with bibr
  - Remove author_table(), as this is just concat_tables() now

                     Changes in version 0.0.0.9069                      

  - Updated osf_* and rb_* functions to use progress bars instead of
    messages
  - New logging functions: logger() and lastlog() inspired by
    @levibaruch
  - New test_paper() for creating paper objects with specific test text
  - summarize_contents() changed to file_category() and now works with a
    vector of file names, as well as a data frame
  - compare_tables(), text_features() and distinctive_words() now
    deprecated
  - validate() function simplified

                     Changes in version 0.0.0.9068                      

  - FReD replication database and associated functions now renamed to
    FLoRA()
  - Various bug fixes discovered when running modules on large numbers
    of papers (e.g., handling when zero references have DOIs)
  - Modules "function_check" and "coi_check" reverted to the
    rtransparent versions (the re-written version were overinclusive and
    need more development).

                     Changes in version 0.0.0.9067                      

  - reports() now takes a paperlist and makes a report from each
  - New report_module_run() and report_qmd() break down the report()
    function to allow separation of module output lists and creation of
    QMD report from them (might be changed to internal functions).
  - Ability to select returned columns in crossref_query()
  - Module "ref_accuracy" now returns info for references with missing
    DOIs that were found by ref_doi_check
  - Module "code_check" split into "repo_check" and "code_check"

                     Changes in version 0.0.0.9066                      

  - lmm() allows you to set the model to any provider or provider/model
    supported by ellmer (must have appropriate *****_API_KEY set in your
    Renviron)
  - lmm() arguments have changed to align with ellmer::chat() arguments
  - lmm_models() now returns models from all platforms for which you
    have a valid API key set
  - The power module uses a new prompt that utilises a JSON schema for
    power
  - Updated report styles

                     Changes in version 0.0.0.9065                      

  - New github_links() function to find github references in a paper.
  - code_check module very much improved - checks SAS and STATA code in
    OSF, researchbox, and github repos.
  - power module much improved
  - New modules: coi_check, funding_check
  - New functions extract_p_values() and extract_urls(), so now no need
    to use all_p_values and all_urls modules to get their tables. These
    modules remain because they are used in demos, but may be deprecated
    soon.

                     Changes in version 0.0.0.9064                      

  - Enhanced module help
  - "ref_replication" module no longer warns about replications if you
    have cited them.
  - Extensive chenges to clen up tests.

                     Changes in version 0.0.0.9063                      

  - get_doi() has been removed in favour of crossref_query(), to look up
    crossref info by bibliographic query, and crossref_doi(), to look up
    crossref info by DOI.
  - scroll_table() changed arguments. height is removed and scroll_above
    changed to maxrows. It not paginates above maxrows (default = 2),
    rather than scrolling within a fixed height. This is a more
    accessible solution, since scrolling is hard with touchscreens and
    it's often hard to copy text in a scroll window. We will continually
    improve this with further user feedback.
  - Fixed a bunch of small problems with modules and let the report
    render even with errors
  - Updated the report template with light and dark themes (set to user
    preference)
  - The module reference_check is split into ref_doi_check and
    ref_accuracy.
  - Lots of modules got renamed so they have a consistent format.

                     Changes in version 0.0.0.9062                      

  - json_expand() updated to handle LLM JSON errors more gracefully.
  - You can pass arguments to modules via report() now with the new args
    argument.
  - New get_prev_outputs() module helper function
  - Updated the vignettes.
  - Modules aspredicted and retractionwatch are removed, as they are
    superseded by prereg_check and reference_check.
  - The module nonsignificant_pvalue has changed to nonsig_p
  - The default modules in a report have changed.
  - A new module report helper, format_ref() for displaying references
    in bibentry or bibtex formats
  - The ref column of the bib table in paper objects is now the bibentry
    for a reference, not just the formatted text. This will allow for
    more formatting options.

                     Changes in version 0.0.0.9061                      

  - Efficiency improvements to the OSF functions
  - Fixed some confusing parts of the articles that changed when the
    module output report structure changed.
  - Modules are now categorised by section: general, intro, method,
    results, discussion, reference
  - Reports are organised by section
  - Display improvement in reports
  - Module report improvement (e.g., fixing broken links)
  - New example report on the pkgdown website

                     Changes in version 0.0.0.9060                      

  - Lots of changes for how reports are formatted
  - In module output, summary is now summary_table
  - Fixed a bug where some .docx file wouldn't read in (support for Word
    files is still patchy -- ideally render to PDF)
  - New pubpeer_comments() function (now vectorised)
  - Module helpers: scroll_table(), collapse_section(), link(),
    plural(), pb()

                     Changes in version 0.0.0.9059                      

  - Package name changed to metacheck!
  - Fixed a bug in osf_file_download() when multiple files have the same
    name and ignore_folder_structure = TRUE.
  - osf_file_download() should handle errors more gracefully (with
    warnings, but not fail)

                     Changes in version 0.0.0.9058                      

  - openalex() results now include abstract, which parses the
    abstract_inverted_index for you

                     Changes in version 0.0.0.9057                      

New functions/modules

  - New module: miscitation to detect commonly mis-cited papers (a
    proof-of-concept)
  - New module: power to detect and classify power analyses (currently
    being validated)
  - New module: aspredicted to get structured data from AsPredicted
    preregistrations (mainly for info)
  - module_template() creates a module file from a template
  - orcid_person() gets details from an ORCiD, such as name, emails,
    country
  - osf_preprint_list() returns a table of preprints from the OSF
    optionally filtered by archive and dates created or modified
  - Added an API wrapper - it is now possible to run papercheck
    functions and modules via a REST API. See inst/plumber/README.md for
    details.
  - Added documentation and plumber/Docker quickstart for the API

Changes

  - Changes to module_find() to find potential modules in the working
    directory and ./modules/
  - Changes to effectsize module so text of the potential effect size is
    given in mod_output$table$es (mod_output$summary$ttests_n and
    mod_output$summary$Ftests_n columns removed, as they are just the
    sum of *tests_with_es and *tests_without_es)
  - pdf2grobid() now gives more useful information in the warning if
    some files do not convert when converting more than one PDF
  - Changed parameter names in pdf2grobid to be consistently snake_case
    (consolidate_headers etc.) whilst keeping backward compatibility for
    the old camelCase (consolidateHeaders etc.)

Bug Fixes

  - Fixed warning messages in osf_check module when there are no OSF
    links
  - Fixed a problem in module_report() that happens when the table
    returned from module_run() has no rows
  - Fixed a bug that crashed stat_table() function by generating a
    summary table in case of empty stat table

                     Changes in version 0.0.0.9056                      

  - If expand_text() doesn't find a text match because sentence location
    info is missing, it now returns the original text instead of NA
  - Fixed a bug that prevented matching xrefs sentences under some
    circumstances (when there was an initial with a full stop in the
    citation) -- re-run read() on XMLs to update any saved paper objects
  - psychsci updated for these fixes
  - Changed retractionwatch internal data to retractionwatch() function
    (alias rw()) to support user updating.
  - Added new function rw_date() so you can find out when
    retractionwatch was last updated
  - New function rw_update() lets you update retractionwatch yourself

                     Changes in version 0.0.0.9055                      

  - pdf2grobid() handles save_path batter if any path components don't
    exist yet. The argument save_path also now can take a vector of the
    same length as the number of PDFs to convert, so you can specify the
    name of each output XML.
  - read() now skips any imports with errors and warns you about them
    after importing all files
  - Fixed a bug that errored on read() when bibentry files don't format
    correctly
  - Function osf_get_all_pages() now has a new argument page_end to
    limit the number of pages retrieved (mainly for testing purposes),
    and is external (previously internal)
  - Fixed a bug in osf_files() that failed on paths with spaces
  - Fixed a bug in read() that duplicated entries in xrefs

                     Changes in version 0.0.0.9054                      

  - osf_file_download() now also retrieves files from linked storage
  - Removed the last dependency to {osfr} and updated osf_check_id() to
    return expected IDs from various URLs
  - OSF functions added to getting started vignette
  - Functions that require and API are now tested using httptest
  - module_list() doesn't fail if there are any errors in the modules

                     Changes in version 0.0.0.9053                      

  - Updated read() to parse more stupid date formats that turn up in the
    submission string (and added the unparsed submission string back
    just in case)
  - Completely overhauled how paper objects handle references.
      - the paper$reference table is now paper$bib
      - the paper$citations table is now paper$xrefs and also contains
        information for internal cross-references to figures, tables,
        footnotes, and formulae
      - the ref_id and bib_id in both tables is now xref_id
      - the xrefs table also contains location information (section,
        div, p, s) for the sentence containing the cross-ref, so you can
        use expand_text()
      - The read() function now returns paper objects with these new
        tables, so you will need to re-read any XML files (if you have
        stored the papercheck list as Rdata)
      - The psychsci object has been updated for this new format
      - Modules and vignettes have been updated as well

                     Changes in version 0.0.0.9052                      

  - Fixed a bug in expand_text() where expanded sentences were
    duplicated if there are multiple matches from the same sentence in
    the data frame.
  - Updated the retractionwatch table
  - Fixed a bug in read() that omitted paper DOIs from paper$info
  - Updated read() to add correctly parsed "accepted" and "received"
    dates to paper$info (replaces paper$submission string) (ISO 8601 is
    the only correct date format!)
  - Updated psychsci for new info structure

                     Changes in version 0.0.0.9051                      

  - Small bug fixes to osf_file_download()
  - osf_file_download() now returns a table of file info, including info
    for files not downloaded because of file size limits

                     Changes in version 0.0.0.9050                      

  - Added read() function, which superceeds read_grobid(),
    read_cermine() and read_text() (they are still available, but are
    now just aliases to read()). This should work with XML files in TEI
    (grobid), JATS APA-DTD, NLM-DTD and cermine formats, plus full
    text-only parsing of .docx and plain text files.
  - Added osf_file_download() function, which downloads all files under
    a project or node and structures them the same as the project.

                     Changes in version 0.0.0.9049                      

  - Updated read_grobid() to classify headers as intro, method, results,
    discussion with better accuracy (to handle garbled headers)
  - Updated pdf2grobid() to allow some grobid parameters
  - Updated the module "all_p_values" to handle more scientific notation
    formats

                     Changes in version 0.0.0.9048                      

  - Functions to check ResearchBox.org (rbox_links() and
    rbox_retrieve()) -- very preliminary
  - The module "all_p_values" now returns the p-value as a numeric
    column p_value and the comparator as p_comp, like "exact_p"

                     Changes in version 0.0.0.9047                      

  - fixed some bugs in osf and aspredicted functions (mainly around
    dealing with private or empty projects)
  - added rvest dependency for better webpage parsing
  - changed name of resulting column from summarize_contents() from
    best_guess to file_category

                     Changes in version 0.0.0.9046                      

  - New aspredicted_links() and aspredicted_retrieve() functions
  - New related blog post
  - General bug fixes in newer stuff
  - Updated license to AGPL (GNU Affero General Public License)

                     Changes in version 0.0.0.9045                      

  - When reading a paper with read_grobid(), the paper$references table
    now contains new columns for bibtype, title, journal, year, and
    authors to facilitate reference checks, and more reliably pulls
    DOIs.
  - The psychsci set has been updated for the new reference tables
  - fixed bug in info_table() where adding "id" to the items argument
    borked the id column
  - Added json_expand() function to expand JSON-formatted LLM responses
  - Updated the LLM examples in the vignettes
  - Added find_project argument to osf_retrieve() to make searching for
    the parent project optional (it takes 1+ API calls)
  - Added emojis for convenience

                     Changes in version 0.0.0.9044                      

  - Revised the OSF functions again!
  - Organised the Reference section of the website
  - Added some blog posts to the website
  - Upgraded the "osf_check" module to give more info

                     Changes in version 0.0.0.9043                      

  - Totally re-wrote the OSF functions

                     Changes in version 0.0.0.9042                      

  - New OSF functions and vignette
  - Build pkgdown manually

                     Changes in version 0.0.0.9041                      

  - Fixed a bug in validate() that returned incorrect summary stats if
    the data type of an expected column didn't match the data type of an
    observed column (e.g., double vs integer)
  - Combined the two effect size modules into "effect_size"
  - Renamed the module "imprecise_p" to "exact_p" (I keep typo-ing
    "imprecise")
  - Added a loading message
  - Added code coverage at
    https://app.codecov.io/gh/scienceverse/papercheck
  - updated "all_p_values" to handle unicode operators like <=or >>

                     Changes in version 0.0.0.9040                      

  - Updated default llm model to llama-3.3-70b-versatile (old one is
    being deprecated in August)
  - Updated reporting function for modules to show the summary table
  - Fixes a bug in validate() that returned FALSE for matches if the
    expected and observed results were both NA
  - Added two preliminary modules: "effect_size_ttest" and
    "effect_size_ftest"

                     Changes in version 0.0.0.9039                      

  - removed the llm_summarise module
  - updated papercheck_app() to show all modules
  - removed the LLM tab from the shiny app
  - fixed a bug in pdf2grobid() where a custom grobid_url was not used
    in batch processing
  - psychsci object updated to use XMLs from grobid 0.8.2, which fixes
    some grobid-related errors in PDF import

                     Changes in version 0.0.0.9038                      

  - validate() function is updated for the new module structure
  - the validation, metascience, and text_model vignettes are updated
  - modules can now use relative paths (to their own location) to access
    helper files

                     Changes in version 0.0.0.9037                      

  - The way modules are created has been majorly changed -- it is now
    very similar to R package functions, using roxygen for
    documentation, instead of JSON format. There is no longer a need to
    distinguish text search, code, and LLM types of modules, they all
    use code. The vignettes have been updated to reflect this.
  - Modules now return a summary table that is appended to a master
    summary table if you chain modules like psychsci |>
    module_run("all_p_values") |> module_run("marginal")
  - The validate() function is temporarily removed to adapt the workflow
    to the new summary tables.
  - new module_help() function and some help/examples in modules
  - new module_info() helper function
  - new paperlist() function to create paper list objects
  - paper lists now print as a table of IDs, titles, and DOIs
  - updated read_grobid() to have fewer false positives for citations
  - updated retractionwatch

                     Changes in version 0.0.0.9036                      

  - Now reads in grobid XMLs that have badly parsed figures

                     Changes in version 0.0.0.9035                      

  - updated the shiny app for recent changes

                     Changes in version 0.0.0.9034                      

  - openalex() takes paper objects, paper lists, and vectors of DOIs as
    input, not just a single DOI
  - fixed paper object naming problem when nested files are not all at
    the same depth

                     Changes in version 0.0.0.9033                      

  - added read_cermine() as associated internal functions for reading
    cermine-formatted XMLs

                     Changes in version 0.0.0.9032                      

  - New functions for exploring github repositories: github_repo(),
    github_readme(), github_languages(), github_files(), github_info()
  - A new vignette about github functions

                     Changes in version 0.0.0.9031                      

  - read_grobid() now includes figure and table captions, plus
    footnotes, in the text table
  - the psychsci paper list object is updated to include the above
  - The functions that module_run() delegates to now check and only pass
    valid arguments

               Changes in version 0.0.0.9030 (2025-03-01)               

  - modules are now updated for clearer output, and added a new module
    vignette
  - llm() no longer returns NA when the rate limit is hit, but slows
    down queries accordingly
  - read_grobid() now includes back matter (e.g., acknowledgements, COI
    statements) in the text, so is searchable with search_text()
  - references are now converted to bibtex format, so are more complete
    and consistent
  - Machine-learning module types are removed (the python/reticulate
    setup was too complex for many users), and instructions for how to
    create simple text feature models is included in the metascience
    vignette

               Changes in version 0.0.0.9029 (2025-02-26)               

  - added author_table() to get a dataframe of author info from a list
    of paper objects
  - fixed a bunch of tests now that multiple matches in a sentence are
    possible
  - added back text (acknowledgements, annex, funding notes) to the text
    of a paper
  - Fixed a bug in search_text() that omitted duplicate matches in the
    same sentence when using results = "match"
  - Upgraded the search string for the "all-p-values" module to not
    error when a numeric value is followed by "-"
  - Error catching for stats() related to the above problem (and filed
    an issue on statcheck)
  - URLs in grobid XML are now converted to "" using the source url, not
    the text url, which is often mangled

               Changes in version 0.0.0.9028 (2025-02-18)               

  - added psychsci dataset of 250 open access papers from Psychological
    Science
  - added "all" option the the return argument of search_text()
  - added info_table() to get a dataframe of info from a list of paper
    objects
  - experimental functions for text prediction: distinctive_words() and
    text_features()

               Changes in version 0.0.0.9027 (2025-02-07)               

  - Removed ChatGPT and added groq support
  - Updated llm() and associated functions like llm_models()
  - Working on div vs section aggregation for search_text()

               Changes in version 0.0.0.9026 (2025-02-06)               

  - metascience and batch vignettes
  - removed scienceverse as a dependency
  - revised validation functions
  - added tl_accuracy()

               Changes in version 0.0.0.9025 (2025-02-04)               

  - Added expand_text()

               Changes in version 0.0.0.9024 (2025-01-31)               

  - Added validate() function and vignette