OpenRLens
← Back to OpenRLens

Methodology

Every signal on OpenRLens is computed from open scholarly data using reproducible logic. No black boxes — this page explains what each signal measures and why.

Data Sources

All data is sourced from publicly available, open scholarly databases. Nothing proprietary.

Publication DatabasePapers, authors, affiliations, concepts, citations
CrossrefDOI metadata and funder acknowledgements
Europe PMCBiomedical funder and grant metadata
NIH RePORTERUS NIH grant records
NSF AwardsUS NSF grant records
UKRIUK Research & Innovation grant records
OpenAIREEU Horizon and national funding metadata
CORESupplementary open-access full-text papers

Reading the signals

Higher is better

A high score is a positive indicator

Lower is better

A high score is a risk indicator

Informational

Context only — not scored in verdict

Signals are scored 0–1 and classified as High / Moderate / Low using fixed thresholds derived from the distribution across all institutions. Computed over the date window you select.

Structural Signals

ILS

Institutional Leadership Signal

Higher is better

How often the institution holds a leading authorship role (first, corresponding, or PI-equivalent last author) on its own papers.

RCR

Research Concentration

Informational

Share of total citations held by the top 10% of recurring authors. High = a few stars dominate output.

CDS

Collaboration Dependency

Lower is better

How much output would drop if the top-3 external partner institutions were removed. High = research is not self-sustaining.

ASI

Affiliation Stability

Higher is better

Ratio of researchers active across ≥3 distinct years vs. those across ≥2 years. A proxy for faculty retention.

STS

Output Stability

Higher is better

Consistency of annual publication volume. Computed as 1 − (std ÷ mean). Volatile output scores low.

TC

Thematic Continuity

Higher is better

Year-on-year Jaccard similarity of the top-5 research concept labels. High = focused, stable research identity.

Funding Signals

FPR

Funding Presence

Higher is better

Share of papers that acknowledge at least one external funder. A lower bound — open databases underreport funding.

FLS

Funding Leadership

Higher is better

Same as ILS but restricted to funded papers only. Does the institution lead the research it is funded for?

FDD

Funding Dependency Differential

Higher is better

FLS rate minus ILS rate. Positive = institution leads more of its funded work than its overall output.

EFAS

Lead Role on Funded Papers

Lower is better

Funded papers with no institutional lead author ÷ total funded papers. Lower is better.

FTA

Funding-Theme Alignment

Higher is better

Jaccard similarity between concept labels in funded vs. unfunded papers. High = funded work stays on-topic.

Mentorship & Diversity

MDI

Mentorship Diversity Index

Informational

Share of first authors in the recent half of the window who did not appear in the earlier half. High = new researchers are being elevated.

GBI

Gender Balance Index

Informational

Estimated gender balance among institutional first authors. Name-inference only (~80–85% accuracy for Western names). Indicative.

The overall verdict

The verdict shown on each results page is a weighted composite of the structural and funding signals. MDI and GBI are informational and do not contribute to the score. Positive-polarity signals contribute positively; negative-polarity signals contribute inversely. No single signal dominates — a Strong verdict requires broad coverage across both structural and funding dimensions.

Exact weighting coefficients are not published — not to obscure the method, but because publishing fixed numbers creates incentives to optimise the score rather than genuine research quality.

Known limitations

Funding underreportingFPR and related signals are lower bounds. Institutions with less-indexed national funders (common in Asia, Africa) will appear less funded than they are.
Author disambiguationCommon names may be merged across different people, or one person's record may be split. Affects RCR, ASI, and MDI.
Concept taggingResearch topics are inferred by ML classifiers from open scholarly databases. Niche or emerging fields may be tagged inconsistently, affecting TC and FTA.
Time window sensitivityAll signals are computed over the selected date range. Short windows produce high-variance results — a 3+ year window is recommended.
Non-English literatureEnglish-language papers are better indexed. Institutions publishing significantly in other languages may appear less productive than they are.

Question about a specific signal?

info@openrlens.com →