Methodology

Network construction (full write-up coming in Phase 9):

  1. Resolve institution to an OpenAlex ID via autocomplete.
  2. Pull every work at that institution within the chosen year range, capped at 30 authors per work.
  3. Match each authorship's raw affiliation string against the department query using a normalized fuzzy + substring score (rapidfuzz token_set_ratio >= 90, or a substring fallback for the "Molecular and Cellular Biology" case).
  4. Filter candidates to those with at least three department-tagged authorship records; cross-check that they appear in the OpenAlex Author affiliations entry within the time range.
  5. Apply a PI tier filter: top 20% by weighted publication count OR at least one first/corresponding-author paper.
  6. Build the graph: nodes = surviving PIs, edges = co-authored works between them.