About

ScholarConnect transforms faculty CVs into data

⚡️

Save Time

Automatically publish an up-to-date faculty profile using AI and your CV as a data source.

🤝

Connect & Collaborate

Enable industry, researchers, and community partners find faculty, learn about their research, and connect through an AI faculty expertise search (coming soon)

🏆

Showcase Excellence

Highlight recent and high-impact publications and distinctions with the help of AI.

How to participate

ScholarConnect is a pilot within the School of Physical Sciences. All faculty within the school have the option to participate by opting in during the ScholarSteps merit & promotion process, or by emailing their CV to scholarconnect@uci.edu.

Step 1

Choose to participate

Academic appointees within the Physical Sciences pilot group have the option of participating in ScholarConnect. During their review within ScholarSteps, the faculty member or department analyst will upload an updated CV. Once the review is ready for department review, the faculty member is asked to certify that everything is complete and correct. Pilot users will see an additional option on this certification screen where they can express interest in participating in ScholarConnect.

Certification screen in ScholarSteps
User settings dropdown

If faculty change their mind later, they can always come back and change their preference from the new User Settings screen.

Not up for review? Email us your CV at scholarconnect@uci.edu to participate.

Step 2

Review your AI-generated profile

Pilot faculty who choose to participate will receive an email from the ScholarConnect team once their AI-generated faculty profile is ready for review. We invite you to email any suggestions for improvement or issue reports to scholarconnect@uci.edu. Our goal is to improve ScholarConnect so that generated profiles work well for most people. In an upcoming release we'll offer edit capabilities so faculty can adjust their profile as they see fit. And, if you're not happy with your generated profile, just let us know and we'll remove it.

How it works

After choosing to participate as described above, CVs undergo a multi-step process to transform PDF content into data within ScholarConnect:

  1. Extract PDF content: text, formatting, page number, and position information are extracted from the PDF.
  2. Extract sample headings: the first few section headings are identified using GPT 4o vision. Only the first 1-2 pages of the PDF are analyzed.
  3. Visual analysis: heading formatting, position, and other styling are analyzed to identify visual characteristics common across the sample headings.
  4. Identify all headings: all headings are identified using visual characteristics identified in the previous step.
  5. Identify publications section: GPT 4o reviews the list of headings identified and assists with picking the main headings listed in the CV. Importantly it identifies the publications section, inclusive of any books, conference proceedings, journal articles, etc...
  6. Extract bibliographic data: Text content of the publications section is provided to GPT 4o, transforming unstructured text into structured data (JSON). Text is "chunked" and processed iteratively to avoid output token limitations of GPT 4o.
  7. Extract other data: GPT 4o is provided the entire CV document and extracts data such as the individual's name, current positions, degrees, distinctions, contact info, and professional society affiliations.
  8. Identify areas of expertise: GPT 4o is employed again to identify 10 areas of expertise.

The result is a large structured data file (JSON) representing most of the CV content. This data is then published to scholarconnect.uci.edu. We hope to further refine these steps with faculty input so that little editing is required.