Report¶
If you're using pub analyzer as an external library, you'll find pretty much everything you need here. This is where all the magic happens.
Suppose you already have an author model. Let's see how to generate a scientific production report of that author.
import asyncio
from pub_analyzer.internal.report import make_author_report
from pub_analyzer.models.author import Author
author = Author(**kwargs) # (1)!
report = asyncio.run(make_author_report(author=author)) # (2)!
- Use real information instead of
**kwargs
placeholder. - Functions are defined as asynchronous since their primary use occurs within a TUI context. We apologize for any inconvenience this may cause.
And that's it! that's all. Well, maybe you want to export the report to a format like JSON, that's where pydantic does its magic.
with open("report.json", mode="w", encoding="utf-8") as file:
file.write(report.model_dump_json(indent=2, by_alias=True)) # (1)!
- It is important that you use the
by_alias=True
parameter, otherwise you will not be able to import correctly using the pub analyzer models.
ta-da!
Early stages
In the early phases of the project, before Pub Analyzer existed as a TUI, the main goal was to emulate an Excel file. This file, based on input tables containing the works of an author and the works that reference them, categorized the types of citations. Later, the idea was expanded to encompass automating works retrieval. It was during this period that I stumbled across OpenAlex, and as they say, one thing led to another.
Functions to make reports.
FromDate
module-attribute
¶
DateTime marker for works published from this date.
ToDate
module-attribute
¶
DateTime marker for works published up to this date.
_add_work_abstract ¶
Get work abstract from abstract_inverted_index and insert new key abstract
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
work
|
dict[str, Any]
|
Raw work. |
required |
Returns:
Type | Description |
---|---|
dict[str, Any]
|
Work with new key |
_get_author_profiles_keys ¶
def _get_author_profiles_keys(
author: Author,
extra_profiles: list[
Author | AuthorResult | DehydratedAuthor
]
| None,
) -> list[AuthorOpenAlexKey]
Create a list of profiles IDs joining main author profile and extra author profiles.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
author
|
Author
|
Main OpenAlex author object. |
required |
extra_profiles
|
list[Author | AuthorResult | DehydratedAuthor] | None
|
Extra OpenAlex authors objects related with the main author. |
required |
Returns:
Type | Description |
---|---|
list[AuthorOpenAlexKey]
|
List of Author OpenAlex Keys. |
_get_authors_list ¶
def _get_authors_list(
authorships: list[Authorship],
) -> list[str]
Collect OpenAlex IDs from authors in a list of authorship's.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
authorships
|
list[Authorship]
|
List of authorships. |
required |
Returns:
Type | Description |
---|---|
list[str]
|
Authors keys IDs. |
_get_citation_type ¶
def _get_citation_type(
original_work_authors: list[str],
cited_work_authors: list[str],
) -> CitationType
Compare two lists of authors and returns the citation type.
Based on the authors of a given work and the authors of another work that cites the analyzed work, calculate the citation type.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
original_work_authors
|
list[str]
|
List of the authors of the evaluated work. |
required |
cited_work_authors
|
list[str]
|
List of the authors of the citing document. |
required |
Returns:
Type | Description |
---|---|
CitationType
|
Calculated cite type (Type A or Type B). |
Info
Type A: Citations made by researchers in documents where the evaluated author or one of his co-authors does not appear as part of the authorship of the citing documents.
Type B: Citations generated by the author or one of the co-authors of the work being analyzed.
_get_institution_keys ¶
def _get_institution_keys(
institution: Institution,
extra_profiles: list[
Institution
| InstitutionResult
| DehydratedInstitution
]
| None,
) -> list[InstitutionOpenAlexKey]
Create a list of profiles IDs joining main institution profile and extra institution profiles.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
institution
|
Institution
|
Main OpenAlex institution object. |
required |
extra_profiles
|
list[Institution | InstitutionResult | DehydratedInstitution] | None
|
Extra OpenAlex institutions objects related with the main institution. |
required |
Returns:
Type | Description |
---|---|
list[InstitutionOpenAlexKey]
|
List of Institution OpenAlex Keys. |
_get_source
async
¶
def _get_source(client: AsyncClient, url: str) -> Source
Get source given a URL.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
client
|
AsyncClient
|
HTTPX asynchronous client to be used to make the requests. |
required |
url
|
str
|
URL of works with all filters. |
required |
Returns:
Type | Description |
---|---|
Source
|
Source Model. |
Raises:
Type | Description |
---|---|
HTTPStatusError
|
One response from OpenAlex API had an error HTTP status of 4xx or 5xx. |
_get_valid_works ¶
Skip works that do not contain enough data.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
works
|
list[dict[str, Any]]
|
List of raw works. |
required |
Returns:
Type | Description |
---|---|
list[dict[str, Any]]
|
List of raw works with enough data to pass the Works validation. |
Danger
Sometimes OpenAlex provides works with insufficient information to be considered. In response, we have chosen to exclude such works at this stage, thus avoiding the need to handle exceptions within the Model validators.
_get_works
async
¶
def _get_works(client: AsyncClient, url: str) -> list[Work]
Get all works given a URL.
Iterate over all pages of the URL
Parameters:
Name | Type | Description | Default |
---|---|---|---|
client
|
AsyncClient
|
HTTPX asynchronous client to be used to make the requests. |
required |
url
|
str
|
URL of works with all filters and sorting applied. |
required |
Returns:
Type | Description |
---|---|
list[Work]
|
List of Works Models. |
Raises:
Type | Description |
---|---|
HTTPStatusError
|
One response from OpenAlex API had an error HTTP status of 4xx or 5xx. |
make_author_report
async
¶
def make_author_report(
author: Author,
extra_profiles: list[
Author | AuthorResult | DehydratedAuthor
]
| None = None,
pub_from_date: FromDate | None = None,
pub_to_date: ToDate | None = None,
cited_from_date: FromDate | None = None,
cited_to_date: ToDate | None = None,
) -> AuthorReport
Make a scientific production report by Author.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
author
|
Author
|
Author to whom the report is generated. |
required |
extra_profiles
|
list[Author | AuthorResult | DehydratedAuthor] | None
|
List of author profiles whose works will be attached. |
None
|
pub_from_date
|
FromDate | None
|
Filter works published from this date. |
None
|
pub_to_date
|
ToDate | None
|
Filter works published up to this date. |
None
|
cited_from_date
|
FromDate | None
|
Filter works that cite the author, published after this date. |
None
|
cited_to_date
|
ToDate | None
|
Filter works that cite the author, published up to this date. |
None
|
Returns:
Type | Description |
---|---|
AuthorReport
|
Author's scientific production report Model. |
Raises:
Type | Description |
---|---|
HTTPStatusError
|
One response from OpenAlex API had an error HTTP status of 4xx or 5xx. |
make_institution_report
async
¶
def make_institution_report(
institution: Institution,
extra_profiles: list[
Institution
| InstitutionResult
| DehydratedInstitution
]
| None = None,
pub_from_date: FromDate | None = None,
pub_to_date: ToDate | None = None,
cited_from_date: FromDate | None = None,
cited_to_date: ToDate | None = None,
) -> InstitutionReport
Make a scientific production report by Institution.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
institution
|
Institution
|
Institution to which the report is generated. |
required |
extra_profiles
|
list[Institution | InstitutionResult | DehydratedInstitution] | None
|
List of institutions profiles whose works will be attached. |
None
|
pub_from_date
|
FromDate | None
|
Filter works published from this date. |
None
|
pub_to_date
|
ToDate | None
|
Filter works published up to this date. |
None
|
cited_from_date
|
FromDate | None
|
Filter works that cite the institution, published after this date. |
None
|
cited_to_date
|
ToDate | None
|
Filter works that cite the institution, published up to this date. |
None
|
Returns:
Type | Description |
---|---|
InstitutionReport
|
Institution's scientific production report Model. |
Raises:
Type | Description |
---|---|
HTTPStatusError
|
One response from OpenAlex API had an error HTTP status of 4xx or 5xx. |