StackExchange
Agent Source token | a8affc7d-9395-4f1f-a1fd-d00cfbdfa718 |
Consumes Artifacts | stackexchange-sites , domain-list |
Subject coverage | Questions and answers on all StackExchange sites |
Object coverage | All DOIs, all Article Landing Pages |
Data contributor | StackExchange |
Data origin | Questions and answers on StackExchange sites |
Freshness | Every few days |
Identifies | Linked DOIs, unlinked DOIs, landing page URLs |
License | Creative Commons Attribution-ShareAlike 4.0 International |
Looks in | Text of questions and answers |
Name | StackExchange |
Operated by | Crossref |
Produces Evidence Records | Yes |
Produces relation types | discusses |
Source ID | stackexchange |
Updates or deletions | None expected |
What it is
StackExchange is a network of 'question and answer' sites. It originated with StackOverflow.com but now includes a large network of sites, each of which covers a specific subject area. Users may post questions and other users may post answers. When a question or an answer includes a link to registered content we will detect it.
What it does
The StackExchange Agent aims to cover all StackExchange sites. To do this it runs two processes. The first one covers sites where we expect to find links, and scans every few days. The second covers all other StackExchnage sites and scans every month or two.
Regular scan:
- Retrieves the list of StackExchange sites we're interested in from the
stackexchange-sites
Artifact. - Scans every article landing page domain in the
domain-list
, includingdoi.org
. - For each site and each domain, make a search query for questions and answers on that site that mention that domain.
- For each link found, attempt to match that to a DOI.
Full scan:
- Retrieves the full list of all StackExchange sites from StackExchange API.
- Scans every article landing page domain in the
domain-list
, includingdoi.org
. - For each site and each domain, make a search query for questions and answers on that site that mention that domain.
- For each link found, attempt to match that to a DOI.
Example Event
{
"license": "https://creativecommons.org/licenses/by-sa/4.0/",
"obj_id": "https://0-doi-org.libus.csd.mu.edu/10.1063/1.1792071",
"source_token": "a8affc7d-9395-4f1f-a1fd-d00cfbdfa718",
"occurred_at": "2015-12-01T10:15:35Z",
"subj_id": "https://chemistry.stackexchange.com/a/41547",
"id": "00185922-5dee-40ad-9726-8fb7c2cf1746",
"evidence_record": "https://0-evidence-eventdata-crossref-org.libus.csd.mu.edu/evidence/20170413-stackexchange-b22762bf-7d83-47b6-99b8-a7e173973ada",
"terms": "https://0-doi-org.libus.csd.mu.edu/10.13003/CED-terms-of-use",
"action": "add",
"subj": {
"pid": "https://chemistry.stackexchange.com/a/41547",
"title": "How important is it that geometry be optimized at a high level of theory?",
"issued": "2015-12-01T10:15:35Z",
"type": "comment",
"author": {
"url": "https://chemistry.stackexchange.com/users/186/wildcat",
"name": "Wildcat",
"id": 186
}
},
"source_id": "stackexchange",
"obj": {
"pid": "https://0-doi-org.libus.csd.mu.edu/10.1063/1.1792071",
"url": "http://0-doi-org.libus.csd.mu.edu/10.1063/1.1792071"
},
"timestamp": "2017-04-13T20:03:24Z",
"relation_type_id": "discusses"
}
Evidence Record
- Contains observations of type
plaintext
. This the text of the question or answer. Thesensitive
hash is set to true, but you can find the link to the question in the Event.
Edits / deletion
We don't expect to have to edit or delete any Events.
Quirks
The Event is captured at the point it is returned from the StackExchange search. It may be edited before or after it is captured. We don't detect if it has been edited, so you should be aware of this.
The Agent is only aware of questions or answers that are matched via a Landing Page or doi.org domain. If an answer mentions an unlinked DOI (i.e. plain text DOIs such as 10.5555/12345678
) we will not retrieve it. However, if the Agent does become aware of a question or answer because it contains a linked DOI or landing page URL and it also contain an unlinked DOI, it will match both.
Failure modes
- Publisher sites may block the Event Data Bot collecting landing pages.
Further information
- (https://stackexchange.com)[https://stackexchange.com]