function tripal_pub_PMID_parse_pubxml
2.x tripal_pub.PMID.inc | tripal_pub_PMID_parse_pubxml($pub_xml) |
3.x tripal_chado.pub_importer_PMID.inc | tripal_pub_PMID_parse_pubxml($pub_xml) |
1.x PMID.inc | tripal_pub_PMID_parse_pubxml($pub_xml) |
This function parses the XML containing details of a publication and converts it into an associative array of where keys are Tripal Pub ontology terms and the values are extracted from the XML. The XML should contain only a single publication record.
Information about the valid elements in the PubMed XML can be found here: http://www.nlm.nih.gov/bsd/licensee/elements_descriptions.html
Information about PubMed's citation format can be found here http://www.nlm.nih.gov/bsd/policy/cit_format.html
Parameters
$pub_xml: An XML string describing a single publication
Return value
An array describing the publication
Related topics
1 call to tripal_pub_PMID_parse_pubxml()
- tripal_pub_remote_search_PMID in tripal_pub/
includes/ importers/ tripal_pub.PMID.inc - A hook for performing the search on the PubMed database.
File
- tripal_pub/
includes/ importers/ tripal_pub.PMID.inc, line 354 - This file provides support for importing and parsing of results from the NCBI PubMed database. The functions here are used by both the publication importer setup form and the publication importer.
Code
function tripal_pub_PMID_parse_pubxml($pub_xml) {
$pub = array();
if (!$pub_xml) {
return $pub;
}
// read the XML and iterate through it.
$xml = new XMLReader();
$xml->xml(trim($pub_xml));
while ($xml->read()) {
$element = $xml->name;
if ($xml->nodeType == XMLReader::ELEMENT) {
switch ($element) {
case 'ERROR':
$xml->read(); // get the value for this element
tripal_report_error('tripal_pubmed', TRIPAL_ERROR, "Error: %err", array('%err' => $xml->value));
break;
case 'PMID':
// thre are multiple places where a PMID is present in the XML and
// since this code does not descend into every branch of the XML tree
// we will encounter many of them here. Therefore, we only want the
// PMID that we first encounter. If we already have the PMID we will
// just skip it. Examples of other PMIDs are in the articles that
// cite this one.
$xml->read(); // get the value for this element
if (!array_key_exists('Publication Dbxref', $pub)) {
$pub['Publication Dbxref'] = 'PMID:' . $xml->value;
}
break;
case 'Article':
$pub_model = $xml->getAttribute('PubModel');
$pub['Publication Model'] = $pub_model;
tripal_pub_PMID_parse_article($xml, $pub);
break;
case 'MedlineJournalInfo':
tripal_pub_PMID_parse_medline_journal_info($xml, $pub);
break;
case 'ChemicalList':
// TODO: handle this
break;
case 'SupplMeshList':
// TODO: meant for protocol list
break;
case 'CitationSubset':
// TODO: not sure this is needed.
break;
case 'CommentsCorrections':
// TODO: handle this
break;
case 'GeneSymbolList':
// TODO: handle this
break;
case 'MeshHeadingList':
// TODO: Medical subject headings
break;
case 'NumberOfReferences':
// TODO: not sure we should keep this as it changes frequently.
break;
case 'PersonalNameSubjectList':
// TODO: for works about an individual or with biographical note/obituary.
break;
case 'OtherID':
// TODO: ID's from another NLM partner.
break;
case 'OtherAbstract':
// TODO: when the journal does not contain an abstract for the publication.
break;
case 'KeywordList':
// TODO: handle this
break;
case 'InvestigatorList':
// TODO: personal names of individuals who are not authors (can be used with collection)
break;
case 'GeneralNote':
// TODO: handle this
break;
case 'DeleteCitation':
// TODO: need to know how to handle this
break;
default:
break;
}
}
}
$pub['Citation'] = tripal_pub_create_citation($pub);
$pub['raw'] = $pub_xml;
return $pub;
}