function tripal_pub_PMID_parse_pubxml

2.x tripal_pub.PMID.inc tripal_pub_PMID_parse_pubxml($pub_xml)
3.x tripal_chado.pub_importer_PMID.inc tripal_pub_PMID_parse_pubxml($pub_xml)
1.x PMID.inc tripal_pub_PMID_parse_pubxml($pub_xml)

This function parses the XML containing details of a publication and converts it into an associative array of where keys are Tripal Pub ontology terms and the values are extracted from the XML. The XML should contain only a single publication record.

Information about the valid elements in the PubMed XML can be found here: http://www.nlm.nih.gov/bsd/licensee/elements_descriptions.html

Information about PubMed's citation format can be found here http://www.nlm.nih.gov/bsd/policy/cit_format.html

Parameters

$pub_xml: An XML string describing a single publication

Return value

An array describing the publication

Related topics

1 call to tripal_pub_PMID_parse_pubxml()
tripal_pub_remote_search_PMID in tripal_pub/includes/importers/tripal_pub.PMID.inc
A hook for performing the search on the PubMed database.

File

tripal_pub/includes/importers/tripal_pub.PMID.inc, line 354
This file provides support for importing and parsing of results from the NCBI PubMed database. The functions here are used by both the publication importer setup form and the publication importer.

Code

function tripal_pub_PMID_parse_pubxml($pub_xml) {
  $pub = array();

  if (!$pub_xml) {
    return $pub;
  }

  // read the XML and iterate through it.
  $xml = new XMLReader();
  $xml->xml(trim($pub_xml));
  while ($xml->read()) {
    $element = $xml->name;
    if ($xml->nodeType == XMLReader::ELEMENT) {

      switch ($element) {
        case 'ERROR':
          $xml->read(); // get the value for this element
          tripal_report_error('tripal_pubmed', TRIPAL_ERROR, "Error: %err", array('%err' => $xml->value));
          break;
        case 'PMID':
          // thre are multiple places where a PMID is present in the XML and
          // since this code does not descend into every branch of the XML tree
          // we will encounter many of them here.  Therefore, we only want the
          // PMID that we first encounter. If we already have the PMID we will
          // just skip it.  Examples of other PMIDs are in the articles that
          // cite this one.
          $xml->read(); // get the value for this element
          if (!array_key_exists('Publication Dbxref', $pub)) {
            $pub['Publication Dbxref'] = 'PMID:' . $xml->value;
          }
          break;
        case 'Article':
          $pub_model = $xml->getAttribute('PubModel');
          $pub['Publication Model'] = $pub_model;
          tripal_pub_PMID_parse_article($xml, $pub);
          break;
        case 'MedlineJournalInfo':
          tripal_pub_PMID_parse_medline_journal_info($xml, $pub);
          break;
        case 'ChemicalList':
          // TODO: handle this
          break;
        case 'SupplMeshList':
          // TODO: meant for protocol list
          break;
        case 'CitationSubset':
          // TODO: not sure this is needed.
          break;
        case 'CommentsCorrections':
          // TODO: handle this
          break;
        case 'GeneSymbolList':
          // TODO: handle this
          break;
        case 'MeshHeadingList':
          // TODO: Medical subject headings
          break;
        case 'NumberOfReferences':
          // TODO: not sure we should keep this as it changes frequently.
          break;
        case 'PersonalNameSubjectList':
          // TODO: for works about an individual or with biographical note/obituary.
          break;
        case 'OtherID':
          // TODO: ID's from another NLM partner.
          break;
        case 'OtherAbstract':
          // TODO: when the journal does not contain an abstract for the publication.
          break;
        case 'KeywordList':
          // TODO: handle this
          break;
        case 'InvestigatorList':
          // TODO: personal names of individuals who are not authors (can be used with collection)
          break;
        case 'GeneralNote':
          // TODO: handle this
          break;
        case 'DeleteCitation':
          // TODO: need to know how to handle this
          break;
        default:
          break;
      }
    }
  }
  $pub['Citation'] = tripal_pub_create_citation($pub);

  $pub['raw'] = $pub_xml;
  return $pub;
}