Georgetown University Medical Center Liu lab - Literature intelligence in biomedicine


LF Detection and Search via Virtual Integration



  Due to the complexity of the biomedical domain, concise representations such as acronyms, abbreviations, and symbols (which we denote as short forms, abbreviated as SFs) are widely used in the domain. However, they pose great challenges to natural language processing applications in the domain, e.g., gene/protein name recognition. To mitigate this problem, there have been several tools developed for automatic detection of full forms for SFs in text (which we denote as long forms, abbreviated as LFs) .

  SF knowledge bases contructed using different detection methods are similar, but not always the same because of different criteria used in the methods. This website allows you to query several online SF knowledge bases for detecting LFs in text or searching LFs for an SF proposed in those knowledge bases. Currently we include seven online resources:

ADAM, ARGH, Acromine, and BAS are SF knowledge bases compiled from MEDLINE abstracts. And other resources are from public domain. Note that we also incorporate BioThesaurus into the result table to provide cross-references to BioThesaurus.

  For some SFs, there are many corresponding LFs, which are often very different in their meaning, e.g., CAR stands for Central African Republic, Conditioned Avoidance Response, Central Activation Ratio, etc. In order to organize LFs in an SF thesaurus and reduce candidate LFs for an SF given in a certain context, we attempt to automatically annotate LFs with semantic information. Please read this page for automatic semantic annotation. (Note: In the web version below, currently not all the modules discussed in our paper are used for semantic annotation, which causes annotation errors for certain kinds of LFs.)


LF detection and Search:

  • Detect LFs in a MEDLINE abstract(e.g., 17311308)

    PMID:
  • Search LFs for a given SF (e.g., CAR)




Publications

  • Torii M, Hu ZZ, Wu CH, Liu H (2006) A Comparison Study of Biomedical Short Form Definition Detection Algorithms. In proc of ACM First International Workshop on Text Mining in Bioinformatics (TMBio).
  • Torii M, Hu ZZ, Song M, Wu CH, Liu H (2007) A Comparison Study on Algorithms for Detecting Biomedical Short Form Definition BMC Bioinformatics (in press)
  • Torii M and Liu H (2007) Semantic Category Assignment for Long Forms Extracted from Text In Proc of American Medical Informatics Association (AMIA) Annual Symposium (in press)

Liu Lab, Building D, Room 175, 4000 Reservoir Rd NW, Washington, DC 20007 | Phone: 202.687.7933 |Last updated: October 17, 2007