Objective
The National Aeronautics and Space Administration seeks companies to
commercialize Perilog. Perilog is a suite of data mining tools that
retrieves and organizes contextually relevant data from any sequence of
terms (text, musical notes, genetic data, etc.). It is an integrated set of
methods that can be used to intelligently mine information from databases.
Product Profile
Perilog designed originally to support the FAA’s Aviation Safety
Reporting System (ASRS). The ASRS test bed demonstrated Perilog’s power
on a topical database of thousands of documents. The algorithm was
powerful enough to produce the first quantitative evidence of situational
relationships between reported commercial aviation incidents and a
specific type of aviation accident (McGreevy & Statler, 1998).
Perilog unearths data that is contextually relevant to the subject
being investigated. The software measures the degree of contextual
association for large numbers of term pairs in text (or any other
sequence) to produce models that capture the structure of the text.
Perilog statistically compares these models to measure their degree of
similarity to a query model, develops a ranking, and presents the search
results to the user. Furthermore, the user has access to powerful query
tools that, for example, generate search options automatically.
Technical Basics
Perilog relies on four methods for data mining:
- Keyword-in-Context Search retrieves narratives that contain
one or more user-specified keywords in typical or selected contexts,
and ranks the narratives on their relevance to the keywords in
context.
- Flexible, Model-Based Phrase Search retrieves narratives that
contain one or more user-specified phrases, and ranks the narratives
on their relevance to the phrases.
- Model-Based Phrase Generation produces a list of phrases from
documents that contain a user-specified word or phrase.
- Narrative-Based Phrase Discovery finds phrases that are
related to topics of interest by generating a list of narratives
similar in meaning to the keyword or phrase query.
Perilog methods derive from seven inventive concepts:
- Modeling contextual associations (directional and non-directional)
in sequences
- Deriving a query model from key terms and a database of contextual
models
- Key term-in-context search
- Model-based phrase search
- Generating phrases from contextual models of sequences
- Extracting phrases from sequences, selecting relevant phrases
- Discovering associated phrases by iterative extraction and selection
Perilog offers the following features:
Analysis:
Measures contextual associations within sequences
Relevance-ranking:
Ranks collections of contexts on similarity to other collections of
contexts.
Search:
Provides more effective key term and phrase search.
Phrase mining:
Helps users to know what phrases occur in a database, and discovers
phrases that are contextually related to key terms or phrases.
Modeling:
Represents contexts within sequences as pairwise inter-term contextual
associations having quantified degrees of association.
- Generates network representations of collections of
contexts.
- Models represent contexts of both real-world and symbolic
terms.
- The contextual scope of models is easy to manipulate.
Potential Commercial Uses
Perilog has applications in a multitude of fields. Any knowledge
management application can benefit from the technology. Perilog’s
manipulation of patterned or sequential symbols, data, items, objects,
events, causes, time spans, actions, attributes, entities, relations and
representations allows powerful search and mining of any type of
information repository.
BIOTECH: Perilog may be employed in biotechnology to unearth
contextually relevant information, data, representations, sequences, and
patterns. It may also be used to mine and retrieve genetic and protein
sequences, representations, and analogs.
MULTIMEDIA: Perilog may also be used to manipulate multimedia
repositories for “smart” retrieval of sound, music, voice, audio data,
audio encoding, and vocal encoding.
CORPORATE NETWORKS: In a corporate text-based repository environment,
Perilog algorithms may be applied to any form of text; for example,
narratives, reports, literature, patents, punctuation, messages,
electronic mail, internet text, and web site information and URLs.
Additionally, it processes linguistic patterns and grammatical tags.
Perilog is able to streamline any large topical database, manage
corporate-wide knowledge repository (extendable to nodes outside the
corporation via network), and push information based on specific research,
marketing, or organizational projects to querying users.
Benefits
Perilog has tangible benefits such as context-sensitivity and “smart”
context ranking. It can distill topically and situationally relevant phrases
and deliver relevant results to a user who is not a subject expert.
Furthermore, Perilog does not require manipulation of the subject database
or expert intervention to achieve results.
| Other
Search Technology |
Perilog
Innovations |
| Based on isolated term occurrence
or linguistic patterns. |
Based on structure of
domains, situations, concerns and narratives |
| Term Indexing |
Contextual indexing |
| Treats phrases as terms |
Models intra-phrase
relations |
| list frequent noun phrases |
Represents all phrases |
| Handmade tables |
Self-generated Tables |
Technology Commercialization Status
Perilog consists of modules embodied in four patents (pending). NASA is
seeking non-exclusive license applicants.
Contact Information
If your company is interested in this technology or would like licensing instructions,
please contact:
Marty Zeller
NASA Far West Technology Transfer Center
3716 S Hope St. #200
Los Angeles, CA 9007
Phone: (213) 743-2353
Fax: (213) 746-9043
E-mail: zeller@usc.edu
More information can be found at the Perilog Informational Event website at:
http://ettc.usc.edu/ames/perilog
|