White Papers

PhraseRate: An HTML Keyphrase Extractor

Overview A standard feature in cataloging documents is the list of keywords. When the source documents are web pages, they can attempt to aid the cataloger by analyzing the page and presenting relevant support material. This paper describes PhraseRate, which is an interactive aid for keyword extraction, designed to assist human classifiers in the Infomine Project. In particular, it introduces a novel keyphrase extraction heuristic for web pages which requires no training, but instead is based on the assumption that most well written webpages "Suggest" keyphrases based on their internal structure. It is very fast, flexible, and its results compare favorably with the state of the art in keyphrase extraction.

Download White Paper

By downloading you agree to our Terms and Conditions. These include information regarding use of your personal data.

Publisher
University of California
File Format
PDF
Date Published
Dec 4, 2008
Format
White Papers
Topics
HTML

Similiar White Papers

Oracle Forms 10g - Forms Look and Feel

Oracle Forms 10g - Forms Look and Feel

Oracle Forms application have traditionally behaved as, and looked like, desktop applications. Even when Oracle Forms ap

Publisher: Oracle  |  Tags: applications, css

Defining Colors Using Hex Values: With the Gray-Scale and RGB Color Naming Schemes

Defining Colors Using Hex Values: With the Gray-Scale and RGB Color Naming Schemes

In Version 6, SAS/Graph includes predefined names for 290 different colors. Additional predefined color names became ava

Publisher: SAS Institute

Using the SAS Output Delivery System and PROC TEMPLATE to Create XHTML Files

Using the SAS Output Delivery System and PROC TEMPLATE to Create XHTML Files

SAS 8.2 introduced the ODS MARKUP statement, allowing users to export to a variety of markup languages, including HTML,

Publisher: SAS Institute

Categorizing Drug Data With SAS PROC FORMAT, INPUT and PUT Functions

Categorizing Drug Data With SAS PROC FORMAT, INPUT and PUT Functions

Drug data recoding is commonly used in analyses of drug utilization rates, drug risks and side effects, and drug abuse a

Publisher: SAS Institute  |  Tags: data

Transforming Word Documents Into the XSL-FO Format

Transforming Word Documents Into the XSL-FO Format

Microsoft made customizing Microsoft Office Word documents much easier and simpler when it introduced a new XML file for

Publisher: Microsoft  |  Tags: data, microsoft office, office

University of California White Papers

Stateless Load Balancing Over Multiple MPLS Paths

Stateless Load Balancing Over Multiple MPLS Paths

The paper proposes a flow-independent approach to balance the load coming from several multimedia applications (i.e., IP

Publisher: University of California  |  Tags: applications, ip, mpls, network

Escape From the Computer Lab: Education in Mobile Wireless Networks

Escape From the Computer Lab: Education in Mobile Wireless Networks

As mobile wireless network technology becomes widespread, the importance of education about this new form of communicati

Publisher: University of California  |  Tags: computing, mobile wireless, mobility, network, portable devices, university of california

Parallel Spectral Clustering Algorithm for Large-Scale Community Data Mining

Parallel Spectral Clustering Algorithm for Large-Scale Community Data Mining

The spectral clustering algorithm has been shown to be very effective in finding clusters of non-linear boundaries. Unfo

Publisher: University of California

Directed Diffusion for Wireless Sensor Networking

Directed Diffusion for Wireless Sensor Networking

Advances in processor, memory and radio technology will enable small and cheap nodes capable of sensing, communication a

Publisher: University of California  |  Tags: data, network

Mesh Topology Construction for Interconnected Wireless LANs

Mesh Topology Construction for Interconnected Wireless LANs

The 802.11s working group has been formed recently to recommend an Extended Service Set (ESS) that enables wider area co

Publisher: University of California  |  Tags: network