White Papers

Optimizing Frequency Queries for Data Mining Applications

Category: Data Management

Overview Data mining algorithms use various Trie and bitmap-based representations to optimize the support (i.e., frequency) counting performance. This paper compares the memory requirements and support counting performance of FP Tree, and Compressed Patricia Trie against several novel variants of vertical bit vectors. First, borrowing ideas from the VLDB domain, they compress vertical bit vectors using WAH encoding. Second, they evaluate the Gray code rank-based transaction reordering scheme, and show that in practice, simple lexicographic ordering, obtained by applying LSB Radix sort, outperforms this scheme.

Download White Paper

By downloading you agree to our Terms and Conditions. These include information regarding use of your personal data.

Publisher
Columbia University
File Format
PDF
Date Published
Oct 1, 2008
Format
White Papers
Topics
Data Mining - Analysis

Similiar White Papers

SQL Server 2000 Enterprise Edition (64-bit): Advantages of a 64-Bit Environment

SQL Server 2000 Enterprise Edition (64-bit): Advantages of a 64-Bit Environment

HP has partnered with Microsoft to provide information about the advantages of a 64-Bit Environment. Microsoft SQL Serv

Publisher: Hewlett-Packard  |  Tags: applications, software

Understanding Your Customer: Segmentation Techniques for Gaining Customer Insight and Predicting Risk in the Telecom Industry

Understanding Your Customer: Segmentation Techniques for Gaining Customer Insight and Predicting Risk in the Telecom Industry

The explosion of customer data in the last twenty years has increased the need for data mining aimed at Customer Relatio

Publisher: SAS Institute  |  Tags: crm, customer service, data, data mining

Information Architecture Essentials, Part 5: Business Intelligence in Your Information Architecture

Information Architecture Essentials, Part 5: Business Intelligence in Your Information Architecture

This series explores a variety of elements that create a successful information architecture design. As one manages and

Publisher: IBM  |  Tags: business intelligence, data, data mining

An Extensive Examination of Data Structures Using C# 2.0 - Part 1: An Introduction to Data Structures

An Extensive Examination of Data Structures Using C# 2.0 - Part 1: An Introduction to Data Structures

Probably the most common and well-known data structure is the array, which contains a contiguous collection of data item

Publisher: Microsoft  |  Tags: data

Cool New Features in SAS Enterprise Miner 5.3

Cool New Features in SAS Enterprise Miner 5.3

SAS released Enterprise Miner 5.3 in late 2007 with a veritable plethora of cool new features for data miners everywhere

Publisher: SAS Institute  |  Tags: data, software

Columbia University White Papers

An Analysis of the Skype Peer-to-Peer Internet Telephony Protocol

An Analysis of the Skype Peer-to-Peer Internet Telephony Protocol

Skype is a peer-to-peer VoIP client developed by KaZaa. Skype claims that it can work almost seamlessly across NATs and

Publisher: Columbia University  |  Tags: applications, firewall, instant messaging, ip, network, peer-to-peer, voip, yahoo im

A Budget-Balanced and Price-Adaptive Credit Protocol for MANETs

A Budget-Balanced and Price-Adaptive Credit Protocol for MANETs

A virtual credit exchange protocol for Mobile Ad-hoc NETworks (MANETs) is proposed to enforce the cooperation of packet

Publisher: Columbia University  |  Tags: data, updates

Buy-at-Bulk Network Design With Protection

Buy-at-Bulk Network Design With Protection

This paper considers approximation algorithms for buy-at-bulk network design, with the additional constraint that demand

Publisher: Columbia University  |  Tags: network

Data Mining Methods for Detection of New Malicious Executables

Data Mining Methods for Detection of New Malicious Executables

A serious security threat is malicious executables, especially new, unseen malicious executables often arriving as email

Publisher: Columbia University  |  Tags: data, email

On the Detection of Signaling DoS Attacks on 3G Wireless Networks

On the Detection of Signaling DoS Attacks on 3G Wireless Networks

Third Generation (3G) wireless networks based on the CDMA2000 and UMTS standards are now increasingly being deployed thr

Publisher: Columbia University  |  Tags: cdma2000, umts, wireless networks