White Papers

Performance Evaluation and Characterization of Scalable Data Mining Algorithms

Overview Data mining has become one of the most essential tools in diverse fields. The increases in data sizes and algorithmic complexities require the computational power of chip to increase even further. This paper presents detailed characteristics from the hardware and software perspectives for a set of representative data mining programs. The paper first designs MineBench, a benchmarking suite containing representative data mining applications from multiple categories including two classification, two association rule mining, and four clustering applications. The paper evaluates the MineBench applications on an 8-way Shared Memory Parallel (SMP) machine and analyzes their important performance characteristics. During the evaluation, the input datasets and the number of processors used are varied to measure the scalability of the applications in the benchmark suite.

Download White Paper

By downloading you agree to our Terms and Conditions. These include information regarding use of your personal data.

Publisher
Northwestern University
File Format
PDF
Date Published
Oct 1, 2008
Format
White Papers
Topics
Data Mining - Analysis, Scalability

Similiar White Papers

Scalable Robust Covariance and Correlation Estimates for Data Mining

Scalable Robust Covariance and Correlation Estimates for Data Mining

Covariance and correlation estimates have important applications in data mining. In the presence of outliers, classical

Publisher: Association for Computing Machinery  |  Tags: applications, data, data mining

NetCube: A Scalable Tool for Fast Data Mining and Compression

NetCube: A Scalable Tool for Fast Data Mining and Compression

This paper proposes an novel method of computing and storing DataCubes. The idea is to use Bayesian Networks, which can

Publisher: Carnegie Mellon University  |  Tags: computing, data, database, network

SAS Helps Improve Manufacturing Processes at Legendary Semiconductor Plant

SAS Helps Improve Manufacturing Processes at Legendary Semiconductor Plant

Just 20 years ago, semiconductor chip components were measured in microns, or thousands of nanometers. Today, IBM builds

Publisher: SAS Institute  |  Tags: data, semiconductor

Northwestern University White Papers

An Application of Central Limit Theorem to Wide Area Network Service Level Agreement Analyses

An Application of Central Limit Theorem to Wide Area Network Service Level Agreement Analyses

Managed Network Service Providers (NSP) supply the bandwidth, transport, equipment, and management services to connect d

Publisher: Northwestern University  |  Tags: management, wan

Towards a High-Speed Router-Based Anomaly/Intrusion Detection System

Towards a High-Speed Router-Based Anomaly/Intrusion Detection System

Traffic anomalies and attacks are commonplace in today's networks, and identifying them rapidly and accurately is critic

Publisher: Northwestern University  |  Tags: network, routers, the link

IDGraphs: Intrusion Detection and Analysis Using Histographs

IDGraphs: Intrusion Detection and Analysis Using Histographs

Traffic anomalies and attacks are commonplace in today's networks and identifying them rapidly and accurately is critica

Publisher: Northwestern University  |  Tags: network, routers

A DoS Resilient Flow-Level Intrusion Detection Approach for High-Speed Networks

A DoS Resilient Flow-Level Intrusion Detection Approach for High-Speed Networks

Global-scale attacks like viruses and worms are increasing in frequency, severity and sophistication, making it critical

Publisher: Northwestern University  |  Tags: data, false positives, routers

Reverse Hashing for High-Speed Network Monitoring: Algorithms, Evaluation, and Applications

Reverse Hashing for High-Speed Network Monitoring: Algorithms, Evaluation, and Applications

A key function for network traffic monitoring and analysis is the ability to perform aggregate queries over multiple dat

Publisher: Northwestern University  |  Tags: data, ip, network