White Papers

Introducing the Webb Spam Corpus: Using Email Spam to Identify Web Spam Automatically

Overview Just as email spam has negatively impacted the user messaging experience, the rise of Web spam is threatening to severely degrade the quality of information on the World Wide Web. Fundamentally, Web spam is designed to pollute search engines and corrupt the user experience by driving traffic to particular spammed Web pages, regardless of the merits of those pages. This paper identifies an interesting link between email spam and Web spam, and they use this link to propose a novel technique for extracting large Web spam samples from the Web. Then, they present the Webb Spam Corpus - a first-of-its-kind, large-scale, and publicly available Web spam data set that was created using their automated Web spam collection method.

Download White Paper

By downloading you agree to our Terms and Conditions. These include information regarding use of your personal data.

Publisher
Georgia Institute of Technology
File Format
PDF
Date Published
Jul 1, 2009
Format
White Papers
Topics
Spam - E-mail Fraud - Phishing, Email, Network Security

Similiar White Papers

Singer Recaptures $40,000 in Lost Productivity Costs With Hosted E-Mail Filtering

Singer Recaptures $40,000 in Lost Productivity Costs With Hosted E-Mail Filtering

Singer Sewing Company manufacturers and distributes its sewing products worldwide. As a global manufacturer, Singer depe

Publisher: Microsoft  |  Tags: network, productivity, spam

Sophos Email Security and Control - Free 30 Day Trial

Sophos Email Security and Control - Free 30 Day Trial

Proactively block inbound and outbound threats with unrivaled effectiveness and simplicity, delivering high-capacity, hi

Publisher: Sophos

Security Best Practices: What to Look for in an Email Hosting Provider

Security Best Practices: What to Look for in an Email Hosting Provider

Email is an essential business tool for organizations of all sizes. Yet it is also the easiest way for hackers, spammers

Publisher: USA.NET  |  Tags: email, email security, hackers, outsourcing

Eliminate Spam, Gain Productivity

Eliminate Spam, Gain Productivity

Learn all about the dangers and the costs of spam in all its forms - from stock-touting to PDF to spreadsheet and more.

Publisher: MessageLabs, now part of Symantec  |  Tags: direct marketing, email, infrastructure, marketing, network, pdf, spam, spreadsheet, storm worm

PDF Spam: Spam Evolves, PDF becomes the Latest Threat

PDF Spam: Spam Evolves, PDF becomes the Latest Threat

PDF spam, which accounts for 20% of all spam, is a relatively new phenomenon. Yet, it undergoes the same constant evolut

Publisher: MessageLabs  |  Tags: applications, pdf, spam

Georgia Institute of Technology White Papers

Bandwidth Estimation: Metrics, Measurement Techniques, and Tools

Bandwidth Estimation: Metrics, Measurement Techniques, and Tools

In a packet network, the terms "Bandwidth" or "Throughput" often characterize the amount of data that the network can tr

Publisher: Georgia Institute of Technology  |  Tags: data, ip, network, open source, peer-to-peer

Scalability of Network-Failure Resilience

Scalability of Network-Failure Resilience

This work quantifies scalability of network resilience upon failures. It characterize resilience as the percentage of lo

Publisher: Georgia Institute of Technology  |  Tags: network

Improving the Performance of TCP Wireless Video Streaming With a Novel Playback Adaptation Algorithm

Improving the Performance of TCP Wireless Video Streaming With a Novel Playback Adaptation Algorithm

This paper proposes a playback adaptation algorithm for video streaming with TCP in wireless networks where both handoff

Publisher: Georgia Institute of Technology  |  Tags: ip, wireless networks

Bandwidth Estimation and Robust Video Streaming Over 802.11e Wireless LANs

Bandwidth Estimation and Robust Video Streaming Over 802.11e Wireless LANs

Streaming high quality Audio/Video (AV) from home media sources to TV sets over a Wireless Local Area Network (WLAN) is

Publisher: Georgia Institute of Technology  |  Tags: qos, tv

Energy-Aware Traffic Shaping for Wireless Real-Time Applications

Energy-Aware Traffic Shaping for Wireless Real-Time Applications

Sleep modes of wireless network cards are used to switch these cards into low-power state when idle, but large time-out

Publisher: Georgia Institute of Technology  |  Tags: cpu, management, network, network card, qos, real-time