White Papers

The Web Changes Everything: Understanding the Dynamics of Web Content

Overview The Web is a dynamic, ever changing collection of information. This paper explores changes in Web content by analyzing a crawl of 55,000 Web pages, selected to represent different user visitation patterns. Although change over long intervals has been explored on random (and potentially unvisited) samples of Web pages, little is known about the nature of finer grained changes to pages that are actively consumed by users, such as those in the sample. The paper describes algorithms, analyses, and models for characterizing changes in Web content, focusing on both time (by using hourly and sub-hourly crawls) and structure (by looking at page-, DOM-, and term-level changes).

Download White Paper

By downloading you agree to our Terms and Conditions. These include information regarding use of your personal data.

Publisher
Association for Computing Machinery
File Format
PDF
Date Published
Sep 16, 2009
Format
White Papers
Topics
Web Content Management

Similiar White Papers

GUIDEBOOK: ORACLE'S SIEBEL CRM ON DEMAND

GUIDEBOOK: ORACLE'S SIEBEL CRM ON DEMAND

Oracle's Siebel CRM On Demand has leveraged its history and experience in CRM to provide customers with deeper functiona

Publisher: Oracle  |  Tags: crm

Oracle Blends Managed Services With OnDemand Pricing

Oracle Blends Managed Services With OnDemand Pricing

Growing businesses are increasingly realising that Managed Services offer a scalable, flexible route forward. This white

Publisher: Oracle

Best Practices for Building WEB Applications Using IBM Content Manager OnDemand Web Enablement Kit Java API's

Best Practices for Building WEB Applications Using IBM Content Manager OnDemand Web Enablement Kit Java API's

The Content Manager OnDemand Web Enablement Kit (ODWEK) Java API's provide a rich development environment for Java devel

Publisher: IBM  |  Tags: developers, java, server

Integration Guide: Implementing FileNet P8 Content Manager With Network Appliance Storage Systems

Integration Guide: Implementing FileNet P8 Content Manager With Network Appliance Storage Systems

Content and business processes have caused significant growth of unstructured data in an enterprise environment. Today's

Publisher: Network Appliance (NetApp)  |  Tags: business process, data, software, unified

Content Management System Based on BEA WebLogic Manages End-to-End Content Generation, Uploads, Quality Assurance, Discovery, and Delivery

Content Management System Based on BEA WebLogic Manages End-to-End Content Generation, Uploads, Quality Assurance, Discovery, and Delivery

Bharti Telesoft provides a range of solutions and business support systems to wireless and wire-line telecom service pro

Publisher: BEA Systems  |  Tags: bharti, mobile operators

Association for Computing Machinery White Papers

Managing ETL Processes

Managing ETL Processes

ETL tools allow the definition of sometimes complex processes to extract, transform, and load heterogeneous data into a

Publisher: Association for Computing Machinery  |  Tags: data, data integration, data warehouse, management

GPS-Free Node Localization in Mobile Wireless Sensor Networks

GPS-Free Node Localization in Mobile Wireless Sensor Networks

An important problem in mobile ad-hoc wireless sensor networks is the localization of individual nodes, i.e., each node'

Publisher: Association for Computing Machinery  |  Tags: gps, infrastructure, network

A Black-Box Approach for Web Application SLA

A Black-Box Approach for Web Application SLA

Web servers nowadays have to cope with unprecedented amounts of workload, due to increasing popularity and complexity; i

Publisher: Association for Computing Machinery  |  Tags: applications, server

Load Balancing for Multimedia Streaming in Heterogeneous Peer-to-Peer Systems

Load Balancing for Multimedia Streaming in Heterogeneous Peer-to-Peer Systems

Multimedia streaming of mostly user generated content is an ongoing trend, not only since the upcoming of Last.fm and Yo

Publisher: Association for Computing Machinery  |  Tags: user generated, user generated content, youtube

Multiobjective Network Design for Realistic Traffic Models

Multiobjective Network Design for Realistic Traffic Models

Network topology design problems find application in several real life scenarios. However, most designs in the past eith

Publisher: Association for Computing Machinery  |  Tags: network, realistic