WWW2007 Logo IW3C2 Logo

Proceedings of the Sixteenth International World Wide Web Conference
(WWW2007)
May 8-12, 2007
Banff, Alberta, CANADA



CHAIRS' MESSAGES

ORGANIZATION

SPONSORS

PAPERS

Track: Browsers and User Interfaces

Session: Personalization

Homepage Live: Automatic Block Tracing for Web Personalization         1
J. Han, D. Han (Shanghai Jiao-Tong University),
C.
Lin, H.-J. Zeng, Z. Chen (Microsoft Research Asia),
Y.
Yu (Shanghai Jiao-Tong University)

Open User Profiles for Adaptive News Systems: Help or Harm?         11
J.-w. Ahn, P. Brusilovsky, J. Grady, D. He, S. Y. Syn (University of Pittsburgh)

Investigating Behavioral Variability in Web Search         21
R. W. White (Microsoft Research), S. M. Drucker (Microsoft Live Laboratories)

Session: Smarter Browsing

CSurf: A Context-Driven Non-Visual Web-Browser         31
J. Mahmud, Y. Borodin, I. V. Ramakrishnan (Stony Brook University)

GeoTracker: Geospatial and Temporal RSS Navigation         41
Y.-F. Chen, G. Di Fabbrizio, D. Gibbon, R. Jana, S. Jora, B. Renger, B. Wei (AT&T Laboratories – Research)

Learning Information Intent via Observation         51
A. Tomasic, I. Simmons, J. Zimmerman (Carnegie Mellon University)

Track: Data Mining

Session: Identifying Structure in Web Pages

Page-level Template Detection via Isotonic Smoothing         61
D. Chakrabarti, R. Kumar (Yahoo! Research),
K.
Punera (University of Texas at Austin)

Towards Domain-Independent Information Extraction from Web Tables         71
W. Gatterbauer, P. Bohunsky, M. Herzog, B. Krüpl, B. Pollak (Vienna University of Technology)

Web Object Retrieval         81
Z. Nie, Y. Ma, S. Shi, J.-R. Wen, W.-Y. Ma (Microsoft Research Asia)

Session: Mining Textual Data

Summarizing Email Conversations with Clue Words         91
G. Carenini, R. T. Ng, X. Zhou (University of British Columbia)

Organizing and Searching the World Wide Web of Facts — Step Two: Harnessing the Wisdom of the Crowds         101
M. Paşca (Google Inc.)

Do Not Crawl in the DUST: Different URLs with Similar Text         111
Z. Bar-Yossef (Technion and Google Haifa Engineering Center),
I.
Keidar (Technion),
U.
Schonfeld (University of California at Los Angeles)

Session: Similarity Search

A New Suffix Tree Similarity Measure for Document Clustering         121
H. Chim, X. Deng (City University of Hong Kong)

Scaling Up All Pairs Similarity Search         131
R. J. Bayardo (Google, Inc.), Y. Ma (University of California at Irvine), R. Srikant (Google, Inc.)

Detecting Near-Duplicates for Web Crawling         141
G. S. Manku, Jain (Google Inc.), A. D. Sarma (Stanford University)

Session: Predictive Modeling of Web Users 

Demographic Prediction Based on User's Browsing Behavior         151
J. Hu, H.-J. Zeng, H. Li, C. Niu, Z. Chen (Microsoft Research Asia)

Why We Search: Visualizing and Predicting User Behavior         161
E. Adar, D. S. Weld, B. N. Bershad, S. D. Gribble (University of Washington)

Topic Sentiment Mixture: Modeling Facets and Opinions in Weblogs         171
Q. Mei, X. Ling, M. Wondra (University of Illinois at Urbana-Champaign),
H. Su (Vanderbilt University),
CX. Zhai (University of Illinois at Urbana-Champaign)

Session: Mining in Social Networks 

Wherefore Art Thou R3579X? Anonymized Social Networks, Hidden Patterns, and Structural Steganography         181
L. Backstrom, (Cornell University),
C.
Dwork (Microsoft Research),
J.
Kleinberg (Cornell University)

Information Flow Modeling based on Diffusion Rate for Prediction and Ranking         191
X. Song, Y. Chi, K. Hino, B. L. Tseng (NEC Laboratories America)

NetProbe: A Fast& Scalable System for Fraud Detection in Online Auction Networks         201
S. Pandit, D. H. Chau, S. Wang, C. Faloutsos (Carnegie Mellon University)

Track: E* Applications

Session: E-Communities

The Complex Dynamics of Collaborative Tagging         211
H. Halpin (University of Edinburgh),
V.
Robu (CWI, Center for Mathematics and Computer Science),
H. Shepherd (Princeton University)

Expertise Networks in Online Communities: Structure and Algorithms         221
J. Zhang, M. S. Ackerman, L. Adamic (University of Michigan)

Internet-Scale Collection of Human-Reviewed Data         231
Q. Su, D. Pavlov, J.-H. Chow, W. C. Baker (Yahoo! Inc.)

Session: E-Commerce and E-Content 

DETECTIVES: DETEcting Coalition hiT Inflation attacks in adVertising nEtworks Streams         241
A. Metwally, D. Agrawal, A. El Abbadi (University of California at Santa Barbara)

Extraction and Search of Chemical Formulae in Text Documents on the Web         251
B. Sun, Q. Tan, P. Mitra, C. L. Giles (The Pennsylvania State University)

A Content-Driven Reputation System for the Wikipedia         261
B. T. Adler, L. de Alfaro (University of California at Santa Cruz)

Track: Industrial Practice & Experience

Google News Personalization: Scalable Online Collaborative Filtering         271
A. Das, M. Datar, A. Garg (Google Inc.),
S.
Rajaram (University of Illinois at Urbana-Champaign)

Exploring in the Weblog Space by Detecting Informative and Affective Articles         281
X. Ni, G.-R. Xue, X. Ling, Y. Yu (Shanghai Jiao-Tong University),
Q. Yang (Hong Kong University of Science & Technology)

Spam Double-Funnel: Connecting Web Spammers with Advertisers         291
Y.-M. Wang, M. Ma (Microsoft Research),
Y.
Niu, H. Chen (University of California at Davis)

Track: Performance and Scalability 

Session: Scalable Systems for Dynamic Content

GlobeTP: Template-Based Database Replication for Scalable Web Applications         301
T. Groothuyse, S. Sivasubramanian, G. Pierre (Vrije Universiteit)

Consistency-preserving Caching of Dynamic Database Content         311
N. Tolia, M. Satyanarayanan (Carnegie Mellon University)

Optimized Query Planning of Continuous Aggregation Queries in Dynamic Data Dissemination Networks         321
R. Gupta (IBM India Research Laboratory),
K.
Ramamritham (Indian Institute of Technology)

Session: Performance Engineering of Web Applications

A Scalable Application Placement Controller for Enterprise Data Centers         331
C. Tang, M. Steinder, M. Spreitzer, G. Pacifici (IBM T.J. Watson Research Center)

A Unified Platform for Data Driven Web Applications with Automatic Client-Server Partitioning         341
F. Yang, N. Gupta, N. Gerner, X. Qi, A. Demers, J. Gehrke (Cornell University),
J.
Shanmugasundaram (Yahoo!)

MyXDNS: A Request Routing DNS Server with Decoupled Server Selection         351
H. A. Alzoubi, M. Rabinovich (Case Western Reserve University),
O. Spatscheck (AT&T Research Laboratories)

Track: Pervasive Web and Mobility 

Robust Web Page Segmentation for Mobile Terminal Using Content-Distances and Page Layout Information         361
G. Hattori, K. Hoashi, K. Matsumoto, F. Sugaya (KDDI R&D Laboratories),

PRIVÉ: Anonymous Location-Based Queries in Distributed Mobile Systems         371
G. Ghinita, P. Kalnis (National University of Singapore),
S.
Skiadopoulos (University of Peloponnese)

A Mobile Application Framework for the Geospatial Web         381
R. Simon, P. Fröhlich (Telecommunications Research Center Vienna)

Track: Search 

Session: Search Potpourri

Navigation-Aided Retrieval         391
S. Pandit (Carnegie Mellon University),
C.
Olston (Yahoo! Research)

Efficient Search Engine Measurements         401
Z. Bar-Yossef, M. Gurevich (Technion - Israel Institute of Technology)

Efficient Search in Large Textual Collections with Redundancy         411
J. Zhang, T. Suel (Polytechnic University)

Session: Crawlers 

The Discoverability of the Web         421
A. Dasgupta, A. Ghosh, R. Kumar, C. Olston, S. Pandey, A. Tomkins (Yahoo! Research)

Combining Classifiers to Identify Online Databases         431
L. Barbosa, J. Freire (University of Utah)

An Adaptive Crawler for Locating Hidden-Web Entry Points         441
L. Barbosa, J. Freire (University of Utah)

Session: Web Graphs 

Random Web Crawls         451
T. Bennouas (Criteo R&D),
F.
de Montgolfier (LIAFA - Université Paris 7)

Extraction and Classification of Dense Communities in the Web         461
Y. Dourisboure, F. Geraci, M. Pellegrini (Istituto di Informatica e Telematica)

Web Projections: Learning from Contextual Subgraphs of the Web         471
J. Leskovec (Carnegie Mellon University),
S.
Dumais, E. Horvitz (Microsoft Research)

Session: Search Quality and Precision 

Supervised Rank Aggregation         481
Y.-T. Liu (Microsoft Research Asia & Beijing Jiaotong University),
T.-Y. Liu (Microsoft Research Asia),
T. Qin (Microsoft Research Asia & Tsinghua University),
Z.-M.
Ma (Chinese Academy of Science),
H.
Li (Microsoft Research Asia)

Navigating the Intranet with High Precision         491
H. Zhu (IBM Almaden Research Center),
A.
Löser (SAP Research CEC Dresden),
S.
Raghavan, S. Vaithyanathan (IBM Almaden Research Center)

Optimizing Web Search Using Social Annotations         501
S. Bao, X. Wu (Shanghai JiaoTong University),
B.
Fei (IBM China Research Laboratory),
G.
Xue (Shanghai JiaoTong University),
Z.
Su (IBM China Research Laboratory),
Y.
Yu (Shanghai JiaoTong University)

Session: Advertisements & Click Estimates

Robust Methodologies for Modeling Web Click Distributions         511
K. Ali, M. Scarr (Yahoo!)

Predicting Clicks: Estimating the Click-Through Rate for New Ads         521
M. Richardson (Microsoft Research),
E.
Dominowska (Microsoft),
R.
Ragno (Microsoft Research)

Dynamics of Bid Optimization in Online Advertisement Auctions         531
C. Borgs, J. Chayes (Microsoft Research),
O.
Etesami (University of California at Berkeley),
N.
Immorlica, K. A. Jain (Microsoft Research),
M.
Mahdian (Yahoo! Research)

Session: Knowledge Discovery 

Compare&Contrast: Using the Web to Discover Comparable Cases for News Stories         541
J. Liu, E. Wagner, L. Birnbaum (Northwestern University)

Answering Bounded Continuous Search Queries in the World Wide Web         551
D. Kukulenz (Institute of Information Systems),
A.
Ntoulas (Microsoft Search Laboratories)

Answering Relationship Queries on the Web         561
G. Luo, C. Tan, Y.-l. Tian (IBM T.J. Watson Research Center)

Session: Personalization 

Dynamic Personalized Pagerank in Entity-Relation Graphs         571
S. Chakrabarti (IIT Bombay)

A Large-scale Evaluation and Analysis of Personalized Search Strategies         581
Z. Dou (Nankai University),
R.
Song, J.-R. Wen (Microsoft Research Asia)

Privacy-Enhancing Personalized Web Search         591
Y. Xu (Simon Fraser University),
B.
Zhang, Z. Chen (Microsoft Research Asia),
K.
Wang (Simon Fraser University)

Track: Security, Privacy, Reliability, & Ethics

Session: Defending Against Emerging Threats 

Defeating Script Injection Attacks with Browser-Enforced Embedded Policies         601
T. Jim (AT&T Laboratories – Research),
N.
Swamy, M. Hicks (University of Maryland)

Subspace: Secure Cross-Domain Communication for Web Mashups         611
C. Jackson (Stanford University),
H. J.
Wang (Microsoft Research)

Exposing Private Information by Timing Web Applications         621
A. Bortz, D. Boneh (Stanford University), P. Nandy

On Anonymizing Query Logs via Token-based Hashing         629
R. Kumar, J. Novak, B. Pang, A. Tomkins (Yahoo! Research)

Session: Passwords and Phishing 

CANTINA: A Content-Based Approach to Detecting Phishing Web Sites         639
Y. Zhang (University of Pittsburgh),
J.
Hong, L. Cranor (Carnegie Mellon University)

Learning to Detect Phishing Emails         649
I. Fette, N. Sadeh, A. Tomasic (Carnegie Mellon Univ.)

A Large-Scale Study of Web Password Habits         657
D. Florêncio, C. Herley (Microsoft Research)

Session: Access Control and Trust on the Web

A Fault Model and Mutation Testing of Access Control Policies         667
E. Martin, T. Xie (North Carolina State University)

Analyzing Web Access Control Policies         677
V. Kolovski, J. Hendler (University of Maryland),
B.
Parsia (University of Manchester)

Compiling Cryptographic Protocols for Deployment on the Web         687
J. McCarthy (Brown University),
J. D.
Guttman, J. D. Ramsdell (MITRE Corporation),
S. Krishnamurthi (Brown University)

Track: Semantic Web 

Session: Ontologies 

YAGO: A Core of Semantic Knowledge Unifying WordNet and Wikipedia         697
F. M. Suchanek, G. Kasneci, G. Weikum (Max-Planck-Institut)

Ontology Summarization Based on RDF Sentence Graph         707
X. Zhang, G. Cheng, Y. Qu (Southeast University)

Just the Right Amount: Extracting Modules from Ontologies         717
B. C. Grau, I. Horrocks, Y. Kazakov, U. Sattler (The University of Manchester)

Session: Applications

Toward Expressive Syndication on the Web         727
C. Halaschek-Wiener, J. Hendler (University of Maryland)

Exhibit: Lightweight Structured Data Publishing         737
D. F. Huynh, D. R. Karger, R. C. Miller (Massachusetts Institute of Technology)

Explorations in the Use of Semantic Web Technologies for Product Information Management         747
J.-S. Brunner, L. Ma, C. Wang, L. Zhang (IBM China Research Laboratory),
D. C.
Wolfson (IBM Software Group),
Y.
Pan (IBM China Research Laboratory),
K.
Srinivas (IBM T.J. Watson Research Center)

Session: Similarity and Extraction

Measuring Semantic Similarity between Words Using Web Search Engines         757
D. Bollegala (The University of Tokyo),
Y.
Matsuo (National Institute of Advanced Industrial Science & Technology),
M. Ishizuka (The University of Tokyo)

Using Google Distance to Weight Approximate Ontology Matches         767
R. Gligorov, Z. Aleksovski, W. ten Kate (Philips Research),
F. van Harmelen (Vrije Universiteit)

Hierarchical, Perceptron-like Learning for Ontology-Based Information Extraction         777
Y. Li, K. Bontcheva (University of Sheffield)

Session: Query Languages and DBs 

From SPARQL to Rules (and back)         787
A. Polleres (Universidad Rey Juan Carlos)

SPARQ2L: Towards Support for Subgraph Extraction Queries in RDF Databases         797
K. Anyanwu, A. Maduko (University of Georgia),
A. Sheth (Wright State University)

Bridging the Gap Between OWL and Relational Databases         807
B. Motik, I. Horrocks, U. Sattler (University of Manchester)

ActiveRDF: Object-Oriented Semantic Web Programming         817
E. Oren, R. Delbru, S. Gerke, A. Haller, S. Decker (National University of Ireland)

Session: Semantic Web and Web 2.0 

The Two Cultures: Mashing up Web 2.0 and the Semantic Web         825
A. Ankolekar, M. Krötzsch, T. Tran, D. Vrandecic (Universität Karlsruhe)

Analysis of Topological Characteristics of Huge Online Social Networking Services         835
Y.-Y. Ahn, S. Han, H. Kwak, S. Moon, H. Jeong (KAIST)

P-TAG: Large Scale Automatic Generation of Personalized Annotation TAGs for the Web         845
P.-A. Chirita, S. Costache (University of Hannover),
S.
Handschuh (National University of Ireland),
W.
Nejdl (University of Hannover)

Track: Technology for Developing Regions 

Session: Communication in Developing Regions

Connecting the 'Bottom of the Pyramid' – An Exploratory Case Study of India's Rural Communication Environment         855
S. Seshagiri, A. Sagar, D. Joshi (Motorola India Research Laboratories)

Communication as Information-Seeking: The Case for Mobile Social Software for Developing Regions         863
B. E. Kolko, E. J. Rose, E. Johnson (University of Washington)

Optimal Audio-Visual Representations for Illiterate Users of Computers         873
I. Medhi, A. Prasad, K. Toyama (Microsoft Research Laboratories India)

Session: Networking Issues in the Web

Identifying and Discriminating Between Web and Peer-to-Peer Traffic in the Network Core         883
J. Erman, A. Mahanti, M. Arlitt, C. Williamson (University of Calgary)

Long Distance Wireless Mesh Network Planning: Problem Formulation and Solution         893
S. Sen, B. Raman (IIT Kanpur)

Is High-Quality VoD Feasible using P2P Swarming?         903
S. Annapureddy (New York University),
S.
Guha (Cornell University),
C.
Gkantsidis, D. Gunawardena (Microsoft Research),
P.
Rodriguez (Telefonica Research)

Track: Web Engineering

Session: Web Modeling

Turning Portlets into Services: The Consumer Profile         913
O. Díaz, S. Trujillo, S. Pérez (University of the Basque Country)

A Framework for Rapid Integration of Presentation Components         923
J. Yu, B. Benatallah, R. Saint-Paul (University of New South Wales),
F. Casati (University of Trento),
F.
Daniel, M. Matera (Politecnico di Milano)

Integrating Value-based Requirement Engineering Models to WebML using VIP Business Modeling Framework         933
F. Azam, Z. Li, R. Ahmad (Beijing University of Aeronautics & Astronautics)

Session: End-User Perspectives and Measurement in Web Engineering

Towards Effective Browsing of Large Scale Social Annotations         943
R. Li, S. Bao (Shanghai JiaoTong University),
B.
Fei, Z. Su (IBM China Research Laboratory),
Y. Yu (Shanghai JiaoTong University)

Supporting End-Users in the Creation of Dependable Web Clips         953
S. Lingam, S. Elbaum (University of Nebraska-Lincoln)

Effort Estimation: How Valuable is it for a Web Company to Use a Cross-company Data Set, Compared to Using Its Own Single-company Data Set?         963
E. Mendes (The University of Auckland),
S.
Di Martino, F. Ferrucci, C. Gravino (Univ. di Salerno)

Track: Web Services

Session: Orchestration and Choreography 

Towards the Theoretical Foundation of Choreography         973
Z. Qiu, X. Zhao, C. Cai, H. Yang (Peking University)

Introduction and Evaluation of Martlet, a Scientific Workflow Language for Abstracted Parallelisation         983
D. Goodman (Oxford University Computing Laboratory)

Semi-Automated Adaptation of Service Interactions         993
H. R. Motahari Nezhad, B. Benatallah (University of New South Wales),
A. Martens, F. Curbera (IBM T.J. Watson Research Center),
F. Casati (University of Trento)

Session: SLAs and QoS

Reliable QoS Monitoring Based on Client Feedback         1003
R. Jurca (Ecole Polytechnique Fédérale de Lausanne),
W.
Binder (University of Lugano),
B.
Faltings (Ecole Polytechnique Fédérale de Lausanne)

Preference-based Selection of Highly Configurable Web Services         1013
S. Lamparter, A. Ankolekar, R. Studer (University of Karlsruhe),
S. Grimm (FZI Research Center for Information Technologies)

Speeding up Adaptation of Web Service Compositions Using Expiration Times         1023
J. Harney, P. Doshi (University of Georgia)

DIANE - An Integrated Approach to Automated Service Discovery, Matchmaking and Composition         1033
U. Küster, B. König-Ries (Friedrich-Schiller University Jena),
M. Stern, M. Klein (University of Karlsruhe)

Track: XML and Web Data

Session: Querying & Transforming XML

Multiway SLCA-based Keyword Search in XML Data         1043
C. Sun, C.-Y. Chan, A. K. Goenka (National University of Singapore)

Visibly Pushdown Automata for Streaming XML         1053
V. Kumar, P. Madhusudan, M. Viswanathan (University of Illinois at Urbana-Champaign)

Mapping-Driven XML Transformation         1063
H. Jiang, H. Ho, L. Popa (IBM Almaden Research Center),
W.-S. Han (Kyungpook National University),

Session: Parsing, Normalizing, & Storing XML

Querying and Maintaining a Compact XML Storage         1073
R. K. Wong, F. Lam, W. M. Shui (University of New South Wales & Green Pea Software)

XML Design for Relational Storage         1083
S. Kolahi (University of Toronto),
L.
Libkin (University of Edinburgh)

A High-Performance Interpretive Approach to Schema-Directed Parsing         1093
M. Matsa, E. Perkins, A. Heifets, M. Gaitatzes Kostoulas, D. Silva, N. Mendelsohn, M. Leger (IBM Corporation)

POSTERS

Topic: Developing Regions 

Collaborative ICT for Indian Business Clusters         1115
S. Roy, S. Biswas (Motorola India Research Laboratories)

Delay Tolerant Applications for Low Bandwidth and Intermittently Connected Users: the aAQUA Experience         1117
S. Sahni, K. Ramamritham (Indian Institute of Technology Bombay)

Topic: Search

A Cautious Surfer for PageRank         1119
L. Nie, B. Wu, B. D. Davison (Lehigh University)

A Clustering Method For Web Data with Multi-Type Interrelated Components         1121
L. Bolelli, S. Ertekin, D. Zhou, C. L. Giles (The Pennsylvania State University),

A Large-Scale Study of Robots.txt         1123
Y. Sun, Z. Zhuang, C. L. Giles (The Pennsylvania State University)

A Link-Based Ranking Scheme for Focused Search         1125
T. Abou-Assaleh, Y. Miao, T. Das, P. O'Brien ,W. Gao , Z. Zhen (GenieKnows.com)

A Link Classification Based Approach to Website Topic Hierarchy Generation         1127
N. Liu, C. C. Yang (The Chinese University of Hong Kong)

A Search-based Chinese Word Segmentation Method         1129
X.-J. Wang (IBM China Research Center),
W.
Liu (Huazhong University of Science & Technology),
Y. Qin (IBM China Research Center)

Anchor-based Proximity Measures         1131
A. Joshi, R. Kumar, B. Reed, A. Tomkins (Yahoo! Research)

Automatic Search Engine Performance Evaluation with Click-through Data Analysis         1133
Y. Liu, Y. Fu, M. Zhang, S. Ma (Tsinghua University),
L.
Ru (Sohu Incorporation)

Automatic Searching of Tables in Digital Libraries         1135
Y. Liu, K. Bai, P. Mitra, C. L. Giles (The Pennsylvania State University)

Bayesian Network based Sentence Retrieval Model         1137
K. Cai, J. Bu, C. Chen, K. Liu, W. Chen (Zhejiang University)

Brand Awareness and the Evaluation of Search Results         1139
B. J. Jansen, M. Zhang, Y. Zhang (The Pennsylvania State University)

Causal Relation of Queries from Temporal Logs         1141
Y. Sun (Peking University),
N.
Liu (Microsoft Research Asia),
K.
Xie (Peking University),
S.
Yan (University of Illinois at Urbana-Champaign),
B.
Zhang, Z. Chen (Microsoft Research Asia)

Classifying Web Sites         1143
C. Lindemann, L. Littig (University of Leipzig)

Comparing Apples and Oranges: Normalized PageRank for Evolving Graphs         1145
K. Berberich, S. Bedathur, G. Weikum (Max-Planck Institute for Informatics),
M. Vazirgiannis (INRIA/FUTURS)

Designing Efficient Sampling Techniques to Detect Webpage Updates         1147
Q. Tan, Z. Zhuang, P. Mitra, C. L. Giles (The Pennsylvania State University)

Determining the User Intent of Web Search Engine Queries         1149
B. J. Jansen, D. L. Booth (The Pennsylvania State University),
A. Spink (Queensland University of Technology)

EPCI: Extracting Potentially Copyright Infringement Texts from the Web         1151
T. Tashiro, T. Ueda, T. Hori, Y. Hirate, H. Yamana (Waseda University & National Institute of Informatics)

Efficient Training on Biased Minimax Probability Machine for Imbalanced Text Classification         1153
X. Peng, I. King (The Chinese University of Hong Kong)

Electoral Search Using the VerkiezingsKijker: An Experience Report         1155
V. Jijkoun, M. Marx, M. de Rijke, F. van Waveren (University of Amsterdam)

Exploration of Query Context for Information Retrieval         1157
K. Cai, C. Chen, J. Bu, P. Huang, Z. Kang (Zhejiang University)

First-order Focused Crawling         1159
Q. Xu, W. Zuo (Jilin University)

Academic Web Search Engine — Generating a Survey Automatically         1161
Y. Wang, Z. Geng, S. Huang, X. Wang, A. Zhou (Fudan University)

Generative Models for Name Disambiguation         1163
Y. Song, J. Huang, I. G. Councill, J. Li, C. L. Giles (The Pennsylvania State University)

GigaHash: Scalable Minimal Perfect Hashing for Billions of URLs         1165
K. Chellapilla, A. Mityagin, D. Charles (Microsoft Live Laboratories)

How NAGA Uncoils: Searching with Entities and Relations         1167
G. Kasneci, F. M. Suchanek, M. Ramanath, G. Weikum (Max-Planck-Institut)

Identifying Ambiguous Queries in Web Search         1169
R. Song (Shanghai Jiao Tong University & Microsoft Research Asia),
Z. Luo (Fudan University),
J.-R.
Wen (Microsoft Research Asia),
Y.
Yu (Shanghai Jiao Tong University),
H.-W.
Hong (Microsoft Research Asia)

Web Page Classification with Heterogeneous Data Fusion         1171
Z. Xu, I. King, M. R. Lyu (The Chinese University of Hong Kong)

Learning Information Diffusion Process on the Web         1173
X. Wan, J. Yang (Peking University)

MedSearch: A Specialized Search Engine for Medical Information         1175
G. Luo, C. Tang, H. Yang (IBM T.J. Watson Research Center),
X. Wei (University of Massachusetts at Amherst)

Mining Contiguous Sequential Patterns from Web Logs         1177
J. Chen (Queens College, CUNY),
T.
Cook (City University of New York)

Monitoring the Evolution of Cached Content in Google and MSN         1179
I. Anagnostopoulos (University of the Aegean)

Multi-factor Clustering for a Marketplace Search Interface         1181
N. Sundaresan, K. Ganesan, R. Grandhi (eBay Research Laboratories),

On Ranking Techniques for Desktop Search         1183
S. Cohen, C. Domshlak, N. Zwerdling (Technion—Israel Institute of Technology)

Query-Driven Indexing for Peer-to-Peer Text Retrieval         1185
G. Skobeltsyn, T. Luu (Ecole Polytechnique Fédérale de Lausanne),
I. P. Žarko (University of Zagreb),
M.
Rajman, K. Aberer (Ecole Polytechnique Fédérale de Lausanne)

Query Topic Detection for Reformulation         1187
X. He (Peking University),
J. Yan (Microsoft Research Asia),
J. Ma (Peking University),
N. Liu, Z. Chen (Microsoft Research Asia)

Review Spam Detection         1189
N. Jindal, B. Liu (University of Illinois at Chicago)

SCAN: A Small-World Structured P2P Overlay for Multi-Dimensional Queries         1191
X. Sun (Graduate School of Chinese Academy of Sciences)

SRing: A Structured Non DHT P2P Overlay Supporting String Range Queries         1193
X. Sun, X. Chen (Graduate School of Chinese Academy of Sciences)

Search Engine Retrieval of Changing Information         1195
Y. S. Kim, B. H. Kang (University of Tasmania),
P.
Compton (The University of New South Wales),
H.
Motoda (Osaka University)

Search Engines and Their Public Interfaces: Which APIs are the Most Synchronized?         1197
F. McCown, M. L. Nelson (Old Dominion University)