Proceedings of the Sixteenth International World Wide Web Conference
(WWW2007)
May 8-12, 2007
Banff, Alberta, CANADA
PAPERS
Track: Browsers and User Interfaces
Session: Personalization
Homepage Live: Automatic Block Tracing for Web Personalization 1
J. Han, D. Han (Shanghai Jiao-Tong University),
C. Lin, H.-J. Zeng, Z. Chen (Microsoft Research Asia),
Y. Yu (Shanghai Jiao-Tong University)
Open User Profiles for Adaptive News Systems: Help or Harm? 11
J.-w. Ahn, P. Brusilovsky, J. Grady, D. He, S. Y. Syn (University of Pittsburgh)
Investigating Behavioral Variability in Web Search 21
R. W. White (Microsoft Research), S. M. Drucker (Microsoft Live Laboratories)
Session: Smarter Browsing
CSurf: A Context-Driven Non-Visual Web-Browser 31
J. Mahmud, Y. Borodin, I. V. Ramakrishnan (Stony Brook University)
GeoTracker: Geospatial and Temporal RSS Navigation 41
Y.-F. Chen, G. Di Fabbrizio, D. Gibbon, R. Jana, S. Jora, B. Renger, B. Wei (AT&T Laboratories – Research)
Learning Information Intent via Observation 51
A. Tomasic, I. Simmons, J. Zimmerman (Carnegie Mellon University)
Track: Data Mining
Session: Identifying Structure in Web Pages
Page-level Template Detection via Isotonic Smoothing 61
D. Chakrabarti, R. Kumar (Yahoo! Research),
K. Punera (University of Texas at Austin)
Towards Domain-Independent Information Extraction from Web Tables 71
W. Gatterbauer, P. Bohunsky, M. Herzog, B. Krüpl, B. Pollak (Vienna University of Technology)
Web Object Retrieval 81
Z. Nie, Y. Ma, S. Shi, J.-R. Wen, W.-Y. Ma (Microsoft Research Asia)
Session: Mining Textual Data
Summarizing Email Conversations with Clue Words 91
G. Carenini, R. T. Ng, X. Zhou (University of British Columbia)
Organizing and Searching the World Wide Web of Facts — Step Two: Harnessing the Wisdom of the Crowds 101
M. Paşca (Google Inc.)
Do Not Crawl in the DUST: Different URLs with Similar Text 111
Z. Bar-Yossef (Technion and Google Haifa Engineering Center),
I. Keidar (Technion),
U. Schonfeld (University of California at Los Angeles)
Session: Similarity Search
A New Suffix Tree Similarity Measure for Document Clustering 121
H. Chim, X. Deng (City University of Hong Kong)
Scaling Up All Pairs Similarity Search 131
R. J. Bayardo (Google, Inc.), Y. Ma (University of California at Irvine), R. Srikant (Google, Inc.)
Detecting Near-Duplicates for Web Crawling 141
G. S. Manku, Jain (Google Inc.), A. D. Sarma (Stanford University)
Session: Predictive Modeling of Web Users
Demographic Prediction Based on User's Browsing Behavior 151
J. Hu, H.-J. Zeng, H. Li, C. Niu, Z. Chen (Microsoft Research Asia)
Why We Search: Visualizing and Predicting User Behavior 161
E. Adar, D. S. Weld, B. N. Bershad, S. D. Gribble (University of Washington)
Topic Sentiment Mixture: Modeling Facets and Opinions in Weblogs 171
Q. Mei, X. Ling, M. Wondra (University of Illinois at Urbana-Champaign),
H. Su (Vanderbilt University),
CX. Zhai (University of Illinois at Urbana-Champaign)
Session: Mining in Social Networks
Wherefore Art Thou R3579X? Anonymized Social Networks, Hidden Patterns, and Structural Steganography 181
L. Backstrom, (Cornell University),
C. Dwork (Microsoft Research),
J. Kleinberg (Cornell University)
Information Flow Modeling based on Diffusion Rate for Prediction and Ranking 191
X. Song, Y. Chi, K. Hino, B. L. Tseng (NEC Laboratories America)
NetProbe: A Fast& Scalable System for Fraud Detection in Online Auction Networks 201
S. Pandit, D. H. Chau, S. Wang, C. Faloutsos (Carnegie Mellon University)
Track: E* Applications
Session: E-Communities
The Complex Dynamics of Collaborative Tagging 211
H. Halpin (University of Edinburgh),
V. Robu (CWI, Center for Mathematics and Computer Science),
H. Shepherd (Princeton University)
Expertise Networks in Online Communities: Structure and Algorithms 221
J. Zhang, M. S. Ackerman, L. Adamic (University of Michigan)
Internet-Scale Collection of Human-Reviewed Data 231
Q. Su, D. Pavlov, J.-H. Chow, W. C. Baker (Yahoo! Inc.)
Session: E-Commerce and E-Content
DETECTIVES: DETEcting Coalition hiT Inflation attacks in adVertising nEtworks Streams 241
A. Metwally, D. Agrawal, A. El Abbadi (University of California at Santa Barbara)
Extraction and Search of Chemical Formulae in Text Documents on the Web 251
B. Sun, Q. Tan, P. Mitra, C. L. Giles (The Pennsylvania State University)
A Content-Driven Reputation System for the Wikipedia 261
B. T. Adler, L. de Alfaro (University of California at Santa Cruz)
Track: Industrial Practice & Experience
Google News Personalization: Scalable Online Collaborative Filtering 271
A. Das, M. Datar, A. Garg (Google Inc.),
S. Rajaram (University of Illinois at Urbana-Champaign)
Exploring in the Weblog Space by Detecting Informative and Affective Articles 281
X. Ni, G.-R. Xue, X. Ling, Y. Yu (Shanghai Jiao-Tong University),
Q. Yang (Hong Kong University of Science & Technology)
Spam Double-Funnel: Connecting Web Spammers with Advertisers 291
Y.-M. Wang, M. Ma (Microsoft Research),
Y. Niu, H. Chen (University of California at Davis)
Track: Performance and Scalability
Session: Scalable Systems for Dynamic Content
GlobeTP: Template-Based Database Replication for Scalable Web Applications 301
T. Groothuyse, S. Sivasubramanian, G. Pierre (Vrije Universiteit)
Consistency-preserving Caching of Dynamic Database Content 311
N. Tolia, M. Satyanarayanan (Carnegie Mellon University)
Optimized Query Planning of Continuous Aggregation Queries in Dynamic Data Dissemination Networks 321
R. Gupta (IBM India Research Laboratory),
K. Ramamritham (Indian Institute of Technology)
Session: Performance Engineering of Web Applications
A Scalable Application Placement Controller for Enterprise Data Centers 331
C. Tang, M. Steinder, M. Spreitzer, G. Pacifici (IBM T.J. Watson Research Center)
A Unified Platform for Data Driven Web Applications with Automatic Client-Server Partitioning 341
F. Yang, N. Gupta, N. Gerner, X. Qi, A. Demers, J. Gehrke (Cornell University),
J. Shanmugasundaram (Yahoo!)
MyXDNS: A Request Routing DNS Server with Decoupled Server Selection 351
H. A. Alzoubi, M. Rabinovich (Case Western Reserve University),
O. Spatscheck (AT&T Research Laboratories)
Track: Pervasive Web and Mobility
Robust Web Page Segmentation for Mobile Terminal Using Content-Distances and Page Layout Information 361
G. Hattori, K. Hoashi, K. Matsumoto, F. Sugaya (KDDI R&D Laboratories),
PRIVÉ: Anonymous Location-Based Queries in Distributed Mobile Systems 371
G. Ghinita, P. Kalnis (National University of Singapore),
S. Skiadopoulos (University of Peloponnese)
A Mobile Application Framework for the Geospatial Web 381
R. Simon, P. Fröhlich (Telecommunications Research Center Vienna)
Track: Search
Session: Search Potpourri
Navigation-Aided Retrieval 391
S. Pandit (Carnegie Mellon University),
C. Olston (Yahoo! Research)
Efficient Search Engine Measurements 401
Z. Bar-Yossef, M. Gurevich (Technion - Israel Institute of Technology)
Efficient Search in Large Textual Collections with Redundancy 411
J. Zhang, T. Suel (Polytechnic University)
Session: Crawlers
The Discoverability of the Web 421
A. Dasgupta, A. Ghosh, R. Kumar, C. Olston, S. Pandey, A. Tomkins (Yahoo! Research)
Combining Classifiers to Identify Online Databases 431
L. Barbosa, J. Freire (University of Utah)
An Adaptive Crawler for Locating Hidden-Web Entry Points 441
L. Barbosa, J. Freire (University of Utah)
Session: Web Graphs
Random Web Crawls 451
T. Bennouas (Criteo R&D),
F. de Montgolfier (LIAFA - Université Paris 7)
Extraction and Classification of Dense Communities in the Web 461
Y. Dourisboure, F. Geraci, M. Pellegrini (Istituto di Informatica e Telematica)
Web Projections: Learning from Contextual Subgraphs of the Web 471
J. Leskovec (Carnegie Mellon University),
S. Dumais, E. Horvitz (Microsoft Research)
Session: Search Quality and Precision
Supervised Rank Aggregation 481
Y.-T. Liu (Microsoft Research Asia & Beijing Jiaotong University),
T.-Y. Liu (Microsoft Research Asia),
T. Qin (Microsoft Research Asia & Tsinghua University),
Z.-M. Ma (Chinese Academy of Science),
H. Li (Microsoft Research Asia)
Navigating the Intranet with High Precision 491
H. Zhu (IBM Almaden Research Center),
A. Löser (SAP Research CEC Dresden),
S. Raghavan, S. Vaithyanathan (IBM Almaden Research Center)
Optimizing Web Search Using Social Annotations 501
S. Bao, X. Wu (Shanghai JiaoTong University),
B. Fei (IBM China Research Laboratory),
G. Xue (Shanghai JiaoTong University),
Z. Su (IBM China Research Laboratory),
Y. Yu (Shanghai JiaoTong University)
Session: Advertisements & Click Estimates
Robust Methodologies for Modeling Web Click Distributions 511
K. Ali, M. Scarr (Yahoo!)
Predicting Clicks: Estimating the Click-Through Rate for New Ads 521
M. Richardson (Microsoft Research),
E. Dominowska (Microsoft),
R. Ragno (Microsoft Research)
Dynamics of Bid Optimization in Online Advertisement Auctions 531
C. Borgs, J. Chayes (Microsoft Research),
O. Etesami (University of California at Berkeley),
N. Immorlica, K. A. Jain (Microsoft Research),
M. Mahdian (Yahoo! Research)
Session: Knowledge Discovery
Compare&Contrast: Using the Web to Discover Comparable Cases for News Stories 541
J. Liu, E. Wagner, L. Birnbaum (Northwestern University)
Answering Bounded Continuous Search Queries in the World Wide Web 551
D. Kukulenz (Institute of Information Systems),
A. Ntoulas (Microsoft Search Laboratories)
Answering Relationship Queries on the Web 561
G. Luo, C. Tan, Y.-l. Tian (IBM T.J. Watson Research Center)
Session: Personalization
Dynamic Personalized Pagerank in Entity-Relation Graphs 571
S. Chakrabarti (IIT Bombay)
A Large-scale Evaluation and Analysis of Personalized Search Strategies 581
Z. Dou (Nankai University),
R. Song, J.-R. Wen (Microsoft Research Asia)
Privacy-Enhancing Personalized Web Search 591
Y. Xu (Simon Fraser University),
B. Zhang, Z. Chen (Microsoft Research Asia),
K. Wang (Simon Fraser University)
Track: Security, Privacy, Reliability, & Ethics
Session: Defending Against Emerging Threats
Defeating Script Injection Attacks with Browser-Enforced Embedded Policies 601
T. Jim (AT&T Laboratories – Research),
N. Swamy, M. Hicks (University of Maryland)
Subspace: Secure Cross-Domain Communication for Web Mashups 611
C. Jackson (Stanford University),
H. J. Wang (Microsoft Research)
Exposing Private Information by Timing Web Applications 621
A. Bortz, D. Boneh (Stanford University), P. Nandy
On Anonymizing Query Logs via Token-based Hashing 629
R. Kumar, J. Novak, B. Pang, A. Tomkins (Yahoo! Research)
Session: Passwords and Phishing
CANTINA: A Content-Based Approach to Detecting Phishing Web Sites 639
Y. Zhang (University of Pittsburgh),
J. Hong, L. Cranor (Carnegie Mellon University)
Learning to Detect Phishing Emails 649
I. Fette, N. Sadeh, A. Tomasic (Carnegie Mellon Univ.)
A Large-Scale Study of Web Password Habits 657
D. Florêncio, C. Herley (Microsoft Research)
Session: Access Control and Trust on the Web
A Fault Model and Mutation Testing of Access Control Policies 667
E. Martin, T. Xie (North Carolina State University)
Analyzing Web Access Control Policies 677
V. Kolovski, J. Hendler (University of Maryland),
B. Parsia (University of Manchester)
Compiling Cryptographic Protocols for Deployment on the Web 687
J. McCarthy (Brown University),
J. D. Guttman, J. D. Ramsdell (MITRE Corporation),
S. Krishnamurthi (Brown University)
Track: Semantic Web
Session: Ontologies
YAGO: A Core of Semantic Knowledge Unifying WordNet and Wikipedia 697
F. M. Suchanek, G. Kasneci, G. Weikum (Max-Planck-Institut)
Ontology Summarization Based on RDF Sentence Graph 707
X. Zhang, G. Cheng, Y. Qu (Southeast University)
Just the Right Amount: Extracting Modules from Ontologies 717
B. C. Grau, I. Horrocks, Y. Kazakov, U. Sattler (The University of Manchester)
Session: Applications
Toward Expressive Syndication on the Web 727
C. Halaschek-Wiener, J. Hendler (University of Maryland)
Exhibit: Lightweight Structured Data Publishing 737
D. F. Huynh, D. R. Karger, R. C. Miller (Massachusetts Institute of Technology)
Explorations in the Use of Semantic Web Technologies for Product Information Management 747
J.-S. Brunner, L. Ma, C. Wang, L. Zhang (IBM China Research Laboratory),
D. C. Wolfson (IBM Software Group),
Y. Pan (IBM China Research Laboratory),
K. Srinivas (IBM T.J. Watson Research Center)
Session: Similarity and Extraction
Measuring Semantic Similarity between Words Using Web Search Engines 757
D. Bollegala (The University of Tokyo),
Y. Matsuo (National Institute of Advanced Industrial Science & Technology),
M. Ishizuka (The University of Tokyo)
Using Google Distance to Weight Approximate Ontology Matches 767
R. Gligorov, Z. Aleksovski, W. ten Kate (Philips Research),
F. van Harmelen (Vrije Universiteit)
Hierarchical, Perceptron-like Learning for Ontology-Based Information Extraction 777
Y. Li, K. Bontcheva (University of Sheffield)
Session: Query Languages and DBs
From SPARQL to Rules (and back) 787
A. Polleres (Universidad Rey Juan Carlos)
SPARQ2L: Towards Support for Subgraph Extraction Queries in RDF Databases 797
K. Anyanwu, A. Maduko (University of Georgia),
A. Sheth (Wright State University)
Bridging the Gap Between OWL and Relational Databases 807
B. Motik, I. Horrocks, U. Sattler (University of Manchester)
ActiveRDF: Object-Oriented Semantic Web Programming 817
E. Oren, R. Delbru, S. Gerke, A. Haller, S. Decker (National University of Ireland)
Session: Semantic Web and Web 2.0
The Two Cultures: Mashing up Web 2.0 and the Semantic Web 825
A. Ankolekar, M. Krötzsch, T. Tran, D. Vrandecic (Universität Karlsruhe)
Analysis of Topological Characteristics of Huge Online Social Networking Services 835
Y.-Y. Ahn, S. Han, H. Kwak, S. Moon, H. Jeong (KAIST)
P-TAG: Large Scale Automatic Generation of Personalized Annotation TAGs for the Web 845
P.-A. Chirita, S. Costache (University of Hannover),
S. Handschuh (National University of Ireland),
W. Nejdl (University of Hannover)
Track: Technology for Developing Regions
Session: Communication in Developing Regions
Connecting the 'Bottom of the Pyramid' – An Exploratory Case Study of India's Rural Communication Environment 855
S. Seshagiri, A. Sagar, D. Joshi (Motorola India Research Laboratories)
Communication as Information-Seeking: The Case for Mobile Social Software for Developing Regions 863
B. E. Kolko, E. J. Rose, E. Johnson (University of Washington)
Optimal Audio-Visual Representations for Illiterate Users of Computers 873
I. Medhi, A. Prasad, K. Toyama (Microsoft Research Laboratories India)
Session: Networking Issues in the Web
Identifying and Discriminating Between Web and Peer-to-Peer Traffic in the Network Core 883
J. Erman, A. Mahanti, M. Arlitt, C. Williamson (University of Calgary)
Long Distance Wireless Mesh Network Planning: Problem Formulation and Solution 893
S. Sen, B. Raman (IIT Kanpur)
Is High-Quality VoD Feasible using P2P Swarming? 903
S. Annapureddy (New York University),
S. Guha (Cornell University),
C. Gkantsidis, D. Gunawardena (Microsoft Research),
P. Rodriguez (Telefonica Research)
Track: Web Engineering
Session: Web Modeling
Turning Portlets into Services: The Consumer Profile 913
O. Díaz, S. Trujillo, S. Pérez (University of the Basque Country)
A Framework for Rapid Integration of Presentation Components 923
J. Yu, B. Benatallah, R. Saint-Paul (University of New South Wales),
F. Casati (University of Trento),
F. Daniel, M. Matera (Politecnico di Milano)
Integrating Value-based Requirement Engineering Models to WebML using VIP Business Modeling Framework 933
F. Azam, Z. Li, R. Ahmad (Beijing University of Aeronautics & Astronautics)
Session: End-User Perspectives and Measurement in Web Engineering
Towards Effective Browsing of Large Scale Social Annotations 943
R. Li, S. Bao (Shanghai JiaoTong University),
B. Fei, Z. Su (IBM China Research Laboratory),
Y. Yu (Shanghai JiaoTong University)
Supporting End-Users in the Creation of Dependable Web Clips 953
S. Lingam, S. Elbaum (University of Nebraska-Lincoln)
Effort
Estimation: How Valuable is it for a Web Company to Use a Cross-company
Data Set, Compared to Using Its Own Single-company Data Set? 963
E. Mendes (The University of Auckland),
S. Di Martino, F. Ferrucci, C. Gravino (Univ. di Salerno)
Track: Web Services
Session: Orchestration and Choreography
Towards the Theoretical Foundation of Choreography 973
Z. Qiu, X. Zhao, C. Cai, H. Yang (Peking University)
Introduction and Evaluation of Martlet, a Scientific Workflow Language for Abstracted Parallelisation 983
D. Goodman (Oxford University Computing Laboratory)
Semi-Automated Adaptation of Service Interactions 993
H. R. Motahari Nezhad, B. Benatallah (University of New South Wales),
A. Martens, F. Curbera (IBM T.J. Watson Research Center),
F. Casati (University of Trento)
Session: SLAs and QoS
Reliable QoS Monitoring Based on Client Feedback 1003
R. Jurca (Ecole Polytechnique Fédérale de Lausanne),
W. Binder (University of Lugano),
B. Faltings (Ecole Polytechnique Fédérale de Lausanne)
Preference-based Selection of Highly Configurable Web Services 1013
S. Lamparter, A. Ankolekar, R. Studer (University of Karlsruhe),
S. Grimm (FZI Research Center for Information Technologies)
Speeding up Adaptation of Web Service Compositions Using Expiration Times 1023
J. Harney, P. Doshi (University of Georgia)
DIANE - An Integrated Approach to Automated Service Discovery, Matchmaking and Composition 1033
U. Küster, B. König-Ries (Friedrich-Schiller University Jena),
M. Stern, M. Klein (University of Karlsruhe)
Track: XML and Web Data
Session: Querying & Transforming XML
Multiway SLCA-based Keyword Search in XML Data 1043
C. Sun, C.-Y. Chan, A. K. Goenka (National University of Singapore)
Visibly Pushdown Automata for Streaming XML 1053
V. Kumar, P. Madhusudan, M. Viswanathan (University of Illinois at Urbana-Champaign)
Mapping-Driven XML Transformation 1063
H. Jiang, H. Ho, L. Popa (IBM Almaden Research Center),
W.-S. Han (Kyungpook National University),
Session: Parsing, Normalizing, & Storing XML
Querying and Maintaining a Compact XML Storage 1073
R. K. Wong, F. Lam, W. M. Shui (University of New South Wales & Green Pea Software)
XML Design for Relational Storage 1083
S. Kolahi (University of Toronto),
L. Libkin (University of Edinburgh)
A High-Performance Interpretive Approach to Schema-Directed Parsing 1093
M. Matsa, E. Perkins, A. Heifets, M. Gaitatzes Kostoulas, D. Silva, N. Mendelsohn, M. Leger (IBM Corporation)
POSTERS
Topic: Developing Regions
Collaborative ICT for Indian Business Clusters 1115
S. Roy, S. Biswas (Motorola India Research Laboratories)
Delay Tolerant Applications for Low Bandwidth and Intermittently Connected Users: the aAQUA Experience 1117
S. Sahni, K. Ramamritham (Indian Institute of Technology Bombay)
Topic: Search
A Cautious Surfer for PageRank 1119
L. Nie, B. Wu, B. D. Davison (Lehigh University)
A Clustering Method For Web Data with Multi-Type Interrelated Components 1121
L. Bolelli, S. Ertekin, D. Zhou, C. L. Giles (The Pennsylvania State University),
A Large-Scale Study of Robots.txt 1123
Y. Sun, Z. Zhuang, C. L. Giles (The Pennsylvania State University)
A Link-Based Ranking Scheme for Focused Search 1125
T. Abou-Assaleh, Y. Miao, T. Das, P. O'Brien ,W. Gao , Z. Zhen (GenieKnows.com)
A Link Classification Based Approach to Website Topic Hierarchy Generation 1127
N. Liu, C. C. Yang (The Chinese University of Hong Kong)
A Search-based Chinese Word Segmentation Method 1129
X.-J. Wang (IBM China Research Center),
W. Liu (Huazhong University of Science & Technology),
Y. Qin (IBM China Research Center)
Anchor-based Proximity Measures 1131
A. Joshi, R. Kumar, B. Reed, A. Tomkins (Yahoo! Research)
Automatic Search Engine Performance Evaluation with Click-through Data Analysis 1133
Y. Liu, Y. Fu, M. Zhang, S. Ma (Tsinghua University),
L. Ru (Sohu Incorporation)
Automatic Searching of Tables in Digital Libraries 1135
Y. Liu, K. Bai, P. Mitra, C. L. Giles (The Pennsylvania State University)
Bayesian Network based Sentence Retrieval Model 1137
K. Cai, J. Bu, C. Chen, K. Liu, W. Chen (Zhejiang University)
Brand Awareness and the Evaluation of Search Results 1139
B. J. Jansen, M. Zhang, Y. Zhang (The Pennsylvania State University)
Causal Relation of Queries from Temporal Logs 1141
Y. Sun (Peking University),
N. Liu (Microsoft Research Asia),
K. Xie (Peking University),
S. Yan (University of Illinois at Urbana-Champaign),
B. Zhang, Z. Chen (Microsoft Research Asia)
Classifying Web Sites 1143
C. Lindemann, L. Littig (University of Leipzig)
Comparing Apples and Oranges: Normalized PageRank for Evolving Graphs 1145
K. Berberich, S. Bedathur, G. Weikum (Max-Planck Institute for Informatics),
M. Vazirgiannis (INRIA/FUTURS)
Designing Efficient Sampling Techniques to Detect Webpage Updates 1147
Q. Tan, Z. Zhuang, P. Mitra, C. L. Giles (The Pennsylvania State University)
Determining the User Intent of Web Search Engine Queries 1149
B. J. Jansen, D. L. Booth (The Pennsylvania State University),
A. Spink (Queensland University of Technology)
EPCI: Extracting Potentially Copyright Infringement Texts from the Web 1151
T. Tashiro, T. Ueda, T. Hori, Y. Hirate, H. Yamana (Waseda University & National Institute of Informatics)
Efficient Training on Biased Minimax Probability Machine for Imbalanced Text Classification 1153
X. Peng, I. King (The Chinese University of Hong Kong)
Electoral Search Using the VerkiezingsKijker: An Experience Report 1155
V. Jijkoun, M. Marx, M. de Rijke, F. van Waveren (University of Amsterdam)
Exploration of Query Context for Information Retrieval 1157
K. Cai, C. Chen, J. Bu, P. Huang, Z. Kang (Zhejiang University)
First-order Focused Crawling 1159
Q. Xu, W. Zuo (Jilin University)
Academic Web Search Engine — Generating a Survey Automatically 1161
Y. Wang, Z. Geng, S. Huang, X. Wang, A. Zhou (Fudan University)
Generative Models for Name Disambiguation 1163
Y. Song, J. Huang, I. G. Councill, J. Li, C. L. Giles (The Pennsylvania State University)
GigaHash: Scalable Minimal Perfect Hashing for Billions of URLs 1165
K. Chellapilla, A. Mityagin, D. Charles (Microsoft Live Laboratories)
How NAGA Uncoils: Searching with Entities and Relations 1167
G. Kasneci, F. M. Suchanek, M. Ramanath, G. Weikum (Max-Planck-Institut)
Identifying Ambiguous Queries in Web Search 1169
R. Song (Shanghai Jiao Tong University & Microsoft Research Asia),
Z. Luo (Fudan University),
J.-R. Wen (Microsoft Research Asia),
Y. Yu (Shanghai Jiao Tong University),
H.-W. Hong (Microsoft Research Asia)
Web Page Classification with Heterogeneous Data Fusion 1171
Z. Xu, I. King, M. R. Lyu (The Chinese University of Hong Kong)
Learning Information Diffusion Process on the Web 1173
X. Wan, J. Yang (Peking University)
MedSearch: A Specialized Search Engine for Medical Information 1175
G. Luo, C. Tang, H. Yang (IBM T.J. Watson Research Center),
X. Wei (University of Massachusetts at Amherst)
Mining Contiguous Sequential Patterns from Web Logs 1177
J. Chen (Queens College, CUNY),
T. Cook (City University of New York)
Monitoring the Evolution of Cached Content in Google and MSN 1179
I. Anagnostopoulos (University of the Aegean)
Multi-factor Clustering for a Marketplace Search Interface 1181
N. Sundaresan, K. Ganesan, R. Grandhi (eBay Research Laboratories),
On Ranking Techniques for Desktop Search 1183
S. Cohen, C. Domshlak, N. Zwerdling (Technion—Israel Institute of Technology)
Query-Driven Indexing for Peer-to-Peer Text Retrieval 1185
G. Skobeltsyn, T. Luu (Ecole Polytechnique Fédérale de Lausanne),
I. P. Žarko (University of Zagreb),
M. Rajman, K. Aberer (Ecole Polytechnique Fédérale de Lausanne)
Query Topic Detection for Reformulation 1187
X. He (Peking University),
J. Yan (Microsoft Research Asia),
J. Ma (Peking University),
N. Liu, Z. Chen (Microsoft Research Asia)
Review Spam Detection 1189
N. Jindal, B. Liu (University of Illinois at Chicago)
SCAN: A Small-World Structured P2P Overlay for Multi-Dimensional Queries 1191
X. Sun (Graduate School of Chinese Academy of Sciences)
SRing: A Structured Non DHT P2P Overlay Supporting String Range Queries 1193
X. Sun, X. Chen (Graduate School of Chinese Academy of Sciences)
Search Engine Retrieval of Changing Information 1195
Y. S. Kim, B. H. Kang (University of Tasmania),
P. Compton (The University of New South Wales),
H. Motoda (Osaka University)
Search Engines and Their Public Interfaces: Which APIs are the Most Synchronized? 1197
F. McCown, M. L. Nelson (Old Dominion University)