This paper appears in: Information Technology in Biomedicine, IEEE Transactions on Publication Date: April 2006 Volume: 10, Issue: 2 On page(s): 254- 263 ISSN: 1089-7771 INSPEC Accession Number: 9029999 Digital Object Identifier: 10.1109/TITB.2005.859885 Posted online: 2006-04-03 15:36:10.0 <> Otey ME, Parthasarathy S, Trost DC. Dissimilarity measures for detecting hepatotoxicity in clinical trial data. Proceedings of the Sixth SIAM International Conference on Data Mining, April, 2006; 507-511. Proceedings of the Sixth SIAM International Conference on Data Mining Edited by Joydeep Ghosh, Diane Lambert, David Skillicorn, and Jaideep Srivastava Proceedings in Applied Mathematics 124 Conference held in Bethesda, Maryland on April 20-22, 2006. The Sixth SIAM International Conference on Data Mining continues the tradition of presenting approaches, tools, and systems for data mining in fields such as science, engineering, industrial processes, healthcare, and medicine. The conference was sponsored by the Center for Applied Scientific Computing at the Lawrence Livermore National Laboratory and the American Statistical Association, continuing a trend towards greater collaboration between the two communities. Contents Message from the Conference Co-Chairs; Preface; Area Under ROC Optimisation Using a Ramp Approximation; On the Necessary and Sufficient Conditions of a Meaningful Distance Function for High Dimensional Data Space; 24CPM: A Covariance-Preserving Projection Method; Transform Regression and the Kolmogorov Superposition Theorem; A Latent Dirichlet Model for Unsupervised Entity Resolution; Deriving Private Information from Randomly Perturbed Ratings; Name Reference Resolution in Organizational Email Archives; Automated Knowledge Discovery from Simulators; Mining for Outliers in Sequential Databases; Mining Control Flow Abnormality for Logic Error Isolation; Scan Detection: A Data Mining Approach,; Learning Bayesian Networks from Incomplete Data: An Efficient Method for Generating Approximate Predictive Distributions; Efficient Markov Network Structure Discovery Using Independence Tests; K-Means Clustering over a Large, Dynamic Network,; Adapting K-Medians to Generate Normalized Cluster Centers; Advanced Prototype Machines: Exploring Prototypes for Classification; Toward Semantic XML Clustering; A Semantic Approach for Mining Hidden Links from Complementary and Non-interactive Biomedical Literature; Representation Is Everything: Towards Efficient and Adaptable Similarity Measures for Biological Data; Mining Frequent Agreement Subtrees in Phylogenetic Databases; Trend Relational Analysis and Grey-Fuzzy Clustering Method; Joint Cluster Analysis of Attribute Data and Relationship Data: The Connected k-Center Problem; Weighted Clustering Ensembles; Clustering in the Presence of Bridge-Nodes; Mining Frequent Patterns from Very High Dimensional Data: A Top-Down Row Enumeration Approach; Mining Frequent Patterns by Differential Refinement of Clustered Bitmaps; Discovery of Co-evolving Spatial Event Sets; Efficient Algorithms for Sequence Segmentation; Density-Based Clustering over an Evolving Data Stream with Noise; A Random Walks Method for Text Classification; Efficient Mining of Temporally Annotated Sequences; A Framework for Local Supervised Dimensionality Reduction of High Dimensional Data; Segmentation and Dimensionality Reduction; Probabilistic Multi-state Split-Merge Algorithm for Coupling Parameter Estimates; Item Sets That Compress; Mining Approximate Frequent Itemsets in the Presence of Noise: Algorithm and Analysis; Mining Frequent Closed Itemsets Out of Core; Local L2-Thresholding Based Data Mining in Peer-to-Peer Systems; Collaborative Information Extraction and Mining from Multiple Web Documents; Collaborative Document Clustering; Cluster Description Formats, Problems and Algorithms; Positive Borders or Negative Borders: How to Make Lossless Generator Based Representations Concise; Bayesian K-Means as a "Maximization-Expectation" Algorithm; A Framework for Clustering Massive Text and Categorical Data Streams; Cone Cluster Labeling for Support Vector Clustering; Semi-supervised Clustering with Partial Background Information; A New Privacy-Preserving Distributed k-Clustering Algorithm; ODAC: Hierarchical Clustering of Time Series Data Streams; Detecting the Change of Clustering Structure in Categorical Data Streams; Dissimilarity Measures for Detecting Hepatotoxicity in Clinical Trial Data ; Transductive De-noising and Dimensionality Reduction Using Total Bregman Regression; Robust Estimation for Mixture of Probability Tables Based on ?-likelihood; Fast Optimal Bandwidth Selection for Kernel Density Estimation; Risk-Sensitive Learning via Expected Shortfall Minimization; On Approximate Solutions to Support Vector Machines; Confidence Estimation Methods for Partially Supervised Relation Extraction; Inference of Node Replacement Recursive Graph Grammars; Learning from Incomplete Ratings Using Non-negative Matrix Factorization; Health Monitoring of a Shaft Transmission System via Hybrid Models of PCR and PLS; Modeling Evolutionary Behaviors for Community-Based Dynamic Recommendation; A Systematic Cross-Comparison of Sequence Classifiers; Data-Enhanced Predictive Modeling for Sales Targeting; Graph-Based Methods for Orbit Classification; Mining and Validating Localized Frequent Itemsets with Dynamic Tolerance; Profiling Protein Families from Partially Aligned Sequences; Personalized Knowledge Discovery: Mining Novel Association Rules from Text; A Novel Framework for Incorporating Labeled Examples into Anomaly Detection; Towards the Prediction of Protein Abundance from Tandem Mass Spectrometry Data; Using Compression to Identify Classes of Inauthentic Texts; Fast Mining of Distance-Based Outliers in High-Dimensional Datasets; Spatial Weighted Outlier Detection; Robust Clustering for Tracking Noisy Evolving Data Streams; WIP: Mining Weighted Interesting Patterns with a Strong Weight and/or Support Affinity; Discovering Frequent Tree Patterns over Data Streams; Finding Sequential Patterns from Massive Number of Spatio-temporal Events; Mining Minimal Contrast Subgraph Patterns; Author Index 2006 / xii + 646 pages / Softcover / ISBN-13: 978-0-898716-11-5 / ISBN-10: 0-89871-611-X / List Price $160.00 / Member Price $112.00 / Order Code PR124 <>