Publications in 2012

Youngchul Cha, Junghoo Cho, Social-network analysis using topic models. In Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), August 2012.

Chu-Cheng Hsieh, Junghoo Cho, Finding similar items by leveraging social tag clouds. In Proceedings of the ACM Symposium on Applied Computing (SAC),March 2012

Yingyi Bu, Vinayak Borkar, Michael J. Carey, Joshua Rosen, Neoklis Polyzotis, Tyson Condie, Markus Weimer and Raghu Ramakrishnan. “Scaling Datalog for Machine Learning on Big Data.” Tech Report (arXiv:1203.0160), 2012

Xiang Zhang, Shunping Huang, Zhaojun Zhang, Wei Wang, Mining Genome-Wide Genetic Markers, PLOS Computation Biology, vol. 8, no. 12, e1002828, 2012.

Eric Yi Liu, Zhishan Guo, Xiang Zhang, Vladimir Jojic, and Wei Wang, Metric learning from relative comparisons by minimizing squared residual, Proceedings of the 12th IEEE International Conference on Data Mining (ICDM), pp. 978-983, 2012.

Wei Cheng, Xiang Zhang, Feng Pan, and Wei Wang, Hierarchical co-clustering based on entropy splitting, Proceedings of the 20th ACM Conference on Information and Knowledge Management (CIKM), pp. 1472-1476, 2012.

Wei Cheng, Xiang Zhang, Wei Wang, Yubao Wu, Xiaolin Yin, Jing Li and David Heckerman, Inferring novel associations between SNP sets and gene sets in eQTL study using sparse graphical model, Proceedings of the ACM International Conference on Bioinformatics and Computational Biology (ACMBCB),pp. 466-472, 2012.

Xiang Zhang, Wei Cheng, Jennifer Listgarten, Carl Kadie, Shunping Huang,Wei Wang, and David Heckerman, Learning transcriptional regulatory relationships using sparse graphical models, PLoS ONE, vol. 7, no. 5, e357622012, 2012.

Eric Yi Liu, Steven Buyske, Aaron K. Aragaki, Ulrike Peters, Eric Boerwinkle, Chris Carlson, Cara Carty, Dana C. Crawford, Jeff Haessler, Lucia A. Hindorff, Loic Le Marchand, Teri A. Manolio, Tara Matise, Wei Wang, Charles Kooperberg, Kari E. North, and Yun Li, Genotype imputation of Metabochip SNPs using a study-specific reference panel of ~4,000 haplotypes in African Americans from the Women′s Health Initiative, Genetic Epidemiology, vol. 36, no. 2, pp.107-117, 2012.

Xiang Zhang, Shunping Huang, Wei Sun, andWei Wang, Rapid and robust resampling-based multiple testing correction with application in genome-wide eQTL study, Genetics, vol. 190, no. 4, pp. 1511-1520, 2012.

James J Crowley, Yunjung Kim, Jin Peng Szatkiewicz, Amanda L Pratt, Corey R Quackenbush, Daniel E Adkins, Edwin van den Oord, Molly A Bogue, Hyuna Yang, Wei Wang, David W Threadgill, Fernando Pardo-Manuel de Villena, Howard L McLeod, and Patrick F Sullivan, Genome-wide association mapping of loci for antipsychotic-induced extrapyramidal symptoms in mice, Mammalian Genome, vol. 23, no. 5-6, pp. 322-335, 2012.

Mingsheng Long, Jianmin Wang, Guiguang Ding, Wei Cheng, Xiang Zhang, and Wei Wang, Dual transfer learning, Proceedings of the 12th SIAM International Conference on Data Mining (SDM), pp. 540-551, 2012.

Kai Xia, Andrey A Shabalin, Shunping Huang, Vered Madar, Yi-Hui Zhou, Wei Wang, Fei Zou, Wei Sun, Patrick F Sullivan, Fred A Wright, seeQTL: a searchable database for human eQTLs, Bioinformatics, vol. 28, no. 3, pp. 451-452, 2012.

Collaborative Cross Consortium, The genome architecture of the Collaborative Cross mouse genetic reference population, Genetics, vol. 190, no. 2, pp. 389-401, 2012.

Zhaojun Zhang, Xiang Zhang, and Wei Wang, HTreeQA: using semi-perfect phylogeny trees in quantitative trait loci study on genotype data, G3: Genes, Genomes, Genetics, vol. 2, no. 2, pp. 175-189, 2012.

Mirjana Mazuran, Edoardo Serra, Carlo Zaniolo, Extending the Power of Datalog Recursion. VLDB Journal, accepted November 2012.

Nikolay Laptev, Carlo Zaniolo and Tsai-Ching Lu, BOOT-TS: A Scalable Bootstrap for Massive Time-Series Data. Big Learning: NIPS 2012 Workshop. December 8, Lake Tahoe, Nevada, USA.

Nikolay Laptev, Kai Zeng and Carlo Zaniolo, Early Accurate Results for Advanced Analytics on MapReduce. PVLDB 5(10): 1028-1039 (2012).

Shi Gao, Carlo Zaniolo, Supporting Database Provenance under Schema Evolution. ER Workshops 2012: 67-77.

Carlo Zaniolo, Logical Foundations of Continuous Query Languages for Data Streams.Datalog 2012: 177-189.

Shi Gao and Carlo Zaniolo, Provenance Management in Databases Under Schema Evolution. 4th USENIX Workshop on the theory and practice of provenance. June 14-15, Boston MA.

Nikolay Laptev, Kai Zeng, Carlo Zaniolo, Early Accurate Results for Advanced Analytics on MapReduce. PVLDB 5(10): 1028-1039 (2012).

Braverman, Rafail Ostrovsky, Carlo Zaniolo, Optimal Sampling From Sliding Windows.J. Comput. Syst. Sci., 78(1): 260-272 (2012).

Maurizio Atzori, Carlo Zaniolo, SWiPE: searching wikipedia by example. WWW (Companion Volume) 2012: 309-312.

Barzan Mozafari, Kai Zeng, Carlo Zaniolo, High-performance complex event processing over XML streams. SIGMOD Conference 2012: 253-264.

Nikolay Laptev, Carlo Zaniolo, Optimization of Massive Pattern Queries by Dynamic Configuration Morphing. ICDE 2012: 917-928

Publications in 2013

Jun-Seok Heo, Junghoo Cho, Kyu-Young Whang, Subspace top-k query processing using the hybrid-layer index with a tight bound. Data and Knowledge Engineering, 83: 1-19 (2013).

Youngchul Cha, Bin Bi, Chu-Cheng Hsieh, Junghoo Cho, Incorporating Popularity in Topic Models for Social Network Analysis, In Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), July 2013.

Wei Cheng, Wei Wang, and Sandra Batista, Grid-based Clustering, Data Clustering: Algorithms and Applications Chapter 6, by Charu C. Aggarwal and Chandan K. Reddy, CRC Press, 2013.

Zhaojun Zhang, Shunping Huang, Jack Wang, Xiang Zhang, Fernando Pardo Manuel de Villena, Leonard McMillan, and Wei Wang, GeneScissors: a comprehensive approach to detecting and correcting spurious transcriptome inference due to RNAseq reads misalignment, Proceedings of the 21st Annual International Conference on Intelligent Systems for Molecular Biology (ISMB), Special Issue of Bioinformatics, 2013.

Wei Cheng, Xiaoming Jin, Jian-Tao Sun, Xuemin Lin, Xiang Zhang, and Wei Wang, Searching Dimension Incomplete Databases, IEEE Transactions on Data Engineering (TKDE), 2013.

Jin Szatkiewicz, Weibo Wang, Patrick Sullivan,Wei Wang, and Wei Sun, Improving detection of copy number variation by simultaneous bias correction and read-depth segmentation, Nucleic Acids Research, vol. 41, no. 3, pp. 1519-1532, 2013.

Eric Yi Liu, Mingyao Li, Wei Wang, and Yun Li, MaCH-Admix: genotype imputation for admixed populations, Genetic Epidemiology, vol. 37, no. 1, pp. 25-37, 2013.

Alexander Shkapsky, Kai Zeng, Carlo Zaniolo, Graph Queries in a Next-Generation Datalog System. PVLDB 6(12): 1258-1261 (2013).

Hamid Mousavi, Shi Gao, Carlo Zaniolo, IBminer: A Text Mining Tool for Constructing and Populating InfoBox Databases and Knowledge Bases. PVLDB 6(12): 1330-1333 (2013).

Mirjana Mazuran, Edoardo Serra, Carlo Zaniolo, A declarative extension of horn clauses, and its significance for datalog and its applications. TPLP 13(4-5): 609-623 (2013).

Carlo Curino, Hyun Jin Moon, Alin Deutsch, Carlo Zaniolo, Automating the database schema evolution process. VLDB J. 22(1): 73-98 (2013).

Mirjana Mazuran, Edoardo Serra, Carlo Zaniolo, Extending the power of Datalog recursion. VLDB J. 22(4): 471-493 (2013).

Nikolay Laptev, Kai Zeng, Carlo Zaniolo, Very fast estimation for result and accuracy of big data analytics: The EARL system. ICDE 2013: 1296-1299.

Kai Zeng, Mohan Yang, Barzan Mozafari, Carlo Zaniolo, Complex pattern matching in complex structures: The XSeq approach. ICDE 2013: 1328-1331.

Elio Masciari, Shi Gao, Carlo Zaniolo, Sequential pattern mining from trajectory data. IDEAS 2013: 162-167.

Elio Masciari, Giuseppe Mazzeo and Carlo Zaniolo, A New, Fast and Accurate Algorithm for Hierarchical Clustering on Euclidean Distances. PAKDD (2) 2013: 111-122.

Hamid Mousavi, Carlo Zaniolo, Fast computation of approximate biased histograms on sliding windows over Data Streams. SSDBM 2013: 13 (best Paper award).

 

Welcome to the Scalable Analytics Institute!

The vast volume of data produced every day is creating major transformative opportunities in science and industry. The bottleneck limiting progress in science and industrial productivity has now shifted from the generation of massive datasets to their interpretation and exploitation by the sophisticated analytical applications needed to extract actionable knowledge from massive data sets. High‐performance scalable analytics are needed to tame the ever‐growing volume of data and application complexity. The UCLA Henry Samueli School of Engineering and Applied Sciences (SEAS) launched in 2013 the Scalable Analytics Institute (ScAI) to address the research challenges and opportunities in the new technology area of Big Data. It currently has more than 20 Ph.D. students and research scientists.