Publications in 2019

-updating in progress-

Nathan LaPierre, Chelsea J.-T Ju, Guangyu Zhou and Wei Wang. MetaPheno: A Critical Evaluation of Deep Learning and Machine Learning in Metagenome-Based Disease Prediction. Methods, 2019.

Ruirui Li, Jyun-Yu Jiang, Chelsea J.-T Ju and Wei Wang. Corals: Who are My Potential New Customers? Tapping into the Wisdom of Customers’ Decisions. In proceedings of The 12th ACM International Conference on Web Search and Data Mining (WSDM’19), ACM, 2019.

Muhao Chen, Chelsea J.-T Ju, Guangyu Zhou, Tianran Zhang, Xuelu Chen, Kai-Wei Chang, Carlo Zaniolo and Wei Wang. Multifaceted Protein-Protein Interaction Prediction Based on Siamese Residual RCNN. Bioinformatics (ISMB/ECCE 2019).

Zeyu Li, Jyun-Yu Jiang, Yizhou Sun and Wei Wang. Personalized Question Routing via Heterogenous Network Embedding. In proceedings of The 33rd AAAI Conference on Artificial Intelligence (AAA’19), AAAI, 2019.

Ruirui Li, Liangda Li, Xian Wu, Yunhong Zhou and Wei Wang. Click Feedback-Aware Query Recommendation Using Adversarial Examples. In proceedings of The 30th World Wide Web Conference, WWW 2019.

Publications in 2018

-updating in progress-

Wenchao Yu, Cheng Zheng, Wei Cheng, Charu C. Aggarwal, Dongjin Song, Bo Zong, Haifeng Chen, Wei Wang. “Learning Deep Network Representations with Adversarially Regularized Autoencoders.” In proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, August 2018.

Seunhyun Yoo, Seungbae Kim, Joshua Joy, Mario Gerla. “Promoting Cooperative Strategies on Proof-of-Work Blockchain,” International Joint Conference on Neural Networks (IJCNN), July 2018.

Jieyu Zhao, Yichao Zhou, Zeyu Li, Wei Wang and Kai Wei Chang. “Learning Gender-Neutral Word Embeddings” In proceedings of The 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018.

Sara Melvin, Jonathan Lin, Seungbae Kim, Mario Gerla. “Hitchhiker: A Wireless Routing Protocol in a Delay Tolerant Network Using Density-Based Clustering” IEEE 88th Vehicular Technology Conference (VTC-Fall) Chicago, August 2018.

Nicholas J. Martiasz, Justin Wood, Pranay Doshi, William Speier, Barry Beckemeyer, Wei Wang, William Hsu, Alcino J. Silva. “ResearchMaps.org for integrating and planning research.” PLOS One 13:e0195271

Ruirui Li, Jyun-Yu Jiang, Chelsea J.-T Ju, Cheryl Flynn, Wen-Ling Hsu, Jia Wang, Wei Wang and Tan Xu. “Enhancing Response Generation Using Chat Flow Identification.” In Conventional AI and Its Applications Workshop at KDD 2018.

Jyun-Yu Jiang and Wei Wang. RIN: “Reformulation Inference Network for Context-Aware Query Suggestions.” In proceedings of The 27th ACM International Conference on Information and Knowledge Management, ACM 2018.

Jyun-Yu Jiang, Cheng-Te Li, Yian Chen and Wei Wang. “Identifying Users behind Shared Accounts in Online Streaming Services.” In proceedings of The 41st International ACM SIGIR Conference on Research and Development in Information Retrieval, (SIGIR’ 18), ACM 2018.

Guangyu Zhou, Jyun-Yu Jiang, Chelsea J.-T Ju, and Wei Wang. “Inferring Microbial Communities for City Scale Metagenomics Using Neural Networks.” In proceedings of 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM’18), IEEE 2018.

Jyun-Yu Jiang, Francine Chen, Yan-Ying Chen and Wei Wang. “Learning to Disentangle Interleaved Conversational Threads with a Siamese Hierarchical Network and Similarity Ranking.” In proceedings of The 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL’ 18), ACL 2018.

Nathan LaPierre, Serghei Mangul, Mohammad Alser, Igor Mandric, Nicholas C. Wu, David Koslicki, and Eleazer Eskin. “MiCoP: Microbial Community Profiling method capable of detecting low abundance viral and fungal organisms in metagenomic samples.bioRxiv, pp. 243188, Jan 2018.

Nathan LaPierre, Rob Egan, Wei Wang and Zhong Wang. “MiniScrub: De Novo long read scrubbing using approximate alignment and deep learning.” bioRxiv, pp. 433573, Oct 2018.

Chao Zhang, Fanbo Tao, Xiusi Chen, Jiaming Shen, Meng Jiang, Brian Sadler, Michelle Vanni and Jaiwei Han. “TaxonGen: Constructing Topical Concept Taxonomy by Adaptive Term Embedding and Clustering. ” In proceedings of The 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), oral 2018.

Fangbo Tao*, Chao Zhang*, Xiusi Chen, Meng Jiang, Tim Hanratty, Lance Kaplan and Jiawei Han. “Doc2Cube: Automated Document Allocation to Text Cube via Dimension-Aware Joint Embedding.” In proceedings of The 17th IEEE International Conference on Data Mining (ICDM), oral 2018.

Publications in 2017

-updating in progress-

Chelsea J.-T. Ju, Ruirui Li, Zhengliang Wu, Jyun-Yu Jiang, Zhao Yang and Wei Wang. Fleximer: Accurate Quantification of RNA-Seq via Variable-Length k-mers. In proceedings of The 8th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, 2017.

Wei Wang, Brian Kleakley, Chelsea J.-T Ju, Vincent Kyi, Patrick Tan, Howard Choi, Xinxin Huang, Yichao Zhou, Justin Wood, Din Wang, Alex Bui and Peipei Ping. Aztec: A Platform to Render Biomedical Software Findable, Accessible, Interoperable, and Reusable. CoRR abs/1706.06087, 2017.

Chealsea J.-T Ju*, Jyun-Yu Jiang*, Ruirui Li, Zeyu Li and Wei Wang. TahcoRoll: An Efficient Approach for Signature Profiling in Genomic Data through Variable-Length K-mers. Technical report, 2017.

Chelsea J.-T Ju, Zhuangtian Zhao and Wei Wang. Efficient Approach to correct read alignment for pseudogene abundance estimates. IEEE/ACM transactions on Computation Biology and Bioinformatics (TCBB), 14(3): 522-533, 2017.

Seungbae Kim, Jinyoung Han, Seunghyun Yoo, Mario Gerla. “How Are Social Influencers Connected in Instagram?” Social Informatics (Socinfo) 2017, Oxford UK, September 2017.

Nicholas J. Matiasz, Justin Wood, Wei Wang, Alcino J. Silva and William Hsu. Computer-Aided Experiment Planning toward Casual Discovery in Neuroscience. Front. Neuroinform. 2017.

Seungbae Kim, Mario Gerla “Socio-Geo: Social Network Routing Protocol in Delay Tolerant Networks,” ICNC 2017.

Jyun-Yu Jiang, Pu-Jen Cheng and Wei Wang. Open Source Repository Recommendation in Social Development in Information Retrieval (SIGIR’ 17), ACM 2017.

Justin Wood, Patrick Tan, Wei Wang, Croey W. Arnold. Source-LDA: Enhancing Probabilistic Topic Models Using Prior Knowledge Sources. ICDE 2017: 411-422

Mohammad Arifur Rahman, Nathan LaPierre and Huzefa Rangwala. “Phenotype Prediction from Metagenomic Data Using Clustering Assembly with Multiple Instance Learning (CAMIL).” IEEE/ACM transactions on computational biology and bioinformatics, Oct. 2017.

Mohammad Arifur Rahman, Nathan LaPierre, Huzefa Rangwala and Daniel Barbara. “Metagenome sequence clustering with hash-based canopies.” Journal of bioinformatics and computational biology, vol. 15, no. 6, pp. 1740006, Oct. 2017.

Nicholas J. Matiasz, Justin Wood, Wei Wang, Alcino J. Silva and William Hsu. Translating literature into casual graphs: Toward automated experiment selection. BIBM 2017: 573-576

Publications in 2016

-updating in progress-

Nathan LaPierre, Mohammad Arifur Rahman and Huzefa Rangwala. “CAMIL: Clustering and Assembly with Multiple Instance Learning for Phenotype Prediction.” In proceedings of the IEEE International Conference on Bioinformatics and Biomedicine, Shenzhen, China, 2016.

Rurui Li, Xinxin Huang, Shuo Song, Jia wang and Wei Wang. Towards Customer Trouble Tickets resolution Automation in Large Cellular Services. In Proceedings of 22nd International Conference on Mobile Computing and Networking (Mobicom 2016)

Publications in 2015

-updating in progress-

Wenchao Yu, Ariyam Das, Justin Wood, Wei Wang, Carlo Zaniolo and Ping Luo. Max-Intensity: Detecting Competitive Advertiser Communities in Sponsored Search Market. ICDM 2015: 569-578.

Ruirui Li and Wei Wang. REAFUM: Representative Approximate Frequent Subgraph Mining. In Proceedings of The International on Data Mining (SDM 2015)

Liuli Chen, Chelsea J.-T. Ju, Ruirui Li, Wenchao Yu, and Wei Wang. Skimdiff: Transcript-level Differential Analysis of RNA-Seq Data. In Proceedings of The 6th International Conference on Bioinformatics Models, Methods and Algorithms (Bioinformatics 2015)

Computer Vision meets Big Data: Complexity and Compositionality

Speaker: Prof. Alan L. Yuille

Abstract:

Big data arises naturally in computer vision because of the enormous number and variety of images and the large range of visual tasks that we want to perform on them. Computer vision researchers must pay increasingly attention to complexity issues as they develop algorithms that work on large image datasets. This talk has two parts. The first part describes practical issues that arise when working with large datasets such as Pascal and ImageNet. These include efficient algorithms, parallel implementations (e.g., GPUs), and special purpose hardware. The second part describes theoretical work that addresses arguably the fundamental problem of vision — how can a visual system store (represent), rapidly access (do inference), and learn the enormous number and variety of objects — and configurations of objects — that occur in the world? We propose and analysis a simplified hierarchical compositional model that can address many of these issues, and which may relate to the structure of the human visual system.

Hypothesis Exploration across Disciplines

Speaker: Prof. Stott Parker

Abstract:

A consequence of the abundance of data of all forms is that scientific research efforts are increasingly cutting across disciplines. Interdisciplinary research is difficult for many reasons, but among these are the difficulties of analyzing heterogeneous data and the lack of methods for collaborative construction of hypotheses. This is particularly true in fields like neuroscience, where the data is complex and ranges over many orders of magnitude in scale — and no single individual can hope to master it all.

In this talk I describe a system for exploration of hypotheses in phenotype data, implemented with a database obtained from several studies at UCLA. ViVA is a web-based system for analyzing hypotheses about variance structure, permitting exploratory analysis of GLMs. It permits visual identification of phenotype profiles (patterns of values across phenotypes) that characterize groups (subpopulations), and includes a variety of methods for visualization of variance. Visualization supports interdisciplinary collaboration, and enables screening and refinement of hypotheses about sets of phenotypes. With several examples we illustrate how this approach supports “natural selection” on a pool of hypotheses, and permits deeper understanding of the statistical architecture of the data.

ViVA was designed for investigation of data concerning the biological bases of traits such as memory and response inhibition phenotypes — to explore whether they can aid in moving from traditional categorical approaches for psychiatric syndromes towards more quantitative approaches based on large-scale analysis of the space of human variation. The hypotheses and data are increasingly trans-disciplinary and sophisticated, and the impact of better methods can be enormous.

Publications in 2014

Chelsea J.-T. Ju, Zhuangtian Zhao and Wei Wang. PseudoLasso: leveraging read alignment in homologous regions to correct pseudogene expression estimates via RNASeq. In Proceedings of The 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics (ACMBCB 2014).

Bin Bi, Yuanyuan Tian, Yannis Sismanis, Andrey Balmin, Junghoo Cho, Scalable Topic-Specific Influence Analysis on Microblogs, In Proceedings of the International ACM WSDM Conference (WSDM), February 2014.

Eric Yi Liu, Andrew P Morgan, Elissa J Chesler, Wei Wang, Gary A Churchill, Fernando Pardo-Manuel de Villena, Starting at the ends: high-resolution sex-specific linkage maps of the mouse reveal polarized distribution of recombination in male germline, Genetics, 2014.

Wei Wang and Zhenyuan Wang, Total Orderings Defined on the Set of All Fuzzy Numbers, Fuzzy Sets and Systems, 2014.

Jason Phillippi, Yuying Xie, Darla R Miller, Timothy A Bell, Zhaojun Zhang, Alan B Lenarcic, David L Aylor, S Harsha Krovi, David W Threadgill, Fernando Pardo-Manuel de Villena, Wei Wang, William Valdar, and Jeffrey A Frelinger, Using the Emerging Collaborative Cross to Probe the Immune System, Genes and Immunology, 2014.

Wei Cheng, Xiaoming Jin, Jian-Tao Sun, Xuemin Lin, Xiang Zhang, and Wei Wang, Searching Dimension Incomplete Databases, IEEE Transactions on Data Engineering (TKDE), 2014

Publications in 2010

Michael J. Welch, Junghoo Cho, Walter Chang, Generating Advertising Keywords from Video Content, In Proceedings of the 19th International Conference on Information and Knowledge Management (CIKM), October 2010.

Jun-Seok Heo, Junghoo Cho, Kyu-Young Whang, The Hybrid-Layer Index: A Synergic Approach to Answering Top-k Queries in Arbitrary Subspaces, In Proceedings of the 26th IEEE International Conference on Data Engineering (ICDE), March 2010.

Tyson Condie, Neil Conway, Peter Alvaro, Joseph M. Hellerstein, Khaled Elmeleegy, and Russell Sears. “MapReduce Online.” In In Proceedings of the 7th USENIX Symposium on Networked Systems Design and Implementation (NSDI), 2010.

Wang, Jeremy, Fernando Pardo-Manuel de Villena, Wei Wang, and Leonard McMillan, Genome-wide compatible SNP intervals and their propertiesProceedings of the ACM International Conference on Bioinformatics and Computational Biology (ACMBCB), pp. 43-52, 2010.

Pakatci, Isa, Wei Wang, and Leonard McMillan, Gene set analysis using principal components, Proceedings of the ACM International Conference on Bioinformatics and Computational Biology (ACMBCB), pp. 330-333, 2010.

Eric Yi Liu, Qi Zhang, Leonard McMillan, Fernando Pardo-Manuel de Villena, andWei Wang, Efficient genome ancestry inference in complex pedigrees with inbreeding, Proceedings of the 18th Annual International Conference on Intelligent Systems for Molecular Biology (ISMB)Special Issue of Bioinformatics, vol. 26, no. 12, pp. 199-207, 2010.

Xiang Zhang, Shunping Huang, Fei Zou, and Wei Wang, TEAM: Efficient two-locus epistasis tests in human genome-wide association study, Proceedings of the 18th Annual International Conference on Intelligent Systems for Molecular Biology (ISMB)Special Issue of Bioinformatics, vol. 26, no. 12, pp. 217-227, 2010.

Ning Jin, Calvin Young, and Wei Wang, GAIA: Graph classification using evolutionary computation, Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD), pp. 879-890, 2010.

Xiang Zhang, Feng Pan, Yuying Xie, Fei Zou, andWei Wang, COE: a general approach for efficient genome-wide two-locus epistasis test in disease association study, Journal of Computational Biology (JCB), vol. 17, no. 3, pp. 401-415, 2010.

Carlo A. Curino, Hyun J. Moon, Alin Deutsch and Carlo Zaniolo, Update Rewriting and Integrity Constraint Maintenance in a Schema Evolution Support System: PRISM++,PVLDB 4(2), 117-128(2010).

Barzan Mozafari, Kai Zeng, Carlo Zaniolo, From Regular Expressions to Nested Words: Unifying Languages and Query Execution for Relational and XML Sequences. PVLDB 3(1): 150-161 (2010).

Hyun Jin Moon, Carlo Curino and Carlo Zaniolo, Scalable Architecture and Query Optimization for Transaction-time DBs with Evolving Schemas. SIGMOD Conference Indianapolis, Indiana, June 6-11, 2010: 207-218.

Arnold C.W., El-Saden S.M., Bui A.A., Taira R., “Clinical Case-based Retrieval Using Latent Topic Analysis,” AMIA Annu Symp Proc. 2010 Nov 13;2010:26-30.

Publications in 2011

Michael Welch, Junghoo Cho, Christopher Olston, Search Result Diversity for Informational Queries, In Proceedings of the 20th International World Wide Web Conference (WWW), March 2011.

Michael J. Welch, Uri Schonfeld, Dan He, Junghoo Cho Topical Semantics of Twitter Links, In Proceedings of the 4th ACM International Conference on Web Search and Data Mining (WSDM), February 2011.

Tuan M. V. Le, Tru H. Cao, Son M. Hoang, Junghoo Cho Ontology-based proximity search, In Proceedings of the 13th International Conference on Information Integration and Web-based Applications and Services (iiWAS), December 2011.

Markus Weimer, Tyson Condie and Raghu Ramakrishnan. Machine learning in a higher order cloud computing language. In BigLearn workshop on parallel and large-scale machine learning (NIPS), 2011.

Niketan Pansare, Vinayak R. Borkar, Chris Jermaine, Tyson Condie. Online Aggregation for Large MapReduce Jobs. In International Conference on Very Large Data Bases (VLDB), 2011.

Wei Cheng, Xiaochuan Ni, Jian-Tao Sun, Xiaoming Jin, Hye-Chung Kum, Xiang Zhang, and Wei Wang, Measuring opinion relevance in latent topic space, Proceedings of the IEEE International Conference on Social Computing (SocialCom), pp. 323-330, 2011.

Summer G Goodson, Zhaojun Zhang, James K Tsuruta, Wei Wang, and Deborah A O’Brien, Classification of mouse sperm motility patterns using an automated multiclass support vector machines mode, Biology of Reproduction, vol. 84, no. 6, pp. 1207-1215, 2011.

Eric Yi Liu, Zhaojun Zhang, and Wei Wang, Clustering with relative constraints, Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD), pp. 947-955, 2011.

Aylor DL, Valdar W, Foulds-Mathes W, Buus RJ, Verdugo RA, Baric RS, Ferris MT, Frelinger JA, Heise M, Frieman MB, Gralinski LE, Bell TA, Didion JD, Hua K, Nehrenberg DL, Powell CL, Steigerwalt J, Xie Y, Kelada SN, Collins FS, Yang IV, Schwartz DA, Branstetter LA, Chesler EJ, Miller DR, Spence J, Liu EY, McMillan L, Sarkar A, Wang J, Wang W, Zhang Q, Broman KW, Korstanje R, Durrant C, Mott R, Iraqi FA, Pomp D, Threadgill D, Pardo-Manuel de Villena F, Churchill GA. Genetic analysis of complex traits in the emerging collaborative cross, Genome Research, vol. 21, pp. 1213-1222, 2011.

Ning Jin and Wei Wang, LTS: Discriminative subgraph mining by learning from search history, Proceedings of the 27th IEEE International Conference on Data Engineering (ICDE), pp. 207-218, 2011.

Xiang Zhang, Shunping Huang, Fei Zou, and Wei Wang, Tools for efficient epistasis detection in genome-wide association study, Source Code for Biology and Medicine, vol. 6, no. 1, pp. 1-3, 2011.

Yan-Nei Law, Haixun Wang, and Carlo Zaniolo, Relational Languages and Data Models for Continuous Querieson Sequences and Data Streams. ACM Trans. Datab. Syst. 36, 2, Article 8 (May 2011).

Hamid Mousavi, Carlo Zaniolo, Fast and Accurate Computation of Equi-Depth Histograms over Data Streams. EDBT 2011: 69-80.

Hetal Thakkar, Nikolay Laptev, Hamid Mousavi, Barzan Mozafari, Vincenzo Russo, Carlo Zaniolo, SMM: A data stream management system for Knowledge Discovery.ICDE 2011: 757-768.

Singleton K.W., Lan M., Arnold C., Vahidi M., Arangua L., Gelberg L., Bui A.A., “Wireless Data Collection of Self-administered Surveys using Tablet Computers,” AMIA Annu Symp Proc. 2011;2011:1261-9. Epub 2011 Oct 22