December 11 - 14, 2018
College of Engineering Guindy (CEG)
Anna University, Chennai, INDIA
Dr. Ricardo Baeza-Yates
NTENT & Northeastern University at SV, USA
Big Data or Right Data? Opportunities and Challenges
Big data nowadays is a fashionable topic, independently of what people mean when they use this term. But being big is just a matter of volume, although there is no clear agreement in the size threshold. On the other hand, it is easy to capture large amounts of data using a brute force approach. So, the real goal should not be big data but to ask ourselves, for a given problem, what is the right data and how much of it is needed. For some problems, this would imply big data, but for most of the problems much less data will and is needed. Hence, in this presentation, we cover the opportunities and the challenges behind big data. Regarding the challenges, we explore the trade-offs involved with the main problems that arise with big data: scalability, redundancy, bias, the bubble filter and privacy.
Ricardo Baeza-Yates areas of expertise are information retrieval, web search and data mining, data science and algorithms. He is currently a Professor at Northeastern University, Silicon Valley campus, since August 2017. He is also CTO of NTENT, a semantic search technology company based in California since June 2016. Before he was VP of Research at Yahoo Labs, based in Sunnyvale, California, from August 2014 to February 2016. Before he founded and led from 2006 to 2015 the Yahoo labs in Barcelona and Santiago de Chile. Between 2008 and 2012 he also oversaw Yahoo Labs in Haifa, Israel, and started the London lab in 2012. He is part time Professor at the Dept. of Information and Communication Technologies (DTIC) of the Universitat Pompeu Fabra (UPF), in Barcelona, Spain, as well as at the Dept. of Computing Science (DCC) of Universidad de Chile in Santiago. During 2005, he was an ICREA research professor at UPF. Until 2004 he was Professor and founding director of the Center for Web Research at Universidad de Chile. He obtained a Ph.D. in CS from the University of Waterloo, Canada, in 1989. Before he obtained two masters (M.Sc. CS & M.Eng. EE) and the electronics engineer degree from the University of Chile in Santiago. He is co-author of the best-seller Modern Information Retrieval textbook, published in 1999 by Addison-Wesley with a second enlarged edition in 2011, which won the ASIST 2012 Book of the Year award. He is also co-author of the 2nd edition of the Handbook of Algorithms and Data Structures, Addison-Wesley, 1991; and co-editor of Information Retrieval: Algorithms and Data Structures, Prentice-Hall, 1992, among more than 600 other publications. Within ACM he was the Chilean site director of the regional ACM Programming Contest and member of the South American steering committee from 1998 to 2005. Later he was member of the ACM Publications Board from 2007 to 2009 and of the ACM European Council from 2010 to 2014. Finally, he was elected to the ACM Council from July 2012 to June 2016. From 2002 to 2004 he was elected to the board of governors of the IEEE Computer Society. He has received the Organization of American States award for young researchers in exact sciences (1993), the Graham Medal for innovation in computing given by the University of Waterloo to distinguished ex-alumni (2007), the CLEI Latin American distinction for contributions to CS in the region (2009), and the National Award of the Chilean Association of Engineers (2010), among other distinctions. In 2003, he was the first computer scientist to be elected to the Chilean Academy of Sciences and since 2010 is a founding member of the Chilean Academy of Engineering. In 2009, he was named ACM Fellow and in 2011 IEEE Fellow.
Dr. C. Chandra Sekhar
Indian Institute of Technology Madras, India
Deep Learning Models for Image Processing Tasks
The shallow learning models based on conventional machine learning techniques for pattern classification such as Gaussian mixture models, multilayer feedforward neural networks and support vector machines use the hand-picked features as input to the models. Recently, several deep learning models have been explored for learning a suitable representation from the image data and then using the learnt representation for performing the image pattern analysis tasks such as image classification, annotation and captioning. In this talk, we present the deep learning models such as Stacked autoencoder, Deep convolutional neural network and Stacked restricted Boltzmann machine for learning a suitable representation from the image data. Then, we present the deep learning models based approaches to image classification, image annotation and image captioning.
Prof.C.Chandra Sekhar received his B.Tech. degree in Electronics and Communication Engineering from Sri Venkateswara University, Tirupati, India, in 1984. He received his M.Tech. degree in Electrical Engineering and Ph.D. degree in Computer Science and Engineering from Indian Institute of Technology (IIT) Madras in 1986 and 1997, respectively. He is currently working as a Professor since 2010 in the Department of Computer Science and Engineering at IIT Madras. He was a Japanese Society for Promotion of Science (JSPS) post-doctoral fellow at Center for Integrated Acoustic Information Research, Nagoya University, Nagoya, Japan, from May 2000 to May 2002. Prof.Chandra Sekhar has received the “Srimathi Marti Annapurna Gurunath Award for Excellence in Teaching at IIT Madras” for the year 2016. His current research interests are in speech processing, kernel methods, deep learning, distance metric learning and content-based information retrieval of multimedia data.
Dr. Jaya Sreevalsan Nair
International Institute of Information Technology Bangalore, India
Visual Analytics: “Bringing data to life”
John Tukey, the mathematician, said the following. once upon a time about analytics: “This is my favorite part about analytics: Taking boring flat data and bringing it to life through visualization.” It remains true to a great extent even today, in the time of big data. The objective of this tutorial is to impress upon the audience the need for visualization as an essential part of larger data science workflows. Visualization in itself has evolved from being summaries to facilitating complex exploratory analysis of data. This tutorial will demonstrate techniques of how data can be formatted to make the best use of some of the time-tested visualization techniques, and how visualizations enable in the overall data analysis.
Dr. Jaya Srevalsan Nair is currently with the International Institute of Information Technology Bangalore. Her research interests include exploiting spatial locality and other analytical processes in data visualization. She applies these approaches to work well in LiDAR point cloud analysis, multiplex networks in biology and society, and multivariate data in health informatics. She has graduated with a B. Tech. In aerospace engineering from IIT Madras, M.S. in computational engineering from Mississippi State University, and Ph.D. in computer science from University of California at Davis. She has received the Early Career Research Award by SERB in 2017.