NTENT & Northeastern University at SV, USA
Big Data or Right Data? Opportunities and Challenges
Monday December 10, 09:00 – 10:00 | Ada Lovelace Auditorium, IST Dept.
Big data nowadays is a fashionable topic, independently of what people mean when they use this term. But being big is just a matter of volume, although there is no clear agreement in the size threshold. On the other hand, it is easy to capture large amounts of data using a brute force approach. So, the real goal should not be big data but to ask ourselves, for a given problem, what is the right data and how much of it is needed. For some problems, this would imply big data, but for most of the problems much less data will and is needed. Hence, in this presentation, we cover the opportunities and the challenges behind big data. Regarding the challenges, we explore the trade-offs involved with the main problems that arise with big data: scalability, redundancy, bias, the bubble filter and privacy.
Ricardo Baeza-Yates areas of expertise are information retrieval, web search and data mining, data science and algorithms. He is currently a Professor at Northeastern University, Silicon Valley campus, since August 2017. He is also CTO of NTENT, a semantic search technology company based in California since June 2016. Before he was VP of Research at Yahoo Labs, based in Sunnyvale, California, from August 2014 to February 2016. Before he founded and led from 2006 to 2015 the Yahoo labs in Barcelona and Santiago de Chile. Between 2008 and 2012 he also oversaw Yahoo Labs in Haifa, Israel, and started the London lab in 2012. He is part time Professor at the Dept. of Information and Communication Technologies (DTIC) of the Universitat Pompeu Fabra (UPF), in Barcelona, Spain, as well as at the Dept. of Computing Science (DCC) of Universidad de Chile in Santiago. During 2005, he was an ICREA research professor at UPF. Until 2004 he was Professor and founding director of the Center for Web Research at Universidad de Chile. He obtained a Ph.D. in CS from the University of Waterloo, Canada, in 1989. Before he obtained two masters (M.Sc. CS & M.Eng. EE) and the electronics engineer degree from the University of Chile in Santiago. He is co-author of the best-seller Modern Information Retrieval textbook, published in 1999 by Addison-Wesley with a second enlarged edition in 2011, which won the ASIST 2012 Book of the Year award. He is also co-author of the 2nd edition of the Handbook of Algorithms and Data Structures, Addison-Wesley, 1991; and co-editor of Information Retrieval: Algorithms and Data Structures, Prentice-Hall, 1992, among more than 600 other publications. Within ACM he was the Chilean site director of the regional ACM Programming Contest and member of the South American steering committee from 1998 to 2005. Later he was member of the ACM Publications Board from 2007 to 2009 and of the ACM European Council from 2010 to 2014. Finally, he was elected to the ACM Council from July 2012 to June 2016. From 2002 to 2004 he was elected to the board of governors of the IEEE Computer Society. He has received the Organization of American States award for young researchers in exact sciences (1993), the Graham Medal for innovation in computing given by the University of Waterloo to distinguished ex-alumni (2007), the CLEI Latin American distinction for contributions to CS in the region (2009), and the National Award of the Chilean Association of Engineers (2010), among other distinctions. In 2003, he was the first computer scientist to be elected to the Chilean Academy of Sciences and since 2010 is a founding member of the Chilean Academy of Engineering. In 2009, he was named ACM Fellow and in 2011 IEEE Fellow.