Bay Area Artificial Intelligence Meetup Group Message Board › Jan 25 ACM Data Mining Event - Analytics at Petabyte Scale: Cloudera and Fac

Jan 25 ACM Data Mining Event - Analytics at Petabyte Scale: Cloudera and Facebook on Hadoop and Hive. Mountain View, CA

Greg Makowski
Posted Jan 19, 2010 11:23 PM
Greg_Makowski
Los Altos, CA
Post #: 25
Send an Email Post a Greeting

SF Bay Area web site, for this Data Mining Special Interest Group event
http://www.sfbayacm.o...

Add to your LinkedIn profile
http://events.linkedi...

Location: LinkedIn, 2027 Stierlin Ct., Mountain View, CA 94043
Time: 6:30 - 8:30 pm


TITLE 1: ”Hadoop: Distributed Data Processing”

Hadoop is an open-source distributed platform designed to economically store and process data using clustered commodity hardware. Hadoop is Apache’s implementation of the MapReduce/GFS frameworks popularized by Google. In this talk we will demystify this powerful platform, and describe how it enables you to consolidate many different data storage and processing needs in an economically scalable cloud resource.

SPEAKER BIOGRAPHY
Dr. Amr Awadallah is Chief Technical Officer and Founder for Cloudera, Inc. Before Cloudera, he was vice president of product intelligence engineering at Yahoo! Inc., where he worked since June 2000 after Yahoo acquired his first startup (VivaSmart). Dr. Awadallah received his PhD from Stanford University in 2007 and his BS/MS degrees from Cairo University in 1992 and 1995, respectively.

TITLE 2: ”Facebook’s Petabyte Scale Data Warehouse Using Hive and Hadoop”

Hive is an open source, peta-byte scale date warehousing framework built on top of Hadoop that enables scalable analytics on large data sets using SQL and some language extensions. Scalable analysis on large data sets has been core to the functions of a number of teams at Facebook – both engineering and non-engineering. This talk will highlight how Hive and Hadoop allow us at Facebook to offer a cheap, scalable and flexible infrastructure to do different kinds of analysis. We will talk about the architecture, applications and capabilities of this infrastructure which handles close to 8000 jobs a day and stores nearly 2.5PB of compressed data.

SPEAKER BIOGRAPHY
Ashish Thusoo has been with Facebook for the last couple of years and is managing the Facebook data infrastructure team in his most recent role. He started the Hive project at Facebook along with Joydeep and serves at the project lead for Hive at Apache. He is also part of the Hadoop PMC at Apache and has presented Hive at a number of conferences, forums and panels. Ashish has deep expertise in data processing and parallel processing technologies, infrastructure and applications built on those infrastructures. In the past he has worked at Oracle in areas of Parallel Query Execution as well as XML Databases. At Oracle he built many core data warehousing and query processing features and was recognized as one of the leaders in the Parallel Execution team. These features are regularly used in most Oracle based data warehouses.


Powered by mvnForum

Syntience Inc.

AI research company. Provides video equipment, time, and web space

Offer a perk for our members and get exposure.

Offer a perk →
Other nearby
Meetups
Why these groups?
x

The Meetup Groups shown here are topically similar to Bay Area Artificial Intelligence Meetup Group.

Groups are more likely to be displayed here if they:

  • have a Meetup scheduled
  • have a high rating
  • have a group photo
  • are "public" and not "private"
  • have shown they are likely to stick around (older than 30 days)
Find more Meetup Groups
near Menlo Park

Log in

  • Not registered with us yet?
or

Log in to Meetup with your Facebook account.

Sign up

or

Join this Meetup Group even quicker with your Facebook account.

By clicking the "Sign up using Facebook" or "Sign up" buttons above, you agree to Meetup's Terms of Service