Advanced Search »
National Science Foundation Award #0551639

Collaborative Research: CRI - Scalable Benchmarks, Software and Data for Data Mining, Analytics and Scientific Discoveries

 
Investigator(s): Alok Choudhary (PI) ; Gokhan Memik (Co-PI)
Sponsor: Northwestern University, IL 60208 8474913003
Start Date/Expiration Date 2006-03-15 to 2007-02-28 (amended 2006-03-13)
Awarded Amount to Date: $84,764
Abstract: This collaborative project, developing a broad suite of data mining benchmarks, defines benchmark data sets and efficient algorithms for important data mining kernels establishing a comprehensive benchmark suite for data mining applications. Overall, applications using data mining algorithms now form a large enough percentage to warrant research into the development of a data mining benchmark that can be used to evaluate new processor architecture and serve for comparison in testing new data mining algorithms. Taking an initial, and significant step towards developing benchmarks, test suites and datasets for applications which can be used to drive the design, implementation, and growth of systems from processor to application levels, the project specifically pursues the following goals: -Develop a benchmarking suite that will be used to understand the bottlenecks in high performance data mining and guide in the development of next-generation processors, and -Devise data mining kernels that can be efficiently executed on existing and future processors. Benchmarks play a major role in advancing architectures, software scalability, networks, and other IT disciplines. They not only play a role in measuring the relative performance of different systems, but also aid in the research and development of architectures to applications in terms of quality, scalability, cost, execution time, and other measures. Establishing a benchmark and accompanying tools for data access and usage, performing a detailed analysis of applications in the suite, and developing a testbed to perform these analyses, the work contributes a community resource that can help in design evaluation, comparison, and improvement for processor architecture, algorithms, and scalable systems. Broader Impact: While providing a standardize way of evaluating and comparing algorithms, applications, designs, and products, the results from this project have the potential to directly impact the advancement of various fields including data mining algorithms and applications, newer architectures, and system design for data intensive computing. The project opens the way to the development of a new industry segment addressing data intensive computing, similar to what resulted from media, networking, and signal processing applications. Moreover, the resource contributes to education by providing the community with software, tools, and data that can be used in the classroom.
NSF Org: CNS - Division of Computer and Network Systems
Award Number: 0551639
Award Instrument: Continuing grant
Program Manager: Rita V. Rodriguez
CNS Division of Computer and Network Systems
CSE Directorate for Computer & Information Science & Engineering
NSF Program(s): COMPUTING RES INFRASTRUCTURE
Field Application(s): Computer Science
Program Reference Code(s): BASIC RESEARCH & HUMAN RESORCS, 9218
Program Element Code(s): 7359