December 2024
Welcome to the December edition of the Data Science Initiative newsletter!
In this issue, we bring you updates on groundbreaking research, upcoming events, and achievements from our vibrant data science community. Whether you’re catching up on past developments or seeking ways to engage, we’re excited to share the latest from the DSI!
Explore our website for more, and as always, thank you for supporting the advancement of data science at UMN.
Notice: The DSI Seed Grant Showcase has been rescheduled
The showcase will now take place in February 2025. See more information in our events section and RSVP today!
Join us to celebrate the impactful work of our DSI seed grant recipients, foster collaboration, and spark ideas for future research. Light refreshments will be provided—we look forward to seeing you there!
This Friday: AI Makerspace Hours with a Pop-In Seminar
Don’t miss this Friday’s AI Makerspace hours, where the AI Society club will host a pop-in seminar featuring a beginner-friendly overview of AI agentic models, swarm intelligence, and retrieval-augmented generation (RAG). The session will include a ~30-minute presentation, followed by a ~30-minute interactive activity where you can create and explore a chatbot based on your interests. It’s the perfect opportunity to learn, connect, and engage with the exciting world of AI!
RSVP Now!
Featured Article/Celebrating Success
Building Bridges: Advancing Healthcare Innovation Through AI and Data Science
On November 15, the inaugural workshop of the AID-H working group brought together attendees from across the University of Minnesota. Participants represented a variety of colleges, showcasing the interdisciplinary nature of this initiative. The event was a launchpad for meaningful collaborations at the intersection of healthcare, data science, and artificial intelligence. The presentations underscored the diverse ways AI and data science are advancing healthcare research and practice.
Key Themes and Next Steps
Based on participant feedback and discussions during the first workshop, four initial themes have been identified, such as:
- Cancer
- Alzheimer’s Disease
- Mental Health
- Pharmacy and Nutrition
Additional themes are being anonymously contributed by workshop members, with new suggestions already emerging.
To further these efforts, the working group encourages continued engagement:
- Shared Spreadsheet
Indicate your interest in one or more themes, propose new themes, or share your expertise using the shared spreadsheet (Opens Google Sheets). Feel free to leave any comments or elaborate on your interest or expertise associated with the themes which help us further organize the following events. - Slack Discussions
Join the AID-H Slack workspace to engage in discussions, create theme-specific channels, and share updates. - Monthly Thematic Series
Starting in 2025, the group will organize monthly sessions focusing on one theme at a time. These will include presentations and group discussions addressing specific challenges and opportunities.
Get Involved
- Add interested colleagues to the shared spreadsheet (Opens Google Sheets).
- Explore opportunities for funding through the DSI Seed Grants to support pilot projects and new collaborations.
This workshop was just the beginning, look forward to building a vibrant community advancing healthcare innovation through AI and data science.
The Data Science Initiative (DSI) regularly features interviews with members of our vibrant data science community, delving into their perspectives on the field and its implications for their careers. This week, we're excited to introduce Ayisha Tabbassum, whose expertise and insights shed light on the evolving landscape of data science. Follow the link to discover their answers to our thought-provoking questions on the intersection of data science and career advancement.
In this edition of our newsletter, we are privileged to hear from Ayisha Tabbassum.
This month’s spotlight features Ayisha Tabbassum, a dynamic researcher and one of the speakers at our WiADS 2024 event. Ayisha’s work bridges the cutting edge of AI, multi-cloud architectures, and data systems, focusing on vulnerabilities in machine learning models and optimizing data operations. Her insights range from surprising discoveries in cross-cloud efficiencies to the importance of holistic system security.
Ayisha shared her definition of Data Science as the engine driving digital transformation and highlighted exciting tools like LangChain, Hugging Face Transformers, and MLflow that she and her students are using to push the boundaries of AI applications. Looking ahead, she envisions a future shaped by privacy-preserving AI, explainable systems, and the evolution of unified data ecosystems.
Read Ayisha’s full responses to gain deeper insights into her impactful work and vision for the future of data science!
Research Spotlight - Seed Grant Awardee
Title: Stochastic Optimization for Constrained Deep Learning
PI(s): Ju Sun, Zhaosong Lu
DSI Track: Foundational
MnDRIVE Area(s): Robotics and Sensors, Global Food, Environment, Brain Conditions
Summary Paragraph:
Deep learning powers many of today's artificial intelligence applications, but has some limitations. This research aims to improve "constrained deep learning" which allows incorporating things like fairness and handling imbalanced data. Currently, methods exist to solve these problems but don't scale well to massive datasets with millions of images or data points. The researchers will develop new mathematical optimization techniques to solve constrained deep learning problems at scale efficiently. These stochastic methods will randomly sample small batches of data rather than using the entire dataset each time. The methods will be tested on real-world problems related to algorithmic fairness and learning from imbalanced medical data. They will also be integrated into existing software for constrained deep learning called NCVX. The project combines expertise in deep learning and optimization. It could enable new advances in trustworthy and efficient AI systems. The software will be freely available to other researchers. Follow-on work will focus on collaborations and applications in science,engineering, and medicine.
Events
AI Makerspace Hours
When: Every other Friday starting September 13, 2024 to April 18th, 2025
Where: Walter 575
*Notice: there are no makerspace hours between December 7 and Jan 23. They will resume Jan 24.
The DSI and MSI invite all students, staff, and faculty to our AI Makerspace Hours, a unique event where you can dive into AI on our state of the art HPC with hands-on experience. With the support of our expert MSI staff, you'll learn everything from basic coding to training advanced generative AI models. Enjoy access to dedicated HPC nodes for practical learning and a set of comprehensive tutorials.
RSVPs are not required but highly recommended; otherwise, attendees will need to spend a few minutes creating an account on the HPC. Please bring your own laptop (it doesn't need to be a high-performance one). There will be one or two laptops available to loan out if needed.
Don’t miss out on this opportunity to learn, explore, and innovate with us! RSVP Now!
Rescheduled: DSI Seed Grant Showcase
When: Tuesday, February 11th, 2025 from 3:00 to 5:00 p.m.
Where: Coffman Union’s Mississippi Room
Join us to celebrate the impactful work of the DSI seed grant recipients. This event fosters community, encourages collaboration, and sparks ideas for future research development. Light refreshments will be provided—don’t miss this chance to connect with fellow data science enthusiasts and explore cutting-edge research.
Data Discovery Across Departments
Events in other departments/initiatives/institutions - (External (Non-DSI Events)
CSE - DSI Hackathon - Machine Learning Challenge
When: December 6th, 2025 from 1:30 - 4:30 p.m.
Where: PAN 110
Join the CSE Data Science Initiative for an exciting hackathon tackling a machine learning challenge created by the NSF's HDR Institutes! This event focuses on anomaly detection using three curated datasets from astronomy, biology, and climate science.
Participants will explore the datasets, learn how to submit models, and have the chance to form groups and kickstart their projects. Whether you're a seasoned machine learning enthusiast or a curious beginner, everyone is welcome!
?️ When: December 6th, 1:30–4:30 PM
? Where: PAN 110
? Bonus: Pizza provided!
This event will replace the usual DSI Journal Club meeting. To help us plan, please sign up here.
Integrating AI into your assignments and exploring the pedagogical implications
When: January 15th, 2025
Where: Virtual
Register for this short course where you will be introduced to the basics of generative AI (GenAI) and create/reimagine an assignment for your spring 2025 course. You will determine under what circumstances GenAI may be used and create a policy that explains your GenAI use. Finally, attend a synchronous small group discussion among peers to discuss decisions made about GenAI use. The format includes approximately 3-4 hours of asynchronous work (available January 15th) and a 90-minute synchronous workshop on February 20, 2025.
Student focus groups: When and why students use AI
When: January 15th, 2025 1:00 - 2:00 p.m.
Where: Virtual
This hour-long webinar shares the results of student focus groups held in November 2024 on the topic of generative AI in course work. Undergraduate and graduate students from multiple disciplines and campuses were asked to talk about:
- how generative AI is impacting their engagement in course assignments,
- when they are using AI tools and for what purposes, and
- What are the implications of generative AI in their own education and development?
The results offer actionable insights to instructors as they adjust teaching practices in order to leverage the opportunities these tools provide, while mitigating the potential negative impacts on learning and education.
University of Minnesota Day of Data 2025
When: January 15th & Thursday, January 16th, 2025
Where: Virtual
Join us for the University of Minnesota Day of Data 2024! This year features a series of virtual events to foster data enthusiasm and critical thinking around applications of data. Topics include the use and ethics of data and AI, a special webinar from the Financial Systems User Network (FSUN) on visualizing and analyzing financial data, and an opportunity to organize your files in the Day of Data Clean Up. These events are free of charge, and open to all students, faculty, staff, and alumni from all University of Minnesota campuses. Whether you are new to data or a data expert, you are welcome! Virtual events are spread out over two exciting "days of data". Attend one or both events.
UMN Interdisciplinary Health Data Competition
The Business Advancement Center for Health (BACH) at the Carlson School will host the 6th Annual UMN Interdisciplinary Health Data Competition, where graduate students from diverse disciplines collaborate to tackle real-world healthcare challenges using data science. Participants will explore datasets, develop innovative solutions, and present their insights to sponsors and judges.
Key Details:
Cash Awards:
- 1st Place: $4,000 per team
- 2nd Place: $2,000 per team
- 3rd Place: $1,000 per team
Important Dates:
- Registration: Nov. 18 - Dec. 15, 2024
- Kickoff Event: Feb. 4, 2025
- Final Presentations: Feb. 26, 2025
Registration Link (Opens Google forms)
NIH Data Sharing Index (S-index) Challenge
Are you passionate about Open Science and FAIR data principles?
Calling all researchers, healthcare professionals, data scientists, informaticians, and anyone interested in expanding data sharing in research to participate in the Data Sharing Index (S-index) Challenge.
Total prize pool of $1million, with a first prize of $500,000!
Led by the National Eye Institute (NEI) with contributions from multiple National Institutes of Health (NIH) Institutes, Centers, and Offices (ICOs), aims to incentivize the creation, development, and validation of a quantitative data sharing index.
Call for Proposals: 2025 Data-Intensive Research Conference
When: August 6-7, 2025
Where: Minneapolis, MN & Virtual
The IPUMS Big Microdata Network and NDIRA, a collaboration between IPUMS and the University of Minnesota Life Course Center, are currently accepting submissions for the 2025 Data-Intensive Research Conference, to be held in person in Minneapolis, Minnesota; key components of the program will also be available to virtual participants. The conference theme is Understanding Health and Population Dynamics through Big Microdata. The deadline to apply is January 31, 2025. Submit an abstract.
Learning Resources
EE 8950: High Dimensional Statistics and Machine Learning Spring 2025
This course provides a rigorous introduction to high-dimensional probability, statistics, and machine learning from a
non-asymptotic perspective. We focus on high-dimensional machine learning problems with hidden low-dimensional
structures, using non-asymptotic analysis to understand the interplay between sample complexity, dimension, and structural parameters. Key topics include tail bounds, concentration inequalities, and random matrices, with applications in sparse models and smooth function estimation. Emphasis will be placed on empirical process theory and regularization, enabling students to understand complexity management in statistical machine learning.
Course: Why Data Science?
Data Informs Decision-Making and Drives Innovation Data science is the study of data to extract meaningful insights.
Data Science is a multidisciplinary approach that combines principles and practices from the fields of mathematics, statistics, artificial intelligence, and computer engineering to analyze large amounts of information. GEMS Learning courses are modular data science education tailored to food, agriculture, and natural resource applications for working professionals and students. Across the curriculum, instructors have built their course content from their own work executing large-scale data science projects to solve pressing agricultural problems.
Fall 2024 Courses
Computing Basics for the Agri-food Sector (self paced)
Are you a field or bench scientist and always wanted to feel more comfortable with your computing skills? These self-paced online courses are designed for those who have never used the command line, but realize that the responsibilities they have or will soon take on require them to automate tasks. Learn basic UNIX command-line skills, enable participants to work remotely on more powerful machines, create and run scripts to automate complex workflows, and synchronize your scripts with the larger community with Github.
- Introducing the GEMS platform + Jupyter Lab
- Demystifying the UNIX command line
- Working Remotely and Scheduling Jobs on MSI’s systems
- Synching your work with the community
Explicitly Accounting for Location in Agriculture in Python (self paced)
Learn how to work with spatial data in Python, starting from importing different spatial datasets and creating simple maps, to conducting basic geocomputation on vector and raster data.
Funding Opportunities and Deadline
If you're interested in exploring these or other data science opportunities, whether it's finding the right team, preparing your submission, or partnering with industry on federal and state funding initiatives, please reach out—we’re here to help!
- Quad AI-ENGAGE - AI-ENGAGE, NSF, JST, ICAR and CSIRO invite joint multilateral proposals from researchers in at least three of the Quad countries. Proposals involving researchers from all four countries are encouraged and will be prioritized for funding. Proposers from the Quad countries collaborate to write a single proposal that will undergo a single review process coordinated by NSF, the Coordinating Agency. Questions regarding details of AI-ENGAGE submission to partner agencies should be addressed to: Yuta Kobayashi, JST, Department of Moonshot R&D Program, [email protected]
- NEH Digital Humanities Advancement Grants - The Digital Humanities Advancement Grants program (DHAG) supports work that is innovative, experimental, and contributes to the critical infrastructure that underpins scholarly research, teaching, and public programming in the humanities. Optional draft due: Nov. 13, 2024; deadline: Jan. 9, 2025
- NIH AHRQ - Examining the Impact of AI on Healthcare Safety - The purpose of this NOFO is to invite grant applications that support healthcare safety by determining (1) whether and how certain breakthrough uses of Artificial Intelligence (AI) systems can affect patient safety; and (2) how AI systems can be safely implemented and used. AI has the potential to improve the safety, effectiveness, efficiency, accessibility, and affordability of healthcare. However, as with most technologies, this potential must be balanced by identifying and mitigating potential risks for patient harm and user burden. Deadline: Jan. 25, 2024
- NSF Emerging Mathematics in Biology - The Emerging Mathematics in Biology (eMB) program seeks to stimulate the development of innovative mathematical theories, techniques, and approaches to investigate challenging questions of great interest to biologists and public health policymakers. It supports the development of the mathematical foundation of Artificial Intelligence/Deep Learning/Machine Learning (AI/DL/ML) enabling explainable AI or mechanistic insight. The program emphasizes the uses of mathematical methodologies to advance our understanding of complex, dynamic, and heterogenous biological systems at all scales (molecular, cellular, organismal, population, ecosystems, evolutionary, etc.). Deadline - March 3, 2025
- NSF Ethical and Responsible Research (ER2) - The ER2 program supports projects that focus on what constitutes or promotes responsible and ethical research in science, technology, engineering, and mathematics (STEM) fields. The ER2 program promotes the development, improvement, and dissemination of responsible and ethical research practices and aims to build on organizational cultures that value and reward such practices. Deadline - January 23, 2025
- DOD FY25 Minerva Research Initiative University Research Program - Minerva’s University Research program aims to support innovative basic research projects that contribute to the advancement of social science and provides new methods and understandings on social and behavioral questions of security and defense-related interest. Minerva aims to improve DoD's basic understanding of the social, cultural, behavioral, and political forces that shape regions of the world of strategic importance to the U.S. The research program seeks to: Leverage and focus the resources of the Nation's top universities; Define and develop foundational knowledge about sources of present and future conflict with an eye toward better understanding of the political trajectories of key regions of the world; and Improve the ability of DoD to develop cutting-edge social science research and foreign area and interdisciplinary studies that is developed and vetted by the best scholars in these fields. Deadline - Feb. 28, 2025
For students:
- The NSF PACK fellowship: The PACK fellowship is a graduate student opportunity to conduct research at the University of Kiel, Germany for 3 weeks. Applicants from any science or engineering discipline are encouraged to apply now!
Open Positions: Assistantships and Internships
- Ecology Assistant - Americorps, Remote positions
- Honeywell Internships - Honeywell is looking for 8 Artificial Intelligence interns in Honeywell Aerospace (US Person Required) and 22 interns will be a part of other Honeywell businesses (Non US Persons Possible)
- Quantinuum Internships - Quantinuum is looking for summer interns in Health Safety and Environment, Manufacturing Engineering, Optics Engineering, Metasurface Design (PhD), Trapped-ion Quantum Computing Theory (PhD), PMO Project Engineering, and Technical Solutions Consulting.
Social Media/Website Links
Catchup on the Latest News at DSAI HUB
Opening the Doors: AI Hub Earth Day Event Sparks Critical Conversations on AI and the Planet
On April 22, the University of Minnesota AI Hub officially opened its doors with an Earth Day event centered on a timely and essential question: If AI is shaping the future, what does that mean for the planet?
The Road to the AI Hub: Our Journey from DSI to DSAI to Today
UMN’s DSAI and ELAI Lead High-Impact AI Roundtables at 2026 MSBA Conference
MINNEAPOLIS — At the Minneapolis Convention Center earlier this year, the conversation around the future of Minnesota classrooms took a high-tech turn.