RIT ASA DataFest 2025

Undergraduate students from various colleges and universities in the Greater Rochester area and neighboring cities participate in national festival of data sanctioned by the American Statistical Association.

What is ASA DataFest?

The American Statistical Association (ASA) DataFest is a 3-day data analysis competition where undergraduate teams dive into a large, complex dataset provided by a real-world organization. Hosted by the American Statistical Association, the event brings together students from the Rochester area to tackle data challenges, collaborate, and grow professionally.

Why participate?

  • Analyze one of the most complex datasets you’ve seen!
  • Collaborate with students from diverse schools and backgrounds.
  • Network with data science professionals.
  • Gain experience to highlight in job interviews
  • Build skills in teamwork, problem-solving, and working under pressure.

Congratulations to our 2025 DataFest Winners!

Best Insights

1st Place
Team: P(enguin) Values
Captain: Nishka Desai (RIT)

2nd Place
Team: drop tables not dreams
Captain: Madeline Mariano (RIT)

3rd Place
Team: The Horrors, LLC
Captain: Quinn Dominick (RIT)


Best Use of External Data

1st Place
Team: Fisher's Niu Team
Captain: Franklin Jones (St. John Fisher University)

2nd Place
Team: mathmorelikeguessanscheck
Captain: Joshua Bourgeot (RIT)

3rd Place
Team: Babbitt's Bioinformaticians
Captain: Lilly Rowland (RIT)


Best Visualization

1st Place
Team: Chi-Squad
Captain: Shubhranshu (Dew) Dutta (University of Rochester)

2nd Place
Team: COM's Protégés
Captain: Maxwell Naccaratto (RIT)

3rd Place
Team: The Data Drivers
Captain: Jerry Chen (RIT)

Registration is closed.
 

Team Registration

Each team captain must take the responsibility of registering his/her team. 

Team Registration

Team Member Registration

Once the team is registered the captain should encourage all the members to INDIVIDUALLY register, otherwise, the team will not be able to participate. 

Team Member Registration

Mentor Registration

If you are a graduate student wishing to help with the mentoring of contestants.

Mentor Registration

Faculty Registration

If you are a local faculty member or a data scientist in our community willing to help.

Faculty Registration

Student Info and Guidelines

DataFest Location

Main room: Gosnell Hall - Auditorium 1250


Supplies

  • We recommend that every team member have a desktop or laptop available for use during the competition. You might find it helpful to have a mix of PCs and Macs since they have different strengths. We recommend that you make sure the software you will be using throughout the weekend is installed correctly and running on your computer before the competition. You will be working with a large dataset so make sure that you have the space for it on your drive.
  • You might want to have some of your favorite statistical or computational reference books ready to be used if you have them, and bookmark the pages that you regularly use.

Large Data Advice

  • The dataset you will be working with is quite large.  If you type a variable name to view it, it will take a while to display. Therefore, remember these R commands: head(), tail(), str(). 
  • We strongly recommend you create a small data set that you can use to test things on. Then, if it works out, you can apply your procedure to the large dataset.  Some procedures can take a long time to run on large data sets, and so it will be good to know that your procedure works (because you tested it on a smaller data set) while you wait.  We recommend taking a random sample of rows from the original data set, but there might be other approaches you find useful. 

DataFest Rules

  • Before downloading the dataset you must sign the non-disclosure agreement by agreeing to the terms of use and entering your name and email address. At the end of DataFest, delete all data from thumb drives, hard drives, etc. The data is sensitive. 
  • Should members of your team drop out at the last minute, you might be asked to join another team that is also missing members. 
  • At all times between 9 am-12midnight there will be a consultant present. These are faculty, grad students, or other professionals with field-specific knowledge on the dataset. They all have different areas of expertise. Feel free to ask anything. This is not an exam, but a competition. Do not expect the consultants to write code for you, do data management, etc. They are there to help point you in the right direction, but you're responsible for getting there on your own. The schedule of consultants will be made available at the beginning of the event. 

DataFest Judging

  • Each team will have five minutes to present their findings to the judges. 
  • At some point on Friday, you might want to set aside time to think about what you want the judges to know. The five-minute time limit will be strictly enforced. At least one member must be present for the presentation.
  • Your report must be submitted to the designated Google drive by 11 AM Sunday. Allowed formats: PDF. If using a web-based tool like Google Docs, please export to PDF and send the PDF as your submission.
  • Your slides should be ready by 1 PM Sunday on your own computer (Zoom links will be sent to you then), when the parallel presentations start. You don't need to submit you slides. 
  • Awards will be given in three categories:
    • Best Insight
    • Best Visualization 
    • Best Use of External Data

Formatting Requirements

1. Document Structure:

  • Use the standard Springer journal structure:
    • Title (concise, descriptive)
    • Author/Team Name(s) and Affiliation (if applicable)
    • Abstract (150–250 words, summarizing objectives, methods, key findings)
    • Keywords (4–6 terms)
    • Introduction (context, problem statement, objectives)
    • Methodology (data sources, tools, analytical techniques)
    • Results (visualizations, statistical findings)
    • Discussion (interpretation, significance, limitations)
    • Conclusion (key takeaways, future work)
    • References (Springer-compliant citations)
    • Appendices (optional, e.g., code snippets, raw data samples).

2. Formatting Details:

  • Font:
    • Main text: Times New Roman or Arial, 10–12 pt.
    • Headings: Bold, numbered hierarchically (e.g., 1, 1, 1.1.1).
  • Margins: 1 inch (2.54 cm) on all sides.
  • Line Spacing: 1.15 or single-spaced.
  • Paragraphs: No indentation; add 6 pt spacing between paragraphs.
  • Figures/Tables:
    • Centered, with captions below (figures) or above (tables).
    • Caption format: 1 / Table 1: [Descriptive title].
    • Ensure high resolution (300+ DPI) for images.
  • Equations: Use an equation editor (e.g., LaTeX, MathType) and number sequentially.
  • Citations:
    • Use square brackets (e.g., [1]) for in-text citations.
    • Follow Springer’s reference style (e.g., APA, IEEE, or journal-specific).
  • Page Numbers: Bottom center, starting from the first page.

3. File Specifications:

  • Page Size: A4 or US Letter.
  • File Format: Submit as a single PDF (no Word/LaTeX files unless specified).
  • Naming Convention: pdf.
Submission Process
  1. Deadline: Submit by [date/time] via [portal/email link].
  2. Anonymization: Do not include team names in the text (only in the filename).
  3. Page Limit: Maximum 8–12 pages (including references and appendices).
  4. Supplementary Materials:
  • Attach datasets, code, or interactive visualizations as a ZIP file (if allowed).
  • Label files clearly (e.g., zip).
Additional Tips
  • Use Springer’s LaTeX or Word template (download from Springer Author Guidelines).
  • rticle package from R
  • Proofread for grammar, typos, and formatting consistency.
  • Avoid footnotes unless critical.
  • Ensure tables/figures are self-explanatory and referenced in the text.
Checklist for Participants
  • Title page with team name (filename only).
  • Abstract and keywords included.
  • Figures/tables labeled and cited.
  • Citations formatted correctly.
  • PDF previewed for layout errors.

Let participants know if you’ll provide a template or sample report for reference!

How are you scored at DataFest@RIT?
For each of the categories, there is specific rubric with the five criteria of evaluation of your entry

Best Visualization

Each criterion is rated on a Likert scale (1-5)

Clarity of Message: 
1: Unclear or misleading | 3: Understandable but vague | 5: Extremely clear and intuitive

Aesthetic Design: 
1: Visually cluttered/unappealing | 3: Functional but plain | 5: Professionally designed, visually striking

Appropriateness of Visualization Type:
1: Incorrect chart/format for the data | 3: Adequate but suboptimal | 5: Perfectly matched to the data story

Creativity/Innovation: 
1: Generic or copied | 3: Some originality | 5: Highly inventive, unique approach

Technical Execution: 
1: Flawed or buggy | 3: Functional with minor issues | 5: Flawless, polished, and interactive (if applicable)

Total Score (out of 25):


Best Insight

Each criterion is rated on a Likert scale (1-5)

Significance of Insights: 
1: Superficial or obvious | 3: Moderately valuable | 5: Deep, impactful, or unexpected

Data-Driven Rigor: 
1: Opinions unsupported by data | 3: Partial evidence | 5: Fully backed by robust analysis

Novelty:
1: Conventional or recycled | 3: Some fresh angles | 5: Groundbreaking perspective

Actionability:
1: No practical application | 3: Partially actionable | 5: Clear, implementable recommendations

Methodology: 
1: Weak or flawed analysis | 3: Standard techniques | 5: Sophisticated, statistically sound approach

Total Score (out of 25):
 


Best Use of External Data

Each criterion is rated on a Likert scale (1-5)

Relevance of External Data:
1: Irrelevant or distracting | 3: Somewhat related | 5: Directly enhances the analysis

Integration with Primary Data:
1: Poorly merged or disjointed | 3: Basic linkage | 5: Seamlessly blended for richer insights

Enhancement of Analysis:
1: Adds no value | 3: Moderately improves results | 5: Transformative impact on conclusions

Data Quality/Credibility:
1: Unreliable sources | 3: Mixed credibility | 5: Authoritative, well-documented sources

Creativity in Sourcing:
1: Common/public datasets | 3: Minor unique additions | 5: Unconventional or hard-toaccess data

Total Score (out of 25):

Steering Committee

Kate Koch headshot
Student and Administrative Support Specialist
School of Mathematics and Statistics
College of Science
585-475-2498

RIT School of Mathematics and Statistics Students

Yang Liu
Graduate Student

Sheeraja Rajakrishnan
Graduate Studen
 

DataFest Sponsors

SPONSORS

Inspiring and empowering the next generation of world class data scientists! The Organizing Committee of DataFest@RIT wishes to express their thanks to all the past, present and future sponsors. Without these champions of data science, we cannot continue this tradition that strengthens our undergraduate students! You can read a brief history of the ASA DataFest and explore past participating institutions here.

Sponsorship Levels

Cauchy Sponsor - $5,000
1. Access to “Meet the Sponsors” Career Fair
2. Access to the Resume Book
3. Invitation to join the Email Listserv
4. Full-page ad in the DATAFEST@RIT 2025 conference main booklet/conference program
5. Large logo prominently placed on all DATAFEST@RIT 2025 banners
6. Logo prominently placed on the DATAFEST@RIT 2025 poster
7. Large logo and company link on the DATAFEST@RIT 2025 website
8. Short company profile and link on DATAFEST@RIT 2025 social media (FB, X, etc.)
9. Company name displayed in the DATAFEST@RIT 2025 main conference desk during event

Pareto Sponsor - $2,500
1. Access to “Meet the Sponsors” Career Fair
2. Access to the Resume Book
3. Invitation to join the Email Listserv
4. Medium logo placed on all DATAFEST@RIT 2025 banners
5. Logo placed on the DATAFEST@RIT 2025 poster
6. Medium logo and company link on the DATAFEST@RIT 2025 website
7. Short company profile and link on DATAFEST@RIT 2025 social media (FB, Twitter, etc.)
8. Company name displayed in the DATAFEST@RIT 2025 main conference desk during event

Lognormal Sponsor - $1,000
1. Access to the Resume Book
2. Invitation to join the Email Listserv
3. Small logo placed on all DATAFEST@RIT 2025 banners Logo placed on the DATAFEST@RIT 2025 poster
4. Small logo and company link on the DATAFEST@RIT 2025 website
5. Short company profile and link on DATAFEST@RIT 2025 social media (FB, Twitter, etc.)
6. Company name displayed in the DATAFEST@RIT 2025 main conference desk during event

Weibull Sponsor - $500
1. Short company profile and link on DATAFEST@RIT 2025 social media (FB, Twitter, etc.)
2. Acknowledgment of company on the DATAFEST@RIT 2025 website
3. Company name displayed in the DATAFEST@RIT 2025 main conference desk during event

Gauss Sponsor - $100
1. Acknowledgment of company the DATAFEST@RIT 2025 website
2. Company name displayed in the DATAFEST@RIT 2025 main conference desk during event

Uniform (Individual) Sponsor - $50
1. DATAFEST@RIT 2025 Memorabilia
 

Thank you to our past DataFest@RIT Sponsors!
 

We would like to thank our generous past sponsors for UPSTAT 2011-2019. Please review the sponsorship levels below and consider a donation to this event.

• Rochester Institute of Technology
• University of Rochester
• Praxair
• Xerox
• iCitizen
• Rochester Data Science Consortium
• Harris Corporation
• Wegmans
• Conduent
• Center for Quality and Applied Statistics (CQAS)
• JMP Statistical Discovery
• American Statistical Association (ASA)
• WITR 89.7
• M&T Bank
• Corning

Lognormal Sponsors:

  • Rochester Data Science Consortium
  • Wegmans
  • The American Statistical Association
  • Harris Corporation
  • M&T Bank

Weibull Sponsors:

  • Corning
  • CQAS @ RIT
  • UR Dept of Biostatistics and Computational Biology

Data Science Jobs in High Demand


Sean Lahman, from the Rochester Democrat & Chronicle talks with RIT Professor Ernest Fokoué about data science and the school's DataFest. - March 23, 2017.

DataFest Logo