Projects – Fall 2023

Click on a project to read its description.

Sponsor Background

Arts NC State is the collective organization for six of NC State University’s performing and visual arts programs including The Crafts Center, Department of Performing Arts & Technology (Dance & Music), Gregg Museum of Art & Design, NC State LIVE performing artist series, and University Theatre. Arts NC State provides development, marketing, outreach & engagement and ticketing support to these units and serves all NC State Campus. 

Background and Problem Statement

The Curricular Connections Guide is one of the key programs Arts NC State (ANCS) offers via the outreach and engagement office, and links NC State faculty to arts programs that thematically connect with their courses. It has evolved from an undesigned pdf/paper guide to a designed digital and paper guide and then became online-only during the pandemic. 

Each semester, the manager of outreach and engagement, Amy Sawyers-Williams, and her team of student interns analyze the thematic content of the upcoming arts programs offered by the 6 programs ANCS serves. They then go through the course catalog by department and copy/paste relevant class info into a spreadsheet (for example, here is the spring 2023 spreadsheet). For example, if University Theatre was going to produce Hamilton, we would make a list of thematic connections like History/American Revolution, war, political theory, psychology, dance etc. We would then search for the course in the catalog that connected to these themes.

Once the spreadsheet is complete and all class connections have been made, we reach out to the faculty teaching these courses to let them know and to encourage them to engage with the art by either offering extra credit for the students to see the show, having a guest artist visit their class etc. 

The problem is that it is time consuming to manually review the course catalog for thematic connections to our programs, and on top of that, it is subjective based on the experience of the person researching and their knowledge of both the program and class. We are definitely missing class connections that a computer program could possibly detect. 

Project Description

We want a software system to facilitate making the connections between written artistic themes and courses in the course catalog. For example, what if faculty could go to a website and type in the class they teach or a concept like “feminism” and then receive a list of the programs that may connect to it?

This app should provide a way for administrators to manage shows and their connections to courses in the NC State course catalog.

There are a few examples of websites that do this or something similar for other organizations:

Registration and Records is unable to provide programmatic access to the course catalog. Instead, the system should provide an administrative interface that will allow an admin to upload and maintain a catalog file requested from Registration and Records each semester with up-to-date course information. With this file, the app will be able to identify courses that have been added, deleted, or modified. Students working on this project will have access to this file for the Fall 2023 course catalog. There should also be an interface to curate these records and establish/modify the associations between courses and shows that will ultimately be displayed on the public-facing side of the website.

Technologies and Other Constraints

We would like this to ultimately be a tool on a webpage that could be accessible via desktop or mobile device. 

Because this tool will be hosted in–and supported by–the University, students should confirm with us that any technology choices are compatible with what the University can support. This is typically either a project built on the LAMP stack (PHP, MySQL) or the creation of custom Wordpress plugins to augment the public display of information as well as the administrative side of Wordpress with necessary features to manage courses, shows, and relationships between these.

We are also hoping that the students can design something that Arts NC State staff can be trained on, so that it is sustainable for the future. For example, we would like to know how to load in the shows, the course catalog data, and the key words that would be pulled. 

This is flexible and we are open to all ideas!

Sponsor Background  

Deutsche Bank is a truly global universal Bank operating for 150 years on multiple continents across all  major business lines and being a cornerstone of the global and local economies. We are leaders and  innovators, who are driving the financial industry forward increasingly with help of technology – from  cloud to AI.  

Background and Problem Statement  

The recent developments in the Generative AI field, specifically Large Language Models (LLM), promise productivity and efficiency gains on a scale.  Those models, when trained on the enterprise-specific content, could be very useful to capture tacit  organizational knowledge to help employees to be more effective – from the newcomers onboarding to  accelerating task execution by the experienced operators.

Project Description  

This project will explore ways to implement enterprise-contextualized knowledge discovery mechanisms through LLMs trained on domain-specific data. The project scope would include several elements including, but not limited to: 

  • Researching emerging techniques and methods for the LLM customization and fine-tuning - Researching open-source and proprietary LLM-based solutions to identify suitable candidates  for the customization 
  • Fine-tuning and customizing the selected LLM with a domain specific training corpus. 
  • Creating a human-LLM interface (interactive chatbot) to facilitate enterprise-specific knowledge  discovery.  
  • Evaluating the model performance and efficiency gains through well thought-out metrics and  measurements. 
  • Recommendation on the enterprise implementation of the customized LLM to support  enterprise knowledge discovery and propagation.  

Technologies and Other Constraints  

We could be technology agnostic, at least to an extent, though our preference would be to use Google Cloud Platform and AI functionality it offers, such as Generative AI Studio. 

Students will be required to sign over IP to sponsors when the team is formed

Sponsor Background

Human-centric software engineering lab in the CS department at NCSU is directed by Dr. Sandeep Kuttal. The lab focuses on the human aspects of software engineering by studying and modeling programmer behavior and then designing and developing mixed-initiative programmer-computer systems. The lab's pursuit of excellence is underscored by its multidisciplinary approach, amalgamating the domains of Human-Computer Interaction, Software Engineering, and Artificial Intelligence. By synergizing these diverse fields of expertise, the lab pioneers the development of novel strategies, innovative theories, immersive visualizations, and tangible prototypes, all tailored to cater to the unique needs of programmers.

Background and Problem Statement

Pair programming, a well-established practice in software development, involves two programmers collaborating at a single workstation. This technique has gained traction due to its ability to enhance productivity, code quality, and self-efficacy among programmers. The shared responsibility and continuous feedback lead to more robust solutions and improved learning. However, recent research has illuminated significant differences in the dynamics and outcomes of pair programming when considering gender as a crucial factor.

In the realm of pair programming, the impact of gender dynamics has emerged as a notable concern. Research has unveiled intriguing variations in collaboration approaches, communication styles, leadership roles, interruptions, and partner preferences based on gender. As these distinctions can influence the effectiveness of pair programming and potentially lead to unequal participation, it is imperative to support these gender-related nuances comprehensively.

To lay the groundwork, Dr. Kuttal and her team conducted a comprehensive study, including literature reviews, lab experiments, surveys, and interviews, to comprehend differences in same and mixed gender pairs. Their research illuminated potential challenges in remote pair programming and identified key aspects: (1) different communication cues for men and women, (2) diverse collaboration purposes, (3) distinct leadership styles in different gender pairings, (4) varying interruption patterns, and (5) gender preferences for partners. This effort resulted in a proof-of-concept implementation with some basic capabilities related to tracking interactions in pair programming sessions, but it's not very user-friendly for programmers.

Project Description

The "Fostering Inclusive Pair Programming with Awareness Tool" project aims to develop powerful software to address gender-based dynamics in pair programming. This tool will transform pair programming by analyzing communication styles, leadership roles, interruptions, and partner preferences. The goal is to build understanding and empathy between pairs, enhancing collaboration, code quality, and productivity, regardless of gender composition. 

This project's main goal is to recreate this system from scratch using the current implementation at https://github.com/Farissoliman/PairProgrammingTool as reference. The main focus is on improving the system's usability and functionality. Currently, the system carefully monitors how individuals, both in same-gender and mixed-gender pairs, collaborate during pair programming, roles of individuals, observing communication patterns, leadership dynamics, and interruptions. The system also allows real-time data capture without disrupting the programming process. Additionally, an analytics engine processes the gathered data to create visualizations and insights, which is presented in a user-friendly format to promote better self-awareness and understanding of their collaboration patterns.

Hence, the  system must:

  1. Be robust and capture the real time data efficiently.
  2. Provide a very well articulated User Interface (UI) based on research that allows 
    1. Empathetic communication between same and mixed gender pairs with its UI.
    2. The education of individuals regarding differences in the partner’s behavior and based on research, how to foster better collaboration with the partner.
    Note: Rough sketches of UI will be provided as a starting point to work on it
  3. Be a well-tested system that can be shared with the world. 

Technologies and Other Constraints

The current system consists of two main components: a VS Code extension with a Python (Flask) backend. Students are invited to explore alternative implementations that better address this problem. Some familiarity with Web technologies (JavaScript/TypeScript, CSS, HTML) and with the VSCode Extensions API will be beneficial.

Students will be required to sign over IP to sponsors when the team is formed

Sponsor Background

Department Mission

The North Carolina Department of Natural and Cultural Resources (DNCR) oversees the State’s resources for the arts, history, libraries, and nature. Our mission is to improve quality of life by creating opportunities to experience excellence in these areas throughout North Carolina. 

Division Mission

The North Carolina Division of Parks and Recreation (“DPR” or the “Division”) administers a diverse system of State parks, natural areas, trails, lakes, natural and scenic rivers, and recreation areas. The Division also supports and assists other recreation providers by administering grant programs for park and trail projects, and by offering technical advice for park and trail planning and development.  

DPR exists to inspire all our citizens and visitors through conservation, recreation, and education. 

  • Conservation: To conserve and protect representative examples of North Carolina's natural beauty, ecological features, recreational, and cultural resources within the state parks system. 
  • Recreation: To provide and promote safe, healthy and enjoyable outdoor recreational   opportunities throughout the state. 
  • Education: To provide educational opportunities that promote stewardship of the state's natural and cultural heritage. 

Data & Application Management Program 

The Data and Application Management Program works to support the Division, sister agencies, and nonprofits in web-based applications for various needs: personnel activity, Divisional financial transactions, field staff operations, facilities/equipment/land assets, planning/development/construction project management, incidents, natural resources, etc. Using data from these web applications, we assist program managers with reporting and analytic needs.

We have sponsored previous SDC projects, so we understand the process and know how to help you complete this project in an efficient manner while learning about real-world software application development. Our team includes three NCSU CSC alumni, all of which have completed projects with the SDC. These three will be overseeing the project and working directly with you to fulfill your needs and facilitate the development process.

Background and Problem Statement

Our existing LAMP stack system (the “legacy” system) was developed over the course of 25+ years with ad-hoc application development in a “production only” environment (mainly using PHP and MariaDB languages) to meet immediate business operational needs of the field staff.  Many of the legacy applications, including the Fuel/Vehicle/Equipment application, were written as single file, undocumented, procedural applications.  This makes it difficult to read, maintain, and upgrade them. These applications need to be updated with modern design patterns and documentation.

DPR manages 43 state parks and many other natural areas across the state. For the state parks to function, we need division-owned vehicles and fuel, oil, and equipment to operate these vehicles. These assets must be accounted for to manage inventory, budget, and park needs. This is where the vehicle application comes in; it stores information for vehicles, their fuel use, and related equipment across all the Division. Currently, this legacy application is unstructured, outdated, complicated, and does not have the ability to link to other applications.

We have recently begun migrating many of these legacy applications to new versions following modern design principles and technologies, such as single-page application clients written in React and backed by a REST API. The legacy system and upgraded web-applications have been containerized using Docker to run in parallel in the AWS cloud. 

Last semester, a Senior Design student team began working on a new Inventory application, which aims to maintain the functionality of the legacy system’s Fuel/Vehicle/Equipment application while centralizing and simplifying the inventory-related workflow processes of both our park staff and the budget office. 

The application has been partially completed, allowing for full management of park owned “On-Road” vehicles, one of many equipment categories that fall within the scope of this Inventory application. Our team from last semester also provided us with a dynamic backend and database structure that is ready to be utilized when creating the remaining front-end pages. We are happy with what has been completed for us so far and are excited to continue working with Senior Design Center students to fulfill the remaining requirements. 

Project Description

This semester, in addition to the completion of the remaining front-end pages and any supporting API endpoints, Inventory will also need data reporting tools as well as preparation for its connections to our future Budget application. Park staff must be able to request equipment and motor fleet vehicles from the budget office, as well as record their vehicles’ mileage, monthly park fuel consumption, and motor fleet vehicle telemetrics. 

Last semester, the new Inventory application was redesigned to fit a more modern, object-oriented framework that will allow for standardized control of user permissions, a more organized database structure, and more sophisticated connectivity between applications through our shared REST API backend container.

Parks and Recreation is in the process of implementing a new system that allows for continued use of the legacy applications as well as the establishment of a next generation system. The legacy system has been modified to work with the next generation system for continued use, until all applications can be reworked and migrated appropriately into the new system. Your completed Inventory application shall be seamlessly integrated into this multi-container system using Docker Compose.

Technologies and Other Constraints

Tools and assets are limited to what has been approved by the NC Division of Information Technology (NC-DIT). Most of the ‘usable’ constraints will be what is limited to NC-DPRs use through NC-DIT.

Our new modernized apps currently run on Docker. Each modernized application will be packaged into individual frontend containers that use a NodeJS base image and are written in React with Material UI as the UI framework. The backend consists of a MariaDB database container and a unified REST API backend container which will be used by all modernized applications. The unified REST API container uses PHP 8 and is built using the Slim Framework. All applications in the legacy system will continue to function as they are, all of them within a single PHP 7.4 container that runs its own Apache server.

For this project, students will improve the Inventory application client, which will run on its own container and uses React with Material UI. To support the functionality of this new application, students will also extend the existing REST API and database to add all required functionality.

Students will be required to sign an NDA related to personal and private information stored in the database and to sign over IP to sponsors when the team is formed.

Sponsor Background

The Senior Design Center (SDC) of the Computer Science (CSC) Department at NC State oversees CSC492—the Senior Design capstone course of the CSC undergraduate program at NC State. Senior Design is offered every semester with current enrollment approaching 200 students across several sections. Each section hosts a series of industry-sponsored projects, which are supervised by a faculty Technical Advisor. All sections, and their teams, are also overseen by the Director of the Center.

Background and Problem Statement

Senior Design is a large course offered over multiple sections with different teams working on unique projects. Some of the teaching team in the SDC work across all sections while others are dedicated to individual sections. To optimize fair grading despite these differences across teams and sections, the teaching team follows grading rubrics for the various graded assignments in the class.

The rubrics are currently maintained in Google Sheets templates that are manually adapted for each section every semester (adding appropriate student teams, dedicated tabs for each grader, etc.)  and shared with that semester’s staff. When grading, these spreadsheets are filled with grades by hand. There are two main problems with this process: 1) it is tedious and error-prone to customize each grading sheet every semester since the number of teams and number of faculty members on a section affect several of the calculations in the spreadsheet, and 2) when grading, it is easy to edit cells that have formulas on them, or make changes that affect how automatic calculations are performed.

We also communicate these rubrics to students by posting them on a dedicated page on our website. When we update our rubrics, we have to update the website and separately the Google Sheets templates, creating unnecessary additional work.

Project Description

For this project, your team will build a Web application that will facilitate the creation and use of rubrics. There will be 3 types of users: system administrators, instructors, and students. Administrators will be able to create and maintain rubrics. Administrators will also be able to manage semesters, semester sections, student rosters per section, instructors per section, and optionally, teams of students in a section. Administrators will then be able to create individual or team assignments, and assign a grading rubric to these assignments.

When instructors log into the system, they will be able to see all students and teams in their sections. Instructors can also see all assignments for their students and teams. When assignments are ready to be graded, instructors will be able to open its rubric and enter grades for rubric items. Instructors sometimes also like to add notes for rubric items they fill out. Note that some assignments need to be graded by multiple instructors. Instructors can only view the scores they gave, but administrators will be able to see and manage aggregation of scores from multiple instructors on the same assignment.

Students will also be able to log into the system to see their assignments (individual and team) and their final scores once these have been released by administrators.

Flexible Rubrics

Rubrics are the most interesting challenge on this project since they should be as flexible as possible. The user should be able to specify, for each rubric item, a name for the item, a weight, valid values (like a ranked list, letter grades, numeric scores in a range, etc.), and an optional brief description of what is expected for that item.

Rubrics can also have nested categories of items, where each category will have a designated weight in the overall rubric. For example, the rubric in our Interim Progress Report has a section for design with several different items in it. 

Some elements on a rubric can become optional depending on other values in the rubric. For example, in our written documentation we have a rubric category for requirements with different subcategories for different ways requirements can be expressed. Only one of these subcategories is expected to be used for requirements, while some of the other elements are common across all types of requirements.

We also want to have the ability to add extra-credit items to a rubric, both as part of a category and as part of the main rubric.

Technologies and Other Constraints

This will be a Web application running on Docker containers. The backend will expose a REST API and will be written in PHP 8 with the Slim Framework. The frontend will be written in React. The database will be MySQL/MariaDB.

Sponsor Background

The Senior Design Center (SDC) of the Computer Science (CSC) Department at NC State oversees CSC492—the Senior Design capstone course of the CSC undergraduate program at NC State. Senior Design is offered every semester with current enrollment approaching 200 students across several sections. Each section hosts a series of industry-sponsored projects, which are supervised by a faculty Technical Advisor. All sections, and their teams, are also overseen by the Director of the Center.

Background and Problem Statement

Senior Design is a large course offered over multiple sections with different teams working on unique projects. Because this is a large operation, we have many policies and procedures that students should follow. Also, different sections and teams often have different deadlines for the same deliverables. 

Information about policies, procedures, deadlines and other class administrative details is available to students on one or more of the syllabus, the course calendar, the course website, email, or our submission system. Given the number of possible places where information can be found, at this scale it is no surprise that the teaching team receives multiple similar questions from students throughout the semester.

This problem is not unique to Senior Design, and in fact is common across classes even when enrollment is moderate. To simplify and manage answering student questions, many courses, Senior Design included, often use communication tools such as Piazza, Slack, Discord, Ed, and others. However, especially with larger classes, it is common for students to ask the same question multiple times. It is just easier for students to ask than to comb through multiple resources for a precise answer.

Project Description

This project involves the creation of a bot—SyllaBot—that can be installed into a course’s Slack Workspace or Discord Server to provide a “slash command” that students can use to ask questions about the administrative details of the class. The bot will leverage the OpenAI API and clever prompt engineering to provide an answer to a student’s question that takes into account who the student is and their role in the class. For example, the system should be able to determine who the student is, the section the student is enrolled in, the team the student belongs to, etc. in order to produce the most accurate answer.

On the administrative side, the system needs a way for an instructor to provide access to information needed to answer the questions. These sources should be configurable and flexible. Examples of data sources include:

  • The text from the course’s syllabus
  • API access to one or more Google Calendars
  • A repository of emails sent to the class (e.g., a series of text entries, or posts on a Google Group) that updates each time an email is sent by the instructor
  • API access to custom systems (e.g., the Senior Design Submission System) that can provide due dates and other information
  • A web page

The instructor should also be able to provide their own OpenAI API access key on a per-course basis. Rather than spinning up multiple instances of the app every semester or for every course, the system should support creating courses and installing the bot to different Slack and/or Discord spaces for different courses while ensuring that the course from which a question originates can be identified.

If time permits, the system should keep records of the questions asked, by whom, the responses provided, and other metrics. These can be displayed to the instructor on a queryable dashboard.

Technologies and Other Constraints

This will be a Web application running on Docker containers. You will have access to an OpenAI API key you can use for development. The preferred backend language is Node.js, and any frontend should be implemented on React.

Sponsor Background

Ankit Agarwal is the Founder & CEO of K2S and an NC State Computer Science alumnus. He envisions a platform which enables other alumni an easy way to give back to the student community by way of mentorship to active Computer Science students. 

Sara Seltzer is the Director of Philanthropy for the Department of Computer Science at NC State and works with alumni (individuals, companies, and foundations) who want to impact students and faculty through financial support. 

Together from an industry and a university perspective we’re trying to create a virtual engagement program for the NC State Community.

Background and Problem Statement

In the Department of Computer Science at NC State, we understand that navigating the program and curriculum can be challenging. Wouldn't it be great to have someone who has been there before support and guide you around the pitfalls, helping you reach your full potential? Much of the skills and knowledge you will experience will take place in classrooms and labs but it also happens in the individual connections made with peers and alumni. CSC's alumni network includes over 10,000 members, many that have been very successful in their careers and who are eager to give back and support current students.

Successful alumni often revisit the path that got them there, and it invariably leads them down to the roots of their alma mater. In recognition of their supporters and heroes along that path, they have the urge to become one themselves. A portal which allows alumni to easily provide mentorship and their lessons learned not only is fulfilling to the alumni as a way of giving back, it also provides real help and guidance to students stepping out from the shadows of campus.

Project Description

We propose creating an online mentorship web portal that connects current CSC students with CSC alumni to share a goal of promoting academic success and professional advancement for all. 

Primary Portal end-users include: CSC alumni looking to give back to their alma mater by mentoring students and current CSC undergraduate/graduate students looking for help on a specific topic or project. Secondary users could include alumni who are looking for speaking opportunities and current students searching for contacts for specific internships and co-ops. 

Required Features (Minimum Viable Product)

  1. Ability for NCSU Students and Alumni to Sign-up
  2. Ability for all members to be able to list their work experience and education history (LinkedIn style)
  3. Ability for each member to identify themselves as a prospective Mentor or a Mentee
  4. Ability for each member to list their interests and areas in which they could provide expertise (Mentors) or for which they need mentorship (Mentees)
  5. Ability for each member to define their availability on a weekly/monthly basis
  6. Ability for members to reach out to each other via private messages within the platform
  7. Ability to define and categorize all users into certain roles
  8. Ability to control permissions for features and functions via roles. For example, an Admin Role would have administrative privileges what regular users would not
  9. Ability for users to include a Profile Picture with their account

Nice to Haves

  1. Integration with LinkedIn so their work history, education, Profile Picture can be directly and easily imported into the platform/profile
  2. Ability for NC State students and faculty to be able to sign up using NCSU SSO (Shibboleth)
  3. Calendar/Calendly integration for easy scheduling of meetings

Examples of similar solutions include George Mason University's "Mason Mentors" Program and UC-Berkley's Computer Science Mentor Program. 

Technologies and Other Constraints

Similar solutions exist in the market and are offered by companies like PeopleGrove. The idea would be to draw inspiration from this platform and build something in-house for the NC State Computer Science students and alumni.

The backend must be implemented in PHP using a modern framework like Slim or Laravel. Students are free to choose between a REST architecture (API + frontend framework such as React) or using Twig on the backend for a server-side rendered app.

Students will be required to sign over IP to sponsors when the team is formed

Sponsor Background

The Ergonomics Center is housed in the Edward P. Fitts Department of Industrial and Systems Engineering at North Carolina State University and  provides consulting and training services to clients throughout the U.S. and globally. The Center was founded in 1994 as a partnership between NC State University and the NC Department of Labor. It was created to make workplaces safer, more productive, and more competitive by providing practical, cost-effective ways to reduce or eliminate risk factors associated with musculoskeletal disorders.

Background and Problem Statement

How much is too much?

When engineers design a task, how do they know if people will be physically able to complete it without getting injured? 

Ergonomics and safety professionals have been asking those questions for decades. Several tools have been developed to help provide answers when it comes to defining acceptable lifting, lowering, pushing, pulling, gripping, pinching, and carrying weights and forces. The Ergonomics Center has created downloadable Excel-based calculators using the information in these tools and made them freely accessible to professionals on its website (ErgoDATA). There are one or two industrial ergonomics apps available to the public free of charge at the current time – the CDC’s NIOSH Lifting Equation App – NLE Calc and Intergo’s MMH Calculator Free. Both apps only address two-handed lifting which leaves out many other types of tasks (e.g., pushing, pulling, carrying, one-handed manual material handling) that are addressed by the Center’s calculators. There is a definite need to add additional apps to the ergonomics practitioner’s toolbox.

Ergonomics professionals and teams often collect analysis data while observing a task on the production floor in real time. The ergonomics calculators are straightforward and can be used quickly during the real time observation when using a laptop. Unfortunately, laptops can be unwieldy in crowded production floor work spaces, if they are allowed at all. Smartphones are much less intrusive and more manageable in these settings, but the Ergonomics Center’s spreadsheets are cumbersome to use on a phone. For a few years now, knowing that the Center is part of a university, many clients have asked if app versions of these ergonomics calculators were available. Unfortunately the Center staff does not currently have the capability to “translate” the existing Excel spreadsheets into mobile phone-friendly tools. Development of smartphone apps for these ergonomics analysis tools is appealing because they would be designed specifically for a mobile phone and would not require internet access at the time of use. 

Project Description

The Center envisions these ergonomics analysis apps mimicking the look and feel of its existing Excel-based spreadsheets. If possible the apps should be usable without an internet connection since cellular and wifi signals on a production floor can be limited or non-existent in some facilities. The apps should also have the capability to have input variables and output results exported to a report-ready printable format such as PDF, Word or Excel. Because analyses will be presented to clients, management, and other high-level decision-makers, the appearance of the exported information should be professional and require little manipulation or adjustment by the user. Because many clients work with classified and/or proprietary information, the app information should not be stored in the Cloud; information could be stored locally or be exported in one of the forms mentioned above.

 

Technologies and Other Constraints

The Center is flexible on the technology used and is willing to proceed with student recommendations. The development of a standard “native” app or apps is not mandated if another technology (such as a Progressive Web App) is deemed more suitable for use.

Mobile friendly tool(s) will be provided free of charge on The Ergonomics Center’s website.

Students will be required to sign over IP to sponsors when the team is formed

Sponsor Background

In Computer Science at NC State during Fall 2023, more than 40 instructors are teaching over 40 different undergraduate courses with more than 1600 undergraduate students enrolled across those courses. In addition to undergraduate courses, hundreds of graduate students are also enrolled in over 50 courses. Dr. King coordinates the CSC316 Data Structures & Algorithms course, which has over 225 students enrolled during Fall 2023. Dr. Schmidt coordinates the CSC226 Discrete Math course, which has over 490 students enrolled during Fall 2023. Many instructors offer in-person office hours, virtual office hours, or a combination of both each week. Our goal is to be as efficient and effective as possible when providing assistance to students outside of the classroom. 

Background and Problem Statement

One of the challenges instructors face when dealing with large courses and large teaching teams is the need for an organized and efficient approach to managing office hours. Instructors, students, and teaching assistants often face difficulties in coordinating and facilitating office hours effectively. Traditional methods can lead to confusion, inefficiency, and missed opportunities for valuable student-teacher interaction. Although there are existing solutions to this problem, these lack at least one of robustness, ease of use, stability, or features that would make it a trustworthy solution for instructors to rely on. There is an opportunity to develop a comprehensive digital solution that addresses these challenges, resulting in a more productive use of time and resources for both students and the teaching staff.

Project Description

Our envisioned solution is a web application designed to simplify the way office hours queues are managed in university courses. The application will offer instructors, teaching assistants, and students a platform to coordinate, join, and facilitate office hour interactions.

The application will provide the following features:

  • User Authentication and Roles: Instructors, teaching assistants, and students will have distinct user roles and functionalities tailored to their needs. Since this application can potentially be used in multiple institutions, login options should include using local accounts (username and password) and SSO solutions like NC State’s Shibboleth. Ideally, this means that authentication should be modular so that the tool can be configured to authenticate via an institution’s SSO, LDAP, local accounts, etc. or any combination of these.
  • Course and Roster Management: Instructors can create courses, manage access via CSV roster uploads, invite links, or an access code, and assign teaching assistants to courses.
  • Office Hours Coordination: Teaching staff can create, manage, and mark their office hour time blocks as online or offline. They can specify the location and mode (in-person, online, or both) for each office hour. If instructors teach multiple courses, they should have the ability to hold office hours for multiple courses at the same time.
  • Student Queue: Students can join office hours queues when an office hour session is active by providing a summary of their assistance needs. Students can submit help requests as a group such that all members of the group can participate during the session, whether in-person or remote. Students can monitor their status in the queue without page refresh. At the end of each office hour session, help requests that are still in the queue are removed from the queue and saved as drafts in each student’s account so that the help request can be resubmitted when the student attends a future office hour session.
  • Queue Interaction: Teaching staff can view the queue, invite students to join their office hours, place students back in the queue, and broadcast messages to students individually or collectively. The app should implement push notifications not just for these messages, but also to let instructors know when students join the queue and to let students know that they have been invited to join.
  • Course Overview: Students can log in to view the classes they've enrolled in, see available office hour blocks, and access information about teaching staff availability.

Stretch Goals

  • Calendar. Instructors can create an office hour calendar either directly or by synchronizing events with an existing Google calendar.
  • Analytics & Reporting. instructors can generate reports about help request volume, session times, and other metrics available in the system.
  • Session Facilitation. During an office hour session, a countdown timer of an instructor-defined length appears for both the student and teaching staff member providing assistance.

Technologies and Other Constraints

We would like this application to be web-based running on Docker containers. It should be implemented as a Progressive Web Application (PWA) with React and MaterialUI on the frontend and Node.js on the backend. We prefer MariaDB/MySQL as the database engine, but PostgreSQL is also acceptable.

Sponsor Background

Katabasis is a non-profit organization that specializes in developing educational software for children ages 8-15. Our mission is to facilitate learning, inspire curiosity, and catalyze growth in every member of our community by building a digital learning ecosystem that adapts to the individual, fosters collaboration, and cultivates a mindset of growth and reflection.

Background and Problem Statement

Computer Science (CS) education is of increasing importance to educators, parents, and school administrators as a greater number of CS jobs become available in the workforce. However, many children, particularly in rural high-need areas, have very little access to quality educational content on this subject matter.  Further compounding the issue, many children in these areas have such little exposure to CS that they become too intimidated to engage with CS education even when given the opportunity.  Katabasis is seeking to expose students to CS in a nontraditional way to combat these barriers and spark engagement through the medium of art. This will also open the door for future innovative teaching opportunities.

Project Description

We are seeking a team of students to develop a block-based coding system for creating algorithmic art.  The system will help teach students about artistic concepts such as fractals, evolutionary art, and other art generated from fixed patterns, while also introducing basic computer science concepts such as conditionals, looping, variables, etc. The major touchstones for this project are the visual block-based programming languages, Snap! and Scratch. There should be a portion of the UI for assembling block-based code and another portion of the UI for displaying the art generated when the code is run. The primary focus of the project is this block-based programming/art portion of the platform.

In addition to this portion of the project, we are asking students to also develop a web platform to contain the block-based programming component and to facilitate students sharing their art with others. 

The core features the system must include are:

  • Block-Based Programming for Algorithmic Art: There must be a core system that supports the creation of art pieces through block-based programming. 
    • Block-Based Programming
      • Blocks should be able to be dragged and dropped into position and built upon to make a list of commands.
      • There must be block options for changing colors, adjusting line thickness, positioning, etc. to facilitate the art creation process as well as several preset options with a more limited set of inputs to allow for a more guided creation process.
      • There should be blocks that introduce computer science principles like looping, variables, etc.
    • Visualization Portion
      • After pressing a “Run” button, the currently-placed blocks should be executed to show the resulting art.
      • This component should show the process of the art creation–that is, users of the system should be able to see the results of each code block executing to view the art being created step-by-step in real time.
  • Structured Lessons with Examples: The system should also have a component where students can access lessons explaining various art & CS topics, complete with examples on how to experiment with these topics within the block-based programming interface.
  • Web Platform to Support Creation and Sharing of Art: The web platform, in addition to providing access to the core art creation tool and lessons component, should also provide an interface where students can view their previously created artworks and export them if desired.  Additionally, this interface must be intuitive, given that this system will not only be targeted toward younger children, but also children that may have difficulty with technology/traditional CS learning. Prioritizing good UI/UX is a must.

Technologies and Other Constraints

This project will have 2 core tech platforms: the block-based coding system, that will be made in Unity, and the web platform, that is intended to be integrated within an existing Docker application (including a web portal with login functionality and access to a different art platform) and should align with the tech stack, described as follows: 

  • Frontend: React + JS
  • Backend: Django + Python
  • Database: MariaDB (Python migrations)

The team will be provided with a sample that will demonstrate the interplay between all 4 of these components (frontend, backend, database, and Unity).

Students will be required to sign over IP to sponsors when the team is formed

Sponsor Background

LexisNexis® InterAction® is a flexible and uniquely designed CRM platform that drives business development, marketing, and increased client satisfaction for legal and professional services firms. InterAction provides features and functionality that dramatically improve the tracking and mapping of the firm’s key relationships – who knows whom, areas of expertise, up-to-date case work and litigation – and makes this information actionable through marketing automation, opportunity management, client meeting and activity management, matter and engagement tracking, referral management, and relationship-based business development.

Background and Problem Statement

The key to successful business development is the strength of your engagements with prospective clients and how it changes over time.  LexisNexis InterAction® has an algorithm that, based on activities such as meetings, phone calls and email exchange, calculates the strength of an engagement between two individuals. 

The behavior of the algorithm is under review and a tool to investigate the impact of parameter change is needed, visualizing how the engagement score would change over time.

Project Description

The objective of this project is to produce a tool to allow the review and comparison of algorithm parameter sets by visualizing the resulting engagement scores and how they change over time.

An engagement score is the result of assigning value to the activities between two contacts, and the changing impact of those activities over time

  • Each contact represents a person.
  • Each activity would represent an exchange; email, telephone, meeting, etc.
  • A score is allocated to each activity, which then decays over time.
  • A set of parameters determine the score depending upon activity type and time since that activity took place. 
  • Other logged activity characteristics are used to influence the score.

Some examples of the tool's potential capabilities include:

  • Read a list of activities from a log
  • Identify pairs of contacts that have a high volume of activities.
  • Map activities against time for a contact pair.
  • Visualization of the engagement score over time.
  • Adjust the algorithm parameters, and compare the resulting score timelines.
  • Add activities to the log and compare the resulting score timelines.

 (This list is considered neither exhaustive nor a statement of the agreed scope of the project.)

 An agile development process will be utilized, agreeing on a sequence for functional implementation, incrementally delivering capabilities, and adjusting future deliveries on the basis of feedback.

Technologies and Other Constraints

The team may choose their technology stack with any mix of Javascript, Python, and C#. 

Angular 14 and D3 should be used for any front end and visualizations.

(As a stretch goal, the team could consider an Event Sourcing pattern for the implementation.)

A log of anonymized activity data will be provided.

Students will be required to sign over IP to sponsors when the team is formed

Sponsor Background

Our company is an innovative, global healthcare leader committed to saving and improving lives around the world. We aspire to be the best healthcare company in the world and are dedicated to providing leading innovations and solutions for tomorrow.

Merck’s Security Analytics Team is a small team of Designers, Engineers and Data Scientists who develop innovative products and solutions for the IT Risk Management & Security organization and the broader business as a whole. Our team’s mission is to be at the forefront of cybersecurity analytics and engineering to deliver cutting-edge solutions that advance the detection and prevention of evolving cyber threats and reduce overall risk to the business.

Background and Problem Statement

The Security Analytics Team is the curator of Merck’s cyber data lake, an Amazon Web Services based, Merck proprietary data lake aimed at providing a single point of presence for cybersecurity relevant telemetry data from a variety of systems at Merck.  The data lake ingests terabytes worth of data daily from over 20 different systems, which is then normalized and presented to consumers through a secure access layer for uniform consumption.  

Due to the scale and variety of data being ingested, maintaining visibility into the three V’s generally associated with big data operations – volume, velocity and variety – are paramount to the Security Analytics team.  The team is constantly working on new tools to help better understand the baselines of data being ingested to better understand trends as well as early identification of failures or exceptions in the ingestion process. The team needs to be able to monitor that the volumes that are increasing are in line with expectations (to ensure that asset counts are accurate and line up with configuration management efforts), that data points are received in a timely fashion, and that the ingested data does not deviate from what was expected.

Project Description

We would like the students to develop a solution for tracking volume, velocity and variety of data being ingested and display baselines and trends in a web-based application.  As part of the solution, the user should be able to specify what data sets can be ingested along with their configuration parameters (frequency, subsets versus entire data catalogs, etc.). For example, the student team will ingest some publicly available cybersecurity data (such as data from NIST or MITRE) to a data store of their choosing (may be either relational database based or select from open source big data platforms such as Hadoop).  The ingestion process will be created to run on a recurring basis with a defined frequency of at least daily, as appropriate based on the sample data selected.  The students will then create an application that will monitor the ongoing ingestion, tracking baselines for the volume, velocity and variety of the data.  The students will create a dashboard to display the results in graphical form, highlighting daily, weekly and monthly trends.  Additional development should be considered to identify potential anomalies (such as data type changes, where data fields may change from strings to integers) in ETL (extract, transfer, load) processes, which would provide early notification to the engineering team in the event of an outage or exception.

Technologies and Other Constraints

The students can select their choice of platforms and programming/scripting languages for data ingestion and frameworks for UI development/dashboarding.  The students will be responsible for reviewing the problem statements and proposing a target architecture and tool set to the Security Analytics Team, along with justification for their selections and any caveats/assumptions that drove the decisions.  

Students will be required to sign over IP to sponsors when the team is formed

Sponsor Background

The CSC Undergraduate Curriculum Committee (UGCC) reviews courses (both new and modified), curriculum, and curricular policy for the Department of Computer Science.

Background and Problem Statement

North Carolina State University policies require specific content for course syllabi to help ensure consistent, clear communication of course information to students. However, creating a course syllabus or revising a course syllabus to meet updated university policies can be tedious, and instructors may miss small updates of mandatory text that the university may require in a course syllabus. In addition, the UGCC must review and approve course syllabi as part of the process for course actions and reviewing newly proposed special topics courses. Providing feedback or resources for instructors to guide syllabus updates can be time consuming and repetitive, especially if multiple syllabi require the same feedback and updates to meet university policies.

Project Description

The UGCC would like a web application to facilitate the creation, revision, and feedback process for course syllabi for computer science courses at NCSU. Users will include UGCC members and course instructors (where UGCC members can also be instructors of courses). The UGCC members should be able to add/update/reorder/remove required sections for a course syllabus, based on the university checklist for undergraduate course syllabi. UGCC members should be able to provide references to university policies for each syllabus section, as well as specific required text (that instructors cannot change) as outlined by university policy. UGCC members should be able to update/revise the specific required template text, as appropriate, so that these updates are pushed to all new syllabi created using the tool. Instructors should be able to use the application to create a new course syllabus, or revise/create a new version of an existing course syllabus each semester. UGCC members can then review an instructor’s syllabus in the application and provide comments/feedback on each section of the syllabus, including flagging specific sections of the syllabus for required revision by the instructor. A history of revisions should be maintained. Instructors and UGCC members should be able to download a properly formatted course syllabus in DOCX, PDF, HTML, and Markdown (since several instructors use GH Pages to host syllabus) formats.

Technologies and Other Constraints

  • Java
  • JavaScript
  • MySQL
  • The software must be accessible and usable from a web browser

Other languages and technologies may be used, but they must be approved by the sponsors.

Sponsor Background

Cisco is the worldwide leader in technology that powers the Internet.

Cisco’s Security and Trust Organization’s (STO) InfoSec team is sponsoring the “Managing Firewall Access Control List for Cloud Services” project.

The InfoSec team is responsible for securing Cisco via controls and policies. From time to time the InfoSec team runs into limitations of controls.  This project is in response to a current limitation the InfoSec team is faced with.

Background and Problem Statement

Managing firewall access control lists for cloud services is becoming harder and harder each day.  The issue is that services in the cloud are very dynamic, as in they may scale up or down at any given point.  This includes the underlying cloud infrastructure that runs the service.  In fact, the dynamic way that these cloud services can be scaled up and down or moved between geographical regions is one of the main drivers to leverage services in the cloud.  However, due to the static nature of firewall access control lists become difficult to manage.  

Example:  Company A runs their widget cloud service in Amazon Web Services (AWS) West datacenter.  Their customers are demanding that they provide geo-redundancy, so they deploy their widget service in AWS East datacenter.  AWS runs into a network issue and the West traffic is redirected to the East datacenter.  The issue is that Company A never told their customers that they added the East datacenter and the customer’s firewall is blocking access to the IP addresses in the East datacenter.  Another example is when the infrastructure providers, such as AWS change the IP address assignment to Company A  public widget service.  When the IP address of the widget service changes there is usually no notification to the widget service owner to notify its consumers of the widget service of the change resulting in the widget service not being accessible due to firewall rules. 

Project Description

This project could go one of two ways.  The first step would be researching the industry to see if there is an existing framework that could be used to solve the problem statement.   If so, then the project would consist of integrating that framework in a secure way to automate dynamic firewall Access Control Lists (ACLs).  

If there were no existing solutions, then the project would involve validating the problem statement and taking those findings to design and build a prototype for an industry-wide solution.  

Possible starting points:

  • How to manage dynamic cloud service’s IP addresses – Example:  Office 365 URLs and IP address ranges - Microsoft 365 Enterprise | Microsoft Learn.  Here Microsoft does a great job at providing IP address information for their services. Can an industry standard solution be built to provide a common framework for all cloud services? 
  • How to manage ACLs for dynamic Cloud Services – App-ID standard seems to provide a way to build ACLs based on Cloud services -  App-ID - Palo Alto Networks.  The focus of this project is to provide a common framework for Cloud services to share their dynamic IP address details in a consumable way for things like App-ID.   

Technologies and Other Constraints

Project would need to use open standards to solve the problem at an industry-wide level.  Prototype could focus on few vendors due to time constraints. 

Programing language: any modern language

Networking skills: Understanding Network ACLs 

Operating Systems: Windows, Linux – need understanding of how DNS works. 

Since time is a constraint, the deliverable would be:

Required: design document

If time: a working example of a “DNS type” service for looking up cloud IP Addresses

If time: a working example of modifying the ACLs of a router based on the data in the “DNS type” service. 

Sponsor Background

The Christmas Tree Genetic Program (CTG) at NC State’s Whitehill Lab  is working on genomic tools to develop elite Fraser fir trees. Graduate students are working on elucidating mechanisms involved in the tree abilities to handle disease pressure, pest problems and challenges brought about by climate change. Understanding these mechanisms allow the researchers to develop Christmas trees that are more resilient to biotic and abiotic stressors.

Background and Problem Statement

Scientists in the CTG program handle a large number of plant material such as unique individual trees, cones, seeds, embryos, cultures and clones. Currently, all the data is managed using Microsoft Excel, which will quickly become obsolete in the face of a growing amount of plant material information needing to be stored. Plant material tracking is key for data integrity. We need to know what is what, when the material was last transferred, and its current location. A database will help manage our inventory and prevent data loss and mismanagement. Such a database is referred to as a Laboratory Inventory Management System, or LIMS.

Project Description

This is the second round of development for the ROOTS database, which started as a CSC Senior Design in Spring 2023.

ROOTS is a repository of data related to CTG’s research activities both in the fields and in the laboratory.

The various steps of the protocols used by the research group are represented in the database. Individual plant materials of various stages are saved in the database (trees, cones, seeds, embryos…) along with metadata (origin, transfer date, quantity, location…)

The first round of development, by the Senior Design team in Spring 2023,  resulted in a strong emphasis on lineage tracking and nomenclature. The ROOTS DB ensures that the seeds from a tree are connected to the parents and the progeny (“children”). The naming nomenclature contains specific information related to the tree breeding work done by the CTG. The system has three types of users: user, superuser and admin.  The user has viewing privileges only. The superuser can add, modify, and discard data in the system, and generate reports of material data based on species, genealogy, and other criteria. The admin has additional permission to add new users, superusers, and admins to the system.    

The second round of development for ROOTS 2.0 will focus on addressing feedback from the users after testing of ROOTS 1.0. The Christmas Tree Genetics program has two main outstanding requirements:

  • Installing ROOTS database on the CTG server via Docker
  • Successfully running through the entire process of Initiation of embryos from seeds, maintenance of embryos, maturation of embryos, germination of embryos and acclimatization of young trees.

 It will also focus on other features needed in ROOTS such as:

  • Storage of pictures and notes of plant material
  • Reporting (generate reports containing requested data)
  • Off-line mode (work offline, update DB when back online)
  • Scheduling (calculate date intervals to alert of next transfer to fresh medium is needed).

 

Technologies and Other Constraints

ROOTS is a web application using the following stack

Frontend: React with the Material UI and NPM QR Reader packages.
Backend: NodeJS with an Express.JS framework and Sequelize for the Object-Relational Mapper
Database: MySQL
Authentication: Shibboleth
Containerized using Docker

Sponsor Background

Dr. Tiffany Barnes and Dr. Veronica Cateté lead computer science education research in the department of computer science at NC State University. Dr. Barnes uses data-driven insights to develop tools that assist learners’ skill and knowledge acquisition. Dr. Cateté works closely with K-12 teachers and students conducting field studies of technology use and computing in the classroom.

Together they have advocated for the use of the block-based SnapI programming environment and have worked closely to develop solutions for live classrooms, engaging over 800 students and teachers each school year in computing-infused lessons. 

Background and Problem Statement

To help aid the nation’s critical need for a more computer science literate populace, researchers and educators have been developing interventions for increased exposure and equitable learning experiences in computer science for K-12 students. In addition to the creation of standalone courses like AP CS Principles, Exploring Computer Science and CS Discoveries, we have also been working on developing integrated computer science experiences for common K-12 classes such as English, Math, Science, and Social Studies. 

To support the new influx of teachers and educators from various backgrounds to teach block-based programming lessons, we developed a wrapper for the Snap! language called Snapclass. This system supports assignment creation and student submission as well as project grading. The tool was developed using various programming paradigms and after initial deployment and usability test, we have new feedback to work with to address user needs and tool functionality.  

SnapClass v4.0 will build off of the work done by three prior Senior Design teams (Spring 2022, Fall 2022 and Spring 2023). The prior teams have added useful functionality to SnapClass such as the integration of multiple block-based programming languages into the environment, a FAQ for students working in Snap, mechanisms for auto-saving code, differentiated assignment to students based on skill level, non-coding assignments, etc.

Project Description

Snapclass like other software projects often have a long lifespan, with new features and updates being added over time. Regular bug fixes, updates, and optimizations are necessary to keep the software running smoothly. In order for the Snapclass system to reach a new level of users, the codebase needs to scale accordingly. This means new features, modules and components should be easy to add without compromising the stability or performance of the system.  With a well-structured and maintainable code base, we can more easily adapt to changing user requirements and integrate more third-party libraries or frameworks such as LMS support. 

Prior developers and researchers working on Snapclass have put together a list of defects and functionality that fall short of what K12 educators desire.  This semester, for SnapClass v4.0,  we would  like to work with the  team of students on the final 15% of the project; polishing usability, functionality, and improving overall system effectiveness. As the team catalogs the inventory of improvements, we encourage them to research software architecture and database best practices so that they may have the opportunity to refactor different modules of Snapclass. 

Some of the changes that need to be made follow and a further description of each can be found here:

  • Make the following changes to the section roster: 1. Convert adding a student to a pop up (it is currently a line in a table), 2. Remove the helper column, 3. Add functionality to the “Edit Section” button.
  • Make the following changes to the “Student Help” feature: 1. Allow students to lower their hand, 2. Remove the section dropdown, 3. Show students requesting help for all sections in one view, and 4. Add a button for teachers to clear the queue.
  • Add functionality to the gradebook: Add links to the submission grades so that teachers can go directly to the submission from the gradebook.
  • Make the following changes to the auto saving feature in the student portal: 1. Make sure it works correctly, 2. Autosave should occur every five minutes, and 3. Format the timestamp to be shorter.
  • Add peer and self review assignments.
  • Create a new HomePage that explains the system and its functionality.
  • Create a new set of resources (e.g. walkthroughs, FAQs, with images and/or videos) for teacher and student users to help them learn how to use the system.

Technologies and Other Constraints

  • Technologies required for Web-first Snapclass development 
    • Snap!
    • Cellular
    • JavaScript
    • HTML
    • SQL/phpMyAdmin
  • Current Technologies in use for Snapclass wrapper (Flexible for maintenance/growth) 
    • Node.JS
    • Mocha (Unit Testing)
    • Angular
    • MySQL

Sponsor Background

Adam Gaweda is an Assistant Teaching Professor for the Computer Science department at NC State University. As a member of the NCSU CSC Faculty he, like many faculty, needs to provide quality instruction while addressing academic misconduct enabled by assignment materials and solutions that are easily searchable online.

Background and Problem Statement

With the rising prominence of platforms used for cheating like Chegg and ChatGPT, faculty are required to periodically and proactively monitor these external platforms for potential academic integrity and copyright violations. Furthermore, instructors must regularly conduct searches on current and prior assignments, as students may post assignment materials while seeking assistance during its initial release, as well as part of their portfolio after the course is over. However, the time and effort doing so continues to increase as student enrollments increase, and the process is becoming increasingly difficult to monitor as new course assignments are created and released.

Project Description

The purpose of this project is to assist instructors with monitoring known websites, such as Chegg, CourseHero, GitHub, ChatGPT, etc. for unauthorized postings of course materials or solutions for course assignments. The project would allow the installation and configuration of plugins that enable monitoring frequently used platforms and tools for course materials. Likewise, since the likelihood of a potential violation may decrease after an assignment is released, the tool should also gradually reduce the frequency in which the material is scanned for. For example, if CSC 116 releases the assignment requirements for Homework 1, the tool should scan for potential violations daily, but shift to a weekly or monthly scan after its due date. In addition, scans should be scheduled to avoid server or API rate limitations set by the platforms. For this project, the team will create the core software, as well as 1-2 plugins that will interact with external services.

One potential solution involves a web-based platform that would enable several possible features, including:

  • Allowing instructors to post the text from assignment instructions or specific keywords that should be monitored for specific courses
  • Allowing the installation of plugins for communicating with external services (such as Chegg, CourseHero, GitHub, ChatGPT, etc.)
  • Each plugin may need to support searching or automatic scraping of webpages to extract content (while some tools may provide API access for analysis, other platforms may require students to develop web scraping script)
  • Scheduling tasks, such as scans of external platforms (since these searches need to be regularly monitored, the platform would use server configurations such as cron jobs to schedule periodic scans for the installed plugins)
  • Notifying instructors of possible concerns (if a potential violation is found, the platform should send an alert to the instructor with description and link)

Students involved in this project would need to use HTML/CSS/JavaScript, as well as backend web development frameworks to build the web application. Students will also need to develop HTML parsing scripts that can extract information for these platforms and server configurations to schedule regular evaluation of the course materials. In addition, students would need to interact with any available APIs to LLMs while working on this project. Finally, a form of communication to inform instructors about potential matches should be developed.

Technologies and Other Constraints

API and Web-scraping (such as BeautifulSoup, scrapy, lxml, and requests)

Web-based (Flask or Django preferred, but other web frameworks will be considered)

Database (Postgres or MySQL, SQLAlchemy)

Linux-based Service Scheduling and Mailing tools (cron, Celery, but flexible to specific tools, though should be installable to servers running CentOS or RHEL)

NLP (text similarity, awareness of basic concepts preferred)

Sponsor Background

LexisNexis® InterAction® is a flexible and uniquely designed CRM platform that drives business development, marketing, and increased client satisfaction for legal and professional services firms. InterAction provides features and functionality that dramatically improve the tracking and mapping of the firm’s key relationships – who knows whom, areas of expertise, up-to-date case work and litigation – and makes this information actionable through marketing automation, opportunity management, client meeting and activity management, matter and engagement tracking, referral management, and relationship-based business development.

Background and Problem Statement

Effective tools fit with the way you work.  With that in mind, LexisNexis InterAction® has a series of Microsoft Office Add-ins and integrations that allow users to access their customer data from Outlook, Excel & Word.

Rather than using Microsoft Office, however, many smaller legal firms are turning to Google Workspaces to manage their emails, contacts, and calendars. Currently, InterAction doesn’t have any support for Google applications. 

Project Description

The objective of this project is to produce LexisNexis InterAction tools that integrate with Google Workspaces.  We would like to 1) create a process that allows users to synchronize contact data between InterAction and Google Workspaces, and then 2) provide a background service to automate this process.

The first of these should be an Add-On to the Google Contacts application,  similar to the InterAction Microsoft Office Add-in, that would allow the user to:

  • Indicate whether the contact is known in InterAction
  • Create a new InterAction contact from the Google Contacts data
  • Show differences, and allow for the exchange/update of data

The second would be a Background Service using the Google Workspaces API to synchronize changes to contact data with InterAction.

As a stretch goal, the team should investigate how a similar approach could also be applied to Gmail or Calendar and InterAction

Technologies and Other Constraints

The team may choose their technology stack with any mix of Javascript, Python, and C#. 

Angular 14 and D3 should be used for any front end and visualizations.

An overview of the InterAction MS Office Add-ins will be given, together with a resource pack for styling.

Credentials and access to a test instance of InterAction will also be provided.

Students will be required to sign over IP to sponsors when the team is formed

Sponsor Background

McAfee is a worldwide leader in online protection. We’re focused on protecting people, not devices. Our solutions adapt to our customers’ needs and empower them to confidently experience life online through integrated, easy-to-use solutions.

Background and Problem Statement

Scam or phishing websites are increasingly cheap and easy to stand up, run for a short period of time and then are torn down, potentially before traditional cybersecurity tools can be run against them.  Real time analysis, by pulling apart the different sections of a suspicious website’s individual components on the client machine and delegating to cloud AI, will provide improved protection for consumers.

Project Description

We would like students to build a browser extension that will dissect a web page for relevant items of interest for further analysis such as video, audio, images or blocks of text. 

Proposed Solution:

Build a Cloud API that will combine custom video/audio/image/text classification (such as “AI Generated”, “Phishing”, “Scam”, etc) using existing Open Source models or new Models created from open source datasets (for example, data from this paper).  This will be used in conjunction with McAfee Threat Intelligence, which will be provided as a REST API.

Build a web browser extension that extracts relevant items of interest and makes an API call to a cloud service to get an evaluation, which can include a proprietary trust score, site categorization and additional properties, of the item and overlay the results on top of the item in question in the browser in real time. 

A complete solution will include a browser client to navigate to an arbitrary web page, identify the objects of interest, query the cloud service for the classification details and display the result to the user inline on the web page.

Examples:

  • For a photograph in a news article, query the cloud API and overlay the results of the API call in the bottom right of the photograph.
  • For a URL link on a web page, query the cloud API and when the user hovers over the URL link, display a popup with the relevant results of the API call.

Technologies and Other Constraints

The solution should be developed for the Google Chrome browser (all platforms), Microsoft Edge, Mozilla Firefox and the Apple Safari browser (macOS and iOS).

Students will be required to sign over IP to sponsors when the team is formed.

Sponsor Background

Sterling McLeod is an Assistant Teaching Professor for the Computer Science Department at NC State University.  He teaches the CSC 116 course, where multiple different instructors each semester teach hundreds of students in different sections.  Dr. McLeod would like to create a way for all CSC 116 instructors to have isomorphic tests so that the assessment of students is consistent across all sections.

Background and Problem Statement

The goal of this project is to develop a password-protected website that will act as a repository for isomorphic test questions. This will allow easier test creation that is consistent among many sections of a course, while still allowing some flexibility for each instructor to choose which questions they want to ask.

Consistency in assessment among many instructors of a course is critical in ensuring that each student is assessed fairly. It’s also important to ensure all course Student Learning Outcomes (SLOs) are being assessed properly. An example set of SLOs is below.

Upon successful completion of this course, a student will be able to...

  1. Apply classic problem-solving techniques to simple computational and information-management problems.
  2. Evaluate an arithmetic expression using order of operations, promotion from integer to floating-point types, and integer division.
  3. Use a programming language to write code that selects one of several alternatives based on more than one predicate.
  4. Use a programming language to write a loop whose exit depends on more than one predicate.
  5. Correct syntax errors and distinguish between them and runtime errors or errors in logic.
  6. Find and correct logical programming errors using debugging printout, pencil-and-paper tracing, and systematic search (to locate where an incorrect decision or value first appears).
  7. Verify and validate programs using unit and system testing.
  8. Implement an object-oriented design that has at least two interacting, encapsulated classes.
  9. Write and document programs that adhere to specific coding and documentation standards (e.g., javadoc for documentation; conventions regarding the naming of classes and methods, definition of constants, indention, etc.).
  10. Use the Java system classes to do text-based input and output.
  11. Construct and use arrays with one and two dimensions.
  12. Use programming language constructs learned in the course to implement a fully-specified and fully-tested encapsulated system.

Problems in assessment often arise when instructors have varying degrees of difficulty in their test questions, when certain SLOs are left out of assessments, and/or when questions are graded with different degrees of rigor. This can lead to inconsistencies in students’ preparedness for subsequent courses, bottlenecks in student degree progression (which cause space issues for the university), and inconsistencies in students’ Grade Point Averages (GPAs).

Test consistency is difficult to obtain for several reasons. When there are many instructors for a course (4+), agreeing on test rigor can be an endless discussion. Instructors can be partial to questions they create and be unwilling to not include them. Sometimes instructors simply may not want to go through the effort of creating new test questions, solutions, and rubrics due to the enormous amount of time that task can take.

This project aims to address these issues by providing a mechanism for instructors to create their tests to be consistent with other sections. The proposed platform for the tool will be a website allowing faculty to do the following:

  • Store questions with their corresponding answers and grading rubric. This will include storing them with special formatting, such as computer programming code, formulas, etc.
  • Tag each question with the topic, Bloom’s taxonomy level, and other relevant information, such as tags indicating which tests the question has appeared on and/or student performance on the question.
  • Search for questions based on question title, tags, or other information.
  • Export questions to various formats based on test creation tool.
    • Latex source, Google Doc, Markdown, HTML, etc.
  • Link questions to course SLOs.

All questions stored on the site will be approved by a team of faculty relevant to the course. This will enable faculty to continue using questions they are partial to but ensure that each question meets a baseline level of rigor.

The benefits of this work will be:

  1. Consistency when assessing the roughly 250+ students that are enrolled in CSC 116 every semester across many sections and instructors.
  2. Creating consistent tests will be far easier for faculty.
  3. Faculty can create consistent tests while still having flexibility in the questions that appear on the exams.
  4. Each question that appears on a test will be from a curated repository of questions.

Project Description

I think a website will be the best platform for this project. Each question should have various data associated with it, such as:

  • The question’s text in plain text
  • The latex source code, which can be inputted by faculty when creating questions. I don’t expect students to create a tool to automatically convert between plain text and Latex source.
  • A solution to the question
  • A grading rubric
  • A way to alter parts of the questions, such as any magic numbers, if applicable to the question
  • Tags to make the question searchable. These can be keywords like “arrays” or Bloom’s taxonomy levels like “Comprehension” or both

Storing this data can be done with a simple JSON and/or in a more formal database.

Making a website as opposed to some other platform will be nice because it can be easily accessed by many faculty. The website will need some kind of password protection. If a static page site (like Jekyll, Hugo, etc.) can be used then that would be great, but I’m not familiar enough with web development to envision a solution using those.

Technologies and Other Constraints

There are none.

Sponsor Background

IBM is a leading cloud platform and cognitive solutions company. Restlessly reinventing since 1911, we are the largest technology and consulting employer in the world, with more than 350,000 employees serving clients in 170 countries. With Watsonx, the AI platform for business, powered by data, we are building industry-based solutions to real-world problems. For more than seven decades, IBM Research has defined the future of information technology with more than 3,000 researchers in 12 labs located across six continents.

The Department of Forest Biomaterials at NC State’s College of Natural Resources is home to one of the oldest and most respected paper science and engineering programs in the world focused on cutting edge sustainable materials innovation and smart manufacturing.

Background and Problem Statement

As of 2018, 300 MM tons of municipal solid waste (MSW) was available in the US. Of that
material, about 50% ends up in landfills, which is a growing concern for all communities due to its significant impact on the global environment. Therefore, the rational use or valorization of MSW is essential in a future economy based on sustainability.

We are working on AI-driven MSW characterization with the use of visual, multi-spectral, and hyperspectral sensors. The idea is to build and train models to identify the type of materials (paper, plastics, food, textiles, etc.) in real-time. Specifically, we plan to build an AR assisted sorting for the workforce working in a materials recycling facility (MRF). One step to achieve this is to augment reality by tracking multiple objects moving on a belt and putting labels (color and text) on each object.

Project Description

Our objective is to augment reality by tracking multiple objects on a moving belt and putting labels
(color and text) on each object. One example scenario is described below:

Every 10 seconds, a set of objects are placed on a belt. The AR engine puts labels (color codes and text)  on each object and tracks its position until the object comes off the conveyor belt. The labels and the initial position of the objects would be already given to the team to scope out the project.

Last semester, one of the CSC team worked on the initial proof-of-concept with good success; however, the accuracy of the labeling/tracking needs to be further improved and implemented on a real conveyor system. Eventually, this work will be integrated into an AI system to identify and label the objects.

Skills: Computer Vision, AR toolkits, 3D Visualization

Technologies and Other Constraints

GitHub (preferred)
Computer Vision, AR

Sponsor Background 

The Laboratory for Analytic Sciences (LAS) is a research organization in support of the U.S. Government, working to develop new analytic tradecraft, techniques, and technology that help intelligence analysts better perform complex tasks. Processing large volumes of data is a foundational capability in support of many analysis tools and workflows. Any improvements to existing processes and procedures, whether they are measured in time, efficiency, or stability, can have significant and broad reaching impact on the intelligence community’s ability to supply decision-makers and operational stakeholders with accurate and timely information. 

Background and Problem Statement 

Modern media sources produce immense quantities of speech audio recordings every day across the globe. Information producers and consumers both benefit from cross-lingual transcriptions. However, language analysts are of course overwhelmed in this environment, and in most cases employing their services is cost-prohibitive. Thankfully, machine-learning methods have generated moderately capable 

speech-to-text (STT) and machine translation (MT) algorithms which are far faster, and more economical, to deploy. Of course, while for some applications these solutions are sufficient, regional dialects and accents are complicating factors and it is simply the case that the accuracy of the models is often lacking, even for common languages. These shortcomings limit, and even prohibit, the utility of STT and MT for many applications. We desire to improve the efficacy of STT and MT capabilities, with a present emphasis on the former. 

To create STT algorithms, a machine learning model is provided with ground truth samples of both speech recordings and associated, human-transcribed, text. Through complicated processes beyond the scope of the present document (and this project), the model learns to correlate elements of speech utterances with phonemes and words. Loosely speaking, larger machine learning models (in terms of the number of trainable parameters they contain) outperform smaller models, however the larger a model is the more training data is typically required to train it. Thus an ever-present issue in the machine learning world is the difficulty and cost associated with creating or acquiring a large corpus of ground truth data to use for training models. What follows is an approach to acquire additional ground truth training data for STT algorithms, which will, in turn, enable data scientists to train more, and larger, STT models. It will also afford opportunities to “fine-tune” models for specific dialects and accents of particular interest. 

One common workflow for language analysts is to transcribe a foreign language audio recording directly into the desired language of interest. For example, an audio recording of a Spanish speaker may be transcribed/translated directly into English by an analyst. The same recording may have STT and MT applied to achieve a similar, albeit typically far less accurate, result. However, since the language analyst has already decided to transcribe/translate the audio, there is opportunity to record ground truth training data that could later be used to improve the accuracy of STT and MT algorithms at a smaller cost than would otherwise be required. “All” that is needed is for the analyst to take the extra time to “correct” the STT output. In practice, this “correcting” process means that the analyst is presented the inaccurate STT output in a text editor, and is asked to edit the presented text into the accurate transcription. Analysts are already encouraged to perform this task. However, it is already time-consuming for analysts to perform transcription/translation, and in cases where the STT output is quite poor and requires many edits, the act of correcting may take a significant amount of time to perform. This time requirement is often simply too tall an order to add onto the analyst’s workflow, where moving on to transcribe/translate the next audio recording may reasonably take priority. 

If we can develop an automated procedure to use the, e.g., SpanishAudio-to-EnglishText transcription/translation that the analyst performed to improve the SpanishAudio-to-SpanishText transcription that the STT algorithm generates, this would reduce the editing burden on the analyst that would be required to correct the output. The following project description describes one such automated procedure which the LAS has performed minimal testing on, and which appears promising enough that we wish to generate a prototype implementation enabling testing on a larger, and broader, scale. 

Project Description 

The student team is to create a pipeline to enable testing of various ways in which an analyst’s gold-standards translation might be leveraged to facilitate correction and truth marking of STT, thus ultimately improving the output of a STT algorithm. As a developmental use case, the team is to implement the method described below as a first attempt at leveraging the gold-standards translation. 

First, off-the-shelf MT algorithms will be selected and used to convert the foreign language STT to the native language. Next, we attempt to automatically improve the STT output using a large language model (LLM, e.g. chatgpt). We ask an LLM to compare the machine translated STT and the analyst-derived translation, and extract what differences may be present. We then ask the LLM to use these extracted differences to correct the original foreign language STT. Finally, we ask the LLM which sections, or chunks of text, that it is confident are most improved from the original STT output. These LLM-related tasks are to be scripted by the student team, using an LAS-provided API key for a LLM. Finally, the student team is to develop a very basic interface by which those sections, or chunks of text, from the original STT output that the LLM is most confident are improved can be presented to the analyst for correction. The “correct” answer is known only to the analyst, but for testing purposes the LAS will provide a set of “correct” answers (ground truth data, akin to an analyst-derived translation into the foreign language) and ask the students to implement a calculation of the Levenshtein distance between the LLM-improved STT and the ground truth, as well as the distance between the original STT and the ground truth. This will enable evaluation of the above method, and others that the LAS may later incorporate into the pipeline the student team creates. 

Note that if time permits, a simple variation of the above method could also be supported without much difficulty. In this variation, rather than using the MT to convert the foreign language STT to the native language, the MT would be used to convert the native language translation to the foreign language. The remaining steps would be very similar, i.e. we would then again use a LLM to compare, identify differences, and use those differences to correct the original STT, the only difference being that the comparison would happen in the foreign language rather than the native language. A flow diagram is presented below depicting these two approaches. 

Below is an example, where the foreign language of the audio was Spanish, and the native language was English (examples from tatoeba.org). In this case, the STT underwent machine translation into the native language, and the STT “corrected” by the LLM is accurate to the ground truth data.

STT

MT of SST

Analyst

translation

ChatGPT

corrected STT

Ground truth

Levenshtein

Distance

la trufa es un hongo

que vive en

simbiosis con las

raíces de algunas

plantas tales como

Robles avellanos as

a House

The truffle is a

fungus that lives in

symbiosis with the

roots of some plants

such as Oaks,

hazelnuts, as a

House.

A truffle is a fungus

which lives in

symbiosis with the

roots of certain

plants, such as oak,

hazel, beech, poplar,

and willow trees

La trufa es un hongo

que vive en

simbiosis con las

raíces de algunas

plantas, tales como

robles, avellanos,

hayas, álamos y

sauces.

La trufa es un hongo

que vive en

simbiosis con las

raíces de algunas

plantas, tales como

robles, avellanos,

hayas, álamos y

sauces.

STT to Ground truth

distance is 21


ChatGPT corrected

STT to Ground truth

is 0

Below is an example, where the foreign language of the audio was Spanish, and the native language was English (examples from tatoeba.org). In this case, the analyst translation underwent machine translation into the foreign language, and the STT “corrected” by the LLM does not appear to be improved over the original (at least in terms of Levenshtein distance). This example illustrates why we desire the student team to develop this pipeline to support multiple methods because development of a consistently successful method is under current investigation at the LAS.

STT

Analyst

translation

MT of Analyst

translation

ChatGPT

corrected STT

Ground truth

Levenshtein

Distance

Un reportero se

aprovecha de lo que

obtiene de cualquier

fuente y uso de las

del tipo a dicho, un

pajarito.

A good newspaper

reporter takes

advantage of what

he learns from any

source, even the

"little bird told him

so" type of source.

Un buen reportero

de un periódico

aprovecha lo que

aprende de

cualquier fuente,

incluso del tipo de

fuente "el pajarito

se lo dijo".

Un buen reportero

se aprovecha de lo

que obtiene de

cualquier fuente y

hace uso de las

fuentes de tipo

"dicho por un

pajarito".

Un buen reportero

se aprovecha de lo

que obtiene de

cualquier fuente,

incluso de las del

tipo "me lo ha dicho

un pajarito".

STT to Ground truth

distance is 21


ChatGPT corrected

STT to Ground truth

is 25

The student team is asked to design, engineer, and develop a prototype incorporating as many of the below key features as possible: 

  1. The prototype is to have the capability to take audio samples in any language and produce STT in that language. Using a ground truth, human-derived translation in the native language, the prototype will use MT to translate either the STT or human-derived translation so they are in a common language, and implement the above described LLM queries. The Levenstein distance between the resulting “improved” STT and the ground truth, and from the original STT and the ground truth, will then be calculated. 
  2. The prototype is to employ LLM prompts resulting from a basic experiment in prompt-engineering that the student team performs. 
  3. The prototype should be engineered in a manner that it supports any foreign and native languages desired, up to the availability of STT and MT algorithms and the languages supported by the LLM utilized. For a first prototype, it is sufficient if the prototype supports several of the most common foreign and desired languages (TBD during the semester, but perhaps English, Mandarin, Hindi, Spanish, Arabic). 
  4. The prototype should be engineered in a manner such that additional components can be added on at a later date; for example, in the future we may want to improve the process by biasing the LLM towards corrected STT that maintains phonetic characteristics of the original STT. 
  5. The prototype should allow collections of audio samples (in any language) and ground truth data (in any language) to be uploaded, and to return statistics comparing the original STT, improved STT, and ground truth data, to the researcher. 

The LAS will provide the student team with a dataset for testing (most likely from Tatoeba, an open source collection of crowdsourced sentences and translations, or from the Linguistic Data Consortium’s “Call Home” data sets), an extensible software development environment utilizing AWS resources, access to an LLM (e.g. an API key for chatgpt or similar), and expert consulting/mentoring. Open-source STT and MT algorithms are available for use.

Technologies and Other Constraints 

The prototype should be stand-alone and should not have any restrictions (e.g. no enterprise licenses required), with the possible exception of the LLM utilized. In general, we will need this application to operate on commodity hardware and be accessible via a standard modern browser (e.g. Chrome, Microsoft Edge, etc). Beyond those constraints, technology choices will generally be considered design decisions left to the student team. That said, the LAS sponsors for this team have experience with the following technologies and will be better able to assist if they are utilized: 

  1. Python 
  2. JavaScript frontend frameworks like Vue.js, Angular.js, or React.js 
  3. Python or PHP backend frameworks like Flask or Django 
  4. SQL and NO-SQL databases 
  5. Docker 

ALSO NOTE: Public distributions of research performed in conjunction with USG persons or groups are subject to pre-publication review by the USG. In the case of the LAS, typically this review process is performed with great expediency, is transparent to research partners, and is of little to no consequence to the students. 

Sponsor Background 

The Laboratory for Analytic Sciences (LAS) is a research organization in support of the U.S. Government, working to develop new analytic tradecraft, techniques, and technology that help intelligence analysts better perform complex tasks. Processing large volumes of data is a foundational capability in support of many analysis tools and workflows. Any improvements to existing processes and procedures, whether they are measured in time, efficiency, or stability, can have significant and broad reaching impact on the intelligence community’s ability to supply decision-makers and operational stakeholders with accurate and timely information. 

Background and Problem Statement 

Our main goal of this semester is to demonstrate the need for a machine learning based recommender system when it comes to data prioritization in an enterprise setting. 

In a typical large-scale enterprise, data acquisition, processing, storage, and searching may be split across many different systems. With the introduction of machine learning and business intelligence, the process also adds combining all of this data from multiple sources into a large, central repository (data warehouse) for searching and indexing. The process whereby this occurs is called the Extract, Transform, and Load (ETL) process. As each system evolves and grows independently, some data information may not be indexed for searching.

As data volumes continue to increase, there may be a time when the amount of data (and corresponding derivatives like machine learning feature embeddings) may exceed the total storage size of a data warehouse. We would like to continue creating a web application to demonstrate two different methods of managing this data in a user-prioritized manner. That is: 

  1. Manual prioritization - Allow the end-user to specify generalized rules for what data they would like to keep and how long they would need to keep it.
  2. Learned user priority - Infer what a set of users may care about and for how long based on specific activity metrics. 

We seek the help of NCSU Senior Design to further design a web application that can enable demonstration of both of these methods. 

Project Description 

Last semester, the NCSU Senior Design team created a basic application to specific rules in the cybersecurity domain. A user was granted either edit or read-only permissions. Once within the application, they were able to add specific rules tailored to the use-case (e.g. IP addresses with or without ports) into a group (named buckets). Within each bucket, a user could reorder the rules. In the overall system, this would allow a specific rule to store more data than another. 

For the current semester, we seek the help of a team to add functionality to enhance manual rule-based prioritization and demonstrate visualizing results of a machine-learning based prioritization method. Generally, the team will be asked to do the following: 

  • Understand the current architecture and expand the capabilities 
  • Develop REST endpoints that ingest historical data 
  • Develop frontend UI components for a historical view of data priority 
  • Machine learning ETL with a new dataset and use-case 
  • (Optional) Machine learning recommender model building for demonstration use 

Specifically, most of the UI work will be in augmenting the bucket page with additional features to showcase the differences between the defined rules and the proceeding days data. 

In addition to the new core capabilities, we may also want to explore a freely open dataset that would pivot our demonstration closer to a news recommendation system (vs. cybersecurity). With a new domain, some of the code may need to be generalized for better understanding. 

Technologies and Other Constraints 

We anticipate sharing last semester's code and guides with the team. This application has the following technology stack: 

  • Frontend: React.js with Material UI dashboard template
  • Backend: Django (Rest-Framework) 
  • Database: Postgres 

For testing, we would like the team to look at Jest or Cypress for end-to-end frontend testing.

ALSO NOTE: Public distributions of research performed in conjunction with USG persons or groups are subject to pre-publication review by the USG. In the case of the LAS, typically this review process is performed with great expediency, is transparent to research partners, and is of little to no consequence to the students.

Sponsor Background 

The Laboratory for Analytic Sciences (LAS) is a research organization in support of the U.S. Government, working to develop new analytic tradecraft, techniques, and technology that help intelligence analysts better perform complex tasks. Processing large volumes of data is a foundational capability in support of many analysis tools and workflows. Any improvements to existing processes and procedures, whether they are measured in time, efficiency, or stability, can have significant and broad reaching impact on the intelligence community’s ability to supply decision-makers and operational stakeholders with accurate and timely information. 

Background and Problem Statement 

LAS would like to implement, demonstrate and evaluate the applicability of using the orchestration technology Argo Workflows to automate MLOps (Machine Learning operations) tasks in a Kubernetes cluster. The core intent of this project is to enable integration of open source projects in a proper, systemic fashion. Argo Workflows is an open source container-native workflow engine for orchestrating parallel jobs on Kubernetes. There are many background tasks in complex systems that need automation and orchestration tools facilitate that need. Students will gain experience executing this integration in a way that benefits both the data scientists who will use the system, and themselves as engineers. 

As the nature of MLOps matures, so does the need to automate the process of deploying Machine Learning (ML) models and allowing them to scale and accommodate business demands. Data scientists are training more operational ML models than ever before, and the challenge of effectively utilizing these immensely powerful algorithms is quickly becoming one of management and orchestration engineering, rather than model development. Orchestration tools are often part of the glue that serves to automate a multitude of tasks in complex systems, both integrating different technologies and reducing maintenance burdens on engineers. Many entry level software

engineers don’t have experience with system integration and design. It’s beneficial to understand not only how to build an application but also how to make that application work as part of a larger system.

LAS is prototyping a ML Model Deployment Service (MDS) to facilitate the deployment and scaling of ML models in Kubernetes. A critical component of this system will be an orchestration tool to abstract and automate regular manual tasks, which can be very complex. Alleviating this burden from the data scientists is the ultimate payoff. 

Project Description 

LAS is seeking a team to demonstrate and evaluate the applicability of using the orchestration technology Argo Workflows to automate MLOps (Machine Learning operations) tasks in a Kubernetes cluster. After establishing initial workflows, create a simple web interface and API to customize workflows and trigger execution. Minimal project deliverables may be demonstrated and tested using a small mini cluster like Kube or MiniKube. LAS will optionally provide ML model containers to test deployment. 

Project Deliverables (Minimal): 

  • A workflow to pull containerized ML model(s) from a docker registry and deploy them into a Kubernetes cluster with customizable deployment templates. 
  • A workflow to create a batch processing pipeline using the deployed ML Model(s).
  • Create a simple API and web interface allowing users to: 
    • Select ML Models and resources to include in a deployment, then trigger a workflow.
    • Return status of deployments to the UI. 

If the project team has successfully demonstrated progress in meeting the project's minimal deliverables, they may pursue some of the project’s stretch goals. 

Project Stretch Goals: 

  • Deploy the project work into an existing AWS Elastic Kubernetes cluster. (Access to an existing cluster will be provided.) 
  • An Ansible playbook to standardize deployment of Argo Workflow components into Kubernetes.
  • Experiment creating and deploying a more complex ML pipeline using a combination of Argo Workflows and Seldon Core (Access to an environment running Seldon Core can be provided.) 

Technologies and Other Constraints 

The core technology used in the project must be Argo Workflows, but additional accompanying open source technology choices will generally be considered. Students may develop and test solutions locally on a Kubernetes mini cluster like Kube or MiniKube with the goal of deploying into an AWS EKS Cluster.

Frontend: UI design decisions for the web interface will be left to the student team.
Backend APIs: fastAPI 
Orchestration Framework: Argo Workflows 

As noted above, the student team will have flexibility in the selection of specific accompanying open source technologies for some portions of the project. That said, the LAS sponsors for this team have experience with the following technologies and will be better able to assist if they are utilized: 

  1. Kubernetes 
  2. AWS Elastic Kubernetes Service 
  3. Seldon Core 
  4. Terraform and Ansible 
  5. Python 
  6. JavaScript frontend frameworks like Vue.js or React.js 
  7. Python or PHP backend frameworks like FastAPI or Django 
  8. SQL and NO-SQL databases 
  9. Docker 
  10. Mini Kubernetes clusters like MiniKube or Kube 

ALSO NOTE: Public distributions of research performed in conjunction with USG persons or groups are subject to pre-publication review by the USG. In the case of the LAS, typically this review process is performed with great expediency, is transparent to research partners, and is of little to no consequence to the students. 

REFERENCES: Argo Workflows: https://argoproj.github.io/argo-workflows/

Sponsor Background

NetApp is a cloud-led, data-centric software company dedicated to helping businesses run smoother, smarter and faster. To help our customers and partners achieve their business objectives, we ensure they get the most out of their cloud experiences – whether private, public, or hybrid. We provide the ability to discover, integrate, automate, optimize, protect and secure data and applications. At a technology level, NetApp provides cloud, hybrid and all-flash physical solutions; unique apps and technology platforms for cloud-native apps; and an application-driver infrastructure which allows customers to optimize workloads in the cloud for performance and cost.

Background and Problem Statement

NetApp products use cutting-edge hardware to provide its customers with the latest features and performance.  Using the most up to date hardware also ensures the availability of all the components for the lifespan of the NetApp product.  

NetApp’s storage focused operating system is built on a customized FreeBSD kernel. Since most hardware vendors develop drivers for the Linux kernel initially and then port the drivers to the FreeBSD kernel in a follow-on effort, NetApp often faces challenges getting FreeBSD drivers for the latest and greatest hardware until later in their product development cycle.  This creates risks for products due to the limited time available for feature integration and testing.

FreeBSD has a LinuxKPI (https://wiki.freebsd.org/LinuxKPI), which is a small compatibility layer to allow Linux based drivers to port more easily to FreeBSD.  While the LinuxKPI exists, the porting effort to implement Linux drivers in FreeBSD is still significant and time consuming.  Improving the LinuxKPI to support Linux drivers in FreeBSD with minimal porting effort would allow FreeBSD based products to release sooner on the latest hardware and reduce the development cycles for vendors to port drivers to FreeBSD.

Project Description

Nvidia ConnectX SmartNICs (https://www.nvidia.com/en-us/networking/ethernet-adapters/) are a great example of a network driver developed initially for Linux and then ported to FreeBSD.  The driver for this family of Ethernet adapters leverages the LinuxKPI, but still requires a significant effort to port all the offloads and features to the FreeBSD kernel.

Project Goals

  1. Compare the ConnectX Linux and FreeBSD drivers.
  2. Identify how the LinuxKPI is currently leveraged in the FreeBSD driver implementation.   
  3. Identify where the Linux driver was not compatible with FreeBSD and a custom porting effort was required.
  4. Design LinuxKPI updates that would eliminate some of the FreeBSD porting effort needed for ConnectX drivers.
  5. Implement the LinuxKPI updates and validate the design.
  6. Quantify the reduced porting effort with the updated LinuxKPI implementation.

Bonus (Stretch) Goals

  • Contribute the changes to the upstream FreeBSD kernel.
  • Evaluate using the LinuxKPI for the Intel I225/I226 Series Ethernet adapter (igc driver).
    • Note: Intel drivers currently use the FreeBSD iflib framework instead of the LinuxKPI.
    • Identify the work needed to convert the Intel driver to the LinuxKPI approach.
    • Design LinuxKPI updates that would eliminate some of the FreeBSD porting effort needed for Intel Linux drivers to work in the FreeBSD kernel.
    • Quantify the pros and cons for Intel drivers to port to FreeBSD using the LinuxKPI compatibility layer versus the iflib framework.  
    • Implement the Intel igc driver using the LinuxKPI instead of iflib.

Technologies and Other Constraints

Student hardware:

  • Determine if server or desktop is needed
  • NetApp will provide a Connect-X6 PCIe card
  • NetApp will provide an Intel I226 PCIe card if needed
Students will be required to sign over IP to sponsors when the team is formed

Sponsor Background

SAS provides technology that is used around the world to transform data into intelligence. A key component of SAS technology is providing access to good, clean, curated data.  The SAS Data Management business unit is responsible for helping users create standard, repeatable methods for integrating, improving, and enriching data.  This project is being sponsored by the SAS Data Management business unit to help users better leverage their data assets.  

Background and Problem Statement

Finding relevant data is important for many use cases.  When a user sits down to do a report or a data model, they need to find useful and relevant data for their task.  Or they may choose some dataset, and want to find another couple that are similar to select from.  The goal of this project is to use an open-source recommendation engine technology (https://github.com/microsoft/recommenders) to recommend similar datasets.   This should take the form of an application (can be done in React) where the user has a collection of metadata about some tables in a system, each with a defined topic. They should be able to request a table based on the topic and get a recommendation of tables that are similar.  Or if they start with a table, they can get a list of other similar tables. Finally, if the user selects data, they get recommendations for other similar data.

Project Description

The goal of this project is to create an application that users can use to get recommendations on datasets.  The application can be done in React, or can be a Web page-based application.  The project should explore using this open-source recommendation engine technology (https://github.com/microsoft/recommenders), using a collection of metadata about tables available to the application.  The metadata is organized by table topic.  SAS can provide an initial set of this metadata.

To start, the user could be presented with a set of popular tables, tables organized by topics (part of the metadata), or other ways to navigate the available datasets.  The primary way to initially find tables could be to select a topic.  When the user selects a dataset, the user can get recommendations for other, similar data, using the recommendation engine.

Technologies and Other Constraints

SAS can provide an initial set of metadata about the tables.  Students can collect more metadata using the python profiler (https://greatexpectations.io/packages/great_expectations_semantic_types_expectations).  For the recommendation engine, we would like students to leverage this technology, which includes many examples in the examples folder: open-source recommendation engine technology (https://github.com/microsoft/recommenders).  If the students choose to write a web application, there are several starter ones from previous semesters that they could use to get started, including the Data Gist project.  There are no legal or IP issues here.

Students will be required to sign over IP to sponsors when the team is formed

Sponsor Background

Dr. Stolee is a faculty member with a strong research program in software engineering. This project contributes to their research on program comprehension, and specifically, on a new area of comparative code comprehension.

Background and Problem Statement

Comparative code comprehension is a cognitive process that describes how a person understands the differences between two (generally) similar pieces of code. In preliminary studies on comparative comprehension, professional programmers and students alike have trouble spotting the behavioral differences between similar algorithms. This suggests that support is needed to communicate similarities and differences between arbitrary algorithms. 

In this project, we will create tools and techniques that help developers understand the differences between similar pieces of code. These techniques will use state-of-the-art software engineering tools for static analysis and/or dynamic analysis, such as AST generation, fuzz testing, and more. 

Project Description

Dr. Stolee envisions a web application that allows users to interact with the system. 

There are frontend and backend components to this project. The backend needs to figure out how the two code algorithms are similar and how they are different. For example, using fuzz testing, it might reveal that they behaved the same on inputs X, Y, and Z, but behaved differently on input W. On the frontend, this information needs to be communicated back to the user in an intuitive and understandable way. As a stretch goal, the interface could be dynamic, where the user could make tweaks to the algorithms and then re-assess their similarity.

For example, let’s say the system is provided with two pieces of code:

public static int[] bubbleSort(int[] a) {
    boolean swapped;
    int n = a.length - 2;
    do {
        swapped = false;
        for (int i = 0; i <= n; i++) {
            if (a[i] > a[i + 1]) {
                int tmp = a[i];
                a[i] = a[i + 1];
                a[i + 1] = tmp;
                swapped = true;
            }
        }
        n = n - 1;
    } while (swapped);

    return a;
}

And

public static int[] bubbleSort(int[] array)
{
    boolean swapped = true;
    for(int i = array.length - 1; i > 0 && swapped; i--)
    {
        swapped = false;
        for(int j = 0; j < i; j++)
        {
            if(array[j+1] > array[j]) {
                int tempInt = array[j];
                array[j] = array[j + 1];
                array[j + 1] = tempInt;
                swapped = true;
            }
        }
    }
    return array;
}

One would expect the system, for example, to automatically detect that these behave the same on input arrays with size 0 or size 1, or input arrays with all the same values such as [1,1,1,1]. And then they behave differently on inputs such as [1,2,3,4] (i.e., one sorts arrays in ascending order, and the other in descending). Then, this information would need to be communicated to the user. 

Some differences will be detected using static analysis, while others using dynamic analysis. Each analysis should be able to be toggled (turned on or off), so the user can see just the differencing information they desire. 

At the end, the infrastructure developed by the team will be used in empirical studies to see how assistance helps people understand behavioral differences between code and may lead to publication in Software Engineering or Computer Science Education conferences or journals. Due to the creative nature of this project, students involved with the research will earn authorship on any subsequent publications. 

Technologies and Other Constraints

The students will be participating in an original research project. Dr. Stolee’s prior work in this area has compared algorithms written in Python and Java as the software engineering tool support for these languages tends to be more sophisticated.

The students may borrow or build off existing research projects that measure code similarity using fuzzing. 

Required reading: 

George Mathew, Chris Parnin, Kathryn T. Stolee: SLACC: simion-based language agnostic code clones. ICSE 2020: 210-221

George Mathew, Kathryn T. Stolee: Cross-language code search using static and dynamic analyses. ESEC/SIGSOFT FSE 2021: 205-217 

Middleton, Justin, and Kathryn T. Stolee. "Understanding Similar Code through Comparative Comprehension." 2022 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC). IEEE, 2022.

 

Sponsor Background

Gilbarco Veeder-Root is backed up by a powerhouse of brands that run the spectrum of fueling solutions from fleets to cloud technology. This project will allow students to learn more about our  family of brands and how they help make Gilbarco Veeder-Root the leader of the retail and commercial fuel industry.

Our Companies include Gilbarco Fuel Dispensers, Veeder Root - Automated tank gauge solutions, GasBoy - Commercial fleet and fuel management, Catlow – Hanging Hardware solutions, Insite 360 – On demand fuel management solutions, ANGI – Compressed Natural Gas (CNG) fueling solutions, and many others.  The focus of the project will be geared towards our Gilbarco Fueling Dispensers.  

Background and Problem Statement

With global manufacturing of several different products, the organization needs to stay on top of inventory levels of every single component that is incorporated into each product.  This will help ensure that we have a clear understanding of what products are potentially at risk in terms of manufacturing customer orders.  On a regular interval, manufacturing facilities will perform a quality check and validate inventory levels of the parts currently in stock, including a validation of products that are currently in process through the manufacturing process.

In order to maintain as accurate an inventory as possible, the team would like to be able to properly capture work in progress (WIP) activity to better maintain the appropriate counts.

Project Description

The primary focus of this project will be to develop an Augmented Reality (A/R) solution that allows a user to evaluate parts being consumed by a product in real time on the assembly line.  This will make it possible to capture the associated parts that are already consumed in the Work In Progress (WIP) product.  As part of the project, the team will have access to a physical unit that will be used for modeling the A/R solution and validate the subcomponents of the final product that is made available.   The team will assist in the process of creating product centric CAD models with the help of sponsor resources.  Once developed, the team will take the models created and incorporate them for usage in the A/R platform.  The students will build as part of their application, a catalog that can be used to pull the appropriate model from a serial number on the product.  The correlating serial number will identify a Bill of Materials for each product to allow for visual selection and representation of all parts consumed.  When the team is selected, we will provide access accounts to the students as soon as they are on board.  This will provide the appropriate access to the systems containing the CAD models and associated bill of materials. 

Once the product has been evaluated with all selected options, the team will be responsible for generating a tabulated output of component parts that can be used for capturing uncounted product sub-components not included in physical inventory counts today. As a future step for this project, it would be possible for the existing inventory systems to offer an interface for receiving updates from the A/R system, providing a mechanism for inventory updates in real time.

Solution platform: Ability to build out some type of interface to be able to capture and push inventory items to a platform (Microsoft AX) that can be used to manage quantities.  This application will also need to be able to consume updated revision details from the PTC windchill environment (Product Lifecycle Management platform) as drawings will be updated on a continual basis.

For more information on the PTC Windchill environment, it can be found here.

https://www.ptc.com/en/products/windchill

Technologies and Other Constraints

This project will utilize applications and tools from the PTC ecosystem.  A link to the starting material is provided below.  We currently have several pieces of the data available for consumption and the broader platform has the support of new features such as Vuforia that can be used for A/R development.  This platform provides a development library that can be used to build out the associated application that will be utilized on the target HW platform. 

SW Application Details : PTC Vuforia - https://developer.vuforia.com

Target Hardware Platform:  Apple iPad (Three will be provided for the student team)

Software Application Development – primarily IOS SW application development, needs to be developed as a cross platform development environment.

We anticipate the usage of the following tools in that environment

  • Vuforia or Equivalent Software Program – Augmented Reality Platform that will serve as the basis for the ultimate deliverable
  • Windchill – Existing product bill of materials source that can be used to fabricate the associated inventory items
  • Creo – CAD System that can provide real time models of the existing products
  • Application Development for a cross platform environment.  Target solution will need to be deployed on an iPad.
Students will be required to sign over IP to sponsors when the team is formed

Sponsor Background

Atomic Arcade is a AAA game development studio led by industry veterans with experience on some of gaming’s biggest IP. We are currently working on a AAA GI Joe game centered around the legendary ninja / commando character, Snake Eyes. We’re proud to be part of Wizards of the Coast & Hasbro.

Background and Problem Statement

Our game design team is working on an idea for Snake Eyes that involves dropping dynamic and randomized perks for our player to pick up during combat gameplay. These perks would augment the gameplay in interesting ways to keep combat fresh and exciting over the course of the game. However, because of the complexity of the system, it is difficult to find design answers (i.e. find the fun) without a working version of the system.

Project Description

In order to determine if this mechanic would be fun and engaging for players, the team would like to have a focused prototype of the system working and playable. This way, they can see the system in action, experiment and try things, and find answers to some of the design questions that they need.

This means that we would need a minimally functional melee combat system that allows the player to engage in close quarters combat with multiple AI opponents. Once that base system is functional, our designers would like to be able to introduce various perks to the gameplay in order to experiment with. The implementation should allow these perks to augment the system in interesting and unique ways, and will require iteration in order to support the new mechanics that design comes up with.

System requirements:

  • The player character (Snake Eyes) can move around.
  • The player character can perform melee attacks.
  • AI characters can navigate towards the player character.
  • AI characters can perform melee attacks.
  • Every character has a pool of health that gets reduced when they receive an attack.
  • When a character’s health pool is reduced to 0, they are defeated.
  • Once defeated, characters (including the player character) respawn after some amount of time.
  • When AI characters are defeated, they have a random chance of dropping a perk.
  • When the player character touches a perk, it gets collected, and gameplay is augmented.

Initial perks:

  • Damage Buff.
  • Defense Buff.
  • Health Regeneration.

Technologies and Other Constraints

  • This project will need to be built using Unreal Engine 5.2.
  • All functionality and features should be implemented using C++. Blueprint scripting should be kept to a minimum, and only used for UI or asset references.
  • Feel free to use any art/audio assets you can find on the Unreal Marketplace (or any other free asset library) but all functionality should be built by you. Note that assets are not a priority on our end, we care more about design answers than how it looks.
  • At the end of the semester, you should deliver a Windows packaged build of your project (.exe).

Sponsor Background 

Bandwidth is a software company focused on communications. Bandwidth’s platform is behind many of the communications you interact with every day. Calling mom on the way into work? Hopping on a conference call with your team from the beach? Booking a hair appointment via text? Our APIs, built on top of our nationwide network make it easy for our innovative customers to serve up the technology that powers your life. 

Background and Problem Statement 

Bandwidth just moved into a beautiful new campus, not far from PNC/Carter Finley stadium. On our new campus, we have an expansive parking deck, equipped with four EV charging stations on each of the four floors. 

Bandmates are very excited about EV charging, but it comes with complications – we have a four hour maximum, and there are other charging “manners” that come into play to be a good charging citizen. It’s very easy to forget and leave your car in a charging station, even after it’s charged, keeping someone else from being able to use that station. 

We would like you, dear students, to come up with a way for us to better manage, track, and account for our charging stations.

Project Description 

We envision that a mix of a web application and AI/image recognition would solve this problem. A web app would help us sign up for charging times/stations, and let everyone know who’s using which charger, or if a charger is not in use. Including AI/image recognition would allow us to place cameras (on Raspberry PIs) near each charging spot and develop software to detect if a car is in place or not, identifying the car for display on the web app. 

Technologies and Other Constraints 

The web application portion can be in any language you prefer. We will host this in Amazon Web Services, so you’d be able to take advantage of AWS technologies (RDS, Lambdas, etc.) if you wish. 

The AI/image recognition should probably be done in Python, as there are many good OSS (open source software) libraries available to help. We will provide a Raspberry Pi and camera that can be used to develop + test with. 

Overall, we want this to be a fun exercise in developing a combination of a web application and back-end AI/image recognition! 

You are free to think up more features and functionality that would make this work well. 

And you should also come up with a cool name! 

Sponsor Background

Katabasis is a non-profit organization that specializes in developing educational software for children ages 8-15. Our mission is to facilitate learning, inspire curiosity, and catalyze growth in every member of our community by building a digital learning ecosystem that adapts to the individual, fosters collaboration, and cultivates a mindset of growth and reflection.

Background and Problem Statement

There are many communities of students that have limited access to educational materials on computer science, particularly at young ages. Even more problematic, this subject area is often seen as completely unapproachable and indecipherable, and this perception is often ingrained into children at a young age. Katabasis wants to design an intervention for children to break down complex, high-level computer science topics (in this case, machine learning) in a way that makes computer science education more accessible and promotes students’ self-confidence in their capability for computer science. 

In addition to teaching computer science, there is a secondary focus of this project to teach students basic taxonomy by having them identify different animal species. This will introduce students to basic biological concepts to supplement their education.

Project Description

Katabasis is seeking to develop a casual, single-player video game with the intent of teaching young children (10-15) about basic machine learning principles such as classification, decision trees, and supervised/unsupervised learning. 

NOTE: There will be NO actual machine learning implemented in this project; the focus is on teaching the basic concepts behind machine learning to a younger audience through a game/minigames representative of machine learning principles.

The story of the game will be analogous to a typical classification problem and will have the player assume the role of a zoologist fighting the evil scientist Dr. Zorb. Dr. Zorb wants to destroy the world’s ecosystem by introducing genetically modified creatures (“Zorblings”) onto the planet. To combat Dr. Zorb and prevent ecological damage, the player has an army of drones that can grab the Zorblings and fly them out into space, but the player will need a quick way to identify Zorblings around the world. To accomplish this, the player needs to develop a machine learning model that predicts and classifies real animals from Zorblings. The player will train their ML model by moving quickly around the world and earning as many points as possible through playing various minigames. These minigames will represent the process of classification and the ML concepts of decision trees and supervised learning. Failure to train the model efficiently will result in a poorly-trained machine learning model that cannot classify a Zorbling from a zebra! 

The basic structure of the game will consist of a large game “board” in which players can visit different ecosystems on the globe and encounter minigames of varying difficulty to help “train” the player’s machine learning model. These minigames will be designed around reinforcing and teaching basic machine learning principles through classification tasks. The goal will be for students to understand the logic and reasoning that goes into a machine learning system, without needing to know the technical minutia or comprehensive inner workings. To accomplish better thematic connection with a younger audience, the game will be themed around animals; taxonomical classification has many tasks and activities that can be framed around machine learning concepts. Touchstones for the game include the board game “Pandemic”, Mario Party, Wii Party, and WarioWare. The general structure of a board supported by intervening minigames is a strong one, and appeals to our target audience, so we want to leverage it into educational content.

Because this game overlaps thematically with elementary-level science topics like taxonomy or food webs, the game will primarily be used in the classroom as a supplemental learning tool. As such, there will be time restraints on total gameplay time, as described in the constraints section below.

The team will be asked to implement the “board” portion of the game rather than focus on the individual minigames. The board will consist of five ecosystems, each with its own progress bar to fill. When players choose to launch minigames in one of the ecosystems, they will be directed to a minigame fitting of the current difficulty level. Then, the player will be awarded a number of points, from 1 to 10, that will go towards filling the ecosystem’s progress bar and adjusting the current difficulty level. More detail on these features can be found below, but these are the general systems the team will be responsible for implementing.

Here is a brief summary of the core features we are looking for:

  • Game Board: Gameplay will center on a large board that players will progress through, with the goal of filling progress bars by completing minigames in 5 different ecosystems.
    • Players should be able to move between “spaces” across the board to navigate to spaces where Dr. Zorb’s creatures are causing a commotion (a minigame space).
    • Players not moving to a minigame space within a certain number of turns should introduce a failure condition that can lead to players losing the game. This failure condition is akin to applying the concept of unsupervised machine learning methodology, which is the wrong approach for the prediction/classification problem that the game represents. This would make for a poor ML model—we want to ensure that our model can reliably classify Zorblings from real animals!
    • Players should have the goal of filling up a progress bar for each of the 5 ecosystems on the board. Players will fill up these progress bars by playing minigames. Player scores will be used to determine how much of the bar to fill up at once.
    • Once all progress bars have been filled, players should enter a final, most challenging minigame that will act as the end of the game.
  • Integrating Existing Minigames: Periodically throughout the game, players will be taken away from the board view to play a standalone minigame, each lasting between 30 sec-2 min. A sample selection of minigames will be provided to the team to integrate into their board game implementation.
    • Minigames will vary in difficulty, so the team will be responsible for implementing a system that will give players more difficult minigames as their progress bar fills for an ecosystem.
    • When transitioning between the game board and minigames, there are specific inputs and outputs to be considered. The current difficulty level (measured from 1-10) is an input that the system will use to select which minigame to play from the pool of available minigames. The minigame whose difficulty range corresponds to the player’s current difficulty level will be selected and loaded for the player to play. Following the completion of a minigame, the player’s score (measured from 1-10) is an output that will be used to calculate how much of an ecosystem progress bar to fill up. Upon completing a minigame with a “good” score, the difficulty of an ecosystem should increase; if the player earns a low score, the difficulty level of the ecosystem should decrease.
  • Machine Learning Concepts: We want the game to teach machine learning concepts, mostly at the micro level (minigames), but potentially also at the macro level (board view). These concepts might include:
    • Classification: represented by tasks in minigames that have the player identify animals by their species.
    • Decision trees: goes hand-in-hand with some of the classification tasks found in minigames that ask the player to identify animals. Much like 20 Questions, this model aims to arrive at a decision by asking true/false questions (i.e. Does this animal have wings? Does it eat meat?)
    • Supervised vs. Unsupervised learning: represented by the idea of the player playing minigames vs. the failure condition of a player not making it to a minigame in time. 

Technologies and Other Constraints

For this project, students will develop their game in Unity for consistency with minigames supplied by the Katabasis team. This will make it easier for students to integrate the existing minigames into their own project.

We are aiming for a low computational load with this project and ask that students keep this in mind when making design decisions and during implementation.

The game should be ~45 minutes in length to fit within the average classroom period. Most of this time will be spent within the minigames themselves, but this time constraint is still important for the team to consider when making design decisions for the board implementation.

The team will receive all of the art assets that they will need for the completion of the project.

Students will be required to sign over IP to sponsors when the team is formed

Sponsor Background

Katabasis is a non-profit organization that specializes in developing educational software for children ages 8-15. Our mission is to facilitate learning, inspire curiosity, and catalyze growth in every member of our community by building a digital learning ecosystem that adapts to the individual, fosters collaboration, and cultivates a mindset of growth and reflection.

Background and Problem Statement

Educators are constantly looking for new ways to engage their students. More and more, this is presenting itself in the form of educational games.  

Some of the more pressing subjects nowadays, especially environmentally speaking, are hydrology and river flows--in particular how they relate to and interact with an ecosystem and natural landscape. Modern events show us that a poor understanding of the way water flows can lead to catastrophic results (as seen in the case of the many dam failures and canal blockages we’ve seen in recent years).  Therefore, it is important to foster awareness of these concepts. We are proposing a novel educational tool for students by connecting the engagement of games and the subject of water dynamics in an educational tower defense-style game.  Tower defense games have been shown to be very engaging for children, especially in the late elementary school to early middle school demographic where water dynamics are often taught. 

Project Description

We are seeking a team of students to develop a tower defense game based around controlling and altering watersheds to combat pollutants and dangerous flow.  The water shall act as the path that the pollutants and debris (the enemies) will flow along, and the player will place plants to act as the towers to combat these foes.  The defining mechanic for the game will be the ability to alter the direction the water—and therefore enemies—will travel, allowing for the player to change the dynamics of the level on the fly.  A critical touchstone for this game is the mobile game Fieldrunners, where a similar mechanic is implemented.

The path the water takes will be dictated by pre-set factors in any given level in addition to plants the player places down, which will cause the flow to be dynamically recalculated and adjusted.  This will be a critical part of development: nailing both the algorithm for water flow and the player’s interaction with this core game mechanic.  The flow of the water will also affect the plants themselves; too fast a flow may cause erosion, damaging the plants, and putting the system at risk for a collapse and sudden altering of flow.  Not enough flow, however, may leave plants malnourished and reduce their effectiveness.

In summary, the core feature set we are seeking this semester will be:

  • Basic Tower Defense Mechanics: As you might expect in typical presentations of the genre; certain towers are more effective against certain types of enemies, varying costs/cooldowns, and a regular flow of enemies. As such, the team will be responsible for creating a set of plants to enact these effects.
  • Dynamic Water Flow: The path enemies take in the game will be a core mechanic itself, and should be able to be altered during play by the player’s intervention or per-level effects.  The game will have to handle this transition gracefully and effectively convey the information to the player.
  • Watershed Education:  While we are not looking for a simulator, we are looking for a decently accurate picture of certain key interactions in watershed mechanics.  To this end, we will want our team to highlight these themes through things like enemy/tower matchups, tower theming, power scales, etc.

Technologies and Other Constraints

Students will be required to make a choice of game engine within the first week of the project. They will present their rationale, and we encourage them to consider a variety of factors such as portability, previous experience, and support for game mechanics.

We are aiming for a low computational load with this project and ask that students keep this in mind when making design decisions and during implementation.

Students will be required to sign over IP to sponsors when the team is formed

Sponsor Background

Dr. Stallmann is a professor (NCSU-CSC) whose primary research interests include graph algorithms, graph drawing, and algorithm animation. His main contribution to graph algorithm animation has been to make the development of compelling animations accessible to students and researchers.

Background and Problem Statement

Background.

Galant (Graph algorithm animation tool) is a general-purpose tool for writing animations of graph algorithms. More than 50 algorithms have been implemented using Galant, both for classroom use and for research.

The primary advantage of Galant is the ease of developing new animations using a language that resembles algorithm pseudocode and includes simple function calls to create animation effects.

The most common workflow is

  • Create a graph either externally or using Galant’s graph editor
  • Upload an algorithm animation program created externally using a program editor
  • Compile and execute the program using arrow keys to step forward and backward in a list of animation events

Problem statement.

Deployment of the current implementation of Galant requires that a user has git, Apache ant, and runtime access to a Java compiler; it is also complex and includes many unnecessary features. While it is technically platform independent behavior differs on different platforms; any modifications must be tested on Mac, Windows, and Linux.

A prototype web-based Galant, galant-js (https://github.com/mfms-ncsu/galant-js), was developed by a Spring 2023 Senior Design Team. The goal is to incorporate the most important features of Java-based Galant.

Project Description

Many enhancements are required to put the useability of galant-js on par with the original Java version. The latter has been used heavily in the classroom and in Dr. Stallmann’s research. The JavaScript version already has clear advantages.

Key enhancements are

  • More animation features.
  • More extensive documentation for users and future developers.
  • A more flexible user interface.
  • Ability to edit and save graphs and algorithms online.

A detailed list is in feature-requests.md at the root of the repository.

Technologies and Other Constraints

Students would be required to learn and use JavaScript effectively to reimplement Galant functionality. The current JavaScript implementation uses React and Cytoscape for user interaction and graph drawing, respectively. An understanding of these will be required for some of the added features.