Design Challenge & Research Questions
Outlining the overarching goals of my internship project.
To structure the project and provide a clear goal from the start I have chosen to use a design challenge and supporting research questions. These questions will then be answered using the ICT Research Methods.
Design challenge
The design challenge is as follows:
"Implement a cloud-based pipeline that can provide concrete feedback on the realism of (a) provided image(s) or video(s) of an AI-generated avatar so that the developers of the avatars can verify the applicability thereof in social situations and ensure continued immersion in the interaction."
This rather elaborate design challenge can be split into multiple parts which will need separate verification along with the general verification throughout the project.
A pipeline should be created that is cloud based with the techniques and properties one would expect of such an application.
It should aid the verification of avatar realism.
Developers need a concrete way to check and verify the results of the analysis.
Shifting Emphasis
As the project progressed, it was clear that while the overall project should stay more or less the same, the emphasis should be shifted from the analysis itself to the supporting architecture and pipeline. The analysis itself requires more data science expertise, which is best suited to a follow-up project.
The main deliverable of the project is now the pipeline itself. This means the architecture and modularity of the project take precedence.
Research questions
To meet the aforementioned design challenge, various research questions need to be answered along the way. This will provide me with the technical and theoretical knowhow needed to implement a solution. As it stands, the research questions are as follows:
What makes certain facial expressions realistic?
Research methods: Literature study, Expert interview
The core of the project is ensuring the continued immersion in a conversation with an AI-generated avatar. Many factors, such as congruency between the facial expression and semantic context of the conversation, impact this. A key part of this is the facial expression itself: before it can be evaluated on contextual applicability, it is necessary to know if it is a "realistic" face at all. Disproportionate facial features or strange AU combinations could take someone out of the conversation before it has even started. Evaluating this is no simple matter, however, and begs the question: what makes a face realistic? What should be looked at in an (objective) evaluation?
How will I create a service that can use the results from the analysis to determine whether a face is realistic?
Research methods: Prototyping, Expert interview, Benchmark test, Data analytics
Based on the previous question, methods need to be outlined to convert this into an actual service within the pipeline. Various methods have already been thought of, such as machine learning, the FACS table or comparing extremes and ratios. Which of these best suits the project is yet to be determined.
What technological choices (frameworks, languages, facial analysis library etc.) best suit the project?
Research methods: "Choose fitting technology" methodology
Once I know what to implement, the next factor is how it should be implemented. The choice of technology will impact development greatly, determining not only how fast and smooth development is but also what can be implemented in the first place. More information on this can be found here.
How can I interface with the desired facial analysis library?
Research methods: Prototyping, Community research
Before the degree of realism can be assessed, objective measurements need to be made to perform the analysis with. This is done using one of the facial analysis frameworks (compared here). Such a framework can be interfaced with in various ways: integrating it into another application, running it as a seperate service or even running it concurrently on a server. These options should be explored and compared to see what fits the needs of the project best (both in capabilities and development time spent). Tests related to this can be seen in Interfacing with OpenFace: C++ & Terminal.
What kind of results are important to the users of this pipeline?
Research methods: Expert interview, Interview, Product review
This question differs from many of the others in that it is more UX related. The avatar validation pipeline was thought up as a response to certain issues faced within the larger scope of the project. My question here is: what was the initial problem? What metrics are needed to solve it? In doing this, I can tailor the pipeline to that and make sure it can be plugged directly (or at least easily) into the existing pipeline for avatar creation.
How will results and/or analysis differ in videos vs. still images or posed vs in-the-wild shots?
Research methods: Literature study, Prototyping
Most of the researched facial analysis libraries have different analysis settings for different situations (i.e. posed vs in-the-wild). Whether this has an impact on the assessment of realism is yet to be determined. The same goes for videos vs. images. More on this here.
Last updated
Was this helpful?