FinVolution to Hold 9th Global Data Science Competition, Focus on Deepfake Speech Detection in LLM Era
Author: FinVolution
Published: 05-10-2024
The 9th FinVolution Global Data Science Competition targets deepfake speech detection, tackling the challenge of distinguishing between cloned and authentic voices in the LLM era.
The contest is part of the IJCAI 2024 Competitions and Challenges track, encouraging global collaboration and innovation among AI enthusiasts.
By integrating LLM-generated fake voices in the test dataset, the competition increases complexity and spurs innovation.
FinVolution, a leading fintech service provider, launches the 9th FinVolution Global Data Science Competition on May 10th 2024, with a focus on "Deepfake Speech Detection." The contest is part of the IJCAI (International Joint Conference on Artificial Intelligence) Competitions and Challenges track, a top international AI conference.
As voice synthesis technology continues to evolve, the line between cloned and genuine voices has become increasingly blurred in the era of large language models (LLMs), posing significant challenges to data security and asset protection.
The competition aims to inspire global AI enthusiasts and experts to innovate in combating voice cloning and deepfake scams. Contestants will utilize deep learning technologies to develop models and algorithms based on FinVolution's test dataset. The competition will include LLM-generated fake voices to elevate complexity and spur innovation.
With a total prize pool of RMB 310,000, the contest will consist of preliminaries, semifinals, and a final, with an aim to authenticate true and false voices. Highest-ranked contestants will attend IJCAI 2024 in Jeju of South Korea, to receive the awards and engage with academic and industry experts. FinVolution proudly sponsors IJCAI 2024.
Tiezheng Li, CEO of FinVolution, stated, "Since its inception nine years ago, the FinVolution Global Data Science Competition has evolved into a widely recognized event in the field of data technology, facilitating technical exchange worldwide. Partnering with IJCAI this year, a top-tier international AI conference, demonstrates our commitment to advancing deep speech recognition technology."
The Deepfake Challenge
During the preliminaries (May 10 to June 12), participants will design algorithms based on the white-box dataset supplied by FinVolution and submit scoring results to qualify for the semifinals. The dataset primarily comprises voice recordings totaling 20-40 hours.
At the semifinal stage (June 13 to June 28), contenders are expected to refine their algorithms based on the black-box dataset provided by the competition organizer, vying for a spot in the final. The dataset, composed mainly of private data, contains five to 10 hours of recordings.
Participants can register on the official website from May 9 to June 3, to download and view the datasets.
Upholding AI Ethics
Voice cloning has emerged as a major form of telecom fraud, as scammers exploit AI technology to make distinction between genuine and fake voices increasingly tricky.
The competition focuses on safeguarding user privacy and combating fraudulent activities by identifying cloned voices accurately.
Lei Chen, Vice President of FinVolution and Head of its Big Data and AI Division, said, "The applications of Large Language Models far exceed the corresponding detection technology, posing great challenges to information security. We hope to see AI deepfake voice detection technology keep pace with the developments of LLMs, thus safeguarding the data security of the public. With this concept in mind, the FinVolution Global Data Science Competition is not only a platform for technical competition but also an opportunity to explore how AI can better adhere to ethical principles and serve the public."
To date, the FinVolution Global Data Science Competition has drawn nearly 10,000 participants globally in total, becoming a widely recognized event in the field of digital financial technology.
Organized annually since 2016, the contest themes have spanned diverse domains, all rooted in real-world fintech business scenarios. These themes range from risk control algorithms, financial data applications, and product development to semantic similarity recognition, asset portfolio cash flow prediction, and credit schemes for small- and micro-sized enterprises
Newsroom