Google announces clustering algorithm designed to reveal group characteristics while maintaining individual privacy

Google announces clustering algorithm designed to reveal group characteristics while maintaining individual privacy
By
27 October 2021 (Edited 27 August 2022)

'Differentially Private Clustering' could enable organizations to learn insights from group data while protecting individual data privacy

Google research scientists have provided an update on several years of work on privacy-safe approaches for handling sensitive user data.

The challenge, as stated by the researchers in a post on GoogleBlog, has been:

"Given a database containing several attributes about users, how can one create meaningful user groups and understand their characteristics? Importantly, if the database at hand contains sensitive user attributes, how can one reveal these group characteristics without compromising the privacy of individual users?"

In developing a solution, the researchers have created a new "differentially private clustering algorithm" which can privately generate representative data points from a dataset, so as to reveal group characteristics without revealing the private data of the individuals in the dataset.

To test the new algorithm, the researchers ran it on 4 large, publicly-available benchmark databases and compared its performance to that of several publicly-available algorithms.

In the researchers' words:

"We analyze the normalized k-means loss (mean squared distance from data points to the nearest center) while varying the number of target centers (k) for these benchmark datasets. The described algorithm achieves a lower loss than the other private algorithms in three out of the four datasets we consider."

Which in plain English means that the Google clustering algorithm produced more accurate representations of the characteristics of 3 of the 4 data sets on which they ran it, compared to results from the algorithms used for comparison.

The conclusion reached from the research results, in the words of the researchers, is:

"This work proposes a new algorithm for computing representative points (cluster centers) within the framework of differential privacy. With the rise in the amount of datasets collected around the world, we hope that our open source tool will help organizations obtain and share meaningful insights about their datasets, with the mathematical assurance of differential privacy".

This work looks promising, and I'm glad to see that phrase "open source" in there!

Stay tuned for further updates.

Follow this link to find out more about how you can get fast marketing results from Google Ads paid search.


And if you have questions or comments, you can easily send them to me with the Quick Reply form, below, or send me an e-mail.


David Boggs    - David
David@DavidHBoggs.com
View David Boggs's profile on LinkedIn

Google Certifications - David H Boggs
View my profile on Quora
Subscribe to my blog

External Article: https://ai.googleblog.com/2021/10/practical-differentially-private.html


Subhead 'Differentially Private Clustering' could enable organizations to learn insights from group data while protecting individual data privacy
Website
Visit Website
Rating
4/5 based on 1 vote.
Show Individual Votes
Related Listings

Sorry, you don't have permission to post comments. Log in, or register if you haven't yet.

Please login or register.

Members currently reading this thread: