UNITAR

27 April 2023, Hangzhou, China - UNITAR’s 2021 and 2023 survey among the National Statistical Offices and Systems (NSOs and NSSs) and international organizations, conducted as part of the EC-Funded Crowd4SDG project, showed that 31% of respondents were aware of citizen science data (CSD) projects run by their NSOs in 2023 compared to only 17% in 2021. The awareness of citizen science (CSD) or citizen-generated data (CGD) is growing, and so are the questions on how such data could be collected and used for the monitoring of progress on sustainable development. To shed some light and shine a spotlight on the use of CSD, on 27 April 2023, UNITAR organized a learning session “The Five “Ws” of citizen science or citizen-generated data” during the 2023 World Data Forum (2023 WDF) in Hangzhou, China. The session focused on rigorous studies and findings of the Crowd4SDG project that explored the potential of such data for monitoring the Sustainable Development Goals (SDGs), more closely on the impacts of climate change (SDG 13) and in relation to SDG 11, 5, and 16 in its three years’ run. 

Kicking off the facilitation of the session, Ms Yongyi Min, Chief of the SDG Monitoring Section at the UN Statistics Division (UNSD), reminded the participants that in the opening speech, the UN Secretary-General has highlighted and regarded citizen-generated data as one of the inspiring data initiatives that “allow people and communities to gather and take control of the data affecting their daily lives”.

UNITAR

To share an illustrative example of how such data could be used, Ms. Min invited one of the participating partners of the Crowd4SDG Consortium, Ms. Barbara Pernici, Professor at the Politecnico di Milano (PoliMi). As Crowd4SDG – a research project of the European H2020, came to be in the Spring of 2020, the world got caught itself in the middle of the pandemic, so one of the areas that PoliMi studied is public health, explained Ms. Pernici. The Crowd4SDG tools pursued a combined goal of exploring how tools through automation techniques, i.e. machine learning, and involvement of citizens in this process, could be used for gathering and analyzing data that is fit for use. In the examples that she provided, she explained how social media data, as a way of the passive engagement of citizen scientists in the process, was used for this purpose. The described various tools of the project were developed in the form of applications. For COVID-19 – monitoring the use of face masks and following social distancing by different countries, the VisualCit tool was used that deploys Twitter data–image recognition. Undoubtedly, Ms. Pernici admits that the use of such data did not come without challenges. In the beginning, only 5 images were found useful from the 1000 samples. This is due to different types of data within Twitter that include images, memes, charts, etc. So algorithms were finetuned in terms of image recognition, annotation, and use of keywords with the help of citizen scientists. For validation purposes, the yielded results were compared with the data of other services circling at that period. The project on disaster response - monitoring the floods in Nepal and Thailand, followed also the same structure but faced other challenges related to linguistic aspects of using and recognizing keywords and geolocation. All the work has been documented and made publicly available, concluded Ms. Pernici.

UNITAR

The next speaker, Ms. Adriana Santiago, Deputy Director of Geographical and Environmental Information Assessment National Institute of Statistic and Geography (INEGI) of Mexico gave insights on how the National Statistical Office of Mexico is using citizen science data to produce a sentiment analysis. She elaborated that INEGI has been automatically analyzing the mood of tweets in Mexico since 2014 utilizing a supervising learning method. She elaborated that it started with a big group of people tagging tweets with either positive or negative emotions and later such tagging was automated through the use of machine learning where data could be analyzed on a real-time basis with less time and financial resources. The initial tagging involved more than 9 thousand people from Techmilenio University and staff from INEGI, where one tweet was tagged by several people to ensure quality. Cleaning up and validating the Twitter data was pointed out as one of the challenges by Ms Santiago as well. She further illustrated the results of their analysis where it was shown that during the time of the US elections in 2016, and the earthquake in Mexico in 2017, the mood of Twitter went down.

UNITAR

Touching upon the global initiatives, Ms Hayoi Chen, Coordinator of the Inter-Secretariat Working Group on Household Surveys at UNSD, shared that a new initiative – Collaborative on Citizen Data - was launched during one of the sessions of 2023 WDF on citizen-generated data earlier that week. This collaboration aims to bridge the gaps in the official statistics and make the whole data process more inclusive. Following the discussions of the expert group meeting on CSD in November 2022, the aim is to define the framework, the scope of the work, and bring together all the involved stakeholders from the NSOs and broader NSSs and identify joint needs, and collaborate more closely on the said matter.  Ms. Chen, pointing out having a full house in the session, invited 40 participants to attend the session in person to share their case studies and ideas on how this new initiative should operate.

UNITAR as part of the Crowd4SDG project has already tried to find answers to some of the aspects that the Collaborative is aiming to work on. Specifically on the definition, framework, challenges, and opportunities of using CSD among other areas. This was done through studies including two surveys among NSOs/NSS, 11 interviews and case studies, pilot assessment of 11 datasets using the proposed Quality Assurance approach, and other support provided to more than 5 innovative Citizen Data projects and 1 pilot country – the Maldives, - shared Ms Elena Proden, Senior Specialist, Strategic Implementation of the 2030 Agenda Unit, UNITAR, who led the work on citizen science data on the Crowd4SDG project. Among the key impediments of CSD as shown in the 2023 survey is awareness, alongside the non-sustainability of data source access and non-application of statistical standards. She also presented quality criteria considered for the CSD that are grouped into commonly used and additionally proposed. Commonly used criteria include timeliness, frequency, and sustainability as well as relevance, metadata, coherence, comparability, and integrability among others. Among the additional ones are impartiality, confidentiality/privacy, self-identification, and documented data/collection production/ dissemination process. 

The rest of the session was organized in an engaging quiz on five Ws on citizen science data run by Ms. Proden, where participants could also share their own thoughts, suggestions, and reservations in some cases on the use of citizen data for sustainable development. 

UNITAR
UNITAR

The watch the recording the of session, please proceed to the Whova platform of WDF or watch below.

Share with