
1. Introduction
The Italian campaign during the EU Parliament Election of 2024 marked a pivotal moment in the evolution of political strategies. As digital platforms increasingly dominate the landscape of communication, political campaigns have adapted to leverage these tools for outreach, engagement, and influence. Social media platforms, particularly Instagram, emerged as powerful mediums for connecting with voters, enabling personalized communication and real-time feedback. This shift from traditional methods, such as rallies and television advertisements, to digital strategies reflects the growing importance of technology in shaping political discourse.
However, the digitalization of political campaigns is not without its challenges. While social media offers unprecedented opportunities for engagement, it also raises ethical concerns about transparency, misinformation, and the amplification of echo chambers. The ability to micro-target voters with tailored messages can sway opinions, but it also risks creating divisions and undermining democratic principles. This blog explores how machine learning was employed to analyze the dynamics of political content during the Italian campaign, focusing on post popularity as a key indicator of digital influence. By examining the interplay between innovation and ethical considerations, this study sheds light on the transformative potential of data-driven strategies in modern politics.
2. Background and Related Works in the Italian Campaign
2.1 Evolution of Political Campaigning in the Italian Campaign
Political campaigning has undergone a seismic shift over the past two decades, transitioning from traditional methods to digital platforms. In the past, campaigns relied heavily on physical rallies, door-to-door canvassing, and television advertisements to reach voters. These methods, while effective in their time, were limited in their ability to provide real-time feedback and adapt to changing voter sentiments. The advent of social media has revolutionized this landscape, offering politicians a direct line of communication with their constituents.
The Italian campaign serves as a case study for this evolution, showcasing how digital tools can be harnessed to create targeted and impactful messages. By analyzing voter behavior and preferences through machine learning, political parties were able to craft strategies that resonated with their audience. This approach not only increased engagement but also allowed for a more nuanced understanding of voter needs and concerns.
2.2 Machine Learning in Political Analysis in the Italian Campaign
Machine learning has emerged as a critical tool in political analysis, providing insights into voter behavior, sentiment trends, and engagement metrics. Supervised learning models, such as logistic regression and support vector machines, are commonly used to analyze social media data, predicting the popularity of posts and identifying patterns in voter interactions. These models leverage features such as likes, comments, and shares to assess the effectiveness of digital content.
In the context of the Italian campaign, machine learning played a pivotal role in understanding the dynamics of political communication. By analyzing Instagram posts from political leaders and parties, researchers were able to identify key factors that influenced voter engagement. This data-driven approach not only enhanced the effectiveness of campaign strategies but also highlighted the potential of technology to reshape political discourse.
2.3 Challenges in Digital Political Engagement
While digital campaigns offer numerous benefits, they also come with significant challenges. One of the primary concerns is the risk of misinformation, which can spread rapidly on social media platforms. The ability to micro-target voters with tailored messages raises ethical questions about transparency and fairness. Additionally, algorithmic biases can influence the visibility of posts, potentially skewing public perception.
The Italian campaign highlighted these challenges, emphasizing the need for regulatory frameworks to ensure democratic transparency. By addressing these issues, political campaigns can leverage the benefits of digital tools while mitigating their risks.
3. Dataset and Data Acquisition in the Italian Campaign
3.1 Political Popularity Campaign Dataset in the Italian Campaign
The dataset used in this study, referred to as the “Political Popularity Campaign Dataset,” includes 364 Instagram posts from political leaders and parties active during the Italian campaign. These posts were analyzed to measure engagement metrics, such as likes, comments, and shares, providing insights into voter interactions. The dataset also includes textual features, such as captions and hashtags, which were used to assess the impact of content on engagement levels.
3.2 Data Collection Methodology in the Italian Campaign
To ensure accuracy and reliability, the data was collected using the Apify API, a tool designed for extracting social media data. Posts were selected based on their publication date, focusing on content shared between May and June 2024. Engagement metrics were evaluated at least 48 hours after publication, allowing for a comprehensive understanding of audience interactions.
Field | Description |
---|---|
Image | Visual content of the post |
Caption | Text description accompanying the post |
LikesCount | Number of likes |
CommentsCount | Number of comments |
EngagementRateSum | Combined likes, shares, and comments |
3.3 Balancing Representation
The dataset was curated to ensure balanced representation across major Italian political parties, minimizing bias and providing equitable insights into digital engagement practices. This approach allowed researchers to analyze the effectiveness of campaign strategies across different political affiliations.
4. Methodology in the Italian Campaign
4.1 Collaborative Approach
The methodology reflects a collaborative effort between computer scientists and legal scholars, ensuring a balance between technical rigor and ethical considerations. By integrating interdisciplinary perspectives, the study aimed to address both the technical challenges of data analysis and the ethical implications of digital campaigning.
4.2 Data Processing Techniques
The collected data was categorized into three levels of engagement: low, medium, and high. Metrics such as likes and comments were converted into engagement scores, which were then used to classify posts. Irrelevant content, such as advertisements and non-political posts, was filtered out to optimize the dataset for analysis.
4.3 Machine Learning Framework
A pipeline combining feature extraction and classification was developed to analyze the dataset. Vision models, such as convolutional neural networks (CNNs) and Vision Transformers (ViTs), were used for feature extraction, creating embeddings that represent post characteristics. These embeddings were then fed into classifiers, such as logistic regression, gradient boosting, and support vector machines (SVMs), to predict engagement levels.
Classifier | Accuracy | Precision | Recall |
---|---|---|---|
Logistic Regression | 59.5% | 56.3% | 59.5% |
Gradient Boosting | 61.1% | 58.2% | 61.1% |
Vision Transformer | 61.1% | 70.4% | 61.1% |
6. Results and Analysis
6.1 Predictive Accuracy and Model Comparison
The analysis of model performance revealed critical insights into the predictive capabilities of different machine learning classifiers applied to the Italian campaign dataset. Vision Transformers (ViTs), particularly the ViT-B16 model, demonstrated competitive precision levels compared to CNN-based feature extractors like ResNet-50 and VGG-16. Notably, ViT-B16 paired with Gradient Boosting achieved the highest precision (70.4%), emphasizing its ability to minimize false positives effectively. However, classification accuracy (61.1%) across models was relatively consistent, with no single framework achieving a significant lead.
These results underscore the importance of dataset characteristics in determining model efficacy. While Vision Transformers excel in representing high-dimensional features, their performance does not universally surpass that of CNN architectures, especially when applied to datasets with limited diversity or scale. Additionally, metrics like Matthews Correlation Coefficient (MCC) highlighted the moderate reliability of all models, suggesting room for enhancement in future studies.
Classifier | Accuracy | Precision | Recall | F1 Score |
---|---|---|---|---|
Logistic Regression | 59.5% | 56.3% | 59.5% | 54.1% |
Gradient Boosting | 61.1% | 70.4% | 61.1% | 55.1% |
Vision Transformer | 61.1% | 60.4% | 61.1% | 57.7% |
6.2 Insights from Confusion Matrix
The confusion matrix analysis revealed distinct patterns in classification errors. Medium-engagement posts demonstrated the highest accuracy, indicating clearly distinguishable features. In contrast, posts categorized as low or high popularity often overlapped with medium-engagement features, leading to higher misclassification rates.
For example, high-engagement posts were frequently misclassified as medium-engagement due to shared visual elements, such as polarized political statements combined with neutral visuals. Similarly, low-engagement posts lacking emotional appeal or visual richness were often clustered with medium-engagement content. These trends suggest that central clustering of medium-engagement features may contribute to the bias observed in classification results.
7. Implications of Findings
7.1 Ethical Concerns in Digital Campaigning
Digital political campaigns, exemplified by the Italian campaign, present unique ethical challenges. Micro-targeting practices enable campaigns to deliver tailored messages to specific voter segments, enhancing engagement rates. However, this raises concerns about misinformation, echo chambers, and algorithmic biases. Misinformation can be amplified, reinforcing pre-existing biases and leading to polarized political discourse.
For instance, posts relying on emotional triggers—such as exploitation narratives or economic comparisons—were observed to drive higher engagement. While effective, such strategies risk oversimplifying complex issues, potentially distorting voter perception. Addressing these ethical concerns requires robust regulatory frameworks that promote transparency in political advertising and ensure accountability in targeting practices.
7.2 Leveraging Machine Learning Insights
Machine learning analysis provides critical insights into the mechanics of digital campaigns, offering policymakers tools for monitoring electoral compliance. By identifying patterns in voter engagement, regulatory bodies can develop frameworks to mitigate algorithmic manipulation and ensure fair democratic processes.
Furthermore, campaigns can leverage these insights to refine content strategies, emphasizing emotionally compelling narratives and visually engaging designs to optimize voter outreach.
8. Applications and Future Directions
8.1 Expanding Multimodal Analysis
The integration of multimodal data, including textual, visual, and network-based features, holds promise for enhancing the predictive accuracy of machine learning models. Combining these dimensions can provide a holistic view of digital political engagement, capturing the complexities of voter interactions across diverse platforms.
8.2 Tackling Misinformation
Future research should prioritize strategies to counter misinformation and bot activity. For example, the development of deepfake detection algorithms and the implementation of transparency protocols for generative AI content can safeguard electoral integrity.
8.3 Enhancing Scalability
Advancements in scalable machine learning methods, such as anchor-based techniques, can reduce computational demands while maintaining accuracy, making these approaches viable for large-scale election analysis.
9. Conclusion
The Italian campaign offers a compelling case study in the application of machine learning to political analysis. By leveraging social media data, researchers can uncover critical insights into voter engagement dynamics, enhancing the effectiveness of digital campaigns. However, as these technologies evolve, so too must the ethical and regulatory frameworks that govern their use.
Future research should continue exploring interdisciplinary approaches, integrating computational, legal, and ethical perspectives to ensure transparency and accountability in digital political processes. Companies like IBM, with their expertise in AI ethics and machine learning innovation, can play a pivotal role in advancing these efforts. As machine learning continues to reshape political communication, its role in safeguarding democratic integrity remains paramount.
Reference
Sernani, P., Cossiri, A., Di Cosimo, G., & Frontoni, E. Analyzing Digital Political Campaigning Through Machine Learning: An Exploratory Study for the Italian Campaign for European Union Parliament Election in 2024. Computers 2025, 14, 126. Academic Editor: Yan Zhang. Published: 30 March 2025. DOI: https://doi.org/10.3390/computers14040126. This article is distributed under the terms of the CC BY 4.0 license.
License Information
This work is licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0) license. You are free to:
- Share: Copy and redistribute the material in any format.
- Adapt: Remix, transform, and build upon the material for any purpose, even commercially.
Conditions:
- Attribution: Proper credit must be given, including the citation above, along with a link to the license, and indicate if changes were made.
For more details, visit https://creativecommons.org/licenses/by/4.0/.
