The Systemic Impact of Deplatforming on Social Media: Results

cover
15 Aug 2024

Authors:

(1) Amin Mekacher, City University of London, Department of Mathematics, London EC1V 0HB, (UK) and this author contributed equally;

(2) Max Falkenberg, City University of London, Department of Mathematics, London EC1V 0HB, (UK), this author contributed equally, and Corresponding authors: max.falkenberg@city.ac.uk;

(3) Andrea Baronchelli, City University of London, Department of Mathematics, London EC1V 0HB, (UK), The Alan Turing Institute, British Library, London NW1 2DB, (UK), and Corresponding authors: abaronchelli@turing.ac.uk.

Abstract and Introduction

Results

Discussion & Conclusion

Methods

Acknowledgements, Data availability, and References

RESULTS

User acquisition and activity

We start by analysing how the three cohorts of “matched Gettr”, “banned” and “non-verified” users joined Gettr. Figure 1A shows that user registrations were largely steady over time with two exceptions where registrations peaked: (i) July 2021 when the platform was founded, and (ii) January 2022 following the suspension of Marjorie Taylor-Greene and Robert Malone on Twitter [25], and the announcement by Joe Rogan that he would be opening a Gettr account [26].

FIG. 1. User registrations and daily activity for each cohort. (A) 3-day moving average of the daily number of users who registered on Gettr. The curve is displayed separately for the banned cohort (blue), the matched cohort (green) and other non-verified users on Gettr (orange). (B) 7-day moving average of the proportion of users from each cohort who were active on Gettr on a given day. The percentage of the matched cohort active on Twitter is also shown (dashed brown).

In Figure 1B, we show the fraction of accounts from each cohort who are active on any given day. For the matched cohort, we present their activity on both Gettr and Twitter. Focusing on the non-verified cohort, we see that a growing user base does not correlate with the growth of an engaged community, with, on average, 4% of the non-verified cohort active on any given day. On Gettr, 10% of the matched cohort are active on average, likely exceeding the value for the non-verified cohort because verified social media users are typically more active than other users [27]. However, on Twitter, the matched cohort are significantly more active with 69% of accounts active any given day. The activity of the matched cohort on Twitter is stable, with no evidence of a reduction in activity following the January 2022 suspensions. For the banned cohort on Gettr, activity approaches the baseline of the matched cohort on Twitter, with 53% active daily, 5-times larger than for the matched cohort on Gettr, and 13-times larger than the non-verified cohort. These results are qualitatively robust if we consider exclusively English-language accounts or Portuguese-language accounts, the first and second largest Gettr demographics, respectively (see SI).

User retention on Gettr

We now focus on the retention of users on Gettr. In Fig. 2, panels A and B show the survival curves for the proportion of users who remain active a certain number of days after registration (see Methods) for key registration months (July 2021 and January 2022 where registrations peaked, see Fig. 1), while panel C shows the average retention of users in each cohort over time. Survival curves for other registration months are shown in the SI and follow the same pattern with higher banned retention than matched retention. The matched cohort are consistently active on Twitter with no evidence that users stop using the platform over time: 90% of the matched cohort are active in the first month covered by our dataset (07/21), and 98% of these users remain active in our dataset’s final month (04/22). This highlights that the matched cohort are established Twitter users who are committed to the platform.

Figure 2 shows that the banned cohort have the highest retention on Gettr, independent of the month in which they joined the platform, whereas the non-verified cohort and the matched cohort become inactive at a faster rate. For the highlighted registration months, we note that the January curves fall off at a sharper rate than the July curves: For the July cohort, half of the newly registered users from the non-verified cohort become idle after 216 days, compared to only 68 days for the January cohort.

The event which clarifies these differences is the Marjorie Taylor-Greene deplatforming on Twitter. This deplatforming was denounced by Joe Rogan who opened a Gettr account on January 2, 2022, resulting in a large migration of his supporters, and supporters of Marjorie Taylor-Greene, to Gettr. However, after criticising the platform’s policies [29], Rogan quit the platform on January 12. This ten-day period highlights how a single celebrity’s endorsement resulted in a large migration to

FIG. 2. User retention for key registration months and average retention by registration date over time. (A) Kaplan-Meier survival curves for each user cohort showing the fraction of accounts who registered in July 2021 who remain active on Gettr a given number of days after registration for the banned cohort (blue), matched cohort (green) and the nonverified cohort (orange). The standard error of each curve is computed using Greenwood’s formula [28] (see Methods). The dashed line corresponds to January 1, 2022, shortly before Joe Rogan joined Gettr. (B) Survival curves for January 2022. (C) Decay curves for user activity, showing the duration of their activity with respect to their registration date, normalised by the number of weeks to the end of our data collection period. Data for each cohort is fitted using linear regression (y = ax + b, a = −0.007, [−0.014, 0], b = 0.8, [0.65, 0.95] for banned users, a = −0.011, [−0.015, −0.008], b = 0.6, [0.52, 0.67] for matched users, and a = −0.003, [−0.004, −0.002], b = 0.36, [0.34, 0.37] for non-verified users; square brackets indicate 95% confidence interval, highlighted by shaded area.)

Gettr. However, the subsequent denouncement by Rogan not only resulted in many new users quitting the platform (those from the January 2022 cohort in panel B), but also resulted in many existing users quitting, see dashed line in panel A. Importantly, members of the banned cohort who registered in July 2021 did not leave Gettr at an enhanced rate after January 2022. This highlights that users who had the option to return to Twitter did so, but those who could not (due to suspension) continued to use Gettr.

Compared to previous Gettr studies which showed that users quickly become idle after registration [30], possibly due to the lack of engaging content [31], our results reveal the discrepancy between users banned from Twitter and users who remain active on Twitter, indicating that Gettr was most successful at retaining users who had lost their Twitter audience. Our results also show that deplatforming events of exceptional prominence can induce a significant influx of accounts into a fringe platform, but not necessarily a corresponding outflux from the dominant mainstream platform.

Gettr structure and content

In order to further clarify differences between banned and matched users, we now focus on the structure and content of the Gettr social network. We start by generating a topic model using Gettr posts [32] (see Methods). A table of topics and their description is provided in the SI. This shows that content on Gettr is dominated by issues of broad relevance to the US political right including (1) Covid-19 – one sixth of all Gettr content, approaching one third in some months – (2) deplatforming from Twitter and other social media platforms, (3) accusations of election fraud and the January 6th insurrection, and (4) broader issues regarding gender, abortion, gun-control, the US supreme court, and race.

Most topics discussed on Gettr are prominent in tweets authored by the matched cohort, however, three themes are disproportionately prominent on Gettr: (i) Accusations of election fraud surrounding the 2020 US election, (ii) resistance to Covid-19 vaccine mandates, particularly in relation to the “Freedom Convoy” protests in Canada, and (iii) the Russian invasion of Ukraine. These are topics which are known to have been targets of the Twitter content moderation team [33–35].

We now measure whether the banned and matched cohorts are structurally segregated (or polarized) to assess whether the cohorts share the same, or different, audiences on Gettr. We measure segregation using the latent ideology, a well established method which constructs a synthetic ideological spectrum from user interactions on the platform [36–38] (see Methods). This measure orders the network of interactions between a set of influencer accounts (the banned and matched cohorts combined) and a set of accounts who interact with them (the nonverified cohort). By merging the banned and matched cohort into a single group, we can measure differences in how the non-verified cohort interact with banned and matched users in an unbiased manner based on purely structural factors. Note that we exclude a small number of accounts from the influencer set to avoid geographical conflation (see Methods); these are users not based in the USA.

The distribution of the latent ideology for the banned and matched cohort, and for the non-verified cohort, is shown in Fig. 3A. Both distributions are unimodal according to Hartigan’s diptest [39]. We observe that the banned and matched distribution falls within the bounds of the broader non-verified distribution. The banned and matched distribution is, however, significantly narrower, a feature indicative of the network centrality of these users who play a central role in the general Gettr discussion. Non-verified Gettr users are found both at the core of the Gettr discussion and at the peripheries. The central role of banned and matched users is expected since verified social media accounts typically attain higher engagement than non-verified accounts [40, 41].

The unimodal ideology, and the central position of the matched and banned cohorts, indicates that these users share a common audience on Gettr; segregated audiences would appear as a multi-modal ideology distribution (see examples in [37, 38])

Content toxicity and twitter mentions

FIG. 3. The latent ideology of Gettr users, and the toxicity of Gettr posts and Twitter tweets. (A) The latent ideology is calculated using the 500 most active banned and matched users on Gettr, merged into a single influencer cohort. Unit values on the x-axis correspond to the standard deviation of the ideology distribution for all users. Both distributions are unimodal when tested using Hartigan’s diptest (multimodality not statistically significant for the nonverified cohort, p = 0.99 > 0.01, banned and matched cohort, p = 0.61 > 0.01). Structural data required for calculating the latent ideology for Twitter is not available. (B) The fraction of posts (tweets) from each user cohort on Gettr (and matched Twitter) with a toxicity value larger than the value shown on the x-axis. Toxicity is calculated using the Google Perspective API [42] (see Methods). Median toxicity [lower and upper quartile] for the non-verified cohort, 0.17 [0.06, 0.37], banned cohort, 0.05 [0.02, 0.15], matched cohort on Gettr, 0.04 [0.02, 0.11], and matched cohort on Twitter, 0.09 [0.04, 0.22].

Together, the results for the latent ideology, topic modelling, and toxicity show that, although there are significant differences in activity and retention between the banned and matched cohorts on Gettr (see Figs. 1 and 2), there is little that distinguishes their audience and content, who merge into a single structural group. This result confirms previous research which shows that fringe platforms are politically homogeneous; platforms with this property may be referred to as “echo-platforms” [43, 44]. In contrast, mainstream platforms are often political diverse, but with opposed political groups confined to echo-chambers [38, 41, 44–48].

Considering the toxicity of posts for each topic, we find that topics with disproportionately high toxicity are related to race (e.g., Black Lives Matter; median post toxicity [lower and upper quartile] = 0.40 [0.31, 0.52]), focus on female US Democratic politicians (0.38 [0.18, 0.58], and discuss gender issues (0.38 [0.24, 0.51]). All three topics are known to attract abusive content on social media [49–51].

We now explore possible reasons why the matched cohort are more toxic on Twitter than they are on Gettr. To do this, we analyse the Twitter accounts mentioned in tweets authored by the matched cohort. For each mentioned account, we compute the ratio between the number of users from the matched cohort who quote-tweet that account and the number of users from the matched cohort who quote-tweet or retweet that account. This ratio (referred to as the “quote-ratio” throughout) is instructive since there is evidence that retweets are often (but not exclusively, journalists being a known exception) used to endorse the message of the original author [38, 52], whereas quote tweets allow a user to comment on a message in either a positive, negative, or neutral manner. Negative “quoting” behaviour is a known method of communication with ideological opponents across polarized environments [53, 54]. Hence, a low quote-ratio (i.e., the account is disproportionately retweeted) indicates general endorsement by the matched cohort of users, whereas a high quote-ratio (i.e., the account is disproportionately quote-tweeted) indicates that the matched cohort are more likely to disagree with and hold a negative view of this account.

Figure 4 shows the toxicity of tweets authored by the matched cohort, binned according to their quote-ratio. We count each mentioner-mentionee pair only once for quote-tweets and once for retweets to avoid bias from highly active accounts, and only include accounts mentioned by at least five matched users. This reveals (i) that tweets authored by the matched cohort mentioning any Twitter account are more toxic than tweets which do not mention another account, and (ii) that tweets authored by the matched cohort are more toxic if they mention an account with a high quote-ratio than if they mention accounts with lower quote-ratios.

To better understand this result, we plot the distribution of the quote-ratio broken down into four groups. Figure 5A shows the distribution of all users mentioned by the matched cohort, and the distribution for Twitter accounts who are also part of the matched cohort (i.e., a matched account mentioning another matched account). Three individuals are marked on the figure: (1) Republican 2022 Senate nominee Herschel Walker, the user with the lowest quote-ratio of prominent mentioned accounts (> 100 unique mentions), (2) Democratic speaker of the

FIG. 4. Toxicity of tweets authored by the matched cohort mentioning other Twitter accounts, binned according to their quote-ratio. The distribution of the quote-ratio is shown in Fig. 5. Each point indicates the median toxicity of tweets with a quote-ratio within the binned range [x, x+0.1). Error bars indicate the inter-quartile range. The dashed line indicates the median toxicity for all tweets (including those which do not mention another account) from the matched cohort, with the shaded region indicating the inter-quartile range; all data points lie above this line.

house Nancy Pelosi[55], the user with the largest quoteratio (> 100 unique mentions), and (3) Elon Musk, the account with the most unique mentions.

Figure 5B shows elected US political accounts mentioned by the matched cohort, labelled using the dataset in [56], broken down by party affiliation. This shows that Republican politicians are disproportionately retweeted (i.e., endorsed) by the matched cohort, whereas Democrats are disproportionately quote-tweeted. The individuals marked on this panel are political outliers; Liz Cheney and Adam Kinzinger are the Republican politicians with the highest quote-ratios (> 10 unique mentions), whereas Tulsi Gabbard[57] and Kyrsten Sinema are the Democratic politicians with the lowest quoteratios (> 10 unique mentions). This shows that these politicians do not align with the dominant position of their parties. Consequently, the matched cohort are more likely to endorse the Democratic outliers, and more likely to negatively quote-tweet the Republican outliers; Liz Cheney and Adam Kinzinger have been referred to as RINOs (“Republicans in name only”) by their far-right opponents [58, 59].

Figure 5C shows the news media organisations mentioned by the matched cohort, grouping them according to their political leaning as classified by Media Bias / Fact Check (MBFC; see Methods). Previous research confirmed that MBFC classifications are similar to classifications from other reputable media rating organisations [60]. Finally, Fig. 5D repeats the analysis in panel C, but groups media outlets according to whether MBFC labels them as reliable or questionable.

Using the distribution of all mentions (the “any user”

FIG. 5. The distribution of the quote-ratio of accounts mentioned on Twitter by the matched cohort. (A) The quote-ratio distribution for all mentioned accounts (blue dashed), and for mentioned accounts who are part of the matched cohort of users (i.e., a matched user mentioning another matched user; orange dotted). (B) The quote-ratio distribution for Twitter accounts belonging to known elected US Republican (pink solid) and known elected US Democrat (brown dashed) politicians. (C) The quote-ratio distribution for Twitter accounts belonging to news media organisations who have been labelled with a political leaning by MBFC. Organisations are classified as left (purple dotted), least-biased (grey solid), right (red dot-dashed), or far-right (yellow dashed). (D) The same news media organisations, but broken down according to whether they are classified as a reliable or questionable by MBFC. Vertical lines mark the median of each distribution. Annotations indicate mentioned accounts of particular interest (see text).

curve in Fig. 5A) as the baseline behaviour of the matched cohort, we find that, when tested using a twosample Kolmogorov-Smirnov test, only the distributions of far-right media organisations in panel C (KS-test pvalue = 0.24 > 0.01; Cohen’s d = 0.20) and questionable media organisations in panel D (KS-test p-value = 0.29 > 0.01; Cohen’s d = 0.05) are not significantly different from the baseline (see SI). The Democrat politicians distribution has the largest statistical difference to the all-mention baseline (KS-test p-value = 3 × 10−16 < 0.01; Cohen’s d = 2.34). With the exception of Tulsi Gabbard, no Democratic politicians has a known Gettr account; 132 are mentioned on Twitter by the matched Gettr cohort. In contrast, 32 Republican politicians have been active on Gettr; 151 are mentioned on Twitter by the matched Gettr cohort.

Combining the evidence from the topic modelling and from the quote-ratio in Fig. 5 indicates that the matched cohort are aligned with the US far-right, often quotetweeting, but not retweeting, their Democratic political opponents and moderate Republicans. In conjunction with the latent ideology in Fig. 3, this suggests that Gettr as a whole is generally representative of the US far-right. These results suggest that the ability to mention one’s political opponents on Twitter is part of the reason that the matched cohort are more toxic on Twitter than they are on Gettr where direct interactions with political opponents are not possible [61, 62].

Gettr’s wider impact on right-wing politics - the case of Brazil

Journalistic reports have suggested that Gettr played a key role in facilitating the Bras´ılia insurrection on January 8, 2023, following Jair Bolsonaro’s defeat in the Brazilian Presidential elections [63, 64]. Here we investigate whether there is evidence for this role in the Gettr interaction network.

First, we study the power imbalance in the Portuguese language network by measuring the Gini coefficient of the degree distribution, shown in Fig. 6A. The figure shows that the Gini coefficient peaked in the run-up to the Bras´ılia riots, which is evidence that a handful of users were responsible for shaping the collective narrative of the Portuguese language Gettr community [65, 66].

FIG. 6. Evolution of the interaction network in the Brazilian community. Analysis of the daily interaction network, generated by considering any interaction within a 1-day window. (A) Gini-coefficient of nodes in the giant connected component. (B) Transitivity of the giant component. Dashed lines correspond to key events related to Brazilian politics and Gettr’s involvement: (1) 2021 CPAC Brazil Conference, (2) the Brazilian presidential election, and (3) the Brazilian Congress attack in Bras´ılia.

We now study grassroots engagement, measured by computing the transitivity of the Gettr interaction network, see Fig. 6B. This measure increases when a community of users densely interact with one another [67, 68]. The figure shows that network transitivity peaked during CPAC 2021, where the Bolsonaro regime and Gettr shaped their close alliance [69], and in the days leading up to the Bras´ılia riots. Applying a Portuguese-language topic model to the network reveals that users were discussing accusations of rigged elections and claims of a corrupt media (see SI) in the lead up to the riots.

The peak in both the Gini coefficient and the transitivity shows that leading Bolsonaro allies successfully capitalised on accusations of election fraud to generate a grassroots movement on Gettr in the wake of Bolsonaro’s defeat in the Brazilian elections. These results offer new quantitative insights which build on journalistic reports of Gettr’s role in the riots. Critically, our results show that even when a platform appears largely inactive, a community of idle users can be mobilised within a short time period leading to real world harms.

This paper is available on arxiv under CC BY 4.0 DEED license.