Otherness and suspiciousness: a comparative study of public opinions between the confucius institute and goethe-institut in developing countries

feature-image

Play all audios:

Loading...

ABSTRACT This study aims to examine the differences in lexical priming features between Confucius Institutes and Goethe-Instituts in developing countries using lexical priming theory and


natural language processing techniques. By collecting news media coverage from 2014 to 2023, this study analyses the corpus through collocation, colligation, semantic association, and


semantic prosody. In this study, it was found that the Goethe-Institut has a more stable institutional identity, while the Confucius Institute is still largely recognized for language


teaching and cultural dissemination activities. The association of the Confucius Institute with China and its government creates a stronger sense of "otherness" and leads to


negative perceptions. Overall, this study contributes to a better understanding of public perception and institutional image in developing countries. SIMILAR CONTENT BEING VIEWED BY OTHERS


MEDIA BIAS THROUGH COLLOCATIONS: A CORPUS-BASED STUDY OF EGYPTIAN AND ETHIOPIAN NEWS COVERAGE OF THE GRAND ETHIOPIAN RENAISSANCE DAM Article Open access 27 May 2024 COMPARING THE LANGUAGE


STYLE OF HEADS OF STATE IN THE US, UK, GERMANY AND SWITZERLAND DURING COVID-19 Article Open access 19 January 2024 ANALYSIS OF SOCIAL MEDIA LANGUAGE REVEALS THE PSYCHOLOGICAL INTERACTION OF


THREE SUCCESSIVE UPHEAVALS Article Open access 17 February 2025 As a result of the spiral of silence, the blockage of semiotic hegemony and linguistic alliance, and the absence of Chinese


capacity for worldwide promotion, the Confucius Institute faces challenges in the current era. According to a 2019 Pew Research Center survey, developing countries view China more favourably


than developed countries do. To overcome the public opinion dilemma that currently exists under such a communication pattern, it is necessary to examine how the Confucius Institute is


depicted through language in social media in developing countries. Discourse analysis has been widely used to study public opinion and institutional image. However, previous studies on the


public perception of Confucius Institutes have exhibited gaps in research methodologies and a lack of comparative analyses. Therefore, this study aims to address these gaps by incorporating


a control group, the Goethe-Institut, and utilizing Hoey’s lexical priming theory with the help of artificial intelligence. The research questions of this study include the differences in


lexical priming features between news reports on Confucius Institutes and Goethe-Institut, their impact on the audience’s attitudes and stances, and the way in which these features shape


public opinion images of the institutions. LITERATURE REVIEW REVIEW OF STUDIES ON CONFUCIUS INSTITUTE Confucius Institutes (CI) are public educational and cultural promotion programs that


are funded and arranged by the Chinese International Education Foundation. The program was started in 2004 and supported by the Chinese Ministry of Education-affiliated Hanban (changed to


the Center for Language Education and Cooperation in 2020), which cooperated with universities. The objectives of this program include fostering cross-cultural interactions, supporting local


Chinese teaching abroad, and promoting the Chinese language and culture. REVIEW OF STUDIES ON THE PROBLEMS AND INFLUENCE OF CONFUCIUS INSTITUTES The majority of studies on Confucius


Institutes (CI) that are published in English publications concentrate on communication across cultures, national strategic analysis, and teaching. * 1. Language teaching: In these studies,


the advantages of Confucius Institutes as educational institutions are highlighted, leading to a more favourable perception of them (e.g., Starr 2009; Selezneva 2021). * 2. Strategic


analysis: Some academics use a macro viewpoint while performing strategic analysis (e.g., Lien et al. 2012; Huang et al. 2019). Confucius Institutes are viewed more neutrally and critically


since they are seen as diplomatic weapons, and their political implications are emphasized. The idea of Confucius as a "gentle teacher" contradicts Chinese intentions, according to


Harting (2014), who asserts that Confucius Institutes have political aims rather than cultural aims and do not represent the "real" China. Additionally, several studies examine


the commercial activities of the Confucius Institute and its effects on Sino-foreign commerce to perceive it as an economic instrument (e.g., Li et al. 2009; Lien and Co 2013). * 3. Cultural


communication: The research findings from this viewpoint are split into two opposing groups. One is that the Confucius Institute significantly strengthens China’s soft power, and China has


been using the Confucius Institute to improve its reputation among the international public (e.g., Gill and Huang 2006; Kluver 2014). Brazys and Dukalskis (2019) suggested that Confucius


Institutes systematically enhanced media attitudes towards China on a worldwide scale by comparing regions with or without Confucius Institutes. However, other findings suggested that it is


challenging to determine whether the construction of Confucius Institutes is helping China make friends around the world (e.g., Paradise 2009; Selezneva 2021). Xie and Page (2013) examined


data from the 2007 Pew Global Attitudes Survey and found no evidence to substantiate the hypothesis that the construction of Confucius Institutes in 35 nations had a positive effect on


Chinese diplomacy. According to Zhou and Luk’s (2016) study, receivers are not particularly attracted to China’s soft power or CI. Some scholars even assert that the Chinese culture taught


at Confucius Institutes is a unique incarnation of Chinese policy and that Confucius Institutes interfere with academic freedom (e.g., Acquaye 2020; Yeh et al. 2021). Overall, the


investigations mostly have an ideological bias towards Confucius Institutes and frequently preemptively attack their role in disseminating political propaganda. REVIEW OF STUDIES ON PUBLIC


OPINIONS TOWARDS CONFUCIUS INSTITUTES In 2011, a study on the perceptions of Confucius Institutes abroad was conducted. Li and Dai (2011) discovered that the topic and news content


determined whether the U.S. media made positive or negative comments regarding Confucius Institutes in its coverage of the institutions from 2005 to 2010. Since that time, there has been an


increase in the number of studies. The study topics have ranged from media coverage in a particular nation to regional or global statistical analysis to public opinion surveys. Among them,


the study of public opinion in a single nation started with that of the United States (An and Xu 2015; Liu and Zeng 2017; Zhang 2021), and it progressively expanded to include the analysis


of British (Peng and Yu 2016), Spanish (Min 2012), Australia, and Canada (Zhang and He 2016). Additionally, some of them even adopted a global outlook. They gathered copious news stories


without considering the news source to synthesize public opinion and create a worldwide perception of the Confucius Institute. Yan (2018), for instance, summarized the changes in the public


opinion environment of Confucius Institutes over the period of a decade by conducting a comparative analysis of press reporting on the institutions from 2005 to 2014. Additionally, the


research methodologies have been improved, moving from qualitative to quantitative analysis and from content-based to discourse analysis. A popular analytical research technique in


communication science that is appropriate for examining the environment of public opinion is content-based analysis. As a result, numerous researchers have conducted quantitative analyses of


foreign media and their reports using the content-based analysis approach, assessing the stances of the reports’ content, and drawing conclusions about the trend of data on the stances of


the media reports (e.g., Zhang and He 2016). While the number of studies employing discourse analysis techniques such as critical discourse analysis has steadily expanded (e.g., Ye 2015; Liu


and Zeng 2017), researchers have progressively learned how significantly news discourse affects viewers. For instance, Zhang (2021) examined the traits and propensities of the coverage of


Confucius Institutes by the New York Times in a recent study, focusing on the connection between reporters’ attitudes and social ideology. Furthermore, in the first half of 2021, Zhang


brought lexical priming theory, a different type of discourse analysis methodology, into the research and evaluated a number of news articles about Confucius Institutes in Western media. The


conclusions of the research on the perceptions of Confucius Institutes among public opinion are divisive. The bulk of research that concluded that media shaped public opinion to create a


passive image of society used American media as the subjects. The results of these papers may be summarized as follows: foreign media politicized the interpretation of Confucius Institutes


(Liu 2014). According to Zhou et al. (2018), Confucius Institutes are portrayed as institutions with political characteristics that disseminate ideology and interfere with academic freedom


(e.g., Yuan et al. 2016; Xing and Zhao 2021). In contrast, a considerable number of studies have revealed that the public’s perception of Confucius Institutes is either neutral or positive


(e.g., Ye 2015). However, most of the studies found that there were distinct segments of public opinion. For instance, Liu and Zeng (2017) separated the sample reports into two segments:


"Chinese language education and cultural promotion" and "public diplomacy and soft power." Although these public opinion findings indicated that Confucius Institutes had


a generally positive reputation, there were also doubts regarding political and academic meddling (Zhang and He 2016). As soon as such themes or issues were associated, the perspectives


abruptly switched. In summary, scholars have paid attention to how Confucius Institutes are perceived by the global public and have compiled their findings in a number of studies. However,


controlling for objective quantitative comparison analysis is inadequate, and content and discourse analysis places greater emphasis on the researcher’s comprehension. In addition, in


earlier discourse analyses, researchers paid the most attention to the high-frequency terms (also known as content words) in the whole report since they might indicate the news’s worth and


overall theme. Zhang (2021) notes the drawbacks of considering only high-frequency textual words when the news reports’ substance might not always be pertinent to the institutes being


targeted. Focusing on collocations (content terms with semantic associations) helps address this deficiency by revealing which words and phrases are frequently directly associated with the


targeted institutes. REVIEW OF THE LEXICAL PRIMING THEORY In 2005, British linguist Michael Hoey methodically advanced the lexical priming idea. It was formed under the background of an


increasing awareness that traditional views of the vocabulary of English were out of kilter with the facts about lexical items that are routinely being brought up by corpus investigation


(Hoey 2005). According to Hoey, the core idea of lexical priming is described as follows: > Every time we use a word, and every time we encounter it anew, the > experience either 


reinforces the priming by confirming an existing > association between the word and its co-texts and contexts, or it > weakens the priming, if the encounter introduces the word in an


> unfamiliar context or co-text. (Hoey 2005: 9) In other words, every time we encounter a word or phrase, we store it along with all the words that accompanied it and with a note of the


kind of context in which it was found (Kaszubski 2007). Thus, when context is analysed through lexical priming theory, it can be divided into three parts: collocations, colligations and


semantic associations. After the publication of his book, Michael Hoey’s theory of lexical priming has been applied in various ways. He himself explored the relationship between lexical


priming and creativity (e.g., Hoey and O’Donnell 2008), as well as its impact on second-language learning and in languages other than English. Other researchers, such as Jantunen and Brunni


(2013) and Jantunen (2017), have extended the theory to include morphology, and corpus-based lexical priming theory has been applied to different kinds of text research (e.g., Leedham and


Cai, 2013). Pace-Sigge (2018) and Patterson (2016; 2018) have looked at evidence of lexical priming in spoken English and its connection to metaphor use. The book _Lexical Priming:


Applications and Advances_, which was edited by Patterson and Pace-Sigge in 2017, covers a wide range of topics of further applications and advances of lexical priming theory. Several


studies have introduced lexical priming into news report analysis. To understand how lexical priming is used in practice, Drake (2009) examined a large amount of naturally occurring


phraseplay from a newspaper corpus. He then used corpus-based lexical priming theory on news reports. Zhang (2021) has applied the theory by examining the frequency of priming prepositions


and nouns in news stories to elucidate the attitude and posture of American media towards Confucius Institutes. Specifically, in the book _Lexical Priming: Applications and Advances_


(Patterson and Pace-Sigge 2017), the article _Forced lexical primings in transdiscoursive political messaging_ written by Alison Duguid & Alan Partington discusses how forced lexical


priming is generated and received in political messaging across disciplines, thus affecting listeners. In 2018, Michael Pace-Sigge delved deeper into the relationship between linguistics and


artificial intelligence in the book _Spreading Activation, Lexical Priming and the Semantic Web_. This book illustrated that linguistic knowledge supports the ability of computing devices


to process human language. In turn, these electronic devices are increasingly approaching the creation of a mirror image of language processing, thus supporting the foundational theory of


language structure. These books inspired the methodology of this paper. It can be inferred from the previous discussion that discourse analysis methods used to examine news articles about


Confucius Institutes still suffer from inadequacies. Additionally, there has been a dearth of substantial research comparing the public opinion of Confucius Institutes to that of other


institutions, highlighting the need for further investigation in this area. Furthermore, comparative analysis is an approach commonly used to increase the accuracy and validity of data


analysis. As such, this study incorporates a control group, the Goethe-Institut, into the analysis to mitigate potential gaps and increase the study’s validity. Moreover, inspired by Michael


Pace-Sigge’s work in 2018, this study aims to incorporate artificial intelligence into Hoey’s lexical priming theory. This will enable the expansion of the corpus of text types and volumes


that can be analysed using lexical priming theory. By integrating insights from corpus linguistics and natural language processing, this study seeks to apply lexical priming theory to


explore the relationship between public opinion and institutional image. This paper aims to provide answers to the following questions: * (1) What are the differences in lexical priming


features in the collocation, colligation, semantic association and semantic prosody of news reports on the Confucius Institute and Goethe-Institut separately? * (2) How can these lexical


features influence audiences’ attitudes and stances differently? * (3) How do these features shape the public opinion images towards CIDC and GIDC? * (4) What are the differences in public


opinion images towards the CIDC and GIDC? THEORETICAL FRAMEWORK Lexical priming theory proposes an association-based network model that represents words as nodes in a large memory network,


with similar words connected to each other via edges (Kumar et al. 2020). This theory involves a spread of activation, where activation spreads from one concept to related concepts along


associative and semantic pathways. Although evidence suggests that association-based network models capture complementary semantic information compared to text-based distributional models


(Gruenenfelder et al. 2016), their validity has been questioned on the grounds of being constructed from retrieval-based processes involved in word association tasks (Jones et al. 2011).


Therefore, a complete account of semantic memory should include an explanation of how such associations are formed and how the complex network structure that successfully explains


behavioural performance in semantic tasks is acquired. To address this issue, this study seeks to combine lexical priming theory and the word2vec model, which serves as a priming performance


for distant concepts. By doing so, this study aims to provide a comprehensive account of semantic memory and to gain insights into the underlying attitude behind the primings. THE PROCEDURE


OF LEXICAL PRIMING THEORY Hoey (2005: 7) states that “collocation is pervasive” and that “any explanation for the pervasiveness of collocation has to be psychological, as … [it is] a


psychological concept”. A psychological relationship between words that are up to five words apart is what is meant by the term “collocation”. This is evidenced by their occurrence together


in corpora more often than is explicable in terms of random distribution (Hoey 2005: 8). Because collocation is strongly tied to the psychological phenomena of priming that results from a


language user’s frequent contacts, a concordance program can be used to determine the position of lexis. Therefore, a concordance will be used in this essay to examine the primings that


underlie collocation. In addition, statistical probabilities could also be taken into consideration (a mutual information score is just one of many possible options): a machine can give


predictions as to the degree of likelihood that WORD appears with any other word (Pace-Sigge 2018). Thus, pointwise mutual information (PMI) is used in this study to assess whether the


connection between a candidate collocate and the matching string is solid enough. Pointwise mutual information (PMI) quantifies the degree to which words appear together more frequently than


they do separately in a corpus. In other words, PMI is a useful technique for identifying words that are semantically related to a particular phrase. As lexical priming focuses on the


psychological element of word choice, semantic association was defined by Hoey as follows: > [semantic association] exists when a word or word sequence is > associated in the mind of a


 language user with a semantic set or > class, some members of which are also collocates for that user. > (Hoey 2005: 24) This definition highlights the status of collocates related to


specific keywords. The degree to which collocates are associated with the matching string serves as the foundation for determining the semantic associations. Therefore, pointwise mutual


information (PMI) can also be used to locate and summarize the matching string and to conclude the associations. It should be noted that this paper adopts the MI3(Mutual Information Cubed)


calculation formula implemented in Wordsmith 8.0: \(MI3 = \log _2\left( {J^3E/B} \right)\), where _J_ = joint frequency; _F_1 is frequency of word 1; _F_2 is = frequency of word 2;


\(F_{total}\) is frequency of total tokens; \(E = J + \left( {F_{total} - F_1} \right) + \left( {F_{total} - F_2} \right) + \left( {F_{total} - F_1 - F_2} \right)\); \(B = \left( {J + \left(


{F_{total} - F_1} \right)} \right)\left( {J + \left( {F_{total} - F_2} \right)} \right)\). Compared to Mutual Information, MI3 offers several distinct advantages. MI3 emphasizes joint


frequency by raising it to the power of three, prioritizing associations with robust co-occurrence. It considers total token counts through term E, providing a comprehensive understanding of


associations within the corpus. MI3 ensures enhanced precision by incorporating term B, capturing distinctiveness and minimizing chance associations. Adopting the MI3 formula deepens our


understanding of word associations, enhancing linguistic analysis accuracy. However, Hoey (2005) describes that semantic association can be particularly affected by local collocations that


might not appear in an average corpus or act as a complimentary (or, conversely, discourteous) complement to words or word sequences. The similarity of the word vector calculated by the


word2vec model could compensate for the lack of PMI in particular semantics and be used to measure the semantic association in a specific semantic environment. It is noticeable that semantic


prosody, the meaning conveyed by collocational links, typically conveys meanings that encode attitudes and evaluations. Semantic prosody goes further, as it highlights that a large number


of words in use have an underlying, subconscious prosody that, according to Louw, only became visible once computers made large-scale concordancing possible (Louw 1993). In this research, we


assumed that semantic prosody is defined as the larger textual context associated with the words in the corpus. The TOPICs extracted by LDA that are closely related to WORD and its stances


and attitudes are regarded as the semantic prosody (this will be explained later in “Word2vec model and LDA model”). The definition of colligation is the grammatical functions preferred or


avoided by the group in which the word or word sequence participates (Hoey 2005). Hoey orients his definition of colligation towards Halliday’s use—i.e., “the relation [held] between a word


and a grammatical pattern” (Hoey 2005). Based on this, he defined colligation as follows: * 1. the grammatical company a word or word sequence keeps (or avoids keeping) either within its own


group or at a higher rank; * 2. the grammatical functions preferred or avoided by the group in which the word or word sequence participates; and * 3. the place in a sequence that a word or


word sequence prefers (or avoids) (Hoey 2005). To determine colligation, Stanfordnlp was used in this study to carry out dependency parsing to analyse the grammatical structure of a sentence


and address the three problems related to grammatical company, grammatical functions, and the preferred position of a word or word sequence. Dependency parsing involves identifying the


syntactic relationships between words and represents them as nodes and edges, which reflect the grammatical roles of words in the sentence. WORD2VEC MODEL AND LDA MODEL WORD2VEC MODEL The


word2vec model is an unsupervised shallow neural network model that is used in natural language processing to represent words as vectors. The model consists of only an input layer, a hidden


layer, and an output layer. In this model, a neural network is used to learn the relationships between words, and they are mapped to vectors in a high-dimensional space, such that the


proximity of words in the space reflects their semantic similarity. The word2vec model includes two main models: continuous bag of words (CBOW) and skip-gram (Jatnika et al. 2019). The CBOW


model predicts the target word \(\omega _t\) given its surrounding context words \(\omega _{t - 2}\), \(\omega _{t - 1}\), \(\omega _{t + 1}\), \(\omega _{t + 2}\). On the other hand, the


skip-gram model predicts the context words \(\omega _{t - 2}\), \(\omega _{t - 1}\), \(\omega _{t + 1}\), and \(\omega _{t + 2}\) given the target word \(\omega _t\).The structures of the


two neural network models are shown in the diagram below in Fig. 1: In the Word2Vec model, word vectors are designed to capture the semantic meaning of words, enabling the representation of


similar words to be located closer together within the vector space. This proximity reflects the underlying semantic similarity between words. The calculation of word vector similarity


commonly employs the use of cosine similarity, which measures the cosine of the angle between two vectors. When the angle between two vectors is small (approaching 0), their cosine


similarity approaches 1, indicating a high degree of similarity. Conversely, as the angle increases (approaching π/2), the cosine similarity approaches 0, indicating a low degree of


similarity. By evaluating the cosine similarity between word vectors, one can quantitatively determine their semantic similarity. A cosine similarity close to 1 suggests that the words


possess similar semantic meanings, while a cosine similarity close to 0 suggests significant differences in their semantic interpretations. Hence, the computation of word vector similarity


facilitates the identification of highly co-occurring words within similar linguistic contexts. These words exhibit both collocational patterns, indicative of their frequent co-occurrence,


as well as semantic associations, reflecting their interconnected meanings. LATENT DIRICHLET ALLOCATION Latent Dirichlet allocation (LDA) is an unsupervised learning algorithm that is based


on probabilistic graphical models and used for large-scale text corpus topic modelling (Blei et al. 2003). The LDA model assumes that each document consists of multiple topics, and each


topic consists of multiple words. For documents, the topics follow a Dirichlet distribution, while for topics, the words follow a Dirichlet distribution. Given a text matrix that has already


been vectorized, the probability distribution of each word \(\omega _i\) belonging to a topic \(z_i\) can be calculated as follows: $$p\left( {z_i = k|z_{ - i},{{{\mathbf{x}}}},\alpha


,\beta } \right) \propto \left( {n_{d_i,k} + \alpha _k} \right)\frac{{e^{{\phi _{{{\mathbf{k}}}}}^ \cdot\, {{{\mathbf{x}}}}_i}}}{{\mathop {\sum}\nolimits_{j = 1}^V {e^{{\phi


_{{{\mathbf{k}}}}}^ \cdot \,{{{\mathbf{x}}}}_j}} }}$$ where \(n_{d_i,k}\) is the number of words in document \(d_i\) that belong to topic _k_; \(\alpha _k\) is the weight of the _k_ topic in


the document-topic distribution; \(\phi _{{{\mathbf{k}}}}\) is the vector of the _k_ topic, \({{{\mathbf{x}}}}_i\) is the vector of the _i_ word; and _β_ is the hyperparameters of the


topic-word distribution. Similarly, the probability distribution of each word _ω_ belonging to a topic _k_ can be calculated as follows: $$p\left( {\omega |z = k,{{{\mathbf{x}}}},\beta }


\right) \propto e^{{\phi {_{{{\mathbf{k}}}}}}^ \cdot\, {{{\mathbf{x}}}}_\omega }$$ where, \({{{\mathbf{x}}}}_\omega\) is the vector of word _ω_. In this study, a joint use of word2vec and


LDA for text modelling is proposed, which has several advantages. First, the semantic relationships between words are considered. Traditional LDA models only take into account the frequency


information of words in the text without considering their semantic relationships. However, by using word2vec to vectorize words, the semantic relationships between them can be better


captured, thus improving information extraction from the text. Second, the combination of word2vec and LDA enhances the effectiveness of text modelling. Word2vec represents words as vectors,


thus providing a better representation of the relationships between them, while LDA represents the text as a distribution of topics, providing a better representation of the content. The


combination of these two approaches leads to a more effective representation of the text. Third, LDA models may suffer from the inclusion of noise words, which can negatively impact the


model’s performance. However, by vectorizing words using word2vec, noise words can be more easily distinguished from relevant words, leading to a more robust model. Finally, the joint use of


word2vec and LDA allows for more granular text modelling. Word2vec captures detailed relationships between words, while LDA represents the text as a distribution of topics, providing a


better representation of the content. By combining these two approaches, more detailed information can be extracted from the text. METHODOLOGY The attitude of language influenced by social


media in developing nations might be detected by comparing the collocational words and phrases of the names "Confucius Institute" and "Goethe-Institut." It was possible


to determine the context of the institutions about which international audiences read on a regular basis by evaluating the positive and negative semantic associations. Colligation analysis


is also helpful for determining whether the institute in the context is in a dominant or subsidiary role. As a result, the audiences and social media stances of overseas developing


countries’ Confucius Institute and Goethe-Institut could be compared. RESEARCH DESIGN In the new era, the restrictions of the West’s international communication pattern, the blockade of


semiotic hegemony and language alliances, targeted rumour propaganda against China with the influence of the spiral of silence, and the lack of China’s international promotion capacity (Xing


and Zhao 2021) all pose challenges to the development of the Confucius Institute. Several polls have shown that developing countries have a more favourable view of China than developed


countries. According to the Global Survey on China’s National Image 2019 by the Institute of Contemporary China and the World, 79% of developing countries consider their country’s


relationship with China important and increasingly recognize China’s performance in foreign affairs. According to the Pew Research Center, more affluent countries, such as Japan (85%),


Sweden (70%), Canada (67%), and the United States (60%), have a more negative attitude towards China. More than half of people in African and West Asian countries, such as Nigeria (70%) and


Kenya (58%), have a favourable opinion of China. Thus, although powerful Western media still restrict the international communication pattern, the demographic advantage and trust of


developing countries provide a breakthrough to resolve this dilemma. It is helpful to analyse the contexts of language and attitudes presented behind the language of public opinions from


developing countries. Therefore, studying the image of Confucius Institutes in the media and on developing countries’ websites can help us resolve the difficulties more effectively.


Therefore, in this study, the sample sources are limited to developing countries. In addition, the Goethe-Institut belongs to Germany, which has not colonized the developing countries where


the corpus was collected for a long time. That is, compared to the British Council and Alliance fran, it has less political and cultural influence in developing countries. Moreover,


according to the literature review above, researchers have concluded that the Goethe-Institut is a well-developed institute for language teaching and culture promotion. It has overcome the


initial difficulties and has formed a favourable international public opinion environment. In this paper, a quantitative corpus-based analysis is adopted. Moreover, we take lexical priming


theory as the theoretical framework, analysing the collected discourse from three aspects: collocation, semantic associations, and colligation. The quantitative analysis is applied to the


analysis of lexical priming to find the collocation, semantic association words and colligational types. RESEARCH SAMPLES This study is based on the data in the NOW corpus. The NOW Corpus is


a subcorpus of the English Language Corpus created by Brigham Young University in the U.S. It is the most up-to-date corpus of English, containing a wide range of online newspapers and


magazines (technology, entertainment, sports, politics, etc.). This study takes “Confucius Institute” and “Goethe-Institut” as matching strings to collect data from the NOW corpus from


01/01/2014 to 04/01/2023. On June 1, 2014, the American Association of University Professors (AUP) issued Confucius Institutes Threaten Academic Freedom. Since then, Confucius Institutes in


the U.S. and other Western countries have been affected to varying degrees, and some have even been forced to close down. As discourses collected by the NOW corpus are mainly from countries


whose mother tongue or official language is English, the data from developing countries in the corpus are as follows: India, Sri Lanka, Pakistan, Bangladesh, Malaysia, Philippines, South


Africa, Nigeria, Ghana, Kenya, Tanzania, and Jamaica. In the NOW corpus, 1086 texts are found through the matching string “Confucius Institute”, while 1065 pieces are related to


“Goethe-Institut”. Due to limited permissions, it is difficult to analyse the data within the NOW corpus directly. Therefore, the original texts are copied and collected to create small new


corpora: CI Developing Country Corpus (CIDC) and Goethe-Institut Developing Country Corpus (GIDC). The number of texts from various nations in the CIDC and GIDC is compared in Fig. 2. In


developing nations such as Ghana and Bangladesh, opinions on the Confucius Institute and the Goethe-Institut are divisive. The polarization might be connected to the influence of


institutions in various nations. Additionally, Sri Lanka and Jamaica have low reports, suggesting that both institutes have little influence in those nations. However, a simple glance at


Chart 1 demonstrates that the influence of the Goethe-Institut and the Confucius Institute on social media in developing nations is equal, making this set of information eligible for


comparison. DATA APPROACHES The study encompasses a four-part approach to data processing (see Fig. 3), which entails data collection and preprocessing, collocation calculation, colligation


analysis, and semantic association construction. First, we retrieved news articles and established small-scale corpora CIDC and GIDC by searching for the keywords “Confucius Institute” and


“Goethe-Institut” in the NOW Corpus from 01/01/2014 to 03/31/2023. Complete news articles were obtained by retrieving the URLs of all relevant news through legal means. Prior to data


cleaning, the collected news articles were structurally processed, such as by removing URLs and non-English characters, followed by segmentation and tokenization based on English stop words.


Second, collocation calculation involved PMI value computation and word vector similarity. By employing Skip-gram and CBOW models in the word2vec model, each word in the corpus was


transformed into a fixed-length word vector. Subsequently, the similarity between each word vector and the target words "Confucius Institute" and "Goethe Institut" was


calculated, and the results were ranked based on the similarity. Complementary analysis of word vector similarity and PMI facilitated collocation analysis of the corpus. Third, we used the


StanfordNLP model to perform syntactic dependency parsing on sentences containing the target words "Confucius Institute" and "Goethe Institut" to construct syntax trees.


The part of speech and grammatical depth of the target words in the sentence were identified and counted for colligation analysis. Finally, the LDA model was employed to extract ten topics


and the top ten words related to each topic from the text matrix of CIDC and GIDC. Euclidean distance was used to calculate the distance between the target words and the centres of each


topic for constructing the semantic association of the corpus. Collocation, colligation, and semantic association of the corpus were jointly contributed by this part of the study. STATISTICS


ANALYSIS The CIDC has 16,483 lemma types and 420,605 lemma tokens, while the GIDC has 25,834 lemma types and 436,140 lemma tokens. These figures suggest that the GIDC may have a more


diverse range of vocabulary and potentially cover a wider range of topics or genres. Further analysis is carried out as follows from three perspectives, collocation, colligation and semantic


association, to determine if there are any significant differences in terms of linguistic features or usage patterns between the two corpora. FEATURES OF COLLOCATIONAL PRIMING FEATURES OF


CONCORD PATTERN According to Table 1, the word _with_ has a high frequency in the locations. _Collaboration_, As shown in Table 1, the word _with_ a high frequency in the locations.


_Collaboration_, _Partnership_ are also used in different lexical situations with high frequency. The meaning “in the company or presence of sb/sth,” “doing sth together or of working


together towards a shared goal” has developed collocational priming with Goethe-Institut. Such collocational priming shapes an image that equals communicating and cooperating with other


national institutions. In addition, several countries’ names, such as _Namibia_ and _Bangladesh_, appear in Table 1, helping associate the Goethe-Institut with other countries. It is also


noticeable that _German_, which stands for the national characteristics of the Goethe-Institut, is oddly positioned in L5, L4 and R3, R4 just that far from the matching string compared to


other national names. All of the above helps the audience form a stable memory by repeating that the Goethe-Institut has excellent friendships and cooperation with various countries, while


its Germany-specific attributes are forgotten. Moreover, _Mueller_, _bhavan_ are located in high frequency and heavily tied with Goethe-Institut, which refers to Max Mueller Bhavan, a


language institute run by the Goethe-Institut in India. Additionally, _Kristen_, _Hackenbroch_, _director_ and _Bangladesh_, appearing in positions tied closely with GI, is described as a


person, Kristen Hackenbroch, who was employed by the Goethe-Institut in Dhaka, Bangladesh, performed research and taught urban studies. All these company and personal names imply for-profit


and privatization: “private acts are responsible for the institution’s works”. The most notable aspect of Table 2 is the placement of the terms _China_ and _Chinese_, which explicitly refer


to the country-specific characteristics of the Confucius Institute, in six lexical locations, including the first and second frequencies on L4, R2, R3, and R4. It is often emphasized that


the “Confucius Institute belongs to China”. _Chinese_ develops a more significant lexical priming and semantic connection with the Confucius Institute than _German_ does with the


Goethe-Institut in Table 1, where _German_ or _Germany_ only appears in lexical locations with lower frequency and longer distances in relation to the matching string. In addition, the word


_university_ is noteworthy, appearing in almost all lexical positions in the top three frequencies, meaning that _university_ also has a strong collocation with the Confucius Institute. In


addition to words such as _director_, _headquarters_, and _language_, these lexical items are all related to the function and operation of the Confucius Institute, implying that the


Confucius Institute has a strong connection to what it would do, which may lead readers to associate Confucius Institutes with "active" and "dominant" institutions. PMI


AND WORD VECTOR As explained in Section 3, we have previously discussed the relationship between MI and word vectors in relation to collocation. Here, we further classify collocations into


three categories: specific collocations (high similarity score only), general collocations (high PMI score only), and typical collocations (high similarity and PMI scores). Table 3 reveals


that most of the words that form general collocations with the Confucius Institute are closely associated with its operations, language teaching, and cultural exchange activities. For


instance, words such as _language_, _teachers_, _teaching_, and _students_ form a strong priming effect with the Confucius Institute, resulting in high PMI values. The underlying semantic


associations behind these collocations have been elaborated in our forthcoming Section 4.2, where they will be presented in tabular form. Apart from the words that have been previously


discussed, it is worth noting the high PMI value of the word _Hanban_. [Example 1] * 1. …Confucius Institutes themselves are not merely agreements between foreign universities and the


_Hanban_. Each institute is led by a Chinese partner university. * 2. …Talal Abu-Ghazaleh Confucius Institute, in cooperation with Confucius Institute Headquarters (_Hanban_) and Shenyang


Normal University, organized the Educators Delegation to China program for the fourth year. Hanban served as the headquarters that was closely tied to the Confucius Institute. However,


following the restructuring of the Confucius Institute in 2020, Hanban, which was the organization responsible for the administration of the institute, was abolished. Nevertheless, the


negative connotations associated with the word _Hanban_ may have been inadvertently transferred to the Confucius Institute through collocational priming effects. Besides, it could be found


that _said_ appears in the cluster table and also as a fairly high ranking MI3 collocation. [Example 2] * 1. He _said_ that despite the differences in the political and social systems of the


two countries, we had an excellent relationship. * 2. Ambassador Zhang _said_ that China-Tunisia relations are traditional and friendly. * 3. Professor Dr Khalid Iraqi _said_ the Chinese


language has now become an international language and over time, the number of Chinese language learners is increasing. It could be observed that _said_ predominantly appeared as quotations


from spokespersons within the context of news articles, rather than in the narrative itself. This finding suggests that news discourse often incorporates reported speech, which to some


extent reflects the neutral and fact-oriented stance of the news writers. However, it would be valuable to conduct further research on the specific topics and underlying attitudes conveyed


through the quotations behind _said_. Focusing solely on specific collocations, we found that words with vectors that are similar to those of the Confucius Institute can be classified into


two categories. The first category consists of the primary sponsors of the Confucius Institute, such as _Hebei_, _Stellenbosch_, _Sargodha_, _Sichuan_, _Chongqing_, and other Chinese


provinces, which are likely related to the Chinese partner schools of Confucius Institutes. As Chinese province names are a typical example of “foreign place names”, the audience tends to


directly associate them with _Chinese_. In large-scale contexts, the subjective image of the Confucius Institute is still dominated by the Chinese side. The second category is related to


"institutional cooperation," such as _Nnamdi_, _Azikiwe_, _Lagos_, and "institutional construction," such as _China-built_, _department_, and _faculty_. It is important


to note that these specific collocations are distinct from the general collocations discussed earlier. These specific collocations weaken the function of Confucius Institutes, such as


language and cultural dissemination, and focus on the establishment and expansion of Confucius Institutes, making them appear to be more aggressive. Moreover, it is worth noting the word


_nonprofit_, which stands in stark contrast to the for-profit Goethe-Institut. Confucius Institutes are dedicated to creating an image of a nonprofit academic institution. Finally, in the


research, it was found that there is a significant difference between the specific collocations and general collocations of the Confucius Institute. The degree of overlap between the two


tables is only approximately 15.5%, indicating that the impact of the general context on the Confucius Institute is significantly different from the effect of the microcontext. Focusing on


typical collocations, we found that the words _University_, _headquarters_, _held_, _Karachi_, _organized_, and _Hanban_ have a strong collocational relationship with the Confucius Institute


based on their PMI and word vector complementarity. [Example 3] …The event was co-_organised_ by the Pakistan Institute of China Studies (PICS) and the Confucius institute. …The visit was


_organised_ to renew the agreement for the Confucius institute and expand the future scope of collaboration with BFSU to potentially include undergraduate and graduate studies… Through these


words, we can see that the related news coverage still mostly focuses on surface-level activities, and the associated events align with the image that Confucius Institutes wish to portray


as language and cultural promotion institutions. However, the collocations formed by the audience still centre around "what Confucius Institutes have done". The events are complex,


and a consistent and fixed direction for collocational priming has not yet formed. As shown in Table 3, the words forming the general collocation with the Goethe-Institut consist of country


and organization names, such as _Bangladesh_, _Namibia_, and _Chennai_. Some words are related to the core of "collaboration," such as _cooperation_, _collaboration_, and _with_.


These two parts of words have already been described in the concord pattern, and the image of the Goethe-Institut as a friendly collaborative organization is already established. It is worth


noting that there are also words associated with the events hosted by the Goethe-Institut, such as auditorium, indicating the artistic and cultural events organized by this institution. In


Table 4, we observe a categorization of words that is similar to that in Table 5. Although there are differences in the specific words used, the overall categorization of words is similar.


Notably, cultural organizations such as _UNESCO_ and _Alliance Francaise_, which are similar to the Goethe-Institut, also appear in Table 4. The fact that the Goethe-Institut is listed


alongside these organizations in the specific collocation category indicates that its international and collaborative nature has been recognized, providing a strong indication for


classification to the audience. Our research finds that the specific collocations of the "Goethe-Institut" have a high overlap with general collocations, reaching 42.2%. This


indicates that the macrocontextual environment of the Goethe-Institut is relatively consistent with its microcontextual collocation, contributing to the formation of its image. Focusing on


typical collocations, we find that _partnership_, _collaboration_, and _cooperation_ are all included, suggesting that the concept of "friendly cooperation" has already formed


strong collocational priming for the Goethe-Institut. Moreover, the Goethe-Institut only forms typical collocations with its executives, founding organization, and host country but not with


German institutions, indicating that the Goethe-Institut, as a language dissemination institution (at least on the surface), has completely lost its connection with the government. Of


course, it cannot be denied that there are many German words in the table. However, we ensure that all the news texts collected for this study are in English, meaning that these German words


appear in English texts. Moreover, the table shows that the relevant German words, unlike the Chinese province names in Table 6, are not the same as the English words representing their


meanings, such as _kultur_. This suggests that the use of German is closely related to the connection to the Goethe-Institut, and such usage is already common in news reports from developing


countries. FEATURES OF SEMANTIC ASSOCIATION After further examination with Antconc 3.5.8 under the above mentioned collocation (especially focusing on typical collocations), it is found


that the contexts related to the Confucius Institute (Table 5) could be classified into Event, Location, Relationship and Functions. Similarly, the contexts related to the Goethe-Institut


could be classified in the same way. The contextualization in Table 7 reveals that, among the contexts related to events, a large part of the coverage of Confucius Institutes focuses on the


establishment of Confucius Institutes in various countries and regions. These reports are objective, only stating the establishment of Confucius Institutes without lexical bias. In addition,


most reports target “places where the second Confucius Institute was built” rather than “the first Confucius Institute was established in certain places”, which implies that the Confucius


Institutes may operate well before they can afford to open a second one. This kind of implication can be a positive image-building semantic association. In the absence of other biases, the


audience would assume that Confucius Institutes are a famous institution. In addition, it is interesting to note that the front and back of such kinds of semantic associations are often


accompanied by the praises of the work of Confucius Institutes in teaching language and promoting culture, which has contributed to the image building of the Confucius Institute. Moreover,


this shows that the media coverage narrations towards Confucius Institutes in developing countries are positive and that the Confucius Institute has active ingredients in public opinion in


the developing world. Such positive elements also exist in other semantic associations: [Example 4] * 1. The Confucius Institute will be a _bridge_ to unite the people of the countries of


Tanzania and…; * 2. Professor Samuel Kwame Offei, commended the Confucius Institute for its role in _bridging_ the language gap between China and Ghana… Obviously, in these contexts, public


opinion has given a great deal of recognition to the contribution of Confucius Institutes in promoting language learning, enhancing cultural exchange, and optimizing relations between


countries. From this perspective, the image of Confucius Institutes in developing countries tends to be favourable and friendly. However, these semantic associations from a macro level give


the audience the impression that this is an “act of state”. Furthermore, the functions and organization of Confucius Institutes are often found in contexts of language teaching and cultural


dissemination, part of which describes the organizational structure of Confucius Institutes (two universities in the two countries jointly establish the Confucius Institute, or the two


headquarters codirect Confucius Institutes in both countries). This description is consistent with what had been stated on the website that the Confucius Institute is a nonprofit education


institution jointly hosted by Chinese and foreign partners. Some of the contexts focus on essential tasks such as “teaching language” and “spreading culture”, objectively showing the


school-running activities of Confucius Institutes. The statement conforms to the purpose of Confucius Institutes on the official website: “The aim of Communicating Chinese, the current


situation of Chinese language and culture, And promoting people-to-people exchanges between China and the rest of the world”. Table 8 illustrates a number of terms associated with projects


that the Goethe-Institut has sponsored, such as _kultur_, _project_, and _supported_. After these words were searched, it could be found that the projects behind these words are in different


fields. For example, _School_ is the Partners for the Future (PASCH) initiative, a language school supported by the Goethe-Institut. It can seem that the semantic associations of these


projects are in completely different areas, implying that the Goethe-Institut involves a wide range of activities. It can be said that the semantic association of projects in different


fields has confused the German cultural attributes of the Goethe-Institut and weakened its fundamental purpose. This kind of confusion, masking fundamental purposes and attributes, is


further reinforced by the words of art-related activities. Many words representing locations, such as _Namibia, Chennai, and Nicosia_, or words with fewer denotations, such as _cultural,


bildung, offer, and auditorium_, are closely related to art activities. FEATURE OF COLLIGATIONAL PRIMING Based on the data presented in Tables 9 and 10, there are some notable differences in


the language used in news related to the Goethe-Institut and the Confucius Institute. According to the data presented in Tables 9 and 10, the use of passive voice is significantly more


common than active voice for both the Goethe-lnstitut (91.19%) and Confucius lnstitute (87.23%). This finding may reflect divergent attitudes and perspectives towards the two institutions.


It is possible that news stories about the Confucius Institute are more inclined to depict the institute as active and initiating action, while articles on the Goethe-Institut may portray it


as passive and accepting. These differing portrayals could reflect contrasting attitudes towards the Confucius Institute. Indeed, some critics have suggested that the institute is used as a


propaganda tool by the Chinese government. In terms of subject count, the Confucius Institute has a higher overall count at 240 compared to 134 for the Goethe-Institut. However, when


looking at the breakdown by sentence depth, the Goethe-Institut has a higher count of first-level subjects (25.4%) than the Confucius Institute (17.08%), which has a higher count of


third-level subjects (64.2%). This suggests that descriptions of the Confucius Institute may focus more on the actions and effects of the institution, while descriptions of the


Goethe-Institut may place more emphasis on its identity as a specific institution. It was found that the "Confucius Institute" appeared more frequently than the "Goethe


Institute" and had a higher complexity in terms of adjective count and attribute count. Specifically, the three or more layers of attribute count and adjective count for "Confucius


Institute" (170) were much larger than for "Goethe Institute" (117), indicating that the "Confucius Institute" was described and explained more extensively, with a


greater focus on general descriptive language. For example, syntax trees show the standard naming of Confucius Institutes is “CI in area/university”, and the grammatical structures (Table


11) refer to _Confucius_NNP Institute_NNP IN_, which are included in terms of adjectives count and attribute count. This kind of colligation makes the audience think that “CI is a part of a


university” or “CI is an organization in a region”. In contrast, however, those descriptions of the Goethe-Institut may place more emphasis on specific contents and characteristics. FEATURE


OF TOPICS AND SEMANTIC PROSODIES When LDA is used to analyse text, all the words appearing in the text are usually mapped to the topic space. Since the number of topics in the text varies


and the proportion of different topics in the corpus is also different, the size of each circle in Fig. 4 represents the proportion of the topic in the text. The larger the circle is, the


more texts are related to that topic, and vice versa. Each topic has some topic words, which are the words with high frequency in the topic space. Each word is assigned a probability value,


indicating its relevance to each topic. The higher the probability value is, the greater the weight of the word in that topic. In this study, we extracted the top ten topics in the CIDC and


GIDC, the high-frequency topic words in each topic, and the vector similarity between the target words "Confucius Institute" and "Goethe-Institut" and each topic, as


shown in Tables 12 and 13. The smaller the distance from the target word to the topic is, the higher the correlation between the topic and the target word. As shown in Fig. 4, the


distribution of the top ten topics in the CIDC is quite diverse, indicating that the corpus covers a wide range of events. According to the values of topic similarity and the top words


associated with each topic, which are shown in Table 12, Topics 5, 8, 9, and 10 have high vector similarity with the Confucius Institute, indicating that these topics may be the semantic


prosodies of the Confucius Institute in the CIDC. By analysing the top words associated with each topic, we can categorize the contexts into two types. The first type is related to the role


of Confucius Institutes in the field of education. For example, Topic 8 is associated with the establishment and operation of Confucius Institutes in Sri Lanka, while Topic 6 is related to


the establishment of Confucius Institutes in Africa and the promotion of the Chinese language in Africa. The second type of context is related to sudden news events. For instance, Topic 5


discusses the attack on a Confucius Institute teacher in Karachi, Pakistan. The third type is related to the international relations impact of Confucius Institutes. For example, Topic 9 is


associated with medical projects and exchanges, while Topic 3 is related to the policies and political parties in the United States and their impact on Confucius Institutes. As shown in Fig.


5, the top 30 most salient terms in the GIDC corpus are led by _film_ and _art_, indicating that the Goethe-Institut is closely linked to its artistic activities in the large corpus


environment. This suggests that most reports focus on the artistic collaborations of the Goethe-Institut. The appearance of terms such as _films_, _artists_, and _music_ also confirms this.


The high proportion of art-related topics in the GIDC further supports this idea. Interestingly, _German_ appears as the third most salient term, indicating that the Goethe-Institut is


closely associated with Germany. Terms such as _language_, _students_, and _learning_ suggest that the Goethe-Institut has not abandoned its original purpose of teaching language and


promoting German language and culture. Unlike the top 30 most salient terms in the CIDC corpus for the Confucius Institute, there is no explicit connection between the Goethe-Institut and


the _government_. Moreover, terms such as _international_, _invited_, and _service_ suggest that the role of the Goethe-Institut is more focused on what it has done rather than the impact it


has had. This contrasts with the more politically oriented language present in the CIDC corpus for the Confucius Institute. Focusing on the top ten themes in Fig. 4, it is evident that the


distribution of themes is more uniform and dispersed than that of the Confucius Institute, implying a broader range of content covered in the corpus. Based on the values in Table 13 for


topic similarity and the cues provided by top words, it can be inferred that Topic 7, Topic 10, Topic 6, Topic 8, and Topic 2 are the semantic prosodies of the Goethe-Institut in the GIDC


corpus. According to the topic words, the relevant contexts of the Goethe-Institut can be divided into two categories. The first category is closely associated with artistic activities, such


as Topic 7, which is related to international cultural exchange programs and art festivals in Germany. Topic 6 is related to films and shorts, involving news reports on film festivals or


exhibitions. Topic 2 pertains to music, theatre, and information related to music concerts or theatre festivals. The second category pertains to purely cultural activities and cooperation


with other institutions, such as Topic 10, which involves partner relationships and cultural project applications. The above activities are disconnected from "German language


teaching," and the Goethe-Institut has been established as an art and culture planning and cooperative institution. DISCUSSION Through a comparison of the different priming features of


the CIDC and GIDC, it is concluded in this study that public opinion and the image portrayed (or forced primings) are significant factors that contribute to the differences in priming


features. The findings of this study are as follows: Reports on the Confucius Institute in developing countries focus on its teaching activities and events, which are related to the


functions and operations of the Confucius Institute. The lexical items that are frequently used in these reports, such as _director_, _headquarters_, _language_, and _nonprofit_, all


contribute to the nonprofit academic institution image of the Confucius Institute. The verbs used in these reports are mostly neutral, with some positive connotations, which indicates that


most of the reports view the teaching activities of the Confucius Institute positively. However, these reports generally have a superficial perspective and evaluate the events themselves,


rather than the overall image of the institute. Reports on the Goethe-Institut in developing countries shape the public image of the institute as an art curator and facilitator of art


exchange, reflecting its friendly collaboration with other institutions. The large corpus data context in which the Goethe-Institut operates has a strong correlation with the field of art


activities. Furthermore, the collocation of the Goethe-Institut with non-German organizations such as _UNESCO_ and _Alliance Francaise_ suggests that its international and collaborative


nature has been recognized, thus providing the audience with a strong indication for classification. The institutional image and activity themes of the Goethe-Institut have become somewhat


fixed, while those of the Confucius Institute are still undergoing significant changes. The degree of overlap in PMI and word vector similarity indicates that the typical collocation of the


Goethe-Institute is much greater than that of the Confucius Institute, and it has already formed the conditions necessary to develop a unique semantic association. Therefore, based on


Findings 1–3, it can be concluded that the Goethe-Institut has moved beyond its initial goal of being a language teaching institution, and its institutional image has become more fixed. In


contrast, the Confucius Institute is still largely associated with language teaching activities and language-based cultural promotion activities, and public perception of the institute is


still developing. The Confucius Institute remains closely tied to China and its government, while the Goethe-Institut has fully separated from the German government. Through an examination


of typical collocations, it could be found that _partnership_, _collaboration_, and _cooperation_ are commonly used in connection with the Goethe-Institut, indicating a strong association


with friendly cooperation. In contrast, the term _Hanban_ has strong collocational priming with the Confucius Institute, and the word itself has become a symbol of China’s soft power. As


_Hanban_ was originally a part of the Chinese government, news content about the Confucius Institute can still be associated with government policies, as indicated by words such as


_economic_, _power_, and _government_ that appear in the context of discussions about the institute. The Confucius Institute is portrayed as active and assertive, while the Goethe-Institut


is perceived more as a supportive and collaborative entity. In terms of colligation, the active voice is used more frequently in discussions of the Confucius Institute than in those of the


Goethe-Institut, and the same trend is observed in actual semantic associations. Discussions about the Confucius Institute focus on what it is actively doing, while discussions about the


Goethe-Institut emphasize collaborative activities with other organizations. The Chinese and Chinese-language attributes of the Confucius Institute have been externalized and are not


well-accepted by audiences, while the German language and German attributes of the Goethe-Institut have to a certain extent been internalized and accepted within news reporting. The corpus


shows that both Chinese Pinyin and German words are used, but the former refers to the organizing body of the Confucius Institute, while the latter has a variety of meanings and is used in


news reporting without an obvious connection to the Goethe-Institut. The English name of the Confucius Institute could confuse the media and politicians and may mislead to negative


perceptions. As noted on Wikipedia, "Some commentators argue, unlike these organizations, many Confucius Institutes operate directly on university campuses, thus giving rise to what


they see as unique concerns related to academic freedom and political influence." When combined with the deliberate rhetoric of "CI impedes academic freedom" created by


politicians and media, audiences in developing countries may be more inclined to accept a distorted image of the Confucius Institute. Based on these characteristics, we conclude that the


overall image of the Goethe-Institut is peaceful and creative, while that of the Confucius Institute has a stronger "otherness" attribute and is more assertive, with a government


association. Due to its variable priming feature, audience perceptions of the Confucius Institute are more susceptible to external influences. Given the measures and negative media coverage


against the Confucius Institute taken by developed countries such as the United States since 2014, we believe that there is a tendency for the image of the institute in news media coverage


in developing countries to be questioned. We have identified several factors that may contribute to these differences: Firstly, the representation of public opinion holds political


significance. It is evident that the discourse content of both institutions in developing countries is based on actual events and analyzed with relative impartiality. This connection between


public opinion and the national circumstances of developing nations is inseparable. While these nations are not entirely reliant on developed countries, they still require assistance from


other nations for their progress. As a result, the study’s findings indicate that developing nations tend to maintain a more "neutral" stance while critically describing and


evaluating both the Goethe-Institut and the Confucius Institute. However, it cannot be denied that ideological differences and national perspectives play a role, causing some developing


countries to be cautious about Confucius Institutes, albeit to a lesser extent than Western countries. Particularly, the lack of understanding regarding China’s approach to language and


cultural advancement instills fear among them that the Confucius Institutes may infringe upon their institutional academic freedom. Hence, the aforementioned results indicate that developing


countries demonstrate a more critical and occasionally hostile attitude, presenting a country-specific and government-oriented image of public opinion towards Confucius Institutes. This


finding aligns with Zhou et al. (2018) conclusions. We believe that the Confucius Institute’s identity as an "other" will persist, preventing it from fully achieving its intended


mission of improving global understanding of Chinese language and culture, fostering friendly relations between China and foreign countries, promoting multiculturalism worldwide, and


contributing to the creation of a harmonious world. Secondly, the preferences and attitudes of the audience are closely linked to the public opinion’s image. While it is widely recognized


that the media subconsciously influences the audience, it is also important to acknowledge that the audience’s requirements and preferences can shape media coverage. In other words, the


media may selectively report information or even bias their coverage to align with the preferences of their viewers. Audiences make selective choices regarding media content based on their


personal preferences, indirectly impacting the media’s "living space." As a result of intensified market competition and a shift towards audience orientation, the media has been


compelled to reassess the attitudes and demands of the audience at all levels. Lastly, the nomenclature of the Confucius Institutes could contribute to some confusion. The organizational


structure of Confucius Institutes involves collaborations with local colleges and universities worldwide. The English name of Confucius Institutes, coupled with their unique operational


model, may attract significant attention on social media platforms. When social media users and politicians repeatedly mention the name without providing an explanation of the organization,


audiences may develop the perception that the Confucius Institute is directly penetrating the local university campus. Such contextual preconceptions add a sense of "intrusion" to


the image of Confucius Institutes, accompanied by deliberate rhetoric suggesting that they impede academic freedom. Consequently, audiences are more inclined to accept a distorted image of


Confucius Institutes. CONCLUSION In summary, this study examines the differences in lexical priming features between the Confucius Institute and Goethe-Institut in developing countries. The


study used the NOW corpus to gather web URLs from 2014 to 2023 with a match string of the Confucius Institute and Goethe-Institut. To obtain the news content, in this study, the text was


split into two corpora, CIDC and GIDC. Through the processing of data, the corpus was examined from four aspects: collocation, colligation, semantic association, and semantic prosody. This


study aims to examine whether there are some differences in lexical priming features between GIDC and CIDC to determine the attitude and public opinion image behind such kinds of primings.


This study revealed that the Goethe-Institut has evolved beyond its initial objective of being a language-teaching establishment, and its institutional identity has become more stable. In


contrast, the Confucius Institute is still largely recognized for language teaching and cultural dissemination activities. Public perception of the institute is still evolving, and it


remains closely tied to China and its government. The Goethe-Institut, on the other hand, has completely separated from the German government. While the Confucius Institute is portrayed as


active and assertive, the Goethe-Institut is perceived more as a supportive and collaborative entity. The English name of the Confucius Institute could potentially create confusion among


media and politicians and lead to negative perceptions. The overall image of the Goethe-Institut is peaceful and creative, while that of the Confucius Institute has a stronger sense of


"otherness" and involves more assertiveness due to its association with the government. In news media coverage of developing countries, there is a tendency for the image of the


institute to be questioned. In addition, this research has certain limitations. Firstly, the developing countries collected in this study are limited to those collected in the NOW corpus


(India, Sri Lanka, Pakistan, Bangladesh, Malaysia, Philippines, South Africa, Nigeria, Ghana, and Kenya). Due to the geographical limitation of the collected developing countries, some


deviations may exist from the public opinion of developing countries in general. Secondly, it should be acknowledged that the study has limitations on data analysis. For instance, in the


analysis of _said_, neither concordance line analysis nor extensively utilize quotation processing with illustrative example was used. As a result, there is room for future research to delve


into the specific topics and associations found within quotations from spokespersons as well as those present in the narrative of news articles. Thirdly, it should be noted that alternative


measures of collocation and semantic association exist, for instance, Gries (2013) introduced an association measure that effectively identifies asymmetric collocations and distinguishes


between high and low association strengths; Gries (2019) proposed tupleization as a research program that analyzes multiple dimensions of information, both of which prove that different


collocation metrics would result in different top-ranked collocations. Thus, in this study the analysis has been restricted to use of word2vec and LDA. Therefore, for future recommendation,


more research could be carried out to compare and contrast computational methods common in natural language processing and more traditional corpus linguistics methods. Additionally, it is


hoped that the Confucius Institute, as a window for China’s foreign language teaching and cultural dissemination, can be more objectively understood and recognized. DATA AVAILABILITY The


datasets generated during and/or analysed during the current study are available in the Github repository: https://github.com/Bellahhm/Datasets-of-Otherness-and-Suspiciousness.git.


REFERENCES * Acquaye JB (2020) Western perceptions on Confucius Institute advancement of Chinese language and culture: a narrative review. US-China Educ Rev 10(5):185–199 Google Scholar  *


An R, Xu M (2015) News discourse analysis of the suspension of Confucius Institute at the University of Chicago. Int Commun 2:43–45 Google Scholar  * Brazys S, Dukalskis A (2019) Rising


powers and grassroots image management: Confucius Institutes and China in the media. Chin J Int Politics 12(4):557–584. https://doi.org/10.1093/cjip/poz012 Article  Google Scholar  * Blei


DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3(Jan):993–1022 MATH  Google Scholar  * Drake J (2009) A linguistic account of word play: the lexical priming of


pinning. J Pragmatics 41:794–809. https://doi.org/10.1016/j.pragma.2008.09.025 Article  Google Scholar  * Gill B, Huang Y (2006) Sources and limits of Chinese “Soft Power”. Survival


48(2):17–36. https://doi.org/10.1080/00396330600765377 Article  Google Scholar  * Gries ST (2013) 50-something years of work on collocations: what is or should be next…. Int J Corpus


Linguist 18(1):137–166. https://doi.org/10.1075/ijcl.18.1.09gri Article  Google Scholar  * Gries ST (2019) 15 years of collostructions: some long overdue additions/corrections (to/of


actually all sorts of corpus-linguistics measures). Int J Corpus Linguist 24(3):385–412. https://doi.org/10.1075/ijcl.00011.gri Article  Google Scholar  * Gruenenfelder TM, Recchia G, Rubin


T, Jones MN (2016) Graph‐theoretic properties of networks based on word association norms: implications for models of lexical semantic memory. Cogn Sci 40(6):1460–1495.


https://doi.org/10.1111/cogs.12299 Article  PubMed  Google Scholar  * Huang W, Lien D, Xiang J (2019) The power transition and the U.S. response to China’s expanded soft power. Int Relat


Asia-Pacific 24(2):249–266. https://doi.org/10.1093/irap/lcz008 Article  Google Scholar  * Harting F (2014) Confucius Institutes as innovative tools of China’s cultural diplomacy. Chinese


Politics and International Relations. Routledge, pp.121–144. https://doi.org/10.4324/9781315866734 * Hoey M (2005) Lexical priming: A new theory of words and language. Routledge, London *


Hoey M, O’Donnell MB (2008) Lexicography, grammar, and textual position. Int J Lexicogr 21(3):293–309. https://doi.org/10.1093/ijl/ecn025 Article  Google Scholar  * Jantunen JH, Brunni S


(2013) Morphology, lexical priming and second language acquisition: a corpus-study on learner Finnish. Twenty Years of Learner Corpus Research: Looking back, Moving ahead. Louvain-la-Neuve:


Presses universitaires de Louvain, pp.235–245 * Jantunen JH (2017) Lexical and morphological priming : A holistic phraseological analysis of the Finnish time expression kello. In M


Pace-Sigge, & KJ Patterson (Eds.), Lexical Priming: Applications and advances (pp. 254–272). John Benjamins. Studies in Corpus Linguistics, 79. https://doi.org/10.1075/scl.79.10jar *


Jones MN, Gruenenfelder TM, Recchia G (2011) In defense of spatial models of lexical semantics. In Proceedings of the annual meeting of the cognitive science society (Vol. 33, No. 33) *


Jatnika D, Bijaksana MA, Suryani AA (2019) Word2vec model analysis for semantic similarities in english words. Procedia Comput Sci 157:160–167. https://doi.org/10.1016/j.procs.2019.08.153


Article  Google Scholar  * Kaszubski P (2007) Michael Hoey. Lexical priming: a new theory of words and language. Funct Lang 14(2):283–294. https://doi.org/10.1075/fol.14.2.12kas Article 


Google Scholar  * Kluver R (2014) The sage as strategy: nodes, networks, and the quest for geopolitical power in the Confucius Institute. Commun Culture Critique 7(2):192–209.


https://doi.org/10.1111/cccr.12046 Article  Google Scholar  * Kumar AA, Balota DA, Steyvers M (2020) Distant connectivity and multiple-step priming in large-scale semantic networks. J Exp


Psychol: Learn Memory Cogn 46(12):2261. https://doi.org/10.1037/xlm0000793 Article  Google Scholar  * Leedham M, Cai G (2013) Using a corpus approach to explore the influence of teaching


materials on Chinese students’ use of linking adverbials. J Second Lang Writing 12:374–389. https://doi.org/10.1016/j.jslw.2013.07.002 Article  Google Scholar  * Li H, Mirmirani S, Ilacqua


JA (2009) Confucius Institutes: distributed leadership and knowledge sharing in a worldwide network. Learn Organization 16:469–482. https://doi.org/10.1108/09696470910993945 Article  Google


Scholar  * Li K, Dai C (2011) Report of the U. S. Public Opinion of the Confucius Institute. World Economics and Politics (07):76–93+157–158 * Lien D, Oh CH, Selmier WT (2012) Confucius


institute effects on China’s trade and FDI: isn’t it delightful when folks afar study Hanyu? Int Rev Econ Financ 21:147–155. https://doi.org/10.1016/j.iref.2011.05.010 Article  Google


Scholar  * Lien D, Co CY (2013) The effect of Confucius Institutes on U.S. exports to China: a state level analysis. Int Rev Econ Financ 27:566–571.


https://doi.org/10.1016/j.iref.2013.01.011 Article  Google Scholar  * Louw B (1993) Irony in the Text or Insincerity in the Writer?—The Diagnostic Potential of Semantic Prosodies. In M


Baker, G Francis, & E Togni-ni-Bonelli (Eds.), Text and Technology: In Honour of John Sinclair (pp. 157–176). Amsterdam: Benjamins. https://doi.org/10.1075/z.64.11lou * Liu C, Zeng L


(2017) Critical discourse analysis of news reports on Confucius Institutes in mainstream media in the United States. Int Commun (01): 76–78 * Liu Y (2014) Research on China-related public


opinion from the perspective of national cultural security — Taking the New York Times’ Report on Confucius Institutes as an Example. Acdemic Exchange (4):200–203 * Min YE (2012) A


comparison between Confucius Institute and cervantes institute: the modern awareness in Chinese culture and the post-colonialism in Spanish culture. Contemp Foreign Lang Stud 3:41 Google


Scholar  * Paradise JF (2009) China and international harmony: the role of Confucius Institutes in Bolstering Beijing’s soft power. Asian Survey 49(4):647–669.


https://doi.org/10.1525/as.2009.49.4.647 Article  Google Scholar  * Pace-Sigge M (2018) Spreading activation, lexical priming and the semantic web: early psycholinguistic theories, corpus


linguistics and AI applications. Cham, Palgrave Macmillan, Switzerland. https://doi.org/10.1007/978-3-319-90719-2 * Patterson KJ (2016) The analysis of metaphor: to what extent can the


theory of lexical priming help our understanding of metaphor usage and comprehension? J Psycholinguist Res 45(2):237–258. https://doi.org/10.1007/s10936-014-9343-1 Article  PubMed  Google


Scholar  * Patterson K (2018) Understanding Metaphor through Corpora: A Case Study of Metaphors in Nineteenth Century Writing (1st ed.). Routledge, London.


https://doi.org/10.4324/9781351241090 * Patterson K, Pace-Sigge M (2017) Lexical Priming: Applications and Advances. (1 ed.) (Series in Corpus Linguistics). John Benjamins.


https://doi.org/10.1075/scl.79 * Peng F, Yu X (2016) The image and discourse system of Confucius Institutes reported by British mainstream media. Academic Exploration (11):112–119 *


Selezneva NV (2021) Learning Chinese in Vietnam: the role of the Confucius Institute. Rus J Vietnam Stud 5(4):71–86. https://doi.org/10.54631/VS.2021.54-71-86 Article  Google Scholar  *


Starr D (2009) Chinese language education in Europe: the Confucius Institutes. Eur J Educat 44:65–82. https://doi.org/10.1111/j.1465-3435.2008.01371.x Article  Google Scholar  * Xie T, Page


BI (2013) What affects China’s National Image? A cross-national study of public opinion. J Contemp China 22:850–867. https://doi.org/10.1080/10670564.2013.782130 Article  Google Scholar  *


Xing L, Zhao J (2021) New media and international communication of China’s national image. Xiandai Guoji Guanxi (11):51–59+61 * Yan X (2018) The Change of Public Opinion Environment for the


Development of Confucius Institutes— Based on the Analysis of Chinese and Foreign newspapers’ reports on Confucius Institutes from 2005 to 2014. Chinese Culture Overseas Communication


(02):219–226 * Ye Y (2015) The Image of the Confucius Institute in Foreign Media Reports. J Sichuan University (03):48–57 * Yeh Y, Wu C, Huang W (2021) China’s soft power and U.S. public


opinion. Econ Political Stud 9:447–460. https://doi.org/10.1080/20954816.2021.1933766 Article  Google Scholar  * Yuan Z, Guo J, Zhu H (2016) Confucius Institutes and the limitations of


China’s global cultural network. China Information 30(3):334–356. https://doi.org/10.1177/0920203X16672167 Article  Google Scholar  * Zhang D, He Y (2016) Public opinions over Confucius


Institutes in the International Arena: an analysis of relevant media reports in Western Countries. Renmin University China Education J (01):91–110 * Zhang W (2021) Public opinion dilemma of


Confucius institutes in the new situation: characteristics causes and countermeasures. Morden Commun 43(03):20–26 Google Scholar  * Zhou T, Wen Y, Jia W (2018) The Dilemma and


Countermeasures of International Public Opinion Related to China from the Perspective of “Other-plastic” — A Case Study of US Media Reports Related to Confucius Institutes. Int Commun


(04):19–21 * Zhou Y, Luk SC (2016) Establishing Confucius institutes: a tool for promoting China’s soft power? J Contemp China 25:628–642. https://doi.org/10.1080/10670564.2015.1132961


Article  Google Scholar  Download references ACKNOWLEDGEMENTS The author thanks Wang Jun-Ling for constructive discussions and Supervisor Wu Cheng-Nian for great supports. This research was


supported by Centre for Language Education and Cooperation of China (No.YHJC21ZD-011). AUTHOR INFORMATION AUTHORS AND AFFILIATIONS * School of International Chinese language Education at


Beijing Normal University, Beijing, China Ming Huang Authors * Ming Huang View author publications You can also search for this author inPubMed Google Scholar CORRESPONDING AUTHOR


Correspondence to Ming Huang. ETHICS DECLARATIONS COMPETING INTERESTS The author declares no competing interests. ADDITIONAL INFORMATION PUBLISHER’S NOTE Springer Nature remains neutral with


regard to jurisdictional claims in published maps and institutional affiliations. SUPPLEMENTARY INFORMATION DATA RIGHTS AND PERMISSIONS OPEN ACCESS This article is licensed under a Creative


Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the


original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in


the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended


use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit


http://creativecommons.org/licenses/by/4.0/. Reprints and permissions ABOUT THIS ARTICLE CITE THIS ARTICLE Huang, M. Otherness and suspiciousness: a comparative study of public opinions


between the Confucius Institute and Goethe-Institut in developing countries. _Humanit Soc Sci Commun_ 10, 428 (2023). https://doi.org/10.1057/s41599-023-01920-7 Download citation * Received:


15 November 2022 * Accepted: 05 July 2023 * Published: 19 July 2023 * DOI: https://doi.org/10.1057/s41599-023-01920-7 SHARE THIS ARTICLE Anyone you share the following link with will be


able to read this content: Get shareable link Sorry, a shareable link is not currently available for this article. Copy to clipboard Provided by the Springer Nature SharedIt content-sharing


initiative