
- Select a language for the TTS:
- UK English Female
- UK English Male
- US English Female
- US English Male
- Australian Female
- Australian Male
- Language selected: (auto detect) - EN
Play all audios:
ABSTRACT This study aims to examine the differences in lexical priming features between Confucius Institutes and Goethe-Instituts in developing countries using lexical priming theory and
natural language processing techniques. By collecting news media coverage from 2014 to 2023, this study analyses the corpus through collocation, colligation, semantic association, and
semantic prosody. In this study, it was found that the Goethe-Institut has a more stable institutional identity, while the Confucius Institute is still largely recognized for language
teaching and cultural dissemination activities. The association of the Confucius Institute with China and its government creates a stronger sense of "otherness" and leads to
negative perceptions. Overall, this study contributes to a better understanding of public perception and institutional image in developing countries. SIMILAR CONTENT BEING VIEWED BY OTHERS
MEDIA BIAS THROUGH COLLOCATIONS: A CORPUS-BASED STUDY OF EGYPTIAN AND ETHIOPIAN NEWS COVERAGE OF THE GRAND ETHIOPIAN RENAISSANCE DAM Article Open access 27 May 2024 COMPARING THE LANGUAGE
STYLE OF HEADS OF STATE IN THE US, UK, GERMANY AND SWITZERLAND DURING COVID-19 Article Open access 19 January 2024 ANALYSIS OF SOCIAL MEDIA LANGUAGE REVEALS THE PSYCHOLOGICAL INTERACTION OF
THREE SUCCESSIVE UPHEAVALS Article Open access 17 February 2025 As a result of the spiral of silence, the blockage of semiotic hegemony and linguistic alliance, and the absence of Chinese
capacity for worldwide promotion, the Confucius Institute faces challenges in the current era. According to a 2019 Pew Research Center survey, developing countries view China more favourably
than developed countries do. To overcome the public opinion dilemma that currently exists under such a communication pattern, it is necessary to examine how the Confucius Institute is
depicted through language in social media in developing countries. Discourse analysis has been widely used to study public opinion and institutional image. However, previous studies on the
public perception of Confucius Institutes have exhibited gaps in research methodologies and a lack of comparative analyses. Therefore, this study aims to address these gaps by incorporating
a control group, the Goethe-Institut, and utilizing Hoey’s lexical priming theory with the help of artificial intelligence. The research questions of this study include the differences in
lexical priming features between news reports on Confucius Institutes and Goethe-Institut, their impact on the audience’s attitudes and stances, and the way in which these features shape
public opinion images of the institutions. LITERATURE REVIEW REVIEW OF STUDIES ON CONFUCIUS INSTITUTE Confucius Institutes (CI) are public educational and cultural promotion programs that
are funded and arranged by the Chinese International Education Foundation. The program was started in 2004 and supported by the Chinese Ministry of Education-affiliated Hanban (changed to
the Center for Language Education and Cooperation in 2020), which cooperated with universities. The objectives of this program include fostering cross-cultural interactions, supporting local
Chinese teaching abroad, and promoting the Chinese language and culture. REVIEW OF STUDIES ON THE PROBLEMS AND INFLUENCE OF CONFUCIUS INSTITUTES The majority of studies on Confucius
Institutes (CI) that are published in English publications concentrate on communication across cultures, national strategic analysis, and teaching. * 1. Language teaching: In these studies,
the advantages of Confucius Institutes as educational institutions are highlighted, leading to a more favourable perception of them (e.g., Starr 2009; Selezneva 2021). * 2. Strategic
analysis: Some academics use a macro viewpoint while performing strategic analysis (e.g., Lien et al. 2012; Huang et al. 2019). Confucius Institutes are viewed more neutrally and critically
since they are seen as diplomatic weapons, and their political implications are emphasized. The idea of Confucius as a "gentle teacher" contradicts Chinese intentions, according to
Harting (2014), who asserts that Confucius Institutes have political aims rather than cultural aims and do not represent the "real" China. Additionally, several studies examine
the commercial activities of the Confucius Institute and its effects on Sino-foreign commerce to perceive it as an economic instrument (e.g., Li et al. 2009; Lien and Co 2013). * 3. Cultural
communication: The research findings from this viewpoint are split into two opposing groups. One is that the Confucius Institute significantly strengthens China’s soft power, and China has
been using the Confucius Institute to improve its reputation among the international public (e.g., Gill and Huang 2006; Kluver 2014). Brazys and Dukalskis (2019) suggested that Confucius
Institutes systematically enhanced media attitudes towards China on a worldwide scale by comparing regions with or without Confucius Institutes. However, other findings suggested that it is
challenging to determine whether the construction of Confucius Institutes is helping China make friends around the world (e.g., Paradise 2009; Selezneva 2021). Xie and Page (2013) examined
data from the 2007 Pew Global Attitudes Survey and found no evidence to substantiate the hypothesis that the construction of Confucius Institutes in 35 nations had a positive effect on
Chinese diplomacy. According to Zhou and Luk’s (2016) study, receivers are not particularly attracted to China’s soft power or CI. Some scholars even assert that the Chinese culture taught
at Confucius Institutes is a unique incarnation of Chinese policy and that Confucius Institutes interfere with academic freedom (e.g., Acquaye 2020; Yeh et al. 2021). Overall, the
investigations mostly have an ideological bias towards Confucius Institutes and frequently preemptively attack their role in disseminating political propaganda. REVIEW OF STUDIES ON PUBLIC
OPINIONS TOWARDS CONFUCIUS INSTITUTES In 2011, a study on the perceptions of Confucius Institutes abroad was conducted. Li and Dai (2011) discovered that the topic and news content
determined whether the U.S. media made positive or negative comments regarding Confucius Institutes in its coverage of the institutions from 2005 to 2010. Since that time, there has been an
increase in the number of studies. The study topics have ranged from media coverage in a particular nation to regional or global statistical analysis to public opinion surveys. Among them,
the study of public opinion in a single nation started with that of the United States (An and Xu 2015; Liu and Zeng 2017; Zhang 2021), and it progressively expanded to include the analysis
of British (Peng and Yu 2016), Spanish (Min 2012), Australia, and Canada (Zhang and He 2016). Additionally, some of them even adopted a global outlook. They gathered copious news stories
without considering the news source to synthesize public opinion and create a worldwide perception of the Confucius Institute. Yan (2018), for instance, summarized the changes in the public
opinion environment of Confucius Institutes over the period of a decade by conducting a comparative analysis of press reporting on the institutions from 2005 to 2014. Additionally, the
research methodologies have been improved, moving from qualitative to quantitative analysis and from content-based to discourse analysis. A popular analytical research technique in
communication science that is appropriate for examining the environment of public opinion is content-based analysis. As a result, numerous researchers have conducted quantitative analyses of
foreign media and their reports using the content-based analysis approach, assessing the stances of the reports’ content, and drawing conclusions about the trend of data on the stances of
the media reports (e.g., Zhang and He 2016). While the number of studies employing discourse analysis techniques such as critical discourse analysis has steadily expanded (e.g., Ye 2015; Liu
and Zeng 2017), researchers have progressively learned how significantly news discourse affects viewers. For instance, Zhang (2021) examined the traits and propensities of the coverage of
Confucius Institutes by the New York Times in a recent study, focusing on the connection between reporters’ attitudes and social ideology. Furthermore, in the first half of 2021, Zhang
brought lexical priming theory, a different type of discourse analysis methodology, into the research and evaluated a number of news articles about Confucius Institutes in Western media. The
conclusions of the research on the perceptions of Confucius Institutes among public opinion are divisive. The bulk of research that concluded that media shaped public opinion to create a
passive image of society used American media as the subjects. The results of these papers may be summarized as follows: foreign media politicized the interpretation of Confucius Institutes
(Liu 2014). According to Zhou et al. (2018), Confucius Institutes are portrayed as institutions with political characteristics that disseminate ideology and interfere with academic freedom
(e.g., Yuan et al. 2016; Xing and Zhao 2021). In contrast, a considerable number of studies have revealed that the public’s perception of Confucius Institutes is either neutral or positive
(e.g., Ye 2015). However, most of the studies found that there were distinct segments of public opinion. For instance, Liu and Zeng (2017) separated the sample reports into two segments:
"Chinese language education and cultural promotion" and "public diplomacy and soft power." Although these public opinion findings indicated that Confucius Institutes had
a generally positive reputation, there were also doubts regarding political and academic meddling (Zhang and He 2016). As soon as such themes or issues were associated, the perspectives
abruptly switched. In summary, scholars have paid attention to how Confucius Institutes are perceived by the global public and have compiled their findings in a number of studies. However,
controlling for objective quantitative comparison analysis is inadequate, and content and discourse analysis places greater emphasis on the researcher’s comprehension. In addition, in
earlier discourse analyses, researchers paid the most attention to the high-frequency terms (also known as content words) in the whole report since they might indicate the news’s worth and
overall theme. Zhang (2021) notes the drawbacks of considering only high-frequency textual words when the news reports’ substance might not always be pertinent to the institutes being
targeted. Focusing on collocations (content terms with semantic associations) helps address this deficiency by revealing which words and phrases are frequently directly associated with the
targeted institutes. REVIEW OF THE LEXICAL PRIMING THEORY In 2005, British linguist Michael Hoey methodically advanced the lexical priming idea. It was formed under the background of an
increasing awareness that traditional views of the vocabulary of English were out of kilter with the facts about lexical items that are routinely being brought up by corpus investigation
(Hoey 2005). According to Hoey, the core idea of lexical priming is described as follows: > Every time we use a word, and every time we encounter it anew, the > experience either
reinforces the priming by confirming an existing > association between the word and its co-texts and contexts, or it > weakens the priming, if the encounter introduces the word in an
> unfamiliar context or co-text. (Hoey 2005: 9) In other words, every time we encounter a word or phrase, we store it along with all the words that accompanied it and with a note of the
kind of context in which it was found (Kaszubski 2007). Thus, when context is analysed through lexical priming theory, it can be divided into three parts: collocations, colligations and
semantic associations. After the publication of his book, Michael Hoey’s theory of lexical priming has been applied in various ways. He himself explored the relationship between lexical
priming and creativity (e.g., Hoey and O’Donnell 2008), as well as its impact on second-language learning and in languages other than English. Other researchers, such as Jantunen and Brunni
(2013) and Jantunen (2017), have extended the theory to include morphology, and corpus-based lexical priming theory has been applied to different kinds of text research (e.g., Leedham and
Cai, 2013). Pace-Sigge (2018) and Patterson (2016; 2018) have looked at evidence of lexical priming in spoken English and its connection to metaphor use. The book _Lexical Priming:
Applications and Advances_, which was edited by Patterson and Pace-Sigge in 2017, covers a wide range of topics of further applications and advances of lexical priming theory. Several
studies have introduced lexical priming into news report analysis. To understand how lexical priming is used in practice, Drake (2009) examined a large amount of naturally occurring
phraseplay from a newspaper corpus. He then used corpus-based lexical priming theory on news reports. Zhang (2021) has applied the theory by examining the frequency of priming prepositions
and nouns in news stories to elucidate the attitude and posture of American media towards Confucius Institutes. Specifically, in the book _Lexical Priming: Applications and Advances_
(Patterson and Pace-Sigge 2017), the article _Forced lexical primings in transdiscoursive political messaging_ written by Alison Duguid & Alan Partington discusses how forced lexical
priming is generated and received in political messaging across disciplines, thus affecting listeners. In 2018, Michael Pace-Sigge delved deeper into the relationship between linguistics and
artificial intelligence in the book _Spreading Activation, Lexical Priming and the Semantic Web_. This book illustrated that linguistic knowledge supports the ability of computing devices
to process human language. In turn, these electronic devices are increasingly approaching the creation of a mirror image of language processing, thus supporting the foundational theory of
language structure. These books inspired the methodology of this paper. It can be inferred from the previous discussion that discourse analysis methods used to examine news articles about
Confucius Institutes still suffer from inadequacies. Additionally, there has been a dearth of substantial research comparing the public opinion of Confucius Institutes to that of other
institutions, highlighting the need for further investigation in this area. Furthermore, comparative analysis is an approach commonly used to increase the accuracy and validity of data
analysis. As such, this study incorporates a control group, the Goethe-Institut, into the analysis to mitigate potential gaps and increase the study’s validity. Moreover, inspired by Michael
Pace-Sigge’s work in 2018, this study aims to incorporate artificial intelligence into Hoey’s lexical priming theory. This will enable the expansion of the corpus of text types and volumes
that can be analysed using lexical priming theory. By integrating insights from corpus linguistics and natural language processing, this study seeks to apply lexical priming theory to
explore the relationship between public opinion and institutional image. This paper aims to provide answers to the following questions: * (1) What are the differences in lexical priming
features in the collocation, colligation, semantic association and semantic prosody of news reports on the Confucius Institute and Goethe-Institut separately? * (2) How can these lexical
features influence audiences’ attitudes and stances differently? * (3) How do these features shape the public opinion images towards CIDC and GIDC? * (4) What are the differences in public
opinion images towards the CIDC and GIDC? THEORETICAL FRAMEWORK Lexical priming theory proposes an association-based network model that represents words as nodes in a large memory network,
with similar words connected to each other via edges (Kumar et al. 2020). This theory involves a spread of activation, where activation spreads from one concept to related concepts along
associative and semantic pathways. Although evidence suggests that association-based network models capture complementary semantic information compared to text-based distributional models
(Gruenenfelder et al. 2016), their validity has been questioned on the grounds of being constructed from retrieval-based processes involved in word association tasks (Jones et al. 2011).
Therefore, a complete account of semantic memory should include an explanation of how such associations are formed and how the complex network structure that successfully explains
behavioural performance in semantic tasks is acquired. To address this issue, this study seeks to combine lexical priming theory and the word2vec model, which serves as a priming performance
for distant concepts. By doing so, this study aims to provide a comprehensive account of semantic memory and to gain insights into the underlying attitude behind the primings. THE PROCEDURE
OF LEXICAL PRIMING THEORY Hoey (2005: 7) states that “collocation is pervasive” and that “any explanation for the pervasiveness of collocation has to be psychological, as … [it is] a
psychological concept”. A psychological relationship between words that are up to five words apart is what is meant by the term “collocation”. This is evidenced by their occurrence together
in corpora more often than is explicable in terms of random distribution (Hoey 2005: 8). Because collocation is strongly tied to the psychological phenomena of priming that results from a
language user’s frequent contacts, a concordance program can be used to determine the position of lexis. Therefore, a concordance will be used in this essay to examine the primings that
underlie collocation. In addition, statistical probabilities could also be taken into consideration (a mutual information score is just one of many possible options): a machine can give
predictions as to the degree of likelihood that WORD appears with any other word (Pace-Sigge 2018). Thus, pointwise mutual information (PMI) is used in this study to assess whether the
connection between a candidate collocate and the matching string is solid enough. Pointwise mutual information (PMI) quantifies the degree to which words appear together more frequently than
they do separately in a corpus. In other words, PMI is a useful technique for identifying words that are semantically related to a particular phrase. As lexical priming focuses on the
psychological element of word choice, semantic association was defined by Hoey as follows: > [semantic association] exists when a word or word sequence is > associated in the mind of a
language user with a semantic set or > class, some members of which are also collocates for that user. > (Hoey 2005: 24) This definition highlights the status of collocates related to
specific keywords. The degree to which collocates are associated with the matching string serves as the foundation for determining the semantic associations. Therefore, pointwise mutual
information (PMI) can also be used to locate and summarize the matching string and to conclude the associations. It should be noted that this paper adopts the MI3(Mutual Information Cubed)
calculation formula implemented in Wordsmith 8.0: \(MI3 = \log _2\left( {J^3E/B} \right)\), where _J_ = joint frequency; _F_1 is frequency of word 1; _F_2 is = frequency of word 2;
\(F_{total}\) is frequency of total tokens; \(E = J + \left( {F_{total} - F_1} \right) + \left( {F_{total} - F_2} \right) + \left( {F_{total} - F_1 - F_2} \right)\); \(B = \left( {J + \left(
{F_{total} - F_1} \right)} \right)\left( {J + \left( {F_{total} - F_2} \right)} \right)\). Compared to Mutual Information, MI3 offers several distinct advantages. MI3 emphasizes joint
frequency by raising it to the power of three, prioritizing associations with robust co-occurrence. It considers total token counts through term E, providing a comprehensive understanding of
associations within the corpus. MI3 ensures enhanced precision by incorporating term B, capturing distinctiveness and minimizing chance associations. Adopting the MI3 formula deepens our
understanding of word associations, enhancing linguistic analysis accuracy. However, Hoey (2005) describes that semantic association can be particularly affected by local collocations that
might not appear in an average corpus or act as a complimentary (or, conversely, discourteous) complement to words or word sequences. The similarity of the word vector calculated by the
word2vec model could compensate for the lack of PMI in particular semantics and be used to measure the semantic association in a specific semantic environment. It is noticeable that semantic
prosody, the meaning conveyed by collocational links, typically conveys meanings that encode attitudes and evaluations. Semantic prosody goes further, as it highlights that a large number
of words in use have an underlying, subconscious prosody that, according to Louw, only became visible once computers made large-scale concordancing possible (Louw 1993). In this research, we
assumed that semantic prosody is defined as the larger textual context associated with the words in the corpus. The TOPICs extracted by LDA that are closely related to WORD and its stances
and attitudes are regarded as the semantic prosody (this will be explained later in “Word2vec model and LDA model”). The definition of colligation is the grammatical functions preferred or
avoided by the group in which the word or word sequence participates (Hoey 2005). Hoey orients his definition of colligation towards Halliday’s use—i.e., “the relation [held] between a word
and a grammatical pattern” (Hoey 2005). Based on this, he defined colligation as follows: * 1. the grammatical company a word or word sequence keeps (or avoids keeping) either within its own
group or at a higher rank; * 2. the grammatical functions preferred or avoided by the group in which the word or word sequence participates; and * 3. the place in a sequence that a word or
word sequence prefers (or avoids) (Hoey 2005). To determine colligation, Stanfordnlp was used in this study to carry out dependency parsing to analyse the grammatical structure of a sentence
and address the three problems related to grammatical company, grammatical functions, and the preferred position of a word or word sequence. Dependency parsing involves identifying the
syntactic relationships between words and represents them as nodes and edges, which reflect the grammatical roles of words in the sentence. WORD2VEC MODEL AND LDA MODEL WORD2VEC MODEL The
word2vec model is an unsupervised shallow neural network model that is used in natural language processing to represent words as vectors. The model consists of only an input layer, a hidden
layer, and an output layer. In this model, a neural network is used to learn the relationships between words, and they are mapped to vectors in a high-dimensional space, such that the
proximity of words in the space reflects their semantic similarity. The word2vec model includes two main models: continuous bag of words (CBOW) and skip-gram (Jatnika et al. 2019). The CBOW
model predicts the target word \(\omega _t\) given its surrounding context words \(\omega _{t - 2}\), \(\omega _{t - 1}\), \(\omega _{t + 1}\), \(\omega _{t + 2}\). On the other hand, the
skip-gram model predicts the context words \(\omega _{t - 2}\), \(\omega _{t - 1}\), \(\omega _{t + 1}\), and \(\omega _{t + 2}\) given the target word \(\omega _t\).The structures of the
two neural network models are shown in the diagram below in Fig. 1: In the Word2Vec model, word vectors are designed to capture the semantic meaning of words, enabling the representation of
similar words to be located closer together within the vector space. This proximity reflects the underlying semantic similarity between words. The calculation of word vector similarity
commonly employs the use of cosine similarity, which measures the cosine of the angle between two vectors. When the angle between two vectors is small (approaching 0), their cosine
similarity approaches 1, indicating a high degree of similarity. Conversely, as the angle increases (approaching π/2), the cosine similarity approaches 0, indicating a low degree of
similarity. By evaluating the cosine similarity between word vectors, one can quantitatively determine their semantic similarity. A cosine similarity close to 1 suggests that the words
possess similar semantic meanings, while a cosine similarity close to 0 suggests significant differences in their semantic interpretations. Hence, the computation of word vector similarity
facilitates the identification of highly co-occurring words within similar linguistic contexts. These words exhibit both collocational patterns, indicative of their frequent co-occurrence,
as well as semantic associations, reflecting their interconnected meanings. LATENT DIRICHLET ALLOCATION Latent Dirichlet allocation (LDA) is an unsupervised learning algorithm that is based
on probabilistic graphical models and used for large-scale text corpus topic modelling (Blei et al. 2003). The LDA model assumes that each document consists of multiple topics, and each
topic consists of multiple words. For documents, the topics follow a Dirichlet distribution, while for topics, the words follow a Dirichlet distribution. Given a text matrix that has already
been vectorized, the probability distribution of each word \(\omega _i\) belonging to a topic \(z_i\) can be calculated as follows: $$p\left( {z_i = k|z_{ - i},{{{\mathbf{x}}}},\alpha
,\beta } \right) \propto \left( {n_{d_i,k} + \alpha _k} \right)\frac{{e^{{\phi _{{{\mathbf{k}}}}}^ \cdot\, {{{\mathbf{x}}}}_i}}}{{\mathop {\sum}\nolimits_{j = 1}^V {e^{{\phi
_{{{\mathbf{k}}}}}^ \cdot \,{{{\mathbf{x}}}}_j}} }}$$ where \(n_{d_i,k}\) is the number of words in document \(d_i\) that belong to topic _k_; \(\alpha _k\) is the weight of the _k_ topic in
the document-topic distribution; \(\phi _{{{\mathbf{k}}}}\) is the vector of the _k_ topic, \({{{\mathbf{x}}}}_i\) is the vector of the _i_ word; and _β_ is the hyperparameters of the
topic-word distribution. Similarly, the probability distribution of each word _ω_ belonging to a topic _k_ can be calculated as follows: $$p\left( {\omega |z = k,{{{\mathbf{x}}}},\beta }
\right) \propto e^{{\phi {_{{{\mathbf{k}}}}}}^ \cdot\, {{{\mathbf{x}}}}_\omega }$$ where, \({{{\mathbf{x}}}}_\omega\) is the vector of word _ω_. In this study, a joint use of word2vec and
LDA for text modelling is proposed, which has several advantages. First, the semantic relationships between words are considered. Traditional LDA models only take into account the frequency
information of words in the text without considering their semantic relationships. However, by using word2vec to vectorize words, the semantic relationships between them can be better
captured, thus improving information extraction from the text. Second, the combination of word2vec and LDA enhances the effectiveness of text modelling. Word2vec represents words as vectors,
thus providing a better representation of the relationships between them, while LDA represents the text as a distribution of topics, providing a better representation of the content. The
combination of these two approaches leads to a more effective representation of the text. Third, LDA models may suffer from the inclusion of noise words, which can negatively impact the
model’s performance. However, by vectorizing words using word2vec, noise words can be more easily distinguished from relevant words, leading to a more robust model. Finally, the joint use of
word2vec and LDA allows for more granular text modelling. Word2vec captures detailed relationships between words, while LDA represents the text as a distribution of topics, providing a
better representation of the content. By combining these two approaches, more detailed information can be extracted from the text. METHODOLOGY The attitude of language influenced by social
media in developing nations might be detected by comparing the collocational words and phrases of the names "Confucius Institute" and "Goethe-Institut." It was possible
to determine the context of the institutions about which international audiences read on a regular basis by evaluating the positive and negative semantic associations. Colligation analysis
is also helpful for determining whether the institute in the context is in a dominant or subsidiary role. As a result, the audiences and social media stances of overseas developing
countries’ Confucius Institute and Goethe-Institut could be compared. RESEARCH DESIGN In the new era, the restrictions of the West’s international communication pattern, the blockade of
semiotic hegemony and language alliances, targeted rumour propaganda against China with the influence of the spiral of silence, and the lack of China’s international promotion capacity (Xing
and Zhao 2021) all pose challenges to the development of the Confucius Institute. Several polls have shown that developing countries have a more favourable view of China than developed
countries. According to the Global Survey on China’s National Image 2019 by the Institute of Contemporary China and the World, 79% of developing countries consider their country’s
relationship with China important and increasingly recognize China’s performance in foreign affairs. According to the Pew Research Center, more affluent countries, such as Japan (85%),
Sweden (70%), Canada (67%), and the United States (60%), have a more negative attitude towards China. More than half of people in African and West Asian countries, such as Nigeria (70%) and
Kenya (58%), have a favourable opinion of China. Thus, although powerful Western media still restrict the international communication pattern, the demographic advantage and trust of
developing countries provide a breakthrough to resolve this dilemma. It is helpful to analyse the contexts of language and attitudes presented behind the language of public opinions from
developing countries. Therefore, studying the image of Confucius Institutes in the media and on developing countries’ websites can help us resolve the difficulties more effectively.
Therefore, in this study, the sample sources are limited to developing countries. In addition, the Goethe-Institut belongs to Germany, which has not colonized the developing countries where
the corpus was collected for a long time. That is, compared to the British Council and Alliance fran, it has less political and cultural influence in developing countries. Moreover,
according to the literature review above, researchers have concluded that the Goethe-Institut is a well-developed institute for language teaching and culture promotion. It has overcome the
initial difficulties and has formed a favourable international public opinion environment. In this paper, a quantitative corpus-based analysis is adopted. Moreover, we take lexical priming
theory as the theoretical framework, analysing the collected discourse from three aspects: collocation, semantic associations, and colligation. The quantitative analysis is applied to the
analysis of lexical priming to find the collocation, semantic association words and colligational types. RESEARCH SAMPLES This study is based on the data in the NOW corpus. The NOW Corpus is
a subcorpus of the English Language Corpus created by Brigham Young University in the U.S. It is the most up-to-date corpus of English, containing a wide range of online newspapers and
magazines (technology, entertainment, sports, politics, etc.). This study takes “Confucius Institute” and “Goethe-Institut” as matching strings to collect data from the NOW corpus from
01/01/2014 to 04/01/2023. On June 1, 2014, the American Association of University Professors (AUP) issued Confucius Institutes Threaten Academic Freedom. Since then, Confucius Institutes in
the U.S. and other Western countries have been affected to varying degrees, and some have even been forced to close down. As discourses collected by the NOW corpus are mainly from countries
whose mother tongue or official language is English, the data from developing countries in the corpus are as follows: India, Sri Lanka, Pakistan, Bangladesh, Malaysia, Philippines, South
Africa, Nigeria, Ghana, Kenya, Tanzania, and Jamaica. In the NOW corpus, 1086 texts are found through the matching string “Confucius Institute”, while 1065 pieces are related to
“Goethe-Institut”. Due to limited permissions, it is difficult to analyse the data within the NOW corpus directly. Therefore, the original texts are copied and collected to create small new
corpora: CI Developing Country Corpus (CIDC) and Goethe-Institut Developing Country Corpus (GIDC). The number of texts from various nations in the CIDC and GIDC is compared in Fig. 2. In
developing nations such as Ghana and Bangladesh, opinions on the Confucius Institute and the Goethe-Institut are divisive. The polarization might be connected to the influence of
institutions in various nations. Additionally, Sri Lanka and Jamaica have low reports, suggesting that both institutes have little influence in those nations. However, a simple glance at
Chart 1 demonstrates that the influence of the Goethe-Institut and the Confucius Institute on social media in developing nations is equal, making this set of information eligible for
comparison. DATA APPROACHES The study encompasses a four-part approach to data processing (see Fig. 3), which entails data collection and preprocessing, collocation calculation, colligation
analysis, and semantic association construction. First, we retrieved news articles and established small-scale corpora CIDC and GIDC by searching for the keywords “Confucius Institute” and
“Goethe-Institut” in the NOW Corpus from 01/01/2014 to 03/31/2023. Complete news articles were obtained by retrieving the URLs of all relevant news through legal means. Prior to data
cleaning, the collected news articles were structurally processed, such as by removing URLs and non-English characters, followed by segmentation and tokenization based on English stop words.
Second, collocation calculation involved PMI value computation and word vector similarity. By employing Skip-gram and CBOW models in the word2vec model, each word in the corpus was
transformed into a fixed-length word vector. Subsequently, the similarity between each word vector and the target words "Confucius Institute" and "Goethe Institut" was
calculated, and the results were ranked based on the similarity. Complementary analysis of word vector similarity and PMI facilitated collocation analysis of the corpus. Third, we used the
StanfordNLP model to perform syntactic dependency parsing on sentences containing the target words "Confucius Institute" and "Goethe Institut" to construct syntax trees.
The part of speech and grammatical depth of the target words in the sentence were identified and counted for colligation analysis. Finally, the LDA model was employed to extract ten topics
and the top ten words related to each topic from the text matrix of CIDC and GIDC. Euclidean distance was used to calculate the distance between the target words and the centres of each
topic for constructing the semantic association of the corpus. Collocation, colligation, and semantic association of the corpus were jointly contributed by this part of the study. STATISTICS
ANALYSIS The CIDC has 16,483 lemma types and 420,605 lemma tokens, while the GIDC has 25,834 lemma types and 436,140 lemma tokens. These figures suggest that the GIDC may have a more
diverse range of vocabulary and potentially cover a wider range of topics or genres. Further analysis is carried out as follows from three perspectives, collocation, colligation and semantic
association, to determine if there are any significant differences in terms of linguistic features or usage patterns between the two corpora. FEATURES OF COLLOCATIONAL PRIMING FEATURES OF
CONCORD PATTERN According to Table 1, the word _with_ has a high frequency in the locations. _Collaboration_, As shown in Table 1, the word _with_ a high frequency in the locations.
_Collaboration_, _Partnership_ are also used in different lexical situations with high frequency. The meaning “in the company or presence of sb/sth,” “doing sth together or of working
together towards a shared goal” has developed collocational priming with Goethe-Institut. Such collocational priming shapes an image that equals communicating and cooperating with other
national institutions. In addition, several countries’ names, such as _Namibia_ and _Bangladesh_, appear in Table 1, helping associate the Goethe-Institut with other countries. It is also
noticeable that _German_, which stands for the national characteristics of the Goethe-Institut, is oddly positioned in L5, L4 and R3, R4 just that far from the matching string compared to
other national names. All of the above helps the audience form a stable memory by repeating that the Goethe-Institut has excellent friendships and cooperation with various countries, while
its Germany-specific attributes are forgotten. Moreover, _Mueller_, _bhavan_ are located in high frequency and heavily tied with Goethe-Institut, which refers to Max Mueller Bhavan, a
language institute run by the Goethe-Institut in India. Additionally, _Kristen_, _Hackenbroch_, _director_ and _Bangladesh_, appearing in positions tied closely with GI, is described as a
person, Kristen Hackenbroch, who was employed by the Goethe-Institut in Dhaka, Bangladesh, performed research and taught urban studies. All these company and personal names imply for-profit
and privatization: “private acts are responsible for the institution’s works”. The most notable aspect of Table 2 is the placement of the terms _China_ and _Chinese_, which explicitly refer
to the country-specific characteristics of the Confucius Institute, in six lexical locations, including the first and second frequencies on L4, R2, R3, and R4. It is often emphasized that
the “Confucius Institute belongs to China”. _Chinese_ develops a more significant lexical priming and semantic connection with the Confucius Institute than _German_ does with the
Goethe-Institut in Table 1, where _German_ or _Germany_ only appears in lexical locations with lower frequency and longer distances in relation to the matching string. In addition, the word
_university_ is noteworthy, appearing in almost all lexical positions in the top three frequencies, meaning that _university_ also has a strong collocation with the Confucius Institute. In
addition to words such as _director_, _headquarters_, and _language_, these lexical items are all related to the function and operation of the Confucius Institute, implying that the
Confucius Institute has a strong connection to what it would do, which may lead readers to associate Confucius Institutes with "active" and "dominant" institutions. PMI
AND WORD VECTOR As explained in Section 3, we have previously discussed the relationship between MI and word vectors in relation to collocation. Here, we further classify collocations into
three categories: specific collocations (high similarity score only), general collocations (high PMI score only), and typical collocations (high similarity and PMI scores). Table 3 reveals
that most of the words that form general collocations with the Confucius Institute are closely associated with its operations, language teaching, and cultural exchange activities. For
instance, words such as _language_, _teachers_, _teaching_, and _students_ form a strong priming effect with the Confucius Institute, resulting in high PMI values. The underlying semantic
associations behind these collocations have been elaborated in our forthcoming Section 4.2, where they will be presented in tabular form. Apart from the words that have been previously
discussed, it is worth noting the high PMI value of the word _Hanban_. [Example 1] * 1. …Confucius Institutes themselves are not merely agreements between foreign universities and the
_Hanban_. Each institute is led by a Chinese partner university. * 2. …Talal Abu-Ghazaleh Confucius Institute, in cooperation with Confucius Institute Headquarters (_Hanban_) and Shenyang
Normal University, organized the Educators Delegation to China program for the fourth year. Hanban served as the headquarters that was closely tied to the Confucius Institute. However,
following the restructuring of the Confucius Institute in 2020, Hanban, which was the organization responsible for the administration of the institute, was abolished. Nevertheless, the
negative connotations associated with the word _Hanban_ may have been inadvertently transferred to the Confucius Institute through collocational priming effects. Besides, it could be found
that _said_ appears in the cluster table and also as a fairly high ranking MI3 collocation. [Example 2] * 1. He _said_ that despite the differences in the political and social systems of the
two countries, we had an excellent relationship. * 2. Ambassador Zhang _said_ that China-Tunisia relations are traditional and friendly. * 3. Professor Dr Khalid Iraqi _said_ the Chinese
language has now become an international language and over time, the number of Chinese language learners is increasing. It could be observed that _said_ predominantly appeared as quotations
from spokespersons within the context of news articles, rather than in the narrative itself. This finding suggests that news discourse often incorporates reported speech, which to some
extent reflects the neutral and fact-oriented stance of the news writers. However, it would be valuable to conduct further research on the specific topics and underlying attitudes conveyed
through the quotations behind _said_. Focusing solely on specific collocations, we found that words with vectors that are similar to those of the Confucius Institute can be classified into
two categories. The first category consists of the primary sponsors of the Confucius Institute, such as _Hebei_, _Stellenbosch_, _Sargodha_, _Sichuan_, _Chongqing_, and other Chinese
provinces, which are likely related to the Chinese partner schools of Confucius Institutes. As Chinese province names are a typical example of “foreign place names”, the audience tends to
directly associate them with _Chinese_. In large-scale contexts, the subjective image of the Confucius Institute is still dominated by the Chinese side. The second category is related to
"institutional cooperation," such as _Nnamdi_, _Azikiwe_, _Lagos_, and "institutional construction," such as _China-built_, _department_, and _faculty_. It is important
to note that these specific collocations are distinct from the general collocations discussed earlier. These specific collocations weaken the function of Confucius Institutes, such as
language and cultural dissemination, and focus on the establishment and expansion of Confucius Institutes, making them appear to be more aggressive. Moreover, it is worth noting the word
_nonprofit_, which stands in stark contrast to the for-profit Goethe-Institut. Confucius Institutes are dedicated to creating an image of a nonprofit academic institution. Finally, in the
research, it was found that there is a significant difference between the specific collocations and general collocations of the Confucius Institute. The degree of overlap between the two
tables is only approximately 15.5%, indicating that the impact of the general context on the Confucius Institute is significantly different from the effect of the microcontext. Focusing on
typical collocations, we found that the words _University_, _headquarters_, _held_, _Karachi_, _organized_, and _Hanban_ have a strong collocational relationship with the Confucius Institute
based on their PMI and word vector complementarity. [Example 3] …The event was co-_organised_ by the Pakistan Institute of China Studies (PICS) and the Confucius institute. …The visit was
_organised_ to renew the agreement for the Confucius institute and expand the future scope of collaboration with BFSU to potentially include undergraduate and graduate studies… Through these
words, we can see that the related news coverage still mostly focuses on surface-level activities, and the associated events align with the image that Confucius Institutes wish to portray
as language and cultural promotion institutions. However, the collocations formed by the audience still centre around "what Confucius Institutes have done". The events are complex,
and a consistent and fixed direction for collocational priming has not yet formed. As shown in Table 3, the words forming the general collocation with the Goethe-Institut consist of country
and organization names, such as _Bangladesh_, _Namibia_, and _Chennai_. Some words are related to the core of "collaboration," such as _cooperation_, _collaboration_, and _with_.
These two parts of words have already been described in the concord pattern, and the image of the Goethe-Institut as a friendly collaborative organization is already established. It is worth
noting that there are also words associated with the events hosted by the Goethe-Institut, such as auditorium, indicating the artistic and cultural events organized by this institution. In
Table 4, we observe a categorization of words that is similar to that in Table 5. Although there are differences in the specific words used, the overall categorization of words is similar.
Notably, cultural organizations such as _UNESCO_ and _Alliance Francaise_, which are similar to the Goethe-Institut, also appear in Table 4. The fact that the Goethe-Institut is listed
alongside these organizations in the specific collocation category indicates that its international and collaborative nature has been recognized, providing a strong indication for
classification to the audience. Our research finds that the specific collocations of the "Goethe-Institut" have a high overlap with general collocations, reaching 42.2%. This
indicates that the macrocontextual environment of the Goethe-Institut is relatively consistent with its microcontextual collocation, contributing to the formation of its image. Focusing on
typical collocations, we find that _partnership_, _collaboration_, and _cooperation_ are all included, suggesting that the concept of "friendly cooperation" has already formed
strong collocational priming for the Goethe-Institut. Moreover, the Goethe-Institut only forms typical collocations with its executives, founding organization, and host country but not with
German institutions, indicating that the Goethe-Institut, as a language dissemination institution (at least on the surface), has completely lost its connection with the government. Of
course, it cannot be denied that there are many German words in the table. However, we ensure that all the news texts collected for this study are in English, meaning that these German words
appear in English texts. Moreover, the table shows that the relevant German words, unlike the Chinese province names in Table 6, are not the same as the English words representing their
meanings, such as _kultur_. This suggests that the use of German is closely related to the connection to the Goethe-Institut, and such usage is already common in news reports from developing
countries. FEATURES OF SEMANTIC ASSOCIATION After further examination with Antconc 3.5.8 under the above mentioned collocation (especially focusing on typical collocations), it is found
that the contexts related to the Confucius Institute (Table 5) could be classified into Event, Location, Relationship and Functions. Similarly, the contexts related to the Goethe-Institut
could be classified in the same way. The contextualization in Table 7 reveals that, among the contexts related to events, a large part of the coverage of Confucius Institutes focuses on the
establishment of Confucius Institutes in various countries and regions. These reports are objective, only stating the establishment of Confucius Institutes without lexical bias. In addition,
most reports target “places where the second Confucius Institute was built” rather than “the first Confucius Institute was established in certain places”, which implies that the Confucius
Institutes may operate well before they can afford to open a second one. This kind of implication can be a positive image-building semantic association. In the absence of other biases, the
audience would assume that Confucius Institutes are a famous institution. In addition, it is interesting to note that the front and back of such kinds of semantic associations are often
accompanied by the praises of the work of Confucius Institutes in teaching language and promoting culture, which has contributed to the image building of the Confucius Institute. Moreover,
this shows that the media coverage narrations towards Confucius Institutes in developing countries are positive and that the Confucius Institute has active ingredients in public opinion in
the developing world. Such positive elements also exist in other semantic associations: [Example 4] * 1. The Confucius Institute will be a _bridge_ to unite the people of the countries of
Tanzania and…; * 2. Professor Samuel Kwame Offei, commended the Confucius Institute for its role in _bridging_ the language gap between China and Ghana… Obviously, in these contexts, public
opinion has given a great deal of recognition to the contribution of Confucius Institutes in promoting language learning, enhancing cultural exchange, and optimizing relations between
countries. From this perspective, the image of Confucius Institutes in developing countries tends to be favourable and friendly. However, these semantic associations from a macro level give
the audience the impression that this is an “act of state”. Furthermore, the functions and organization of Confucius Institutes are often found in contexts of language teaching and cultural
dissemination, part of which describes the organizational structure of Confucius Institutes (two universities in the two countries jointly establish the Confucius Institute, or the two
headquarters codirect Confucius Institutes in both countries). This description is consistent with what had been stated on the website that the Confucius Institute is a nonprofit education
institution jointly hosted by Chinese and foreign partners. Some of the contexts focus on essential tasks such as “teaching language” and “spreading culture”, objectively showing the
school-running activities of Confucius Institutes. The statement conforms to the purpose of Confucius Institutes on the official website: “The aim of Communicating Chinese, the current
situation of Chinese language and culture, And promoting people-to-people exchanges between China and the rest of the world”. Table 8 illustrates a number of terms associated with projects
that the Goethe-Institut has sponsored, such as _kultur_, _project_, and _supported_. After these words were searched, it could be found that the projects behind these words are in different
fields. For example, _School_ is the Partners for the Future (PASCH) initiative, a language school supported by the Goethe-Institut. It can seem that the semantic associations of these
projects are in completely different areas, implying that the Goethe-Institut involves a wide range of activities. It can be said that the semantic association of projects in different
fields has confused the German cultural attributes of the Goethe-Institut and weakened its fundamental purpose. This kind of confusion, masking fundamental purposes and attributes, is
further reinforced by the words of art-related activities. Many words representing locations, such as _Namibia, Chennai, and Nicosia_, or words with fewer denotations, such as _cultural,
bildung, offer, and auditorium_, are closely related to art activities. FEATURE OF COLLIGATIONAL PRIMING Based on the data presented in Tables 9 and 10, there are some notable differences in
the language used in news related to the Goethe-Institut and the Confucius Institute. According to the data presented in Tables 9 and 10, the use of passive voice is significantly more
common than active voice for both the Goethe-lnstitut (91.19%) and Confucius lnstitute (87.23%). This finding may reflect divergent attitudes and perspectives towards the two institutions.
It is possible that news stories about the Confucius Institute are more inclined to depict the institute as active and initiating action, while articles on the Goethe-Institut may portray it
as passive and accepting. These differing portrayals could reflect contrasting attitudes towards the Confucius Institute. Indeed, some critics have suggested that the institute is used as a
propaganda tool by the Chinese government. In terms of subject count, the Confucius Institute has a higher overall count at 240 compared to 134 for the Goethe-Institut. However, when
looking at the breakdown by sentence depth, the Goethe-Institut has a higher count of first-level subjects (25.4%) than the Confucius Institute (17.08%), which has a higher count of
third-level subjects (64.2%). This suggests that descriptions of the Confucius Institute may focus more on the actions and effects of the institution, while descriptions of the
Goethe-Institut may place more emphasis on its identity as a specific institution. It was found that the "Confucius Institute" appeared more frequently than the "Goethe
Institute" and had a higher complexity in terms of adjective count and attribute count. Specifically, the three or more layers of attribute count and adjective count for "Confucius
Institute" (170) were much larger than for "Goethe Institute" (117), indicating that the "Confucius Institute" was described and explained more extensively, with a
greater focus on general descriptive language. For example, syntax trees show the standard naming of Confucius Institutes is “CI in area/university”, and the grammatical structures (Table
11) refer to _Confucius_NNP Institute_NNP IN_, which are included in terms of adjectives count and attribute count. This kind of colligation makes the audience think that “CI is a part of a
university” or “CI is an organization in a region”. In contrast, however, those descriptions of the Goethe-Institut may place more emphasis on specific contents and characteristics. FEATURE
OF TOPICS AND SEMANTIC PROSODIES When LDA is used to analyse text, all the words appearing in the text are usually mapped to the topic space. Since the number of topics in the text varies
and the proportion of different topics in the corpus is also different, the size of each circle in Fig. 4 represents the proportion of the topic in the text. The larger the circle is, the
more texts are related to that topic, and vice versa. Each topic has some topic words, which are the words with high frequency in the topic space. Each word is assigned a probability value,
indicating its relevance to each topic. The higher the probability value is, the greater the weight of the word in that topic. In this study, we extracted the top ten topics in the CIDC and
GIDC, the high-frequency topic words in each topic, and the vector similarity between the target words "Confucius Institute" and "Goethe-Institut" and each topic, as
shown in Tables 12 and 13. The smaller the distance from the target word to the topic is, the higher the correlation between the topic and the target word. As shown in Fig. 4, the
distribution of the top ten topics in the CIDC is quite diverse, indicating that the corpus covers a wide range of events. According to the values of topic similarity and the top words
associated with each topic, which are shown in Table 12, Topics 5, 8, 9, and 10 have high vector similarity with the Confucius Institute, indicating that these topics may be the semantic
prosodies of the Confucius Institute in the CIDC. By analysing the top words associated with each topic, we can categorize the contexts into two types. The first type is related to the role
of Confucius Institutes in the field of education. For example, Topic 8 is associated with the establishment and operation of Confucius Institutes in Sri Lanka, while Topic 6 is related to
the establishment of Confucius Institutes in Africa and the promotion of the Chinese language in Africa. The second type of context is related to sudden news events. For instance, Topic 5
discusses the attack on a Confucius Institute teacher in Karachi, Pakistan. The third type is related to the international relations impact of Confucius Institutes. For example, Topic 9 is
associated with medical projects and exchanges, while Topic 3 is related to the policies and political parties in the United States and their impact on Confucius Institutes. As shown in Fig.
5, the top 30 most salient terms in the GIDC corpus are led by _film_ and _art_, indicating that the Goethe-Institut is closely linked to its artistic activities in the large corpus
environment. This suggests that most reports focus on the artistic collaborations of the Goethe-Institut. The appearance of terms such as _films_, _artists_, and _music_ also confirms this.
The high proportion of art-related topics in the GIDC further supports this idea. Interestingly, _German_ appears as the third most salient term, indicating that the Goethe-Institut is
closely associated with Germany. Terms such as _language_, _students_, and _learning_ suggest that the Goethe-Institut has not abandoned its original purpose of teaching language and
promoting German language and culture. Unlike the top 30 most salient terms in the CIDC corpus for the Confucius Institute, there is no explicit connection between the Goethe-Institut and
the _government_. Moreover, terms such as _international_, _invited_, and _service_ suggest that the role of the Goethe-Institut is more focused on what it has done rather than the impact it
has had. This contrasts with the more politically oriented language present in the CIDC corpus for the Confucius Institute. Focusing on the top ten themes in Fig. 4, it is evident that the
distribution of themes is more uniform and dispersed than that of the Confucius Institute, implying a broader range of content covered in the corpus. Based on the values in Table 13 for
topic similarity and the cues provided by top words, it can be inferred that Topic 7, Topic 10, Topic 6, Topic 8, and Topic 2 are the semantic prosodies of the Goethe-Institut in the GIDC
corpus. According to the topic words, the relevant contexts of the Goethe-Institut can be divided into two categories. The first category is closely associated with artistic activities, such
as Topic 7, which is related to international cultural exchange programs and art festivals in Germany. Topic 6 is related to films and shorts, involving news reports on film festivals or
exhibitions. Topic 2 pertains to music, theatre, and information related to music concerts or theatre festivals. The second category pertains to purely cultural activities and cooperation
with other institutions, such as Topic 10, which involves partner relationships and cultural project applications. The above activities are disconnected from "German language
teaching," and the Goethe-Institut has been established as an art and culture planning and cooperative institution. DISCUSSION Through a comparison of the different priming features of
the CIDC and GIDC, it is concluded in this study that public opinion and the image portrayed (or forced primings) are significant factors that contribute to the differences in priming
features. The findings of this study are as follows: Reports on the Confucius Institute in developing countries focus on its teaching activities and events, which are related to the
functions and operations of the Confucius Institute. The lexical items that are frequently used in these reports, such as _director_, _headquarters_, _language_, and _nonprofit_, all
contribute to the nonprofit academic institution image of the Confucius Institute. The verbs used in these reports are mostly neutral, with some positive connotations, which indicates that
most of the reports view the teaching activities of the Confucius Institute positively. However, these reports generally have a superficial perspective and evaluate the events themselves,
rather than the overall image of the institute. Reports on the Goethe-Institut in developing countries shape the public image of the institute as an art curator and facilitator of art
exchange, reflecting its friendly collaboration with other institutions. The large corpus data context in which the Goethe-Institut operates has a strong correlation with the field of art
activities. Furthermore, the collocation of the Goethe-Institut with non-German organizations such as _UNESCO_ and _Alliance Francaise_ suggests that its international and collaborative
nature has been recognized, thus providing the audience with a strong indication for classification. The institutional image and activity themes of the Goethe-Institut have become somewhat
fixed, while those of the Confucius Institute are still undergoing significant changes. The degree of overlap in PMI and word vector similarity indicates that the typical collocation of the
Goethe-Institute is much greater than that of the Confucius Institute, and it has already formed the conditions necessary to develop a unique semantic association. Therefore, based on
Findings 1–3, it can be concluded that the Goethe-Institut has moved beyond its initial goal of being a language teaching institution, and its institutional image has become more fixed. In
contrast, the Confucius Institute is still largely associated with language teaching activities and language-based cultural promotion activities, and public perception of the institute is
still developing. The Confucius Institute remains closely tied to China and its government, while the Goethe-Institut has fully separated from the German government. Through an examination
of typical collocations, it could be found that _partnership_, _collaboration_, and _cooperation_ are commonly used in connection with the Goethe-Institut, indicating a strong association
with friendly cooperation. In contrast, the term _Hanban_ has strong collocational priming with the Confucius Institute, and the word itself has become a symbol of China’s soft power. As
_Hanban_ was originally a part of the Chinese government, news content about the Confucius Institute can still be associated with government policies, as indicated by words such as
_economic_, _power_, and _government_ that appear in the context of discussions about the institute. The Confucius Institute is portrayed as active and assertive, while the Goethe-Institut
is perceived more as a supportive and collaborative entity. In terms of colligation, the active voice is used more frequently in discussions of the Confucius Institute than in those of the
Goethe-Institut, and the same trend is observed in actual semantic associations. Discussions about the Confucius Institute focus on what it is actively doing, while discussions about the
Goethe-Institut emphasize collaborative activities with other organizations. The Chinese and Chinese-language attributes of the Confucius Institute have been externalized and are not
well-accepted by audiences, while the German language and German attributes of the Goethe-Institut have to a certain extent been internalized and accepted within news reporting. The corpus
shows that both Chinese Pinyin and German words are used, but the former refers to the organizing body of the Confucius Institute, while the latter has a variety of meanings and is used in
news reporting without an obvious connection to the Goethe-Institut. The English name of the Confucius Institute could confuse the media and politicians and may mislead to negative
perceptions. As noted on Wikipedia, "Some commentators argue, unlike these organizations, many Confucius Institutes operate directly on university campuses, thus giving rise to what
they see as unique concerns related to academic freedom and political influence." When combined with the deliberate rhetoric of "CI impedes academic freedom" created by
politicians and media, audiences in developing countries may be more inclined to accept a distorted image of the Confucius Institute. Based on these characteristics, we conclude that the
overall image of the Goethe-Institut is peaceful and creative, while that of the Confucius Institute has a stronger "otherness" attribute and is more assertive, with a government
association. Due to its variable priming feature, audience perceptions of the Confucius Institute are more susceptible to external influences. Given the measures and negative media coverage
against the Confucius Institute taken by developed countries such as the United States since 2014, we believe that there is a tendency for the image of the institute in news media coverage
in developing countries to be questioned. We have identified several factors that may contribute to these differences: Firstly, the representation of public opinion holds political
significance. It is evident that the discourse content of both institutions in developing countries is based on actual events and analyzed with relative impartiality. This connection between
public opinion and the national circumstances of developing nations is inseparable. While these nations are not entirely reliant on developed countries, they still require assistance from
other nations for their progress. As a result, the study’s findings indicate that developing nations tend to maintain a more "neutral" stance while critically describing and
evaluating both the Goethe-Institut and the Confucius Institute. However, it cannot be denied that ideological differences and national perspectives play a role, causing some developing
countries to be cautious about Confucius Institutes, albeit to a lesser extent than Western countries. Particularly, the lack of understanding regarding China’s approach to language and
cultural advancement instills fear among them that the Confucius Institutes may infringe upon their institutional academic freedom. Hence, the aforementioned results indicate that developing
countries demonstrate a more critical and occasionally hostile attitude, presenting a country-specific and government-oriented image of public opinion towards Confucius Institutes. This
finding aligns with Zhou et al. (2018) conclusions. We believe that the Confucius Institute’s identity as an "other" will persist, preventing it from fully achieving its intended
mission of improving global understanding of Chinese language and culture, fostering friendly relations between China and foreign countries, promoting multiculturalism worldwide, and
contributing to the creation of a harmonious world. Secondly, the preferences and attitudes of the audience are closely linked to the public opinion’s image. While it is widely recognized
that the media subconsciously influences the audience, it is also important to acknowledge that the audience’s requirements and preferences can shape media coverage. In other words, the
media may selectively report information or even bias their coverage to align with the preferences of their viewers. Audiences make selective choices regarding media content based on their
personal preferences, indirectly impacting the media’s "living space." As a result of intensified market competition and a shift towards audience orientation, the media has been
compelled to reassess the attitudes and demands of the audience at all levels. Lastly, the nomenclature of the Confucius Institutes could contribute to some confusion. The organizational
structure of Confucius Institutes involves collaborations with local colleges and universities worldwide. The English name of Confucius Institutes, coupled with their unique operational
model, may attract significant attention on social media platforms. When social media users and politicians repeatedly mention the name without providing an explanation of the organization,
audiences may develop the perception that the Confucius Institute is directly penetrating the local university campus. Such contextual preconceptions add a sense of "intrusion" to
the image of Confucius Institutes, accompanied by deliberate rhetoric suggesting that they impede academic freedom. Consequently, audiences are more inclined to accept a distorted image of
Confucius Institutes. CONCLUSION In summary, this study examines the differences in lexical priming features between the Confucius Institute and Goethe-Institut in developing countries. The
study used the NOW corpus to gather web URLs from 2014 to 2023 with a match string of the Confucius Institute and Goethe-Institut. To obtain the news content, in this study, the text was
split into two corpora, CIDC and GIDC. Through the processing of data, the corpus was examined from four aspects: collocation, colligation, semantic association, and semantic prosody. This
study aims to examine whether there are some differences in lexical priming features between GIDC and CIDC to determine the attitude and public opinion image behind such kinds of primings.
This study revealed that the Goethe-Institut has evolved beyond its initial objective of being a language-teaching establishment, and its institutional identity has become more stable. In
contrast, the Confucius Institute is still largely recognized for language teaching and cultural dissemination activities. Public perception of the institute is still evolving, and it
remains closely tied to China and its government. The Goethe-Institut, on the other hand, has completely separated from the German government. While the Confucius Institute is portrayed as
active and assertive, the Goethe-Institut is perceived more as a supportive and collaborative entity. The English name of the Confucius Institute could potentially create confusion among
media and politicians and lead to negative perceptions. The overall image of the Goethe-Institut is peaceful and creative, while that of the Confucius Institute has a stronger sense of
"otherness" and involves more assertiveness due to its association with the government. In news media coverage of developing countries, there is a tendency for the image of the
institute to be questioned. In addition, this research has certain limitations. Firstly, the developing countries collected in this study are limited to those collected in the NOW corpus
(India, Sri Lanka, Pakistan, Bangladesh, Malaysia, Philippines, South Africa, Nigeria, Ghana, and Kenya). Due to the geographical limitation of the collected developing countries, some
deviations may exist from the public opinion of developing countries in general. Secondly, it should be acknowledged that the study has limitations on data analysis. For instance, in the
analysis of _said_, neither concordance line analysis nor extensively utilize quotation processing with illustrative example was used. As a result, there is room for future research to delve
into the specific topics and associations found within quotations from spokespersons as well as those present in the narrative of news articles. Thirdly, it should be noted that alternative
measures of collocation and semantic association exist, for instance, Gries (2013) introduced an association measure that effectively identifies asymmetric collocations and distinguishes
between high and low association strengths; Gries (2019) proposed tupleization as a research program that analyzes multiple dimensions of information, both of which prove that different
collocation metrics would result in different top-ranked collocations. Thus, in this study the analysis has been restricted to use of word2vec and LDA. Therefore, for future recommendation,
more research could be carried out to compare and contrast computational methods common in natural language processing and more traditional corpus linguistics methods. Additionally, it is
hoped that the Confucius Institute, as a window for China’s foreign language teaching and cultural dissemination, can be more objectively understood and recognized. DATA AVAILABILITY The
datasets generated during and/or analysed during the current study are available in the Github repository: https://github.com/Bellahhm/Datasets-of-Otherness-and-Suspiciousness.git.
REFERENCES * Acquaye JB (2020) Western perceptions on Confucius Institute advancement of Chinese language and culture: a narrative review. US-China Educ Rev 10(5):185–199 Google Scholar *
An R, Xu M (2015) News discourse analysis of the suspension of Confucius Institute at the University of Chicago. Int Commun 2:43–45 Google Scholar * Brazys S, Dukalskis A (2019) Rising
powers and grassroots image management: Confucius Institutes and China in the media. Chin J Int Politics 12(4):557–584. https://doi.org/10.1093/cjip/poz012 Article Google Scholar * Blei
DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3(Jan):993–1022 MATH Google Scholar * Drake J (2009) A linguistic account of word play: the lexical priming of
pinning. J Pragmatics 41:794–809. https://doi.org/10.1016/j.pragma.2008.09.025 Article Google Scholar * Gill B, Huang Y (2006) Sources and limits of Chinese “Soft Power”. Survival
48(2):17–36. https://doi.org/10.1080/00396330600765377 Article Google Scholar * Gries ST (2013) 50-something years of work on collocations: what is or should be next…. Int J Corpus
Linguist 18(1):137–166. https://doi.org/10.1075/ijcl.18.1.09gri Article Google Scholar * Gries ST (2019) 15 years of collostructions: some long overdue additions/corrections (to/of
actually all sorts of corpus-linguistics measures). Int J Corpus Linguist 24(3):385–412. https://doi.org/10.1075/ijcl.00011.gri Article Google Scholar * Gruenenfelder TM, Recchia G, Rubin
T, Jones MN (2016) Graph‐theoretic properties of networks based on word association norms: implications for models of lexical semantic memory. Cogn Sci 40(6):1460–1495.
https://doi.org/10.1111/cogs.12299 Article PubMed Google Scholar * Huang W, Lien D, Xiang J (2019) The power transition and the U.S. response to China’s expanded soft power. Int Relat
Asia-Pacific 24(2):249–266. https://doi.org/10.1093/irap/lcz008 Article Google Scholar * Harting F (2014) Confucius Institutes as innovative tools of China’s cultural diplomacy. Chinese
Politics and International Relations. Routledge, pp.121–144. https://doi.org/10.4324/9781315866734 * Hoey M (2005) Lexical priming: A new theory of words and language. Routledge, London *
Hoey M, O’Donnell MB (2008) Lexicography, grammar, and textual position. Int J Lexicogr 21(3):293–309. https://doi.org/10.1093/ijl/ecn025 Article Google Scholar * Jantunen JH, Brunni S
(2013) Morphology, lexical priming and second language acquisition: a corpus-study on learner Finnish. Twenty Years of Learner Corpus Research: Looking back, Moving ahead. Louvain-la-Neuve:
Presses universitaires de Louvain, pp.235–245 * Jantunen JH (2017) Lexical and morphological priming : A holistic phraseological analysis of the Finnish time expression kello. In M
Pace-Sigge, & KJ Patterson (Eds.), Lexical Priming: Applications and advances (pp. 254–272). John Benjamins. Studies in Corpus Linguistics, 79. https://doi.org/10.1075/scl.79.10jar *
Jones MN, Gruenenfelder TM, Recchia G (2011) In defense of spatial models of lexical semantics. In Proceedings of the annual meeting of the cognitive science society (Vol. 33, No. 33) *
Jatnika D, Bijaksana MA, Suryani AA (2019) Word2vec model analysis for semantic similarities in english words. Procedia Comput Sci 157:160–167. https://doi.org/10.1016/j.procs.2019.08.153
Article Google Scholar * Kaszubski P (2007) Michael Hoey. Lexical priming: a new theory of words and language. Funct Lang 14(2):283–294. https://doi.org/10.1075/fol.14.2.12kas Article
Google Scholar * Kluver R (2014) The sage as strategy: nodes, networks, and the quest for geopolitical power in the Confucius Institute. Commun Culture Critique 7(2):192–209.
https://doi.org/10.1111/cccr.12046 Article Google Scholar * Kumar AA, Balota DA, Steyvers M (2020) Distant connectivity and multiple-step priming in large-scale semantic networks. J Exp
Psychol: Learn Memory Cogn 46(12):2261. https://doi.org/10.1037/xlm0000793 Article Google Scholar * Leedham M, Cai G (2013) Using a corpus approach to explore the influence of teaching
materials on Chinese students’ use of linking adverbials. J Second Lang Writing 12:374–389. https://doi.org/10.1016/j.jslw.2013.07.002 Article Google Scholar * Li H, Mirmirani S, Ilacqua
JA (2009) Confucius Institutes: distributed leadership and knowledge sharing in a worldwide network. Learn Organization 16:469–482. https://doi.org/10.1108/09696470910993945 Article Google
Scholar * Li K, Dai C (2011) Report of the U. S. Public Opinion of the Confucius Institute. World Economics and Politics (07):76–93+157–158 * Lien D, Oh CH, Selmier WT (2012) Confucius
institute effects on China’s trade and FDI: isn’t it delightful when folks afar study Hanyu? Int Rev Econ Financ 21:147–155. https://doi.org/10.1016/j.iref.2011.05.010 Article Google
Scholar * Lien D, Co CY (2013) The effect of Confucius Institutes on U.S. exports to China: a state level analysis. Int Rev Econ Financ 27:566–571.
https://doi.org/10.1016/j.iref.2013.01.011 Article Google Scholar * Louw B (1993) Irony in the Text or Insincerity in the Writer?—The Diagnostic Potential of Semantic Prosodies. In M
Baker, G Francis, & E Togni-ni-Bonelli (Eds.), Text and Technology: In Honour of John Sinclair (pp. 157–176). Amsterdam: Benjamins. https://doi.org/10.1075/z.64.11lou * Liu C, Zeng L
(2017) Critical discourse analysis of news reports on Confucius Institutes in mainstream media in the United States. Int Commun (01): 76–78 * Liu Y (2014) Research on China-related public
opinion from the perspective of national cultural security — Taking the New York Times’ Report on Confucius Institutes as an Example. Acdemic Exchange (4):200–203 * Min YE (2012) A
comparison between Confucius Institute and cervantes institute: the modern awareness in Chinese culture and the post-colonialism in Spanish culture. Contemp Foreign Lang Stud 3:41 Google
Scholar * Paradise JF (2009) China and international harmony: the role of Confucius Institutes in Bolstering Beijing’s soft power. Asian Survey 49(4):647–669.
https://doi.org/10.1525/as.2009.49.4.647 Article Google Scholar * Pace-Sigge M (2018) Spreading activation, lexical priming and the semantic web: early psycholinguistic theories, corpus
linguistics and AI applications. Cham, Palgrave Macmillan, Switzerland. https://doi.org/10.1007/978-3-319-90719-2 * Patterson KJ (2016) The analysis of metaphor: to what extent can the
theory of lexical priming help our understanding of metaphor usage and comprehension? J Psycholinguist Res 45(2):237–258. https://doi.org/10.1007/s10936-014-9343-1 Article PubMed Google
Scholar * Patterson K (2018) Understanding Metaphor through Corpora: A Case Study of Metaphors in Nineteenth Century Writing (1st ed.). Routledge, London.
https://doi.org/10.4324/9781351241090 * Patterson K, Pace-Sigge M (2017) Lexical Priming: Applications and Advances. (1 ed.) (Series in Corpus Linguistics). John Benjamins.
https://doi.org/10.1075/scl.79 * Peng F, Yu X (2016) The image and discourse system of Confucius Institutes reported by British mainstream media. Academic Exploration (11):112–119 *
Selezneva NV (2021) Learning Chinese in Vietnam: the role of the Confucius Institute. Rus J Vietnam Stud 5(4):71–86. https://doi.org/10.54631/VS.2021.54-71-86 Article Google Scholar *
Starr D (2009) Chinese language education in Europe: the Confucius Institutes. Eur J Educat 44:65–82. https://doi.org/10.1111/j.1465-3435.2008.01371.x Article Google Scholar * Xie T, Page
BI (2013) What affects China’s National Image? A cross-national study of public opinion. J Contemp China 22:850–867. https://doi.org/10.1080/10670564.2013.782130 Article Google Scholar *
Xing L, Zhao J (2021) New media and international communication of China’s national image. Xiandai Guoji Guanxi (11):51–59+61 * Yan X (2018) The Change of Public Opinion Environment for the
Development of Confucius Institutes— Based on the Analysis of Chinese and Foreign newspapers’ reports on Confucius Institutes from 2005 to 2014. Chinese Culture Overseas Communication
(02):219–226 * Ye Y (2015) The Image of the Confucius Institute in Foreign Media Reports. J Sichuan University (03):48–57 * Yeh Y, Wu C, Huang W (2021) China’s soft power and U.S. public
opinion. Econ Political Stud 9:447–460. https://doi.org/10.1080/20954816.2021.1933766 Article Google Scholar * Yuan Z, Guo J, Zhu H (2016) Confucius Institutes and the limitations of
China’s global cultural network. China Information 30(3):334–356. https://doi.org/10.1177/0920203X16672167 Article Google Scholar * Zhang D, He Y (2016) Public opinions over Confucius
Institutes in the International Arena: an analysis of relevant media reports in Western Countries. Renmin University China Education J (01):91–110 * Zhang W (2021) Public opinion dilemma of
Confucius institutes in the new situation: characteristics causes and countermeasures. Morden Commun 43(03):20–26 Google Scholar * Zhou T, Wen Y, Jia W (2018) The Dilemma and
Countermeasures of International Public Opinion Related to China from the Perspective of “Other-plastic” — A Case Study of US Media Reports Related to Confucius Institutes. Int Commun
(04):19–21 * Zhou Y, Luk SC (2016) Establishing Confucius institutes: a tool for promoting China’s soft power? J Contemp China 25:628–642. https://doi.org/10.1080/10670564.2015.1132961
Article Google Scholar Download references ACKNOWLEDGEMENTS The author thanks Wang Jun-Ling for constructive discussions and Supervisor Wu Cheng-Nian for great supports. This research was
supported by Centre for Language Education and Cooperation of China (No.YHJC21ZD-011). AUTHOR INFORMATION AUTHORS AND AFFILIATIONS * School of International Chinese language Education at
Beijing Normal University, Beijing, China Ming Huang Authors * Ming Huang View author publications You can also search for this author inPubMed Google Scholar CORRESPONDING AUTHOR
Correspondence to Ming Huang. ETHICS DECLARATIONS COMPETING INTERESTS The author declares no competing interests. ADDITIONAL INFORMATION PUBLISHER’S NOTE Springer Nature remains neutral with
regard to jurisdictional claims in published maps and institutional affiliations. SUPPLEMENTARY INFORMATION DATA RIGHTS AND PERMISSIONS OPEN ACCESS This article is licensed under a Creative
Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the
original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in
the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended
use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit
http://creativecommons.org/licenses/by/4.0/. Reprints and permissions ABOUT THIS ARTICLE CITE THIS ARTICLE Huang, M. Otherness and suspiciousness: a comparative study of public opinions
between the Confucius Institute and Goethe-Institut in developing countries. _Humanit Soc Sci Commun_ 10, 428 (2023). https://doi.org/10.1057/s41599-023-01920-7 Download citation * Received:
15 November 2022 * Accepted: 05 July 2023 * Published: 19 July 2023 * DOI: https://doi.org/10.1057/s41599-023-01920-7 SHARE THIS ARTICLE Anyone you share the following link with will be
able to read this content: Get shareable link Sorry, a shareable link is not currently available for this article. Copy to clipboard Provided by the Springer Nature SharedIt content-sharing
initiative