Using a professional communication corpus for business writing

Corpus is a collection of authentic texts compiled for the study of real-life language usage. Corpus is also compiled to build language models for machine learning or to compile dictionaries. A compiled corpus is useful for language users to find real-life language examples to enhance their communication proficiency.

The two well-known corpora available online is the British National Corpus (BNC) and the Corpus of Contemporary American English (COCA). BNC contains 100 million words of written and spoken British English real-life examples, while COCA is five times larger than the BNC. These two corpora contain real-life examples of daily conversation, fiction, popular magazine, newspaper, academic texts and more.[1]

Besides the general corpus like BNC and COCA, Professional corpus that contains language examples of a specific industry is also available. The RCPCE Profession-specific corpora that were developed by the Research Centre for Professional Communication (RCPCE)Department of English at The Hong Kong Polytechnic University is one of the publicly available free corpus resources.

Unlike dictionaries that show limited examples of usage, searching a corpus will return a list of concordances that contains segments of text that are adjacent to the search word (aka. Keyword in Centre (KWIC)). If a corpus user wants to read a larger segment of the text and the context, an expanded view of each concordance is available.

For example, if a corporate writer is preparing for a script for the CEO about the CSR performance of the company and would like to find the collocated adjective of the word “sustainable”. He/She could make use of the corpus search feature of the Hong Kong Corpus of Corporate Governance Reports (HKCCGR) and examines the concordances of the word “sustainable”. The concordances can be sorted in the alphabetical order of the right or left first to the firth word.

The concordances show that the word “sustainable” in the corporate governance report is related to the concepts of:

1.     Sustainable business models, practices and development

2.     Sustainable financial position, such as cost savings, earning, profit, etc.

3.     The performances, such as effectiveness, healthiness, remarkable, growth, etc.

4.     Sustainable returns.

The concordances from a professional corpus could provide a wider selection of collocated word for the technical writer to express a concept in their domain precisely. Junior workers could also use a profession-specific corpora resource to strengthen their language proficiency in their professional domain.