North Cooc & Genevieve Leung

Many scholars have called for the need to disambiguate details about Asian Americans and Pacific Islanders (AAPIs) across backgrounds like ethnicity, education, civic participation, and time of migration. What is sometimes equally muddled are the subgroups consisting of various nationalities who are categorized together under the guise of having a similar language. “Chinese,” this project’s focal point, is spoken by nearly three million people in the U.S., yet the term can refer to a singular language (e.g., Cantonese), ethnic group (e.g., Han Chinese), or a group of languages in the Sino-Tibetan language family. Chinese as a single category to identify a group overlooks diversity in language, immigration patterns, socioeconomic status, challenges, and communities of care in the United States.

This study disaggregates the population of Chinese speakers in the U.S. to understand how speakers identify with different varieties of Chinese. Using Census data on Chinese speakers from the American Community Survey (ACS) Public Use Microdata Sample (PUMS), the study examines areas where different Chinese speakers may struggle in terms of educational attainment, employment and income, and English proficiency. The availability of detailed records on demographic characteristics at the individual level provides a unique opportunity to analyze the population of Chinese speakers in the United States.

The results indicate almost two-thirds of Chinese speakers respond with only “Chinese” when asked about the language spoken at home in the ACS. About 16% to 17% respond in more detail with either Cantonese or Mandarin, respectively. Less than 3% identify Formosan as the language spoken at home, the smallest Chinese language category. Although “Chinese” can include the other three languages (or ones that the ACS excluded), the results show a majority of Chinese speakers respond with only the broader label when asked about language spoken at home. Other results highlight distinct forms of Chinese language identification that are related to geography, citizenship, and place of birth. In states with larger Chinese populations like California and Hawaii, respondents identified with specific Chinese languages. Cantonese speakers, in particular, are more likely to have lower levels of English skills and educational attainment. The results have implications for local organizations targeting social services and care within often diverse Chinese communities.

Findings from this analysis demonstrate the need for both macro and micro levels of disambiguation when it comes to AAPI data that truly reflects the diversity of the AAPI community. Part of the challenge in disaggregating the data by language is related to the survey question and design. Currently, the ACS allows respondents to fill in the home language that they speak. While we agree with this approach, one recommendation is to encourage respondents to consider all forms or dialects of a language. For Chinese speakers, due to current and past ideologies about what Chinese means and the context, respondents may select Chinese broadly even if they are aware of the differences between Cantonese, Mandarin, or Hakka. Priming respondents with the idea of different forms of a language can improve data collection and disaggregation.