Twitter is an extremely popular social media platform. The platform was founded in 2006, and reached 500 million user accounts in June of 2012, 140 million of those in the United States (Semiocast 2012). Users access Twitter through a web interface at, through mobile applications on tablets or smartphones, or via SMS. The platform describes itself, on its website,as a real-time information network; it is also popularly described as a “micro-blogging” service, micro in the sense that users’ contributions are very brief, and blog-like in the expectation that users Tweet about themselves. In order to introduce the platform and the social organization of my corpus to the reader, I first explain what it means to be a user, then describe the publication and republication of texts, giving a partial account of how those texts appear throughout the platform, focusing in on organization through hashtags.
Though many pages on Twitter can be viewed without any authorization, active participation in the platform may only be done through logged-in accounts. Sign-up is free and open to anyone able to access the site. The process is very quick and requires a minimum of personal information. New users are prompted to enter a real name (my experience with the site indicates that these often do not follow a ‘first name, last name’ or ‘company name’ format, but may be used for self-description or parody like many other social media handles). Users also enter an email address, with which they confirm the creation of the new account, and a password. Finally, they must create a username with any combination of letters and numbers prefaced by ‘@’. This username, unlike the “real name,” must be unique in the platform. These fields are sufficient to create a new identity in the platform, but the user may offer more information for the profile that is created – for example, upload a small avatar, enter a biography of up to 160 characters, indicate location. From this profile, the user may profile may then asymmetrically form connections to others or be connected with: any user can “follow” any other user, meaning receive the Tweets of that user, and be “followed” by other users, meaning broadcast their Tweets to that user. These identities may also be used actively within the platform to create or propagate texts.
A Tweet is the primary mode of interaction in Twitter, and any user may Tweet. A Tweet begins with a user’s purely textual contribution of up to 140 Unicode characters. When that Tweet is submitted, it becomes integrated into the Twitter platform. For one, parts of the user entered text become interactive, creating links between the Tweet and other texts, users, and sites. URLs in the text become hyperlinks. Alphanumeric strings prefaced with @ become hyperlinks to the profile page of that @username, whether occupied or not; this is called an @-mention. Alphanumeric strings prefaced with # become hashtags, which, as will be explained below, are a system of Tweet organization within the platform. This research is built around
texts brought together by a hashtag, which in addition to unifying them in topic, connects the texts to each other through this interactivity. As a whole, the text becomes integrated in a standardized and interactive display format used in the official website and app and required of any third-party client (Twitter 2012a). Each Tweet is encapsulated in a rectangular space, with the user’s avatar to the extreme left, and the user’s “real name” and @username above it, all of which link back to the user’s profile. A timestamp to the extreme right links to a permanent URL for that Tweet. The user Tweet text is displayed in the middle of the box. A row of “Tweet actions” can be seen along the bottom (on the website, when a user “mouses over” the Tweet). This is illustrated in Figure 1 below, a Tweet by the popular musician Lady Gaga (real name) or @ladygaga (@username) on January 31. The menu of Tweet actions for this Tweet is expanded by a mouse over, and the options (reply, retweet etc.) are available.


showsits elements and available interactions

Not all Tweets, however, are unique, original texts; rather, many are reproduced from other users, especially through the practice of retweeting. The Retweet action re-publishes a Tweet under a new name, and it accounts for over half of all Tweets in the corpus analyzed in this thesis (as is true of any hashtag corpus I have examined thus far). A variety of conventions
have emerged for marking texts as copied Tweets and, optionally, providing attribution, such as typing “RT @[USERNAME]” or “via @[USERNAME]:” before the copied text (see boyd, Lotan and Golder 2010). These Tweets are then displayed just the same as any other original Tweet, including a hyperlink to the original user’s page through the @-mention in the typed Retweet
tag, and any other hashtags or links also active in the copy. These conventions were supplemented in 2009 when the platform added a native Retweet function, which mobile apps have gradually come to support. These Retweets display slightly differently in some user interfaces, such as and its official app: the original Tweet, including the username,
avatar, and real name of the original author, is reproduced, with a line underneath the Tweet text that identifies it as a Retweet and the Retweeting user: “Retweeted by [REAL NAME].” Some other apps display the avatar of the Retweeting user instead of the original author, and some display “RT @[USERNAME]” before the whole Tweet. At times, users reproduce Tweets
by copying and pasting the text without providing any link to the previous text.These Tweets and Retweets are displayed across the site in several collocations, all of which are updated in real time. On each page, Tweets are displayed in reverse chronological order with the newest Tweets at the top, and a notification bar appearing above all Tweets that will allow the user to load new Tweets as they are produced. Some streams are organized by user identity. The profile page of any user displays the stream of Tweets and Retweets by that user, with the newest at the top. Each user has a “Home” page, where the Tweets of all the users they follow are displayed in reverse chronological order (alternate views, organized by conversations created with the Reply action and @-mentions, are available for some Tweets through extra clicks).
This corpus is collected based on Tweets that would be included in hashtag streams, which bring together topically-tagged Tweets. In these streams, viewable as real-time search pages on Twitter, users exploit the hashtag as an organization system for Tweets. Tweets including the same hashtag are brought together by the platform on a search page stream for that hashtag. The searchability of the hashtag (whereby every iteration of a # followed by any alphanumeric string is hyperlinked to its search page) was added by the platform in 2009. However, the convention began before that, as users prefaced words with “#” in order to attach topical metadata to a Tweet, simultaneously creating and indexing content (for example, “#fail” at the end of a Tweet about a comically unsuccessful attempt to do something). A hashtag search page may display few or no Tweets if the particular hashtag string has been used only idiosyncratically or has not been used within the past two weeks. However, some
hashtags develop enormous popularity and become the site of active interaction. They bring together and prompt interactions between users and texts.The participants in a hashtag are distinct from the Twitter community in general. Participation in a hashtag is self-selecting, in that users participate only in hashtags they find personally relevant or that spark a personal reaction, and affiliative, in that a hashtag is only one of many ways to express a topic but is exploited by multiple users to bring their contributions together. The populations Tweeting about a hashtag cannot be taken to represent those using Twitter in general. However, in the case of #oomf, studied here, this offers unique possibilities of study. In what follows, I’m able to use the texts that emerge around a hashtag not only to focus on one linguistic form, but also to capture texts in interaction with each other. The affiliative use of #oomf redefines it as an antecedent to introduce a specific referent, allowing the environment of Tweets and their antecedent to be held constant. Furthermore, the texts that emerge from this affiliation sometimes make
apparent deeper affiliation with each other, allowing me to trace linguistic form in use between many users.

Literature Review

This literature review proceeds by first surveying what has been written about singular they, which as of yet has mostly emerged as a reaction against sexist generics (i.e. “generic he”). Then, it reviews trends in studies of online communication that demand that social media data be examined based on the actions and identities present in platforms, rather than offline
categories. Finally, it outlines an approach to words through context.


Singular they

This paper draws from previous research on singular they, which often focuses on gender in pronouns. This previous research often focuses on exposing gender persistent in so-called epicene (including no indication of gender), gender neutral, or sex-inclusive pronouns, such as the once-prescribed use of “he” in a sentence like the following where the gender of the referent is actually not yet determined: “If someone calls, tell him I just left.” Furthermore, the original research performed in this thesis asks about the relevance of social relations other than gender in the differential use of third person singular pronouns, and it would not like to presume that “singular they” is itself gender neutral. As such, calling third-person singular pronouns “masculine”, “feminine” or “epicene” seems to beg the question. The term “singular they” is inelegant. And so I begin this paper by establishing the convention of writing the third person singular pronouns that are the object of my study in the nominative case IN CAPS to signify the entire class of pronouns. That is:
shall stand for
SHE she / her /hers / herself, etc.
HE he / his / him / himself / hisself, etc.
THEY they / them / their / themselves / themself, etc
THEY has been the object of much study since at least the 1970’s, when the issue of sexism in generic pronouns (specifically, HE in this sense) became a topic of feminist, prescriptivist, and thereafter, descriptivist linguistic inquiry. These early feminist writings, in their critique specifically of generics, find sexism deeply embedded in language and the processes of prescription. They commonly observe that the history of THEY traces back to generics in such respected old tomes as Shakespeare’s ‘Comedy of Errors’, long before its proscription in favor of a more “Latin-like” HE in the 1800’s. They write that the use of a
masculine pronoun can never be truly neutral, that HE necessarily renders women “invisible and silent” (Baron 1986, 100). They find the distinction between masculine and feminine gender in language to be imposed, rather than necessary, and find that it reflects and reproduces societal hierarchies (see for example Cameron 1992: 88) Cognitive studies corroborate the point, finding that HE used generically and for explicitly sex-inclusive groups nevertheless brings males to mind either exclusively or before females (see, for example, Moulton et al 1978). The point that gender-specific pronouns cannot stand in for referents whose gender does not match is illustrated especially clearly as McConnell-Ginet find the “generic she” to be acceptable only when feminine referents have special salience (2011b, 196).
The studies, however, tend to share the premise that THEY is motivated by lack of information about the gender of the referent, a condition that likely does not apply to most texts in the corpus studied here.Much ink has been spilled as authors theorize a gap in the English pronominal system. Many authors write of a need for a third-person pronoun that does not specify gender, conceived mostly for generic purposes. In one functional linguistic account, Weidmann (1984), upon illustrating several sentences that do not determine gender but which require a third person, singular (and emphatically, not plural) pronoun puts it as follows: “Two principal ways of stopping our gap are open: either we invent a new pronoun to express the combination of
features described above, or we make do with the existing pronouns” .
Prescriptive solutions that work toward gender neutrality have had little success in the English language. Pauwels (2003) observes that attempts to reform sexism in the English language have turned almost without debate to strategies of gender-neutralization (in comparison to e.g. German, which has relied more heavily on feminization), which have met with limited success (559). Various neologisms suggested (most since the 1970’s) to achieve sex-neutrality for singular, epicene or sex-inclusive pronouns are written about by Baron as “the Word that Failed” (1986, 190-216). Gender neutral noun phrases, where newly coined, are hardly stable. Where actor and actress have been leveled to actor, “woman actor” re-emerges in use (Harré and Mühlhäusler 1990, 240); newly-minted ‘gender neutral’ forms (e.g. “chairperson”) come, in practice, to be used as the marked term mostly or exclusively for females while men continue to be referred to with the original form of the word (“chairman”) (Ehrlich and King 1994). In literature, several authors have attempted to create stories about characters without Gender, but as Livia observes in her 2001 survey of several such English language novels, the characters produced are “disjointed” and “distant.” These failures of sex neutrality are very much in contrast to the persistent and common use of THEY; here, rather than finding speakers “making do” with THEY, THEY is found to occur frequently and even to seem to be the optimal choice for some speakers and referents. Though the question and gap remain open from a prescriptivist lens, considerations of language in use show a different side of the story. Miller and Swift suggested in 1976 that THEY was most likely to succeed in the proposed “gap,” because it was “already commonly used both in speech and writing” with both indefinite pronouns and lexical noun phrase referents of “indeterminate or inclusive gender” (135). Baron found the pronoun met the least resistance when used in places where it is syntactically singular but semantically plural, like with indefinite nouns like “person, someone, or everyone” (1986, 193-6). Stringer and Hopper’s study of spoken English, based on data spanning from the 1960’s to the 1990’s, finds such infrequent use of generic HE that the authors question whether the form was ever “an unmarked usage in English conversational interaction” (1998, 209). Cameron observed in 1992 that THEY was used “in some spoken contexts almost invariably” and “generic he” not at all (95). Further work points to the lexical availability of the pronoun (for example McKay 1980’s analysis of the suitability of the pronoun for prescriptivist purposes). In studies of college compositions including generic, third person pronouns, Myers 1990 finds that THEY is used consistently by the largest percentage of students and does not seem to represent an error in number. Holmes (1998) finds that 80% of the non-specific referents in a corpus of (mostly informal) New Zealand speech are pronominalized with THEY and dismisses HE as so infrequent as to be a “pseudo-generic”.
These studies, together, suggest that THEY is very much in use as a third-person pronoun, but, emerging from the premise that THEY is generic, do not attempt to locate it in broader contexts.Since THEY is so often studied in contrast with generic HE, the research often produces conclusions about gender, showing that THEY use is affected both by the gender of the speaker and by gender stereotypes about the referent. Together, these studies show that women tend to use THEY more, perhaps to talk about generic men or perhaps to talk euphemistically about women, and in cognitive tests, is interpreted as being about men. Multiple studies of generics find that HE and SHE are overwhelmingly used when information about gender is available (for example, for sex-exclusive groups (Newman 1997, Balhorn 2001), or sex-stereotyped NP antecedents (Matossian 1997, Balhorn 2001)). When THEY is used, it is disproportionately for masculine referents. Matossian (1997) illustrates a “male bias” in men’s use of THEY INelicitation tasks where subjects completed the missing generics in written sentences. Though the gender roles stereotypically associated with the gender of a given antecedent are typically matched by the gendered pronoun, a closer inspection of uses of THEY shows that the uses are more frequently for stereotypically masculine referents and less frequent for stereotypically feminine referents. Female respondents did not show this pattern. This tendency of THEY toward masculine referents is also reported in a 1989 study in which participants drew and talked about sentences containing generic pronouns. Each generic pronoun was interpreted with male figures more than with female figures, and with genderless figures the least. THEY produced even more masculine representations and fewer sex-neutral representations than sentences with “he and she”, except in female participants who themselves used non-sexist pronouns (Khosroshahi 1989). These studies suggest a reevaluation of introspective studies that analyze THEY to lack gender marking; though formally it may not, authors seem to consistently find referent gender relevant to its use and its interpretation. Several further studies further support the importance of speaker gender. Pauwels finds that women use more “non-sexist”
alternatives including THEY and almost no HE (see Pauwels 2003, 564, for the synthesis of several of her studies). Myers (1990) finds that college-age women were significantly less likely to use generic HE in an elicitation task, and although their strategies to avoid it were quite mixed, she found many consistently using THEY. Balhorn (2001) reports that female authors in
his newspaper corpus use a larger proportion of THEY and less generic HE, and suggests this reflects female authors’ feelings about inclusion. These findings contribute to THEY research by showing that THEY shows more nuanced, gender-based tendencies than those implied by its apparent formal gender neutrality; however, they all tend to arise from studies of formally
gender-neutral (if sex-stereotyped), generic contexts and the use of THEY where referent gender may be known is not yet considered as a possibility.These and other studies have also found other, generic-related factors related to the referent, sometimes tied to the plurality of THEY, to be important – even more important – in predicting the use of THEY. Further, generic-related features of the discourse also condition THEY, like notional plurality and concreteness of the referent, and personalization. Newman, analyzing pronouns co-referent with singular epicene antecedents in TV interviews, correlates
pronouns with further semantic features of the referent. THEY Is found to be more common than HE, while other strategies like SHE and “he or she” are extremely rare. Stemming from the connection of “they” as a plural pronoun and “singular they”, he introduces the concept of “notional plurality” in his analysis of the syntactically singular antecedents and quantifiers linked to THEY at three levels of sentential “notional number,” plural (ex. “every”), singular and ambiguous (or neutral). However, upon closer analysis of the co-referent antecedents, Newman finds almost no notionally singular antecedents used with THEY and concludes that plurality is even more closely associated with THEY than gender neutrality, and it is conditioned by less oncrete antecedents (470). However, when Baranowski (2002) revisits the notional number in epicene referents, while she finds THEY to be used nearly invariably for notionally plural referents, she also finds that THEY is still not only possible but the most common pronoun for generics in the other two groups. Balhorn’s work with a newspaper corpus corroborates the finding that THEY is used with less concrete individuals, finding that existential antecedents (e.g. anyone/-body) are most often used with THEY, and the more notionally singular lexical NP antecedents (“a person”, “a CEO”) are most often used with HE, SHE, or “he or she”.
McConnell-Ginet’s paper arguing that the use of pronouns is attached to conceptualizations of “people” and “prototypes” suggests that THEY prevents personalization because of its gender neutrality: “The reason singular they is not very satisfactory with definite generics and intolerable with proper names is that both personalize their referents and give them a particular identity, endow them with personality. Personal identity, personality, is in our culture closely tied to sexual identity” (2011b, 200). This echoes an intuition that characterizes McKay’s 1980 exploration of prescriptivist possibilities of THEY, as the author is reluctant to recommend the pronoun in part because the authors find it depersonalizes (generic) referents.
Finally, a few studies reach the edge of the feminist and generic underpinnings in THEY literature, uncovering uses of THEY that are not neutral or not generic. These accounts suffer from less systematic documentation, as they seem to mostly emerge from aberrant tokens noted in corpora or from introspection. In a 2003 paper, McConnell-Ginet illustrates what she characterizes as a restricted but increasing use of THEY: “A friend of Kim’s got their parents to buy them a Miata.” However, she qualifies this use, adding that THEY “is still unlikely to be used for a specific individual in many circumstances: if, for example, both interlocutors are likely to have attributed (the same) sex to that individual” (reprinted as McConnell-Ginet 2011a, 230); this seems strangely restricted, in view of subsequent research. In their corpora, Balhorn (2009) and Newman (1992, 1997) each observe the use of THEY for individuals of known gender to the speaker but with “low individuation,” who appears in discourse “merely as a type:” Balhorn provides the example of a restaurant review that includes an anecdote about a patron at another table as THEY. In 2004, Balhorn writes more decisively that THEY may be substituted in discourse where HE or SHE would make the feature of gender too salient, for example, where “Somebody called while you were out and he said he’d call back later” does not express an appropriate degree of disinterest in the caller (84). Lagunoff (1997) offers a similar analysis, suggesting that THEY may be pragmatically motivated as speakers attempt to avoid providing information of poor quality.Based on a survey of old English texts, Balhorn (2004) creates an argument that he explicitly disconnects from the feminist critique of sexism in generics, claiming instead that the syntax of English internal pressures that slowly allowed THEY to “rise” to its current place in language as an “unmarked” pronoun that allows speakers to highlight animacy of the referent
while not foregrounding gender. (Unfortunately, he provides little modern documentation of this happening in non-generic contexts that would add to this discussion.) The Language Log blog documents further examples for indefinite or unknown definite antecedents of known sex like ‘a users of a men’s restroom’ (see Zwicky 2010) and even with a personal name in a letter soliciting information about a job applicant (see Pullum 2010’s “Singular They with personal name antecedent” including the user comments judging grammaticality and suggesting that the usage is politically hypercorrect or an unedited form letter). Finally, Lagunoff, whose 1997 doctoral dissertation focuses heavily on the different types of antecedents (ex: quantifiers,
definite noun phrases), opens the door to possible combinations. She comes to conclude that “antecedents of Singular they can be of any kind, including where the gender is overt or implied, except names” and with pointing (xiii). In what follows, I systematize the search for such specific referents. In order to account for the data examined here, I recognize it also as
inextricably tied to the online context it emerges from.
Qualitative work recognizes that the variability of terms creates the potential for rich, social meaning. Terms of reference, especially, are encoded with social significance: Schiffrin (2006), working with multiple references over extended narratives, demonstrates how referring to a person in one way always represents choosing that term over other, available options and
thereby constructs a certain type of relationship between speaker and referent. These terms of reference uses both “denotation,” and subjectively colored “connotation” to construct the person talked about (46). Pronouns, too, have been shown to project social categorizations. In their 1990 book Pronouns and People, Mühlhäusler and Harré explore how this small unit of language creates concepts of people; they find that down to the level of grammar, “person indicating expressions in most languages include reference to specific social relations”(5) – especially gender, but also hierarchy. This encoding of relations into pronouns is inherently social, contingent upon practical knowledge of the world and society (53). Joining other
researchers, I look to situate this in online communication.