New Texts Out Now: VJ Um Amel, A Digital Humanities Approach: Text, the Internet, and the Egyptian Uprising

Laila Shereen Sakr (VJ Um Amel), “A Digital Humanities Approach: Text, the Internet, and the Egyptian Uprising,”Middle East Critique Volume 22, Issue 3 (2013).

Jadaliyya: What made you write this article?

Laila Shereen Sakr/VJ Um Amel (LSS/VJUA): The idea for the article came from a discussion I had over a year ago with Middle East Critique Guest Editor Mark LeVine about why some scholars are averse to study text that is accessed through a database. I explained to him why I believe it is critical to have people trained in humanistic and social research looking at what is posted on the Internet. At a time of great debate on how activists in the Middle East have used emerging technologies towards political mobilization, I believe that studies on digital or social media must carefully unpack the politics and the technology together.

I gave him the example of when I analyzed the frequency of several hashtags (#) in Twitter posts on Egypt over a year using R-Shief, a digital archive I built. The comparative, historical map showed a sudden surge of posts on #Tahrir in November 2011. At first glance, I assumed the surge of activity was about the controversial parliamentary elections. It was not until I read what people were posting that I realized these were public outcry was about the deadly Mohammed Mahmoud Street military attack on civilians. This sort of analysis of social media data, in my opinion, can help us more accurately write history. This is only one example. When you put these little gems of information together we get a new picture that challenges the accepted portrayal of political resistance in the region.

In this article, I set out to make a case as to why researchers should incorporate computational analytics and content from digital archives in their research. What does this data tell us that historians, political scientists, and anthropologists do not already know about the Egyptian revolution as in this example? This is not a small task, and this article only begins to address the subject.

I aim to debunk several myths in the work and instead develop several principles that should inform our work in the field:

1. Technologies are not objective, accurate, or void of political or cultural formations.

In my research, I have expressed concerns over Arabic software localization—which is a means of adapting computer software to different languages and regional differences. At an even deeper level, the shift in programming from using C++ (a highly mathematical computer language) to using Java (a computer language that contains lots of English-based vocabulary) has meant a shift into what is culturally English-based. Yet, the academic field of Internet Studies is so young that its ethical standards and traditions are still being debated. Results are often published without public access to the data or tools used in the analysis.

Critiques of race, power, and colonialism are rarely brought into studies on how, for example, Egyptian activists have used Silicon Valley tools to design their own virtualities specific to local culture and history. As recently as 2006, developers were still building applications to enable Arabic characters on a keyboard. Several open source projects to develop software for Arabic on Drupal, Yamli, Google, and other platforms enabled Arabic-language content to grow dramatically in the years to follow.

2. It is not okay to study only English language posts when researching Arab social media.

Any scholarly project that studies digital media or social media on the Arab world must consider the Arabic content which comprises 80%-99% of posts on the region. Studying only the English-language tweets is simply not enough and will skew the results. In my visualized analysis of Facebook users, for example, the case for why it is imperative to read the Arabic text is clear.

[Figure One: This is an analysis of more than six million Facebook users from 2011-2012 conducted by R-Shief, Inc.]

3. Big data is not notable due to size, but because of its relationality to other data.

Another more common definition of big data, by business analyst Doug Laney, points to three key qualities of such data: volume, velocity and variety . This has larger theoretical implications. Indeed, big data is a poor term to describe the storage and analysis of large and or complex data sets using techniques such as NoSQL and machine learning.

The challenge to normalize uneven data inputs into one system or database requires building a network where each data point has its position and function in relation to all the other data points. When the size, speed, and variety of the data reaches a threshold, new, creative methods are developed to keep all the pieces working together. The traditional table of rows and columns are not always the most efficient way to store and process data. It is in the relationality of the data to each other where machine learning can take place and programs can be built to be “smart.”

In a recent blog post on R-Shief’s new search engine, Kal3a (Arabic for “castle”), the new system’s data architecture is carefully explained. By 2014, I would say that the methodological shift to using big data in textual analysis in the humanities has meant a shift from big data to smart data – or big, smart data.

4. Visualizations of Internet data are not about claims about material bodies or the intentions of communicators.

They are traces of an embodied moment of intentional use of digital media. Every data point has an embodied analogue at some moment. And tweets, as a particular category of digital data, have a very particular (historically specific, geo- specific) moment of origin that is exceedingly tangled with material bodies. My project is to figure out what the emerging patterns tell us about the virtual body politic.

A difference between the body politic and the virtual body politic is that while the former is understood as an abstraction of a group of people governed by one authority, the latter is that abstraction of people who exchange ideas publically online about the governance of an authority. Yet, how do we analyze the negotiation between the materiality (the analogue) and information patterns (the data points). In a world that is witnessing Arab revolution and counterrevolution and Twitter popularity, how might even minimal participation in virtual, networked publics—through retweeting, video documenting, blogging, etc—affect the body politic.

J: What particular topics, issues, and literatures does it address?

LSS/VJUA: This article presents an argument for conducting humanities-based research on current events in the Middle East using computational means to examine the infinitely growing scale of social media streams. Specifically, I argue that the political uprisings in Egypt emerged from a matrix of influences, notably, the convergence of three cultural rubrics:

A technological infrastructure consisting of the Internet, blogs, Twitter, Facebook, smart phones, the convention of the hashtag (#), phone numbers stored in digitized contact lists, personal mobile communication devices.
Embodied digital practices such as use of mobile devices, or behavior in public demonstrations towards media cameras, screens, and storytelling.
National narrative of thawra (Arabic for “revolt” or “revolution) woven into the discourse by the vox populis, pundits and journalists alike.

Together, this convergence enabled the mobilization of a body politic that was identified by global witnesses as part of a broader "Arab Spring" originating in Tunisia, and the moment of what has become a cycle of revolution and counterrevolution in the Middle East.

The article begins with a brief description of the methodology conducted in this study – cultural analytics . Eschewing the technological fetishism that prevails in much of the discourse around Internet research, specifically social media analytics, this methodology instead insists on the integration of history and culture in a manner that links learning outcomes with the affordances of media. I then provide a review of scholarship on media and the Middle East that reveals a lack of engagement with digital media content, whether as primary sources or in critically questioning the tools and analytics provided.

The following sections discuss the process of building knowledge management system and digital archive, R-Shief. Of course there are many challenges to building digital archive of social media streaming content. Much like when people started using Islamic court records for historical research, it is a lot of trouble to sift through these new sets of document -- bad handwriting, very coded in technical language, etc.

The article ends with a close analysis of the tweets on #Tahrir from 2011-2012. This section is meant to serve as a foundation for future, in-depth close textual analysis of social media. When charting various hashtags in this exercise, it seemed evident that Tahrir has been imagined as a nationalist trope for the revolution.

J: How does this work connect to and/or depart from your previous research?

LSS/VJUA: Previously, I spent a lot of my time writing about the digital archiving process and the computational innovations. In a paper delivered via Skype at the World Congress for Middle East Studies in Barcelona in 2010, I wrote:

“To amass an archive is a leap of faith, not in the function of preserving data, but in the belief that there will be someone to use it, that the accumulation of these histories will continue to live, that they will have listeners….R-Shief joins a history of archival art works that urgently seek to critique historical information on the contemporary Middle East—information currently under siege, in real time and place, as cultures are destroyed or lost in conflict. However similar, R-Shief is a website that is concerned with archiving and indexing, rather than showcasing, on issues including but not limited to art and culture.”