History of Social Media
Contrary to popular perception, social media networks are by no means a recent phenomenon;they trace their origin to an idea of Duke University graduate students Tom Truscott and Jim Ellis in 1979, which was thenlaunchedas Usenet in 1980, over a decade before the World Wide Web was established and the general public got access to the Internet.Usenet offered a place for scientists and academia working on computer and related technologies to converge and thrash out ideas and work out possible solutions to their individual problems. The earlyUsenet had bulletin board services, newsgroups and other online fora – precursors to some of the same services that are available on today’s social networks albeit in a more refined and advanced manner. In those days the World Wide Web did not exist and the Internet was not part of the common vocabulary. It took some time for the Internet to develop and enter the mainstream;interestingly the launch of the World Wide Web(W3) was first announced on Usenet When on August 6, 1991, Sir Tim Berners-Lee posted a short summary of the World Wide Web project on the “alt.hypertext newsgroup.” This date also marked the debut of the Web as a publicly available service on the Internet.
The change that brought the W3 out of the hands of scientists and academics and into the mainstream was the introduction of the Mosaic Browser by Marc Lowell Andreesen, whocame up with the idea of putting images onto the endless reams of text on the Internet with the ability to tag an image on to a HTML page.The internet suddenly became graphic and that gave a totally new meaning to a webpage, dramatically changing the way in which information could be presented. Usenet unfortunately stuck to the old text based system and over a period of time started to lose users and popularity.
After the decline of Usenet, the first social networking site was actually designed around the concept of ‘six degrees of separation‘ (FrigyesKarinthy (25 June 1887 – 29 August 1938), Chains (Láncszemek)1929). This site was based on the web of contacts model as opposed to the circle of friends model used by most social networking sites today. The site was aptly named sixdegrees.com when it was launched in 1997.Unfortunately, over time its use declined and the site closed down in 2001. In the intervening years, numerous other sites came and went until Google launched Orkut in 2004, which took hold and was a staggering success. Thereafter a number of social networking sites have been launched with varying degrees of success. Today the global mainstay appears to be Facebook.
The rise and popularity of social networking sites can be attributed to the growth of bandwidth availability from a technological perspective. Socially, it probably could be explained through a combination of social theories: the Aristotelian theory of humans being social animals and coupled with the practice of coming together encouraged by most religions. Ultimately as humans we are inherently conditioned to search for others like us. Social networking, byoffering an interest-based segregation of individuals, serves this purpose. Another aspect of humans that social networking sites exploit is our indulgence in nostalgia. Social networking sites are inherently structured to create a web-like linking of people. So, if we manage to trace an old school friend of ours, chances of us discoveringsome more increases exponentially.
Today social networking sites have taken a strong hold and established a social networking environment or a“social economy,” which is valued at a staggering US$1.3 trillion, (see, McKinsey Global Institute, July 2012, The social economy: Unlocking value and productivity through social technologies), with over 1.5 billion users globally. Social networking sites have suddenly transformed from merely an online meeting space to an online market place.
But not everything about social networkingis good.The UK Police have reported that crime involving social media has increased by 540% in the last 3 years.Closer home to India, the Mumbai police have published guidelines on what information to disclose on a social networking site. Cases of cyber-stalking have increased with horrific consequences like social media-induced suicides.Take the case of a teenage girl Megan Meierwho committed suicide when it was revealed that a boy she admired on MySpace was actually a classmate’s mother antagonizing the teenager for being different. The mother, Lori Drew, allegedly communicated with Megan as “Josh” for over one month and then abruptly ended the relationship. Megan committed suicide the same day. Lori Drew was convicted of computer fraud and abuse, but was acquitted for Meier’s death. (See http://www.reuters.com/article/2008/11/20/us-myspace-suicide-idUSTRE4AJ14O20081120.)
Over the last two decades the Internet has become a huge repository of information and with the advent of social networks that repository has grown manifold. With such rich information about individuals ready for the taking at your fingertips, the internet and all that it encompasses will always be a very good foraging ground for both law enforcers and for “agents of opportunity.”
Forensics in Social Media
Digital forensics is the application of forensic science methodologies for the recovery and investigation of materials found in the digital world, often in relation to a crime related to computers. The advent of social networks has presented unique challenges to digital forensic experts, as one of the chief concerns in digital forensics is the authenticity of theevidence together with the challenges with establishing the chain of causation. In a typical process of collecting evidence in digital forensicsone would have to go through the following steps:
- Collection of data;
- Classification of the data;
- Authentication of the data; and,
- Presentation of the data.
From a forensic point of view, social networks have a very unique architecture.They are a web of connections as opposed to linear connections in for instance online service providers, which makes them amenable to data collection with relative ease.The data (information) on a typical social network is dynamic in natureand even populardigital forensic tools like Xplico (Xplico- Internet Traffic Decoder. Network Forensic Analysis Tool,nline at http://www.xplico.org),for instance, simply cannot see or access all the data, as they are passive in their data acquisition methodology and not designed for a dynamically changing environment of a social network.
Social network forensics as of now has to rely on a limited set of data sources in many cases. Gaining access to a social network’s server’s hard drives is just not feasible, and leveraging the service operator’s data directly requires the service operator’s cooperation.Insome countries it is possible to obtain orders from courts to demand that the service provider directly hand over the data.
A forensic investigator can submit requests to the operator but may or may not receive all the relevant dataor may receive only selective or partial data which compromises its purity and neutrality according to the established international guidelines for digital evidence collection (D. Brezinski and T. Killalea. RFC 3227: Guidelines for evidence collection and archiving, The Internet Society, may be accessed at http://www.ietf.org/rfc/rfc3227.txt..)Consequently, the investigator is unable to show that the evidence is authentic, complete, and reliable. Hence gathering reliable data for forensic analysis is the first hurdle.
While traditional forensic methods can be used to extract information from local web browser cache, there are numerous possibilities on the communication layer (in network terminology, the layer which is just on top of the physical layer (wire) and which breaks down all communication into packets). These range from passive sniffing on the network to active attacks like sniffing on unencrypted WiFisand crawling with a social networking component. Crawling however is limited, as metadata and accurate timestamps are not shown on web pages. They are only available by using the social network’s APIs(application programming interface). Even though it would be possible to collect data passively on the communication layer, this approach is limited and it would take a tremendous amount of time for collecting information, and completeness is hardly possible. Furthermore, many social networks offer the possibility to encrypt data on the communication layer by using HTTPS, rendering passive attacks useless. The above extraction methods can be exercised only after orders are obtained from courts or other competent authorities, in the absence of which the steps would be open to challenge as being violative of privacy laws and technology laws, such as, in the Indian context, this would violate the IT Act on account of the methods involving unauthorised access and intrusion into computer systems and networks.
Ways and means are being developed to extract data from social networking sites for meaningful forensic analysis. An interesting feature recently announced is Facebook Timeline, which encourages users to never delete anything from the social network, and to use it as an historic archive. This opens up interesting possibilities for forensic examinations as it would make historical and archival data readily available, which already would be categorised in a timeline.
Once a forensic investigator has access to the relevant data from the social networking site he/she can then set down to assign the data into the following datasets for a better interpretation of the incident at hand.
- The social footprint: What is the social graph of the user, with whom is he/ she connected “friends” with?
- Communications pattern: How is the social network used for communicating, what method is used, and with whom is the user communicating?
- Pictures and videos: What pictures and videos were uploaded by the user, on which other peoples pictures is he /she tagged?
- Times of activity: When is a specific user connected to the social network, when exactly did a specific activity of interest take place?
- Apps: What apps is the user using, what is their purpose, and what information can be inferred in their social context?
- Groups: What groupsis the specific user a member of and what activities does the group endorse or undertake?
- Pages: What pages/ organisations and viewpoints does the user endorse?
Information related to the above data pools cannot be found on a suspect’s hard drive, as it is solely stored with the social network’s operator. Especially for people that use the social network on a daily basis, a plethora of information is stored at the social network operator’s servers.
Sometimes information is cached locally, but this is not a reliable source of information as it is neither complete nor stored persistently. Depending on the implementation of the social network, the availability of data itself and the possibility to retrieve the data via API calls can vary among different social networks. However, most of this data can be extracted either directly, or inferred without the collaboration of the social networkoperator. Once the data is available to the investigator, the full spectrum ofsocial network data analysis can be conducted.
The easiest way for obtainingthe data is of course with the consent of the user, who can provide the usernameand password. To a forensic investigator data from all sources is important; in the case of a non-cooperative user an analysis on the user’s computer using traditional digital forensic techniques will lead to the various personas that user uses in the digital world – take for example the case of Justin Brown who was arrested for impersonating a model named Bree Condon on the dating site Seekingmillionaire.com. Ms.Bree ultimately alerted police to the fraud that her name, likeness, and professional photographs were being used in a scam until Mr. Brown was arrested. Investigators later learned that Mr. Brown had phone conversations with wealthy men in exchange for money and gifts. On questioning Mr. Brown said that he created a plausible biography of Ms. Condon by using her online biographical information. In countries like the United States where personality rights and privacy laws are well developed, fact situations such as this can be actionable under laws other than information technology laws.
While the data can be easily analyzed manually afterwards toanswer specific questions, the massive amount of data that is collected from all the sources requires automatedtools for a forensic investigator to see the full picture.If you add multiple social networking sites with different feature sets the correlation of this data can be quite tiresome. Despite the ongoing process of developing tools to extract data, the correlation and analysis of this data still remains a very human task.
Ultimately the legal and technical objectives of digital forensics have to be streamlined particularly in countries like ours where often white collar criminals are able to slip through the cracks merely on account of either the data gathering process being impeached or the neutrality and purity of the data collected is called into question under the Evidence Act in court on account of the existence of the possibility of tampering. With little or no digital forensics training, evidence gathering and solving crimes by law enforcers at least in our geography is a far cry from the seamless process of investigation and booking criminals that CSI and other crime shows have popularised.
Gurjot Singh is a Chief Technologist with Fidus Law Chambers. He can be contacted at Gurjot@fiduslawchambers.com.
By Gurjot Singh