Authentication of digital documents
by Stephen Mason
This article was published in the e-Newsletter on the fight against cybercrime, Number 5, November 2009, 1 – 5
Copyright in this article is vested in the author, Stephen Mason, and the author has asserted his right under the Copyright, Designs and Patents Act 1988 to be identified as Author of this Work.
The author grants you a licence to download and print copies of this article PROVIDED THAT you (a) retain the copyright notices contained at the beginning and end of the article in its entirety, (b) clearly identify this article as being written by the author in electronic and printed versions and (c) only use it for your private use.
Testing authenticity is not new
The digital environment has caused us to more fully comprehend the complexities of the assumptions we take for granted in the physical world. One example is buying goods and services over the internet. Previously, buying at a distance was mainly by mail order catalogue, a means of selling that began in the nineteenth century. Catalogues are not immune from schemes to defraud buyers. However, because it takes time, trouble and expense to display merchandise in a catalogue, the buyer feels a degree of reassurance that the catalogue is trustworthy because it is a physical object, which in turn provides a high level of comfort. The catalogue confirms the existence of the supplier, and acts to reassure the buyer that the goods they order will be supplied.
The physical item of a catalogue should not, necessarily, act to reassure the buyer. The catalogue could be an elaborate swindle to defraud buyers by purporting to offer goods, and once sufficient money has been cashed, the thieves disappear without supplying the goods ordered. However, for such a swindle to be worthwhile, the cost of setting up the deception has to be low enough to provide an adequate return. This means that attempts to defraud mail order catalogues are relatively rare, because the set-up costs are disproportionate to the return. The reverse applies in the digital world. The outlay in setting up a similar operation in the digital domain is minimal, and the returns can be significant. Mass education has helped budding crooks, because schools across the world teach children how to read and write, and provide sufficient knowledge about computers to encourage the fraudsters whilst young.
Until the emergence of the digital environment, we relied on our perceptions, based on experience, about the physical materials we handled in everyday life. A person will trust a physical object by considering its intrinsic properties. Items that are not trustworthy may be manifest because a letterhead does not quite look genuine, or a manuscript signature on a cheque differs a little too much from that which is usually observed. In essence, we all authenticate the provenance of physical objects every day: just as we are learning to verify our perceptions in the digital world.
In responding to the risks associated with the digital realm, attempts have been made to provide sufficient assurance that the person we are dealing with is the person we should be dealing with. This is often described incorrectly as ‘security’ by people at the end of a telephone in a call centre. Authenticating the identity of the other party certainly forms part of the security process, but in itself, the aim of authentication is to validate the person’s identity: it is necessary to exchange sufficient information to reassure one or both parties that the person they are communicating with is the person whom they claim to be.
In responding to this quandary, the digital world has moved us away from dependence on a signature, to reliance on the use of multiple items of information that, taken together, help to validate the identity of the user to an acceptable degree of trustworthiness. The digital world is with us for the foreseeable future, and IT will continue to be used for nefarious purposes in ways that cover both the criminal and civil law. It is for lawyers to more fully understand the nature of digital evidence, and it is also important to ensure proof of digital evidence is not sidelined into obscurity.
The term ‘authentic’ is used to describe whether a document is genuine. However, it is, perhaps, misleading to use the term ‘authentic’ when referring to a digital document or, perhaps more accurately, a digital object. This is because of the way a digital object is created and made visible. For digital data to be made intelligible to a human being, it must be interpreted. Digital data is processed through a sequences of commands, so a simple document containing written text, for example, will consist of a number of ASCII character codes that must be interpreted before the text is reproduced on a screen in human readable format. However, digital data is not restricted to simple text documents. The format of the data can be of a more elaborate nature, including active components such as macros and scripting language, which means the data might require more complex interpretation to read the text. Also, a file displayed on a different computer to the computer that originally created the file, can, and often does, lead to a different font and different line breaks. This is why the format of a file of documents will differ.
The definition of authenticity in respect of a physical document comprise such attributes as the state of being the original, or of being faithful to an original, uncorrupted and, perhaps, with a verified provenance. In comparison, it is more difficult to be clear as to what is meant by an ‘authentic’ digital object. If, for instance, a particular macro (say a macro that is used to automate frequently used movements of the mouse) is missing from a computer upon which a copy of the digital document rests, the question that must be raised is whether the lack of the macro in the computer that the data now rests, renders the document something other than the genuine document.
To a certain extent, the technical focus of proving the authenticity of a digital object is to have checks and balances in place to demonstrate the history of how the data has been managed, which leads to the assertion that the data has not been modified, replaced, or corrupted and must, therefore, be ‘original’. This proposition rests on two conditions: first, the data is subject to a chain of custody; and second, the data has not been modified without authority between the time it was created or added to the depository, to the moment it was required.
The unique nature of digital data means that although the data may be created in program memory, it might be saved on a number of different storage media. Further, each digital object may be replicated in a number of places, which means there is no single ‘original’. This has implications for understanding the nature of digital data. In essence, there is a need to accept that the concept of an ‘original’ and ‘authentic’ digital object is meaningless. Therefore it is necessary to consider the meaning of ‘authentic’ in terms of a digital object in the relevant context.
Conceptually, a digital object is authenticated by verifying the claims that are associated with the object, such as: the organizational criteria demonstrating the provenance of the digital object, including the documentation pertaining to the chain of custody (and what extent this documentation is trusted), and the extent to which the custodians can be trusted; the object can be examined forensically to establish whether its characteristics and content are consistent with the claims made about it and the record of its provenance; any signatures, seals and time stamps that may be attached to the object can help test the claims to consistency and provenance.
In essence, the ability to prove the authenticity of a digital object is not proving that an original exists, especially when referring to something as dynamic as a database. The issue is about trust, or the lack of trust. Proving the authenticity of a digital object means providing sufficient evidence to convince an adjudicator that the object that has been retrieved is a faithful representation of what is claimed to be the ‘original,’ or a reliable representation of the object that was in turn relied upon by the originator.
Practical questions
However, the authentication of a digital document does not necessarily require an elaborate inspection by a digital evidence specialist. As observed in a case from Greece, Case No. 1327/2001 – Payment Order, when people (including lawyers and judges) are made aware that typing their name into an e-mail is a form of electronic signature, their first response is to ask the question ‘Is it safe?’ The reply to this question is: ‘You have asked the wrong question’. Nobody asks the question ‘Is it safe?’ when presented with a manuscript signature on a letter with the name of a firm, company or public body printed on the paper. Yet the entire letter may be a fabrication. The manuscript signature and the name of the firm or company may be forged or not even exist. The correct question to be asked of any document bearing a signature (whether in digital format or on paper) is this: ‘Is the document authentic?’ The recipient must enquire as to what action they should take to confirm the document is authentic. If the document is not authentic, then whether the signature is that of the person whose signature it purports to be, will probably not be relevant.
It will not always be necessary to request a digital evidence specialist to investigate the document technically to establish authenticity. For instance, the recipient of an e-mail or document might have sufficient information to determine for themselves whether they trust the source of the communication. Even if a person is not aware of the header information that is available to test the source of the e-mail (although this may be forged), there are a number of features that a recipient can assess, including the authenticity of the sender’s e-mail address, the linguistic structure of the text of the e-mail, and whether there are references to physical attributes, such as a postal address or a telephone number. Extrinsic evidence of this nature was held to be sufficient in criminal proceedings in England, in the case of R v Mawji (Rizwan) [2003] EWCA Crim 3067; [2003] All ER (D) 285 (Oct), 2003 WL 22477344 (CA (Crim Div)), and the robust comments of Ford Elliott J in the United States of |America in the case of In the Interest of F.P., A Minor, 2005 P.A. Super 220, 878 A.2d 91 (Pa.Super. 2005), 2005 Pa. Super. LEXIS 1499 illustrate this point effectively.
The nature of the problems lawyers will continue to face respecting digital evidence is illustrated in the recent case of Campaign Against Arms Trade v BAE Systems PLC [2007] EWHC 330 (QB). Mr Justice King granted Norwich Pharmacal relief to the Campaign Against Arms Trade (CAAT) against BAE Systems PLC (BAE) in this instance. Ann Feltham sent an e-mail on the 29 December 2006 to the members of the CAAT steering committee internal e-mail list, a private list not open to the members of the public and comprising only the 12 members of the steering committee and seven members of CAAT’s staff. The e-mail contained privileged legal advice that CAAT received from its solicitors. A copy of the e-mail was sent to BAE. Solicitors for BAE returned a copy of the e-mail printed on paper to CAAT’s solicitors. This was the first time that CAAT came to know of the leak. The e-mail returned to CAAT was incomplete, as described by Mr Justice King, at 31:
‘It was a redacted version of that which had come into the possession of the Respondent and/or its own solicitors. All the routing information, the header address and so forth, which would give details of the email accounts through which the email had been received and sent before arriving at the Respondent and its solicitors, had been removed. Such removal must have been done either by the Respondent or by its solicitors acting on its instructions.’
The source of the leak could only be the result of two possibilities, and CAAT did attempt, unsuccessfully, to trace the source, as described by Mr Justice King:
45. As Ann Feltham says, there are really only two broad possibilities: either the source is one of the authorised recipients of the email, i.e. a member of the Applicant’s steering committee or staff, or the email was intercepted or retrieved by other means by a person or persons unknown, be it by improper access to the Applicant’s or a recipient’s computer system, interception at riseup.net or at some point whilst the email was sent over the internet. In her first witness statement she explains how she made enquiries of each of the authorised recipients who each denied forwarding the email on. Her second witness statement was made in response to that part of the Respondent’s skeleton argument in which it is said that the Applicant has not done enough and that before seeking the present order the Applicant should have (skeleton para.27.) “examined the electronic data available to it on its own computer systems and those of ‘riseup.net’ and further should have asked any authorised recipients to provide it with access to their personal electronic data for purpose of determining whether their denials of involvement in the copying are accurate”.
46. In this later statement Ms Feltham says she did check the ‘sent folders’ on the personal computers of the staff based in the Applicant’s office, but explains that there was a major practical and logistical problem as regards access to the computers used by members of the steering committee. Unlike the staff they are not employees of the Applicant but volunteers who do not work in the office or use computer systems belonging to the Applicant. Some are members of other organisations who access emails from accounts and equipment owned by their employers. Some are based outside London. This all means that to have investigated further on the lines suggested by the Respondent, the Applicant would have needed access to computers to which the Applicant has no right of access and in any event the Applicant would have needed the “costly services of a computer expert to go on a fishing expedition for emails which might or might not have been sent which moreover would have been very time consuming.’
The claim by BAE that CAAT ought to physically examine every computer to trace the route of the e-mail is somewhat unrealistic, as explained above, and also fails to grasp the fundamental issue, that digital data knows no geographical, physical bounds. Returning the e-mail without the source data is similar to returning a letter received through the post in an envelope, yet refusing to deliver up the envelope. That the routing and other technical data is ‘similar’ to the data included on an envelope is an understatement, because the routing and other metadata available in relation to an e-mail is far more extensive than the metadata contained on an envelope. In this instance, Mr Justice King concluded that the order sought ought to be granted, although not in the terms requested.
This application, and the decision by Mr Justice King, illustrates the importance of the metadata associated with a digital object. Documents in digital format include metadata as a matter of course, and it seems unrealistic for the recipient to refuse to deliver up the full document, including the associated metadata, in such circumstances.
© Stephen Mason, 2009
http://www.stephenmason.eu
